TORNADO: an automated pipeline for de novo hybrid genome assembly based on free software packages for sanger and next generation sequencing technologies (NGS).

Next generation sequence technologies (NGS) made possible to sequence entirely genomes in a fast way and low cost, from unicellular to complex organisms, like plants and mammals. These sequences can be assembled (i ) using a reference genome or by some de novo bioinformatics method, such as Velvet, SOAPDenovo, Edena, ABYSS, GS Assembler 454, Mira and ZORRO. They are mainly based on de Bruijin graphs or, in a few softwares, reads overlapping to form contigs and scaffolds. The involved filtering and assembly step are very sensitive for each type of tool, and can be a key factor to generate the best assembly results. This way, when a set of sequences from distinct technologies exists, from Sanger to NGS, it is necessary the use of distinct assembly strategies for each type of data. Actually, at our knowledge, there is no automated hybrid strategies based on in-use of distinct assembly softwares that can be applied to assembly hybrid data generated by NGS or Sanger platforms. This works presents TORNADO, and automated pipeline for hybrid genome assembly based on free software packages. TORNADO did not proposed new methods for genome assembly. It just uses the best described software strategies for each type of genomic data to perform the hybrid assembly. It was organized in two main modules that are configured by XML file. In the first module, input data are filtered for trimming and experimental artifacts clipping. In the second module, based on sequence type, TORNADO automatically performs the assembly task using Mira for 454 and Sanger reads, Velvet for Illumina/Solexa or Solid/Life Tech reads. Finally, each assembled data are merged in a single assembly using ZORRO. If there are paired-end (mate pairs data) reads, an additional step involves CloseGaps software, which closes the gaps between assembled scaffolds. TORNADO was already applied to assembly hybrid genomic reads from Moniliophthora perniciosa fungi, Witche's broom causal in plant cacao. Results showed that our strategy can works like an useful method to automatically assembly hybrid genome data. TORNADO's was implemented using Java and PERL programming language technologies.

Saved in:
Bibliographic Details
Main Authors: HERAI, R. H., COSTA, G. G. D. L., R. JÚNIOR, O., VIDAL, R. O., NASCIMENTO, L. C., PARIZZI, L. P., PEREIRA, G. G. A., CARAZZOLLE, M. F.
Other Authors: LGE/IB/UNICAMP, LBA/CNPTIA; LGE/UNICAMP; LGE/IB/UNICAMP; LGE/IB/UNICAMP, LNBio; LGE/IB/UNICAMP; LGE/IB/UNICAMP; LGE/IB/UNICAMP, LNBio; LGE/IB/UNICAMP, CENAPAD.
Format: Anais e Proceedings de eventos biblioteca
Language:English
eng
Published: 2010-12-02
Subjects:Bases de dados, Bioinformática., Genoma., Genome, Moniliophthora perniciosa, Databases, Bioinformatics, Computer software,
Online Access:http://www.alice.cnptia.embrapa.br/alice/handle/doc/868519
Tags: Add Tag
No Tags, Be the first to tag this record!
id dig-alice-doc-868519
record_format koha
spelling dig-alice-doc-8685192017-08-15T21:38:03Z TORNADO: an automated pipeline for de novo hybrid genome assembly based on free software packages for sanger and next generation sequencing technologies (NGS). HERAI, R. H. COSTA, G. G. D. L. R. JÚNIOR, O. VIDAL, R. O. NASCIMENTO, L. C. PARIZZI, L. P. PEREIRA, G. G. A. CARAZZOLLE, M. F. LGE/IB/UNICAMP, LBA/CNPTIA; LGE/UNICAMP; LGE/IB/UNICAMP; LGE/IB/UNICAMP, LNBio; LGE/IB/UNICAMP; LGE/IB/UNICAMP; LGE/IB/UNICAMP, LNBio; LGE/IB/UNICAMP, CENAPAD. Bases de dados Bioinformática. Genoma. Genome Moniliophthora perniciosa Databases Bioinformatics Computer software Next generation sequence technologies (NGS) made possible to sequence entirely genomes in a fast way and low cost, from unicellular to complex organisms, like plants and mammals. These sequences can be assembled (i ) using a reference genome or by some de novo bioinformatics method, such as Velvet, SOAPDenovo, Edena, ABYSS, GS Assembler 454, Mira and ZORRO. They are mainly based on de Bruijin graphs or, in a few softwares, reads overlapping to form contigs and scaffolds. The involved filtering and assembly step are very sensitive for each type of tool, and can be a key factor to generate the best assembly results. This way, when a set of sequences from distinct technologies exists, from Sanger to NGS, it is necessary the use of distinct assembly strategies for each type of data. Actually, at our knowledge, there is no automated hybrid strategies based on in-use of distinct assembly softwares that can be applied to assembly hybrid data generated by NGS or Sanger platforms. This works presents TORNADO, and automated pipeline for hybrid genome assembly based on free software packages. TORNADO did not proposed new methods for genome assembly. It just uses the best described software strategies for each type of genomic data to perform the hybrid assembly. It was organized in two main modules that are configured by XML file. In the first module, input data are filtered for trimming and experimental artifacts clipping. In the second module, based on sequence type, TORNADO automatically performs the assembly task using Mira for 454 and Sanger reads, Velvet for Illumina/Solexa or Solid/Life Tech reads. Finally, each assembled data are merged in a single assembly using ZORRO. If there are paired-end (mate pairs data) reads, an additional step involves CloseGaps software, which closes the gaps between assembled scaffolds. TORNADO was already applied to assembly hybrid genomic reads from Moniliophthora perniciosa fungi, Witche's broom causal in plant cacao. Results showed that our strategy can works like an useful method to automatically assembly hybrid genome data. TORNADO's was implemented using Java and PERL programming language technologies. X-meeting 2010. 2011-04-10T11:11:11Z 2011-04-10T11:11:11Z 2010-12-02 2010 2020-01-27T11:11:11Z Anais e Proceedings de eventos In: INTERNATIONAL CONFERENCE OF THE BRAZILIAN ASSOCIATION FOR BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 6., 2010, Ouro Preto. Abstracts... [S.l.: s.n.], 2010. http://www.alice.cnptia.embrapa.br/alice/handle/doc/868519 en eng openAccess p. 119.
institution EMBRAPA
collection DSpace
country Brasil
countrycode BR
component Bibliográfico
access En linea
databasecode dig-alice
tag biblioteca
region America del Sur
libraryname Sistema de bibliotecas de EMBRAPA
language English
eng
topic Bases de dados
Bioinformática.
Genoma.
Genome
Moniliophthora perniciosa
Databases
Bioinformatics
Computer software
Bases de dados
Bioinformática.
Genoma.
Genome
Moniliophthora perniciosa
Databases
Bioinformatics
Computer software
spellingShingle Bases de dados
Bioinformática.
Genoma.
Genome
Moniliophthora perniciosa
Databases
Bioinformatics
Computer software
Bases de dados
Bioinformática.
Genoma.
Genome
Moniliophthora perniciosa
Databases
Bioinformatics
Computer software
HERAI, R. H.
COSTA, G. G. D. L.
R. JÚNIOR, O.
VIDAL, R. O.
NASCIMENTO, L. C.
PARIZZI, L. P.
PEREIRA, G. G. A.
CARAZZOLLE, M. F.
TORNADO: an automated pipeline for de novo hybrid genome assembly based on free software packages for sanger and next generation sequencing technologies (NGS).
description Next generation sequence technologies (NGS) made possible to sequence entirely genomes in a fast way and low cost, from unicellular to complex organisms, like plants and mammals. These sequences can be assembled (i ) using a reference genome or by some de novo bioinformatics method, such as Velvet, SOAPDenovo, Edena, ABYSS, GS Assembler 454, Mira and ZORRO. They are mainly based on de Bruijin graphs or, in a few softwares, reads overlapping to form contigs and scaffolds. The involved filtering and assembly step are very sensitive for each type of tool, and can be a key factor to generate the best assembly results. This way, when a set of sequences from distinct technologies exists, from Sanger to NGS, it is necessary the use of distinct assembly strategies for each type of data. Actually, at our knowledge, there is no automated hybrid strategies based on in-use of distinct assembly softwares that can be applied to assembly hybrid data generated by NGS or Sanger platforms. This works presents TORNADO, and automated pipeline for hybrid genome assembly based on free software packages. TORNADO did not proposed new methods for genome assembly. It just uses the best described software strategies for each type of genomic data to perform the hybrid assembly. It was organized in two main modules that are configured by XML file. In the first module, input data are filtered for trimming and experimental artifacts clipping. In the second module, based on sequence type, TORNADO automatically performs the assembly task using Mira for 454 and Sanger reads, Velvet for Illumina/Solexa or Solid/Life Tech reads. Finally, each assembled data are merged in a single assembly using ZORRO. If there are paired-end (mate pairs data) reads, an additional step involves CloseGaps software, which closes the gaps between assembled scaffolds. TORNADO was already applied to assembly hybrid genomic reads from Moniliophthora perniciosa fungi, Witche's broom causal in plant cacao. Results showed that our strategy can works like an useful method to automatically assembly hybrid genome data. TORNADO's was implemented using Java and PERL programming language technologies.
author2 LGE/IB/UNICAMP, LBA/CNPTIA; LGE/UNICAMP; LGE/IB/UNICAMP; LGE/IB/UNICAMP, LNBio; LGE/IB/UNICAMP; LGE/IB/UNICAMP; LGE/IB/UNICAMP, LNBio; LGE/IB/UNICAMP, CENAPAD.
author_facet LGE/IB/UNICAMP, LBA/CNPTIA; LGE/UNICAMP; LGE/IB/UNICAMP; LGE/IB/UNICAMP, LNBio; LGE/IB/UNICAMP; LGE/IB/UNICAMP; LGE/IB/UNICAMP, LNBio; LGE/IB/UNICAMP, CENAPAD.
HERAI, R. H.
COSTA, G. G. D. L.
R. JÚNIOR, O.
VIDAL, R. O.
NASCIMENTO, L. C.
PARIZZI, L. P.
PEREIRA, G. G. A.
CARAZZOLLE, M. F.
format Anais e Proceedings de eventos
topic_facet Bases de dados
Bioinformática.
Genoma.
Genome
Moniliophthora perniciosa
Databases
Bioinformatics
Computer software
author HERAI, R. H.
COSTA, G. G. D. L.
R. JÚNIOR, O.
VIDAL, R. O.
NASCIMENTO, L. C.
PARIZZI, L. P.
PEREIRA, G. G. A.
CARAZZOLLE, M. F.
author_sort HERAI, R. H.
title TORNADO: an automated pipeline for de novo hybrid genome assembly based on free software packages for sanger and next generation sequencing technologies (NGS).
title_short TORNADO: an automated pipeline for de novo hybrid genome assembly based on free software packages for sanger and next generation sequencing technologies (NGS).
title_full TORNADO: an automated pipeline for de novo hybrid genome assembly based on free software packages for sanger and next generation sequencing technologies (NGS).
title_fullStr TORNADO: an automated pipeline for de novo hybrid genome assembly based on free software packages for sanger and next generation sequencing technologies (NGS).
title_full_unstemmed TORNADO: an automated pipeline for de novo hybrid genome assembly based on free software packages for sanger and next generation sequencing technologies (NGS).
title_sort tornado: an automated pipeline for de novo hybrid genome assembly based on free software packages for sanger and next generation sequencing technologies (ngs).
publishDate 2010-12-02
url http://www.alice.cnptia.embrapa.br/alice/handle/doc/868519
work_keys_str_mv AT herairh tornadoanautomatedpipelinefordenovohybridgenomeassemblybasedonfreesoftwarepackagesforsangerandnextgenerationsequencingtechnologiesngs
AT costaggdl tornadoanautomatedpipelinefordenovohybridgenomeassemblybasedonfreesoftwarepackagesforsangerandnextgenerationsequencingtechnologiesngs
AT rjunioro tornadoanautomatedpipelinefordenovohybridgenomeassemblybasedonfreesoftwarepackagesforsangerandnextgenerationsequencingtechnologiesngs
AT vidalro tornadoanautomatedpipelinefordenovohybridgenomeassemblybasedonfreesoftwarepackagesforsangerandnextgenerationsequencingtechnologiesngs
AT nascimentolc tornadoanautomatedpipelinefordenovohybridgenomeassemblybasedonfreesoftwarepackagesforsangerandnextgenerationsequencingtechnologiesngs
AT parizzilp tornadoanautomatedpipelinefordenovohybridgenomeassemblybasedonfreesoftwarepackagesforsangerandnextgenerationsequencingtechnologiesngs
AT pereiragga tornadoanautomatedpipelinefordenovohybridgenomeassemblybasedonfreesoftwarepackagesforsangerandnextgenerationsequencingtechnologiesngs
AT carazzollemf tornadoanautomatedpipelinefordenovohybridgenomeassemblybasedonfreesoftwarepackagesforsangerandnextgenerationsequencingtechnologiesngs
_version_ 1756014578980880384