The Cacao Criollo Genome v2.0 : An improved version of the genome for genetic and functional genomic studies. [W098]

Theobroma cacao L., native from the Amazonian basin of South America is an economically important fruit tree crop for tropical countries, source of chocolate. The first draft genome of the species, from a Criollo cultivar was published In 2011. Although a useful resource, some improvements can be made, including efforts to identify misassemblies, and reduction of the number of scaffolds, gaps, and un-anchored sequences to the ten chromosomes. In this work, we used a NGS-based approach to significantly improve the assembly of the Belizian Criollo B97-61/B2 genome. We combined 4 Illumina large insert size mate paired libraries with 52x of Pacific Biosciences long reads to correct misassembled regions, reduce the number of scaffolds to 554 (4,792 in assembly V1) with a N50 increased from 0.47 Mb to 6.5 Mb. 96.7% of the assembly was anchored to the 10 chromosomes compared to the previous 66.8%. Unknown sites (Ns) were reduced from 10.8% to 5.7%. Moreover, the NCBI Eukaryotic Genome Annotation Pipeline carried out a new RefSeq structural annotation based on RNAseq evidences and functional annotations have been updated. The release of the Theobroma cacao Criollo genome version 2 will be a valuable resource for investigating complex traits at the genomic level and is an important step for future comparative genomics and genetics studied on cocoa. New functional tools and annotations are available through the cacao genome hub (http://cocoa-genome-hub.southgreen.fr). (Texte integral)

Saved in:
Bibliographic Details
Main Authors: Argout, Xavier, Martin, Guillaume, Droc, Gaëtan, Labadie, Karine, Rivals, Eric, Aury, Jean-Marc, Lanaud, Claire
Format: conference_item biblioteca
Language:eng
Published: PAG
Subjects:F30 - Génétique et amélioration des plantes,
Online Access:http://agritrop.cirad.fr/583452/
http://agritrop.cirad.fr/583452/2/ID583452.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Theobroma cacao L., native from the Amazonian basin of South America is an economically important fruit tree crop for tropical countries, source of chocolate. The first draft genome of the species, from a Criollo cultivar was published In 2011. Although a useful resource, some improvements can be made, including efforts to identify misassemblies, and reduction of the number of scaffolds, gaps, and un-anchored sequences to the ten chromosomes. In this work, we used a NGS-based approach to significantly improve the assembly of the Belizian Criollo B97-61/B2 genome. We combined 4 Illumina large insert size mate paired libraries with 52x of Pacific Biosciences long reads to correct misassembled regions, reduce the number of scaffolds to 554 (4,792 in assembly V1) with a N50 increased from 0.47 Mb to 6.5 Mb. 96.7% of the assembly was anchored to the 10 chromosomes compared to the previous 66.8%. Unknown sites (Ns) were reduced from 10.8% to 5.7%. Moreover, the NCBI Eukaryotic Genome Annotation Pipeline carried out a new RefSeq structural annotation based on RNAseq evidences and functional annotations have been updated. The release of the Theobroma cacao Criollo genome version 2 will be a valuable resource for investigating complex traits at the genomic level and is an important step for future comparative genomics and genetics studied on cocoa. New functional tools and annotations are available through the cacao genome hub (http://cocoa-genome-hub.southgreen.fr). (Texte integral)