ddRADseq‑mediated detection of genetic variants in sugarcane

Sugarcane (Saccharum sp.), a world-wide known feedstock for sugar production, bioethanol, and energy, has an extremely complex genome, being highly polyploid and aneuploid. A double-digestion restriction site-associated DNA sequencing protocol (ddRADseq) was tested in four commercial sugarcane hybrids and one high-fbre biotype for the detec tion of single nucleotide polymorphisms (SNPs). In this work we tested two Illumina sequencing platforms, read size (70 vs. 150 bp), diferent sequencing coverage per individual (medium and high coverage), and single-reads versus paired-end reads. We also explored diferent variant calling strategies (with and without reference genome) and fltering schemes [com bining two minor allele frequencies (MAFs) with three depth of coverage thresholds]. For the discovery of a large number of novel SNPs in sugarcane, we recommend longer size and paired-end reads, medium sequencing coverage per individual and Illumina platform NovaSeq6000 for a cost-efective approach, and flter parameters of lower MAF and higher depth coverages thresholds. Although the de novo analysis retrieved more SNPs, the reference-based method allows downstream characterization of variants. For the two best performing matrices, the number of SNPs per chromosome correlated positively with chromosome length, demonstrating the presence of variants throughout the genome. Multivariate comparisons, with both matrices, showed closer relationships among commercial hybrids than with the high-fbre biotype. Functional analysis of the SNPs demonstrated that more than half of them landed within regulatory regions, whereas the other half afected cod ing, intergenic and intronic regions. Allelic distances values were lower than 0.07 when analysing two replicated genotypes, confrming the protocol robustness.

Saved in:
Bibliographic Details
Main Authors: Molina, Catalina, Aguirre, Natalia Cristina, Vera, Pablo Alfredo, Filippi, Carla Valeria, Puebla, Andrea Fabiana, Marcucci Poltri, Susana Noemi, Paniego, Norma Beatriz, Acevedo, Alberto
Format: info:ar-repo/semantics/artículo biblioteca
Language:eng
Published: Springer 2022-11-11
Subjects:Single Nucleotide Polymorphism, Hybrids, Sugar Cane, Polimorfismo de un Solo Nucleótido, Saccharum, Híbridos, Caña de Azúcar, Genotyping by Sequencing, Polyploid Genome, Sequencing, Genotipado por Secuenciación, Genoma Poliploide, Secuenciación,
Online Access:http://hdl.handle.net/20.500.12123/13460
https://link.springer.com/article/10.1007/s11103-022-01322-4
https://doi.org/10.1007/s11103-022-01322-4
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Sugarcane (Saccharum sp.), a world-wide known feedstock for sugar production, bioethanol, and energy, has an extremely complex genome, being highly polyploid and aneuploid. A double-digestion restriction site-associated DNA sequencing protocol (ddRADseq) was tested in four commercial sugarcane hybrids and one high-fbre biotype for the detec tion of single nucleotide polymorphisms (SNPs). In this work we tested two Illumina sequencing platforms, read size (70 vs. 150 bp), diferent sequencing coverage per individual (medium and high coverage), and single-reads versus paired-end reads. We also explored diferent variant calling strategies (with and without reference genome) and fltering schemes [com bining two minor allele frequencies (MAFs) with three depth of coverage thresholds]. For the discovery of a large number of novel SNPs in sugarcane, we recommend longer size and paired-end reads, medium sequencing coverage per individual and Illumina platform NovaSeq6000 for a cost-efective approach, and flter parameters of lower MAF and higher depth coverages thresholds. Although the de novo analysis retrieved more SNPs, the reference-based method allows downstream characterization of variants. For the two best performing matrices, the number of SNPs per chromosome correlated positively with chromosome length, demonstrating the presence of variants throughout the genome. Multivariate comparisons, with both matrices, showed closer relationships among commercial hybrids than with the high-fbre biotype. Functional analysis of the SNPs demonstrated that more than half of them landed within regulatory regions, whereas the other half afected cod ing, intergenic and intronic regions. Allelic distances values were lower than 0.07 when analysing two replicated genotypes, confrming the protocol robustness.