Increase in taxonomic assignment efficiency of viral reads in metagenomic studies

Metagenomics studies have revolutionized the field of biology by revealing the presence of many previously unisolated and uncultured micro-organisms. However, one of the main problems encountered in metagenomic studies is the high percentage of sequences that cannot be assigned taxonomically using commonly used similarity-based approaches (e.g. BLAST or HMM). These unassigned sequences are allegorically called " dark matter " in the metagenomic literature and are often referred to as being derived from new or unknown organisms. Here, based on published and original metagenomic datasets coming from virus-like particle enriched samples, we present and quantify the improvement of viral taxonomic assignment that is achievable with a new similarity-based approach. Indeed, prior to any use of similarity based taxonomic assignment methods, we propose assembling contigs from short reads as is currently routinely done in metagenomic studies, but then to further map unassembled reads to the assembled contigs. This additional mapping step increases significantly the proportions of taxonomically assignable sequence reads from a variety –plant, insect and environmental (estuary, lakes, soil, feces) – of virome studies.

Saved in:
Bibliographic Details
Main Authors: François, Sarah, Filloux, Denis, Frayssinet, Marie, Roumagnac, Philippe, Martin, Darren Patrick, Ogliastro, Mylène, Froissart, Rémy
Format: article biblioteca
Language:eng
Subjects:L73 - Maladies des animaux, H20 - Maladies des plantes, fèces, variété, http://aims.fao.org/aos/agrovoc/c_2772, http://aims.fao.org/aos/agrovoc/c_8157, http://aims.fao.org/aos/agrovoc/c_3081,
Online Access:http://agritrop.cirad.fr/586204/
http://agritrop.cirad.fr/586204/1/1-s2.0-S0168170217307505-main.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Metagenomics studies have revolutionized the field of biology by revealing the presence of many previously unisolated and uncultured micro-organisms. However, one of the main problems encountered in metagenomic studies is the high percentage of sequences that cannot be assigned taxonomically using commonly used similarity-based approaches (e.g. BLAST or HMM). These unassigned sequences are allegorically called " dark matter " in the metagenomic literature and are often referred to as being derived from new or unknown organisms. Here, based on published and original metagenomic datasets coming from virus-like particle enriched samples, we present and quantify the improvement of viral taxonomic assignment that is achievable with a new similarity-based approach. Indeed, prior to any use of similarity based taxonomic assignment methods, we propose assembling contigs from short reads as is currently routinely done in metagenomic studies, but then to further map unassembled reads to the assembled contigs. This additional mapping step increases significantly the proportions of taxonomically assignable sequence reads from a variety –plant, insect and environmental (estuary, lakes, soil, feces) – of virome studies.