FINDMAP

<p>The findmap.f90 program aligns sequence reads to reference map, calls previous variants, and identifies new variants. Program and download information can be found at the Animal Improvement Program (AIP) web site: <a href="http://aipl.arsusda.gov/software/findhap">http://aipl.arsusda.gov/software/findhap</a></p> <p>Sequencing research requires efficient computation. Few programs use already known information about DNA variants when aligning sequence data to the reference map. New program findmap.f90 reads the previous variant list before aligning sequence, calling variant alleles, and summing the allele counts for each DNA source in a single pass. Advantages are faster processing, more precise alignment, more useful data summaries, more compact output, and fewer steps. Programs findmap and BWA were compared using simulated paired end reads of length 150 from fragments of length 1,000 at random locations within the UMD3.1 bovine reference assembly. Each base had 1% probability of error and 1% probability of missing. The 39 million variants from run 5 of the 1,000 bull genomes project were included, with every other variant set to reference or alternate. With 1 processor, BWA required 629 minutes per 1X for alignment, whereas findmap required 12 minutes per 1X for alignment and variant calling. Percentage of correctly mapped reads was 90.5% from BWA and 92.9% from findmap. Variant calls were output by findmap only for the 88.2% of pairs where both ends were located within the fragment length and of opposite orientation. Percentages of variants called correctly were 99.8% for SNPs and 99.9% for deletions, while insertions had 99.9% of alternate calls correct but only 98.6% of reference calls. Memory required by BWA was 4.6 Gbytes / processor, whereas findmap required 46 Gbytes that could be shared by multiple processors. Simultaneous alignment and variant calling is an efficient and accurate strategy. </p><div><br>Resources in this dataset:</div><br><ul><li><p>Resource Title: FINDMAP.</p> <p>File Name: Web Page, url: <a href="https://www.ars.usda.gov/research/software/download/?softwareid=495&modecode=80-42-05-30">https://www.ars.usda.gov/research/software/download/?softwareid=495&modecode=80-42-05-30</a> </p><p>download page</p></li></ul><p></p>

Saved in:
Bibliographic Details
Main Author: Paul M. VanRaden (17477646)
Format: dataset biblioteca
Published: 2019
Subjects:Genomics and transcriptomics, Genetics, computer software, models, animals, genome, computers, memory, humans, autosomes, mitochondrial DNA, homozygosity, alleles, heterozygosity, single nucleotide polymorphism, reading, bulls, DNA, probability,
Online Access:https://figshare.com/articles/model/FINDMAP/24664257
Tags: Add Tag
No Tags, Be the first to tag this record!
id dat-usda-us-article24664257
record_format figshare
spelling dat-usda-us-article246642572019-05-02T00:00:00Z FINDMAP Paul M. VanRaden (17477646) Genomics and transcriptomics Genetics computer software models animals genome computers memory humans autosomes mitochondrial DNA homozygosity alleles heterozygosity single nucleotide polymorphism reading bulls DNA probability <p>The findmap.f90 program aligns sequence reads to reference map, calls previous variants, and identifies new variants. Program and download information can be found at the Animal Improvement Program (AIP) web site: <a href="http://aipl.arsusda.gov/software/findhap">http://aipl.arsusda.gov/software/findhap</a></p> <p>Sequencing research requires efficient computation. Few programs use already known information about DNA variants when aligning sequence data to the reference map. New program findmap.f90 reads the previous variant list before aligning sequence, calling variant alleles, and summing the allele counts for each DNA source in a single pass. Advantages are faster processing, more precise alignment, more useful data summaries, more compact output, and fewer steps. Programs findmap and BWA were compared using simulated paired end reads of length 150 from fragments of length 1,000 at random locations within the UMD3.1 bovine reference assembly. Each base had 1% probability of error and 1% probability of missing. The 39 million variants from run 5 of the 1,000 bull genomes project were included, with every other variant set to reference or alternate. With 1 processor, BWA required 629 minutes per 1X for alignment, whereas findmap required 12 minutes per 1X for alignment and variant calling. Percentage of correctly mapped reads was 90.5% from BWA and 92.9% from findmap. Variant calls were output by findmap only for the 88.2% of pairs where both ends were located within the fragment length and of opposite orientation. Percentages of variants called correctly were 99.8% for SNPs and 99.9% for deletions, while insertions had 99.9% of alternate calls correct but only 98.6% of reference calls. Memory required by BWA was 4.6 Gbytes / processor, whereas findmap required 46 Gbytes that could be shared by multiple processors. Simultaneous alignment and variant calling is an efficient and accurate strategy. </p><div><br>Resources in this dataset:</div><br><ul><li><p>Resource Title: FINDMAP.</p> <p>File Name: Web Page, url: <a href="https://www.ars.usda.gov/research/software/download/?softwareid=495&modecode=80-42-05-30">https://www.ars.usda.gov/research/software/download/?softwareid=495&modecode=80-42-05-30</a> </p><p>download page</p></li></ul><p></p> 2019-05-02T00:00:00Z dataset Model 10113/AA22726 https://figshare.com/articles/model/FINDMAP/24664257 CC0
institution USDA US
collection Figshare
country Estados Unidos
countrycode US
component Datos de investigación
access En linea
databasecode dat-usda-us
tag biblioteca
region America del Norte
libraryname National Agricultural Library of USDA
topic Genomics and transcriptomics
Genetics
computer software
models
animals
genome
computers
memory
humans
autosomes
mitochondrial DNA
homozygosity
alleles
heterozygosity
single nucleotide polymorphism
reading
bulls
DNA
probability
spellingShingle Genomics and transcriptomics
Genetics
computer software
models
animals
genome
computers
memory
humans
autosomes
mitochondrial DNA
homozygosity
alleles
heterozygosity
single nucleotide polymorphism
reading
bulls
DNA
probability
Paul M. VanRaden (17477646)
FINDMAP
description <p>The findmap.f90 program aligns sequence reads to reference map, calls previous variants, and identifies new variants. Program and download information can be found at the Animal Improvement Program (AIP) web site: <a href="http://aipl.arsusda.gov/software/findhap">http://aipl.arsusda.gov/software/findhap</a></p> <p>Sequencing research requires efficient computation. Few programs use already known information about DNA variants when aligning sequence data to the reference map. New program findmap.f90 reads the previous variant list before aligning sequence, calling variant alleles, and summing the allele counts for each DNA source in a single pass. Advantages are faster processing, more precise alignment, more useful data summaries, more compact output, and fewer steps. Programs findmap and BWA were compared using simulated paired end reads of length 150 from fragments of length 1,000 at random locations within the UMD3.1 bovine reference assembly. Each base had 1% probability of error and 1% probability of missing. The 39 million variants from run 5 of the 1,000 bull genomes project were included, with every other variant set to reference or alternate. With 1 processor, BWA required 629 minutes per 1X for alignment, whereas findmap required 12 minutes per 1X for alignment and variant calling. Percentage of correctly mapped reads was 90.5% from BWA and 92.9% from findmap. Variant calls were output by findmap only for the 88.2% of pairs where both ends were located within the fragment length and of opposite orientation. Percentages of variants called correctly were 99.8% for SNPs and 99.9% for deletions, while insertions had 99.9% of alternate calls correct but only 98.6% of reference calls. Memory required by BWA was 4.6 Gbytes / processor, whereas findmap required 46 Gbytes that could be shared by multiple processors. Simultaneous alignment and variant calling is an efficient and accurate strategy. </p><div><br>Resources in this dataset:</div><br><ul><li><p>Resource Title: FINDMAP.</p> <p>File Name: Web Page, url: <a href="https://www.ars.usda.gov/research/software/download/?softwareid=495&modecode=80-42-05-30">https://www.ars.usda.gov/research/software/download/?softwareid=495&modecode=80-42-05-30</a> </p><p>download page</p></li></ul><p></p>
format dataset
author Paul M. VanRaden (17477646)
author_facet Paul M. VanRaden (17477646)
author_sort Paul M. VanRaden (17477646)
title FINDMAP
title_short FINDMAP
title_full FINDMAP
title_fullStr FINDMAP
title_full_unstemmed FINDMAP
title_sort findmap
publishDate 2019
url https://figshare.com/articles/model/FINDMAP/24664257
work_keys_str_mv AT paulmvanraden17477646 findmap
_version_ 1808946096162996224