Evaluation of genome similarities using a wavelet-domain approach

Abstract INTRODUCTION: Tuberculosis is listed among the top 10 causes of deaths worldwide. The resistant strains causing this disease have been considered to be responsible for public health emergencies and health security threats. As stated by the World Health Organization (WHO), around 558,000 different cases coupled with resistance to rifampicin (the most operative first-line drug) have been estimated to date. Therefore, in order to detect the resistant strains using the genomes of Mycobacterium tuberculosis (MTB), we propose a new methodology for the analysis of genomic similarities that associate the different levels of decomposition of the genome (discrete non-decimated wavelet transform) and the Hurst exponent. METHODS: The signals corresponding to the ten analyzed sequences were obtained by assessing GC content, and then these signals were decomposed using the discrete non-decimated wavelet transform along with the Daubechies wavelet with four null moments at five levels of decomposition. The Hurst exponent was calculated at each decomposition level using five different methods. The cluster analysis was performed using the results obtained for the Hurst exponent. RESULTS: The aggregated variance, differenced aggregated variance, and aggregated absolute value methods presented the formation of three groups, whereas the Peng and R/S methods presented the formation of two groups. The aggregated variance method exhibited the best results with respect to the group formation between similar strains. CONCLUSION: The evaluation of Hurst exponent associated with discrete non-decimated wavelet transform can be used as a measure of similarity between genome sequences, thus leading to a refinement in the analysis.

Saved in:
Bibliographic Details
Main Authors: Ferreira,Leila Maria, Sáfadi,Thelma, Ferreira,Juliano Lino
Format: Digital revista
Language:English
Published: Sociedade Brasileira de Medicina Tropical - SBMT 2020
Online Access:http://old.scielo.br/scielo.php?script=sci_arttext&pid=S0037-86822020000100322
Tags: Add Tag
No Tags, Be the first to tag this record!
id oai:scielo:S0037-86822020000100322
record_format ojs
spelling oai:scielo:S0037-868220200001003222020-05-28Evaluation of genome similarities using a wavelet-domain approachFerreira,Leila MariaSáfadi,ThelmaFerreira,Juliano Lino GC content Hurst exponent Mycobacterium tuberculosis Discrete non-decimated wavelet transform Grouping Abstract INTRODUCTION: Tuberculosis is listed among the top 10 causes of deaths worldwide. The resistant strains causing this disease have been considered to be responsible for public health emergencies and health security threats. As stated by the World Health Organization (WHO), around 558,000 different cases coupled with resistance to rifampicin (the most operative first-line drug) have been estimated to date. Therefore, in order to detect the resistant strains using the genomes of Mycobacterium tuberculosis (MTB), we propose a new methodology for the analysis of genomic similarities that associate the different levels of decomposition of the genome (discrete non-decimated wavelet transform) and the Hurst exponent. METHODS: The signals corresponding to the ten analyzed sequences were obtained by assessing GC content, and then these signals were decomposed using the discrete non-decimated wavelet transform along with the Daubechies wavelet with four null moments at five levels of decomposition. The Hurst exponent was calculated at each decomposition level using five different methods. The cluster analysis was performed using the results obtained for the Hurst exponent. RESULTS: The aggregated variance, differenced aggregated variance, and aggregated absolute value methods presented the formation of three groups, whereas the Peng and R/S methods presented the formation of two groups. The aggregated variance method exhibited the best results with respect to the group formation between similar strains. CONCLUSION: The evaluation of Hurst exponent associated with discrete non-decimated wavelet transform can be used as a measure of similarity between genome sequences, thus leading to a refinement in the analysis.info:eu-repo/semantics/openAccessSociedade Brasileira de Medicina Tropical - SBMTRevista da Sociedade Brasileira de Medicina Tropical v.53 20202020-01-01info:eu-repo/semantics/articletext/htmlhttp://old.scielo.br/scielo.php?script=sci_arttext&pid=S0037-86822020000100322en10.1590/0037-8682-0470-2019
institution SCIELO
collection OJS
country Brasil
countrycode BR
component Revista
access En linea
databasecode rev-scielo-br
tag revista
region America del Sur
libraryname SciELO
language English
format Digital
author Ferreira,Leila Maria
Sáfadi,Thelma
Ferreira,Juliano Lino
spellingShingle Ferreira,Leila Maria
Sáfadi,Thelma
Ferreira,Juliano Lino
Evaluation of genome similarities using a wavelet-domain approach
author_facet Ferreira,Leila Maria
Sáfadi,Thelma
Ferreira,Juliano Lino
author_sort Ferreira,Leila Maria
title Evaluation of genome similarities using a wavelet-domain approach
title_short Evaluation of genome similarities using a wavelet-domain approach
title_full Evaluation of genome similarities using a wavelet-domain approach
title_fullStr Evaluation of genome similarities using a wavelet-domain approach
title_full_unstemmed Evaluation of genome similarities using a wavelet-domain approach
title_sort evaluation of genome similarities using a wavelet-domain approach
description Abstract INTRODUCTION: Tuberculosis is listed among the top 10 causes of deaths worldwide. The resistant strains causing this disease have been considered to be responsible for public health emergencies and health security threats. As stated by the World Health Organization (WHO), around 558,000 different cases coupled with resistance to rifampicin (the most operative first-line drug) have been estimated to date. Therefore, in order to detect the resistant strains using the genomes of Mycobacterium tuberculosis (MTB), we propose a new methodology for the analysis of genomic similarities that associate the different levels of decomposition of the genome (discrete non-decimated wavelet transform) and the Hurst exponent. METHODS: The signals corresponding to the ten analyzed sequences were obtained by assessing GC content, and then these signals were decomposed using the discrete non-decimated wavelet transform along with the Daubechies wavelet with four null moments at five levels of decomposition. The Hurst exponent was calculated at each decomposition level using five different methods. The cluster analysis was performed using the results obtained for the Hurst exponent. RESULTS: The aggregated variance, differenced aggregated variance, and aggregated absolute value methods presented the formation of three groups, whereas the Peng and R/S methods presented the formation of two groups. The aggregated variance method exhibited the best results with respect to the group formation between similar strains. CONCLUSION: The evaluation of Hurst exponent associated with discrete non-decimated wavelet transform can be used as a measure of similarity between genome sequences, thus leading to a refinement in the analysis.
publisher Sociedade Brasileira de Medicina Tropical - SBMT
publishDate 2020
url http://old.scielo.br/scielo.php?script=sci_arttext&pid=S0037-86822020000100322
work_keys_str_mv AT ferreiraleilamaria evaluationofgenomesimilaritiesusingawaveletdomainapproach
AT safadithelma evaluationofgenomesimilaritiesusingawaveletdomainapproach
AT ferreirajulianolino evaluationofgenomesimilaritiesusingawaveletdomainapproach
_version_ 1756380678193152000