ExomeHMM: A Hidden Markov Model for Detecting Copy Number Variation Using Whole-Exome Sequencing Data

Ao      Li; Minghui      Wang; Zhenhua      Yu; Cheng      Guo

doi:10.2174/1574893611666160727160757

Abstract

Background: Copy number variations (CNVs), including amplification and deletion, are alterations of DNA copy number compared to a reference genome. CNVs play a crucial role in tumourigenesis and progression, including amplification of oncogenes and deletion of tumor suppressor genes that may significantly increase the risk of cancer. CNVs are also reported to be closely related with non-cancer diseases, such as Down syndrome, Parkinson disease, and Alzheimer disease.

Objective: Whole-exome sequencing (WES) has been successfully applied to the discovery of gene mutations as well as clinical diagnosis. But it is quite challenging to evaluate the copy number using WES data due to read depth bias, exons' distribution pattern and normal cell contamination. Our aim is develop an efficient method to overcome these challenges and detect CNVs using WES data.

Method: In this study, we present ExomeHMM, a hidden Markov model (HMM) based CNV detecting algorithm. ExomeHMM exploits relative read depth, a ratio based signal, to mitigate read depth distortion and employs exponential attenuated transition matrix to handle sparsely and non-uniformly distributed exons. Expectation–maximization algorithm is used to optimize parameters for the proposed model. Finally, we use standard Viterbi algorithm to infer the copy number of exons.

Results: Using previously identified CNVs in 1000 Genome Project data as golden standard, ExomeHMM achieves the highest F-score among the four methods compared in this study. When applied to triple-negative breast cancer data, ExomeHMM is capable to find abnormal genes that are significantly associated with breast cancer.

Conclusion: In conclusion, ExomeHMM is a suitable tool for CNV detections in both healthy samples as well as clinic tumor samples on whole-exome sequencing data.

Keywords: Copy number variation, expectation–maximization algorithm, hidden Markov model, next generation sequencing, viterbi algorithm, whole-exome sequencing.

« Previous Next »

Graphical Abstract

Rights & Permissions Print Cite

Article Metrics

20

7

Journal Information

For Authors

For Editors

For Reviewers

Explore Articles

Open Access

Open Access Articles

For Visitors

DOI https://dx.doi.org/10.2174/1574893611666160727160757	Print ISSN 1574-8936
Publisher Name Bentham Science Publisher	Online ISSN 2212-392X

Current Bioinformatics

ExomeHMM: A Hidden Markov Model for Detecting Copy Number Variation Using Whole-Exome Sequencing Data

Abstract Play Pause

Graphical Abstract

Related Journals

Related Books

Related Articles

Abstract