Generic placeholder image

Current Bioinformatics

Editor-in-Chief

ISSN (Print): 1574-8936
ISSN (Online): 2212-392X

Research Article

Clin-mNGS: Automated Pipeline for Pathogen Detection from Clinical Metagenomic Data

Author(s): Akshatha Prasanna and Vidya Niranjan*

Volume 16, Issue 2, 2021

Published on: 08 June, 2020

Page: [306 - 314] Pages: 9

DOI: 10.2174/1574893615999200608130029

Price: $65

Abstract

Background: Since bacteria are the earliest known organisms, there has been significant interest in their variety and biology, most certainly concerning human health. Recent advances in Metagenomics sequencing (mNGS), a culture-independent sequencing technology, have facilitated an accelerated development in clinical microbiology and our understanding of pathogens.

Objective: For the implementation of mNGS in routine clinical practice to become feasible, a practical and scalable strategy for the study of mNGS data is essential. This study presents a robust automated pipeline to analyze clinical metagenomic data for pathogen identification and classification.

Methods: The proposed Clin-mNGS pipeline is an integrated, open-source, scalable, reproducible, and user-friendly framework scripted using the Snakemake workflow management software. The implementation avoids the hassle of manual installation and configuration of the multiple commandline tools and dependencies. The approach directly screens pathogens from clinical raw reads and generates consolidated reports for each sample.

Results: The pipeline is demonstrated using publicly available data and is tested on a desktop Linux system and a High-performance cluster. The study compares variability in results from different tools and versions. The versions of the tools are made user modifiable. The pipeline results in quality check, filtered reads, host subtraction, assembled contigs, assembly metrics, relative abundances of bacterial species, antimicrobial resistance genes, plasmid finding, and virulence factors identification. The results obtained from the pipeline are evaluated based on sensitivity and positive predictive value.

Conclusion: Clin-mNGS is an automated Snakemake pipeline validated for the analysis of microbial clinical metagenomics reads to perform taxonomic classification and antimicrobial resistance prediction.

Keywords: Metagenomic analysis, clinical metagenomics, clinical diagnostics, snakemake, pathogen detection, taxonomic identification, antimicrobial drug resistance, virulence factor genes.

Graphical Abstract

[1]
Chiu CY, Miller SA. Clinical metagenomics. Nat Rev Genet 2019; 20(6): 341-55.
[http://dx.doi.org/10.1038/s41576-019-0113-7] [PMID: 30918369]
[2]
Simner PJ, Miller S, Carroll KC. Understanding the promises and hurdles of metagenomic next-generation sequencing as a diagnostic tool for infectious diseases. Clin Infect Dis 2018; 66(5): 778-88.
[http://dx.doi.org/10.1093/cid/cix881] [PMID: 29040428]
[3]
van der Straaten T. Next‐generation sequencing: current technologies and applications. ChemMedChem 2015; 10: 419-20.
[4]
Gu W, Miller S, Chiu CY. Clinical metagenomic next-generation sequencing for pathogen detection. Annu Rev Pathol 2019; 14: 319-38.
[http://dx.doi.org/10.1146/annurev-pathmechdis-012418-012751] [PMID: 30355154]
[5]
Miller S, Chiu C. Metagenomic next-generation sequencing for pathogen detection and identification advanced techniques in diagnostic microbiology. Springer 2018; pp. 617-32.
[6]
Hasman H, Saputra D, Sicheritz-Ponten T, et al. Rapid whole-genome sequencing for detection and characterization of microorganisms directly from clinical samples. J Clin Microbiol 2014; 52(1): 139-46.
[http://dx.doi.org/10.1128/JCM.02452-13] [PMID: 24172157]
[7]
Deurenberg RH, Bathoorn E, Chlebowicz MA, et al. Application of next generation sequencing in clinical microbiology and infection prevention. J Biotechnol 2017; 243: 16-24.
[http://dx.doi.org/10.1016/j.jbiotec.2016.12.022] [PMID: 28042011]
[8]
Schlaberg R, Chiu CY, Miller S, et al. Professional practice committee and committee on laboratory practices of the american society for microbiology; microbiology resource committee of the college of american pathologists. validation of metagenomic next-generation sequencing tests for universal pathogen detection. Arch Pathol Lab Med 2017; 141(6): 776-86.
[http://dx.doi.org/10.5858/arpa.2016-0539-RA] [PMID: 28169558]
[9]
Köser CU, Ellington MJ, Cartwright EJ, et al. Routine use of microbial whole genome sequencing in diagnostic and public health microbiology. PLoS Pathog 2012; 8(8)e1002824
[http://dx.doi.org/10.1371/journal.ppat.1002824] [PMID: 22876174]
[10]
Didelot X, Bowden R, Wilson DJ, Peto TEA, Crook DW. Transforming clinical microbiology with bacterial genome sequencing. Nat Rev Genet 2012; 13(9): 601-12.
[http://dx.doi.org/10.1038/nrg3226] [PMID: 22868263]
[11]
Aarestrup FM, Brown EW, Detter C, et al. Integrating genome-based informatics to modernize global disease monitoring, information sharing, and response. Emerg Infect Dis 2012; 18(11)e1
[http://dx.doi.org/10.3201/eid1811.120453] [PMID: 23092707]
[12]
Gardy JL, Loman NJ. Towards a genomics-informed, real-time, global pathogen surveillance system. Nat Rev Genet 2018; 19(1): 9-20.
[http://dx.doi.org/10.1038/nrg.2017.88] [PMID: 29129921]
[13]
Weinstock GM. Genomic approaches to studying the human microbiota. Nature 2012; 489(7415): 250-6.
[http://dx.doi.org/10.1038/nature11553] [PMID: 22972298]
[14]
Forbes JD, Knox NC, Ronholm J, Pagotto F, Reimer A. Metagenomics: the next culture-independent game changer. Front Microbiol 2017; 8: 1069.
[http://dx.doi.org/10.3389/fmicb.2017.01069] [PMID: 28725217]
[15]
Gosiewski T, Ludwig-Galezowska AH, Huminska K, et al. Comprehensive detection and identification of bacterial DNA in the blood of patients with sepsis and healthy volunteers using next-generation sequencing method - the observation of DNAemia. Eur J Clin Microbiol Infect Dis 2017; 36(2): 329-36.
[http://dx.doi.org/10.1007/s10096-016-2805-7] [PMID: 27771780]
[16]
Kujiraoka M, Kuroda M, Asai K, et al. Comprehensive diagnosis of bacterial infection associated with acute cholecystitis using metagenomic approach. Front Microbiol 2017; 8: 685.
[http://dx.doi.org/10.3389/fmicb.2017.00685] [PMID: 28473817]
[17]
Wilson MR, Zimmermann LL, Crawford ED, et al. Acute west nile virus meningoencephalitis diagnosed via metagenomic deep sequencing of cerebrospinal fluid in a renal transplant patient. Am J Transplant 2017; 17(3): 803-8.
[http://dx.doi.org/10.1111/ajt.14058] [PMID: 27647685]
[18]
Wilson MR, O’Donovan BD, Gelfand JM, et al. Chronic meningitis investigated via metagenomic next-generation sequencing. JAMA Neurol 2018; 75(8): 947-55.
[http://dx.doi.org/10.1001/jamaneurol.2018.0463] [PMID: 29710329]
[19]
Langelier C, Zinter MS, Kalantar K, et al. Metagenomic sequencing detects respiratory pathogens in hematopoietic cellular transplant patients. Am J Respir Crit Care Med 2018; 197(4): 524-8.
[http://dx.doi.org/10.1164/rccm.201706-1097LE] [PMID: 28686513]
[20]
Zhou Y, Wylie KM, El Feghaly RE, et al. Metagenomic approach for identification of the pathogens associated with diarrhea in stool specimens. J Clin Microbiol 2016; 54(2): 368-75.
[http://dx.doi.org/10.1128/JCM.01965-15] [PMID: 26637379]
[21]
Doan T, Wilson MR, Crawford ED, et al. Illuminating uveitis: metagenomic deep sequencing identifies common and rare pathogens. Genome Med 2016; 8(1): 90.
[http://dx.doi.org/10.1186/s13073-016-0344-6] [PMID: 27562436]
[22]
Ivy MI, Thoendel MJ, Jeraldo PR, et al. Direct detection and identification of prosthetic joint infection pathogens in synovial fluid by metagenomic shotgun sequencing. J Clin Microbiol 2018; 56(9): e00402-18.
[http://dx.doi.org/10.1128/JCM.00402-18] [PMID: 29848568]
[23]
Köster J, Rahmann S. Snakemake--a scalable bioinformatics workflow engine. Bioinformatics 2012; 28(19): 2520-2.
[http://dx.doi.org/10.1093/bioinformatics/bts480] [PMID: 22908215]
[24]
Araújo PMM, Martins JS, Osório NS. SNAPPy: A snakemake pipeline for scalable HIV-1 subtyping by phylogenetic pairing. Virus Evol 2019; 5(2)vez050
[http://dx.doi.org/10.1093/ve/vez050] [PMID: 31768265]
[25]
Brown J, Zavoshy N, Brislawn CJ, McCue LA. Hundo: a Snakemake workflow for microbial community sequence data 2018.
[26]
Chouaref J, Bliek M, Galland M. A reproducible Snakemake pipeline to analyse Illumina paired-end data from ChiP-Seq experiments. J Open Source Soft 2019; 4(38): 1465.
[http://dx.doi.org/10.21105/joss.01465]
[27]
Cornwell M, Vangala M, Taing L, et al. VIPER: visualization pipeline for RNA-seq, a snakemake workflow for efficient and complete RNA-seq analysis. BMC Bioinformatics 2018; 19(1): 135.
[http://dx.doi.org/10.1186/s12859-018-2139-9] [PMID: 29649993]
[28]
Kieser S, Brown J, Zdobnov EM, Trajkovski M, McCue LA. 2019.
[29]
Kondratenko Y, Korobeynikov A, Lapidus A. CDSnake: snakemake pipeline for retrieval of annotated OTUs from paired-end reads using CD-HIT utilities. BMC Bioinformatics 2019; 20(Suppl. 17): 516.
[30]
Wang D. hppRNA-a Snakemake-based handy parameter-free pipeline for RNA-Seq analysis of numerous samples. Brief Bioinform 2018; 19(4): 622-6.
[PMID: 28096075]
[31]
McInerney TW, Fulton-Howard B, Patterson C, et al. 2019.
[32]
Couto N, Schuele L, Raangs EC, et al. Critical steps in clinical shotgun metagenomics for the concomitant detection and typing of microbial pathogens. Sci Rep 2018; 8(1): 13767.
[http://dx.doi.org/10.1038/s41598-018-31873-w] [PMID: 30213965]
[33]
Cock PJ, Fields CJ, Goto N, Heuer ML, Rice PM. The sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res 2010; 38(6): 1767-71.
[http://dx.doi.org/10.1093/nar/gkp1137] [PMID: 20015970]
[34]
DeLong E. Microbial metagenomics, metatranscriptomics, and metaproteomics. Academic Press 2013.
[35]
Piro VC, Matschkowski M, Renard BY. MetaMeta: integrating metagenome analysis tools to improve taxonomic profiling. Microbiome 2017; 5(1): 101.
[http://dx.doi.org/10.1186/s40168-017-0318-y] [PMID: 28807044]
[36]
Andrews S. FastQC: a quality control tool for high throughput sequence data Babraham Bioinformatics. Cambridge, United Kingdom: Babraham Institute 2010.
[37]
Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 2014; 30(15): 2114-20.
[http://dx.doi.org/10.1093/bioinformatics/btu170] [PMID: 24695404]
[38]
Rose G, Wooldridge DJ, Anscombe C, Mee ET, Misra RV, Gharbia S. Challenges of the unknown: clinical application of microbial metagenomics. Int J Genomics 2015; 2015292950
[http://dx.doi.org/10.1155/2015/292950]
[39]
Driscoll HE, Vincent JJ, English EL, Dolci ED. Metagenomic investigation of the microbial diversity in a chrysotile asbestos mine pit pond, Lowell, Vermont, USA. Genom Data 2016; 10: 158-64.
[http://dx.doi.org/10.1016/j.gdata.2016.11.004] [PMID: 27896068]
[40]
Yang X, Noyes NR, Doster E, et al. Use of metagenomic shotgun sequencing technology to detect foodborne pathogens within the microbiome of the beef production chain. Appl Environ Microbiol 2016; 82(8): 2433-43.
[http://dx.doi.org/10.1128/AEM.00078-16] [PMID: 26873315]
[41]
Biller SJ, Berube PM, Dooley K, et al. Marine microbial metagenomes sampled across space and time. Sci Data 2018; 5180176
[http://dx.doi.org/10.1038/sdata.2018.176] [PMID: 30179232]
[42]
Pereira-Marques J, Hout A, Ferreira RM, et al. Impact of host DNA and sequencing depth on the taxonomic resolution of whole metagenome sequencing for microbiome analysis. Front Microbiol 2019; 10: 1277.
[http://dx.doi.org/10.3389/fmicb.2019.01277] [PMID: 31244801]
[43]
Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods 2012; 9(4): 357-9.
[http://dx.doi.org/10.1038/nmeth.1923] [PMID: 22388286]
[44]
Nurk S, Meleshko D, Korobeynikov A, Pevzner PA. metaSPAdes: a new versatile metagenomic assembler. Genome Res 2017; 27(5): 824-34.
[http://dx.doi.org/10.1101/gr.213959.116] [PMID: 28298430]
[45]
Mikheenko A, Saveliev V, Gurevich A. MetaQUAST: evaluation of metagenome assemblies. Bioinformatics 2016; 32(7): 1088-90.
[http://dx.doi.org/10.1093/bioinformatics/btv697] [PMID: 26614127]
[46]
Truong DT, Franzosa EA, Tickle TL, et al. MetaPhlAn2 for enhanced metagenomic taxonomic profiling. Nat Methods 2015; 12(10): 902-3.
[http://dx.doi.org/10.1038/nmeth.3589] [PMID: 26418763]
[47]
Seemann T. Abricate: mass screening of contigs for antimicrobial and virulence genes 2018.
[48]
Feldgarden M, Brover V, Haft DH, et al. Using the NCBI AMRFinder tool to determine antimicrobial resistance genotype-phenotype correlations within a collection of NARMS isolates. bioRxiv 2019.550707
[49]
Jia B, Raphenya AR, Alcock B, et al. CARD 2017: expansion and model-centric curation of the comprehensive antibiotic resistance database. Nucleic Acids Res 2017; 45(D1): D566-73.
[PMID: 27789705]
[50]
Gupta SK, Padmanabhan BR, Diene SM, et al. ARG-ANNOT, a new bioinformatic tool to discover antibiotic resistance genes in bacterial genomes. Antimicrob Agents Chemother 2014; 58(1): 212-20.
[http://dx.doi.org/10.1128/AAC.01310-13] [PMID: 24145532]
[51]
Zankari E, Hasman H, Cosentino S, et al. Identification of acquired antimicrobial resistance genes. J Antimicrob Chemother 2012; 67(11): 2640-4.
[http://dx.doi.org/10.1093/jac/dks261] [PMID: 22782487]
[52]
Carattoli A, Zankari E, García-Fernández A, et al. In silico detection and typing of plasmids using PlasmidFinder and plasmid multilocus sequence typing. Antimicrob Agents Chemother 2014; 58(7): 3895-903.
[http://dx.doi.org/10.1128/AAC.02412-14] [PMID: 24777092]
[53]
Chen L, Zheng D, Liu B, Yang J, Jin Q. VFDB 2016: hierarchical and refined dataset for big data analysis--10 years on. Nucleic Acids Res 2016; 44(D1): D694-7.
[http://dx.doi.org/10.1093/nar/gkv1239] [PMID: 26578559]
[54]
Grüning B, Dale R, Sjödin A, et al. Bioconda Team. Bioconda: sustainable and comprehensive software distribution for the life sciences. Nat Methods 2018; 15(7): 475-6.
[http://dx.doi.org/10.1038/s41592-018-0046-7] [PMID: 29967506]
[55]
Mikheenko A, Valin G, Prjibelski A, Saveliev V, Gurevich A. Icarus: visualizer for de novo assembly evaluation. Bioinformatics 2016; 32(21): 3321-3.
[http://dx.doi.org/10.1093/bioinformatics/btw379] [PMID: 27378299]
[56]
Mullner D. Fastcluster: fast hierarchical, agglomerative clustering routines for R and python. J Stat Softw 2013; 53: 1-18.
[57]
Vihinen M. How to evaluate performance of prediction methods? Measures and their interpretation in variation effect analysis. BMC Genomics 2012; 13(Suppl. 4): S2.

Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy