Abstract
Background: Three serotypes of Foot-and-mouth disease (FMD) virus have been circulating in Asia, which are commonly identified by serological assays. Such tests are timeconsuming and also need a bio-containment facility for execution. To the best of our knowledge, no computational solution is available in the literature to predict the FMD virus serotypes. Thus, this necessitates the urgent need for user-friendly tools for FMD virus serotyping.
Methods: We presented a computational solution based on a machine-learning model for FMD virus classification and serotype prediction. Besides, various data pre-processing techniques are implemented in the approach for better model prediction. We used sequence data of 2509 FMD virus isolates reported from India and seven other Asian FMD-endemic countries for model training, testing, and validation. We also studied the utility of the developed computational solution in a wet lab setup through collecting and sequencing of 12 virus isolates reported in India. Here, the computational solution is implemented in two user-friendly tools, i.e., online web-prediction server (https://nifmd-bbf.icar.gov.in/FMDVSerPred) and R statistical software package (https://github.com/sam-dfmd/FMDVSerPred).
Results: The random forest machine learning model is implemented in the computational solution, as it outperformed seven other machine learning models when evaluated on ten test and independent datasets. Furthermore, the developed computational solution provided validation accuracies of up to 99.87% on test data, up to 98.64%, and 90.24% on independent data reported from Asian countries, including India and its seven neighboring countries, respectively. In addition, our approach was successfully used for predicting serotypes of field FMD virus isolates reported from various parts of India.
Conclusion: The high-throughput sequencing combined with machine learning offers a promising solution to FMD virus serotyping.
[http://dx.doi.org/10.1016/j.prevetmed.2013.07.013] [PMID: 23958457]
[http://dx.doi.org/10.56093/ijans.v90i7.106662]
[http://dx.doi.org/10.1007/s10393-018-1315-8] [PMID: 29488117]
[http://dx.doi.org/10.1007/s11259-022-10010-z] [PMID: 36190601]
[http://dx.doi.org/10.1016/j.jviromet.2014.06.022] [PMID: 24996132]
[http://dx.doi.org/10.1111/j.1865-1682.2012.01332.x] [PMID: 22551096]
[http://dx.doi.org/10.1128/JVI.02879-12] [PMID: 23255811]
[http://dx.doi.org/10.1155/2020/8847728]
[PMID: 28408775]
[http://dx.doi.org/10.1007/s00705-005-0708-5] [PMID: 16453084]
[http://dx.doi.org/10.1016/0378-1135(88)90024-7] [PMID: 3376418]
[http://dx.doi.org/10.1007/BF01718403] [PMID: 8634024]
[http://dx.doi.org/10.1111/tbed.14603] [PMID: 35614493]
[http://dx.doi.org/10.3389/fvets.2020.00477] [PMID: 32974392]
[http://dx.doi.org/10.1007/s11262-021-01884-3] [PMID: 34988898]
[http://dx.doi.org/10.1021/acsnano.2c10159] [PMID: 36541630]
[http://dx.doi.org/10.1128/CMR.00075-13] [PMID: 25876885]
[http://dx.doi.org/10.1186/s12985-021-01693-y] [PMID: 34980196]
[http://dx.doi.org/10.1016/j.onehlt.2022.100439] [PMID: 36277100]
[http://dx.doi.org/ 10.3390/make5010013]
[http://dx.doi.org/10.1016/j.compbiomed.2021.104672] [PMID: 34315030]
[http://dx.doi.org/10.1016/j.compbiomed.2022.106126] [PMID: 36206696]
[http://dx.doi.org/10.1371/journal.pone.0277431] [PMID: 36449484]
[http://dx.doi.org/10.1093/ofid/ofac401] [PMID: 36004317]
[http://dx.doi.org/10.1016/j.compbiomed.2021.105054] [PMID: 34847387]
[http://dx.doi.org/10.1016/j.compbiomed.2022.105458] [PMID: 35364311]
[http://dx.doi.org/10.1093/bioinformatics/btaa119] [PMID: 32096826]
[http://dx.doi.org/10.1016/j.compbiomed.2022.105401] [PMID: 35381451]
[http://dx.doi.org/10.1016/j.meegid.2022.105261] [PMID: 35231666]
[http://dx.doi.org/10.1186/s12863-018-0710-z] [PMID: 30616524]
[http://dx.doi.org/10.3852/14-293] [PMID: 26553774]
[http://dx.doi.org/10.1186/s13104-016-2203-3] [PMID: 27516337]
[http://dx.doi.org/10.1128/AEM.01541-09] [PMID: 19801464]
[http://dx.doi.org/10.1007/978-3-030-24051-6_58]
[http://dx.doi.org/10.1016/j.vetmic.2015.05.015] [PMID: 26049591]
[http://dx.doi.org/10.1371/journal.pone.0099982] [PMID: 25033270]
[http://dx.doi.org/10.1007/978-3-642-22709-7_30]
[http://dx.doi.org/10.1186/1471-2105-15-8] [PMID: 24410865]
[http://dx.doi.org/10.1016/j.gene.2018.02.044] [PMID: 29458166]
[http://dx.doi.org/10.1093/bioinformatics/bts528] [PMID: 22942017]
[http://dx.doi.org/10.1007/s11227-020-03481-x]
[http://dx.doi.org/10.1007/BF00994018]
[http://dx.doi.org/10.1007/s10115-007-0114-2]
[http://dx.doi.org/10.1038/s41598-019-45223-x] [PMID: 31222027]
[http://dx.doi.org/10.1007/BF02478259]
[http://dx.doi.org/10.1007/978-0-387-30164-8_576]
[http://dx.doi.org/10.1023/A:1010933404324]
[http://dx.doi.org/10.1186/s13040-016-0086-4] [PMID: 26807151]
[http://dx.doi.org/10.1016/j.patrec.2007.05.001]
[http://dx.doi.org/10.1186/1471-2105-12-77] [PMID: 21414208]
[http://dx.doi.org/10.1186/1297-9716-44-116] [PMID: 24308718]