Generic placeholder image

Current Bioinformatics

Editor-in-Chief

ISSN (Print): 1574-8936
ISSN (Online): 2212-392X

Research Article

Development and Study of a Knowledge Graph for Retrieving the Relationship Between BVDV and Related Genes

Author(s): Jia Lv, Yunli Bai*, Lu Chang, Yingfei Li, Rulin Wang and Weiguang Zhou*

Volume 18, Issue 5, 2023

Published on: 12 April, 2023

Page: [448 - 457] Pages: 10

DOI: 10.2174/1574893618666230224142324

Price: $65

Abstract

Background: Bovine viral diarrhea virus (BVDV) can cause diarrhea, abortion, and immunosuppression in cattle, imposing huge economic losses for the global cattle industry. The pathogenic and immune mechanisms of BVDV remain elusive. The development of a BVDV-gene knowledge base can provide clues to reveal the interaction of BVDV with host cells. However, the traditional method of manually establishing a knowledge base is time-consuming and inefficient. The method of developing a knowledge base based on deep learning has noticeably attracted scholars' attention recently.

Objective: The study aimed to explore the substitution of deep learning for manual mining of BVDVrelated genes and to develop a knowledge graph of the relationship between BVDV and related genes.

Methods: A deep learning-based biomedical knowledge graph development method was proposed, which used deep learning to mine biomedical knowledge, model BVDV and various gene concepts, and store data in a graphical database. First, the PubMed database was used as the data source and crawler technology to obtain abstract data on the relationship between BVDV and various host genes. Pretrained BioBERT model was used for biomedical named entity recognition to obtain all types of gene entities, and the pre-trained BERT model was utilized for relationship extraction to achieve the relationship between BVDV and various gene entities. Then, it was combined with manual proofreading to obtain structured triple data with high accuracy. Finally, the Neo4j graph database was used to store data and to develop the knowledge graph of the relationship between BVDV and related genes.

Results: The results showed the obtainment of 71 gene entity types, including PRL4, MMP-7, TGIF1, etc. 9 relation types of BVDV and gene entities were obtained, including "can downregulate expression of", "can upregulate expression of", "can suppress expression of", etc. The knowledge graph was developed using deep learning to mine biomedical knowledge combined with manual proofreading, which was faster and more efficient than the traditional method of establishing knowledge base manually, and the retrieval of semantic information by storing data in graph database was also more efficient.

Conclusion: A BVDV-gene knowledge graph was preliminarily developed, which provided a basis for studying the interaction between BVDV and host cells.

« Previous
Graphical Abstract

[1]
Stenson PD, Mort M, Ball EV, Shaw K, Phillips AD, Cooper DN. The human gene mutation database: Building a comprehensive mutation repository for clinical and molecular genetics, diagnostic testing and personalized genomic medicine. Hum Genet 2014; 133(1): 1-9.
[http://dx.doi.org/10.1007/s00439-013-1358-4] [PMID: 24077912]
[2]
Qin X, Yao X, Xia J. A novel metric to quantify the effect of pathway enrichment evaluation with respect to biomedical text-mined terms: Development and feasibility study. JMIR Med Inform 2021; 9(6): e28247.
[http://dx.doi.org/10.2196/28247] [PMID: 34142969]
[3]
Himmelstein DS, Lizee A, Hessler C, et al. Systematic integration of biomedical knowledge prioritizes drugs for repurposing. eLife 2017; 6: e26726.
[http://dx.doi.org/10.7554/eLife.26726] [PMID: 28936969]
[4]
Ernst P, Siu A, Weikum G. KnowLife: A versatile approach for constructing a large knowledge graph for biomedical sciences. BMC Bioinformatics 2015; 16(1): 157.
[http://dx.doi.org/10.1186/s12859-015-0549-5] [PMID: 25971816]
[5]
Yadav S, Ekbal A, Saha S, Kumar A, Bhattacharyya P. Feature assisted stacked attentive shortest dependency path based Bi-LSTM model for protein–protein interaction. Knowl Base Syst 2019; 166: 18-29.
[http://dx.doi.org/10.1016/j.knosys.2018.11.020]
[6]
Nickel M, Murphy K, Tresp V, Gabrilovich E. TRESP V, GABRILOVICH E. (2016) A review of relational machine learning for knowledge graphs. Proc IEEE 2016; 104(1): 11-33.
[http://dx.doi.org/10.1109/JPROC.2015.2483592]
[7]
Devlin J, Chang M-W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding. ArXiv 2018; 2018: 181004805.
[8]
Lee J, Yoon W, Kim S, et al. BioBERT: A pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 2020; 36(4): 1234-40.
[PMID: 31501885]
[9]
Bordes A, Usunier N, Garcia-Duran A, et al. Translating embeddings for modeling multi-relational data. Adv Neural Inf Process Syst 2013; 26: 1-9.
[10]
Wang Z, Zhang J, Feng J, et al. Knowledge graph embedding by translating on hyperplanes. Proc Conf AAAI Artif Intell 2014; 28(1): 1112-9.
[http://dx.doi.org/10.1609/aaai.v28i1.8870]
[11]
Xie R, Liu Z, Sun M. Representation learning of knowledge graphs with hierarchical types. Proceedings of the IJCAI 2016; 2016: 2965-71.
[12]
Augenstein I, Vlachos A, Maynard D. Extracting relations between non-standard entities using distant supervision and imitation learning. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing: Lisbon Portugal ACL Anthology 747-57.
[http://dx.doi.org/10.18653/v1/D15-1086]
[13]
Lin Y, Shen S, Liu Z. Neural relation extraction with selective attention over instances. Proceedings of the Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics; Berlin, Germany ACL Anthology 2124-33.
[http://dx.doi.org/10.18653/v1/P16-1200]
[14]
Kim J-D, Ohta T, Tsuruoka Y. Introduction to the bio-entity recognition task at JNLPBA. Proceedings of the Proceedings of the international joint workshop on natural language processing in biomedicine and its applications; USA ACL Antology 70-5.
[http://dx.doi.org/10.3115/1567594.1567610]
[15]
Smith L, Tanabe LK, Ando RJ, et al. Overview of BioCreative II gene mention recognition. Genome Biol 2008; 9(Suppl 2): S2.
[http://dx.doi.org/10.1186/gb-2008-9-s2-s2] [PMID: 18834493]
[16]
He K, Zhang X, Ren S. Deep residual learning for image recognition. Proceedings of the Proceedings of the IEEE conference on computer vision and pattern recognition CVPR 2016; 770-8.
[17]
Ba JL, Kiros JR, Hinton GE. Layer normalization. ArXiv 2016; 2016: 160706450.
[18]
Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. Adv Neural Inf Process Syst 2017; 30: 1-11.
[19]
Wu S, He Y. Enriching pre-trained language model with entity information for relation classification. Proceedings of the Proceedings of the 28th ACM international conference on information and knowledge management New York: Association for Computing Machinery 2361-4.
[http://dx.doi.org/10.1145/3357384.3358119]
[20]
Giorgi JM, Bader GD. Transfer learning for biomedical named entity recognition with neural networks. Bioinformatics 2018; 34(23): 4087-94.
[http://dx.doi.org/10.1093/bioinformatics/bty449] [PMID: 29868832]
[21]
Habibi M, Weber L, Neves M, Wiegandt DL, Leser U. Deep learning with word embeddings improves biomedical named entity recognition. Bioinformatics 2017; 33(14): i37-48.
[http://dx.doi.org/10.1093/bioinformatics/btx228] [PMID: 28881963]
[22]
Wei CH, Allot A, Leaman R, Lu Z. PubTator central: Automated concept annotation for biomedical full text articles. Nucleic Acids Res 2019; 47(W1): W587-93.
[http://dx.doi.org/10.1093/nar/gkz389] [PMID: 31114887]
[23]
Han X, Gao T, Yao Y, et al. OpenNRE: An open and extensible toolkit for neural relation extraction. ArXiv 2019; 2019: 190913078.
[http://dx.doi.org/10.18653/v1/D19-3029]
[24]
Sakor A, Jozashoori S, Niazmand E, et al. Knowledge4COVID-19: A semantic-based approach for constructing a COVID-19 related knowledge graph from various sources and analyzing treatments’ toxicities. J Web Semant 2023; 75: 100760.
[http://dx.doi.org/10.1016/j.websem.2022.100760] [PMID: 36268112]
[25]
Sang S, Yang Z, Wang L, Liu X, Lin H, Wang J. SemaTyP: A knowledge graph based literature mining method for drug discovery. BMC Bioinformatics 2018; 19(1): 193.
[http://dx.doi.org/10.1186/s12859-018-2167-5] [PMID: 29843590]
[26]
Santos A. Clinical knowledge graph integrates proteomics data into clinical decision-making. BioRxiv 2020.
[http://dx.doi.org/10.1101/2020.05.09.084897]

Rights & Permissions Print Cite
© 2025 Bentham Science Publishers | Privacy Policy