Abstract
Background: Bovine viral diarrhea virus (BVDV) can cause diarrhea, abortion, and immunosuppression in cattle, imposing huge economic losses for the global cattle industry. The pathogenic and immune mechanisms of BVDV remain elusive. The development of a BVDV-gene knowledge base can provide clues to reveal the interaction of BVDV with host cells. However, the traditional method of manually establishing a knowledge base is time-consuming and inefficient. The method of developing a knowledge base based on deep learning has noticeably attracted scholars' attention recently.
Objective: The study aimed to explore the substitution of deep learning for manual mining of BVDVrelated genes and to develop a knowledge graph of the relationship between BVDV and related genes.
Methods: A deep learning-based biomedical knowledge graph development method was proposed, which used deep learning to mine biomedical knowledge, model BVDV and various gene concepts, and store data in a graphical database. First, the PubMed database was used as the data source and crawler technology to obtain abstract data on the relationship between BVDV and various host genes. Pretrained BioBERT model was used for biomedical named entity recognition to obtain all types of gene entities, and the pre-trained BERT model was utilized for relationship extraction to achieve the relationship between BVDV and various gene entities. Then, it was combined with manual proofreading to obtain structured triple data with high accuracy. Finally, the Neo4j graph database was used to store data and to develop the knowledge graph of the relationship between BVDV and related genes.
Results: The results showed the obtainment of 71 gene entity types, including PRL4, MMP-7, TGIF1, etc. 9 relation types of BVDV and gene entities were obtained, including "can downregulate expression of", "can upregulate expression of", "can suppress expression of", etc. The knowledge graph was developed using deep learning to mine biomedical knowledge combined with manual proofreading, which was faster and more efficient than the traditional method of establishing knowledge base manually, and the retrieval of semantic information by storing data in graph database was also more efficient.
Conclusion: A BVDV-gene knowledge graph was preliminarily developed, which provided a basis for studying the interaction between BVDV and host cells.
Graphical Abstract
[http://dx.doi.org/10.1007/s00439-013-1358-4] [PMID: 24077912]
[http://dx.doi.org/10.2196/28247] [PMID: 34142969]
[http://dx.doi.org/10.7554/eLife.26726] [PMID: 28936969]
[http://dx.doi.org/10.1186/s12859-015-0549-5] [PMID: 25971816]
[http://dx.doi.org/10.1016/j.knosys.2018.11.020]
[http://dx.doi.org/10.1109/JPROC.2015.2483592]
[PMID: 31501885]
[http://dx.doi.org/10.1609/aaai.v28i1.8870]
[http://dx.doi.org/10.18653/v1/D15-1086]
[http://dx.doi.org/10.18653/v1/P16-1200]
[http://dx.doi.org/10.3115/1567594.1567610]
[http://dx.doi.org/10.1186/gb-2008-9-s2-s2] [PMID: 18834493]
[http://dx.doi.org/10.1145/3357384.3358119]
[http://dx.doi.org/10.1093/bioinformatics/bty449] [PMID: 29868832]
[http://dx.doi.org/10.1093/bioinformatics/btx228] [PMID: 28881963]
[http://dx.doi.org/10.1093/nar/gkz389] [PMID: 31114887]
[http://dx.doi.org/10.18653/v1/D19-3029]
[http://dx.doi.org/10.1016/j.websem.2022.100760] [PMID: 36268112]
[http://dx.doi.org/10.1186/s12859-018-2167-5] [PMID: 29843590]
[http://dx.doi.org/10.1101/2020.05.09.084897]