Abstract
Introduction: Membrane proteins play an important role in living organisms as one of the main components of biological membranes. The problem in membrane protein classification and prediction is an important topic of membrane proteomics research because the function of proteins can be quickly determined if membrane protein types can be discriminated.
Methods: Most current methods to classify membrane proteins are labor-intensive and require a lot of resources. In this study, five methods, Average Block (AvBlock), Discrete Cosine Transform (DCT), Discrete Wavelet Transform (DWT), Histogram of Orientation Gradient (HOG), and Pseudo-PSSM (PsePSSM), were used to extract features in order to predict membrane proteins on a large scale. Then, we combined the five obtained feature matrices and constructed the corresponding hypergraph association matrix. Finally, the feature matrices and hypergraph association matrices were integrated to identify the types of membrane proteins using a hypergraph neural network model (HGNN).
Results: The proposed method was tested on four membrane protein benchmark datasets to evaluate its performance. The results showed 92.8%, 88.6%, 88.2%, and 99.0% accuracy on each of the four datasets.
Conclusion: Compared to traditional machine learning classifier methods, such as Random Forest (RF), Support Vector Machine (SVM), etc. HGNN prediction performance was found to be better.
[http://dx.doi.org/10.1002/(SICI)1097-0134(19990101)34:1<137:AID-PROT11>3.0.CO;2-O] [PMID: 10336379]
[http://dx.doi.org/10.1016/S0006-3495(03)70050-2] [PMID: 12719255]
[http://dx.doi.org/10.1016/j.jtbi.2005.05.035] [PMID: 16040052]
[http://dx.doi.org/10.1016/j.bbrc.2007.06.027] [PMID: 17586467]
[http://dx.doi.org/10.1007/s10930-005-7592-4] [PMID: 16323044]
[http://dx.doi.org/10.1016/j.bbrc.2005.06.087] [PMID: 16002049]
[http://dx.doi.org/10.1016/j.jtbi.2005.08.016] [PMID: 16197963]
[http://dx.doi.org/10.1093/protein/gzh061] [PMID: 15314209]
[http://dx.doi.org/10.1016/j.bbrc.2005.08.160] [PMID: 16140260]
[http://dx.doi.org/10.1016/j.jtbi.2006.05.006] [PMID: 16806277]
[http://dx.doi.org/10.1016/j.jtbi.2012.10.033] [PMID: 23137835]
[http://dx.doi.org/10.1016/j.jtbi.2013.11.017] [PMID: 24316387]
[http://dx.doi.org/10.1016/j.jtbi.2010.11.017] [PMID: 21110985]
[http://dx.doi.org/10.1007/s00726-011-1053-5] [PMID: 21850437]
[http://dx.doi.org/10.1016/j.jtbi.2008.07.012] [PMID: 18692511]
[http://dx.doi.org/10.1016/j.jtbi.2018.11.012] [PMID: 30452958]
[http://dx.doi.org/10.1371/journal.pone.0185587] [PMID: 28961273]
[http://dx.doi.org/10.3390/ijms18081781] [PMID: 28813000]
[http://dx.doi.org/10.1109/T-C.1974.223784]
[http://dx.doi.org/10.3390/ijms17101623] [PMID: 27669239]
[http://dx.doi.org/10.1093/nar/gkg095] [PMID: 12520024]
[http://dx.doi.org/10.1093/bioinformatics/btl158] [PMID: 16731699]
[http://dx.doi.org/10.1093/bioinformatics/bts565] [PMID: 23060610]
[http://dx.doi.org/10.1109/TCBB.2010.93]
[http://dx.doi.org/10.1007/s00726-011-1114-9] [PMID: 21993538]
[http://dx.doi.org/10.5555/2976456.2976657]
[http://dx.doi.org/10.1109/CVPR.2009.5206795]
[http://dx.doi.org/10.1109/CVPR.2010.5540012]
[http://dx.doi.org/10.1109/TIP.2012.2202676] [PMID: 22692911]
[http://dx.doi.org/10.1109/ICDM.2008.37]
[http://dx.doi.org/10.1109/TIP.2012.2199502] [PMID: 22614650]
[http://dx.doi.org/10.1609/aaai.v33i01.33013558]
[http://dx.doi.org/10.48550/arXiv.1506.05163]
[http://dx.doi.org/10.5555/3157382.3157527]
[http://dx.doi.org/10.1016/j.ins.2013.12.016]
[http://dx.doi.org/10.1002/prot.1035] [PMID: 11288174]
[http://dx.doi.org/10.1587/elex.7.397]
[http://dx.doi.org/10.1007/s00726-006-0439-2] [PMID: 17031474]