Abstract
Recent technical advances in identifying protein-protein interactions (PPIs) have generated the genomic-wide interaction data, collectively collectively referred to as the interactome. These interaction data give an insight into the underlying mechanisms of biological processes. However, the PPI data determined by experimental and computational methods include an extremely large number of false positives which are not confirmed to occur in vivo. Filtering PPI data is thus a critical preprocessing step to improve analysis accuracy. Integrating Gene Ontology (GO) data is proposed in this article to assess reliability of the PPIs. We evaluate the performance of various semantic similarity measures in terms of functional consistency. Protein pairs with high semantic similarity are considered highly likely to share common functions, and therefore, are more likely to interact. We also propose a combined method of semantic similarity to apply to predicting false positive PPIs. The experimental results show that the combined hybrid method has better performance than the individual semantic similarity classifiers. The proposed classifier predicted that 58.6% of the S. cerevisiae PPIs from the BioGRID database are false positives.
Keywords: Gene ontology, protein-protein interactions, semantic similarity.
Current Bioinformatics
Title:Predicting False Positives of Protein-Protein Interaction Data by Semantic Similarity Measures§
Volume: 8 Issue: 3
Author(s): George Montanez and Young-Rae Cho
Affiliation:
Keywords: Gene ontology, protein-protein interactions, semantic similarity.
Abstract: Recent technical advances in identifying protein-protein interactions (PPIs) have generated the genomic-wide interaction data, collectively collectively referred to as the interactome. These interaction data give an insight into the underlying mechanisms of biological processes. However, the PPI data determined by experimental and computational methods include an extremely large number of false positives which are not confirmed to occur in vivo. Filtering PPI data is thus a critical preprocessing step to improve analysis accuracy. Integrating Gene Ontology (GO) data is proposed in this article to assess reliability of the PPIs. We evaluate the performance of various semantic similarity measures in terms of functional consistency. Protein pairs with high semantic similarity are considered highly likely to share common functions, and therefore, are more likely to interact. We also propose a combined method of semantic similarity to apply to predicting false positive PPIs. The experimental results show that the combined hybrid method has better performance than the individual semantic similarity classifiers. The proposed classifier predicted that 58.6% of the S. cerevisiae PPIs from the BioGRID database are false positives.
Export Options
About this article
Cite this article as:
Montanez George and Cho Young-Rae, Predicting False Positives of Protein-Protein Interaction Data by Semantic Similarity Measures§, Current Bioinformatics 2013; 8 (3) . https://dx.doi.org/10.2174/1574893611308030009
DOI https://dx.doi.org/10.2174/1574893611308030009 |
Print ISSN 1574-8936 |
Publisher Name Bentham Science Publisher |
Online ISSN 2212-392X |

- Author Guidelines
- Bentham Author Support Services (BASS)
- Graphical Abstracts
- Fabricating and Stating False Information
- Research Misconduct
- Post Publication Discussions and Corrections
- Publishing Ethics and Rectitude
- Increase Visibility of Your Article
- Archiving Policies
- Peer Review Workflow
- Order Your Article Before Print
- Promote Your Article
- Manuscript Transfer Facility
- Editorial Policies
- Allegations from Whistleblowers