Abstract
Recent proteomics studies of clinical samples have generated substantial interest. Aided by advances in analytical chemistry and bioinformatics, clinical proteomics has become a driving force behind molecular biomarker development. However, it is still difficult to manage and interpret large amounts of clinical proteomics data due to data integration challenges. The lack of practical metadata representation standards has prevented sharing and interpretation of mass spectrometry experimental results derived from different experimental conditions or different proteomics labs, and ultimately this absence has resulted in missed opportunities for proteomic biomarker discovery. Therefore, in this paper, we describe methods for deploying Semantic Web technologies to design an ontology using OWL for clinical proteomics information and to manage such information using various mechanisms, such as CPAS. We developed a practical proteomics experimental metadata model using Semantic Web technologies and demonstrated the manner in which this model can be integrated with current proteomics data analysis software systems. We demonstrated the manner in which systems employing the metadata model can begin to enable inter-laboratory sharing and analysis of clinical proteomics data. We also discussed the manner in which these tools and techniques have aided in proteomic biomarker discovery studies. Our work reflects an approach to adopt a Cancer Biomedical Informatics Grid (caBIG) compliant software system through the use of an ontology-based metadata model. This effort is the first step in a bigger initiative to move toward an ontology-based approach that enables a standards-driven approach to large-scale inter-laboratory proteomics data integration and analyses with the overarching goal of the discovery of proteomic biomarkers.
Keywords: Clinical proteomics, OWL, RDF, semantic web, Metadata Model, Mass-Spectrometry, Clinical Proteomics, proteomic biomarkers, CPAS, Cancer Biomedical Informatics Grid (caBIG).