Generic placeholder image

Combinatorial Chemistry & High Throughput Screening

Editor-in-Chief

ISSN (Print): 1386-2073
ISSN (Online): 1875-5402

Building a Biological Space Based on Protein Sequence Similarities and Biological Ontologies

Author(s): Paul Kersey, David Lonsdale, Nicky J. Mulder, Robert Petryszak and Rolf Apweiler

Volume 11, Issue 8, 2008

Page: [653 - 660] Pages: 8

DOI: 10.2174/138620708785739925

Price: $65

Abstract

Assignment of function to protein sequence is a task of growing importance in the life sciences, as new highthroughput sequencing DNA technologies generate ever increasing quantities of genomic and meta-genomic data. Patterns within the sequence space, caused by the evolutionary conservation and assembly of protein domains, make possible the inference of function from sequence similarity. Clustering similar sequences is a useful technique for finding conserved sequences; the CluSTr database is a publicly-available database arranging proteins in a hierarchy structured by similarity. The protein classification tool InterProScan builds on this approach by applying a range of methods to detect proteins that contain signatures indicative of the presence of particular conserved domains. The use of ontologies to describe protein function provides a flexible and abstract language to classify proteins. Together, these techniques can provide an understanding of the shape of the protein space, and can be used to explore the unchartered waters of the emerging metagenomic world.

Keywords: CluSTr, clustering, genomes, GO, InterPro, metagenomes, orthology, paralagy


Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy