Bolten, Eva, Schliep, Alexander, Schneckener, Sebastian, Schomburg, Dietmar and Schrader, Rainer (2001). Clustering Protein Sequences - Structure Prediction by transitive homology. Bioinformatics, 17 (10). pp. 935-941. Oxford University Press.
|
PDF
zaik2000-383.pdf - Draft Version Download (151kB) | Preview |
Abstract
It is widely believed that for two proteins A and B a sequence identity above some threshold implies structural similarity. It is not fully understood whether in the case that sequence similarity between A and B is below this threshold the existence of a third protein with a level of sequence similarity with A and with B which is high enough suffices for inferring structural similarity of A and B, that is whether transitivity holds. We examined the protein sequences in the SwissProt database. Their similarity was determined using the Smith-Waterman algorithm. This data was transformed into a directed graph where protein sequences constitute vertices. A directed edge was drawn from vertex A to vertex B if the sequences A and B showed similarity above a fixed threshold. By use of a length dependent scaling of the alignment scores we have a criterion to avoid clustering errors due to multi-domain proteins. To deal with the resulting large graphs we have developed a very efficient library. Methods include both a novel graph-based clustering algorithm capable of handling multi-domain proteins and cluster comparison algorithms. The parameters of above algorithms used were fine-tuned by using SCOP as a test set. We will present our algorithmic advances yielding a 24 percent improvement over pair-wise comparisons, statistics of the clusterings obtained and general methodology relevant for testing our hypothesis.
Item Type: | Journal Article | ||||||||||||||||||||||||
Creators: |
|
||||||||||||||||||||||||
URN: | urn:nbn:de:hbz:38-548539 | ||||||||||||||||||||||||
Journal or Publication Title: | Bioinformatics | ||||||||||||||||||||||||
Volume: | 17 | ||||||||||||||||||||||||
Number: | 10 | ||||||||||||||||||||||||
Page Range: | pp. 935-941 | ||||||||||||||||||||||||
Date: | 2001 | ||||||||||||||||||||||||
Publisher: | Oxford University Press | ||||||||||||||||||||||||
Language: | English | ||||||||||||||||||||||||
Faculty: | Faculty of Mathematics and Natural Sciences | ||||||||||||||||||||||||
Divisions: | Faculty of Mathematics and Natural Sciences > Department of Mathematics and Computer Science > Institute of Computer Science | ||||||||||||||||||||||||
Subjects: | Data processing Computer science | ||||||||||||||||||||||||
Refereed: | No | ||||||||||||||||||||||||
URI: | http://kups.ub.uni-koeln.de/id/eprint/54853 |
Downloads
Downloads per month over past year
Export
Actions (login required)
View Item |