( See also enhanced Fold recognition for Proteins in Mycoplasma genitalium via 3D-PSSM )
Overview & Introduction
The recognition of remote protein homologies is a major aspect of the structural and functional annotation of newly determined genomes. Here we benchmark the coverage and error rate of genome annotation using the widely-used homology-searching program PSI-BLAST (position-specific iterated basic local alignment search tool). This study evaluates the one-to-many success rate for recognition, as often there are several homologues in the database and only one needs to be identified for annotating the sequence. In contrast, previous benchmarks considered one-to-one recognition in which a single query was required to find a particular target. The benchmark constructs a model genome from the full sequences of the structural classification of protein (SCOP) database and searches against a target library of remote homologous domains (<20% identity). The structural benchmark provides a reliable list of correct and false homology assignments. PSI-BLAST successfully annotated 40% of the domains in the model genome that had at least one homologue in the target library. This coverage is more than twice that if one-to-one recognition is evaluated (11% coverage of domains). Although a structural benchmark was used, the results apply to just sequence homology searches. Accordingly, structural and sequence assignments were made to the sequences in the genomes of Mycoplasma genitalium and Mycobacterium tuberculosis.
The web pages contain detailed structural and functional annotations for the two genomes and data files essential for the benchmarks.
Copyright © 1999-2002 Cancer Research UK
All Rights Reserved, disclaimer
Comments to author: email@example.com
Generated: Thu Jun 27, 2002