Protein fold recognition from secondary structure assignments
Robert B. Russell, Richard R. Copley & Geoffrey J. Barton
From Proc. 29th Ann. Hawaii Int. Conf. Sys. Sci., 5, 302-311, 1995.
Abstract
A novel method is described for finding protein tertiary folds consistent with a set of
secondary structure assignments. Given a secondary structure pattern and other restraints
for the protein or protein family, all matches within a non-redundant database of known
protein three-dimensional structural domains are found that are both structurally sensible and
consistent with any experimental information provided. All possible matches between the query pattern and every database
structure are first generated by a comparison of
secondary structure strings, which accounts for likely errors in predicted secondary structure elements and likely
variations between query and database structure by allowing for a user defined number of deletions of whole secondary structural
elements. These matches are then passed through a series of filters to leave only those structures which are compact,
have good beta sheet bonding, and allow the provided loop or turn lengths to bridge the distance between adjacent
secondary structures. Matches are then filtered further by user defined restraints, based on the requirement for particular
secondary structures (eg. those predicted strongly, or those having active site residues), and any distance
restraints known from experiments (eg. disulphide bonds). The final list of matches provides a set of plausible
topologies for the protein of unknown 3D structure, which can be inspected visually using computer graphics, or
tested by experiment. To demonstrate the power of the method, a prediction for the src homology 2 (SH2)
domain is used to search the database. The search reveals 13 possible topologies, one of which is a portion of the
E. coli bio operon protein, which is known to adopt a structure similar to the SH2 domain. The use and
further development of the method are discussed.