Protein fold recognition by mapping predicted secondary structures

Robert B. Russell*, Richard R. Copley & Geoffrey J. Barton
J. Mol. Biol., Accepted 8/2/1996.

University of Oxford
Laboratory of Molecular Biophysics
The Rex Richards Building, South Parks Road
Oxford, OX1 3QU, England
Tel: 44 1865 275368 FAX: 44 1865 510454
E-mail: gjb@bioch.ox.ac.uk

*Present Address:
Biomolecular Modelling Laboratory
Imperial Cancer Research Fund Laboratories
44 Lincoln's Inn Fields, P.O. Box 123
London, WC2A 3PX, England
E-mail: russell@icrf.icnet.uk

Abstract

A strategy is presented for protein fold recognition from secondary structure assignments (alpha-helix and beta-strand). The method can detect similarities between protein folds in the absence of sequence similarity. Secondary structure mapping first identifies all possible matches (maps) between a query string of secondary structures and the secondary structures of protein domains of known three--dimensional structure. The maps are then passed through a series of structural filters to remove those that do not obey simple rules of protein structure. The surviving maps are ranked by scores from the alignment of predicted and experimental accessibilities. Searches made with secondary structure assignments for a test set of eleven fold-families put the correct sequence-dissimilar fold in the first rank 8/11 times. With cross-validated predictions of secondary structure this drops to 4/11 which compares favourably with the widely used THREADER program (1/11). The structural class is correctly predicted 10/11 times by the method in contrast to 5/11 for THREADER. The new technique obtains comparable accuracy in the alignment of amino acid residues and secondary structure elements. Searches are also performed with published secondary structure predictions for the von-Willebrand factor type A domain, the proteasome 20S alpha subunit and the phosphotyrosine interaction domain. These searches demonstrate how the method can find the correct fold for a protein from a carefully constructed secondary structure prediction, multiple sequence alignment and distance restraints. Scans with experimentally determined secondary structures and accessibility, recognise the correct fold with high alignment accuraciy (86% on secondary structures). This suggests that the accuracy of mapping will improve alongside any improvements in the prediction of secondary structure or accessibility. Application to NMR structure determination is also discussed.