Fold recognition methods and links

Some links for methods of FOLD recognition:

Even with no homologue of known 3D structure, it may be possible to find a suitable fold for you protein among known 3D structures by way of fold recognition methods

3D structural similarities

Ab initio prediction of protein 3D structures is not possible at present, and a general solution to the protein folding problem is not likely to be found in the near future. However, it has long been recognised that proteins often adopt similar folds despite no significant sequence or functional similarity and that nature is apparently restricted to a limited number of protein folds.

There are numerous protein structure classifications now available via the WWW:

Thus for many proteins (~ 70%) there will be a suitable structure in the database from which to build a 3D model. Unfortuantely, the lack of sequence similarity will mean that many of these go undetected until after 3D structure determination.

The goal of fold recognition

Methods of protein fold recognition attempt to detect similarities between protein 3D structure that are not accompanied by any significant sequence similarity. There are many approaches, but the unifying theme is to try and find folds that are compatable with a particular sequence. Unlike sequence-only comparison, these methods take advantage of the extra information made available by 3D structure information. In effect, the turn the protein folding problem on it's head: rather than predicting how a sequence will fold, they predict how well a fold will fit a sequence.

Some papers on the subject:

The realities of fold recognition

Despite initially promising results, methods of fold recognition are not always accurate. Guides to the accuracy of protein fold recognition can be found in the proceedings of the Critical Assessment of Structure Predictions (CASP) conferences. At the first meeting in 1994 (CASP1) the methods were found to be about 50 % accurate at best with respect to their ability to place a correct fold at the top of a ranked list. Though many methods failed to detect the correct fold at the top of a ranked list, a correct fold was often found in the top 10 scoring folds. Even when the methods were successful, alignments of sequence on to protein 3D structure were usually incorrect, meaning that comparative modelling performed using such models would be inaccurate.

The CASP2 meeting held in December 1996, showed that many of the methods had improved, though it is difficult to compare the results of the two assessments (i.e. CASP1 & CASP2) since very different criteria were used to assess correct answers. It would be foolish and over-ambitious for me to present a detailed assessment of the results here. However, and important thing to note, was that Murzin & Bateman managed to attain near 100% success by the use of careful human insight, a knowledge of known structures, secondary structure predictions and thoughts about the function of the target sequences. Their results strongly support the arguments given below that human insight can be a powerful aid during fold recognition. A summary of the results from this meeting can be found in the PROTEINS issue dedicated to the meeting (PROTEINS, Suppl 1, 1997).

The CASP3 meeting was held in December 1998. It showed some progress in the ability of fold recognition methods to detect correct protein folds and in the quality of alignments obtained. A detailed summary of the results will appear towards the end of 1999 in the PROTEINS supplement.

For my talk, I did a crude assessment of 5 methods of fold recognition. I took 12 proteins of known structure (3 from each folding class) an ran each of the five methods using default parameters. I then asked how often was a correct fold (not allowing trival sequence detectable folds) found in the first rank, or in the top 10 scoring folds. I also asked how often the method found the correct folding class in the first rank. The results are summarised in here in a PostScript file.

Perhaps the worst result from this study is shown below:

One method suggested that the sequence for the Probe (left) (a four helix bundle) would best fit onto the structure shown on the right (an OB fold, comprising a six stranded barrel).

The results suggest that one should use caution when using these methods. In spite of this, the methods remain very useful.

A practical approach:

Although they are not 100 % accurate, the methods are still very useful. To use the methods I would suggest the following:

Fold recognition slides from my talk:

Slide 16 Slide 17 Slide 18 Slide 19 Slide 20 Slide 21 Slide 22 Slide 23 Slide 24 Slide 25 Slide 26 Slide 27

Next Analysis of folds and alignment of secondary structures.

Back to the Flowchart