Rules were learnt to recognise members of a fold from
negative examples of the same class. The experiment is further refined
with use of integrity constraints which ensure that every rule
considered contains at least one of the following predicate, unit_len, unit_aveh, unit_hmom, coil or has_pro, this adds complexity and richness to the rules as judged
by our knowledge of protein structure. Knowing that the dataset was
constructed with twice as many negative examples as positives, one
could devise a rule which would rejects every examples, fold(X,_) :- fail, and produce 66% overall accuracy. Where
accuracy is defined as the sum of the number of true positive and true
negative over the total number of cases. The cross-validated overall
accuracy in our test is 74.75%, which is statistically better (t-test
at %99.0 confidence level), see Table 1. The overall
accuracy is slightly higher for folds of the all-
and
all-
classes; these proteins are in general smaller and less
complex than those of the
and
classes.
Super | Fam | Dom | Acc | Err | Fold | ||
All-![]() |
|||||||
139 | 210 | 111 | other folds (92) | ||||
4 | 17 | 30 | 81.92 ![]() | 3.15 | DNA-binding 3-helical bundle | ||
2 | 7 | 14 | 68.48 ![]() | 5.10 | EF Hand-like | ||
1 | 2 | 13 | 94.56 ![]() | 2.54 | Globin-like | ||
1 | 3 | 10 | 73.13 ![]() | 5.67 | 4-helical cytokines | ||
1 | 3 | 10 | 63.37 ![]() | 5.95 | ![]() |
||
76.29 ![]() | 10.99 | average | |||||
All-![]() |
|||||||
123 | 220 | 90 | other folds (56) | ||||
8 | 12 | 45 | 71.07 ![]() | 2.85 | Immunoglobulin-like beta-sandwich | ||
1 | 4 | 21 | 81.47 ![]() | 3.58 | Trypsin-like serine proteases | ||
4 | 11 | 20 | 76.92 ![]() | 3.99 | OB-fold | ||
6 | 7 | 16 | 76.53 ![]() | 4.52 | SH3-like barrel | ||
1 | 2 | 14 | 78.50 ![]() | 3.97 | Lipocalins | ||
76.90 ![]() | 3.39 | average | |||||
![]() |
|||||||
131 | 200 | 88 | other folds (70) | ||||
17 | 28 | 55 | 66.14 ![]() | 2.61 |
![]() |
||
1 | 7 | 21 | 78.47 ![]() | 3.69 | NAD(P)-binding Rossmann-fold domains | ||
1 | 4 | 14 | 81.21 ![]() | 4.10 | P-loop containing nucleotide triphosphate hydrolases | ||
1 | 2 | 13 | 62.94 ![]() | 5.82 | Periplasmic binding protein-like II | ||
1 | 10 | 12 | 75.08 ![]() | 4.80 |
![]() |
||
72.77 ![]() | 7.07 | average | |||||
![]() |
|||||||
158 | 240 | 113 | other folds (96) | ||||
17 | 21 | 26 | 80.38 ![]() | 3.40 | Ferredoxin-like | ||
2 | 8 | 13 | 56.30 ![]() | 5.71 | Zincin-like | ||
1 | 1 | 13 | 79.38 ![]() | 4.39 | SH2-like | ||
6 | 6 | 12 | 63.56 ![]() | 5.79 | beta-Grasp | ||
1 | 1 | 9 | 85.63 ![]() | 4.42 | Interleukin 8-like chemokines | ||
73.05 ![]() | 11.16 | average | |||||
74.75 ![]() | 8.95 | overall |