DOMAIN ASSIGNMENT FROM THE LITERATURE


The publications describing the structures were obtained from the citations in the Brookhaven entry. For pre-release entries, the papers were identified by a literature search or from the series Macromolecular Structure (Hedrickson & Wuthrich, 1991; 1992;1993). For the list of proteins and their domain assignments see Table 1. We followed Richardson's (1981) concept of a domain in deciding the extent to which the structure should be subdivided. Occasionally, there was a long link between two obviously compact domains. We identified the end of the first and the start of the second domain omitting the linking region. In addition, sometimes at the N- and the C- termini of the chain there were short extensions which were not part of any of the component domains. In our approach, if we considered that the chain should be split into domains then we generally followed any explicit assignment by the authors of residues to domains except for these linking regions and the extensions at the termini. However, because different groups of workers will make inconsistent decisions as to whether to split a protein into domains, we did not slavishly follow the authors' decision as to chain dissection. Furthermore, in the absence of guidelines from the authors, our assignment was based on the description of the structure especially the definition of b-sheets together with inspection on the graphics using the display program PREPI developed by Islam & Sternberg (unpublished). This procedure is inevitably subjective but being performed by one team on all the chains is likely to be reasonably consistent. The list of domains assignments will be referred to as the authors' assignment and identified in Table 1 as Da. The percentage of a-helices (including 3-10-helices) and b-structure (based on b-ladders but excluding b-bridges) was obtained for each chain with all main-chain atoms by our implementation (S.A. Islam, unpublished) of the Kabsch & Sander algorithm (Kabsch & Sander, 1983). The structural family of domains is often classified as a/a, b/b, a/b, a+b or coil (Levitt & Chothia, 1976). However there is no automatic approach to obtain this classification and accordingly we assigned by inspection a structural class to each domain of multi-domain chains. Figure 1 illustrates the processing of the 284 chains into their domain assignments.