Links for protein structure prediction

Multiple Sequence Alignment

Regardless of the outcome of your searches, you will want a multiple sequence alignment containing your sequence and all the homologues you have found above.

Some sites for performing multiple alignment:

If you are going to do a lot of alignments, then it is probably best to get your own copy of one of many programs, some FTP sites for some of these are:

HMMer (HMM method, Wash U)
SAM (HMM method, Santa Cruz)
ClustalW (EBI,UK)
ClustalW (USA)
MSA (USA)
AMPS (UK)

Note that PileUp is contained within the GCG commercial package. Most institutions with people doing this sort of work will have access to this software, so ask around if you want to use it.

Probably the most important advance since these pages first appeared are Hidden Markov Models for sequence alignment. Several methods are listed above.

Alignments can provide:

Information as to protein domain structure
The location of residues likely to be involved in protein function
Information of residues likely to be buried in the protein core or exposed to solvent
More information than a single sequence for applications like homology modelling and secondary structure prediction.

Some tips

Don't just take everything found in the searches and feed them directly into the alignment program. Searches will almost always return matches that do not indicate a significant sequence similarity. Look through the output carefully and throw things out if they don't appear to be a member of the sequence family. Inclusion of non-members in your alignment will confuse things and likely lead to errors later.
Remember that the programs for aligning sequences aren't perfect, and do not always provide the best alignment. This is particularly so for large families of proteins with low sequence identities. If you can see a better way of aligning the sequences, then by all means edit the alignment manually.

Next secondary structure prediction.

Back to the Flowchart