Protein sequence data

There is some value in doing some initial analysis on your protein sequence. If a protein has come (for example) directly from a gene prediction, it may consist of multiple domains. More seriously, it may contain regions that are unlikely to be globular, or soluble. This flowchart assumes that your protein is soluble, likely comprises a single domain, and does not contain non-globular regions.

Things to consider are:

If the answer to any of the above questions is yes, then it is worthwhile trying to break your sequence into pieces, or ignore particular sections of the sequence, etc. This is related to the problem of locating domains.
