Division of Molecular Biosciences
Department of Life Sciences
Faculty of Natural Sciences

Documentation and Tutorials

PPI data file format:

Tab-delimitted file containing two columnms
ID1 ID2
ID3 ID2
...
where ID1, ID2, ID3 are the string identifying the protein names in the species



BLAST data file:

This file contains the result of the all-against-all BLAST results of proteins in the input species. This includes the BLAST results of the proteins within each species as well as with those in the other species. The self-BLAST score of each protein is required so that a normalization on the sequence similarity score to be accurately computed.
The format of the BLAST data file is as below:


ID1 ID2 BLAST-score


where ID1, ID2 are protein IDs of the input species, and BLAST-score is the score obtained from the BLAST sequence alignment result.


An example of the BLAST data file :
....
A0MZ67 A0MZ67 1062.0
A0MZ67 Q8VDD5 61.62
A0MZ67 Q8VDD5 50.83
A0MZ67 Q8VDD5 42.74
A0MZ67 Q62209 55.84
A0MZ67 Q62209 49.68
A0MZ67 Q62209 43.90
....

The all-against-all BLAST results of the proteins can be obtained using blastall .
To do this, you need a file containing all the FASTA sequences of the proteins in the input networks. You first need to create your own database for BLAST, which can be done by using formatdb .

  • >formatdb -i proteins.fasta -n mydb
  • >blastall -p blastp -i proteins.fasta -m 8 -v 100000000 -d mydb -e 10e-5 >proteins.blast
  • >cut -f1,2,12 proteins.blast >proteins.blast_score

"proteins.blast_score" is the file containing the all-against-all BLAST scores required to run PINALOG

GO annotation file format:

The annotation file has the format following the GO consortium file format