HOW CAN ONE SEARCH THE DATABASE BY REPEAT UNIT SEQUENCE?

By inputing the sequence in the box and choosing which species from the dropdown menu, a list with all the SNPSTRs containing microsatellites with that repeat unit sequence will be obtained.
e.g. I want to get all SNPSTRs that contain microsatellites with repeat unit sequence GTTTTT from rat. I will input the following values:



Because the lists of SNPSTRs generated are very likely to be very long, when clicking on the search button a big table in a single page is generated. It is advised that the data is downloaded rather than seen as an html page.

Another thing that needs to be mentioned is that sometimes the repeat unit sequence in the results will be different from the one input by the user in the search box. This is because when the database was created SNPSTRs were grouped in different classes to ease the analyses of the datasets. What these classes are can be more easily explained by an example.
There are 64 different variations of a sequence with 3 nucleotides but here there are only 10 classes of triSNPSTRs (SNPSTRs containing microsatellites with repeat unit sequence length equal to three). This is because here an ATC microsatellite is equivalent to a TCA one and a CTA one as well as their reverse sequences (TAG,AGT,GAT). So when the downloaded sequences were searched, SNPSTRs whose microsatellites have repeat unit sequence ATC,TCA,CTA,TAG,AGT or GAT were all stored in the database as ATC-repeat SNPSTRs.
When the user submits a sequence, all different variations of the sequence are searched i.e. if one inputs sequence TAG, the database will be searched for all six variations (ATC,TCA,CTA,TAG,AGT and GAT). Classifying microsatellites like this is a common-place thing to do (e.g. Toth et al (2000)).

If you wish to retrieve SNPSTRs per repeat unit sequence length, list of SNPSTRs per species can be downloaded from the ftp site.


WHY IS THERE A SEARCH AND A DOWNLOAD BUTTON?

By default the results are displayed as HTML which is great when one is looking for a small number of SNPSTRs. However when more than a few SNPSTRs are being looked at, the HTML format is not very useful. In addition, one might want to get the information in a format which can be easily manipulated for example, one my want to input the data into a Perl Script for further research, something that HTML output would not allow.
For this reason, the database interface provides the option of downloading the data rather than viewing them. The user can choose if he or she wants the file to be tab-delimited or comma-delimited, and when he or she clicks on the download button a window pops up that asks if the user wants to open the file or save it in their computer. The files contain all the information contained in the HTML page with the exception of the multiple gene accession ids (only the Ensembl gene ids are provided).