genome center home > BLAST Help

BLAST Help: main search parameters

PROGRAM
You can use BLAST to search for similarity in either nucleotide or protein sequences.
  1. blastn: nucleotide to nucleotide search
    Search your DNA sequence against a nucleotide sequence database
  2. tblastn: protein to nucleotide search
    Search your amino-acid sequence against a nucleotide sequence database. The query sequence is compared to the nucleotide sequence database in all six translation frames.
  3. blastx: nucleotide to protein search
    Search your DNA sequence against a protein sequence database (only available for our genomes annotated with predicted genes)
  4. blastp: protein to protein search
    Search your amino-acid sequence against a protein sequence database (only available for our genomes annotated with predicted genes)

EXPECT (E value)
The statistical significance threshold for reporting matches against database sequences; the default value is 10, such that 10 matches are expected to be found merely by chance, according to the stochastic model of Karlin and Altschul (1990). If the statistical significance ascribed to a match is greater than the EXPECT threshold, the match will not be reported. Lower EXPECT thresholds are more stringent, leading to fewer chance matches being reported. Fractional values are acceptable. (See parameter E in the BLAST Manual).

CUTOFF
Cutoff score for reporting high-scoring segment pairs. The default value is calculated from the EXPECT value (see above). HSPs are reported for a database sequence only if the statistical significance ascribed to them is at least as high as would be ascribed to a lone HSP having a score equal to the CUTOFF value. Higher CUTOFF values are more stringent, leading to fewer chance matches being reported. (See parameter S in the BLAST Manual). Typically, significance thresholds can be more intuitively managed using EXPECT.

MATRIX
Specify an alternate scoring matrix for the translation of nucleotides to proteins in the TBLASTN search. The default matrix is BLOSUM62 (Henikoff & Henikoff, 1992). The matrix parameter is ignored for BLASTN nucleotide searches.

FILTER
Mask off segments of the query sequence that have low compositional complexity, as determined by the SEG program of Wootton & Federhen (Computers and Chemistry, 1993), or segments consisting of short-periodicity internal repeats, as determined by the XNU program of Claverie & States (Computers and Chemistry, 1993), or, for BLASTN, by the DUST program of Tatusov and Lipman (in preparation). Filtering can eliminate statistically significant but biologically uninteresting reports from the blast output (e.g., hits against common acidic-, basic- or proline-rich regions), leaving the more biologically interesting regions of the query sequence available for specific matching against database sequences.

Low complexity sequence found by a filter program is substituted using the letter "N" in nucleotide sequence (e.g., "NNNNNNNNNNNNN") and the letter "X" in protein sequences (e.g., "XXXXXXXXX"). Users may turn off filtering by using the "Filter" option on the "Advanced options for the BLAST server" page.

Filtering is only applied to the query sequence (or its translation products), not to database sequences. Default filtering is DUST for BLASTN, SEG for other programs.

It is not unusual for nothing at all to be masked by SEG, XNU, or both, when applied to sequences in SWISS-PROT, so filtering should not be expected to always yield an effect. Furthermore, in some cases, sequences are masked in their entirety, indicating that the statistical significance of any matches reported against the unfiltered query sequence should be suspect.

ALIGNMENTS FORMAT
Gapped alignment allows for gaps in the regions of sequence similarity.

DESCRIPTIONS
Restricts the number of short descriptions of matching sequences reported to the number specified; default limit is 100 descriptions. (See parameter V in the manual page). See also EXPECT and CUTOFF.

ALIGNMENTS
Restricts database sequences to the number specified for which high-scoring segment pairs (HSPs) are reported; the default limit is 50. If more database sequences than this happen to satisfy the statistical significance threshold for reporting (see EXPECT and CUTOFF below), only the matches ascribed the greatest statistical significance are reported. (See parameter B in the BLAST Manual).

For further informations about BLAST refer to the documentation at NCBI.

 




Webmaster
Last modified:  
WICGR Home   |  WIBR Home
Contact Us   |   Related Links