GCP Home Page

How BLAST works

1. BLAST first searches for short regions of a given length (W) called "words" (or substrings) that score at least "T" when compared to the query sequence that align with sequences in the database ("target sequences"), using a substitution matrix.

2. For every pair of sequences (query and target) that have a word or words in common, BLAST extends the alignment in both directions to find alignments that score greater (are more similar) than a certain score threshold (S). These alignments are called high scoring pairs or HSPs; the maximal scoring HSPs are called MSPs.

Note: this is a much simplified explanation. For a more detailed explanation of the statistical algorithms, see Saccone & Pesole (2003).