Académique Documents
Professionnel Documents
Culture Documents
Analysis
Sequence analysis
Pairwise Alignment:
¾Dot Matrix
¾Dynamic Programming
¾Heuristic Methods:
FASTA
BLAST
Pairwise Sequence Alignment
% Identity 100 75 50 60 25 20
BLOSUM
1. Matrix construction.
2. Matrix filling.
3. Back tracing.
4. Alignment plotting.
The two example sequences taken are:
5’ GAATTCAGTTA 3’
5’ CCATCGG 3’
Matrix construction and filling:
The matrix is constructed as in the slides that follow, the
number of rows and columns taken depends on the number of
bases present in both the sequences to be aligned. Once the
matrix is constructed, it is filled based on some scoring rules and
formulas as follows:
For example, the calculation of score in the case of the cell S2,2:
S1,1 = 0 S1,2 = -2 S2,1 = -2 and W (GQC) = -1
Max of all the three is 0. So, 0 + -1 = -1
So, this score is given to the cell and arrow is placed from the diagonal
cell from which the value is obtained, it can be to more than one cell also.
row (i)
G A A T T C A G T T A
col 1 2 3 4 5 6 7 8 9 10 11 12
(j) 0 -2 -4 -6 -8 -10 -12 -14 -16 -18 -20 -22
2
C -2 -1
3
C -4
4
A -6
5
T -8
6
C -10
7
G -12
8
G -14
Similarly all the cells are assigned scores and arrows are placed accordingly.
Finally when the entire matrix is filled, the back tracing is done.
CAUTION:
Gap must be inserted only when we come across a horizontal or
vertical pointing arrow following a score, but not for a diagonal arrow
following it. A diagonal arrow following a score represents an
acceptable match, which can be either match or mismatch.
Properly check where a gap must be inserted (the sequence towards
which the arrow points should get a gap).
From the previous slide, there can be a total of four OPTIMAL GLOBAL
ALIGNMENTS possible. These are:
GA A T T C A GT T A
C C ATCG G
GA A T T C A GT T A
C C ATC GG
G AA T T C A GT T A
C C ATCG G
G AA T T C A GT T A
C C ATC GG
The final filled matrix for local alignment
(including the optimal alignment trace
in red) obtained for the example
sequences taken is . . . .
row (i)
G A A T T C A G T T A
col 1 2 3 4 5 6 7 8 9 10 11 12
(j) 0 0 0 0 0 0 0 0 0 0 0 0
2
C 0 0 0 0 0 0 2 1 0 0 0 0
3
C 0 0 0 0 0 0 4 3 2 1 0 0
4
A 0 0 2 4 3 2 3 6 5 4 3 5
5
T 0 0 1 3 6 8 7 6 5 7 9 8
6
C 0 0 0 2 5 7 10 9 8 7 8 8
7
G 0 2 1 1 4 6 9 9 11 10 9 8
8
G 0 4 3 2 3 5 8 8 13 12 11 10
From the previous slide, there can be a total of two OPTIMAL LOCAL
ALIGNMENTS possible. These are:
A A T T C A G A A T T C A G
A TC G G A TC GG
Distance