Alignment: - PowerPoint PPT Presentation

About This Presentation
Title:

Alignment:

Description:

... to the end of sequence x Hitting the 0th column corresponds to the end of sequence y Global Alignment by the Needleman-Wunsch algorithm From scoring matrix S ... – PowerPoint PPT presentation

Number of Views:69
Avg rating:3.0/5.0
Slides: 9
Provided by: jonas168
Learn more at: http://cobweb.cs.uga.edu
Category:

less

Transcript and Presenter's Notes

Title: Alignment:


1
Alignment Global, local, repeated and
overlaping Jan 21 2004
2
  1. Global Alignment by the Needleman-Wunsch
    algorithm

From scoring matrix S
Linear gap penalty
F

0
-d
-nd
-2d
-d
F(2,1)
F(,1)
F(1,1)
F(n,1)
-2d
F(1,2)
F(n,2)
F(2,2)
F(,2)

Total score
F(1,)
F(2,)
F(,)
F(n,)
F(1,m)
-md
F(2,m)
F(n,m)
F(,m)
Trace back
3
2. Local alignment by the Smith-Waterman algorithm
To break uniformative pairwise combinations
F

0
0
0
0
Total score
0
F(2,1)
F(,1)
F(1,1)
F(n,1)
0
0
F(1,2)
F(n,2)
F(2,2)
F(,2)

F(1,)
F(2,)
max
F(,)
F(n,)
F(1,m)
0
F(2,m)
F(n,m)
F(,m)
Trace back
4
3. Repeated matches
  • could be used to search for a repeated domain or
    motif in a protein
  • care is required in implementation of this
    algorithm as it is asymmetric in the sense that
  • x represents the sequence containing the domain
  • y represents the sequence in which we look for
    repeated matches

5
3. Repeated matches
  • Traceback from M(n1,0) records the best
    alignment
  • The global alignment contains matched and
    unmatched regions
  • Only matches above a threshold, T, are recorded
  • Scores are computed relative to this threshold
  • Changing T will thus affect the outcome of the
    algorithm
  • If T is too large, some matches will be missed
  • If T is too small, match regions may be split and
    too many weak matches may be found
  • Variation of this algorithm
  • WATERMAN-EGGERT ALGORITHM (1987)

6
threshold value
3. Repeated matches
(Local alignment)
Total score
F
F(2,0)
F(,0)
0
F(1,0)
F(n,0)
F(n1,0)
Trace back
7
4. Overlap matches
  • global alignment strategy which does not penalize
    overlapping ends
  • Algorithm is as per global alignment, except at
    initialization and a slight alteration of
    traceback
  • Traceback
  • starts from the maximum recorded element in the
    mth row and the nth column
  • i.e., maxF(n,0), F(n,1),.,F(n,m), F(n-1,m),
    F(n-2,m),, F(0,m)
  • ends when the 0th row or column is reached.
  • Hitting the 0th row corresponds to the end of
    sequence x
  • Hitting the 0th column corresponds to the end of
    sequence y

8
threshold value
4. Overlap matches
------
(Local alignment)
(Global alignment)
F
F(2,0)
F(,0)
0
F(1,0)
F(n,0)
F(0,m1)
Total score
0
max
Trace back
Write a Comment
User Comments (0)
About PowerShow.com