Multiple Sequence Alignment - PowerPoint PPT Presentation

About This Presentation
Title:

Multiple Sequence Alignment

Description:

Multiple Sequence Alignment Mult-Seq-Align allows to detect similarities which cannot be detected with Pairwise-Seq-Align methods. Detection of family characteristics. – PowerPoint PPT presentation

Number of Views:186
Avg rating:3.0/5.0
Slides: 14
Provided by: bioinfo37
Category:

less

Transcript and Presenter's Notes

Title: Multiple Sequence Alignment


1
Multiple Sequence Alignment
  • Mult-Seq-Align allows to detect similarities
    which cannot be detected with Pairwise-Seq-Align
    methods.
  • Detection of family characteristics.
  • Three questions
  • 1. Scoring
  • Computation of Mult-Seq-Align.
  • Family representation.

2
Multiple Sequence Alignment
3
(No Transcript)
4
Example of MSA (Multiple Sequence Alignment)
5
Scoring SP (sum of pairs)
SP the sum of pairwise scores of all pairs of
symbols in the column.
Here, we will assume that (-,-) 0
?3(-,A,A) (-,A)(-,A)(A,A)
SP Total Score S ?i
6
Induced pairwise alignment
Induced pairwise alignment or projection of a
multiple alignment.
a(S1, S2 ) a(S2, S3) a(S1, S3)
SP Total Score Siltj score a(Si, Sj )
(-,-) 0
7
Dyn.Prog. Solution
8
Dynamic Programming Solution
  • The best multiple alignment of r sequences is
    calculated using an r-dimensional hyper-cube
  • The size of the hyper-cube is O( ?ni )
  • Time complexity O(2r nr) O(computation of the
    ? function).
  • Exact problem is NP-Hard (metrics sum-of-pairs
    or evolutionary tree).
  • more efficient solution is needed

9
Multiple Alignment from Pairwise Alignments ?
  • Problem
  • The best pairwise alignment does not necessary
    lead to the best multiple alignment.

10
Pattern-A
Pattern-X
Pattern-B
S1
Pattern-A
Pattern-X
Pattern-D
S2
Pattern-X
Pattern-B
Pattern-D
S3
Correct Solution
S1
S2
S3
Pattern-X
11
Center Star Alignment
  • Scoring scheme distance.
  • Scoring scheme satisfies the triangle inequality
    for any character a,b,c dist(a,c) dist(a,b)
    dist(b,c)
  • (in practice not all scoring matrices satisfy
    the triangle inequality)
  • (c) D(Si, Sj ) score of the optimal pairwise
    alignment.
  • (d) D(M) Siltj aM (Si, Sj ) score of the
    multiple alignment M.
  • (e) aM(Si, Sj) pairwise alignment/score induced
    by M.

12
The Center Star Algorithm (a) Find Sc minimizing
Si?c D(Sc , Si ). (b) Iteratively construct the
multiple alignment Mc 1. McSc 2. Add
the sequences in S\Sc to Mc one by one
so that the induced alignment
aMc(Sc, Si) of every newly added sequence Si
with Sc is optimal. Add spaces, when needed, to
all pre-aligned sequences.
AC-BC DCABC
AC--BC DCA-BC DCAABC
Running time O(n2).
AC--BC DCAABC
13
  • D(Mc) is at most twice the score of the D(Mopt)
  • D (Mc) / D (Mopt) 2(k-1)/k ( lt 2 )
  • Proof
  • a(Si, Sj) D (Si, Sj ) (any induced align. is
    not better than optimal align.) aMc (Sc, Sj) D
    (Sc, Sj )
  • aMc (Si, Sj) aMc (Si, Sc) aMc (Sc, Sj) D
    (Si, Sc ) D (Sc, Sj ) (follows from the
    triangle inequality)
  • 2 D(Mc) Si1..k S j1..k,j?i aMc (Si , Sj )
  • Si1..k S j1..k,j?i ( aMc (Si, Sc)
    aMc (Sc, Sj) )
  • 2(k-1) Sj?c aMc (Sc, Sj)
  • 2(k-1) Sj?c D(Sc, Sj)

14
(d) k Sj1..k,j?c D(Sc, Sj) Si1..k S
j1..k,j?c D(Sc, Sj) Si1..k S j1..k,j?i
D(Si, Sj) Si1..k S j1..k,j?i aMopt (Si,
Sj) 2 D(Mopt)
(e) ? 2 D(Mc) 2(k-1) Sj?c D(Sc, Sj)
k Sj?c D(Sc, Sj) 2 D(Mopt) ?
D(Mc)/(k-1) Sj?c D(Sc, Si)
Sj?c D(Sc, Si) 2 D(Mopt)/k ?
D (Mc) / D (Mopt) 2(k-1)/k
Write a Comment
User Comments (0)
About PowerShow.com