- PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Description:

Title: PowerPoint Presentation Author: heringa Last modified by: heringa Created Date: 2/20/2003 6:00:44 PM Document presentation format: On-screen Show – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 24
Provided by: heringa
Category:
Tags:

less

Transcript and Presenter's Notes

Title:


1
Bioinformatics
  • Nothing in Biology makes sense except in the
    light of evolution (Theodosius Dobzhansky
    (1900-1975))
  • Nothing in bioinformatics makes sense except in
    the light of Biology

2
Evolution
  • Three requirements
  • Template structure providing stability (DNA)
  • Copying mechanism (meiosis)
  • Mechanism providing variation (mutations
    insertions and deletions crossing-over etc.)

3
Evolution
  • Ancestral sequence ABCD
  • ACCD (B C)
    ABD (C ø)
  • ACCD or ACCD
    Pairwise Alignment
  • AB-D A-BD

mutation deletion
4
Evolution
  • Ancestral sequence ABCD
  • ACCD (B C)
    ABD (C ø)
  • ACCD or ACCD
    Pairwise Alignment
  • AB-D A-BD

mutation deletion
true alignment
5
Example Pairwise sequence alignment needs sense
of evolution Global dynamic programming
MDAGSTVILCFVG
Evolution
M D A A S T I L C G S
Amino Acid Exchange Matrix
Search matrix
MDAGSTVILCFVG-
Gap penalties (open,extension)
MDAAST-ILC--GS
6
Sequence alignmentHistory
1970 Needleman-Wunsch global pair-wise
alignment 1981 Smith-Waterman local pair- wise
alignment 1984 Hogeweg-Hesper progressive
multiple alignment 1989 Lipman-Altschul-Kececiog
lu simultaneous multiple alignment 1994 Hidden
Markov Models (HMM) for multiple alignment 1996
Iterative strategies for progressive multiple
alignment revived 1997 PSI-Blast (PSSM)
7
Pair-wise alignment
T D W V T A L K T D W L - - I K
Combinatorial explosion - 1 gap in 1 sequence
n1 possibilities - 2 gaps in 1 sequence (n1)n
- 3 gaps in 1 sequence (n1)n(n-1), etc.
2n (2n)! 22n
n (n!)2
??n 2 sequences of 300 a.a. 1088
alignments 2 sequences of 1000 a.a. 10600
alignments!
8
A protein sequence alignment MSTGAVLIY--TSILIKECHA
MPAGNE----- ---GGILLFHRTHELIKESHAMANDEGGSNNS A
DNA sequence alignment attcgttggcaaatcgcccctatccgg
ccttaa attt---ggcggatcg-cctctacgggcc----
9
Dynamic programmingScoring alignments
Sa,b gp(k) pi k?pe affine gap
penalties pi and pe are the penalties for gap
initialisation and extension, respectively
10
Dynamic programmingScoring alignments
T D W V T A L K T D W L - - I K
20?20
10
1
Affine gap penalties (open, extension)
Amino Acid Exchange Matrix
Score s(T,T)s(D,D)s(W,W)s(V,L)Po2Px
s(L,I)s(K,K)
11
Amino acid exchange matrices
20?20
How do we get one? And how do we get associated
gap penalties? First systematic method to derive
a.a. exchange matrices by Margaret Dayhoff et al.
(1978) Atlas of Protein Structure
12
A 2 R -2 6 N 0 0 2 D 0 -1 2 4 C -2 -4 -4
-5 12 Q 0 1 1 2 -5 4 E 0 -1 1 3 -5 2
4 G 1 -3 0 1 -3 -1 0 5 H -1 2 2 1 -3 3
1 -2 6 I -1 -2 -2 -2 -2 -2 -2 -3 -2 5 L -2 -3
-3 -4 -6 -2 -3 -4 -2 2 6 K -1 3 1 0 -5 1 0
-2 0 -2 -3 5 M -1 0 -2 -3 -5 -1 -2 -3 -2 2 4
0 6 F -4 -4 -4 -6 -4 -5 -5 -5 -2 1 2 -5 0
9 P 1 0 -1 -1 -3 0 -1 -1 0 -2 -3 -1 -2 -5
6 S 1 0 1 0 0 -1 0 1 -1 -1 -3 0 -2 -3 1
2 T 1 -1 0 0 -2 -1 0 0 -1 0 -2 0 -1 -3 0
1 3 W -6 2 -4 -7 -8 -5 -7 -7 -3 -5 -2 -3 -4 0
-6 -2 -5 17 Y -3 -4 -2 -4 0 -4 -4 -5 0 -1 -1 -4
-2 7 -5 -3 -3 0 10 V 0 -2 -2 -2 -2 -2 -2 -1 -2
4 2 -2 2 -1 -1 -1 0 -6 -2 4 B 0 -1 2 3 -4
1 2 0 1 -2 -3 1 -2 -5 -1 0 0 -5 -3 -2 2 Z
0 0 1 3 -5 3 3 -1 2 -2 -3 0 -2 -5 0 0
-1 -6 -4 -2 2 3 A R N D C Q E G H I
L K M F P S T W Y V B Z
PAM250 matrix amino acid exchange matrix (log
odds)
Positive exchange values denote mutations that
are more likely than randomly expected, while
negative numbers correspond to avoided mutations
compared to the randomly expected situation
13
Pairwise sequence alignment Global dynamic
programming
MDAGSTVILCFVG
Evolution
M D A A S T I L C G S
Amino Acid Exchange Matrix
Search matrix
Gap penalties (open,extension)
MDAGSTVILCFVG-
MDAAST-ILC--GS
14
Global dynamic programming
j-1
i-1
MaxS0ltxlti-1, j-1 - Pi - (i-x-1)Px Si-1,j-1 MaxS
i-1, 0ltyltj-1 - Pi - (j-y-1)Px
Si,j si,j Max
15
Global dynamic programming
16
Global dynamic programming
17
Pairwise alignment
  • Global alignment all gaps are penalised
  • Semi-global alignment N- and C-terminal gaps
    (end-gaps) are not penalised
  • MSTGAVLIY--TS-----
  • ---GGILLFHRTSGTSNS

End-gaps
End-gaps
18
Local dynamic programming (Smith Waterman,
1981)
LCFVMLAGSTVIVGTR
E D A S T I L C G S
Negative numbers
Amino Acid Exchange Matrix
Search matrix
Gap penalties (open, extension)
AGSTVIVG A-STILCG
19
Local dynamic programming (Smith Waterman,
1981)
j-1
i-1
Si,j MaxS0ltxlti-1,j-1 - Pi - (i-x-1)Px Si,j
Si-1,j-1 Si,j Max Si-1,0ltyltj-1 - Pi -
(j-y-1)Px 0
Si,j Max
20
Local dynamic programming
21
Dot plots
  • Way of representing (visualising) sequence
    similarity without doing dynamic programming (DP)
  • Make same matrix, but locally represent sequence
    similarity by averaging using a window
  • See Lesks book pp. 167-171

22
Comparing two sequences We want to be able to
choose the best alignment between two
sequences. A simple method of finding
similarities between two sequences is to use dot
plots. The first sequence to be compared is
assigned to the horizontal axis and the second is
assigned to the vertical axis.
23
Dot plots can be filtered by window approaches
(to calculate running averages) and applying a
threshold They can identify insertions,
deletions, inversions
Write a Comment
User Comments (0)
About PowerShow.com