GA for Sequence Alignment - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

GA for Sequence Alignment

Description:

GA for Sequence Alignment Pair-wise alignment Multiple string alignment Pairwise Sequence Alignment VNRLQQNIVSLEVDHKVANYKP VNRLQQSIVSLRDAFNDGELD HRVLNYKP Solving by a ... – PowerPoint PPT presentation

Number of Views:73
Avg rating:3.0/5.0
Slides: 13
Provided by: SURV151
Category:

less

Transcript and Presenter's Notes

Title: GA for Sequence Alignment


1
GA for Sequence Alignment
  • Pair-wise alignment
  • Multiple string alignment

2
Pairwise Sequence Alignment
  • VNRLQQNIVSLEVDHKVANYKP
  • VNRLQQSIVSLRDAFNDGELD HRVLNYKP
  • Solving by a dynamic programming using Dayhoff
    matrics
  • Each pairwise alignment needs O(n1n2)
  • VNRLQQNIVSL__________EVDHKVANYKP
  • VNRLQQSIVSLRDAFND GELD HRVLNYKP

3
How to implement a GA ?
  • Representation
  • Fitness
  • Operators design
  • Selection strategy

4
Pair-wise Alignment Representation
  • How do you think?
  • For example (my intuitively way)
  • Guess a length n
  • Chromosome

5
Pair-wise Alignment Representation
  • So the chromosome becomes
  • You can also use the gap position

(1,2,4,5,6,8.)
(2,4,5,7,8,10.)
6
Pair-wise Alignment Fitness Function
  • Simplest
  • Match 1
  • Dismatch -2
  • Gap -1
  • Using the scoring matrix
  • Protein PAM,
  • DNA substitution matrix
  • Summarize the total score.

7
Pair-wise Alignment Genetic Operators
  • All our previous operators.
  • Image one!!!
  • Selection
  • Try it!!!

8
Conclusion About Pair-wise Alignment
  • DP can solve it in O(NM)
  • GA cant have too much advantage.

9
RPCVCPVLRQAAQ s1 RPCVC_
P__VLRQAAQ a1 RPCACCPVLRQVVQ s2
RPCACCP__VLRQVVQ a2 KPCLCPRQLRQV
s3 KPCLC_ P RQLRQV_ _ a3 KPCCPRQAAQ
s4 KPC_C_ P____ RQAAQ a4 S A
10
Multiple String Alignment Representation
  • How do you think?
  • For example (my intuitively way)
  • Guess a length n
  • Chromosome

11
Multiple String Alignment Representation
  • So the chromosome becomes
  • You can also use the gap position
  • Need fewer space
  • Some good operators..

(1,2,4,5,6,8.)
(2,4,5,7,8,10.)

12
Multiple String Alignment Fitness Function
  • The most hard part
  • You can never know what is the real scoring
    system! Even biologists!!!
  • Approximation
  • Using SOP (sum of pairs)
  • The most widely used
  • Using PAM,
  • Motif-based
Write a Comment
User Comments (0)
About PowerShow.com