The Genome Access Course Multiple Sequence Alignments - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

The Genome Access Course Multiple Sequence Alignments

Description:

Aligning two sequences according to the Smith-Waterman alignment is ... MACAW. BOXSHADE. PRETTYBOX. Databases Based on MSAs. PROSITE. FINGERPRINTS. BLOCKS ... – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 21
Provided by: james858
Category:

less

Transcript and Presenter's Notes

Title: The Genome Access Course Multiple Sequence Alignments


1
TheGenomeAccessCourseMultiple
SequenceAlignments
2
Pairwise vs. Multiple Alignment
  • Aligning two sequences according to the
    Smith-Waterman alignment is straightfoward, but
    the difficulty of expanding the alignment to more
    sequences increases exponentially.

3
Methods for Aligning Multiple Sequences
  • Dynamic Programming
  • Progressive
  • Iterative
  • Genetic Algorithm
  • Hidden Markov Models (HMM)

4
Dynamic Programming
  • A technique for designing efficient algorithms
    for optimization problems
  • Some specific properties of the problem are
    required for the dynamic programming technique to
    be applicable
  • Applicable when a large search problem can be
    broken down into stages
  • Trivial solutions to sub-problems contribute to
    the overall solution

5
Optimal Alignment
  • For a given group of sequences, there is no
    single "correct" alignment, only an alignment
    that is "optimal" according to some set of
    calculations.
  • Determining which alignment is best for a given
    set of sequences is an individual decision.

6
Progressive Methods
  • Add new sequences one at a time to a pairwise
    alignment generated with dynamic programming
  • Most closely related aligned first
  • Modeled by an evolutionary tree
  • Sensitive to intial alignments

7
Progressive PairwiseMethods
  • Most of the available multiple alignment programs
    use some sort of incremental or progressive
    method that makes pairwise alignments, then adds
    new sequences one at a time to these
    aligned groups.
  • This is an approximate method!

8
ClustalW
  • Weighted Clustal
  • Performs pairwise alignments of all sequences
  • Produces a phylogenetic tree by neighbor-joining
    method
  • Aligns sequences sequentially

9
Other Software
  • MultAlin
  • SAM
  • HMMER
  • PIMA
  • treealign
  • PAM

10
Commercial Software
  • VectorNTI Suite AlignX
  • GCG Pileup
  • DNAStar

11
Genetic Algorithm
  • Machine learning algorithm
  • Simulates evolutionary changes in the sequences
  • Alignments not necessarily optimal
  • Seeks to increase initial msa score by simulating
    gap insertion and recombination

12
Markov Models
A Markov model is a probabilistic process over a
finite set, S1, ..., Sk, usually called its
states. Each state-transition generates a
character from the alphabet of the
process. State transitions are determined by a
transition probability matrix, which is
ordinarily independent of history.
13
Hidden Markov Models
  • A Markov model in which the states are hidden.
  • A given state in the sequence cannot be
    determined, but probabilities can be calculated.
  • Can be designed such that biological relevance
    can be assigned to a state
  • Originally applied to speech recognition

14
Uses of Hidden Markov Models
  • Multiple sequence alignment
  • Gene prediction
  • Protein families (Profile HMM)
  • Fold Recognition
  • GpC island detection

15
DNA Sequence Transition Tables
Normal DNA
CpG Island
16
Building a Profile HMM
  1. Construct a transition matrix for same-length
    sequences with no gaps
  2. Correct for single insertions
  3. Correct for variable length insertions
  4. Correct for constant deletions
  5. Correct for variable length deletions

17
Sequence Alignment HMM
Match State
Delete State
Insert State
18
MSA with HMMs
  1. Construct a profile HMM
  2. Find most likely path for each sequence
  3. Sequence of matching/insert/delete states is the
    alignment

19
Editing MSAs
  • CINEMA
  • MACAW
  • BOXSHADE
  • PRETTYBOX

20
Databases Based on MSAs
  • PROSITE
  • FINGERPRINTS
  • BLOCKS
Write a Comment
User Comments (0)
About PowerShow.com