Alignment of Pairs of Sequences 1 - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Alignment of Pairs of Sequences 1

Description:

Ends free-space alignment. Gap penalty. Affine gap penalty. 1) Global alignment ... http://www.paracel.com/products/gm2.html. Group presentation. Wednesday, 12/08/07 ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 23
Provided by: mli1
Category:
Tags: alignment | aps | com | my | pairs | sequences | space

less

Transcript and Presenter's Notes

Title: Alignment of Pairs of Sequences 1


1
Alignment of Pairs of Sequences (1)
  • BCI 3043 Introduction to Bioinformatics

2
  • Biosequence alignment problems
  • Global alignment
  • Local alignment
  • Gap penalty
  • Affine gap penalty
  • Methods of sequence alignment
  • Dot matrix
  • Dynamic programming
  • Heuristic
  • Scoring matrices
  • PAM
  • BLOSUM
  • GOMET
  • Efficiency measurement

3
Genes encode the recipes for proteins
4
Proteins Molecular Machines
  • Proteins in your muscles allows you to
    movemyosinandactin

5
Intro
  • Pairwise sequence alignment process of
    comparing 2 sequences by searching for a series
    of individual character that are in the same
    order in the sequence
  • Compare 2 sequence (DNA_at_Protein) and find the
    similarities
  • DNA alignment can give ambiguous result
  • Translate into protein before aligning them!!

6
Intro
  • Consider two sequence X and Y
  • AGGCTATCACCTGACCTCCAGGCCGATGCCC sequence X
  • TAGCTATCACGACCGCGGTCGATTTGCCCGAC sequence Y
  • We assign gap to position 0.n in X and Y
  • -AGGCTATCACCTGACCTCCAGGCCGA--TGCCC---
  • TAG-CTATCAC--GACCGC--GGTCGATTTGCCCGAC
  • And we get two sequence with same length and as
    similar to each other as possible
  • But, how much the similarity is?
  • Give score to matches, mismatches and gaps

7
Why to do alignment?
  • Hypothesis all genetic material had one
    ancestral
  • But, evolution and mutation has modified and
    create differences
  • Insertion, deletion, substitution
  • Indel insertion and deletion

8
Cont.
  • Comparison of macromolecular sequences.
  • Nucleic acids (DNA, RNA) or proteins.
  • Suggest evolutionary, structural and functional
    relationships.
  • Heuristic algorithms for practical database
    searching.

9
Cont.
  • The draft human genome is available
  • Automated gene finding is possible
  • Gene AGTACGTATCGTATAGCGTAA
  • What does it do?
  • One approach Is there a similar gene in another
    species?
  • Align sequences with known genes
  • Find the gene with the best match

10
Distance vs Similarity
  • Distance minimal sum of weights for a set of
    mutation transforming one to another
  • Similarity ???
  • Similarity useful for db searching and distance
    useful for phylogenetic tree construction
  • Maximizing similarities minimizing distances

11
Homology
  • Common evolutionary relationship between two
    individual
  • Indicates by a high level of similarities
  • 25 identity over 100 amino acids meant these two
    sequences has a common ancestry
  • Homology is different with similarity

12
Biological motivations of alignment problems
  • 5 major variants
  • Global alignments
  • Local alignments
  • Ends free-space alignment
  • Gap penalty
  • Affine gap penalty

13
1) Global alignment
  • To find similarities between sequence S and T
  • Done across entire sequence length
  • E.g Needleman-Wunsch algorithm
  • EMBOSS http//www.hgmp.mrc.ac.uk/Software/EMBOSS/A
    ps/needle.html

14
Global Alignment
  • Input two sequences over the same alphabet
  • Output an alignment of the two sequences
  • Example
  • GCGCATGGATTGAGCGA and TGCGCCATTGATGACCA
  • A possible alignment
  • -GCGC-ATGGATTGAGCGA
  • TGCGCCATTGAT-GACC-A

15
Global Alignment
Example (cont)
  • -GCGC-ATGGATTGAGCGA
  • TGCGCCATTGAT-GACC-A
  • Three elements
  • Perfect matches
  • Mismatches
  • Insertions deletions (indel)

Symmetric view of evolution
16
Global Alignmentscoring scheme
  • Score each position independently
  • Match 1
  • Mismatch -1
  • Indel -2
  • Score of an alignment is sum of position scores

Example -GCGC-ATGGATTGAGCGA TGCGCCATTGAT-GAC
C-A Score (1x13) (-1x2) (-2x4)
3 ------GCGCATGGATTGAGCGA TGCGCC----ATTGATGACCA--
Score (1x5) (-1x6) (-2x11) -23
17
2) Local alignment
  • Definition
  • S1 , S2 strings. Find substrings a , ß of S1 ,
    S2 , whose similarity is maximum over all pairs
    of substrings from S1 , S2.
  • Search the substring in two sequences that most
    similar
  • More meaningful than global
  • Useful in potential sequence and functional motif
    detection
  • Cant get overall similarity
  • E.g Smith-Waterman algorithm
  • EMBOSS
  • http//www.hgmp.mrc.ac.uk/Software/EMBOSS/Aps/wate
    r.html

18
Local Alignment
  • Consider sequence A and B
  • A aaaacccccggggtta
  • B ttcccgggaaccaacc
  • Only red region is similar
  • If
  • A aaaaccccgggtta
  • B ttcccgggaaccaacc
  • then
  • ccc gg
  • cccggg
  • A aaaacccc ggctta
  • B ttcccgggaaccaacc

19
3) Gap penalty
  • Consider sequence P and Q with gap
  • ACGTCTGATACGCCGTATAGTCTATCT Seq P
  • ACGTCTGAT-------ATAGTCTATCT Seq Q
  • We penalize gap using n x d where n number of
    gaps and d is gap penalty
  • But, gap usually in bunch
  • More realistic approach, reduce the gap penalty
    as the gap length increase

20
Affine gap penalty
  • Example of gap penalty method
  • Each gap is given weight depends on its length
  • Wx g rx
  • Wx total weight, g opening penalty, r
    gap penalty for each element, x gap length
  • Example Genematcher2
  • http//www.paracel.com/products/gm2.html

21
(No Transcript)
22
Group presentation
  • Wednesday, 12/08/07
  • gtlt10 minutes per group
  • Smith-waterman overview
  • Smith-waterman algorithm
  • Needleman-Wunsch overview
  • Needleman-Wunsch algorithm
  • Affine gap penalty
Write a Comment
User Comments (0)
About PowerShow.com