Space Efficient Alignment Algorithms - PowerPoint PPT Presentation

Loading...

PPT – Space Efficient Alignment Algorithms PowerPoint presentation | free to download - id: 1f38d4-ZDc1Z



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Space Efficient Alignment Algorithms

Description:

Extension 1 (Ends-Free Global Alignment): Sherri and Dhonam. Extension 2 (Local Alignment) ... Email me a progress report on Tuesday, June 28th ... – PowerPoint PPT presentation

Number of Views:67
Avg rating:3.0/5.0
Slides: 12
Provided by: lanct9
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Space Efficient Alignment Algorithms


1
Space Efficient Alignment Algorithms
  • Dr. Nancy Warter-Perez
  • June 24, 2005

2
Outline
  • Algorithm complexity
  • Complexity of dynamic programming alignment
    algorithms
  • Hirschbergs Divide and Conquer algorithm

3
Algorithm Complexity
  • Indicates the space and time (computational)
    efficiency of a program
  • Space complexity refers to how much memory is
    required to execute the algorithm
  • Time complexity refers to how long it will take
    to execute (compute) the algorithm
  • Generally written in Big-O notation
  • O represents the complexity (order)
  • n represents the size of the data set
  • Examples
  • O(n) order n, linear complexity
  • O(n2) order n squared, quadratic complexity
  • Constants and lower orders ignored
  • O(2n) O(n) and O(n2 n 1) O(n2)

4
Complexity of Dynamic Programming Algorithms for
Global/Local Alignment
  • Time complexity O(mn)
  • For each cell in the score matrix, perform 3
    operations
  • Compute Up, Left, and Diagonal scores
  • O(3mn) O(mn)
  • Space complexity O(mn)
  • Size of scoring matrix mn
  • Size of trace back matrix mn
  • O(2mn) O(mn)
  • Where, m and n are the lengths of the sequences
    being aligned.
  • Since m ? n, O(n2 ) quadratic complexity!

5
Memory Requirements
  • For a sequence of 200-500 amino acids or
    nucleotides
  • O(n2) 5002 250,000
  • If store each score as a 32-bit value 4 bytes,
    it requires 1,000,000 bytes to represent the
    scoring matrix!
  • If store each trace back symbol as a character
    (8-bit value), it requires 250,000 bytes to
    represent the trace back matrix

6
Simple Improvement for Scoring Matrix
  • In reality, the space complexity of the scoring
    matrix is only linear, i.e., O(2min(m,n))
    O(min(m,n))
  • O(min(m,n)) ? O(n) for sequences of comparable
    lengths
  • 2,000 bytes (instead of 1 million)
  • But, trace back still quadratic space complexity

7
Hirschbergs Divide and Conquer Space Efficient
Algorithm
  • Compute the score matrix(s) between the source
    (0,0) and (n, m/2). Save m/2 column of s.
    Compute the reverse score matrix (sreverse)
    between the sink (n, m) and (0,m/2). Save the
    m/2 column of sreverse.
  • Find middle (i, m/2) satisfies max 0? i?n s(i,
    m/2) sreverse(n-i, m/2)
  • Recursively partition problem into 2 subproblems

8
Pseudo Code of Space-Efficient Alignment Algorithm
  • Path (source, sink)
  • If source and sink are in consecutive columns
  • output the longest path from the source to the
    sink
  • Else
  • middle ?middle vertex between source and sink
  • Path (source, middle)
  • Path (middle, sink)

9
Complexity of Space-Efficient Alignment Algorithm
  • Time complexity
  • Equal to the sum of the areas of the rectangles
  • Area ½ Area ¼ Area ? 2Area
  • where, Area nm
  • O(2nm) O(nm)
  • Quadratic time/computation complexity (same as
    before)
  • Space complexity
  • Need to save a column of s and sreverse for each
    computation (but can discard after computing
    middle)
  • O(min(n,m)) if m lt n, switch the sequences (or
    save a row of s and sreverse instead)
  • Linear space complexity!!

10
Project Teams and Presentation Assignments
(Revised)
  • Base Project (Global Alignment)
  • Miguel and Joseph
  • Extension 0 (Global Alignment - all alignments
    with max score)
  • Mario
  • Extension 1 (Ends-Free Global Alignment)
  • Sherri and Dhonam
  • Extension 2 (Local Alignment)
  • Kung-Hua, and Swapna
  • Extension 3 (Affine Gap Penalty)
  • Cory and Nam
  • Extension 4 (Database)
  • Melinda and Dana
  • Extension 5 (Space Efficient Algorithm)
  • Michael and Sarah

11
Workshop
  • Work on Sequence Alignment project
  • Email me a progress report on Tuesday, June 28th
  • Specify the implementation status for each module
  • List each function within a module and specify
    its status
  • Date written
  • Date testing completed
  • Author
  • Include functions in the list that are not
    completed (I.e., not written yet or fully
    tested). For these cases, write TBD (to be
    determined) in the respective date field.
  • Only one report per group, but cc your partner on
    your e-mail!
About PowerShow.com