# Space Efficient Alignment Algorithms - PowerPoint PPT Presentation

PPT – Space Efficient Alignment Algorithms PowerPoint presentation | free to download - id: 1f38d4-ZDc1Z

The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
Title:

## Space Efficient Alignment Algorithms

Description:

### Extension 1 (Ends-Free Global Alignment): Sherri and Dhonam. Extension 2 (Local Alignment) ... Email me a progress report on Tuesday, June 28th ... – PowerPoint PPT presentation

Number of Views:67
Avg rating:3.0/5.0
Slides: 12
Provided by: lanct9
Category:
Tags:
Transcript and Presenter's Notes

Title: Space Efficient Alignment Algorithms

1
Space Efficient Alignment Algorithms
• Dr. Nancy Warter-Perez
• June 24, 2005

2
Outline
• Algorithm complexity
• Complexity of dynamic programming alignment
algorithms
• Hirschbergs Divide and Conquer algorithm

3
Algorithm Complexity
• Indicates the space and time (computational)
efficiency of a program
• Space complexity refers to how much memory is
required to execute the algorithm
• Time complexity refers to how long it will take
to execute (compute) the algorithm
• Generally written in Big-O notation
• O represents the complexity (order)
• n represents the size of the data set
• Examples
• O(n) order n, linear complexity
• O(n2) order n squared, quadratic complexity
• Constants and lower orders ignored
• O(2n) O(n) and O(n2 n 1) O(n2)

4
Complexity of Dynamic Programming Algorithms for
Global/Local Alignment
• Time complexity O(mn)
• For each cell in the score matrix, perform 3
operations
• Compute Up, Left, and Diagonal scores
• O(3mn) O(mn)
• Space complexity O(mn)
• Size of scoring matrix mn
• Size of trace back matrix mn
• O(2mn) O(mn)
• Where, m and n are the lengths of the sequences
being aligned.
• Since m ? n, O(n2 ) quadratic complexity!

5
Memory Requirements
• For a sequence of 200-500 amino acids or
nucleotides
• O(n2) 5002 250,000
• If store each score as a 32-bit value 4 bytes,
it requires 1,000,000 bytes to represent the
scoring matrix!
• If store each trace back symbol as a character
(8-bit value), it requires 250,000 bytes to
represent the trace back matrix

6
Simple Improvement for Scoring Matrix
• In reality, the space complexity of the scoring
matrix is only linear, i.e., O(2min(m,n))
O(min(m,n))
• O(min(m,n)) ? O(n) for sequences of comparable
lengths
• 2,000 bytes (instead of 1 million)
• But, trace back still quadratic space complexity

7
Hirschbergs Divide and Conquer Space Efficient
Algorithm
• Compute the score matrix(s) between the source
(0,0) and (n, m/2). Save m/2 column of s.
Compute the reverse score matrix (sreverse)
between the sink (n, m) and (0,m/2). Save the
m/2 column of sreverse.
• Find middle (i, m/2) satisfies max 0? i?n s(i,
m/2) sreverse(n-i, m/2)
• Recursively partition problem into 2 subproblems

8
Pseudo Code of Space-Efficient Alignment Algorithm
• Path (source, sink)
• If source and sink are in consecutive columns
• output the longest path from the source to the
sink
• Else
• middle ?middle vertex between source and sink
• Path (source, middle)
• Path (middle, sink)

9
Complexity of Space-Efficient Alignment Algorithm
• Time complexity
• Equal to the sum of the areas of the rectangles
• Area ½ Area ¼ Area ? 2Area
• where, Area nm
• O(2nm) O(nm)
• Quadratic time/computation complexity (same as
before)
• Space complexity
• Need to save a column of s and sreverse for each
computation (but can discard after computing
middle)
• O(min(n,m)) if m lt n, switch the sequences (or
save a row of s and sreverse instead)
• Linear space complexity!!

10
Project Teams and Presentation Assignments
(Revised)
• Base Project (Global Alignment)
• Miguel and Joseph
• Extension 0 (Global Alignment - all alignments
with max score)
• Mario
• Extension 1 (Ends-Free Global Alignment)
• Sherri and Dhonam
• Extension 2 (Local Alignment)
• Kung-Hua, and Swapna
• Extension 3 (Affine Gap Penalty)
• Cory and Nam
• Extension 4 (Database)
• Melinda and Dana
• Extension 5 (Space Efficient Algorithm)
• Michael and Sarah

11
Workshop
• Work on Sequence Alignment project
• Email me a progress report on Tuesday, June 28th
• Specify the implementation status for each module
• List each function within a module and specify
its status
• Date written
• Date testing completed
• Author
• Include functions in the list that are not
completed (I.e., not written yet or fully
tested). For these cases, write TBD (to be
determined) in the respective date field.
• Only one report per group, but cc your partner on