A Note on Useful Algorithmic Strategies - PowerPoint PPT Presentation

About This Presentation
Title:

A Note on Useful Algorithmic Strategies

Description:

Title: Finding conserved regions in sequence alignments Author: Veriton Last modified by: Kun-Mao Chao Created Date: 7/28/2001 12:54:06 AM Document presentation format – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 30
Provided by: Veriton
Category:

less

Transcript and Presenter's Notes

Title: A Note on Useful Algorithmic Strategies


1
A Note on Useful Algorithmic Strategies
  • Kun-Mao Chao (???)
  • Department of Computer Science and Information
    Engineering
  • National Taiwan University, Taiwan
  • WWW http//www.csie.ntu.edu.tw/kmchao

2
Greedy Algorithm
  • A greedy method always makes a locally optimal
    (greedy) choice.
  • the greedy-choice property a globally optimal
    solution can be reached by a greedy choice.
  • optimal substructures

3
Huffman Codes (1952)
David Huffman (August 9, 1925 October 7,
1999)
4
Huffman Codes
Expected number of bits per character
3x0.13x0.12x0.31x0.5 1.7 (vs. 2 bits by a
simple scheme)
5
An example
Sequence GTTGTTATCGTTTATGTGGC
By Huffman Coding 01110111000100101111000
10110101001
20 characters 34 bits in total
6
Divide-and-Conquer
  1. Divide the problem into smaller subproblems.
  2. Conquer each subproblem recursively.
  3. Combine the solutions to the child subproblems
    into the solution for the parent problem.

7
Merge Sort(Invented in 1938 Coded in 1945)
John von Neumann (December 28, 1903 February
8, 1957 )
8
Merge Sort(Merge two solutions into one.)
9
Merge Sort
10
Dynamic Programming
  • Dynamic programming is a class of solution
    methods for solving sequential decision problems
    with a compositional cost structure.
  • Richard Bellman was one of the principal founders
    of this approach.

Richard Ernest Bellman (19201984)
11
Two key ingredients
  • Two key ingredients for an optimization problem
    to be suitable for a dynamic-programming solution

2. overlapping subproblems
1. optimal substructures
Subproblems are dependent. (otherwise, a
divide-and-conquer approach is the choice.)
Each substructure is optimal. (Principle of
optimality)
12
Three basic components
  • The development of a dynamic-programming
    algorithm has three basic components
  • The recurrence relation (for defining the value
    of an optimal solution)
  • The tabular computation (for computing the value
    of an optimal solution)
  • The traceback (for delivering an optimal
    solution).

13
Fibonacci numbers
The Fibonacci numbers are defined by the
following recurrence

Leonardo of Pisa (c. 1170 c. 1250)

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144,
233, 377, 610, 987, 1597, 2584, 4181, 6765,
10946, 17711, 28657, 46368, 75025, 121393, ...

14
How to compute F10?


15
Tabular computation
  • The tabular computation can avoid recompuation.

F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 F10
0 1 1 2 3 5 8 13 21 34 55
16
Longest increasing subsequence(LIS)
  • The longest increasing subsequence is to find a
    longest increasing subsequence of a given
    sequence of distinct integers a1a2an .
  • e.g. 9 2 5 3 7 11 8 10 13 6
  • 3 7
  • 7 10 13
  • 7 11
  • 3 5 11 13

are increasing subsequences.
We want to find a longest one.
are not increasing subsequences.
17
A naive approach for LIS
  • Let Li be the length of a longest increasing
    subsequence ending at position i.

Li 1 max j 0..i-1Lj aj lt ai(use a
dummy a0 minimum, and L00)
9 2 5 3 7 11 8 10 13 6
Li 1 1 2 2 3 4 ?
18
A naive approach for LIS
  • Li 1 max j 0..i-1 Lj aj lt ai

9 2 5 3 7 11 8 10 13 6
Li 1 1 2 2 3 4 4 5
6 3
The maximum length
The subsequence 2, 3, 7, 8, 10, 13 is a longest
increasing subsequence. This method runs in O(n2)
time.
19
An O(n log n) method for LIS
  • Define BestEndk to be the smallest number of an
    increasing subsequence of length k.

9 2 5 3 7 11 8 10 13 6
9
2
2
2
2
2
2
2
2
BestEnd1
5
3
3
3
3
3
3
BestEnd2
7
7
7
7
7
BestEnd3
11
8
8
8
BestEnd4
10
10
BestEnd5
13
BestEnd6
20
An O(n log n) method for LIS
  • Define BestEndk to be the smallest number of an
    increasing subsequence of length k.

9 2 5 3 7 11 8 10 13 6
9
2
2
2
2
2
2
2
2
2
BestEnd1
5
3
3
3
3
3
3
3
BestEnd2
7
7
7
7
7
6
BestEnd3
11
8
8
8
8
BestEnd4
For each position, we perform a binary search to
update BestEnd. Therefore, the running time is
O(n log n).
10
10
10
BestEnd5
13
13
BestEnd6
21
Binary search
  • Given an ordered sequence x1x2 ... xn, where
    x1ltx2lt ... ltxn, and a number y, a binary search
    finds the largest xi such that xilt y in O(log n)
    time.

n/2
...
n/4
n
22
Binary search
  • How many steps would a binary search reduce the
    problem size to 1?n n/2 n/4 n/8 n/16
    ... 1

How many steps? O(log n) steps.
23
Longest Common Subsequence (LCS)
  • A subsequence of a sequence S is obtained by
    deleting zero or more symbols from S. For
    example, the following are all subsequences of
    president pred, sdn, predent.
  • The longest common subsequence problem is to find
    a maximum-length common subsequence between two
    sequences.

24
LCS
  • For instance,
  • Sequence 1 president
  • Sequence 2 providence
  • Its LCS is priden.

president providence
25
LCS
  • Another example
  • Sequence 1 algorithm
  • Sequence 2 alignment
  • One of its LCS is algm.

a l g o r i t h m a l i g n m e n t
26
How to compute LCS?
  • Let Aa1a2am and Bb1b2bn .
  • len(i, j) the length of an LCS between
    a1a2ai and b1b2bj
  • With proper initializations, len(i, j)can be
    computed as follows.

27
(No Transcript)
28
(No Transcript)
29
(No Transcript)
30
(No Transcript)
31
Longest Common Increasing Subsequence
  • Proposed by Yang, Huang and Chao
  • IPL 2005
  1. 2 5 3 7 11 8 10 13 6

6 5 2 8 3 7 4 10 1 13
Write a Comment
User Comments (0)
About PowerShow.com