A Note on Useful Algorithmic Strategies - PowerPoint PPT Presentation

About This Presentation

Title:

A Note on Useful Algorithmic Strategies

Description:

Title: Finding conserved regions in sequence alignments Author: Veriton Last modified by: Kun-Mao Chao Created Date: 7/28/2001 12:54:06 AM Document presentation format – PowerPoint PPT presentation

Number of Views:37

Avg rating:3.0/5.0

Slides: 30

Provided by: Veriton

Category:

more less

Transcript and Presenter's Notes

Title: A Note on Useful Algorithmic Strategies

1
A Note on Useful Algorithmic Strategies

Kun-Mao Chao (???)
Department of Computer Science and Information
Engineering
National Taiwan University, Taiwan
WWW http//www.csie.ntu.edu.tw/kmchao

2
Greedy Algorithm

A greedy method always makes a locally optimal
(greedy) choice.
the greedy-choice property a globally optimal
solution can be reached by a greedy choice.
optimal substructures

3
Huffman Codes (1952)
David Huffman (August 9, 1925 October 7,
1999)
4
Huffman Codes
Expected number of bits per character
3x0.13x0.12x0.31x0.5 1.7 (vs. 2 bits by a
simple scheme)
5
An example
Sequence GTTGTTATCGTTTATGTGGC
By Huffman Coding 01110111000100101111000
10110101001
20 characters 34 bits in total
6
Divide-and-Conquer

Divide the problem into smaller subproblems.
Conquer each subproblem recursively.
Combine the solutions to the child subproblems
into the solution for the parent problem.

7
Merge Sort(Invented in 1938 Coded in 1945)
John von Neumann (December 28, 1903 February
8, 1957 )
8
Merge Sort(Merge two solutions into one.)
9
Merge Sort
10
Dynamic Programming

Dynamic programming is a class of solution
methods for solving sequential decision problems
with a compositional cost structure.
Richard Bellman was one of the principal founders
of this approach.

Richard Ernest Bellman (19201984)
11
Two key ingredients

Two key ingredients for an optimization problem
to be suitable for a dynamic-programming solution

2. overlapping subproblems
1. optimal substructures
Subproblems are dependent. (otherwise, a
divide-and-conquer approach is the choice.)
Each substructure is optimal. (Principle of
optimality)
12
Three basic components

The development of a dynamic-programming
algorithm has three basic components
The recurrence relation (for defining the value
of an optimal solution)
The tabular computation (for computing the value
of an optimal solution)
The traceback (for delivering an optimal
solution).

13
Fibonacci numbers
The Fibonacci numbers are defined by the
following recurrence

Leonardo of Pisa (c. 1170 c. 1250)

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144,
233, 377, 610, 987, 1597, 2584, 4181, 6765,
10946, 17711, 28657, 46368, 75025, 121393, ...

14
How to compute F10?

15
Tabular computation

The tabular computation can avoid recompuation.

F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 F10
0 1 1 2 3 5 8 13 21 34 55
16
Longest increasing subsequence(LIS)

The longest increasing subsequence is to find a
longest increasing subsequence of a given
sequence of distinct integers a1a2an .

e.g. 9 2 5 3 7 11 8 10 13 6
3 7
7 10 13
7 11
3 5 11 13

are increasing subsequences.
We want to find a longest one.
are not increasing subsequences.
17
A naive approach for LIS

Let Li be the length of a longest increasing
subsequence ending at position i.

Li 1 max j 0..i-1Lj aj lt ai(use a
dummy a0 minimum, and L00)
9 2 5 3 7 11 8 10 13 6
Li 1 1 2 2 3 4 ?
18
A naive approach for LIS

Li 1 max j 0..i-1 Lj aj lt ai

9 2 5 3 7 11 8 10 13 6
Li 1 1 2 2 3 4 4 5
6 3
The maximum length
The subsequence 2, 3, 7, 8, 10, 13 is a longest
increasing subsequence. This method runs in O(n2)
time.
19
An O(n log n) method for LIS

Define BestEndk to be the smallest number of an
increasing subsequence of length k.

9 2 5 3 7 11 8 10 13 6
9
2
2
2
2
2
2
2
2
BestEnd1
5
3
3
3
3
3
3
BestEnd2
7
7
7
7
7
BestEnd3
11
8
8
8
BestEnd4
10
10
BestEnd5
13
BestEnd6
20
An O(n log n) method for LIS

Define BestEndk to be the smallest number of an
increasing subsequence of length k.

9 2 5 3 7 11 8 10 13 6
9
2
2
2
2
2
2
2
2
2
BestEnd1
5
3
3
3
3
3
3
3
BestEnd2
7
7
7
7
7
6
BestEnd3
11
8
8
8
8
BestEnd4
For each position, we perform a binary search to
update BestEnd. Therefore, the running time is
O(n log n).
10
10
10
BestEnd5
13
13
BestEnd6
21
Binary search

Given an ordered sequence x1x2 ... xn, where
x1ltx2lt ... ltxn, and a number y, a binary search
finds the largest xi such that xilt y in O(log n)
time.

n/2
...
n/4
n
22
Binary search

How many steps would a binary search reduce the
problem size to 1?n n/2 n/4 n/8 n/16
... 1

How many steps? O(log n) steps.
23
Longest Common Subsequence (LCS)

A subsequence of a sequence S is obtained by
deleting zero or more symbols from S. For
example, the following are all subsequences of
president pred, sdn, predent.
The longest common subsequence problem is to find
a maximum-length common subsequence between two
sequences.

24
LCS