CS222 Algorithms First Semester 2003/2004 - PowerPoint PPT Presentation

1 / 23

About This Presentation

Title:

CS222 Algorithms First Semester 2003/2004

Description:

Title: CS222 Algorithms Lecture 7 String Matching 2 + Greedy Approach Author: Sanath Jayasena Last modified by: Sanath Jayasena Created Date: 9/7/2003 3:36:19 PM – PowerPoint PPT presentation

Number of Views:55

Avg rating:3.0/5.0

Slides: 24

Provided by: Sana76

Category:

more less

Transcript and Presenter's Notes

Title: CS222 Algorithms First Semester 2003/2004

1
CS222 AlgorithmsFirst Semester 2003/2004

Dr. Sanath Jayasena
Dept. of Computer Science Eng.
University of Moratuwa
Lecture 7 (28/10/2003)
String Matching Part 2
Greedy Approach

2
Overview

Previous lecture String Matching Part 1
Naïve Algorithm, Rabin-Karp Algorithm
This lecture
String Matching Part 2
String Matching using Finite Automata
Knuth-Morris-Pratt (KMP) Algorithm
Greedy Approach to Algorithm Design

3
String Matching

PART 2

4
Finite Automata

A finite automaton M is a 5-tuple (Q, q0, A, ?,
d), where
Q is a finite set of states
q0 e Q is the start state
A ? Q is a set of accepting states
? is a finite input alphabet
d is the transition function that gives the next
state for a given current state and input

5
How a Finite Automaton Works

The finite automaton M begins in state q0
Reads characters from ? one at a time
If M is in state q and reads input character a, M
moves to state d(q,a)
If its current state q is in A, M is said to have
accepted the string read so far
An input string that is not accepted is said to
be rejected

6
Example

Q 0,1, q0 0, A1, ? a, b
d(q,a) shown in the transition table/diagram
This accepts strings that end in an odd number of
as e.g., abbaaa is accepted, aa is rejected

a
input
a
b
state
1
0
0
0
1
b
0
0
1
a
transition table
b
transition diagram
7
String-Matching Automata

Given the pattern P 1..m, build a finite
automaton M
The state set is Q0, 1, 2, , m
The start state is 0
The only accepting state is m
Time to build M can be large if ? is large

8
String-Matching Automata contd

Scan the text string T 1..n to find all
occurrences of the pattern P 1..m
String matching is efficient T(n)
Each character is examined exactly once
Constant time for each character
But time to compute d is O(m ?)
d Has O(m ? ) entries

9
Algorithm

Input Text string T 1..n, d and m
Result All valid shifts displayed
FINITE-AUTOMATON-MATCHER (T, m, d)
n ? lengthT
q ? 0
for i ? 1 to n
q ? d (q, T i)
if q m
print pattern occurs with shift i-m

10
Knuth-Morris-Pratt (KMP) Method

Avoids computing d (transition function)
Instead computes a prefix function p in O(m) time
p has only m entries
Prefix function stores info about how the pattern
matches against shifts of itself
Can avoid testing useless shifts

11
Terminology/Notations

String w is a prefix of string x, if xwy for
some string y (e.g., srilan of srilanka)
String w is a suffix of string x, if xyw for
some string y (e.g., anka of srilanka)
The k-character prefix of the pattern P
1..m denoted by Pk
E.g., P0 e, Pm P P 1..m

12
Prefix Function for a Pattern

Given that pattern prefix P 1..q matches text
characters T (s1)..(sq), what is the least
shift s gt s such that
P 1..k T (s1)..(sk) where sksq?
At the new shift s, no need to compare the first
k characters of P with corresponding characters
of T
Since we know that they match

13
Prefix Function Example 1
b
a
c
b
a
b
a
b
a
a
b
c
b
a
T
s
a
b
a
b
a
c
a
P
q
b
a
c
b
a
b
a
b
a
a
b
c
b
a
T
s
a
b
a
b
a
c
a
P
k
a
b
a
b
a
Pq
Compare pattern against itself longest prefix of
P that is also a suffix of P5 is P3 so p5 3
Pk
a
b
a
14
Prefix Function Example 2
i 1 2 3 4 5 6 7 8 9 10
P i a b a b a b a b c a
pi 0 0 1 2 3 4 5 6 0 1
15
Knuth-Morris-Pratt (KMP) Algorithm

Information stored in prefix function
Can speed up both the naïve algorithm and the
finite-automaton matcher
KMP Algorithm on the board
2 parts KMP-MATCHER, PREFIX
Running time
PREFIX takes O(m)
KMP-MATCHER takes O(mn)

16
Greedy Approach to Algorithm Design
17
Introduction

Greedy methods typically apply to optimization
problems in which a set of choices must be made
to arrive at an optimal solution
Optimization problem
There can be many solutions
Each solution has a value
We wish to find a solution with the optimal
(minimum or maximum) value

18
Example Optimization Problems

How to give a balance in minimum number of coins?
How to allocate resources to maximize profit from
your business?
A thief has a knapsack of capacity c what items
to put in it to maximize profit?
0-1 knapsack problem (binary choice)
Fractional knapsack problem

19
Greedy Approach

Make each choice in a locally optimal manner
Always makes the choice that looks best at the
moment
We hope that this will lead to a globally optimal
solution
Greedy method doesnt always give optimal
solutions, but for many problems it does

20
Example

A cashier gives change using coins of Rs.10, 5, 2
and 1
Suppose the amount is Rs. 37
Need to minimize the number of coins
Try to use the largest coin to cover the
remaining balance
So, we get 10 10 10 5 2
Does this give the optimal solution?

21
Elements of Greedy Approach

Greedy-choice property
A globally optimal solution can be arrived at by
making a locally optimal (greedy) choice
Proving this may not be trivial
Optimal substructure
Optimal solution to the problem contains within
it optimal solutions to subproblems

22
Applications of Greedy Approach