Loading...

PPT – CS222 Algorithms First Semester 2003/2004 PowerPoint presentation | free to view - id: 6d54ce-OThhM

The Adobe Flash plugin is needed to view this content

CS222 AlgorithmsFirst Semester 2003/2004

- Dr. Sanath Jayasena
- Dept. of Computer Science Eng.
- University of Moratuwa
- Lecture 7 (28/10/2003)
- String Matching Part 2
- Greedy Approach

Overview

- Previous lecture String Matching Part 1
- Naïve Algorithm, Rabin-Karp Algorithm
- This lecture
- String Matching Part 2
- String Matching using Finite Automata
- Knuth-Morris-Pratt (KMP) Algorithm
- Greedy Approach to Algorithm Design

String Matching

- PART 2

Finite Automata

- A finite automaton M is a 5-tuple (Q, q0, A, ?,

d), where - Q is a finite set of states
- q0 e Q is the start state
- A ? Q is a set of accepting states
- ? is a finite input alphabet
- d is the transition function that gives the next

state for a given current state and input

How a Finite Automaton Works

- The finite automaton M begins in state q0
- Reads characters from ? one at a time
- If M is in state q and reads input character a, M

moves to state d(q,a) - If its current state q is in A, M is said to have

accepted the string read so far - An input string that is not accepted is said to

be rejected

Example

- Q 0,1, q0 0, A1, ? a, b
- d(q,a) shown in the transition table/diagram
- This accepts strings that end in an odd number of

as e.g., abbaaa is accepted, aa is rejected

a

input

a

b

state

1

0

0

0

1

b

0

0

1

a

transition table

b

transition diagram

String-Matching Automata

- Given the pattern P 1..m, build a finite

automaton M - The state set is Q0, 1, 2, , m
- The start state is 0
- The only accepting state is m
- Time to build M can be large if ? is large

String-Matching Automata contd

- Scan the text string T 1..n to find all

occurrences of the pattern P 1..m - String matching is efficient T(n)
- Each character is examined exactly once
- Constant time for each character
- But time to compute d is O(m ?)
- d Has O(m ? ) entries

Algorithm

- Input Text string T 1..n, d and m
- Result All valid shifts displayed
- FINITE-AUTOMATON-MATCHER (T, m, d)
- n ? lengthT
- q ? 0
- for i ? 1 to n
- q ? d (q, T i)
- if q m
- print pattern occurs with shift i-m

Knuth-Morris-Pratt (KMP) Method

- Avoids computing d (transition function)
- Instead computes a prefix function p in O(m) time
- p has only m entries
- Prefix function stores info about how the pattern

matches against shifts of itself - Can avoid testing useless shifts

Terminology/Notations

- String w is a prefix of string x, if xwy for

some string y (e.g., srilan of srilanka) - String w is a suffix of string x, if xyw for

some string y (e.g., anka of srilanka) - The k-character prefix of the pattern P

1..m denoted by Pk - E.g., P0 e, Pm P P 1..m

Prefix Function for a Pattern

- Given that pattern prefix P 1..q matches text

characters T (s1)..(sq), what is the least

shift s gt s such that - P 1..k T (s1)..(sk) where sksq?
- At the new shift s, no need to compare the first

k characters of P with corresponding characters

of T - Since we know that they match

Prefix Function Example 1

b

a

c

b

a

b

a

b

a

a

b

c

b

a

T

s

a

b

a

b

a

c

a

P

q

b

a

c

b

a

b

a

b

a

a

b

c

b

a

T

s

a

b

a

b

a

c

a

P

k

a

b

a

b

a

Pq

Compare pattern against itself longest prefix of

P that is also a suffix of P5 is P3 so p5 3

Pk

a

b

a

Prefix Function Example 2

i 1 2 3 4 5 6 7 8 9 10

P i a b a b a b a b c a

pi 0 0 1 2 3 4 5 6 0 1

Knuth-Morris-Pratt (KMP) Algorithm

- Information stored in prefix function
- Can speed up both the naïve algorithm and the

finite-automaton matcher - KMP Algorithm on the board
- 2 parts KMP-MATCHER, PREFIX
- Running time
- PREFIX takes O(m)
- KMP-MATCHER takes O(mn)

Greedy Approach to Algorithm Design

Introduction

- Greedy methods typically apply to optimization

problems in which a set of choices must be made

to arrive at an optimal solution - Optimization problem
- There can be many solutions
- Each solution has a value
- We wish to find a solution with the optimal

(minimum or maximum) value

Example Optimization Problems

- How to give a balance in minimum number of coins?
- How to allocate resources to maximize profit from

your business? - A thief has a knapsack of capacity c what items

to put in it to maximize profit? - 0-1 knapsack problem (binary choice)
- Fractional knapsack problem

Greedy Approach

- Make each choice in a locally optimal manner
- Always makes the choice that looks best at the

moment - We hope that this will lead to a globally optimal

solution - Greedy method doesnt always give optimal

solutions, but for many problems it does

Example

- A cashier gives change using coins of Rs.10, 5, 2

and 1 - Suppose the amount is Rs. 37
- Need to minimize the number of coins
- Try to use the largest coin to cover the

remaining balance - So, we get 10 10 10 5 2
- Does this give the optimal solution?

Elements of Greedy Approach

- Greedy-choice property
- A globally optimal solution can be arrived at by

making a locally optimal (greedy) choice - Proving this may not be trivial
- Optimal substructure
- Optimal solution to the problem contains within

it optimal solutions to subproblems

Applications of Greedy Approach

- Graph algorithms
- Minimum spanning tree
- Shortest path
- Data compression
- Huffman coding
- Activity selection (scheduling) problems
- Fractional knapsack problem
- Not the 0-1 knapsack problem

Announcements

- Assignment 4
- assigned today
- due next week
- Next 2 lectures
- Topic Graphs
- By Ms Sudanthi Wijewickrema