SCALED Pattern Matching - PowerPoint PPT Presentation

About This Presentation
Title:

SCALED Pattern Matching

Description:

SCALED Pattern Matching Amihood Amir Ayelet Butman Bar-Ilan University Moshe Lewenstein and Johns Hopkins University Bar-Ilan ... – PowerPoint PPT presentation

Number of Views:176
Avg rating:3.0/5.0
Slides: 39
Provided by: amir84
Category:

less

Transcript and Presenter's Notes

Title: SCALED Pattern Matching


1
SCALEDPattern Matching
  • Amihood Amir Ayelet Butman
  • Bar-Ilan University Moshe
    Lewenstein
  • and
  • Johns Hopkins University
    Bar-Ilan University

2
Motivation
  • Searching for Templates in
  • Aerial Photographs
  • Input Aerial photo
  • Template
  • Task Search for all locations where the template
    appears in the image.

3
(No Transcript)
4
Model
  • Low level (pixel level)
  • avoid costly processing
  • Asymptotically efficient solutions.
  • Serial, exact algorithms.

5
Types of Approximations
  • Local errors Level of detail
  • Occlusion
  • Noise
  • results O(n² log m) mismatches
  • O(n²k²( edit distance, k
    errors,

  • rectangular patterns.
  • O(n²kv(m log m) v(k log k)
  • edit distance, k
    errors,

  • half rectangular patterns

AL-88
AF-95
6
Types of Approximation
  • Orientation.
  • results O(n²m ) FU-98
  • O(n²m³) ACL-98
  • Scaling Natural scales
  • results O(n) 1-d EV-88
  • O(n² log S) 2-d ALV-92
  • O(n²) dictionary
    AC-96
  • Real scales
  • this result O(n) 1-d,
    truncation

5
7
It seems daunting, but
8
CPM 2003 Morelia, Mexico
9
Problem inherently inexact
  • What if occurrence is 1½ times bigger?
  • What is the meaning of ½ a pixel?
  • Solutions until now Natural Scales -
  • Consider only discrete scales
  • 1, 2, 3, 4, 5, . . .

10
Definition
  • Text
    Pattern


  • Find all occurrences of the pattern in the text
    in all discrete sizes.

n
m
m
n
11
Discrete exact Scaled Matching
  • T
    P
  • A A A A A A A A A A A A A
    A A A
  • A A A A A A A A A A C C A
    A C A
  • A A A C C A A A A A C C A
    A A A
  • A A A C C A A A A A A A A
  • A A A A A A A A A A A A A
  • A A A A A A A A A C C A A
  • A A A A A A A A A C C A A
  • A A A C C C A A A A A A A
  • A A A C C C A A A A A A A
  • A A A C C C A A A A C A A
  • A A A A A A A A A A A A A
  • A A A A A A A A A C C A C
  • A A A A A A A A A A A A A

12
Discrete exact Scaled Matching

Z Z Z U U U Y Y Y Z Z
Z U U U Y Y Y Z Z Z U U U Y Y Y K
K K V V V S S S K K K V V V S S S
K K K V V V S S S X X X E E E T T T
X X X E E E T T T X X X E E E T T T
P Z U Y K V S
X E T
13
Idea Fix a scale s
s
  • Constant amount of work for each square (s-block)

s
n/s
n
14
Algorithm time
  • Time for scale s
  • Total time

  • converges to a constant
  • Making the total time O(n²)

15
Problem Real scales
  • Was open even for strings
  • How do we define?
  • aabcccbb
  • Scaled to 2 aaaabbccccccbbbb
  • Scaled to 1½ aaab cccc bbb
  • truncate
    truncate
  • ½b
    ½c

16
Formally
r times
r
Denote a aaa . . .
a Problem Definition 1 Input Pattern
Text Output All text locations where
appears for some
17
Remark
  • a 1 means we only scale up
  • Reasons Avoid conceptual problem of loss of
    resolution.
  • From far enough away everything looks the same.
  • By our definition, for klt1/m there is a match at
    every text location.

18
Simplify definition
Definition 2 Look for in
the text. Example Paabcccbbbb Match
by definition 2 daaabccccbbbbbbe Match by
definition 1 but not by def 2
daaaabccccbbbbbbbe
19
Why are definitions equivalent?
  • Split text and pattern to
  • symbol part Ts , Ps and
  • length part TL , PL.
  • Example P aabcccbbbb
  • Psabcb
  • PL2134
  • Tdaaabccccbbbbbbe
  • Tsdabcbe
  • TL131461

20
Time
  • Time for split O(nm)
  • Finding Ps in Ts O(nm) (e.g. KMP)
  • HARD PART Finding PL in TL.

21
Definitions are Equivalent
Claim Solving def 2 in time O(f(n))
Solving def 1 in time O(f(n)). Why? - Find
in time O(f(n)) - For each
match verify 1st and last symbol in
constant time in Ts and TL. Total time
O(f(n)n)O(f(n)).
22
Naïve algorithm for matching PL in TL
For each text location, position pattern starting
at that location and calculate interval t/p,
(t1)/p) for each resulting lttext, patterngt
pair. This is the interval of possible scales
since t/p?p t for every a lt t/p, ap lt
t (t1)/p ?p t1 for every a t/p,
ap gt t
23
Check intersection
  • If intersection of all intervals is not empty
    then there is a match.
  • Time O(nm)
  • Example
  • PL 2 1 2 3 2
  • TL 2 4 2 4 7 4 5 3
  • 1,3/2) 4,5)
  • The intersection is empty thus no scaled match in
    location 1. But

24
Check intersection
  • If intersection of all intervals is not empty
    then there is a match.
  • Time O(nm)
  • Example
  • PL 2 1 2 3 2
  • TL 2 4 2 4 7 4 5 3
  • 2,5/2) 2,3) 2,5/2)7/3,8/3)2,5
    /2)
  • The intersection is 7/3,5/2) thus there is a
    scaled match in location 2.

25
Improvement Parameterized Matching
Introduced Baker 1994. Motivation
copying code.
26
Parameterized Matching
  • Input two strings s and t st, over
    alphabets ?s and ?t.
  • s parameterize matches t if bijection
    ?s ?t , such that (s) t.

Example
a
a
b
b
b
(a)x
x
x
y
y
y
(b)y
27
Parameterized Matching
  • Claim (AFM-94)
  • For S that can be sorted in linear time (e.g.
    S1, . . . , n)
  • Parameterized matching can be done in time O(n).

28
The reduction
Lemma for which PL
matches TL at location i scaled to a only if PL
p-matches TL at i. Proof Assume PL does not
p-match TL at location i. The possible
situations are
29
Possibility 1
w.l.o.g. c a1
TL
a
c?a
PL
b
b
For c a1 (smallest possible)
30
Possibility 2
TL
a
a
w.l.o.g. c b1
PL
b
c?b
Intersection not empty only if
(a1)/(b1) gt a/b i.e.
abb gt aba
bgta But this can never happen if a
1.
31
Algorithm for Real Scaled String Matching
  • Let Pi1, Pi2, . . ., Pij be the different
    numbers in PL.
  • P-match PL in TL.
  • For each match, chack intersection of intervals
    between Pi1, . . . , Pij and corresponding
    symbols in TL.
  • End Algorithm

32
Example
PL 2 3 2 3 2
Pi12 Pi23 p-matches TL 5 6 5 6 5 6
10 6 10 6 10 7
scaled match
33
Important Fact
  • So there are at most O(vm) different Piks.
  • Time O(n) for parameterized matching

  • (S1,2,,n).
  • O(vm) verification for each
  • location.
  • Total O(nvm).

34
Tighter analysis
  • Upper bound number of possible p-matches.
  • Lemma Let Pm, Tn, Pi1, Pi2, . . ., Pij
    be the different numbers in PL.
  • Then there are at most n/2j p-matches of PL in
    TL.
  • Meaning Since verification time is O(j) per
    p-match, the lemma implies that total
    verification time is
  • O((n/2j) j) O(n)

35
Proof of Lemma
  • 1st appearance of Pi1, . . . , Pij
  • PL Pi1
    Pi2 Pij
  • TL a1 a2
    aj
  • m-match

36
Lemmas proof (cont.)
  • Let x be the total number of p-matches in the
    text.
  • The sum of all text elements that match 1st
    occurrences of Piks in the pattern
  • (xj²)/2
  • But There are overlaps!
  • How many?

37
Lemmas proof (cont.)
  • For each text location, at most j matches will
    count it. Therefore
  • Total count without overlaps
  • Clearly xj/2 n thus
  • x (2n)/j

38
Open Problem
  • Give 1-d algorithm linear in run-length
    compressed text and pattern.
Write a Comment
User Comments (0)
About PowerShow.com