Estimating the longest increasing sequence in polylogarithmic time - PowerPoint PPT Presentation

Loading...

PPT – Estimating the longest increasing sequence in polylogarithmic time PowerPoint presentation | free to download - id: 6fe14d-YjU2M



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Estimating the longest increasing sequence in polylogarithmic time

Description:

Estimating the longest increasing sequence in polylogarithmic time C. Seshadhri (Sandia National Labs) Joint work with Michael Saks (Rutgers University) – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 46
Provided by: Ses114
Learn more at: http://www.wisdom.weizmann.ac.il
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Estimating the longest increasing sequence in polylogarithmic time


1
Estimating the longest increasing sequence in
polylogarithmic time
  • C. Seshadhri (Sandia National Labs)
  • Joint work with Michael Saks
  • (Rutgers University)

Sandia National Laboratories is a multi-program
laboratory managed and operated by Sandia
Corporation, a wholly owned subsidiary of
Lockheed Martin Corporation, for the U.S.
Department of Energy's National Nuclear Security
Administration under contract DE-AC04-94AL85000
2
The problem
4
24
10
9
15
17
20
18
4
19
3
4
10
15
17
18
19
  • Given array fn ? N, find (length of)
  • Longest Increasing Subsequence (LIS)
  • Rather self-explanatory
  • By now, textbook dynamic programming problem
  • CLRS 01 Chapter 15.4 (Longest Common
    Subsequence), Starred Problem 15.4-6
  • Schensted 61, Fredman 75 O(n log n) algorithm

3
Too much to read
LIS is in range 0.4n, 0.6n
Algorithm
5
7
4
8
9
2
  • Array f is extremely large, so cant read all of
    it
  • What can we say about LIS length, if we see very
    little?
  • LIS LIS length
  • Read only poly(log n) positions
  • Obviously randomized

4
Uniform sample says nothing
2
1
4
3
6
5
8
7
10
9
4
9
  • Choose uniform random sample of poly(log n) size
  • LIS n/2, but random sample always increasing
  • So not really that easy to learn about LIS

5
Our result
LIS in this range
Algorithm
1
n
LIS
  • We want range to be small

6
Our result
LIS in this range
dn
Algorithm
1
n
LIS
  • We want range to be small
  • This work For any (constant) d gt 0
  • Algorithm gives additive dn approximation to
    LIS
  • Running time is (1/d)1/d(log n)c

7
Our result
Ad alert!
dn
1
n
n/2
LIS
  • We want range to be small
  • This work For any (constant) d gt 0
  • Algorithm gives additive dn approximation to
    LIS
  • Running time is (1/d)1/d(log n)c
  • Ailon Chazelle Liu S 03 Parnas Ron Rubinfeld
    03
  • Previous best d ½

8
Our result
Ad alert!
dn
1
n
n/2
LIS
  • We want range to be small
  • This work For any (constant) d gt 0
  • Algorithm gives additive dn approximation to
    LIS
  • Running time is (1/d)1/d(log n)c
  • We get (1 d)-approx to distance to monotonicity
  • Previously best was factor 2

9
Prelims the array in space
20
15
10
4
20
10
9
15
4
10
15
4
1
2
3
10
Prelims the array in space
Violation
Increasing sequence
  • Input is points in plane, given as array
  • (LIS is longest chain in partial order)

11
A hard example
k
k
10 points in each
k
k
LIS 4k
LIS 2k
3k
k
k
3k
  • The decision for a point depends on small scale
    properties of far away portions

12
A hard example
k
k
k
k
3k
k
k
3k
  • Random samples in neighborhoods of points are
    identical!
  • Can we really estimate LIS in polylog time?
  • Is it time for some heavy work?
  • I mean, time for lbs (lower bounds).

13
Outline (or lack thereof)
  • Will I show proofs?
  • No
  • Will I show the algorithm?
  • Maybe
  • I will try to demonstrate the main insight
  • By a series of thought experiments

14
The dynamic program
Closest LIS point to left
Splitter
n/2
  • Closest LIS point to left gives splitter
  • Find LIS is each blue region. Piece together!
  • So we break up original problem into subproblems

15
The dynamic program
S
n/2
  • But we dont know right splitter.
  • So try all possible! Only n different choices
  • Choose the one that gives the largest sum of
    LISs
  • MaxS (LIS-below-S LIS-above-S)

16
The dynamic program
n/2
  • If you LIS in all small boxes, you can build LIS
    for bigger boxes
  • Not the most efficient DP
  • So our sublinear algo will mimic this process

17
The IP
Is this point on LIS?
LIS is in blue region
Splitter
n/2
Where is the splitter?
It is there.
18
The IP
This point NOT on LIS
LIS is in blue region
n/2
Where is the splitter?
It is there.
19
The IP
n/2
3n/4
I wish we knew the splitter in that region
It is there.
20
The IP
n/2
3n/4
5n/8
I think I know what will happen next
Youre lucky Im here
21
The IP
n/2
3n/4
5n/8
I think I know what will happen next
Youre lucky Im here
22
The IP
n/2
3n/4
5n/8
I think I know what will happen next
Youre lucky Im here
23
The interactive protocol
  • If point stays in blue region till very end, then
    it is good (on LIS). Otherwise, bad.
  • This takes (log n) steps, with the help of the
    wizard

24
The interactive protocol
  • If point stays in blue region till very end, then
    it is good (on LIS). Otherwise, bad.
  • This takes (log n) steps, with the help of the
    wizard
  • If we could simulate the wizard

25
The interactive protocol
  • If point stays in blue region till very end, then
    it is good (on LIS). Otherwise, bad.
  • This takes (log n) steps, with the help of the
    wizard
  • If we could simulate the wizard

What?? If you could simulate the wizard, you know
the LIS!
26
Find a splitter
If very few LIS points outside blue, this is not
a bad splitter
n/2
  • Finding splitter may be hard, so try for
    approximate versions?
  • But how do we determine the number of LIS points?

27
Find a splitter
Total no. of points outside bluelt µn
Conservative splitter
n/2
  • If µ lt 1/(100 log n), being against health care
    conservative is good enough

28
Easy to check
n/2
  • Count fraction of sample outside blue
  • poly(log n) samples checks this accurately

29
Getting a conservative splitter
n/2
  • We can sample (log n) different candidates and
    check which of them disbelieves evolution is
    conservative
  • What if no conservative splitter exists?

30
A liberal paradise
Choose any line
No. of points outside at least µn
n/2
  • So we know that LIS lt (1-µ) n
  • Leads to the next idea. Boosting approximations!
  • Given d-approx to LIS, can we get improve to d?

31
Boosting approximations
Run dn-approx on points in box
No. of points outside at least µn
n2
Run dn-approx on points in box
Real splitter
n1
n/2
  • Take sum of outputs as total LIS estimate
  • LIS LIS1 LIS2, Est Est1 Est2
  • Est1 LIS1 lt dn1 Est2 LIS2 lt dn2
  • So Est LIS lt d(n1 n2)
  • n1n2 lt (1-µ)n, so Est LIS lt d(1-µ)n !

32
Putting it together
Conservative splitter?
n/2
  • Check if each is conservative splitter
  • If it is, were found right subproblems
  • Otherwise

33
Putting it together
Run dn-approx on points in box
S
Run dn-approx on points in box
n/2
  • One of these is close enough to real splitter
  • Est(S) Left-Est(S) Right-Est(S)

34
Putting it together
Run dn-approx on points in box
S
Run dn-approx on points in box
n/2
  • One of these is close enough to real splitter
  • Est(S) Left-Est(S) Right-Est(S)
  • Final Estimate maxS Est(S)
  • Looks like a great idea!
  • We go from dn to d(1- µ)n. Recur to keep
    improving approximation

35
It fails, miserably
Alg
d0 d1(1-µ)
Alg
Alg
d1
1/µ
Alg
Alg
Alg
Alg
d2
½
Alg
Alg
Alg
Alg
  • As we go up each level, approx gets better by
    (1-µ).
  • So to get d0 ¼, how many levels needed?
  • ¼ ½ (1-µ)t So t 1/µ
  • We have running time at least 21/µ.
  • So, µ needs to be gt 1/log log n.

36
Find a splitter
Total no. of points outside bluelt µn
Conservative splitter
n/2
  • If µ lt 1/(100 log n), being against health care
    conservative is good enough

37
The basic dichotomy
Continue IP
P
We find splitter
Cannot find splitter
The Interactive Protocol phase
The Dynamic Programming phase
  • For IP, we need µ lt 1/log n
  • µn is error in each level of IP
  • For DP, we need µ gt 1/log log n
  • (1-µ) is decrease in approximation

38
The basic dichotomy
Strengthen
Continue IP
Weaken
P
We find splitter
Cannot find splitter
The Interactive Protocol phase
The Dynamic Programming phase
  • For IP, we need µ lt 1/log n
  • µn is error in each level of IP
  • For DP, we need µ gt 1/log log n
  • (1-µ) is decrease in approximation

39
Reducing to smaller DP!
n/(log n)
Run d-approx to get LIS estimate inside box
n/(log n)
  • Run d-approx on all poly(log n) such boxes

40
Reducing to smaller DP!
n/(log n)
  • Run d-approx on all poly(log n) such boxes
  • Use Dynamic Program to find chain with largest
    sum of estimates
  • Longest path in DAG
  • Can solve in poly(log n) time

41
Dichotomy theorem
OR
One can go from d-approx to (d-d2)-approx by a
(log n) sized DP
Either it is easy to find the right subproblems
42
The algorithm, in one slide
Continue IP
P
We find splitter
Cannot find splitter
Make poly(log n) calls to d-approx. Solve DP of
poly(log n) size.
  • Overall running time becomes (log n)1/d
  • miracle that the math works out

43
The even better version
  • Dont exactly solve this dynamic program!
  • Use our sublinear algo to approximately solve in
    (loglog n) time. Then do it recursively
  • Its painful
  • Its all Greek a ß ? d e ? ? µ ?
  • We had ?, but got rid of it

44
What next?
  • Sublinear dynamic programming!
  • We get (1/d)1/d (log n)c time. Can we get
  • (log n)/d time?
  • Would be extremely cool. Completely optimal
  • Applications for other dynamic programs?
  • How does one find the right subproblems in
    sublinear time?
  • Generalize the dichotomy
  • Longest common subsequence/edit distance?

45
Ask and you shall know
About PowerShow.com