Loading...

PPT – Estimating the longest increasing sequence in polylogarithmic time PowerPoint presentation | free to download - id: 6fe14d-YjU2M

The Adobe Flash plugin is needed to view this content

Estimating the longest increasing sequence in

polylogarithmic time

- C. Seshadhri (Sandia National Labs)
- Joint work with Michael Saks
- (Rutgers University)

Sandia National Laboratories is a multi-program

laboratory managed and operated by Sandia

Corporation, a wholly owned subsidiary of

Lockheed Martin Corporation, for the U.S.

Department of Energy's National Nuclear Security

Administration under contract DE-AC04-94AL85000

The problem

4

24

10

9

15

17

20

18

4

19

3

4

10

15

17

18

19

- Given array fn ? N, find (length of)
- Longest Increasing Subsequence (LIS)
- Rather self-explanatory
- By now, textbook dynamic programming problem
- CLRS 01 Chapter 15.4 (Longest Common

Subsequence), Starred Problem 15.4-6 - Schensted 61, Fredman 75 O(n log n) algorithm

Too much to read

LIS is in range 0.4n, 0.6n

Algorithm

5

7

4

8

9

2

- Array f is extremely large, so cant read all of

it - What can we say about LIS length, if we see very

little? - LIS LIS length
- Read only poly(log n) positions
- Obviously randomized

Uniform sample says nothing

2

1

4

3

6

5

8

7

10

9

4

9

- Choose uniform random sample of poly(log n) size
- LIS n/2, but random sample always increasing
- So not really that easy to learn about LIS

Our result

LIS in this range

Algorithm

1

n

LIS

- We want range to be small

Our result

LIS in this range

dn

Algorithm

1

n

LIS

- We want range to be small
- This work For any (constant) d gt 0
- Algorithm gives additive dn approximation to

LIS - Running time is (1/d)1/d(log n)c

Our result

Ad alert!

dn

1

n

n/2

LIS

- We want range to be small
- This work For any (constant) d gt 0
- Algorithm gives additive dn approximation to

LIS - Running time is (1/d)1/d(log n)c
- Ailon Chazelle Liu S 03 Parnas Ron Rubinfeld

03 - Previous best d ½

Our result

Ad alert!

dn

1

n

n/2

LIS

- We want range to be small
- This work For any (constant) d gt 0
- Algorithm gives additive dn approximation to

LIS - Running time is (1/d)1/d(log n)c
- We get (1 d)-approx to distance to monotonicity
- Previously best was factor 2

Prelims the array in space

20

15

10

4

20

10

9

15

4

10

15

4

1

2

3

Prelims the array in space

Violation

Increasing sequence

- Input is points in plane, given as array
- (LIS is longest chain in partial order)

A hard example

k

k

10 points in each

k

k

LIS 4k

LIS 2k

3k

k

k

3k

- The decision for a point depends on small scale

properties of far away portions

A hard example

k

k

k

k

3k

k

k

3k

- Random samples in neighborhoods of points are

identical! - Can we really estimate LIS in polylog time?
- Is it time for some heavy work?
- I mean, time for lbs (lower bounds).

Outline (or lack thereof)

- Will I show proofs?
- No
- Will I show the algorithm?
- Maybe
- I will try to demonstrate the main insight
- By a series of thought experiments

The dynamic program

Closest LIS point to left

Splitter

n/2

- Closest LIS point to left gives splitter
- Find LIS is each blue region. Piece together!
- So we break up original problem into subproblems

The dynamic program

S

n/2

- But we dont know right splitter.
- So try all possible! Only n different choices
- Choose the one that gives the largest sum of

LISs - MaxS (LIS-below-S LIS-above-S)

The dynamic program

n/2

- If you LIS in all small boxes, you can build LIS

for bigger boxes - Not the most efficient DP
- So our sublinear algo will mimic this process

The IP

Is this point on LIS?

LIS is in blue region

Splitter

n/2

Where is the splitter?

It is there.

The IP

This point NOT on LIS

LIS is in blue region

n/2

Where is the splitter?

It is there.

The IP

n/2

3n/4

I wish we knew the splitter in that region

It is there.

The IP

n/2

3n/4

5n/8

I think I know what will happen next

Youre lucky Im here

The IP

n/2

3n/4

5n/8

I think I know what will happen next

Youre lucky Im here

The IP

n/2

3n/4

5n/8

I think I know what will happen next

Youre lucky Im here

The interactive protocol

- If point stays in blue region till very end, then

it is good (on LIS). Otherwise, bad. - This takes (log n) steps, with the help of the

wizard

The interactive protocol

- If point stays in blue region till very end, then

it is good (on LIS). Otherwise, bad. - This takes (log n) steps, with the help of the

wizard - If we could simulate the wizard

The interactive protocol

- If point stays in blue region till very end, then

it is good (on LIS). Otherwise, bad. - This takes (log n) steps, with the help of the

wizard - If we could simulate the wizard

What?? If you could simulate the wizard, you know

the LIS!

Find a splitter

If very few LIS points outside blue, this is not

a bad splitter

n/2

- Finding splitter may be hard, so try for

approximate versions? - But how do we determine the number of LIS points?

Find a splitter

Total no. of points outside bluelt µn

Conservative splitter

n/2

- If µ lt 1/(100 log n), being against health care

conservative is good enough

Easy to check

n/2

- Count fraction of sample outside blue
- poly(log n) samples checks this accurately

Getting a conservative splitter

n/2

- We can sample (log n) different candidates and

check which of them disbelieves evolution is

conservative - What if no conservative splitter exists?

A liberal paradise

Choose any line

No. of points outside at least µn

n/2

- So we know that LIS lt (1-µ) n
- Leads to the next idea. Boosting approximations!
- Given d-approx to LIS, can we get improve to d?

Boosting approximations

Run dn-approx on points in box

No. of points outside at least µn

n2

Run dn-approx on points in box

Real splitter

n1

n/2

- Take sum of outputs as total LIS estimate
- LIS LIS1 LIS2, Est Est1 Est2
- Est1 LIS1 lt dn1 Est2 LIS2 lt dn2
- So Est LIS lt d(n1 n2)
- n1n2 lt (1-µ)n, so Est LIS lt d(1-µ)n !

Putting it together

Conservative splitter?

n/2

- Check if each is conservative splitter
- If it is, were found right subproblems
- Otherwise

Putting it together

Run dn-approx on points in box

S

Run dn-approx on points in box

n/2

- One of these is close enough to real splitter
- Est(S) Left-Est(S) Right-Est(S)

Putting it together

Run dn-approx on points in box

S

Run dn-approx on points in box

n/2

- One of these is close enough to real splitter
- Est(S) Left-Est(S) Right-Est(S)
- Final Estimate maxS Est(S)
- Looks like a great idea!
- We go from dn to d(1- µ)n. Recur to keep

improving approximation

It fails, miserably

Alg

d0 d1(1-µ)

Alg

Alg

d1

1/µ

Alg

Alg

Alg

Alg

d2

½

Alg

Alg

Alg

Alg

- As we go up each level, approx gets better by

(1-µ). - So to get d0 ¼, how many levels needed?
- ¼ ½ (1-µ)t So t 1/µ
- We have running time at least 21/µ.
- So, µ needs to be gt 1/log log n.

Find a splitter

Total no. of points outside bluelt µn

Conservative splitter

n/2

- If µ lt 1/(100 log n), being against health care

conservative is good enough

The basic dichotomy

Continue IP

P

We find splitter

Cannot find splitter

The Interactive Protocol phase

The Dynamic Programming phase

- For IP, we need µ lt 1/log n
- µn is error in each level of IP
- For DP, we need µ gt 1/log log n
- (1-µ) is decrease in approximation

The basic dichotomy

Strengthen

Continue IP

Weaken

P

We find splitter

Cannot find splitter

The Interactive Protocol phase

The Dynamic Programming phase

- For IP, we need µ lt 1/log n
- µn is error in each level of IP
- For DP, we need µ gt 1/log log n
- (1-µ) is decrease in approximation

Reducing to smaller DP!

n/(log n)

Run d-approx to get LIS estimate inside box

n/(log n)

- Run d-approx on all poly(log n) such boxes

Reducing to smaller DP!

n/(log n)

- Run d-approx on all poly(log n) such boxes
- Use Dynamic Program to find chain with largest

sum of estimates - Longest path in DAG
- Can solve in poly(log n) time

Dichotomy theorem

OR

One can go from d-approx to (d-d2)-approx by a

(log n) sized DP

Either it is easy to find the right subproblems

The algorithm, in one slide

Continue IP

P

We find splitter

Cannot find splitter

Make poly(log n) calls to d-approx. Solve DP of

poly(log n) size.

- Overall running time becomes (log n)1/d
- miracle that the math works out

The even better version

- Dont exactly solve this dynamic program!
- Use our sublinear algo to approximately solve in

(loglog n) time. Then do it recursively - Its painful
- Its all Greek a ß ? d e ? ? µ ?
- We had ?, but got rid of it

What next?

- Sublinear dynamic programming!
- We get (1/d)1/d (log n)c time. Can we get
- (log n)/d time?
- Would be extremely cool. Completely optimal
- Applications for other dynamic programs?
- How does one find the right subproblems in

sublinear time? - Generalize the dichotomy
- Longest common subsequence/edit distance?

Ask and you shall know