Research in Theoretical Computer Science Madhu Sudan CSAIL - PowerPoint PPT Presentation

About This Presentation
Title:

Research in Theoretical Computer Science Madhu Sudan CSAIL

Description:

First algorithm we think of may not be fastest. ... 1930s: Turing invented Turing machine. Universality: One machine implements all algorithms. ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 35
Provided by: MADHU4
Category:

less

Transcript and Presenter's Notes

Title: Research in Theoretical Computer Science Madhu Sudan CSAIL


1
Research in Theoretical Computer ScienceMadhu
SudanCSAIL
2
Overview
  • Part I Introduction to Theory of Computation.
  • Part II Perspective on (immediate) relevance.
  • Part III A current research direction.
  • Introverted Algorithms
  • Communication with errors Meaning of bits

3
Part I Introduction to Theory of CS
4
Theory of Computing
  • Mathematical study of Computation and its
    consequences.
  • Computation Sequence of simple steps, leading to
    complex change in information.
  • Measures Efficiency of algorithm/program
  • Depends on hardware and implementation.
  • Can ask how it scales?
  • If I double the hardware capacity (speed/memory)
  • Will this increase the biggest size of problem I
    can solve by constant factor? (polynomial
    solution)
  • Or by additive constant? (exponential solution)

5
Theory of Computing
  • Mathematical study of Computation and its
    consequences.
  • Computation Sequence of simple steps, leading to
    complex change in information.
  • Issues
  • Algorithms Design efficient sequence of steps
    that produce a desired effect. What is efficient?
  • Complexity When is inefficiency inherent?
  • Implications What effect does (in)efficiency
    have on human (intelligent) interaction?
  • Surprisingly broad in scope and impact.

6
Example Integer Arithmetic
  • Addition
  • Multiplication
  • Factoring

2 3 1 5 6 7
5 8 9 1 4
1
8 1
4 8 1
0 4 8 1
9 0 4 8 1
2 9 0 4 8 1
Linear!
7
Example Integer Arithmetic
  • Addition Linear!
  • Multiplication
  • Factoring?

2 3 1 5 6 7
x 5 8 9 1 4
9 2 6 2 6 8
2 3 1 5 6 7
2 0 8 4 1 0 3
1 8 5 2 5 3 6
1 1 5 7 8 3 5
1 3 6 4 2 5 3 8 2 3 8
Quadratic! Fastest? Not Linear?
8
Example Integer Arithmetic
  • Addition Linear!
  • Multiplication Quadratic! Fastest? Not-linear
  • Factoring? Write 13642538238 as product of two
    integers (each less than 1000000)
  • Inverse of above problem.
  • Not known to be linear/quadratic/cubic.
  • Believed to require exponential time.

9
Fundamental quests of CS Theory
  • Algorithms Given a task (e.g., multiplication)
    find fast algorithms.
  • First algorithm we think of may not be fastest.
  • Complexity Prove lower bounds on resources
    required to solve problem.
  • Is multiplication harder than addition?
  • Is factoring harder than multiplication?
  • Implications Cryptography
  • Economics Markets implement efficient
    computation.
  • Biology Nature implements efficient computation.
  • Networks Errors implement efficient computation.

10
Long-range questions
  • Is PNP?
  • Formally, Is all computation reversible? (e.g.,
    multiplication vs. factoring?)
  • Philosophically, can every designer
    (mathematician, physicist, engineer, biologist)
    be replaced by a computer?
  • (Most of us dont expect this).
  • Can we factor integers efficiently?
  • (Hopefully, still no).
  • If not, can we build secure communication based
    on this?
  • Led to RSA. Still many challenges today.

11
Modern addenda to long-term quests
  • Is the universe random?
  • Maybe if so
  • Can build efficient algorithms this way (modern
    examples due to Karger, Rubinfeld, Indyk, Kelner)
  • Can synchronize distributed systems (essential,
    as shown by Lynch et al.)
  • Can generate and preserve secrets (essential, as
    shown by Goldwasser and Micali).
  • Maybe not if so
  • Might still look random to us, because P ? NP.
    (Long history Blum, Micali, Yao)
  • Is the universe quantum? Factoring easy (Shor)

12
Current quests in computation
  • Algorithms for Massive data sets
  • How can we leverage the computational power of a
    laptop, to understand data such as the WWWMain
    issue Massive data wont fit in our storage.
  • Factors in our favor
  • We can perform random sampling
  • We dont have to deliver guaranteed answers
  • Many Results Karger, Vempala, Rubinfeld, Indyk
  • Can tell if theres a trend change Rubinfeld
    et al.
  • Can tell if a signal has high-intensity in some
    frequency. Indyk et al.
  • Underlying emphasis on Randomness.

13
Part II Perspective of theory
14
History of theoretical CS
  • 1930s Turing invented Turing machine.
  • Universality One machine implements all
    algorithms.
  • Why? To model thought/reasoning/logic
  • theorems and proofs
  • Became foundation of modern computers (von
    Neumann)
  • 1960s Non-trivial algorithms
  • Peterson BCH decoder
  • Cooley-Tukey FFT
  • Dijkstra shortest paths
  • 1970s NP-completeness, Cryptography, RSA.
  • 1990s Internet algorithms (Yahoo!, Akamai,
    Google).

15
Theory vs. Practice
  • Theoretical Perspective
  • Focus on Long-term time horizon not very close
    attention to current nature of
  • Hardware
  • Domain-specific information
  • Solution feasibility
  • Why should you care (today?)
  • Lessons learned from past are useful (theories
    more important than theorems).
  • Good insight into problems of the future.
  • Occasionally solutions useful today!

16
Part III Recent ResearchProblems, Solutions
17
Part IIIa Introverted Algorithms
18
Sublinear time algorithmsR. Rubinfeld, P.
Valiant
  • Typical Algorithmic Tasks.
  • Given x, compute some f(x) in time x. Linear
    time!
  • Modern challenges
  • Data too massive to allow time x to process
    it.
  • Can we do much faster?
  • Allow randomness in algorithms.
  • Allow some approximation error.

19
Motivations
  • Internet Traffic
  • Suppose we maintain vast amounts of logs of
    internet traffic through a router.
  • Was there a major shift in the nature of requests
    within the last hour (perhaps a denial of service
    attack).
  • Disease Patterns
  • Suppose we have data for spread of a disease.
  • What are causal factors.
  • Theme Data Abundant Processing bottleneck

20
Introverted Algorithms New Area Many
Problems, Few Tools
P. Valiant
Symmetric Approximation Properties of
Distributions
Invariant under renaming
yes ? no
Uniform am Uniform nz
Distribution Space
Intrinsic properties
21
Introverted Algorithms New Area Many
Problems, Few Tools
P. Valiant
Symmetric Approximation Properties of
Distributions
ß
Invariant under renaming
yes ? no
Uniform am Uniform nz
a
Distribution Space
Intrinsic properties
Reals
22
Introverted Algorithms New Area Many
Problems, Few Tools
P. Valiant
Symmetric Approximation Properties of
Distributions
ß
Invariant under renaming
yes ? no
continuous
Uniform am Uniform nz
a
Distribution Space
Intrinsic properties
Reals
Includes approximating Entropy, Statistical (L1)
Distance, Support Size, Information Divergences,
other Lc distances, weighted distances
Includes approximating Entropy, Statistical (L1)
Distance, Support Size, Information Divergences,
other Lc distances, weighted distances

23
New Contribution
Entropy Approximation lta or gtß?
Statistical Distance lta or gtß?
na/ß
n
Two Components of a Solution
na/ß BDKR 02
n B 01
An Upper Bound (Algorithm) A Lower Bound
(Impossibility Proof)
n1/2 BFRSW 00
n2a/3ß RRSS 07
g
u
u
d
d
a
g
g
c
u
e
e
24
New Contribution
Canonical Tester
Entropy Approximation lta or gtß?
Statistical Distance lta or gtß?
na/ß
n
Canonical Testing Theorem
If the Canonical Tester does not work, nothing
does.
Both an upper and a lower bound
Determining the sample complexity of property
testing is now a question of algorithm analysis
Whats the algorithm?
25
The Canonical Tester
yes no
(a,b,b,a,a,a,f,e,e,e)
estimate high frequencies
threshold 3
n yes,no
constrain low frequencies
yes ? no
a b c d e

?
?
.4
.3
lt.3
lt.3
lt.3
If the Canonical Tester does not work, nothing
will
? is (?,?)-weakly continuous if d1-d2lt ? then
?(d1)-?(d2)lt?
If the k-sample Canonical Tester with threshold
O( ) does not correctly distinguish ?lta-e
from ?gtße, then no tester can distinguish ?ltae
from ?gtß-e in k?/no(1) samples.
26
Part IIIb Robust Intelligent Communication
27
Intelligence and Interaction Juba S.
  • Typical communication protocols non-robust.
  • Depend on perfect understanding between sender
    and receiver. Require universal adoption of fixed
    standards. Is this essential?
  • Why?
  • To reduce human oversight in critical tasks.
  • E.g., Cars that exchange information, hospitals
    exchanging medical records.
  • Heterogeneity leads to violation of standards.
  • Technical issues
  • Classical communication suppresses/fears
    intelligence of communicators. Need new models,
    methods to exploit intelligence of sender
    receiver.

28
Modelling the Problem
  • Alice wishes to send algorithm A to Bob
  • Both know programming but do so in different
    languages.
  • Can she send him the algorithm?
  • Theorem Not possible to do this unambiguously.
  • Implications Perfect understanding impossible in
    evolving settings (when two communicators evolve).

29
Modelling the Problem
  • Alice wishes to send algorithm A to Bob
  • Both know programming but do so in different
    languages.
  • Can she send him the algorithm?
  • Theorem Juba S. Not possible to do this
    unambiguously.
  • Implications Perfect understanding impossible in
    evolving settings (when two communicators evolve)
  • What should we do?

30
Communication Goals
  • Communication is not an end in itself, it is a
    means to some (selfish, verifiable) end.
  • Bob must be trying to use Alice to some benefit
  • E.g., to alter the environment (remote control)
  • To learn something (intellectual curiosity).
  • Test Case Bob (weak computer) tries to
    communicate with Alice (strong computer) to use
    her computational abilities.
  • Theorem Juba S. Bob can use Alices help to
    solve his problem iff problem is verifiable
    (without common prior background).

31
Examples
  • Bob uses Alice to determine which programs are
    viruses.
  • Undecidable problem. Bob can not verify.
  • Eventually he will make an error.
  • Bob uses Alice to break cryptosystem.
  • He knows when he has broken in. Should do so.
  • In the process of doing so he learns Alices
    language (and realizes he is learning).
  • Bob uses Alice to add integers.
  • Can verify so he wont make mistakes.
  • But probably wont learn her language.

32
Implications
  • Architecture for communicating computers
  • Each interface should have a dedicated
    interpreter
  • Interpreter is constantly in mode of checking and
    adapting.
  • Will future of communication look like this?
  • Answer in 20 years

33
Recap Why is Theory Important?
  • Lessons learned from past are useful (theories
    more important than theorems).
  • Message of FoxConn Algorithms Course!
  • Good insight into problems of the future.
  • Occasionally solutions useful today!
  • RSA, Akamai (CSAIL has more royalties from theory
    than all other sources put together)!

34
Thank You!
35
Property Testers
?
yes ? no
?
a b c d e
.07 .13 0 .03 .11
(?10)
sampling
(probability gt ?
?
yes no
(a,b,b,a,a,a,f,e,e,e)
tester
36
The Canonical Tester
?
yes ? no
?
a b c d e
.07 .13 0 .03 .11
(?10)
sampling
(probability gt ?
?
yes no
(a,b,b,a,a,a,f,e,e,e)
tester
estimate high frequencies
threshold 3
n yes,no
constrain low frequencies
yes ? no
a b c d e

?
?
.4
.3
lt.3
lt.3
lt.3
Write a Comment
User Comments (0)
About PowerShow.com