Property Testing: A Learning Theory Perspective - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Property Testing: A Learning Theory Perspective

Description:

A relaxation of exactly deciding whether the object has the property or does not ... A relaxation of learning the object (with membership queries and under the ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 30
Provided by: dana63
Category:

less

Transcript and Presenter's Notes

Title: Property Testing: A Learning Theory Perspective


1
Property TestingA Learning Theory Perspective
  • Dana Ron
  • Tel Aviv University

2
Property Testing (Informal Definition)
For a fixed property P and any object
O, determine whether O has property P, or whether
O is far from having property P (i.e., far
from any other object having P ).
3
Examples
  • The object can be a function and the property can
    be linearity.
  • The object can be a string and the property can
    be membership in a fixed regular language L.
  • The object can be a graph and the property can be
    3-colorabilty.

4
Two Views of Property Testing
  • A relaxation of exactly deciding whether the
    object has the property or does not have the
    property.
  • A relaxation of learning the object (with
    membership queries and under the uniform
    distribution).

In either case want testing algorithm to be
significantly more efficient than
decision/learning algorithm.
Q Which view is more right?
A Depends
Mainly on type of objects and properties studied
Combinatorial objects and properties vs. function
classes that are of interest to learning community
5
A Little Background
  • Initially defined by Rubinfeld and Sudan in the
    context of Program Testing (of algebraic
    functions).
  • With Goldreich and Goldwasser initiated study
    of testing properties of combinatorial
    objects, and in particular graphs.
  • Growing body of work deals with properties of
    functions, graphs, strings, sets of points ...

Many algorithms with complexity that is
sub-linear in (or even independent of) size
of object.
6
Formal Definition (standard model)
  • A property testing algorithm for property P is
    given a distance parameter ? and query access()
    to a function f.
  • If f has property P then the algorithm should
    accept w.h.p.
  • If f is ?-far from any function having property
    P then the algorithm should reject
    w.h.p.Distance is measured with respect to the
    uniform dist.()

() May consider testing with random examples only
() May consider other distributions (unknown
dist.)
7
Property Testing and LearningBasic
Comments/Observations
  • Comment (Motivation) Can use testing as
    preliminary step to learning That is, for
    efficiently selecting good hypothesis class.
  • Observation Testing is no harder than (proper)
    learning If have learning algorithm for function
    class F that outputs hypothesis in F, then can
    use to test the property of belonging to F.

That is, run learning alg with accuracy parameter
set to ?/2, and check that the hypothesis it
outputs is at most 3?/4 far from f on independent
sample.
Want testing algorithm to be more efficient than
learning algorithm.
8
Classes/Properties for which testing is more
efficient then learning
  • Linear functions
  • Low-degree polynomials
  • Singletons, Monomials
  • (small) DNF and general Boolean formula and
    circuits
  • Monotone functions
  • Juntas
  • Halfspaces
  • Decision Lists, Decision Trees, Branching
    Programs
  • Clustering
  • Properties of Distributions

Two recurring approaches Self-Correcting
approach, and EnforceTest Approach.
9
Linearity Testing Blum,Luby,Rubinfeld
Def1 Let F be a finite field. A function f
Fn ? F is called linear (multi-linear) if there
exists constants a1,,an ?? F s.t. for every
xx1,,xn ? Fn it holds that f(x) ? aixi .
Def2 A function f is said to be ?-far from
linear if for every linear function g,
dist(f,g)gt?, where dist(f,g)Prf(x) ? g(x) (x
selected uniformly in Fn).
Fact A function f Fn ? F is linear i.f.f for
every x,y ? Fn it holds that f(x)f(y)f(xy) .
10
Linearity Testing Cont
Linearity Test 1) Uniformly and independently
select ?(1/?) pairs of elements x,y ? Fn . 2)
For every pair x,y selected, verify that
f(x)f(y) f(xy). 3) If for any of the pairs
selected linearity is violated (i.e., f(x)f(y) ?
f(xy)), then REJECT, otherwise ACCEPT.
Query complexity ?(1/?), i.e., independent of n.
In contrast to learning where need ?(n)
queries/examples.
Theorem If f is linear then test accepts w.p.
1., and if f is ?-far from linear then with
probability at least 2/3 the test rejects it.
Lemma If f is accepted with probability greater
than 1/3, then f is ?-close to linear.
11
Linearity Testing Cont
Lemma If f is accepted with probability greater
than 1/3 , then f is ?-close to linear.
Suppose f is accepted w.p gt 1/3
small (lt ?/2) fraction of violating pairs
(f(x)f(y)?f(xy))
Define self-corrected version of f, denote g
For each x,y let Vy(x) f(xy)-f(y) (the vote of
y on x) g(x)
Plurality(Vy(x))
Can show that (conditioned on lt ?/2 fraction of
violating pairs) (1) g is linear. (2)
dist(f,g) ? ?
12
Testing Basic Properties of Functions Parnas,
R, Samorodnitsky
Considers following function classes
  • Singletons
  • Monomials
  • DNF

13
Testing Basic Properties of Functions Cont
  • Can test whether f is a singleton using O(1/?)
    queries.
  • Can test whether f is a monomial using O(1/?)
    queries.
  • Can test whether f is a monotone DNF with at
    most t terms using Õ(t2/?) queries.

Common theme no dependence in query complexity
on size of input, n, and linear dependence on
distance parameter, e (as opposed to learning
these classes where have dependence on n
(logarithmic))
Recent result of Diakonikolas, Lee, Matulef,
Onak, Rubinfeld, Servedio, Wan greatly extends
the above
14
Testing (Monotone) Singletons
Natural test check, by sampling, that
conditions hold (approximately).
Can analyze natural test for case that distance
between function and class of singletons is not
too big (bounded away from 1/2).
15
Testing Singletons II - Parity Testing
16
Testing Singletons III - Self Correcting
This almost works If f is singleton - always
accepted. If f is e-far from parity - rejected
w.h.p. But if f is e-close to parity function
g, then cannot simply apply claim to argue that
many violating pairs w.r.t. f.
If we could only test violations w.r.t. g
instead of f ...
Use Self-Corrector of BLR to fix f into
parity function (g), and then test violations on
self-corrected version.
17
Testing Singletons IV - The Algorithm
Final Algorithm for Testing Singletons (1)
Test whether f is a parity function with dist.
par. e using algorithm of BLR . (2)
Uniformly select constant number of pairs x,y.
Verify that Self-Cor(f,x) ?
Self-Cor(f,y) Self-Cor(f,x?y) . (3) Verify
that Self-Cor( ) 1 .
18
Testing Monomials and Monotone DNF
Monomial testing algorithm has similar structure
to Singleton testing algorithm. (Here too suffice
to find test for monotone monomials.)
- The first stage of linearity testing is
replaced by Affinity Testing if f is a monomial
then F1x f(x)1 is an affine subspace.
Fact H is affine subspace i.f.f ?x,y,z?H, x?y?z
?H . Affinity test is similar to parity test
select x,y?F1, z?0,1n, verify that
f(x?y?z)f(x)?f(y)?f(z).
- The second stage is also analogous singleton
test (check for violating pairs). Here affinity
adds structure that helps analyze second stage.
Testing monotone DNF use monomial test as
sub-routine
Result of DLMORSW07 which extends to other
families (e.g., non-monotone DNF) uses different
techniques.
19
Testing of Clustering Alon,Dar,Parnas,R
Notation
X - set of points X n
dist(x,y) - distance between points x and y.
Assume that triangle inequality holds
(dist(x,y) dist(x,z)dist(z,y)).
For any subset S of X
Diameter of S d(S) maxx,y?Sdist(x,y)
20
Clustering Cont
X is (k,b)-clusterable if exists a k-way
partition (clustering) of X s.t. each cluster
has diameter at most b.
X is e-far from being (k,b)-clusterable (b ?
b) if there is no k-way partition of any Y ?
X, Y (1-e)n s.t. each cluster has diameter
at most b.
(In particular, will look at b(1?)?b, ?1 .)
In first case algorithm should accept and in
second reject with probability 2/3
21
Clustering Cont
Testing Algorithm (input k,b,e, ? ) (1) Take
sample of mm(k, e, ?) points from X. (2) If
sample is (k,b)-clusterable then accept, o.w.
reject
If X is (k,b)-clusterable then always accept.
Suppose X is e-far from being (k,(1?)?b)-clustera
ble. Show that reject w.p. at least 2/3.
Will prove for general metric and ?1 where
mO(k/e)
Note for general metric cannot go below ?1
unless allow m?(X1/2), but can do so for
Euclidean distance in d dimensions (in that case
must have dependence on (1/ ?)d/2 ).
Other sublinear clustering work (e.g. for
k-median) include Indyk, Mishra,Oblinger,Pitt
, Ben-David, Meyerson,OCallaghan,Plotkin
22
Clustering Cont
Consider following mental experiment -Let
points in sample be x1,x2,,xm

- Construct (growing) set of cluster
representatives REPS.
- Initially, REPS x1.
-At each step, take next point that is at
distance gt b from every x in REPS, and add to
REPS.
gtb
Claim If X is e-far from being (k,
2b)-clusterable then w.h.p. REPSgtk, causing the
algorithm to reject as required.
Proof Idea At each step consider clustering
enforced by REPS Each point in X\REPS is
assigned to closest xi in REPS. Additional sample
points test this clustering.
23
Tolerant Testing of Clustering Parnas,R,Rubinfeld

Tolerant Testing Reject when ?-far but accept
when ?-close
Tolerant Testing Algorithm (input k, b, e, ?,
? ) (1) Take sample of mm(k, e, ?, ?) points
from X. (2) If sample is (? (e -
?)/2)-close to (k,b)-clusterable then
accept, o.w. reject
Can analyze using a generalization of a framework
by Czumaj Sohler for (standard) testing that
captures aspects of enforcetest approach.
Sample has quadratic dependence on 1/(e - ?),
and same dependence on other parameters as
(standard) testing algorithm.
24
Distribution-Free Testing (with queries)
First results GGR trivial positive results
that follow from learning ? testing, and simple
negative results for combinatorial properties
(e.g., bipartiteness)
First non-trivial positive results Halevi,
Kushilevitz (1) Linearity and Low-degree
polynomials. In general when have
self-corrector. (2) Monotonicity in low
dimensions(Also had positive results for graph
properties)
On the other hand, Halevi, Kushilevitz showed
hardness of distribution free testing of
monotonicity in high-dimensions (i.e.,
exponential in d over 0,1d).
Recently, Glasner,Servedio showed that for
several classes (monomials, decisions lists,
linear threshold functions) need ?((n/log n)1/5)
queries for dist-free testing.
25
Conclusions and Open Problems
  • Property testing A relaxation of learning where
    should determine (w.h.p) if exists good
    approximation of function f in class F rather
    than find such approximation.
  • Can serve as preliminary step to learning.
  • For quite a few function classes have testing
    algorithms that are more query-efficient than
    learning algorithms.
  • Some extend to dist-free testing, for some have
    strong lower bounds.
  • Still much is left to be understood about
    relation between testing and learning (has flavor
    of relation between decision and search).

26
Thanks
27
Property Testing and LearningBasic
Comments/Observations
  • Motivation I Can use testing as preliminary
    step to learning That is, for efficiently
    selecting good hypothesis class.
  • Motivation II Relation between testing and
    learning has similarity to relation between
    decision and search.
  • Testing is no harder than (proper) learning If
    have learning algorithm for function class F that
    outputs hypothesis in F, then can use to test the
    property of belonging to F.

That is, run learning alg with accuracy parameter
set to ?/2, and check that the hypothesis it
outputs is at most 3?/4 far from f on independent
sample.
Want testing algorithm to be significantly more
efficient than learning algorithm.
28
Linearity Testing Cont
Lemma If f is accepted with probability greater
than 1/3 , then f is ?-close to linear.
Suppose f is accepted w.p gt 1/3
small (lt ?/2) fraction of violating pairs
(f(x)f(y)?f(xy))
Define self-corrected version of f, denote g
For each x,y let Vy(x) f(xy)-f(y) (the vote of
y on x) g(x)
Plurality(Vy(x))
Can show that (conditioned on lt ?/2 fraction of
violating pairs) (1) g is linear. (2)
dist(f,g) ? ?
Main Technical Lemma (informal) if few violating
pairs then ?x we have that for almost all y,
Vy(x)g(x)
29
Learning Boolean Formulae
  • Can learn singletons and monomials under
    uniform distribution using O(log n/?)
    queries/examples (variation on Occams razor)
  • Can properly learn monotone DNF with t terms and
    r literals using Õ(r log2n/? t(r 1/? ))
    queries AngluinBshoutyJacksonTamon.

Main difference w.r.t testing results in testing
there is no dependence on n and different
algorithmic approach.
Write a Comment
User Comments (0)
About PowerShow.com