Title: Active Learning and the Importance of Feedback in Sampling
1Active Learningand the Importance of Feedback
in Sampling
- Rui Castro
- Rebecca Willett and Robert Nowak
2Motivation twenty questions
Goal Accurately learn a concept, as fast as
possible, by strategically focusing in regions of
interest
3Active Sampling in Regression
Learning by asking carefully chosen questions,
constructed using information gleaned from
previous observations
4Passive Sampling
Sample locations are chosen a priori, before any
observations are made
5Active Sampling
Sample locations are chosen as a function of
previous observations
6Problem Formulation
7Passive vs. Active
Passive Sampling
Active Sampling
8Estimation and Sampling Strategies
Goal
The estimator
The sampling strategy
9Classical Smoothness Spaces
Functions with homogeneous complexity over the
entire domain
- Hölder smooth function class
10Smooth Functions minimax lower bound
Theorem (Castro, RW, Nowak 05)
The performance one can achieve with active
learning is the same achievable with passive
learning!!!
11Inhomogeneous Functions
Homogenous functions spread-out complexity
Inhomogeneous functions localized complexity
The relevant features of inhomogeneous functions
are very localized in space, making active
sampling promising
12Piecewise Constant Functions d2
13Passive Learning in the PC Class
Estimation using Recursive Dyadic Partitions (RDP)
Prune the partition, adapting to the data
Recursively divide the domain into hypercubes
Decorate each partition set with a constant
Distribute sample points uniformly over 0,1d
14RDP-based Algorithm
Choose an RDP that fits the data well, but it is
not overly complicated
This estimator can be computed efficiently using
a tree-pruning algorithm.
15Error Bounds
Oracle bounding techniques, akin to the work of
Barron91, can be used to upper bound the
performance of our estimator
16Active Sampling in the PC class
Active Sampling Key learn the location of the
boundary
Use Recursive Dyadic Partitions to find the
boundary
17Active Sampling in the PC Class
Stage 1 Oversample at coarse resolution
- n/2 samples uniformly distributed
- Limit the resolution many more samples than
cells
- biased, but very low variance result
- (high approximation error, but low estimation
error) - boundary zone is
- reliably detected
18Active Sampling in the PC Class
Stage 2 Critically sample in boundary zone
- n/2 samples uniformly distributed
- within boundary zone
- construct fine partition around boundary
- prune partition according to
- standard multiscale methods
- high resolution
- estimate of boundary
19Main Theorem
Main Theorem (Castro 05)
Cusp-free boundaries cannot behave like the
graph of x1/2 at the origin, but milder kinks
like x at 0 are allowable.
20Sketch of the Proof - Approach
21Controlling the Bias
Cells intersecting the boundary may be pruned if
aligned with cell edge
- Solution
- Repeat Stage 1 d-times, using d slightly offset
partitions - Small cells remaining in any of the d1
partitions are passed on to Stage 2
Potential Problem Area
Not a problem after shift
22Multi-Stage Approach
Iterating the approach yields a L-step method
Compare with minimax lower bound
23Learning PC Functions - Summary
Passive Sampling
Active Sampling
This rates are nearly achieved using RDP-based
estimators, that are easily implemented and have
low computational complexity.
24Spatial Adaptivity and Active Learning
Spatially adaptive estimators based on sparse
model selection (e.g., wavelet thresholding) may
provide automatic mechanisms for guiding active
learning processes
Instead of choosing where-to-sample one can
also choose where-to-compute to actively reduce
computation.
Can active learning provably work in even more
realistic situations and under little or no prior
assumptions ?
25(No Transcript)
26(No Transcript)
27Piecewise Constant Functions d 1
Consider first the simplest non-homogenous
function class
step function
This is a parametric class
28Passive Sampling
Distribute sample points uniformly over 0,1 and
use a maximum likelihood estimator
29Active Sampling
30Learning Rates d 1
Passive Sampling
Active Sampling
(Burnashev Zigangirov 74)
31Sketch of the Proof - Stage 1
Error due to approximation of the boundary regions
estimation error
Intuition tells us that this should be the error
we experience away from the boundary
32Sketch of the Proof - Stage 1
Key Limit the resolution of the RDPs
1/k
1/k
This is the performance away from the boundary
33Sketch of the Proof - Stage 1
Are we finding more than the boundary?
Lemma
At least we are not detecting too many areas
outside the boundary.
34Sketch of the Proof - Stage 2
n/2 more samples distributed uniformly over the
boundary
Total error contribution from boundary zone
35Sketch of the Proof Overall Error
Error away from the boundary
Error in the boundary region
Balancing the two errors yields