Title: Incremental Integration of Probabilistic Models Learned from Data
1Incremental Integration of Probabilistic Models
Learned from Data
- Jian Xu (Louisiana State U)
- Pedrito Maynard-Zhang (Amazon.com)
- Jianhua Chen (Louisiana State U)
2Outline
- Incremental integration problem
- Existing batch integration approach
- BN incremental integration
- Pros, cons, and experiments
- BN subtraction
- Conclusion and future work
3Motivating Scenario
Symptoms, history, and test results
Expert k knowledge en route
Integrated model for diagnosis
Expert j knowledge arrived at time tj
Doctors own knowledge in this domain
Represented as probability models that are
learned from data
Expert i knowledge arrived at time ti
4Incremental Integration Problem
integration algorithm
t1
learning algorithm
BN1
t2
BN2
learning algorithm
tn
aggregateBN
BNn
learning algorithm
?
learningalgorithm
optimalBN
M samples generated from the true BN
not possible in practice
5Outline
- Incremental integration problem
- Existing batch integration approach
- BN incremental integration
- Pros, cons, and experiments
- BN subtraction
- Conclusion and future work
6MC Batch Integration
- Maynard-ReidChajewska01 shows that if sources
learn joints using MLE, then LinOP is the correct
integration algorithm where the weights are
percentage of data seen. - For BN integration, they adapt the MDL learning
algorithm - Use LinOP to approximate the needed statistics
since data is unavailable - Use the estimated fraction of data expert i saw
as the weight ai
7MC Batch Integration Algorithm
- Select a BN most likely to have generated data
using MDL and LinOP - Search over structures by adding, deleting, and
reversing edges - Score using LinOP-based MDL
- Parameterize using LinOP
- Use random restart to avoid getting stuck in a
local maximum
8Outline
- Incremental integration problem
- Existing batch integration approach
- BN incremental integration
- Pros, cons, and experiments
- BN subtraction
- Conclusion and future work
9Batch-Based Strawman 1
- Algorithm
- Wait for all models to arrive
- Apply batch algorithm
- Drawbacks
- Must store all models
- Can do nothing while waiting for models
- May not be able to tell when all models have
arrived - Models may never stop arriving (e.g., periodic
reports)
10Batch-Based Strawman 2
- Algorithm
- Store each model that arrives
- Apply batch algorithm to all stored models after
each new arrival - Drawbacks
- Must store all models
- Roughly O(i) time to add the ith model, and O(n2)
total for n models
11Incremental Integration Algorithm
- Integrate the first group of sources to arrive.
Consider this intermediate result to be an
aggregate source BN - Assign new weight to aggregate source by making
the number of samples it has seen the sum of
the number of samples that all involved sources
have seen - When new BNs come, integrate them with the
current aggregate BN
12Source Definition
- Tuple ltp, M, , essgt where
- p BN representing sources beliefs
- M number of samples distribution is based on
- , ess parameters defining prior over space of
distributions - prior over the sample space
- ess number of "virtual" samples distribution
space prior is based on
13Incremental Integration Algorithm
- 1. DM ? ltpD, 0, pD, essDgt
- 2. loop
- (a) wait until a new group g of sources Sg
S1, , Skg arrive with associated weights and
cumulative estimated sample size M. - (b) DM ? lt pD, MD, pD, essD gt where
- pD ? the integration of pD and Sg using the
batch integration algorithm and MD/(MDM) as the
aggregated sources weight, and - MD ? MD M.
- 3. until no new sources arrive
- 4. return DM.
g
g
14Justification
- We show algorithm is order-independent when
applied to joint distributions - Order-independence property holds approximately
for BNs - Approximation due to generalization and greedy
optimization search
15Outline
- Incremental integration problem
- Existing batch integration approach
- BN incremental integration
- Pros, cons and experiments
- BN subtraction
- Conclusion and future work
16Structure of Asia BN
Visit to Asia
Smoking
Lung Cancer
Tuberculosis
Bronchitis
Abnormality in Chest
Dyspnea
X-Ray
17Pro Performance
- Anytime response
- Most up-to-date aggregate model always available
- Efficient integration
- Typically, fewer sources involved in each
iteration - Total integration time is O(n) for n sources
- Idle time utilization
- Can take advantage of wait times to do
integration, reducing the total wall-clock time
for integration - Space saving
- Space only required for the current aggregate
model and arriving sources
18Time Comparison
Comparing total integration time of incremental
and batch integration algorithms as the number of
sources increases from 1 to 15 for fixed source
sizes of 201000
19Pro Accuracy
- Incremental integration accuracy relatively close
to batch integration accuracy - Difference introduced by local optima in search
space - Difference generally decreases with larger source
size
20Accuracy Comparison
Comparing incremental, batch, and source accuracy
over time when incrementally combining sources of
size 50
21Con Bias Inertia
- Bias introduced via local optima in search space
- Inertia incoming sources with small weights
unable to change aggregate significantly after a
point - Inertia cut both ways bias in aggregate can be
countered and held at bay by accurate sources
with relatively large weights
22Bias Inertia
Effect of a highly weighted, inaccurate source
arriving early third among 10 lower weight,
higher accuracy sources
23Con Sensitivity to Order
- Different source orderings can result in markedly
different results, even for same-size or
same-weight sources - The accuracy of the sources also matters less
accurate sources can introduce bias which is then
subject to the inertia effect - Source ordering-bias tradeoff
- If bad sources arrive early, bias they introduce
easier to undo, but bias also easier to introduce
in the first place - If bad sources arrive late, they are less likely
to introduce bias, but bias more difficult to
undo once introduced
24Outline
- Incremental integration problem
- Existing batch integration approach
- BN incremental integration
- Pros, cons and experiments
- BN subtraction
- Conclusion and future work
25BN Subtraction
- Scenarios
- Incorporating updates
- De-duplicating shared BNs
- Algorithm Incremental integration algorithm, but
use negative weights for BNs to remove
26Outline
- Incremental integration problem
- Existing batch integration approach
- BN incremental integration
- Pros, cons and experiments
- BN subtraction
- Conclusion and future work
27Conclusion
- Incremental algorithm supports anytime
querying, utilizes idle time, and saves space - The result of incremental integration of joint
distributions is independent of the source order - Experiments show BN integration result depends on
source order to a degree mainly due to bias
introduced by greedy optimization and maintained
by an inertial effect - Reduction of accuracy of incremental algorithm
may be acceptable
28Future Work
- Optimally grouping sources to minimize the total
integration time (we only explored the extreme of
integrating one source at a time) - Reducing high computation cost due to the heavy
reliance on BN inference - seek faster inference algorithm, e.g.,
approximate - organize the sources into hierarchical
integration tree, which allows parallel,
distributed integration - Subtraction experiments
- Detecting shared sources
29Acknowledgment
- Work partially supported by
- NSF grant ITR-0326387
- AFOSR grants FA9550-05-1-0454, F49620-03-1-0238,
F49620-03-1-0239, and F49620-03-1-0241
30Thank you!