Using Sparse Candidate Algorithm for Constructing Bayesian Network presentation

About This Presentation

Transcript and Presenter's Notes

Title: Using Sparse Candidate Algorithm for Constructing Bayesian Network

1
Using Sparse Candidate Algorithm for
Constructing Bayesian Network

Lei, Seak Fei
EECS 800 Protein Informatics

Reading Learning Bayesian Network Structure from
Massive DatasetsThe Sparse Candidate
Algorithm by Nir Friedman, Iftach Nachman, and
Dana Peer
2
Background

A Bayesian network for X X1, X2, , Xn
B ltG, ?gt
G Directed acyclic graph
? Conditional probability table (CPT)
Join probability
Learning a Bayesian network
Given a training set D x1, x 2, , xN,
Find a B that best matches D.

3
Possible Approaches

Constraint satisfaction problem
Statistical test e.g. ?2-test
Sensitive to failures in independence test
Optimization problem?
Score e.g.BDe, MDL
Decomposability
Find the structure maximizes these scores
Search technique
Generally NP-hard
Heuristic Greedy hill-climbing, simulated
annealing
O(n2)

4
Idea of Sparse Candidate

If examples and attributes are large, the
computational cost is too high
Most of the candidates considered during the
search procedure can be eliminated in advance
based on our statistical understanding on the
domain
If X and Y are independent in data, we dont need
to consider Y as a parent of X.
Restricting the possible parents of each variable
(k)
k ltlt n 1
Search space can greatly reduced -gt more
efficient
Mutual information

5
Related work

Chow and Lius algorithm
Use MI between all pairs of variables to build a
maximal spanning tree
Problems
Network ? tree
Cant deal with complex interactions

B
I(AC) gt I (AD) gt I(AB)
A
But A and D are conditional independent!!!
C
D
6
Related work (contd)

Solution (Sparse Candidate algorithm)
Use the network structure found at the last stage
to find better candidate parents -gt iterations
How to stop?
2 stopping conditions
Score based
Score(Bn1) Score (Bn)
Candidate based
For all i, Cin1 Cin
Include current parent as a candidate
Monotonic increase Score(Bn1D) gt Score(BnD)

7
Outline of the Sparse Candidate algorithm
8
Restrict Step

Three possible methods for candidate selection
for Xi
Discrepancy (Disc) measure
Based on Kullback-Leibler divergence (MI)

9
Restrict Step (contd)

Shield (shld) measure
Based on conditional independence
Score measure
Penalizing structures with more parameters
(possible values)

10
Maximize Step

Task

11
Maximize Step (contd)

Standard heuristics
Unconstrained
Search Space ( of possible parents) O(nCk)
Time O(n2)
Constrained by small candidate
Search Space ( of possible parents) O(2k)
Time O(kn)
Divide and Conquer heuristics

12
Divide and Conquer heuristics

Idea break down the problem into manageable
components, then combine them to get the global
solutions

13
Divide and Conquer heuristics (contd)

Strongly Connected Components (SCC)
A subset of vertices A is strongly connected if
for each X, Y ? A, there is a directed path from
X to Y and a directed path from Y to X
Decomposition of SCC into maximal sets that have
no strongly connected components
Separator Decomposition
Searching a separator of H which separate H into
H1 and H2 with no edge between them

14
Divide and Conquer heuristics (contd)

Cluster-Tree Decomposition
Decomposing into cluster tree
Similar to Clique-tree (each node is a cluster)
Use Dynamic programming to find the best tree
configuration

15
Experiments

General Greedy hill-climbing vs. Greedy
hill-climbing w/ Sparse Candidate algorithm
Two tests
Synthetic data
10,000 instance from alarm network
Real-life data
10,000 messages from 20 newsgroups
5,000 instance from gene expression data

16
Results Synthetic data
17
Results Real life data
18
Conclusion

Using sparse candidate sets enable us to search
for good structure efficiently
Suggest a new way to search for structure that
maximize the score
Concerns
Limited samples from biological experiments
Ignore the combination effect
For example X XOR Y Z

19
Reference

Learning Bayesian Network Structure from Massive
DatasetsThe Sparse Candidate Algorithm. Nir
Friedman, Iftach Nachman, and Dana Peer
Lecture Slides from Kyu-Baek Hwang (Soongsil
University)
Lecture Slides from Jincheng Gao (KSU)

Write a Comment

User Comments (0)

About PowerShow.com

Using Sparse Candidate Algorithm for Constructing Bayesian Network PowerPoint PPT Presentation