Parameter Related Domain Knowledge for Learning in Bayesian Networks - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

Parameter Related Domain Knowledge for Learning in Bayesian Networks

Description:

Domain Knowledge. In real world, often data is too sparse to allow building of an accurate model ... Domain Knowledge readily available ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 37
Provided by: stefanni
Category:

less

Transcript and Presenter's Notes

Title: Parameter Related Domain Knowledge for Learning in Bayesian Networks


1
Parameter Related Domain Knowledge forLearning
in Bayesian Networks
Stefan Niculescu PhD Candidate, Carnegie Mellon
University Joint work with professor Tom Mitchell
and Dr. Bharat Rao April 2005
2
Domain Knowledge
  • In real world, often data is too sparse to allow
    building of an accurate model
  • Domain knowledge can help alleviate this problem
  • Several types of domain knowledge
  • Relevance of variables (feature selection)
  • Conditional Independences among variables
  • Parameter Domain Knowledge

3
Parameter Domain Knowledge
  • In a Bayes Net for a real world domain
  • can have huge number of parameters
  • not enough data to estimate them accurately
  • Parameter Domain Knowledge constraints
  • reduce the number of parameters to estimate
  • reduce the variance of parameter estimates

4
Outline
  • Motivation
  • Parameter Related Domain Knowledge
  • Experiments
  • Related Work
  • Summary / Future Work

5
Parameters and Counts
Theorem. The Maximum Likelihood estimators are
given by
CPT for variable Xi
6
Parameter Sharing
Theorem. The Maximum Likelihood estimators are
given by
7
Incomplete Data, Frequentist
8
Dependent Dirichlet Priors
9
Bayesian Averaging
10
Hierarchical Parameter Sharing
11
Probability Mass Sharing
DK Parameters of a given color have the same sum
across all distributions.
...
12
Probability Ratio Sharing
DK Parameters of a given color preserve their
relative ratios across all distributions.
...
13
Where are we right now?
14
Outline
  • Motivation
  • Parameter Related Domain Knowledge
  • Experiments
  • Related Work
  • Summary / Future Work

15
Datasets
  • Project World - CALO
  • 6 persons, 200 emails
  • Manually labeled as About / Not About Meetings
  • Data (Person, Email, Topic)
  • Artificial Datasets
  • Kept most of the characteristics of the data BUT
    ...
  • ... new emails were generated where frequencies
    of certain words were shared across users
  • Purpose
  • Domain Knowledge readily available
  • To be able to study the effect of training set
    size (up to 5000)
  • To be able to compare our estimated distribution
    to the true distribution

16
Approach
  • Can model Email using a Naive Bayes model
  • Without Parameter Sharing (PSNB)
  • With Parameter Sharing (SSNB)
  • Also compare with a model that assumes the sender
    is irrelevant (GNB)
  • the frequencies of words within a topic to be
    learnt from all examples

Sender
Topic
Word
Topic
Sender
Word
17
Effect of Training Set Size
  • As expected
  • SSNB performs better than both models
  • SSNB and PSNB tend to perform similarly when the
    size of training set increases, but SSNB much
    better when data is sparse

18
Outline
  • Motivation
  • Parameter Related Domain Knowledge
  • Experiments
  • Related Work
  • Summary / Future Work

19
Dirichlet Priors in a Bayes Net
Prior Belief
Spread
The Domain Expert specifies an assignment of
parameters. However, leaves room for some error
(Spread)
20
HMMs and DBNs
...
...
...
...
21
Module Networks
  • In a Module
  • Same parents
  • Same CPTs

Image from Learning Module Networks by Eran
Segal and Daphne Koller
22
Context Specific Independence
Burglary
Set
Alarm
23
Outline
  • Motivation
  • Parameter Related Domain Knowledge
  • Experiments
  • Related Work
  • Summary / Future Work

24
Summary
  • Parameter Related Domain Knowledge is needed when
    data is scarce
  • Developed methods to estimate parameters
  • For each of four types of Domain Knowledge
    presented
  • From both complete and incomplete Data
  • Markov Models, Module Nets, Context Specific
    Independence particular cases of our parameter
    sharing domain knowledge
  • Models using Parameter Sharing performed better
    than two classical Bayes Nets on synthetic data

25
Future Work
  • Automatically find Shared Parameters
  • Study interactions among different types of
    Domain Knowledge
  • Incorporate Domain Knowledge about continuous
    variables
  • Investigate Domain Knowledge in the form of
    inequality constraints

26
Questions ?
27
THE END
28
Backup Slides
29
Hierarchical Parameter Sharing
30
Full Data Observability, Frequentist
31
Probability Mass Sharing
  • Want to model P(WordLanguage)
  • Two languages English, Spanish
  • Different sets of words
  • Domain Knowledge
  • Aggregate Probability Mass of Nouns the same in
    both
  • Same holds for adjectives, verbs, etc

32
Probability Mass Sharing
33
Full Data Observability, Frequentist
34
Probability Ratio Sharing
  • Want to model P(WordLanguage)
  • Two languages English, Spanish
  • Different sets of words
  • Domain Knowledge
  • Word groups
  • About computers computer, mouse, monitor, etc
  • Relative frequency of computer to mouse same
    in both languages
  • Aggregate mass can be different

T1 Computer Words
T2 Business Words
35
Probability Ratio Sharing
36
Full Data Observability, Frequentist
Write a Comment
User Comments (0)
About PowerShow.com