Markov Logic - PowerPoint PPT Presentation

PPT – Markov Logic PowerPoint presentation | free to download - id: 485c7f-ZDY2N

The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
Title:

Markov Logic

Description:

Markov Logic Pedro Domingos Dept. of Computer Science & Eng. University of Washington Desiderata A language for cognitive modeling should: Handle uncertainty Noise ... – PowerPoint PPT presentation

Number of Views:76
Avg rating:3.0/5.0
Slides: 79
Provided by: Pedr59
Category:
Tags:
Transcript and Presenter's Notes

Title: Markov Logic

1
Markov Logic
• Pedro Domingos
• Dept. of Computer Science Eng.
• University of Washington

2
Desiderata
• A language for cognitive modeling should
• Handle uncertainty
• Noise
• Incomplete information
• Ambiguity
• Handle complexity
• Many objects
• Relations among them
• IsA and IsPart hierarchies
• Etc.

3
Solution
• Probability handles uncertainty
• Logic handles complexity
• What is the simplest way to
• combine the two?

4
Markov Logic
• Assign weights to logical formulas
• Treat formulas as templates for features of
Markov networks

5
Overview
• Representation
• Inference
• Learning
• Applications

6
Propositional Logic
• Atoms Symbols representing propositions
• Logical connectives , ?, V, etc.
• Knowledge base Set of formulas
• World Truth assignment to all atoms
• Every KB can be converted to CNF
• CNF Conjunction of clauses
• Clause Disjunction of literals
• Literal Atom or its negation
• Entailment Does KB entail query?

7
First-Order Logic
• Atom Predicate(Variables,Constants) E.g.
• Ground atom All arguments are constants
• Quantifiers
• This tutorial Finite, Herbrand interpretations

8
Markov Networks
• Undirected graphical models

Cancer
Smoking
Cough
Asthma
• Potential functions defined over cliques

Smoking Cancer ?(S,C)
False False 4.5
False True 4.5
True False 2.7
True True 4.5
9
Markov Networks
• Undirected graphical models

Cancer
Smoking
Cough
Asthma
• Log-linear model

Weight of Feature i
Feature i
10
Probabilistic Knowledge Bases
• PKB Set of formulas and their probabilities
• Consistency Maximum entropy
• Set of formulas and their weights
• Set of formulas and their potentials
• (1 if formula true, if formula false)

11
Markov Logic
• A Markov Logic Network (MLN) is a set of pairs
(F, w) where
• F is a formula in first-order logic
• w is a real number
• An MLN defines a Markov network with
• One node for each grounding of each predicate in
the MLN
• One feature for each grounding of each formula F
in the MLN, with the corresponding weight w

12
Relation to Statistical Models
• Special cases
• Markov networks
• Markov random fields
• Bayesian networks
• Log-linear models
• Exponential models
• Max. entropy models
• Gibbs distributions
• Boltzmann machines
• Logistic regression
• Hidden Markov models
• Conditional random fields
• Obtained by making all predicates zero-arity
• Markov logic allows objects to be interdependent
(non-i.i.d.)
• Markov logic facilitates composition

13
Relation to First-Order Logic
• Infinite weights ? First-order logic
• Satisfiable KB, positive weights ? Satisfying
assignments Modes of distribution
• Markov logic allows contradictions between
formulas

14
Example
15
Example
16
Example
17
Example
18
Example
19
Overview
• Representation
• Inference
• Learning
• Applications

20
Theorem Proving
• TP(KB, Query)
• KBQ ? KB U Query
• return SAT(CNF(KBQ))

21
Satisfiability (DPLL)
• SAT(CNF)
• if CNF is empty return True
• if CNF contains empty clause return False
• choose an atom A
• return SAT(CNF(A)) V SAT(CNF(A))

22
First-Order Theorem Proving
• Propositionalization
• 1. Form all possible ground atoms
• 2. Apply propositional theorem prover
• Lifted Inference Resolution
• Resolve pairs of clauses until empty clause
derived
• Unify literals by substitution, e.g.
unifies

and
23
Probabilistic Theorem Proving
• Given Probabilistic knowledge base K
Query formula Q
• Output P(QK)

24
Weighted Model Counting
• ModelCount(CNF) worlds that satisfy CNF
• Assign a weight to each literal
• Weight(world) ? weights(true literals)
• Weighted model counting

Given CNF C and literal weights W Output S
weights(worlds that satisfy C)
PTP is reducible to lifted WMC
25
Example
26
Example
27
Example
If
Then
28
Example
29
Example
30
Inference Problems
31
Propositional Case
• All conditional probabilities are ratios of
partition functions
• All partition functions can be computed by
weighted model counting

32
Conversion to CNF Weights
• WCNF(PKB)
• for all (Fi, Fi) ? PKB s.t. Fi gt 0 do
• PKB ? PKB U (Fi ? Ai, 0) \ (Fi, Fi)
• CNF ? CNF(PKB)
• for all Ai literals do WAi ? Fi
• for all other literals L do wL ? 1
• return (CNF, weights)

33
Probabilistic Theorem Proving
PTP(PKB, Query) PKBQ ? PKB U (Query,0) return
WMC(WCNF(PKBQ)) / WMC(WCNF(PKB))
34
Probabilistic Theorem Proving
PTP(PKB, Query) PKBQ ? PKB U (Query,0) return
WMC(WCNF(PKBQ)) / WMC(WCNF(PKB))
Compare
TP(KB, Query) KBQ ? KB U Query return
SAT(CNF(KBQ))
35
Weighted Model Counting
• WMC(CNF, weights)
• if all clauses in CNF are satisfied
• return
• if CNF has empty unsatisfied clause return 0

Base Case
36
Weighted Model Counting
• WMC(CNF, weights)
• if all clauses in CNF are satisfied
• return
• if CNF has empty unsatisfied clause return 0
• if CNF can be partitioned into CNFs C1,, Ck
sharing no atoms
• return

Decomp. Step
37
Weighted Model Counting
• WMC(CNF, weights)
• if all clauses in CNF are satisfied
• return
• if CNF has empty unsatisfied clause return 0
• if CNF can be partitioned into CNFs C1,, Ck
sharing no atoms
• return
• choose an atom A
• return

Splitting Step
38
First-Order Case
• PTP schema remains the same
• Conversion of PKB to hard CNF and weights New
atom in Fi ? Ai is now Predicatei(variables in
Fi, constants in Fi)
• New argument in WMC Set of substitution
constraints of the form x A, x ? A, x y, x ?
y
• Lift each step of WMC

39
Lifted Weighted Model Counting
• LWMC(CNF, substs, weights)
• if all clauses in CNF are satisfied
• return
• if CNF has empty unsatisfied clause return 0

Base Case
40
Lifted Weighted Model Counting
• LWMC(CNF, substs, weights)
• if all clauses in CNF are satisfied
• return
• if CNF has empty unsatisfied clause return 0
• if there exists a lifted decomposition of CNF
• return

Decomp. Step
41
Lifted Weighted Model Counting
• LWMC(CNF, substs, weights)
• if all clauses in CNF are satisfied
• return
• if CNF has empty unsatisfied clause return 0
• if there exists a lifted decomposition of CNF
• return
• choose an atom A
• return

Splitting Step
42
Extensions
• Unit propagation, etc.
• Caching / Memoization
• Knowledge-based model construction

43
Approximate Inference
• WMC(CNF, weights)
• if all clauses in CNF are satisfied
• return
• if CNF has empty unsatisfied clause return 0
• if CNF can be partitioned into CNFs C1,, Ck
sharing no atoms
• return
• choose an atom A
• return
• with probability
, etc.

Splitting Step
44
MPE Inference
• Replace sums by maxes
• Use branch-and-bound for efficiency
• Do traceback

45
Overview
• Representation
• Inference
• Learning
• Applications

46
Learning
• Data is a relational database
• Closed world assumption (if not EM)
• Learning parameters (weights)
• Generatively
• Discriminatively
• Learning structure (formulas)

47
Generative Weight Learning
• Maximize likelihood
• Use gradient ascent or L-BFGS
• No local maxima
• Requires inference at each step (slow!)

No. of true groundings of clause i in data
Expected no. true groundings according to model
48
Pseudo-Likelihood
• Likelihood of each variable given its neighbors
in the data Besag, 1975
• Does not require inference at each step
• Consistent estimator
• Widely used in vision, spatial statistics, etc.
• But PL parameters may not work well for long
inference chains

49
Discriminative Weight Learning
• Maximize conditional likelihood of query (y)
given evidence (x)
• Expected counts can be approximated by counts in
MAP state of y given x

No. of true groundings of clause i in data
Expected no. true groundings according to model
50
Voted Perceptron
• Originally proposed for training HMMs
discriminatively Collins, 2002
• Assumes network is linear chain

wi ? 0 for t ? 1 to T do yMAP ? Viterbi(x)
wi ? wi ? counti(yData) counti(yMAP) return
?t wi / T
51
Voted Perceptron for MLNs
• HMMs are special case of MLNs
• Replace Viterbi by prob. theorem proving
• Network can now be arbitrary graph

wi ? 0 for t ? 1 to T do yMAP ? PTP(MLN U
x, y) wi ? wi ? counti(yData)
counti(yMAP) return ?t wi / T
52
Structure Learning
• Generalizes feature induction in Markov nets
• Any inductive logic programming approach can be
used, but . . .
• Goal is to induce any clauses, not just Horn
• Evaluation function should be likelihood
• Requires learning weights for each candidate
• Turns out not to be bottleneck
• Bottleneck is counting clause groundings
• Solution Subsampling

53
Structure Learning
• Initial state Unit clauses or hand-coded KB
• Operators Add/remove literal, flip sign
• Evaluation function Pseudo-likelihood
Structure prior
• Search
• Beam, shortest-first Kok Domingos, 2005
• Bottom-up Mihalkova Mooney, 2007
• Relational pathfinding Kok Domingos, 2009,
2010

54
Alchemy
• Open-source software including
• Full first-order logic syntax
• MAP and marginal/conditional inference
• Generative discriminative weight learning
• Structure learning
• Programming language features

alchemy.cs.washington.edu
55
Alchemy Prolog BUGS
Represent-ation F.O. Logic Markov nets Horn clauses Bayes nets
Inference Probabilistic thm. proving Theorem proving Gibbs sampling
Learning Parameters structure No Params.
Uncertainty Yes No Yes
Relational Yes Yes No
56
Overview
• Representation
• Inference
• Learning
• Applications

57
Applications to Date
• Natural language processing
• Information extraction
• Entity resolution
• Collective classification
• Social network analysis
• Robot mapping
• Activity recognition
• Scene analysis
• Computational biology
• Probabilistic Cyc
• Personal assistants
• Etc.

58
Information Extraction
Parag Singla and Pedro Domingos,
Memory-Efficient Inference in Relational
Domains (AAAI-06). Singla, P., Domingos, P.
(2006). Memory-efficent inference in relatonal
domains. In Proceedings of the Twenty-First
National Conference on Artificial
Intelligence (pp. 500-505). Boston, MA AAAI
Press. H. Poon P. Domingos, Sound and
Efficient Inference with Probabilistic and
Deterministic Dependencies, in Proc. AAAI-06,
Boston, MA, 2006. P. Hoifung (2006). Efficent
inference. In Proceedings of the Twenty-First
National Conference on Artificial Intelligence.
59
Segmentation
Author
Title
Venue
Parag Singla and Pedro Domingos,
Memory-Efficient Inference in Relational
Domains (AAAI-06). Singla, P., Domingos, P.
(2006). Memory-efficent inference in relatonal
domains. In Proceedings of the Twenty-First
National Conference on Artificial
Intelligence (pp. 500-505). Boston, MA AAAI
Press. H. Poon P. Domingos, Sound and
Efficient Inference with Probabilistic and
Deterministic Dependencies, in Proc. AAAI-06,
Boston, MA, 2006. P. Hoifung (2006). Efficent
inference. In Proceedings of the Twenty-First
National Conference on Artificial Intelligence.
60
Entity Resolution
Parag Singla and Pedro Domingos,
Memory-Efficient Inference in Relational
Domains (AAAI-06). Singla, P., Domingos, P.
(2006). Memory-efficent inference in relatonal
domains. In Proceedings of the Twenty-First
National Conference on Artificial
Intelligence (pp. 500-505). Boston, MA AAAI
Press. H. Poon P. Domingos, Sound and
Efficient Inference with Probabilistic and
Deterministic Dependencies, in Proc. AAAI-06,
Boston, MA, 2006. P. Hoifung (2006). Efficent
inference. In Proceedings of the Twenty-First
National Conference on Artificial Intelligence.
61
Entity Resolution
Parag Singla and Pedro Domingos,
Memory-Efficient Inference in Relational
Domains (AAAI-06). Singla, P., Domingos, P.
(2006). Memory-efficent inference in relatonal
domains. In Proceedings of the Twenty-First
National Conference on Artificial
Intelligence (pp. 500-505). Boston, MA AAAI
Press. H. Poon P. Domingos, Sound and
Efficient Inference with Probabilistic and
Deterministic Dependencies, in Proc. AAAI-06,
Boston, MA, 2006. P. Hoifung (2006). Efficent
inference. In Proceedings of the Twenty-First
National Conference on Artificial Intelligence.
62
State of the Art
• Segmentation
• HMM (or CRF) to assign each token to a field
• Entity resolution
• Logistic regression to predict same
field/citation
• Transitive closure
• Alchemy implementation Seven formulas

63
Types and Predicates
token Parag, Singla, and, Pedro, ... field
Author, Title, Venue citation C1, C2,
... position 0, 1, 2, ... Token(token,
position, citation) InField(position, field,
citation) SameField(field, citation,
citation) SameCit(citation, citation)
64
Types and Predicates
token Parag, Singla, and, Pedro, ... field
Author, Title, Venue, ... citation C1, C2,
... position 0, 1, 2, ... Token(token,
position, citation) InField(position, field,
citation) SameField(field, citation,
citation) SameCit(citation, citation)
Optional
65
Types and Predicates
token Parag, Singla, and, Pedro, ... field
Author, Title, Venue citation C1, C2,
... position 0, 1, 2, ... Token(token,
position, citation) InField(position, field,
citation) SameField(field, citation,
citation) SameCit(citation, citation)
Evidence
66
Types and Predicates
token Parag, Singla, and, Pedro, ... field
Author, Title, Venue citation C1, C2,
... position 0, 1, 2, ... Token(token,
position, citation) InField(position, field,
citation) SameField(field, citation,
citation) SameCit(citation, citation)
Query
67
Formulas
Token(t,i,c) gt InField(i,f,c) InField(i,f,c)
ltgt InField(i1,f,c) f ! f gt
(!InField(i,f,c) v !InField(i,f,c)) Token(t,i
,c) InField(i,f,c) Token(t,i,c)
InField(i,f,c) gt SameField(f,c,c) SameField(
f,c,c) ltgt SameCit(c,c) SameField(f,c,c)
SameField(f,c,c) gt SameField(f,c,c) SameCit
(c,c) SameCit(c,c) gt SameCit(c,c)
68
Formulas
Token(t,i,c) gt InField(i,f,c) InField(i,f,c)
ltgt InField(i1,f,c) f ! f gt
(!InField(i,f,c) v !InField(i,f,c)) Token(t,i
,c) InField(i,f,c) Token(t,i,c)
InField(i,f,c) gt SameField(f,c,c) SameField(
f,c,c) ltgt SameCit(c,c) SameField(f,c,c)
SameField(f,c,c) gt SameField(f,c,c) SameCit
(c,c) SameCit(c,c) gt SameCit(c,c)
69
Formulas
Token(t,i,c) gt InField(i,f,c) InField(i,f,c)
ltgt InField(i1,f,c) f ! f gt
(!InField(i,f,c) v !InField(i,f,c)) Token(t,i
,c) InField(i,f,c) Token(t,i,c)
InField(i,f,c) gt SameField(f,c,c) SameField(
f,c,c) ltgt SameCit(c,c) SameField(f,c,c)
SameField(f,c,c) gt SameField(f,c,c) SameCit
(c,c) SameCit(c,c) gt SameCit(c,c)
70
Formulas
Token(t,i,c) gt InField(i,f,c) InField(i,f,c)
ltgt InField(i1,f,c) f ! f gt
(!InField(i,f,c) v !InField(i,f,c)) Token(t,i
,c) InField(i,f,c) Token(t,i,c)
InField(i,f,c) gt SameField(f,c,c) SameField(
f,c,c) ltgt SameCit(c,c) SameField(f,c,c)
SameField(f,c,c) gt SameField(f,c,c) SameCit
(c,c) SameCit(c,c) gt SameCit(c,c)
71
Formulas
Token(t,i,c) gt InField(i,f,c) InField(i,f,c)
ltgt InField(i1,f,c) f ! f gt
(!InField(i,f,c) v !InField(i,f,c)) Token(t,i
,c) InField(i,f,c) Token(t,i,c)
InField(i,f,c) gt SameField(f,c,c) SameField(
f,c,c) ltgt SameCit(c,c) SameField(f,c,c)
SameField(f,c,c) gt SameField(f,c,c) SameCit
(c,c) SameCit(c,c) gt SameCit(c,c)
72
Formulas
Token(t,i,c) gt InField(i,f,c) InField(i,f,c)
ltgt InField(i1,f,c) f ! f gt
(!InField(i,f,c) v !InField(i,f,c)) Token(t,i
,c) InField(i,f,c) Token(t,i,c)
InField(i,f,c) gt SameField(f,c,c) SameField(
f,c,c) ltgt SameCit(c,c) SameField(f,c,c)
SameField(f,c,c) gt SameField(f,c,c) SameCit
(c,c) SameCit(c,c) gt SameCit(c,c)
73
Formulas
Token(t,i,c) gt InField(i,f,c) InField(i,f,c)
ltgt InField(i1,f,c) f ! f gt
(!InField(i,f,c) v !InField(i,f,c)) Token(t,i
,c) InField(i,f,c) Token(t,i,c)
InField(i,f,c) gt SameField(f,c,c) SameField(
f,c,c) ltgt SameCit(c,c) SameField(f,c,c)
SameField(f,c,c) gt SameField(f,c,c) SameCit
(c,c) SameCit(c,c) gt SameCit(c,c)
74
Formulas
Token(t,i,c) gt InField(i,f,c) InField(i,f,c)
!Token(.,i,c) ltgt InField(i1,f,c) f ! f
gt (!InField(i,f,c) v !InField(i,f,c)) Token(
t,i,c) InField(i,f,c) Token(t,i,c)
InField(i,f,c) gt SameField(f,c,c) SameField(
f,c,c) ltgt SameCit(c,c) SameField(f,c,c)
SameField(f,c,c) gt SameField(f,c,c) SameCit
(c,c) SameCit(c,c) gt SameCit(c,c)
75
Results Segmentation on Cora
76
Results Matching Venues on Cora
77
Summary
• Cognitive modeling requires combination of
logical and statistical techniques
• We need to unify the two
• Markov logic
• Syntax Weighted logical formulas
• Semantics Markov network templates
• Inference Probabilistic theorem proving
• Learning Statistical inductive logic programming
• Many applications to date

78
Resources
• Open-source software/Web site Alchemy
• Learning and inference algorithms
• Tutorials, manuals, etc.
• MLNs, datasets, etc.
• Publications
• Book Domingos Lowd, Markov Logic, Morgan
Claypool, 2009.

alchemy.cs.washington.edu