Cross-Domain Action-Model Acquisition for Planning viaWeb Search - PowerPoint PPT Presentation

1 / 43

About This Presentation

Title:

Cross-Domain Action-Model Acquisition for Planning viaWeb Search

Description:

Title: PowerPoint Presentation Last modified by: USER Created Date: 1/1/1601 12:00:00 AM Document presentation format: (4:3) Other titles – PowerPoint PPT presentation

Number of Views:66

Avg rating:3.0/5.0

Slides: 44

Provided by: sistSysu8

Category:

more less

Transcript and Presenter's Notes

Title: Cross-Domain Action-Model Acquisition for Planning viaWeb Search

1
Cross-Domain Action-Model Acquisition for
Planning viaWeb Search

Hankz Hankui Zhuoa, Qiang Yangb, Rong Pana and
Lei Lia
aSun Yat-sen University, China
bHong Kong University of Science Technology,
Hong Kong

2
Motivation

There are many domains that share knowledge with
each other, e.g.,

3
Motivation

There are many domains that share knowledge with
each other, e.g.,
walking in the driverlog domain

http//www.superstock.com/stock-photos-images/1778
R-4701
4
Motivation

There are many domains that share knowledge with
each other, e.g.,
walking in the driverlog domain
navigating in the rovers domain

http//www.superstock.com/stock-photos-images/1778
R-4701
http//www.pixelparadox.com/mars.htm
5
Motivation

There are many domains that share knowledge with
each other, e.g.,
walking in the driverlog domain
navigating in the rovers domain
moving in the elevator domain
etc

http//www.superstock.com/stock-photos-images/1778
R-4701
http//www.venusengineers.com/goods-lift.html
http//www.pixelparadox.com/mars.htm
6
Motivation

These actions in these domains all share the
common knowledge about location change, thus,
it may be possible to borrow knowledge from
each other.
specifically, next slide

http//www.superstock.com/stock-photos-images/1778
R-4701
http//www.venusengineers.com/goods-lift.html
http//www.pixelparadox.com/mars.htm
7
Motivation
http//www.superstock.com/stock-photos-images/1778
R-4701
http//www.pixelparadox.com/mars.htm
walk(?d-driver ?l1-loc ?l2-loc) precondition (and
(at ?d ?l1) (path ?l1 ?l2)) effect (and (not
(at ?d ?l1)) (at ?d ?l2)))
8
Motivation
http//www.superstock.com/stock-photos-images/1778
R-4701
http//www.pixelparadox.com/mars.htm
navigate(?d-rover ?x-waypoint ?y-waypoint) precon
dition ?? effect ??
guess?
walk(?d-driver ?l1-loc ?l2-loc) precondition (and
(at ?d ?l1) (path ?l1 ?l2)) effect (and (not
(at ?d ?l1)) (at ?d ?l2)))
9
Motivation
http//www.superstock.com/stock-photos-images/1778
R-4701
http//www.pixelparadox.com/mars.htm
walk(?d-driver ?l1-loc ?l2-loc) precondition (and
(at ?d ?l1) (path ?l1 ?l2)) effect (and (not
(at ?d ?l1)) (at ?d ?l2)))
navigate(?d-rover ?x-waypoint ?y-waypoint) precon
dition (at ?x ?y) (visible ?y ?z) effect
(not (at ?x ?y)) (at ?x ?z)
guess?
10
Motivation
http//www.superstock.com/stock-photos-images/1778
R-4701
http//www.pixelparadox.com/mars.htm
walk(?d-driver ?l1-loc ?l2-loc) precondition (and
(at ?d ?l1) (path ?l1 ?l2)) effect (and (not
(at ?d ?l1)) (at ?d ?l2)))
navigate(?d-rover ?x-waypoint ?y-waypoint) precon
dition (at ?x ?y) (visible ?y ?z) effect
(not (at ?x ?y)) (at ?x ?z)
guess?
11
Motivation
http//www.superstock.com/stock-photos-images/1778
R-4701
http//www.pixelparadox.com/mars.htm
walk(?d-driver ?l1-loc ?l2-loc) precondition (and
(at ?d ?l1) (path ?l1 ?l2)) effect (and (not
(at ?d ?l1)) (at ?d ?l2)))
navigate(?d-rover ?x-waypoint ?y-waypoint) precon
dition (at ?d ?x) (visible ?x ?y) effect
(not (at ?d ?x)) (at ?d ?y)
guess?
12
Motivation

In this work, we aim at learning action models
from a target domain,
e.g., learning the model of navigate in rovers,
by transferring knowledge from another domain,
called a source domain,
e.g., the knowledge of the model walk in
driverlog.

13
Problem Formulation

Formally, our learning problem can be addressed
Given as inputs
Action models from a source domain As
A few plan traces from the target domain
lts0,a1,s1,,an,sngt,
where si is a partial state, and ai
is an action.
Action schemas from the target domain A
Predicates from the target domain P

14
Problem Formulation

Formally, our learning problem can be addressed
Given as inputs
Action models from a source domain As
A few plan traces from the target domain
lts0,a1,s1,,an,sngt,
where si is a partial state, and ai
is an action.
Action schemas from the target domain A
Predicates from the target domain P
Output
Action models in the target domain At

15
Problem Formulation

Our assumptions are
based on STRIPS domain
people do not write action names randomly
E.g., not using eat to express move!
no need to observe full intermediate states in
plan traces, i.e., intermediate state can be
partial or empty.
action sequences in plan traces are correct.
actions in plan traces are all ordered, i.e.,
there are no concurrent actions.
there is information available in the Web related
to actions.

16
Our Algorithm LAWS
Constraints from web searching
17
Our Algorithm LAMMAS
Constraints from states between actions
18
Our Algorithm LAMMAS
Constraints imposed on action models
19
Our Algorithm LAMMAS
Constraints to ensure causal links in traces.
20
Our Algorithm LAMMAS
Solving constraints Using a weighted MAXSAT
solver.
21
Web constraints

Used to measure the similarity between two
actions.
To do this, we search two actions in the Web.
Specifically, we build predicate-action pairs
from the target domain as follows
Where,
p is a predicate
a is an action schema
ps parameters are included by as

22
Web constraints

Similarly, we build predicate-action pairs from
the source
where,
PAspre, PAsadd, PAsdel, denote sets of
precondition-action pairs, add-action pairs and
del-action pairs.
Note that we require p?PRE(a), which is different
from PAt

23
Web constraints

Next, we collect a set of web documents Ddi by
searching keyword
wltp,agt ?PAt.
We process each page di as a vector yi by
calculating the tf-idf (Jones 1972).
As a result, we have a set of real-number vectors
Yyi.
Likewise, we can easily get a set of vectors
Xxi by searching keyword wltp,agt?PAspre.

24
Web constraints

We define the similarity function between two
keywords w and w as follows
similarity(w,w)MMD2(F, Y, X),

MMD is the Maximum Mean Discrepancy, which is
given by (Borgwardt et al. 2006). The mathematics
is like
25
Web constraints

We define the similarity function between two
keywords w and w as follows
similarity(w,w)MMD2(F, Y, X),

MMD is the Maximum Mean Discrepancy, which is
given by (Borgwardt et al. 2006). The mathematics
is like
where
26
Web constraints

We define the similarity function between two
keywords w and w as follows
similarity(w,w)MMD2(F, Y, X),

MMD is Maximum Mean Discrepancy, which is given
by (Borgwardt et al. 2006). The mathematics is
like

a set of feature mapping function of a
Gaussian kernel.

where
27
Web constraints

Finally, we generate weighted web constraints by
the following steps
For each wltp,agt?PAt, and wltp,agt?PAspre , we
calculate similarity(w,w),
Generate a constraint
p ?PRE(a),
and associate it with similarity(w, w) as
its weight.
likewise for ADD(a) and DEL(a)

28
State constraints (given by Yang et.al 2007)

Generally, if p frequently appears before a, it
is probably a precondition of a. Specifically,

The weights of all the constraints are calculated
by counting their occurrences in all the plan
traces.

29
Action constraints (given by Yang et.al 2007)

Action constraints are imposed to ensure the
learned action models are succinct, which is

These constraints are associated with the maximal
weight of all the state constraints to ensure
these constraints are maximally satisfied.

30
Plan constraints (given by Yang et.al 2007)

We require that causal links in plan traces are
not broken. Thus, we build constraints as
follows.
For each precondition p of an action aj in a plan
trace, either p is in the initial state, or there
is ai prior to aj that adds p, and no ak between
ai and aj that deletes p
where i lt k lt j.
For each literal q in goal, either q is in the
initial state s0, or there is ai that adds q and
no ak that deletes q

31
Plan constraints (given by Yang et.al 2007)

To ensure these constraints are maximally
satisfied, we assign these constraints with the
maximal weight of state constraints.

32
Solve constraints

Before solving all these constraints, we adjust
the weights of web constraints by replacing the
original weights wo with wo
where wm is the maximal value of weights of state
constraints, and ? belongs to 0,1).
We can easily adjust wo from 0 to 8 by varying
? from 0 to 1.

33
Solve constraints

Solve these weighted constraints by running a
weighted MAXSAT solver.
The attained result is converted to action
models, e.g.

34
Experimental Result

Example result
(action walk(?d - rover ?x - waypoint ?y -
waypoint)
precondition (and (at ?d ?x)
(visible ?x ?y))
effect (and (not (at ?d ?x))
(at ?d ?y) (not
(visible ?x ?y))))

Missing condition
Extra condition
By comparing to hand-written action models, we
know that there is a missing/extra condition.
We calculate the error rate by counting all the
missing and extra conditions, and finally get the
accuracy.
35
Experimental Result

We compared LAWS to t-LAMP (by Zhuo et. al. 2009)
and ARMS (Yang et. al. 2007), where
t-LAMP borrows knowledge by building syntax
mappings
ARMS learns without borrowing knowledge.
The results are shown below

36
Experimental Result

We can see that
LAWS gt t-LAMP gt ARMS accuracies of LAWS are
higher than t-LAMP and ARMS, which empirically
shows the advantage of LAWS.
accuracies decrease when plan traces increase,
which is consistent with our intuition, since
more information will help learning.

37
Experimental Result

We also test the following three cases
Case I(? 0) not borrowing knowledge
Case II(? 0.5 and wo 1) weights of web
constraints are the same, i.e., not using
similarity function
Case III(? 0.5) using the similarity function.
The results are shown bellow

38
Experimental Result

We can see that
Case III gt the other two suggests the similarity
function could really help improve the learning
result
Case II gt Case I suggests that web constraints
is helpful

39
Experimental Result

Next, we test different ratios of states
Accuracy generally increases when the ratio
increases
This is consistent with our intuition, since the
increasing information could help improve the
learning result.

40
Experimental Result

We also test different values of ?
When ? increases from 0 to 0.5, the accuracy
increases, which exhibits when the effect of web
knowledge enlarges, the accuracy gets higher
However, when ? is larger than 0.5, the accuracy
decreases when ? increases. This is because the
impact of plan traces is relatively reduced. This
suggests knowledge from plan traces is also
important in learning high-quality action models.

41
Cpu Time

The Cpu time is smaller than 1,000 seconds on a
typical 2 GHZ PC with 1GB memory.
It is quite reasonable in learning. However, it
did not include web searching time, since it
mainly depends on specific network quality.

42
Conclusion

In this paper, we propose an algorithm framework
to borrow knowledge from another domain with
web search, and empirically show the improvement
of the learning quality.
Our work can be extended to more complex action
models, e.g., PDDL models.
Can also be extended to multi-task action-model
acquisition.

43
Thank You

Write a Comment

User Comments (0)