Cross-Domain Action-Model Acquisition for Planning viaWeb Search - PowerPoint PPT Presentation

1 / 43
About This Presentation
Title:

Cross-Domain Action-Model Acquisition for Planning viaWeb Search

Description:

Title: PowerPoint Presentation Last modified by: USER Created Date: 1/1/1601 12:00:00 AM Document presentation format: (4:3) Other titles – PowerPoint PPT presentation

Number of Views:66
Avg rating:3.0/5.0
Slides: 44
Provided by: sistSysu8
Category:

less

Transcript and Presenter's Notes

Title: Cross-Domain Action-Model Acquisition for Planning viaWeb Search


1
Cross-Domain Action-Model Acquisition for
Planning viaWeb Search
  • Hankz Hankui Zhuoa, Qiang Yangb, Rong Pana and
    Lei Lia
  • aSun Yat-sen University, China
  • bHong Kong University of Science Technology,
    Hong Kong

2
Motivation
  • There are many domains that share knowledge with
    each other, e.g.,

3
Motivation
  • There are many domains that share knowledge with
    each other, e.g.,
  • walking in the driverlog domain

http//www.superstock.com/stock-photos-images/1778
R-4701
4
Motivation
  • There are many domains that share knowledge with
    each other, e.g.,
  • walking in the driverlog domain
  • navigating in the rovers domain

http//www.superstock.com/stock-photos-images/1778
R-4701
http//www.pixelparadox.com/mars.htm
5
Motivation
  • There are many domains that share knowledge with
    each other, e.g.,
  • walking in the driverlog domain
  • navigating in the rovers domain
  • moving in the elevator domain
  • etc

http//www.superstock.com/stock-photos-images/1778
R-4701
http//www.venusengineers.com/goods-lift.html
http//www.pixelparadox.com/mars.htm
6
Motivation
  • These actions in these domains all share the
    common knowledge about location change, thus,
  • it may be possible to borrow knowledge from
    each other.
  • specifically, next slide

http//www.superstock.com/stock-photos-images/1778
R-4701
http//www.venusengineers.com/goods-lift.html
http//www.pixelparadox.com/mars.htm
7
Motivation
http//www.superstock.com/stock-photos-images/1778
R-4701
http//www.pixelparadox.com/mars.htm
walk(?d-driver ?l1-loc ?l2-loc) precondition (and
(at ?d ?l1) (path ?l1 ?l2)) effect (and (not
(at ?d ?l1)) (at ?d ?l2)))
8
Motivation
http//www.superstock.com/stock-photos-images/1778
R-4701
http//www.pixelparadox.com/mars.htm
navigate(?d-rover ?x-waypoint ?y-waypoint) precon
dition ?? effect ??
guess?
walk(?d-driver ?l1-loc ?l2-loc) precondition (and
(at ?d ?l1) (path ?l1 ?l2)) effect (and (not
(at ?d ?l1)) (at ?d ?l2)))
9
Motivation
http//www.superstock.com/stock-photos-images/1778
R-4701
http//www.pixelparadox.com/mars.htm
walk(?d-driver ?l1-loc ?l2-loc) precondition (and
(at ?d ?l1) (path ?l1 ?l2)) effect (and (not
(at ?d ?l1)) (at ?d ?l2)))
navigate(?d-rover ?x-waypoint ?y-waypoint) precon
dition (at ?x ?y) (visible ?y ?z) effect
(not (at ?x ?y)) (at ?x ?z)
guess?
10
Motivation
http//www.superstock.com/stock-photos-images/1778
R-4701
http//www.pixelparadox.com/mars.htm
walk(?d-driver ?l1-loc ?l2-loc) precondition (and
(at ?d ?l1) (path ?l1 ?l2)) effect (and (not
(at ?d ?l1)) (at ?d ?l2)))
navigate(?d-rover ?x-waypoint ?y-waypoint) precon
dition (at ?x ?y) (visible ?y ?z) effect
(not (at ?x ?y)) (at ?x ?z)
guess?
11
Motivation
http//www.superstock.com/stock-photos-images/1778
R-4701
http//www.pixelparadox.com/mars.htm
walk(?d-driver ?l1-loc ?l2-loc) precondition (and
(at ?d ?l1) (path ?l1 ?l2)) effect (and (not
(at ?d ?l1)) (at ?d ?l2)))
navigate(?d-rover ?x-waypoint ?y-waypoint) precon
dition (at ?d ?x) (visible ?x ?y) effect
(not (at ?d ?x)) (at ?d ?y)
guess?
12
Motivation
  • In this work, we aim at learning action models
    from a target domain,
  • e.g., learning the model of navigate in rovers,
  • by transferring knowledge from another domain,
    called a source domain,
  • e.g., the knowledge of the model walk in
    driverlog.

13
Problem Formulation
  • Formally, our learning problem can be addressed
  • Given as inputs
  • Action models from a source domain As
  • A few plan traces from the target domain
  • lts0,a1,s1,,an,sngt,
  • where si is a partial state, and ai
    is an action.
  • Action schemas from the target domain A
  • Predicates from the target domain P

14
Problem Formulation
  • Formally, our learning problem can be addressed
  • Given as inputs
  • Action models from a source domain As
  • A few plan traces from the target domain
  • lts0,a1,s1,,an,sngt,
  • where si is a partial state, and ai
    is an action.
  • Action schemas from the target domain A
  • Predicates from the target domain P
  • Output
  • Action models in the target domain At

15
Problem Formulation
  • Our assumptions are
  • based on STRIPS domain
  • people do not write action names randomly
  • E.g., not using eat to express move!
  • no need to observe full intermediate states in
    plan traces, i.e., intermediate state can be
    partial or empty.
  • action sequences in plan traces are correct.
  • actions in plan traces are all ordered, i.e.,
    there are no concurrent actions.
  • there is information available in the Web related
    to actions.

16
Our Algorithm LAWS
Constraints from web searching
17
Our Algorithm LAMMAS
Constraints from states between actions
18
Our Algorithm LAMMAS
Constraints imposed on action models
19
Our Algorithm LAMMAS
Constraints to ensure causal links in traces.
20
Our Algorithm LAMMAS
Solving constraints Using a weighted MAXSAT
solver.
21
Web constraints
  • Used to measure the similarity between two
    actions.
  • To do this, we search two actions in the Web.
  • Specifically, we build predicate-action pairs
    from the target domain as follows
  • Where,
  • p is a predicate
  • a is an action schema
  • ps parameters are included by as

22
Web constraints
  • Similarly, we build predicate-action pairs from
    the source
  • where,
  • PAspre, PAsadd, PAsdel, denote sets of
    precondition-action pairs, add-action pairs and
    del-action pairs.
  • Note that we require p?PRE(a), which is different
    from PAt

23
Web constraints
  • Next, we collect a set of web documents Ddi by
    searching keyword
  • wltp,agt ?PAt.
  • We process each page di as a vector yi by
    calculating the tf-idf (Jones 1972).
  • As a result, we have a set of real-number vectors
    Yyi.
  • Likewise, we can easily get a set of vectors
    Xxi by searching keyword wltp,agt?PAspre.

24
Web constraints
  • We define the similarity function between two
    keywords w and w as follows
  • similarity(w,w)MMD2(F, Y, X),

MMD is the Maximum Mean Discrepancy, which is
given by (Borgwardt et al. 2006). The mathematics
is like
25
Web constraints
  • We define the similarity function between two
    keywords w and w as follows
  • similarity(w,w)MMD2(F, Y, X),

MMD is the Maximum Mean Discrepancy, which is
given by (Borgwardt et al. 2006). The mathematics
is like
where
26
Web constraints
  • We define the similarity function between two
    keywords w and w as follows
  • similarity(w,w)MMD2(F, Y, X),

MMD is Maximum Mean Discrepancy, which is given
by (Borgwardt et al. 2006). The mathematics is
like
  • a set of feature mapping function of a
  • Gaussian kernel.

where
27
Web constraints
  • Finally, we generate weighted web constraints by
    the following steps
  • For each wltp,agt?PAt, and wltp,agt?PAspre , we
    calculate similarity(w,w),
  • Generate a constraint
  • p ?PRE(a),
  • and associate it with similarity(w, w) as
    its weight.
  • likewise for ADD(a) and DEL(a)

28
State constraints (given by Yang et.al 2007)
  • Generally, if p frequently appears before a, it
    is probably a precondition of a. Specifically,
  • The weights of all the constraints are calculated
    by counting their occurrences in all the plan
    traces.

29
Action constraints (given by Yang et.al 2007)
  • Action constraints are imposed to ensure the
    learned action models are succinct, which is
  • These constraints are associated with the maximal
    weight of all the state constraints to ensure
    these constraints are maximally satisfied.

30
Plan constraints (given by Yang et.al 2007)
  • We require that causal links in plan traces are
    not broken. Thus, we build constraints as
    follows.
  • For each precondition p of an action aj in a plan
    trace, either p is in the initial state, or there
    is ai prior to aj that adds p, and no ak between
    ai and aj that deletes p
  • where i lt k lt j.
  • For each literal q in goal, either q is in the
    initial state s0, or there is ai that adds q and
    no ak that deletes q

31
Plan constraints (given by Yang et.al 2007)
  • To ensure these constraints are maximally
    satisfied, we assign these constraints with the
    maximal weight of state constraints.

32
Solve constraints
  • Before solving all these constraints, we adjust
    the weights of web constraints by replacing the
    original weights wo with wo
  • where wm is the maximal value of weights of state
    constraints, and ? belongs to 0,1).
  • We can easily adjust wo from 0 to 8 by varying
    ? from 0 to 1.

33
Solve constraints
  • Solve these weighted constraints by running a
    weighted MAXSAT solver.
  • The attained result is converted to action
    models, e.g.

34
Experimental Result
  • Example result
  • (action walk(?d - rover ?x - waypoint ?y -
    waypoint)
  • precondition (and (at ?d ?x)
    (visible ?x ?y))
  • effect (and (not (at ?d ?x))
  • (at ?d ?y) (not
    (visible ?x ?y))))

Missing condition
Extra condition
By comparing to hand-written action models, we
know that there is a missing/extra condition.
We calculate the error rate by counting all the
missing and extra conditions, and finally get the
accuracy.
35
Experimental Result
  • We compared LAWS to t-LAMP (by Zhuo et. al. 2009)
    and ARMS (Yang et. al. 2007), where
  • t-LAMP borrows knowledge by building syntax
    mappings
  • ARMS learns without borrowing knowledge.
  • The results are shown below

36
Experimental Result
  • We can see that
  • LAWS gt t-LAMP gt ARMS accuracies of LAWS are
    higher than t-LAMP and ARMS, which empirically
    shows the advantage of LAWS.
  • accuracies decrease when plan traces increase,
    which is consistent with our intuition, since
    more information will help learning.

37
Experimental Result
  • We also test the following three cases
  • Case I(? 0) not borrowing knowledge
  • Case II(? 0.5 and wo 1) weights of web
    constraints are the same, i.e., not using
    similarity function
  • Case III(? 0.5) using the similarity function.
  • The results are shown bellow

38
Experimental Result
  • We can see that
  • Case III gt the other two suggests the similarity
    function could really help improve the learning
    result
  • Case II gt Case I suggests that web constraints
    is helpful

39
Experimental Result
  • Next, we test different ratios of states
  • Accuracy generally increases when the ratio
    increases
  • This is consistent with our intuition, since the
    increasing information could help improve the
    learning result.

40
Experimental Result
  • We also test different values of ?
  • When ? increases from 0 to 0.5, the accuracy
    increases, which exhibits when the effect of web
    knowledge enlarges, the accuracy gets higher
  • However, when ? is larger than 0.5, the accuracy
    decreases when ? increases. This is because the
    impact of plan traces is relatively reduced. This
    suggests knowledge from plan traces is also
    important in learning high-quality action models.

41
Cpu Time
  • The Cpu time is smaller than 1,000 seconds on a
    typical 2 GHZ PC with 1GB memory.
  • It is quite reasonable in learning. However, it
    did not include web searching time, since it
    mainly depends on specific network quality.

42
Conclusion
  • In this paper, we propose an algorithm framework
    to borrow knowledge from another domain with
    web search, and empirically show the improvement
    of the learning quality.
  • Our work can be extended to more complex action
    models, e.g., PDDL models.
  • Can also be extended to multi-task action-model
    acquisition.

43
Thank You
Write a Comment
User Comments (0)
About PowerShow.com