Expressive and Efficient Frameworks for Partial Satisfaction Planning - PowerPoint PPT Presentation

About This Presentation

Title:

Expressive and Efficient Frameworks for Partial Satisfaction Planning

Description:

Partial Satisfaction/Over-Subscription Planning. Traditional planning problems ... Find the highest utility plan given the resource constraints ... – PowerPoint PPT presentation

Number of Views:50

Avg rating:3.0/5.0

Slides: 20

Provided by: hennin3

Learn more at: https://rakaposhi.eas.asu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Expressive and Efficient Frameworks for Partial Satisfaction Planning

1
Expressive and Efficient Frameworks for Partial
Satisfaction Planning

Subbarao Kambhampati
Arizona State University
(Proposal submitted for consideration to
Behzad Kamgar-Parsi/ONR)

2
Partial Satisfaction/Over-Subscription Planning

Traditional planning problems
Find the (lowest cost) plan that satisfies all
the given goals
PSP Planning
Find the highest utility plan given the resource
constraints
Goals have utilities and actions have costs
arises naturally in many real world planning
scenarios
MARS rovers attempting to maximize scientific
return, given resource constraints
UAVs attempting to maximize reconnaisance
returns, given fuel etc constraints
Logistics problems resource constraints
due to a variety of reasons
Constraints on agents resources
Conflicting goals
With complex inter-dependencies between goal
utilities
Soft constraints
Limited time

3
Supporting PSP planning

PSP planning changes planning from a
satisficing to an optimizing problem
It is trivial to find a plan hard to find a good
one!
Rich connections to OR(IP)/MDP
Requires selecting objectives in addition to
actions
Which subset of goals to achieve
At what degree to satisfy individual goals
E.g. Collect as much soil sample as possible get
done as close to 2pm as possible
Currently, the objective selection is left to
humans
Leads to highly suboptimal plans since objective
selection cannot be done independent of planning
We propose to develop scalable methods for
synthesizing plans in such over-subscribed
scenarios

4
Proposal Overview

Preliminary work
Simple formal model PSP-Net Benefit
MDP-based, IP-based, and heuristic-planning based
approaches
Proposed directions
Improving expressiveness of PSP planners
Handling goals needing degree of satisfaction
(e.g. numeric goals)
Handling goals with soft deadline (where utility
of the delayed goals is reduced)
Handling complex interactions between objectives
Interactions between the plans of the goals
Interactions between the utilities of the goals
Improving search in PSP planners
More powerful heuristics for PSP planning (which
take interactions into account)
More flexible search frameworks --non-combinable
costs and utilities
Multi-objective search
Applications
Replanning as a PSP planning problem

5
Formulation

PSP Net benefit
Given a planning problem P (F, A, I, G), and
for each action a cost ca ? 0, and for each
goal fluent f ? G a utility uf ? 0, and a
positive number k. Is there a finite sequence of
actions ? (a1, a2, , an) that starting from I
leads to a state S that has net benefit ?f?(S?G)
uf ?a?? ca ? k.

Maximize the Net Benefit
Actions have execution costs, goals have
utilities, and the objective is to find the plan
that has the highest net benefit. ? easy enough
to extend to mixture of soft and hard goals
6
A spectrum of approaches for PSP-Net Benefit
AAAI 2004 KBCS 2004

EXACT METHODS
Deterministic MDPs
Model the problem as a deterministic MDP with
action costs, where a state has a reward equal to
the utility of the goals that hold in it.
A special action Done takes the agent from any
state S to a state Sd which is a sink state
Guaranteed optimal, but very slow (using SPUDD, a
state of the art MDP solver)
Optiplan
Integer programming based STRIPS planner
Optimal for a given plan length
Equivalent to bounded-horizon MDP

HEURISTIC METHODS
Altaltps
Heuristic planner that selects the objectives
up front heuristically
Novel use of planning-graph based reachability
analysis to pick objectives
Not optimal, but quite fast
Sapaps
Models PSP as heuristic search. Can be optimal
given admissible heuristics.
Can be thought of as a search-based solution to
the deterministic MDP

Source of Strength Planning graph
based Reachability Heuristics for PSP
7
Comparison of approaches
Exact algorithms based on MDPs dont scale at all
AAAI 2004
8
Adapting PG heuristics for PSP
optional

Challenges
Need to propagate costs on the planning graph
The exact set of goals are not clear
Interactions between goals
Obvious approach of considering all 2n goal
subsets is infeasible

Idea Select a subset of the top level goals
upfront
Challenge Goal interactions
Approach Estimate the net benefit of each goal
in terms of its utility minus the cost of its
relaxed plan
Bias the relaxed plan extraction to (re)use the
actions already chosen for other goals

9
SAPAPS A forward A Approach for PSP
optional
Anytime A Algorithm Search through best
beneficial nodes
A5 SampleRock(Y)
A1 Navigate(X,Y)
A2 SampleSoil(Y)
A4 Navigate(Y,Z)
A3 TakePicture
A f(S) g(S) h(S)
g(S) is the net benefit of the plan that got us
from initial state to S -- Difference
between the utility of goals holding in S and
and the cost of actions that took us
from I to S h(S) is the additional net
benefit of the best plan P starting from S
(If S is the result of applying P to S, then
we want to maximize U(S)
U(S) C(P) h(S) is the estimate of h()
10
SAPAPS Modeling A search for PSP
optional

Many state-of-the-art planners use best-first A
search.
How to model A search to PSP Net Benefit?

Search node evaluation
(f gh)
Lowest expected total number of actions
Candidate Plans
Qualifying plans Achieve all goals
Search termination criteria
Achieving all goals

Search node evaluation
(f gh)
Highest expected total benefit (goal utility
action cost).
Candidate Plans
Beneficial plans Total achieved goal utility gt
total action cost.
Search termination criteria
No search node appears to be extendable to be
more beneficial than the best beneficial plan
found.

11
Proposal Overview

Preliminary work
Simple formal model PSP-Net Benefit
MDP-based, IP-based, and heuristic-planning based
approaches
Proposed directions
Improving expressiveness of PSP planners
Handling goals needing degree of satisfaction
(e.g. numeric goals)
Handling goals with soft deadlines (where utility
of the delayed goals is reduced)
Handling complex interactions between objectives
Interactions between the plans of the goals
Interactions between the utilities of the goals
Improving search in PSP planners
More powerful heuristics for PSP planning (which
take interactions into account)
More flexible search frameworks --non-combinable
costs and utilities
Multi-objective search
Applications
Replanning as a PSP planning problem

12
Search Heuristic Improvements

Make objective selection more sensitive to goal
(achievement) interactions
Consider group interactions
Consider negative interactions
Preliminary work in ICAPS 2005 (with Sanchez
Nigenda)
Consider faster techniques for exact methods
Leverage our recent work on novel IP encodings
Based on loosely coupled network flow problems
which is highly competitive with SAT methods
ICAPS 2005 (with van den Briel)
Consider adapting directed and anytime MDP
techniques

13
Degree Delay of Satisfaction

In metric temporal domains, PSP will involve
Partial Degree of satisfaction
If you cant give me 1000, give me half at least
Need to track costs for various intervals of a
numeric quantity ?
Delayed Satisfaction
If you submit the homework past the deadline, you
will get penalty points

Preliminary work on degree of satisfaction in
IJCAI 2005
14
Utility interactions between goals

PSP-net benefit considers goal achievement
interactions
..but assumes additive model of goal utilities
U(G1,G2) U(G1)U(G2)
Additive utility model often unrealistic
Utility having two shoes is much more than the
sum of the utilities of having either one of them
Utility of having two cars is less than the sum
of utilities of having either one of them
Challenges
Elicit utility models (preference elicitation)
Model utility interactions
Adapt and extend CP-nets for modeling goal
utilities
Can also consider qualitative preference models
Extend the reachability heuristics to consider
both plan interactions and goal interactions

15
Non-combinable costs/utilities

PSP Net Benefit assumes costs and utilities are
in same units
often does not hold
E.g. different types of resource costs (fuel,
manpower) different types of utilities
Solution Multi-objective search
Either elicit utility models
Alpha manpower Beta mission utility
..or search for highest utility plans given a
specific resource bound
..or provide pareto (non-dominated) set of
solution plans and let the user choose
Challenge Need to adapt reachability heuristics
to separately track the various types of costs
and utilities
We plan to build on our work on multi-objective
temporal planning in SAPA

16
Combining uncertainty and partial satisfaction

Time permitting, we hope to extend our PSP
framework to handle stochastic domains
Planning in stochastic domains already has many
natural affinities to PSP
If the planner wants to ensure that its plan
reaches goals with higher probability, it needs
to often go for longer (costlier) plans
..Many challenges remain in selecting objectives
in stochastic domains
We expect to leverage our significant work in
extending reachability heuristics for stochastic
and non-deterministic domains
UAI 2005 AAAI 2005 ICAPS 2004 JAIR in review

Note Not in the proposal draft
17
Explaining the planners decisions in mixed
initiative scenarios

In mixed-initiative scenarios, humans would like
to get explanations on the selected objectives
Anecdotal evidence suggests that in military
planning applications, human users are not
willing to take a plan when the objectives
selected by the planner do not match the humans
intuition
Challenge Explaining the optimality of the
planners decisions is technically hard
In contrast, explaining correctness is much
simpler
Proposed approach Will modify the reachability
heuristic computations to leave a trace of their
reasoning
Intent would be to explain at least the
pareto-optimality of the selected set of
objectives
when a subgoal cannot not be included because of
cost-based or preference-based interactions with
other selected subgoals, annotate this fact
summarize the pareto-set (in multi-objective
optimization cases) in terms of conditional plans
explaining which member of the set is optimal
under what conditions
Support sensitivity analysis on the stability of
the selected objectives (i.e., under what
conditions will they no longer be optimal)

18
Modeling Replanning as a PSP problem

Traditionally, replanning has been cast as a
procedure rather than a problem
Modify the old plan to handle the new situations
..we take the stance that replanning is a
problem
Achieve the original goals of the agent from the
current initial situation
Subject to various constraints that were imposed
by the partial execution of the original plan
Reservations, Commitments these are however soft
constraints
..Replanning can be best modeled as a PSP
problem!
We propose to do this..

19
Summary and Impact

PSP planning problems are ubiquitous and extend
the modeling power of planning frameworks
.. By foregrounding user preferences among
different objectives
They pose interesting technical challenges to the
state of the art
..by emphasizing plan-quality considerations
We have already made significant progress in
handling PSP problems
AAAI 2004 ICAPS 2005 (2) IJCAI 2005
..and propose to extend our framework
significantly
..as well as demonstrate its power through
applications