Title: Fifth International Conference on Autonomous Agents and Multiagent Systems AAMAS06
1Fifth International Conference on Autonomous
Agents and Multi-agent Systems (AAMAS-06) Exact
Solutions of Interactive POMDPs Using Behavioral
Equivalence Speaker Prashant Doshi University of
Georgia Authors B. Rathnasabapathy, Prashant
Doshi, and Piotr Gmytrasiewicz
2Overview
- I-POMDP Framework for sequential decision
making for an agent in a multi-agent setting - Takes the perspective of an individual in an
interaction - Problem
- Cardinality of the interactive state space ?
infinite - Other agent's models (incl. beliefs) are part of
an agent's state space (interactive epistemology) - An algorithm for solving I-POMDPs exactly
- Aggregate behaviorally equivalent models of other
agents
3 Background Properties of POMDPs and I-POMDPs
- Finitely nested
- Beliefs are nested up to a finite strategic level
l - Level 0 models are POMDPs
- Value function of POMDP and finitely nested
I-POMDP is piecewise linear and convex (PWLC) - Agents behaviors in POMDP and finitely nested
I-POMDP can be represented using policy trees
4Interactive POMDPs
- Definition
- Interactive state space
- S set of physical states set
of intentional models - set of subintentional models
- Intentional models contain the other agents
beliefs
5Example Single-Agent Tiger Problem
-100
10
?
-1
6Behaviorally Equivalent Models
Equivalence Classes of Beliefs
7 Equivalence Classes of Interactive States
- Definition
- Combination of a physical state and an
equivalence class of models
8Lossless Aggregation
- In a finitely nested I-POMDP, a probability
distribution over
, provides a sufficient statistic for the
past history of is observations - Transformation of the interactive state space
into behavioral equivalence classes is
value-preserving - Optimal policy of the transformed finitely nested
I-POMDP remains unchanged
9Solving I-POMDPs Exactly
- Procedure Solve-IPOMDP ( AGENTi, Belief Nesting
L ) Returns Policy - If L 0 Then
- Return Policy Solve-POMDP ( AGENTi )
- Else
- For all AGENTj lt gt AGENTi
- Policyj Solve-IPOMDP( AGENTj , L-1)
- End
- Mj Behavioral-Equivalence-Models(Policyj )
- ECISi S x xj Mj
- Policy Modified-GIP(ECISi , Ai , Ti , ?i , Oi
, Ri ) - Return Policy
- End
10Multi-Agent Persistent-Tiger Problem
-100
10
Growl Left, Growl Right X Creak
Right, Creak Left, Silence
11Beliefs on ECIS
Agent js policy
12Agent is policy in the presence of another agent
j Policy becomes diverse as is ability of
observing js actions improves
13(No Transcript)
14Conclusions
- A method that enables exact solution of finitely
nested interactive POMDPs - Aggregate agent models into behavioral
equivalence classes - Discretization is lossless
- Interesting behaviors emerge in the multi-agent
Tiger problem
15Thank You and Please Stop by my Poster
Questions