Title: Automated negotiations: Agents interacting with other automated agents and with humans
1Automated negotiations Agents interacting with
other automated agents and with humans
- Sarit Kraus
- Department of Computer Science
- Bar-Ilan University
- University of Maryland
- sarit_at_cs.biu.ac.il
http//www.cs.biu.ac.il/sarit/
2Negotiations
- A discussion in which interested parties
exchange information and come to an agreement.
Davis and Smith, 1977
3 Negotiations
- NEGOTIATION is an interpersonal decision-making
process necessary whenever we cannot achieve our
objectives single-handedly.
4 Agent environments
- Teams of agents that need to coordinate joint
activities problems distributed information,
distributed decision solving, local conflicts. - Open agent environments acting in the same
environment problems need motivation to
cooperate, conflict resolution, trust,
distributed and hidden information.
5Open Agent Environments
- Consist of
- Automated agents developed by or serving
different people or organizations. - People with a variety of interests and
institutional affiliations. - The computer agents are self-interested
they may cooperate to further their interests. - The set of agents is not fixed.
6Open Agent Environments (examples)
- Agents support people
- Collaborative interfaces
- CSCW Computer Supported Cooperative Work systems
- Cooperative learning systems
- Military-support systems
- Agents act as proxies for people
- Coordinating schedules
- Patient care-delivery systems
- Online auctions
- Groups of agents act autonomously alongside
people - Simulation systems for education and training
- Computer games and other forms of entertainment
- Robots in rescue operations
- Software personal assistants
7Open Agent Environments (examples)
- Agents support people
- Collaborative interfaces
- CSCW Computer Supported Cooperative Work systems
- Cooperative learning systems
- Military-support systems
- Agents act as proxies for people
- Coordinating schedules
- Patient care-delivery systems
- Online auctions
- Groups of agents act autonomously alongside
people - Simulation systems for education and training
- Computer games and other forms of entertainment
- Robots in rescue operations
- Software personal assistants
8Examples
- Monitoring electricity networks (Jennings)
- Distributed design and engineering (Petrie et
al.) - Distributed meeting scheduling (Sen Durfee)
- Teams of robotic systems acting in hostile
environments (Balch Arkin, Tambe) - Collaborative Internet-agents (Etzioni Weld,
Weiss) - Collaborative interfaces (Grosz Ortiz, Andre)
- Information agent on the Internet (Klusch)
- Cooperative transportation scheduling (Fischer)
- Supporting hospital patient scheduling (Decker
Jin) - Intelligent Agents for Command and Control
(Sycara)
9Types of agents
- Fully rational agents
- Bounded rational agents
10Using other disciplines results
- No need to start from scratch!
- Required modification and adjustment AI gives
insights and complimentary methods. - Is it worth it to use formal methods for
multi-agent systems?
11Negotiating with rational agents
- Quantitative decision making
- Maximizing expected utility
- Nash equilibrium, Bayesian Nash equilibrium
- Automated Negotiator
- Model the scenario as a game
- The agent computes (if complexity allows)
- the equilibrium strategy, and acts
accordingly. - (Kraus, Strategic Negotiation in
- Multiagent Environments,
- MIT Press 2001).
12Game Theory studies situations of strategic
interaction in which each decision maker's plan
of action depends on the plans of the other
decision makers.
- Short introduction to game theory
13Decision Theory (reminder)(How to make decisions)
- Decision Theory
- Probability theory Utility Theory
- (deals with chance) (deals with
outcomes) - Fundamental idea
- The MEU (Maximum expected utility) principle
- Weigh the utility of each outcome by the
probability that it occurs
14Basic Principle
- Given probability P(out1 Ai), utility U(out1),
- P(out2 Ai), utility
U(out2) - Expected utility of an action Aii EU(Ai) S
U(outj)P(outjAi) - Choose Ai such that maximizes EU MEU argmax
S U(outj)P(outjAi) Ai ? Ac
Outj ? OUT
Outj ? OUT
15Risk Averse, Risk NeutralRisk Seeking
RISK SEEKER
RISK AVERSE
RISK NEUTRAL
16Game Description
- Players
- Who participates in the game?
- Actions / Strategies
- What can each player do?
- In what order do the players act?
- Outcomes / Payoffs
- What is the outcome of the game?
- What are the players' preferences over the
possible outcomes?
17Game Description (cont)
- Information
- What do the players know about the parameters of
the environment or about one another? - Can they observe the actions of the other
players? - Beliefs
- What do the players believe about the unknown
parameters of the environment or about one
another? - What can they infer from observing the actions of
the other players?
18Strategies and Equilibrium
- Strategy
- Complete plan, describing an action for every
contingency - Nash Equilibrium
- Each player's strategy is a best response to the
strategies of the other players - Equivalently No player can improve his payoffs
by changing his strategy alone - Self-enforcing agreement. No need for formal
contracting - Other equilibrium concepts also exist
19Classification of Games
- Depending on the timing of move
- Games with simultaneous moves
- Games with sequential moves
- Depending on the information available to the
players - Games with perfect information
- Games with imperfect (or incomplete) information
- We concentrate on non-cooperative games
- Groups of players cannot deviate jointly
- Players cannot make binding agreements
20Games with Simultaneous Moves and Perfect
Information
- All players choose their actions simultaneously
or just independently of one another - There is no private information
- All aspects of the game are known to the players
- Representation by game matrices
- Often called normal form games or strategic form
games
21Matching Pennies
Example of a zero-sum game. Strategic issue of
competition.
22Prisoners Dilemma
- Each player can cooperate or defect
Column
cooperate
defect
-10,0
-1,-1
cooperate
Row
defect
-8,-8
0,-10
Main issue Tension between social optimality
and individual incentives.
23Coordination Games
- A supplier and a buyer need to decide whether to
adopt a new purchasing system.
Buyer
new
old
0,0
20,20
new
Supplier
old
5,5
0,0
24Battle of sexes
The game involves both the issues of coordination
and competition
25 Definition of Nash Equilibrium
- A game has n players.
- Each player i has a strategy set Si
- This is his possible actions
- Each player has a payoff function
- pI S ?R
- A strategy ti in Si is a best response if there
is no other strategy in Si that produces a higher
payoff, given the opponents strategies
26Definition of Nash Equilibrium
- A strategy profile is a list (s1, s2, , sn) of
the strategies each player is using - If each strategy is a best response given the
other strategies in the profile, the profile is a
Nash equilibrium - Why is this important?
- If we assume players are rational, they will play
Nash strategies - Even less-than-rational play will often converge
to Nash in repeated settings
27An Example of a Nash Equilibrium
Column
a
b
0,1
1,2
a
Row
b
1,0
2,1
(b,a) is a Nash equilibrium Given that column is
playing a, rows best response is b Given that
row is playing b, columns best response is a
28Mixed strategies
- Unfortunately, not every game has a pure strategy
equilibrium. - Rock-paper-scissors
- However, every game has a mixed strategy Nash
equilibrium - Each action is assigned a probability of play
- Player is indifferent between actions, given
these probabilities
29Mixed Strategies
Wife
shopping
football
0,0
2,1
football
Husband
shopping
1,2
0,0
30Mixed strategy
- Instead, each player selects a probability
associated with each action - Goal utility of each action is equal
- Players are indifferent to choices at this
probability - aprobability husband chooses football
- bprobability wife chooses shopping
- Since payoffs must be equal, for husband
- b1(1-b)2 b2/3
- For wife
- a1(1-a)2 2/3
- In each case, expected payoff is 2/3
- 2/9 of time go to football, 2/9 shopping, 5/9
miscoordinate - If they could synchronize ahead of time they
could do better.
31Rock paper scissors
Column
rock
paper
scissors
-1,1
0,0
1,-1
rock
Row
paper
1,-1
0,0
-1,1
scissors
-1,1
1,-1
0,0
32Setup
- Player 1 plays rock with probability pr, scissors
with probability ps, paper with probability 1-pr
ps - Utility2(rock) 0pr 1ps 1(1-pr ps)
2 ps pr -1 - Utility2(scissors) 0ps 1(1 pr ps) 1pr
1 2pr ps - Utility2(paper) 0(1-pr ps) 1pr 1ps
pr ps - Player 2 wants to choose a probability for each
action so that the expected payoff for each
action is the same.
33Setup
- qr(2 ps pr 1) qs(1 2pr ps) (1-qr-qs)
(pr ps) - It turns out (after some algebra) that the
optimal mixed strategy is to play each action 1/3
of the time - Intuition What if you played rock half the time?
Your opponent would then play paper half the
time, and youd lose more often than you won - So youd decrease the fraction of times you
played rock, until your opponent had no edge in
guessing what youll do
34Extensive Form Games
Any finite game of perfect information has a pure
strategy Nash equilibrium. It can be found by
backward induction.
Chess is a finite game of perfect information.
Therefore it is a trivial game from a game
theoretic point of view.
35Extensive Form Games - Intro
- A game can have complex temporal structure
- Information
- set of players
- who moves when and under what circumstances
- what actions are available when called upon to
move - what is known when called upon to move
- what payoffs each player receives
- Foundation is a game tree
36Example Cuban Missile Crisis
- 100, - 100
Nuke
Kennedy
Arm
Khrushchev
Fold
10, -10
-1, 1
Retract
Pure strategy Nash equilibria (Arm, Fold) and
(Retract, Nuke)
37Subgame perfect equilibrium credible threats
- Proper subgame subtree (of the game tree) whose
root is alone in its information set - Subgame perfect equilibrium
- Strategy profile that is in Nash equilibrium in
every proper subgame (including the root),
whether or not that subgame is reached along the
equilibrium path of play
38Example Cuban Missile Crisis
- 100, - 100
Nuke
Kennedy
Arm
Khrushchev
Fold
10, -10
-1, 1
Retract
Pure strategy Nash equilibria (Arm, Fold) and
(Retract, Nuke)
Pure strategy subgame perfect equilibria (Arm,
Fold)
Conclusion Kennedys Nuke threat was not
credible.
39Type of games
Diplomacy
40Take it or leave it deals
- The rules of the game
- You will be randomly paired up with someone in
the other section this pairing will remain
completely anonymous. - One of you will be chosen (by coin flip) to be
either the Proposer or the Responder in this
experiment. - The Proposer gets to make an offer to split 100
in some proportion with the Responder. So the
proposer can offer x to the responder, proposing
to keep 100-x for themselves. - The Responder must decide what is the lowest
amount offered by the proposer that he / she will
accept i.e. I will accept any offer which is
greater than or equal to y. - If the responder accepts the offer made by the
proposer, they split the sum according to the
proposal. If the responder rejects, both parties
lose their shares.
41AN EXAMPLE OF Buyer/Seller negotiation
42BARGAINING
ZOPA
Sellers surplus
Buyers surplus
x final price
s
b
Sellers RP Sellers wants s or more
Buyers RP Buyer wants b or less
43BARGAINING
- If b lt s negative bargaining zone, no
possible agreements - If b gt s positive bargaining zone,
agreement possible - (x-s) sellers surplus
- (b-x) buyers surplus
- The surplus to divide independent on x
constant-sum game!
44POSITIVE BARGAINING ZONE
Sellers reservation point
Sellers target point
Sellers bargaining range
Buyers bargaining range
Buyers target point
Buyers reservation point
POSITIVE bargaining zone
45NEGATIVE BARGAINING ZONE
Sellers reservation point
Sellers target point
Sellers bargaining range
Buyers bargaining range
Buyers target point
Buyers reservation point
NEGATIVE bargaining zone
46Single issue negotiation
- Agents a and b negotiate over a pie of size 1
- Offer (x,y), xy1
- Deadline n and Discount factor d
- Utility Ua((x,y), t) x dt-1 if
t n - Ub((x,y),t) y dt-1
- 0
otherwise - The agents negotiate using Rubinsteins
alternating offers protocol
47Alternating offers protocol
- Time Offer Respond
- 1 a (x1,y1) b
(accept/reject) - 2 b (x2,y2) a
(accept/reject) - -
- -
- n
48Equilibrium strategies
- How much should an agent offer if there is only
one time period? - Let n1 and a be the first mover
- Agent as offer
- Propose to keep the whole pie (1,0) agent b
will accept this
49Equilibrium strategies for n 2
- d 1/4 first mover a
- Offer (x, y) x as share y
bs share - Optimal offers obtained using backward induction
-
Time Offering agent Offer Utility
1 a ? b (3/4, 1/4) 3/41/4
2 b ? a (0, 1) 01/4
Agreement
The offer (3/4, 1/4) forms a P.E. Nash equilibrium
50Effect of discount factor and deadline on the
equilibrium outcome
- What happens to first movers share as d
increases? - What happens to second movers share as d
increases? - As deadline increases, what happens to first
movers share? - Likewise for second mover?
51Effect of d and deadline on the agents shares
52Multiple issues
- Set of issues S 1, 2, , m. Each issue is a
pie of size 1 - The issues are divisible
- Deadline n (for all the issues)
- Discount factor dc for issue c
- Utility U(x, t) ?c U(xc, t)
53Multi-issue procedures
- Package deal procedure The issues are bundled
and discussed together as a package - Simultaneous procedure The issues are negotiated
in parallel but independently of each other - Sequential procedure The issues are negotiated
sequentially one after another
54Package deal procedure
- Issues negotiated using alternating offers
protocol - An offer specifies a division for each of the
m issues - The agents are allowed to accept/reject a
complete offer - The agents may have different preferences over
the issues - The agents can make tradeoffs across the
issues to maximize their utility this leads
to Pareto optimal outcome
55Utility for two issues
Ua 2X Y
Ub X 2Y
56Making tradeoffs
What is as utility for Ub 2
Ub 2
57Example for two issues
DEADLINE n 2 DISCOUNT FACTORS d1 d2
1/2 UTILITIES Ua 1/2t-1 (x1 2x2)
Ub 1/2t-1 (2y1 y2)
Time Offering agent Package Offer
1 a ? b (1/4, 3/4) (1, 0) OR (3/4, 1/4) (0, 1)
2 b ? a (0, 1) (0, 1) Ub 1.5
The outcome is not symmetric
58P.E. Nash equilibrium strategies
For t n The offering agent takes 100 percent of
all the issues The receiving agent accepts For t
lt n (for agent a)
OFFER x, y s.t. Ub(y, t) EQUB(t1) If more than one such x, y perform trade-offs across issues to find best offer RECEIVE x, y If Ua(x, t) EQUA(t1) ACCEPT else REJECT
EQUA(t1) is as equilibrium utility for t1
EQUB(t1) is bs equilibrium utility for t1
59Making trade-offs divisible issues
Agent as trade-off problem at time t TR Find
a package x, y to m
Maximize ? kac xc
c1
m Subject to ? kbc yc EQUB(t1)
0 xc 1 0 yc 1
c1
This is the fractional knapsack problem
60Making trade-offs divisible issues
- Agent as perspective (time t)
- Agent a considers the m issues in the
increasing order of ka/kb and assigns to b the
maximum possible share for each of them until bs
cumulative utility equals EQUB(t1)
61Equilibrium strategies
For t n The offering agent takes 100 percent of
all the issues The receiving agent accepts For t
lt n (for agent a)
OFFER x, y s.t. Ub(y, t) EQUB(t1) If more then one such x, y perform trade-offs across issues to find best offer RECEIVE x, y If Ua(x, t) EQUA(t1) ACCEPT else REJECT
62Equilibrium solution
- An agreement on all the m issues occurs in the
first time period - Time to compute the equilibrium offer for the
first time period is O(mn) - The equilibrium solution is Pareto-optimal (an
outcome is Pareto optimal if it is impossible to
improve the utility of both agents
simultaneously) - The equilibrium solution is not unique, it is
not symmetric
63Making trade-offs indivisible issues
- Agent as trade-off problem at time t is to find
a package x, y that
For indivisible issues, this is the integer
knapsack problem
64Key points
- Single issue
- Time to compute equilibrium is O(n)
- The equilibrium is not unique, it is not
symmetric - Multiple divisible issues (exact solution)
- Time to compute equilibrium for t1 is O(mn)
- The equilibrium is Pareto optimal, it is not
unique, it is not symmetric - Multiple indivisible issues (approx. solution)
- There is an FPTAS to compute approximate
equilibrium - The equilibrium is Pareto optimal, it is not
unique, it is not symmetric
65Negotiation on data allocation in multi-server
environmentR. Azulay-Schwartz and S. Kraus.
Negotiation On Data Allocation in Multi-Agent
Environments. Autonomous Agents and Multi-Agent
Systems journal 5(2)123-172, 2002.
66Cooperative Web Servers
- The Data and Information System component of the
Earth Observing System (EOSDIS) of NASA is a
distributed knowledge system which supports
archival and distribution of data at multiple and
independent servers.
67Cooperative Web Servers- cont.
- Each data collection, or file, is called a
dataset. The datasets are huge, so each dataset
has only one copy. - The current policy for data allocation in NASA is
static old datasets are not reallocated each
new dataset is located by the server with the
nearest topics (defined according to the topics
of the datasets stored by this server).
68Related Work -File Allocation Problem
- The original problemHow to distribute files
among computers, in order to optimize the system
performance. - Our problemHow can self-motivated servers
decide about distribution of files, when each
server has its own objectives.
69Environment Description
- There are several information servers. Each
server is located at a different geographical
area. - Each server receives queries from the clients in
its area, and sends documents as responses to
queries. These documents can be stored locally,
or in another server.
70Environment Description
the query
document/s
distance
serverj
serveri
a query
the document/s
a client
area j
area i
71Basic Definitions
- SERVERS the set of the servers.
- DATASETS the set of datasets (files) to be
allocated. - Allocationa mapping of each dataset to one of
theservers. The set of all possible allocation
is denoted by Allocs. - U the utility function of each server.
72The Conflict Allocation
- If at least one server opts outM of the
negotiation, then the conflict allocation
conflict_alloc is implemented. - We consider the conflict allocation to be the
static allocation. (each dataset is stored in the
server with closest topics).
73Utility Function
- Userver(alloc,t) specifies the utility of server
from alloc?Allocs at time t. - It consists of
- The utility from the assignment of each dataset.
- The cost of negotiation delay.
- Userver(alloc,0) ? Vserver(x,alloc(x)).
- x?DATASETS
74Parameters of utility
- query price payment for retrieved docoments.
- usage(ds,s) the expected number of documents of
dataset ds from clients in the area of server
s. - storage costs, retrieve costs, answer costs.
75Cost over time
- Cost of communication and computation time of the
negotiation. - Loss of unused information new documents can not
be used until the negotiation ends. - Datasets usage and storage cost are assumed to
decrease over time, with the same discount ratio
(p-1). - Thus, there is a constant discount ratio of the
utility from an allocation Userver(alloc,t)d
tUserver(alloc,0) - tC.
76Assumptions
- Each server prefers any agreement over
continuation of the negotiation indefinitely. - The utility of each server from the conflict
allocation is always greater or equal to 0. - OFFERS - the set of allocations that are
preferred by all the agents over opting out.
77Negotiation Analysis - Simultaneous Responses
- Simultaneous responsesA server, when
responding, is not informed of the other
responses. - TheoremFor each offer x ?OFFERS, there is a
subgame-perfect equilibrium of the bargaining
game, with the outcome x offered and unanimously
accepted in period 0.
78Choosing the Allocation
- The designers of the servers can agree in advance
on a joint technique for choosing x - giving each server its conflict utility
- maximizing a social welfare criterion
- the sum of the servers utilities.
- or the generalized Nash product of the servers
utilities P (Us(x)-Us(conflict))
79Experimental Evaluation
- How do the parameters influence the results of
the negotiation? - vcost(alloc) the variable costs due to an
allocation (excludes storage_cost and the gains
due to queries). - vcost_ratio the ratio of vcosts when using
negotiation, and vcosts of the static allocation.
80Effect of Parameters on The Results
- As the number of servers grows, vcost_ratio
increases (more complex computations) L. - As the number of datasets grows, vcost_ratio
decreases (negotiation is more beneficial) J. - Changing the mean usage did not influence
vcost_ratio significantlyK, but vcost_ratio
decreases as the standard deviation of the usage
increasesJ.
81Influence of Parameters - cont.
- When the standard deviation of the distances
between servers increases, vcost_ratio
decreasesJ. - When the distance between servers increases,
vcost_ratio decreasesJ. - In the domains tested,
- answer_cost ? vcost_ratio ? L.
- storage_cost ? vcost_ratio ? L.
- retrieve_cost ? vcost_ratio ? J.
- query_price ? vcost_ratio ? J.
82Incomplete Information
- Each server knows
- The usage frequency of all datasets, by clients
from its area - The usage frequency of datasets stored in it, by
all clients
83BARGAINING
ZOPA
Sellers surplus
Buyers surplus
x final price
sL
sH
bH
bL
Sellers RP Sellers wants s or more
Buyers RP Buyer wants b or less
84Definition of a Bayesian game
- N is the set of players.
- O is the set of the states of nature.
- Ai is the set of actions for player i. A A1
A2 An - Ti is the type set of player i. For each state
of nature, the game will have different types of
players (one type per player). - u O A ? R is the payoff function for player i.
- pi is the probability distribution over O for
each player i, that is to say, each player has
different views of the probability distribution
over the states of the nature. In the game, they
never know the exact state of the nature.
85Solution concepts for Bayesian games
- A (Bayesian) Nash equilibrium is a strategy
profile and beliefs specified for each player
about the types of the other players that
maximizes the expected utility for each player
given their beliefs about the other players'
types and given the strategies played by the
other players.
86Incomplete Information - cont.
- A revelation mechanism
- First, all the servers report simultaneously all
their private information - for each dataset, the past usage of the dataset
by this server. - for each server, the past usage of each local
dataset by this server. - Then, the negotiation proceeds as in the complete
information case.
87Incomplete Information - cont.
- LemmaThere is a Nash equilibrium where each
server tells the truth about its past usage of
remote datasets, and the other servers usage of
its local datasets. - Lies concerning details about local usage of
local datasets are intractable.
88Summary negotiation on data allocation
- We have considered the data allocation problem in
a distributed environment. - We have presented the utility function of the
servers, which expresses their preferences. - We have proposed using a negotiation protocol for
solving the problem. - For incomplete information situations, a
revelation process was added to the protocol.
89Agent-Human Negotiation
90Computers interacting with people
Computer has the control
Computer persuades human
Human has the control
9191
92Culture sensitive agents
- The development of standardized agent to be used
in the collection of data for studies on culture
and negotiation
Buyer/Seller agents negotiate well across
cultures
93Semi-autonomous cars
94Medical applications
Gertner Institute for Epidemiology and Health
Policy Research
95Automated care-taker
The physiotherapist has no other available
appointments this week. How about resting
before the appointment?
I scheduled an appointment for you at the
physiotherapist this afternoon
I will be too tired in the afternoon!!!
Try to reschedule and fail
96Security applications
- Collect
- Update
- Analyze
- Prioritize
-
97People often follow suboptimal decision strategies
- Irrationalities attributed to
- sensitivity to context
- lack of knowledge of own preferences
- the effects of complexity
- the interplay between emotion and cognition
- the problem of self control
- bounded rationality in the bullet
-
98- Agents that play repeatedly with the same person
99AutONA BY03
- Buyers and sellers
- Using data from previous experiments
- Belief function to model opponent
- Implemented several tactics and heuristics
- including, concession mechanism
A. Byde, M. Yearworth, K.-Y. Chen, and C.
Bartolini. AutONA A system for automated
multiple 1-1 negotiation. In CEC, pages 5967,
2003
100Cliff-Edge
- Virtual learning and reinforcement learning
- Using data from previous interactions
- Implemented several tactics and heuristics
- qualitative in nature
- Non-deterministic behavior, via means of
randomization
R. Katz and S. Kraus. Efficient agents for cliff
edge environments with a large set of decision
options. In AAMAS, pages 697704, 2006
101General opponent modeling
- Agents that play with the same person only once
102Challenges of human opponent modeling
- Small number of examples
- difficult to collect data on people
- Noisy data
- people are inconsistent (the same person may act
differently) - people are diverse
103Guessing Heuristic
- Multi-issue, multi-attribute, with incomplete
information - Domain independent
- Implemented several tacticsand heuristics
- including, concession mechanism
C. M. Jonker, V. Robu, and J. Treur. An agent
architecture for multi-attribute negotiation
using incomplete preference information. JAAMAS,
15(2)221252, 2007
104PURB Agent
- Building blocks Personality model, Utility
function, Rules for guiding choice. - Key idea Models Personality traits of its
negotiation partners over time. - Uses decision theory to decide how to negotiate,
with utility function that depends on models and
other environmental features. - Pre-defined rules facilitate computation.
Plays as well as people adapts to culture
105QOAgent LIN08
Played at least as well as people
- Multi-issue, multi-attribute, with incomplete
information - Domain independent
- Implemented several tactics and heuristics
- qualitative in nature
- Non-deterministic behavior, also via means of
randomization
Is it possible to improve the QOAgent?
Yes, if you have data
R. Lin, S. Kraus, J. Wilkenfeld, and J. Barry.
Negotiating with bounded rational agents in
environments with incomplete information using an
automated agent. Artificial Intelligence,
172(6-7)823851, 2008
105
106KBAgent
- Multi-issue, multi-attribute, with incomplete
information - Domain independent
- Implemented several tactics and heuristics
- qualitative in nature
- Non-deterministic behavior, also via means of
randomization - Using data from previous interactions
Y. Oshrat, R. Lin, and S. Kraus. Facing the
challenge of human-agent negotiations via
effective general opponent modeling. In AAMAS,
2009
106
107Example scenario
- Employer and job candidate
- Objective reach an agreement over hiring terms
after successful interview
107
108General opponent modeling
- Challenge sparse data of past negotiation
sessions of people negotiation - Technique Kernel Density Estimation
108
109General opponent modeling
- Estimate likelihood of other party
- accept an offer
- make an offer
- its expected average utility
- The estimation is done separately for each
possible agent type - The type of a negotiator is determined using a
simple Bayes' classifier - Use estimation for decision making
109
110KBAgent as the job candidate
- Best result 20,000, Project manager, With
leased car 20 pension funds, fast promotion, 8
hours
KBAgent
Human
20,000 Team Manager With leased car Pension
20 Slow promotion 9 hours
12,000 Programmer Without leased car Pension
10 Fast promotion 10 hours
20,000 Project manager Without leased
car Pension 20 Slow promotion 9 hours
110
111KBAgent as the job candidate
- Best agreement 20,000, Project manager, With
leased car 20 pension funds, fast promotion, 8
hours
KBAgent
Human
?
Round 7
20,000 Programmer With leased car Pension
10 Slow promotion 9 hours
111
112Experiments
Learned from 20 games of human-human
- 172 grad and undergrad students in Computer
Science - People were told they may be playing a computer
agent or a person. - Scenarios
- Employer-Employee
- Tobacco Convention England vs. Zimbabwe
112
113Results Comparing KBAgent to others
Player Type Average Utility Value (std)
KBAgent vs people Employer 468.9 (37.0)
QOAgent vs peoples Employer 417.4 (135.9)
People vs. People Employer 408.9 (106.7)
People vs. QOAgent Employer 431.8 (80.8)
People vs. KBAgent 380. 4 (48.5)
KBAgent 482.7 (57.5)
QOAgent Job Candidate 397.8 (86.0)
People vs. People Job Candidate 310.3 (143.6)
People vs. QOAgent Job Candidate 320.5 (112.7)
People vs. KBAgent Job Candidate 370.5 (58.9)
113
114Main results
- In comparison to the QOAgent
- The KBAgent achieved higher utility values than
QOAgent - More agreements were accepted by people
- The sum of utility values (social welfare) were
higher when the KBAgent was involved - The KBAgent achieved significantly higher utility
values than people - Results demonstrate the proficiency negotiation
done by the KBAgent
General opponent modeling improves agent
negotiations
General opponent modeling improves agent
bargaining
114
115Automated care-taker
I arrange for you to go to the physiotherapist in
the afternoon
I will be too tired in the afternoon!!!
How can I convince him? What argument should I
give?
116Security applications
How should I convince him to provide me with
information?
117 Argumentation
Should I tell him that we are running out of
antibiotics?
- Which information to reveal?
Build a game that combines information revelation
and bargaining
117
118Automated care-taker
I arrange for you to go to the physiotherapist in
the afternoon
I will be too tired in the afternoon!!!
How can I convince him? What argument should I
give?
119Security applications
How should I convince him to provide me with
information?
120Color Trails (CT)
- An infrastructure for agent design,
implementation and evaluation for open
environments - Designed with Barbara Grosz (AAMAS 2004)
- Implemented by Harvard team and BIU team
120
121An experimental test-ted
- Interesting for people to play
- analogous to task settings
- vivid representation of strategy space (not just
a list of outcomes). - Possible for computers to play
- Can vary in complexity
- repeated vs. one-shot setting
- availability of information
- communication protocol.
121
122Social Preference Agent
- Learns the extent to which people are affected by
social preferences such as social welfare and
competitiveness. - Designed for one-shot take-it-or-leave-it
scenarios. - Does not reason about the future ramifications of
its actions.
Y. Gal and A. Pfeffer Predicting people's
bidding behavior in negotiation. AAMAS 2006
370-376
123- Agents for Revelation Games
Peled Noam, Gal Kobi, Kraus Sarit
124Introduction - Revelation games
- Combine two types of interaction
- Signaling games (Spence 1974)
- Players choose whether to convey private
information to each other - Bargaining games (Osborne and Rubinstein 1999)
- Players engage in multiple negotiation rounds
- Example Job interview
125Colored Trails (CT)
126Why not equilibrium agents?
- Results from the social sciences suggest people
do not follow equilibrium strategies - Equilibrium based agents played against people
failed. - People rarely design agents to follow equilibrium
strategies (Sarne et al AAMAS 2008). - Equilibrium strategies are usually not
cooperative - all lose.
-
126
127Perfect Equilibrium (PE) Agent
- Solved using Backward induction.
- No signaling.
- Counter-proposal round (selfish)
- Second proposer Find the most beneficial
proposal while the responder benefit remains
positive. - Second responder Accepts any proposal which
gives it a positive benefit.
128PE agent Phase one
- First proposal round (generous)
- First proposer propose the opponents
counter-proposal. - First responder Accepts any proposals which
gives it the same or higher benefit from its
counter-proposal. - Revelation phase - revelation vs non revelation
- In both boards, the PE with goal revelation
yields lower or equal expected utility than
non-revelation PE
129Benefits Diversity
- Average proposed benefit to players from first
and second rounds
130Performance of PEQ agent
131Revelation Effect
- Only 35 of the games played by humans included
revelation - Revelation had a significant effect on human
performance but not on agent performance - Revelation didn't help the agent
- People were deterred by the strategic
machine-generated proposals
132SIGAL agent
- Agent based on general opponent modeling
Genetic algorithm
Logistic Regression
133SIGAL Agent
- Learns from previous games.
- Predict the acceptance probability for each
proposal using Logistic Regression. - Models human as using a weighted utility function
of - Humans benefit
- Benefits difference
- Revelation decision
- Benefits in previous round
134Logistic Regression using a Genetic Algorithm
135Expected benefit maximization
136Maximization round 2
137Strategy Comparison
- Strategies for the asymmetric board, non of
the players has revealed, the human lacks 2 chips
for reaching the goal, the agent lacks 1
In first round the agent was proposed a benefit
of 90
138Heuristics
- Tit for Tat
- Never give more than you asks in the
counter-proposal - Risk averseness
- Isoelastic utility
139Learned Coefficients
- Responder benefit (0.96)
- Benefits difference (-0.79)
- Responder revelation (0.26)
- Proposer revelation (0.03)
- Responder benefit in first round (0.45)
- Proposer benefit in first round (0.33)
140Methodology
- Cross validation.
- 10-fold
- Over-fitting removal.
- Stop learning in the minimum of the
generalization error - Error calculation on held out test set.
- Using new human-human games
- Performance prediction criteria.
141Performance
General opponent modeling improves agent
negotiations
142General opponent modeling in Maximization
problems
142
143AAT agent
- Agent based on general opponent modeling
Decision Tree/ Naïve Byes
AAT
143
144Aspiration Adaptation Theory (AAT)
- Economic theory of peoples behavior (Selten)
- No utility function exists for decisions (!)
- Relative decisions used instead
- Retreat and urgency used for goal variables
Avi Rosenfeld and Sarit Kraus. Modeling Agents
through Bounded Rationality Theories. Proc. of
IJCAI 2009., JAAMAS, 2010.
145Commodity search
1000
145
146Commodity search
900
1000
147Commodity search
900
1000
950
If price lt 800 buy otherwise visit 5 stores and
buy in the cheapest.
147
148Results
148
149- General opponent modeling in cooperative
environments
150Coordination with limited communication
- Communication is not always possible
- High communication costs
- Need to act undetected
- Damaged communication devices
- Language incompatibilities
- Goal Limited interruption of human activities
Zuckerman, S. Kraus and J. S. Rosenschein.
Using Focal Points Learning to Improve
Human-Machine Tactic Coordination, JAAMAS, 2010.
150
151Focal Points (Examples)
- Divide 100 into two piles, if your piles are
identical to your coordination partner, you get
the 100. Otherwise, you get nothing.
101 equilibria
151
152Focal points (Examples)
9 equilibria
16 equilibria
152
153Focal Points
- Thomas Schelling (63)
- Focal Points Prominent solutions to tactic
coordination games
153
154Prior work Focal Points Based Coordination for
closed environments
- Domain-independent rules that could be used by
automated agents to identify focal points - Properties Centrality, Firstness, Extremeness,
Singularity. - Logic based model
- Decision theory based model
- Algorithms for agents coordination
Kraus and Rosenchein MAAMA 1992 Fenster et al
ICMAS 1995 Annals of Mathematics and Artificial
Intelligence 2000
154
155FPL agent
- Agent based on general opponent modeling
Decision Tree/ neural network
Focal Point
155
156FPL agent
- Agent based on general opponent modeling
raw data vector FP vector
Decision Tree/ neural network
156
157Focal Point Learning
157
158Results cont
General opponent modeling improves agent
coordination
- very similar domain (VSD) vs similar domain
(SD) of the pick the pile game.
158
159Experiments with people is a costly process
160Evaluation of agents (EDA)
- Peer Designed Agents (PDA) computer agents
developed by humans - Experiment 300 human subjects, 50 PDAs, 3 EDA
- Results
- EDA outperformed PDAs in the same situations in
which they outperformed people, - on average, EDA exhibited the same measure of
generosity
R. Lin, S. Kraus, Y. Oshrat and Y. Gal.
Facilitating the Evaluation of Automated
Negotiators using Peer Designed Agents, in AAAI
2010.
161Conclusions
- Negotiation and argumentation with people is
required for many applications - General opponent modeling is beneficial
- Machine learning
- Behavioral model
- Challenge how to integrate machine learning and
behavioral model
161
162References
- S.S. Fatima, M. Wooldridge, and N.R. Jennings,
Multi-issue negotiation with deadlines, Jnl of AI
Research, 21 381-471, 2006. - R. Keeney and H. Raiffa, Decisions with multiple
objectives Preferences and value trade-offs,
John Wiley, 1976. - S. Kraus, Strategic negotiation in multiagent
environments, The MIT press, 2001. - S. Kraus and D. Lehmann. Designing and Building a
Negotiating Automated Agent, Computational
Intelligence, 11(1)132-171, 1995 - S. Kraus, K. Sycara and A. Evenchik. Reaching
agreements through argumentation a logical model
and implementation. Artificial Intelligence
journal, 104(1-2)1-69, 1998. - R. Lin and Sarit Kraus. Can Automated Agents
Proficiently Negotiate With Humans?
Communications of the ACM Vol. 53 No. 1, Pages
78-88, January, 2010. - R. Lin, S. Kraus, Y. Oshrat and Y. Gal.
Facilitating the Evaluation of Automated
Negotiators using Peer Designed Agents, in AAAI
2010.
163References contd.
- R. Lin, S. Kraus, J. Wilkenfeld, and J. Barry.
Negotiating with bounded rational agents in
environments with incomplete information using an
automated agent. Artificial Intelligence,
172(6-7)823851, 2008 - A. Lomuscio, M. Wooldridge, and N.R. Jennings, A
classification scheme for negotiation in
electronic commerce , Int. Jnl. of Group Deciion
and Negotiation, 12(1), 31-56, 2003. - M.J. Osborne and A. Rubinstein, A course in game
theory, The MIT press, 1994. - M.J. Osborne and A. Rubinstein, Bargaining and
Markets, Academic Press, 1990. - Y. Oshrat, R. Lin, and S. Kraus. Facing the
challenge of human-agent negotiations via
effective general opponent modeling. In AAMAS,
2009 - H. Raiffa, The Art and Science of Negotiation,
Harvard University Press, 1982. - J.S. Rosenschein and G. Zlotkin, Rules of
encounter, The MIT press, 1994. - I. Stahl, Bargaining Theory, Economics Research
Institute, Stockholm School of Economics, 1972. - I. Zuckerman, S. Kraus and J. S. Rosenschein.
Using Focal Points Learning to Improve
Human-Machine Tactic Coordination, JAAMAS, 2010.
164Tournament
- 2nd annual competition of state-of-the-art
negotiating agents to be held in AAMAS11
Do you want to participate? At least 2,000
for the winner! Contact us! sarit_at_cs.biu.ac.il