Automated negotiations: Agents interacting with other automated agents and with humans

About This Presentation

Title:

Automated negotiations: Agents interacting with other automated agents and with humans

Description:

NEGOTIATION is an interpersonal decision-making process necessary whenever we cannot achieve our objectives single-handedly. Negotiations – PowerPoint PPT presentation

Number of Views:20

Avg rating:3.0/5.0

Slides: 163

Provided by: umd52

more less

Transcript and Presenter's Notes

Title: Automated negotiations: Agents interacting with other automated agents and with humans

1
Automated negotiations Agents interacting with
other automated agents and with humans

Sarit Kraus
Department of Computer Science
Bar-Ilan University
University of Maryland
sarit_at_cs.biu.ac.il

http//www.cs.biu.ac.il/sarit/
2
Negotiations

A discussion in which interested parties
exchange information and come to an agreement.
Davis and Smith, 1977

3

Negotiations

NEGOTIATION is an interpersonal decision-making
process necessary whenever we cannot achieve our
objectives single-handedly.

4
Agent environments

Teams of agents that need to coordinate joint
activities problems distributed information,
distributed decision solving, local conflicts.
Open agent environments acting in the same
environment problems need motivation to
cooperate, conflict resolution, trust,
distributed and hidden information.

5
Open Agent Environments

Consist of
Automated agents developed by or serving
different people or organizations.
People with a variety of interests and
institutional affiliations.
The computer agents are self-interested
they may cooperate to further their interests.
The set of agents is not fixed.

6
Open Agent Environments (examples)

Agents support people
Collaborative interfaces
CSCW Computer Supported Cooperative Work systems
Cooperative learning systems
Military-support systems

Agents act as proxies for people
Coordinating schedules
Patient care-delivery systems
Online auctions

Groups of agents act autonomously alongside
people
Simulation systems for education and training
Computer games and other forms of entertainment
Robots in rescue operations
Software personal assistants

7
Open Agent Environments (examples)

Agents support people
Collaborative interfaces
CSCW Computer Supported Cooperative Work systems
Cooperative learning systems
Military-support systems
Agents act as proxies for people
Coordinating schedules
Patient care-delivery systems
Online auctions
Groups of agents act autonomously alongside
people
Simulation systems for education and training
Computer games and other forms of entertainment
Robots in rescue operations
Software personal assistants

8
Examples

Monitoring electricity networks (Jennings)
Distributed design and engineering (Petrie et
al.)
Distributed meeting scheduling (Sen Durfee)
Teams of robotic systems acting in hostile
environments (Balch Arkin, Tambe)
Collaborative Internet-agents (Etzioni Weld,
Weiss)
Collaborative interfaces (Grosz Ortiz, Andre)
Information agent on the Internet (Klusch)
Cooperative transportation scheduling (Fischer)
Supporting hospital patient scheduling (Decker
Jin)
Intelligent Agents for Command and Control
(Sycara)

9
Types of agents

Fully rational agents
Bounded rational agents

10
Using other disciplines results

No need to start from scratch!
Required modification and adjustment AI gives
insights and complimentary methods.
Is it worth it to use formal methods for
multi-agent systems?

11
Negotiating with rational agents

Quantitative decision making
Maximizing expected utility
Nash equilibrium, Bayesian Nash equilibrium
Automated Negotiator
Model the scenario as a game
The agent computes (if complexity allows)
the equilibrium strategy, and acts
accordingly.
(Kraus, Strategic Negotiation in
Multiagent Environments,
MIT Press 2001).

12
Game Theory studies situations of strategic
interaction in which each decision maker's plan
of action depends on the plans of the other
decision makers.

Short introduction to game theory

13
Decision Theory (reminder)(How to make decisions)

Decision Theory
Probability theory Utility Theory
(deals with chance) (deals with
outcomes)
Fundamental idea
The MEU (Maximum expected utility) principle
Weigh the utility of each outcome by the
probability that it occurs

14
Basic Principle

Given probability P(out1 Ai), utility U(out1),
P(out2 Ai), utility
U(out2)
Expected utility of an action Aii EU(Ai) S
U(outj)P(outjAi)
Choose Ai such that maximizes EU MEU argmax
S U(outj)P(outjAi) Ai ? Ac
Outj ? OUT

Outj ? OUT
15
Risk Averse, Risk NeutralRisk Seeking
RISK SEEKER
RISK AVERSE
RISK NEUTRAL
16
Game Description

Players
Who participates in the game?
Actions / Strategies
What can each player do?
In what order do the players act?
Outcomes / Payoffs
What is the outcome of the game?
What are the players' preferences over the
possible outcomes?

17
Game Description (cont)

Information
What do the players know about the parameters of
the environment or about one another?
Can they observe the actions of the other
players?
Beliefs
What do the players believe about the unknown
parameters of the environment or about one
another?
What can they infer from observing the actions of
the other players?

18
Strategies and Equilibrium

Strategy
Complete plan, describing an action for every
contingency
Nash Equilibrium
Each player's strategy is a best response to the
strategies of the other players
Equivalently No player can improve his payoffs
by changing his strategy alone
Self-enforcing agreement. No need for formal
contracting
Other equilibrium concepts also exist

19
Classification of Games

Depending on the timing of move
Games with simultaneous moves
Games with sequential moves
Depending on the information available to the
players
Games with perfect information
Games with imperfect (or incomplete) information
We concentrate on non-cooperative games
Groups of players cannot deviate jointly
Players cannot make binding agreements

20
Games with Simultaneous Moves and Perfect
Information

All players choose their actions simultaneously
or just independently of one another
There is no private information
All aspects of the game are known to the players
Representation by game matrices
Often called normal form games or strategic form
games

21
Matching Pennies
Example of a zero-sum game. Strategic issue of
competition.
22
Prisoners Dilemma

Each player can cooperate or defect

Column
cooperate
defect
-10,0
-1,-1
cooperate
Row
defect
-8,-8
0,-10
Main issue Tension between social optimality
and individual incentives.
23
Coordination Games

A supplier and a buyer need to decide whether to
adopt a new purchasing system.

Buyer
new
old
0,0
20,20
new
Supplier
old
5,5
0,0
24
Battle of sexes
The game involves both the issues of coordination
and competition
25
Definition of Nash Equilibrium

A game has n players.
Each player i has a strategy set Si
This is his possible actions
Each player has a payoff function
pI S ?R
A strategy ti in Si is a best response if there
is no other strategy in Si that produces a higher
payoff, given the opponents strategies

26
Definition of Nash Equilibrium

A strategy profile is a list (s1, s2, , sn) of
the strategies each player is using
If each strategy is a best response given the
other strategies in the profile, the profile is a
Nash equilibrium
Why is this important?
If we assume players are rational, they will play
Nash strategies
Even less-than-rational play will often converge
to Nash in repeated settings

27
An Example of a Nash Equilibrium
Column
a
b
0,1
1,2
a
Row
b
1,0
2,1
(b,a) is a Nash equilibrium Given that column is
playing a, rows best response is b Given that
row is playing b, columns best response is a
28
Mixed strategies

Unfortunately, not every game has a pure strategy
equilibrium.
Rock-paper-scissors
However, every game has a mixed strategy Nash
equilibrium
Each action is assigned a probability of play
Player is indifferent between actions, given
these probabilities

29
Mixed Strategies
Wife
shopping
football
0,0
2,1
football
Husband
shopping
1,2
0,0
30
Mixed strategy

Instead, each player selects a probability
associated with each action
Goal utility of each action is equal
Players are indifferent to choices at this
probability
aprobability husband chooses football
bprobability wife chooses shopping
Since payoffs must be equal, for husband
b1(1-b)2 b2/3
For wife
a1(1-a)2 2/3
In each case, expected payoff is 2/3
2/9 of time go to football, 2/9 shopping, 5/9
miscoordinate
If they could synchronize ahead of time they
could do better.

31
Rock paper scissors
Column
rock
paper
scissors
-1,1
0,0
1,-1
rock
Row
paper
1,-1
0,0
-1,1
scissors
-1,1
1,-1
0,0
32
Setup

Player 1 plays rock with probability pr, scissors
with probability ps, paper with probability 1-pr
ps
Utility2(rock) 0pr 1ps 1(1-pr ps)
2 ps pr -1
Utility2(scissors) 0ps 1(1 pr ps) 1pr
1 2pr ps
Utility2(paper) 0(1-pr ps) 1pr 1ps
pr ps
Player 2 wants to choose a probability for each
action so that the expected payoff for each
action is the same.

33
Setup

qr(2 ps pr 1) qs(1 2pr ps) (1-qr-qs)
(pr ps)
It turns out (after some algebra) that the
optimal mixed strategy is to play each action 1/3
of the time
Intuition What if you played rock half the time?
Your opponent would then play paper half the
time, and youd lose more often than you won
So youd decrease the fraction of times you
played rock, until your opponent had no edge in
guessing what youll do

34
Extensive Form Games
Any finite game of perfect information has a pure
strategy Nash equilibrium. It can be found by
backward induction.
Chess is a finite game of perfect information.
Therefore it is a trivial game from a game
theoretic point of view.
35
Extensive Form Games - Intro

A game can have complex temporal structure
Information
set of players
who moves when and under what circumstances
what actions are available when called upon to
move
what is known when called upon to move
what payoffs each player receives
Foundation is a game tree

36
Example Cuban Missile Crisis
- 100, - 100
Nuke
Kennedy
Arm
Khrushchev
Fold
10, -10
-1, 1
Retract
Pure strategy Nash equilibria (Arm, Fold) and
(Retract, Nuke)
37
Subgame perfect equilibrium credible threats

Proper subgame subtree (of the game tree) whose
root is alone in its information set
Subgame perfect equilibrium
Strategy profile that is in Nash equilibrium in
every proper subgame (including the root),
whether or not that subgame is reached along the
equilibrium path of play

38
Example Cuban Missile Crisis
- 100, - 100
Nuke
Kennedy
Arm
Khrushchev
Fold
10, -10
-1, 1
Retract
Pure strategy Nash equilibria (Arm, Fold) and
(Retract, Nuke)
Pure strategy subgame perfect equilibria (Arm,
Fold)
Conclusion Kennedys Nuke threat was not
credible.
39
Type of games
Diplomacy
40
Take it or leave it deals

The rules of the game
You will be randomly paired up with someone in
the other section this pairing will remain
completely anonymous.
One of you will be chosen (by coin flip) to be
either the Proposer or the Responder in this
experiment.
The Proposer gets to make an offer to split 100
in some proportion with the Responder. So the
proposer can offer x to the responder, proposing
to keep 100-x for themselves.
The Responder must decide what is the lowest
amount offered by the proposer that he / she will
accept i.e. I will accept any offer which is
greater than or equal to y.
If the responder accepts the offer made by the
proposer, they split the sum according to the
proposal. If the responder rejects, both parties
lose their shares.

41
AN EXAMPLE OF Buyer/Seller negotiation
42
BARGAINING
ZOPA
Sellers surplus
Buyers surplus
x final price
s
b
Sellers RP Sellers wants s or more
Buyers RP Buyer wants b or less
43
BARGAINING

If b lt s negative bargaining zone, no
possible agreements
If b gt s positive bargaining zone,
agreement possible
(x-s) sellers surplus
(b-x) buyers surplus
The surplus to divide independent on x
constant-sum game!

44
POSITIVE BARGAINING ZONE
Sellers reservation point
Sellers target point
Sellers bargaining range
Buyers bargaining range
Buyers target point
Buyers reservation point
POSITIVE bargaining zone
45
NEGATIVE BARGAINING ZONE
Sellers reservation point
Sellers target point
Sellers bargaining range
Buyers bargaining range
Buyers target point
Buyers reservation point
NEGATIVE bargaining zone
46
Single issue negotiation

Agents a and b negotiate over a pie of size 1
Offer (x,y), xy1
Deadline n and Discount factor d
Utility Ua((x,y), t) x dt-1 if
t n
Ub((x,y),t) y dt-1
0
otherwise
The agents negotiate using Rubinsteins
alternating offers protocol

47
Alternating offers protocol

Time Offer Respond
1 a (x1,y1) b
(accept/reject)
2 b (x2,y2) a
(accept/reject)
-
-
n

48
Equilibrium strategies

How much should an agent offer if there is only
one time period?
Let n1 and a be the first mover

Agent as offer
Propose to keep the whole pie (1,0) agent b
will accept this

49
Equilibrium strategies for n 2

d 1/4 first mover a
Offer (x, y) x as share y
bs share
Optimal offers obtained using backward induction

Time Offering agent Offer Utility
1 a ? b (3/4, 1/4) 3/41/4
2 b ? a (0, 1) 01/4
Agreement
The offer (3/4, 1/4) forms a P.E. Nash equilibrium
50
Effect of discount factor and deadline on the
equilibrium outcome

What happens to first movers share as d
increases?
What happens to second movers share as d
increases?
As deadline increases, what happens to first
movers share?
Likewise for second mover?

51
Effect of d and deadline on the agents shares
52
Multiple issues

Set of issues S 1, 2, , m. Each issue is a
pie of size 1
The issues are divisible
Deadline n (for all the issues)
Discount factor dc for issue c
Utility U(x, t) ?c U(xc, t)

53
Multi-issue procedures

Package deal procedure The issues are bundled
and discussed together as a package
Simultaneous procedure The issues are negotiated
in parallel but independently of each other
Sequential procedure The issues are negotiated
sequentially one after another

54
Package deal procedure

Issues negotiated using alternating offers
protocol
An offer specifies a division for each of the
m issues
The agents are allowed to accept/reject a
complete offer
The agents may have different preferences over
the issues
The agents can make tradeoffs across the
issues to maximize their utility this leads
to Pareto optimal outcome

55
Utility for two issues
Ua 2X Y
Ub X 2Y
56
Making tradeoffs
What is as utility for Ub 2
Ub 2
57
Example for two issues
DEADLINE n 2 DISCOUNT FACTORS d1 d2
1/2 UTILITIES Ua 1/2t-1 (x1 2x2)
Ub 1/2t-1 (2y1 y2)
Time Offering agent Package Offer
1 a ? b (1/4, 3/4) (1, 0) OR (3/4, 1/4) (0, 1)
2 b ? a (0, 1) (0, 1) Ub 1.5
The outcome is not symmetric
58
P.E. Nash equilibrium strategies
For t n The offering agent takes 100 percent of
all the issues The receiving agent accepts For t
lt n (for agent a)
OFFER x, y s.t. Ub(y, t) EQUB(t1) If more than one such x, y perform trade-offs across issues to find best offer RECEIVE x, y If Ua(x, t) EQUA(t1) ACCEPT else REJECT
EQUA(t1) is as equilibrium utility for t1
EQUB(t1) is bs equilibrium utility for t1
59
Making trade-offs divisible issues
Agent as trade-off problem at time t TR Find
a package x, y to m
Maximize ? kac xc
c1
m Subject to ? kbc yc EQUB(t1)
0 xc 1 0 yc 1
c1
This is the fractional knapsack problem
60
Making trade-offs divisible issues

Agent as perspective (time t)
Agent a considers the m issues in the
increasing order of ka/kb and assigns to b the
maximum possible share for each of them until bs
cumulative utility equals EQUB(t1)

61
Equilibrium strategies
For t n The offering agent takes 100 percent of
all the issues The receiving agent accepts For t
lt n (for agent a)
OFFER x, y s.t. Ub(y, t) EQUB(t1) If more then one such x, y perform trade-offs across issues to find best offer RECEIVE x, y If Ua(x, t) EQUA(t1) ACCEPT else REJECT
62
Equilibrium solution

An agreement on all the m issues occurs in the
first time period
Time to compute the equilibrium offer for the
first time period is O(mn)
The equilibrium solution is Pareto-optimal (an
outcome is Pareto optimal if it is impossible to
improve the utility of both agents
simultaneously)
The equilibrium solution is not unique, it is
not symmetric

63
Making trade-offs indivisible issues

Agent as trade-off problem at time t is to find
a package x, y that

For indivisible issues, this is the integer
knapsack problem
64
Key points

Single issue
Time to compute equilibrium is O(n)
The equilibrium is not unique, it is not
symmetric
Multiple divisible issues (exact solution)
Time to compute equilibrium for t1 is O(mn)
The equilibrium is Pareto optimal, it is not
unique, it is not symmetric
Multiple indivisible issues (approx. solution)
There is an FPTAS to compute approximate
equilibrium
The equilibrium is Pareto optimal, it is not
unique, it is not symmetric

65
Negotiation on data allocation in multi-server
environmentR. Azulay-Schwartz and S. Kraus.
Negotiation On Data Allocation in Multi-Agent
Environments. Autonomous Agents and Multi-Agent
Systems journal 5(2)123-172, 2002.
66
Cooperative Web Servers

The Data and Information System component of the
Earth Observing System (EOSDIS) of NASA is a
distributed knowledge system which supports
archival and distribution of data at multiple and
independent servers.

67
Cooperative Web Servers- cont.

Each data collection, or file, is called a
dataset. The datasets are huge, so each dataset
has only one copy.
The current policy for data allocation in NASA is
static old datasets are not reallocated each
new dataset is located by the server with the
nearest topics (defined according to the topics
of the datasets stored by this server).

68
Related Work -File Allocation Problem

The original problemHow to distribute files
among computers, in order to optimize the system
performance.
Our problemHow can self-motivated servers
decide about distribution of files, when each
server has its own objectives.

69
Environment Description

There are several information servers. Each
server is located at a different geographical
area.
Each server receives queries from the clients in
its area, and sends documents as responses to
queries. These documents can be stored locally,
or in another server.

70
Environment Description
the query
document/s
distance
serverj
serveri
a query
the document/s
a client
area j
area i
71
Basic Definitions

SERVERS the set of the servers.
DATASETS the set of datasets (files) to be
allocated.
Allocationa mapping of each dataset to one of
theservers. The set of all possible allocation
is denoted by Allocs.
U the utility function of each server.

72
The Conflict Allocation

If at least one server opts outM of the
negotiation, then the conflict allocation
conflict_alloc is implemented.
We consider the conflict allocation to be the
static allocation. (each dataset is stored in the
server with closest topics).

73
Utility Function

Userver(alloc,t) specifies the utility of server
from alloc?Allocs at time t.
It consists of
The utility from the assignment of each dataset.
The cost of negotiation delay.
Userver(alloc,0) ? Vserver(x,alloc(x)).
x?DATASETS

74
Parameters of utility

query price payment for retrieved docoments.
usage(ds,s) the expected number of documents of
dataset ds from clients in the area of server
s.
storage costs, retrieve costs, answer costs.

75
Cost over time

Cost of communication and computation time of the
negotiation.
Loss of unused information new documents can not
be used until the negotiation ends.
Datasets usage and storage cost are assumed to
decrease over time, with the same discount ratio
(p-1).
Thus, there is a constant discount ratio of the
utility from an allocation Userver(alloc,t)d
tUserver(alloc,0) - tC.

76
Assumptions

Each server prefers any agreement over
continuation of the negotiation indefinitely.
The utility of each server from the conflict
allocation is always greater or equal to 0.
OFFERS - the set of allocations that are
preferred by all the agents over opting out.

77
Negotiation Analysis - Simultaneous Responses

Simultaneous responsesA server, when
responding, is not informed of the other
responses.
TheoremFor each offer x ?OFFERS, there is a
subgame-perfect equilibrium of the bargaining
game, with the outcome x offered and unanimously
accepted in period 0.

78
Choosing the Allocation

The designers of the servers can agree in advance
on a joint technique for choosing x
giving each server its conflict utility
maximizing a social welfare criterion
the sum of the servers utilities.
or the generalized Nash product of the servers
utilities P (Us(x)-Us(conflict))

79
Experimental Evaluation

How do the parameters influence the results of
the negotiation?
vcost(alloc) the variable costs due to an
allocation (excludes storage_cost and the gains
due to queries).
vcost_ratio the ratio of vcosts when using
negotiation, and vcosts of the static allocation.

80
Effect of Parameters on The Results

As the number of servers grows, vcost_ratio
increases (more complex computations) L.
As the number of datasets grows, vcost_ratio
decreases (negotiation is more beneficial) J.
Changing the mean usage did not influence
vcost_ratio significantlyK, but vcost_ratio
decreases as the standard deviation of the usage
increasesJ.

81
Influence of Parameters - cont.

When the standard deviation of the distances
between servers increases, vcost_ratio
decreasesJ.
When the distance between servers increases,
vcost_ratio decreasesJ.
In the domains tested,
answer_cost ? vcost_ratio ? L.
storage_cost ? vcost_ratio ? L.
retrieve_cost ? vcost_ratio ? J.
query_price ? vcost_ratio ? J.

82
Incomplete Information

Each server knows
The usage frequency of all datasets, by clients
from its area
The usage frequency of datasets stored in it, by
all clients

83
BARGAINING
ZOPA
Sellers surplus
Buyers surplus
x final price
sL
sH
bH
bL
Sellers RP Sellers wants s or more
Buyers RP Buyer wants b or less
84
Definition of a Bayesian game

N is the set of players.
O is the set of the states of nature.
Ai is the set of actions for player i. A A1
A2 An
Ti is the type set of player i. For each state
of nature, the game will have different types of
players (one type per player).
u O A ? R is the payoff function for player i.
pi is the probability distribution over O for
each player i, that is to say, each player has
different views of the probability distribution
over the states of the nature. In the game, they
never know the exact state of the nature.

85
Solution concepts for Bayesian games

A (Bayesian) Nash equilibrium is a strategy
profile and beliefs specified for each player
about the types of the other players that
maximizes the expected utility for each player
given their beliefs about the other players'
types and given the strategies played by the
other players.

86
Incomplete Information - cont.

A revelation mechanism
First, all the servers report simultaneously all
their private information
for each dataset, the past usage of the dataset
by this server.
for each server, the past usage of each local
dataset by this server.
Then, the negotiation proceeds as in the complete
information case.

87
Incomplete Information - cont.

LemmaThere is a Nash equilibrium where each
server tells the truth about its past usage of
remote datasets, and the other servers usage of
its local datasets.
Lies concerning details about local usage of
local datasets are intractable.

88
Summary negotiation on data allocation

We have considered the data allocation problem in
a distributed environment.
We have presented the utility function of the
servers, which expresses their preferences.
We have proposed using a negotiation protocol for
solving the problem.
For incomplete information situations, a
revelation process was added to the protocol.

89
Agent-Human Negotiation
90
Computers interacting with people
Computer has the control
Computer persuades human
Human has the control
91
91
92
Culture sensitive agents

The development of standardized agent to be used
in the collection of data for studies on culture
and negotiation

Buyer/Seller agents negotiate well across
cultures

PURB agent

93
Semi-autonomous cars
94
Medical applications
Gertner Institute for Epidemiology and Health
Policy Research
95
Automated care-taker
The physiotherapist has no other available
appointments this week. How about resting
before the appointment?
I scheduled an appointment for you at the
physiotherapist this afternoon
I will be too tired in the afternoon!!!
Try to reschedule and fail
96
Security applications

Collect
Update
Analyze
Prioritize

97
People often follow suboptimal decision strategies

Irrationalities attributed to
sensitivity to context
lack of knowledge of own preferences
the effects of complexity
the interplay between emotion and cognition
the problem of self control
bounded rationality in the bullet

Agents that play repeatedly with the same person

99
AutONA BY03

Buyers and sellers
Using data from previous experiments
Belief function to model opponent
Implemented several tactics and heuristics
including, concession mechanism

A. Byde, M. Yearworth, K.-Y. Chen, and C.
Bartolini. AutONA A system for automated
multiple 1-1 negotiation. In CEC, pages 5967,
2003
100
Cliff-Edge

Virtual learning and reinforcement learning
Using data from previous interactions
Implemented several tactics and heuristics
qualitative in nature
Non-deterministic behavior, via means of
randomization

R. Katz and S. Kraus. Efficient agents for cliff
edge environments with a large set of decision
options. In AAMAS, pages 697704, 2006
101
General opponent modeling

Agents that play with the same person only once

102
Challenges of human opponent modeling

Small number of examples
difficult to collect data on people
Noisy data
people are inconsistent (the same person may act
differently)
people are diverse

103
Guessing Heuristic

Multi-issue, multi-attribute, with incomplete
information
Domain independent
Implemented several tacticsand heuristics
including, concession mechanism

C. M. Jonker, V. Robu, and J. Treur. An agent
architecture for multi-attribute negotiation
using incomplete preference information. JAAMAS,
15(2)221252, 2007
104
PURB Agent

Building blocks Personality model, Utility
function, Rules for guiding choice.
Key idea Models Personality traits of its
negotiation partners over time.
Uses decision theory to decide how to negotiate,
with utility function that depends on models and
other environmental features.
Pre-defined rules facilitate computation.

Plays as well as people adapts to culture
105
QOAgent LIN08
Played at least as well as people

Multi-issue, multi-attribute, with incomplete
information
Domain independent
Implemented several tactics and heuristics
qualitative in nature
Non-deterministic behavior, also via means of
randomization

Is it possible to improve the QOAgent?
Yes, if you have data
R. Lin, S. Kraus, J. Wilkenfeld, and J. Barry.
Negotiating with bounded rational agents in
environments with incomplete information using an
automated agent. Artificial Intelligence,
172(6-7)823851, 2008
105
106
KBAgent

Multi-issue, multi-attribute, with incomplete
information
Domain independent
Implemented several tactics and heuristics
qualitative in nature
Non-deterministic behavior, also via means of
randomization
Using data from previous interactions

Y. Oshrat, R. Lin, and S. Kraus. Facing the
challenge of human-agent negotiations via
effective general opponent modeling. In AAMAS,
2009
106
107
Example scenario

Employer and job candidate
Objective reach an agreement over hiring terms
after successful interview

107
108
General opponent modeling

Challenge sparse data of past negotiation
sessions of people negotiation
Technique Kernel Density Estimation

108
109
General opponent modeling

Estimate likelihood of other party
accept an offer
make an offer
its expected average utility
The estimation is done separately for each
possible agent type
The type of a negotiator is determined using a
simple Bayes' classifier
Use estimation for decision making

109
110
KBAgent as the job candidate

Best result 20,000, Project manager, With
leased car 20 pension funds, fast promotion, 8
hours

KBAgent
Human
20,000 Team Manager With leased car Pension
20 Slow promotion 9 hours
12,000 Programmer Without leased car Pension
10 Fast promotion 10 hours
20,000 Project manager Without leased
car Pension 20 Slow promotion 9 hours
110
111
KBAgent as the job candidate

Best agreement 20,000, Project manager, With
leased car 20 pension funds, fast promotion, 8
hours

KBAgent
Human
?
Round 7
20,000 Programmer With leased car Pension
10 Slow promotion 9 hours
111
112
Experiments
Learned from 20 games of human-human

172 grad and undergrad students in Computer
Science
People were told they may be playing a computer
agent or a person.
Scenarios
Employer-Employee
Tobacco Convention England vs. Zimbabwe

112
113
Results Comparing KBAgent to others
Player Type Average Utility Value (std)
KBAgent vs people Employer 468.9 (37.0)
QOAgent vs peoples Employer 417.4 (135.9)
People vs. People Employer 408.9 (106.7)
People vs. QOAgent Employer 431.8 (80.8)
People vs. KBAgent 380. 4 (48.5)
KBAgent 482.7 (57.5)
QOAgent Job Candidate 397.8 (86.0)
People vs. People Job Candidate 310.3 (143.6)
People vs. QOAgent Job Candidate 320.5 (112.7)
People vs. KBAgent Job Candidate 370.5 (58.9)
113
114
Main results

In comparison to the QOAgent
The KBAgent achieved higher utility values than
QOAgent
More agreements were accepted by people
The sum of utility values (social welfare) were
higher when the KBAgent was involved
The KBAgent achieved significantly higher utility
values than people
Results demonstrate the proficiency negotiation
done by the KBAgent

General opponent modeling improves agent
negotiations
General opponent modeling improves agent
bargaining
114
115
Automated care-taker
I arrange for you to go to the physiotherapist in
the afternoon
I will be too tired in the afternoon!!!
How can I convince him? What argument should I
give?
116
Security applications
How should I convince him to provide me with
information?
117
Argumentation
Should I tell him that we are running out of
antibiotics?

Which information to reveal?

Build a game that combines information revelation
and bargaining
117
118
Automated care-taker
I arrange for you to go to the physiotherapist in
the afternoon
I will be too tired in the afternoon!!!
How can I convince him? What argument should I
give?
119
Security applications
How should I convince him to provide me with
information?
120
Color Trails (CT)

An infrastructure for agent design,
implementation and evaluation for open
environments
Designed with Barbara Grosz (AAMAS 2004)
Implemented by Harvard team and BIU team

120
121
An experimental test-ted

Interesting for people to play
analogous to task settings
vivid representation of strategy space (not just
a list of outcomes).
Possible for computers to play
Can vary in complexity
repeated vs. one-shot setting
availability of information
communication protocol.

121
122
Social Preference Agent

Learns the extent to which people are affected by
social preferences such as social welfare and
competitiveness.
Designed for one-shot take-it-or-leave-it
scenarios.
Does not reason about the future ramifications of
its actions.

Y. Gal and A. Pfeffer Predicting people's
bidding behavior in negotiation. AAMAS 2006
370-376
123

Agents for Revelation Games

Peled Noam, Gal Kobi, Kraus Sarit
124
Introduction - Revelation games

Combine two types of interaction
Signaling games (Spence 1974)
Players choose whether to convey private
information to each other
Bargaining games (Osborne and Rubinstein 1999)
Players engage in multiple negotiation rounds
Example Job interview

125
Colored Trails (CT)

Asymmetric Symmetric

126
Why not equilibrium agents?

Results from the social sciences suggest people
do not follow equilibrium strategies
Equilibrium based agents played against people
failed.
People rarely design agents to follow equilibrium
strategies (Sarne et al AAMAS 2008).
Equilibrium strategies are usually not
cooperative
all lose.

126
127
Perfect Equilibrium (PE) Agent

Solved using Backward induction.
No signaling.
Counter-proposal round (selfish)
Second proposer Find the most beneficial
proposal while the responder benefit remains
positive.
Second responder Accepts any proposal which
gives it a positive benefit.

128
PE agent Phase one

First proposal round (generous)
First proposer propose the opponents
counter-proposal.
First responder Accepts any proposals which
gives it the same or higher benefit from its
counter-proposal.
Revelation phase - revelation vs non revelation
In both boards, the PE with goal revelation
yields lower or equal expected utility than
non-revelation PE

129
Benefits Diversity

Average proposed benefit to players from first
and second rounds

130
Performance of PEQ agent
131
Revelation Effect

Only 35 of the games played by humans included
revelation
Revelation had a significant effect on human
performance but not on agent performance
Revelation didn't help the agent
People were deterred by the strategic
machine-generated proposals

132
SIGAL agent

Agent based on general opponent modeling

Genetic algorithm
Logistic Regression
133
SIGAL Agent

Learns from previous games.
Predict the acceptance probability for each
proposal using Logistic Regression.
Models human as using a weighted utility function
of
Humans benefit
Benefits difference
Revelation decision
Benefits in previous round

134
Logistic Regression using a Genetic Algorithm
135
Expected benefit maximization
136
Maximization round 2
137
Strategy Comparison

Strategies for the asymmetric board, non of
the players has revealed, the human lacks 2 chips
for reaching the goal, the agent lacks 1

In first round the agent was proposed a benefit
of 90
138
Heuristics

Tit for Tat
Never give more than you asks in the
counter-proposal
Risk averseness
Isoelastic utility

139
Learned Coefficients

Responder benefit (0.96)
Benefits difference (-0.79)
Responder revelation (0.26)
Proposer revelation (0.03)
Responder benefit in first round (0.45)
Proposer benefit in first round (0.33)

140
Methodology

Cross validation.
10-fold
Over-fitting removal.
Stop learning in the minimum of the
generalization error
Error calculation on held out test set.
Using new human-human games
Performance prediction criteria.

141
Performance
General opponent modeling improves agent
negotiations
142
General opponent modeling in Maximization
problems
142
143
AAT agent

Agent based on general opponent modeling

Decision Tree/ Naïve Byes
AAT
143
144
Aspiration Adaptation Theory (AAT)

Economic theory of peoples behavior (Selten)
No utility function exists for decisions (!)
Relative decisions used instead
Retreat and urgency used for goal variables

Avi Rosenfeld and Sarit Kraus. Modeling Agents
through Bounded Rationality Theories. Proc. of
IJCAI 2009., JAAMAS, 2010.
145
Commodity search
1000
145
146
Commodity search
900
1000
147
Commodity search
900
1000
950
If price lt 800 buy otherwise visit 5 stores and
buy in the cheapest.
147
148
Results
148
149

General opponent modeling in cooperative
environments

150
Coordination with limited communication

Communication is not always possible
High communication costs
Need to act undetected
Damaged communication devices
Language incompatibilities
Goal Limited interruption of human activities

Zuckerman, S. Kraus and J. S. Rosenschein.
Using Focal Points Learning to Improve
Human-Machine Tactic Coordination, JAAMAS, 2010.
150
151
Focal Points (Examples)

Divide 100 into two piles, if your piles are
identical to your coordination partner, you get
the 100. Otherwise, you get nothing.

101 equilibria
151
152
Focal points (Examples)
9 equilibria
16 equilibria
152
153
Focal Points

Thomas Schelling (63)
Focal Points Prominent solutions to tactic
coordination games

153
154
Prior work Focal Points Based Coordination for
closed environments

Domain-independent rules that could be used by
automated agents to identify focal points
Properties Centrality, Firstness, Extremeness,
Singularity.
Logic based model
Decision theory based model
Algorithms for agents coordination

Kraus and Rosenchein MAAMA 1992 Fenster et al
ICMAS 1995 Annals of Mathematics and Artificial
Intelligence 2000
154
155
FPL agent

Agent based on general opponent modeling

Decision Tree/ neural network
Focal Point
155
156
FPL agent

Agent based on general opponent modeling

raw data vector FP vector
Decision Tree/ neural network
156
157
Focal Point Learning

3 experimental domains

157
158
Results cont
General opponent modeling improves agent
coordination

very similar domain (VSD) vs similar domain
(SD) of the pick the pile game.

158
159
Experiments with people is a costly process
160
Evaluation of agents (EDA)

Peer Designed Agents (PDA) computer agents
developed by humans
Experiment 300 human subjects, 50 PDAs, 3 EDA
Results
EDA outperformed PDAs in the same situations in
which they outperformed people,
on average, EDA exhibited the same measure of
generosity

R. Lin, S. Kraus, Y. Oshrat and Y. Gal.
Facilitating the Evaluation of Automated
Negotiators using Peer Designed Agents, in AAAI
2010.
161
Conclusions

Negotiation and argumentation with people is
required for many applications
General opponent modeling is beneficial
Machine learning
Behavioral model
Challenge how to integrate machine learning and
behavioral model

161
162
References

S.S. Fatima, M. Wooldridge, and N.R. Jennings,
Multi-issue negotiation with deadlines, Jnl of AI
Research, 21 381-471, 2006.
R. Keeney and H. Raiffa, Decisions with multiple
objectives Preferences and value trade-offs,
John Wiley, 1976.
S. Kraus, Strategic negotiation in multiagent
environments, The MIT press, 2001.
S. Kraus and D. Lehmann. Designing and Building a
Negotiating Automated Agent, Computational
Intelligence, 11(1)132-171, 1995
S. Kraus, K. Sycara and A. Evenchik. Reaching
agreements through argumentation a logical model
and implementation. Artificial Intelligence
journal, 104(1-2)1-69, 1998.
R. Lin and Sarit Kraus. Can Automated Agents
Proficiently Negotiate With Humans?
Communications of the ACM Vol. 53 No. 1, Pages
78-88, January, 2010.
R. Lin, S. Kraus, Y. Oshrat and Y. Gal.
Facilitating the Evaluation of Automated
Negotiators using Peer Designed Agents, in AAAI
2010.

163
References contd.

R. Lin, S. Kraus, J. Wilkenfeld, and J. Barry.
Negotiating with bounded rational agents in
environments with incomplete information using an
automated agent. Artificial Intelligence,
172(6-7)823851, 2008
A. Lomuscio, M. Wooldridge, and N.R. Jennings, A
classification scheme for negotiation in
electronic commerce , Int. Jnl. of Group Deciion
and Negotiation, 12(1), 31-56, 2003.
M.J. Osborne and A. Rubinstein, A course in game
theory, The MIT press, 1994.
M.J. Osborne and A. Rubinstein, Bargaining and
Markets, Academic Press, 1990.
Y. Oshrat, R. Lin, and S. Kraus. Facing the
challenge of human-agent negotiations via
effective general opponent modeling. In AAMAS,
2009
H. Raiffa, The Art and Science of Negotiation,
Harvard University Press, 1982.
J.S. Rosenschein and G. Zlotkin, Rules of
encounter, The MIT press, 1994.
I. Stahl, Bargaining Theory, Economics Research
Institute, Stockholm School of Economics, 1972.
I. Zuckerman, S. Kraus and J. S. Rosenschein.
Using Focal Points Learning to Improve
Human-Machine Tactic Coordination, JAAMAS, 2010.