Automated negotiations: Agents interacting with other automated agents and with humans

About This Presentation
Title:

Automated negotiations: Agents interacting with other automated agents and with humans

Description:

NEGOTIATION is an interpersonal decision-making process necessary whenever we cannot achieve our objectives single-handedly. Negotiations – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 163
Provided by: umd52

less

Transcript and Presenter's Notes

Title: Automated negotiations: Agents interacting with other automated agents and with humans


1
Automated negotiations Agents interacting with
other automated agents and with humans
  • Sarit Kraus
  • Department of Computer Science
  • Bar-Ilan University
  • University of Maryland
  • sarit_at_cs.biu.ac.il

http//www.cs.biu.ac.il/sarit/
2
Negotiations
  • A discussion in which interested parties
    exchange information and come to an agreement.
    Davis and Smith, 1977

3

Negotiations
  • NEGOTIATION is an interpersonal decision-making
    process necessary whenever we cannot achieve our
    objectives single-handedly.

4
Agent environments
  • Teams of agents that need to coordinate joint
    activities problems distributed information,
    distributed decision solving, local conflicts.
  • Open agent environments acting in the same
    environment problems need motivation to
    cooperate, conflict resolution, trust,
    distributed and hidden information.

5
Open Agent Environments
  • Consist of
  • Automated agents developed by or serving
    different people or organizations.
  • People with a variety of interests and
    institutional affiliations.
  • The computer agents are self-interested
    they may cooperate to further their interests.
  • The set of agents is not fixed.

6
Open Agent Environments (examples)
  • Agents support people
  • Collaborative interfaces
  • CSCW Computer Supported Cooperative Work systems
  • Cooperative learning systems
  • Military-support systems
  • Agents act as proxies for people
  • Coordinating schedules
  • Patient care-delivery systems
  • Online auctions
  • Groups of agents act autonomously alongside
    people
  • Simulation systems for education and training
  • Computer games and other forms of entertainment
  • Robots in rescue operations
  • Software personal assistants

7
Open Agent Environments (examples)
  • Agents support people
  • Collaborative interfaces
  • CSCW Computer Supported Cooperative Work systems
  • Cooperative learning systems
  • Military-support systems
  • Agents act as proxies for people
  • Coordinating schedules
  • Patient care-delivery systems
  • Online auctions
  • Groups of agents act autonomously alongside
    people
  • Simulation systems for education and training
  • Computer games and other forms of entertainment
  • Robots in rescue operations
  • Software personal assistants

8
Examples
  • Monitoring electricity networks (Jennings)
  • Distributed design and engineering (Petrie et
    al.)
  • Distributed meeting scheduling (Sen Durfee)
  • Teams of robotic systems acting in hostile
    environments (Balch Arkin, Tambe)
  • Collaborative Internet-agents (Etzioni Weld,
    Weiss)
  • Collaborative interfaces (Grosz Ortiz, Andre)
  • Information agent on the Internet (Klusch)
  • Cooperative transportation scheduling (Fischer)
  • Supporting hospital patient scheduling (Decker
    Jin)
  • Intelligent Agents for Command and Control
    (Sycara)

9
Types of agents
  • Fully rational agents
  • Bounded rational agents

10
Using other disciplines results
  • No need to start from scratch!
  • Required modification and adjustment AI gives
    insights and complimentary methods.
  • Is it worth it to use formal methods for
    multi-agent systems?

11
Negotiating with rational agents
  • Quantitative decision making
  • Maximizing expected utility
  • Nash equilibrium, Bayesian Nash equilibrium
  • Automated Negotiator
  • Model the scenario as a game
  • The agent computes (if complexity allows)
  • the equilibrium strategy, and acts
    accordingly.
  • (Kraus, Strategic Negotiation in
  • Multiagent Environments,
  • MIT Press 2001).

12
Game Theory studies situations of strategic
interaction in which each decision maker's plan
of action depends on the plans of the other
decision makers.
  • Short introduction to game theory

13
Decision Theory (reminder)(How to make decisions)
  • Decision Theory
  • Probability theory Utility Theory
  • (deals with chance) (deals with
    outcomes)
  • Fundamental idea
  • The MEU (Maximum expected utility) principle
  • Weigh the utility of each outcome by the
    probability that it occurs

14
Basic Principle
  • Given probability P(out1 Ai), utility U(out1),
  • P(out2 Ai), utility
    U(out2)
  • Expected utility of an action Aii EU(Ai) S
    U(outj)P(outjAi)
  • Choose Ai such that maximizes EU MEU argmax
    S U(outj)P(outjAi) Ai ? Ac
    Outj ? OUT

Outj ? OUT
15
Risk Averse, Risk NeutralRisk Seeking
RISK SEEKER
RISK AVERSE
RISK NEUTRAL
16
Game Description
  • Players
  • Who participates in the game?
  • Actions / Strategies
  • What can each player do?
  • In what order do the players act?
  • Outcomes / Payoffs
  • What is the outcome of the game?
  • What are the players' preferences over the
    possible outcomes?

17
Game Description (cont)
  • Information
  • What do the players know about the parameters of
    the environment or about one another?
  • Can they observe the actions of the other
    players?
  • Beliefs
  • What do the players believe about the unknown
    parameters of the environment or about one
    another?
  • What can they infer from observing the actions of
    the other players?

18
Strategies and Equilibrium
  • Strategy
  • Complete plan, describing an action for every
    contingency
  • Nash Equilibrium
  • Each player's strategy is a best response to the
    strategies of the other players
  • Equivalently No player can improve his payoffs
    by changing his strategy alone
  • Self-enforcing agreement. No need for formal
    contracting
  • Other equilibrium concepts also exist

19
Classification of Games
  • Depending on the timing of move
  • Games with simultaneous moves
  • Games with sequential moves
  • Depending on the information available to the
    players
  • Games with perfect information
  • Games with imperfect (or incomplete) information
  • We concentrate on non-cooperative games
  • Groups of players cannot deviate jointly
  • Players cannot make binding agreements

20
Games with Simultaneous Moves and Perfect
Information
  • All players choose their actions simultaneously
    or just independently of one another
  • There is no private information
  • All aspects of the game are known to the players
  • Representation by game matrices
  • Often called normal form games or strategic form
    games

21
Matching Pennies
Example of a zero-sum game. Strategic issue of
competition.
22
Prisoners Dilemma
  • Each player can cooperate or defect

Column
cooperate
defect
-10,0
-1,-1
cooperate
Row
defect
-8,-8
0,-10
Main issue Tension between social optimality
and individual incentives.
23
Coordination Games
  • A supplier and a buyer need to decide whether to
    adopt a new purchasing system.

Buyer
new
old
0,0
20,20
new
Supplier
old
5,5
0,0
24
Battle of sexes
The game involves both the issues of coordination
and competition
25
Definition of Nash Equilibrium
  • A game has n players.
  • Each player i has a strategy set Si
  • This is his possible actions
  • Each player has a payoff function
  • pI S ?R
  • A strategy ti in Si is a best response if there
    is no other strategy in Si that produces a higher
    payoff, given the opponents strategies

26
Definition of Nash Equilibrium
  • A strategy profile is a list (s1, s2, , sn) of
    the strategies each player is using
  • If each strategy is a best response given the
    other strategies in the profile, the profile is a
    Nash equilibrium
  • Why is this important?
  • If we assume players are rational, they will play
    Nash strategies
  • Even less-than-rational play will often converge
    to Nash in repeated settings

27
An Example of a Nash Equilibrium
Column
a
b
0,1
1,2
a
Row
b
1,0
2,1
(b,a) is a Nash equilibrium Given that column is
playing a, rows best response is b Given that
row is playing b, columns best response is a
28
Mixed strategies
  • Unfortunately, not every game has a pure strategy
    equilibrium.
  • Rock-paper-scissors
  • However, every game has a mixed strategy Nash
    equilibrium
  • Each action is assigned a probability of play
  • Player is indifferent between actions, given
    these probabilities

29
Mixed Strategies
Wife
shopping
football
0,0
2,1
football
Husband
shopping
1,2
0,0
30
Mixed strategy
  • Instead, each player selects a probability
    associated with each action
  • Goal utility of each action is equal
  • Players are indifferent to choices at this
    probability
  • aprobability husband chooses football
  • bprobability wife chooses shopping
  • Since payoffs must be equal, for husband
  • b1(1-b)2 b2/3
  • For wife
  • a1(1-a)2 2/3
  • In each case, expected payoff is 2/3
  • 2/9 of time go to football, 2/9 shopping, 5/9
    miscoordinate
  • If they could synchronize ahead of time they
    could do better.

31
Rock paper scissors
Column
rock
paper
scissors
-1,1
0,0
1,-1
rock
Row
paper
1,-1
0,0
-1,1
scissors
-1,1
1,-1
0,0
32
Setup
  • Player 1 plays rock with probability pr, scissors
    with probability ps, paper with probability 1-pr
    ps
  • Utility2(rock) 0pr 1ps 1(1-pr ps)
    2 ps pr -1
  • Utility2(scissors) 0ps 1(1 pr ps) 1pr
    1 2pr ps
  • Utility2(paper) 0(1-pr ps) 1pr 1ps
    pr ps
  • Player 2 wants to choose a probability for each
    action so that the expected payoff for each
    action is the same.

33
Setup
  • qr(2 ps pr 1) qs(1 2pr ps) (1-qr-qs)
    (pr ps)
  • It turns out (after some algebra) that the
    optimal mixed strategy is to play each action 1/3
    of the time
  • Intuition What if you played rock half the time?
    Your opponent would then play paper half the
    time, and youd lose more often than you won
  • So youd decrease the fraction of times you
    played rock, until your opponent had no edge in
    guessing what youll do

34
Extensive Form Games
Any finite game of perfect information has a pure
strategy Nash equilibrium. It can be found by
backward induction.
Chess is a finite game of perfect information.
Therefore it is a trivial game from a game
theoretic point of view.
35
Extensive Form Games - Intro
  • A game can have complex temporal structure
  • Information
  • set of players
  • who moves when and under what circumstances
  • what actions are available when called upon to
    move
  • what is known when called upon to move
  • what payoffs each player receives
  • Foundation is a game tree

36
Example Cuban Missile Crisis
- 100, - 100
Nuke
Kennedy
Arm
Khrushchev
Fold
10, -10
-1, 1
Retract
Pure strategy Nash equilibria (Arm, Fold) and
(Retract, Nuke)
37
Subgame perfect equilibrium credible threats
  • Proper subgame subtree (of the game tree) whose
    root is alone in its information set
  • Subgame perfect equilibrium
  • Strategy profile that is in Nash equilibrium in
    every proper subgame (including the root),
    whether or not that subgame is reached along the
    equilibrium path of play

38
Example Cuban Missile Crisis
- 100, - 100
Nuke
Kennedy
Arm
Khrushchev
Fold
10, -10
-1, 1
Retract
Pure strategy Nash equilibria (Arm, Fold) and
(Retract, Nuke)
Pure strategy subgame perfect equilibria (Arm,
Fold)
Conclusion Kennedys Nuke threat was not
credible.
39
Type of games
Diplomacy
40
Take it or leave it deals
  • The rules of the game
  • You will be randomly paired up with someone in
    the other section this pairing will remain
    completely anonymous.
  • One of you will be chosen (by coin flip) to be
    either the Proposer or the Responder in this
    experiment.
  • The Proposer gets to make an offer to split 100
    in some proportion with the Responder. So the
    proposer can offer x to the responder, proposing
    to keep 100-x for themselves.
  • The Responder must decide what is the lowest
    amount offered by the proposer that he / she will
    accept i.e. I will accept any offer which is
    greater than or equal to y.
  • If the responder accepts the offer made by the
    proposer, they split the sum according to the
    proposal. If the responder rejects, both parties
    lose their shares.

41
AN EXAMPLE OF Buyer/Seller negotiation
42
BARGAINING
ZOPA
Sellers surplus
Buyers surplus
x final price
s
b
Sellers RP Sellers wants s or more
Buyers RP Buyer wants b or less
43
BARGAINING
  • If b lt s negative bargaining zone, no
    possible agreements
  • If b gt s positive bargaining zone,
    agreement possible
  • (x-s) sellers surplus
  • (b-x) buyers surplus
  • The surplus to divide independent on x
    constant-sum game!

44
POSITIVE BARGAINING ZONE
Sellers reservation point
Sellers target point
Sellers bargaining range
Buyers bargaining range
Buyers target point
Buyers reservation point
POSITIVE bargaining zone
45
NEGATIVE BARGAINING ZONE
Sellers reservation point
Sellers target point
Sellers bargaining range
Buyers bargaining range
Buyers target point
Buyers reservation point
NEGATIVE bargaining zone
46
Single issue negotiation
  • Agents a and b negotiate over a pie of size 1
  • Offer (x,y), xy1
  • Deadline n and Discount factor d
  • Utility Ua((x,y), t) x dt-1 if
    t n
  • Ub((x,y),t) y dt-1
  • 0
    otherwise
  • The agents negotiate using Rubinsteins
    alternating offers protocol

47
Alternating offers protocol
  • Time Offer Respond
  • 1 a (x1,y1) b
    (accept/reject)
  • 2 b (x2,y2) a
    (accept/reject)
  • -
  • -
  • n

48
Equilibrium strategies
  • How much should an agent offer if there is only
    one time period?
  • Let n1 and a be the first mover
  • Agent as offer
  • Propose to keep the whole pie (1,0) agent b
    will accept this

49
Equilibrium strategies for n 2
  • d 1/4 first mover a
  • Offer (x, y) x as share y
    bs share
  • Optimal offers obtained using backward induction

Time Offering agent Offer Utility
1 a ? b (3/4, 1/4) 3/41/4
2 b ? a (0, 1) 01/4
Agreement
The offer (3/4, 1/4) forms a P.E. Nash equilibrium
50
Effect of discount factor and deadline on the
equilibrium outcome
  • What happens to first movers share as d
    increases?
  • What happens to second movers share as d
    increases?
  • As deadline increases, what happens to first
    movers share?
  • Likewise for second mover?

51
Effect of d and deadline on the agents shares
52
Multiple issues
  • Set of issues S 1, 2, , m. Each issue is a
    pie of size 1
  • The issues are divisible
  • Deadline n (for all the issues)
  • Discount factor dc for issue c
  • Utility U(x, t) ?c U(xc, t)

53
Multi-issue procedures
  • Package deal procedure The issues are bundled
    and discussed together as a package
  • Simultaneous procedure The issues are negotiated
    in parallel but independently of each other
  • Sequential procedure The issues are negotiated
    sequentially one after another

54
Package deal procedure
  • Issues negotiated using alternating offers
    protocol
  • An offer specifies a division for each of the
    m issues
  • The agents are allowed to accept/reject a
    complete offer
  • The agents may have different preferences over
    the issues
  • The agents can make tradeoffs across the
    issues to maximize their utility this leads
    to Pareto optimal outcome

55
Utility for two issues
Ua 2X Y
Ub X 2Y
56
Making tradeoffs
What is as utility for Ub 2
Ub 2
57
Example for two issues
DEADLINE n 2 DISCOUNT FACTORS d1 d2
1/2 UTILITIES Ua 1/2t-1 (x1 2x2)
Ub 1/2t-1 (2y1 y2)
Time Offering agent Package Offer
1 a ? b (1/4, 3/4) (1, 0) OR (3/4, 1/4) (0, 1)
2 b ? a (0, 1) (0, 1) Ub 1.5
The outcome is not symmetric
58
P.E. Nash equilibrium strategies
For t n The offering agent takes 100 percent of
all the issues The receiving agent accepts For t
lt n (for agent a)
OFFER x, y s.t. Ub(y, t) EQUB(t1) If more than one such x, y perform trade-offs across issues to find best offer RECEIVE x, y If Ua(x, t) EQUA(t1) ACCEPT else REJECT
EQUA(t1) is as equilibrium utility for t1
EQUB(t1) is bs equilibrium utility for t1
59
Making trade-offs divisible issues
Agent as trade-off problem at time t TR Find
a package x, y to m
Maximize ? kac xc
c1
m Subject to ? kbc yc EQUB(t1)
0 xc 1 0 yc 1
c1
This is the fractional knapsack problem
60
Making trade-offs divisible issues
  • Agent as perspective (time t)
  • Agent a considers the m issues in the
    increasing order of ka/kb and assigns to b the
    maximum possible share for each of them until bs
    cumulative utility equals EQUB(t1)

61
Equilibrium strategies
For t n The offering agent takes 100 percent of
all the issues The receiving agent accepts For t
lt n (for agent a)
OFFER x, y s.t. Ub(y, t) EQUB(t1) If more then one such x, y perform trade-offs across issues to find best offer RECEIVE x, y If Ua(x, t) EQUA(t1) ACCEPT else REJECT
62
Equilibrium solution
  • An agreement on all the m issues occurs in the
    first time period
  • Time to compute the equilibrium offer for the
    first time period is O(mn)
  • The equilibrium solution is Pareto-optimal (an
    outcome is Pareto optimal if it is impossible to
    improve the utility of both agents
    simultaneously)
  • The equilibrium solution is not unique, it is
    not symmetric

63
Making trade-offs indivisible issues
  • Agent as trade-off problem at time t is to find
    a package x, y that

For indivisible issues, this is the integer
knapsack problem
64
Key points
  • Single issue
  • Time to compute equilibrium is O(n)
  • The equilibrium is not unique, it is not
    symmetric
  • Multiple divisible issues (exact solution)
  • Time to compute equilibrium for t1 is O(mn)
  • The equilibrium is Pareto optimal, it is not
    unique, it is not symmetric
  • Multiple indivisible issues (approx. solution)
  • There is an FPTAS to compute approximate
    equilibrium
  • The equilibrium is Pareto optimal, it is not
    unique, it is not symmetric

65
Negotiation on data allocation in multi-server
environmentR. Azulay-Schwartz and S. Kraus.
Negotiation On Data Allocation in Multi-Agent
Environments. Autonomous Agents and Multi-Agent
Systems journal 5(2)123-172, 2002.
66
Cooperative Web Servers
  • The Data and Information System component of the
    Earth Observing System (EOSDIS) of NASA is a
    distributed knowledge system which supports
    archival and distribution of data at multiple and
    independent servers.

67
Cooperative Web Servers- cont.
  • Each data collection, or file, is called a
    dataset. The datasets are huge, so each dataset
    has only one copy.
  • The current policy for data allocation in NASA is
    static old datasets are not reallocated each
    new dataset is located by the server with the
    nearest topics (defined according to the topics
    of the datasets stored by this server).

68
Related Work -File Allocation Problem
  • The original problemHow to distribute files
    among computers, in order to optimize the system
    performance.
  • Our problemHow can self-motivated servers
    decide about distribution of files, when each
    server has its own objectives.

69
Environment Description
  • There are several information servers. Each
    server is located at a different geographical
    area.
  • Each server receives queries from the clients in
    its area, and sends documents as responses to
    queries. These documents can be stored locally,
    or in another server.

70
Environment Description
the query
document/s
distance
serverj
serveri
a query
the document/s
a client
area j
area i
71
Basic Definitions
  • SERVERS the set of the servers.
  • DATASETS the set of datasets (files) to be
    allocated.
  • Allocationa mapping of each dataset to one of
    theservers. The set of all possible allocation
    is denoted by Allocs.
  • U the utility function of each server.

72
The Conflict Allocation
  • If at least one server opts outM of the
    negotiation, then the conflict allocation
    conflict_alloc is implemented.
  • We consider the conflict allocation to be the
    static allocation. (each dataset is stored in the
    server with closest topics).

73
Utility Function
  • Userver(alloc,t) specifies the utility of server
    from alloc?Allocs at time t.
  • It consists of
  • The utility from the assignment of each dataset.
  • The cost of negotiation delay.
  • Userver(alloc,0) ? Vserver(x,alloc(x)).
  • x?DATASETS

74
Parameters of utility
  • query price payment for retrieved docoments.
  • usage(ds,s) the expected number of documents of
    dataset ds from clients in the area of server
    s.
  • storage costs, retrieve costs, answer costs.

75
Cost over time
  • Cost of communication and computation time of the
    negotiation.
  • Loss of unused information new documents can not
    be used until the negotiation ends.
  • Datasets usage and storage cost are assumed to
    decrease over time, with the same discount ratio
    (p-1).
  • Thus, there is a constant discount ratio of the
    utility from an allocation Userver(alloc,t)d
    tUserver(alloc,0) - tC.

76
Assumptions
  • Each server prefers any agreement over
    continuation of the negotiation indefinitely.
  • The utility of each server from the conflict
    allocation is always greater or equal to 0.
  • OFFERS - the set of allocations that are
    preferred by all the agents over opting out.

77
Negotiation Analysis - Simultaneous Responses
  • Simultaneous responsesA server, when
    responding, is not informed of the other
    responses.
  • TheoremFor each offer x ?OFFERS, there is a
    subgame-perfect equilibrium of the bargaining
    game, with the outcome x offered and unanimously
    accepted in period 0.

78
Choosing the Allocation
  • The designers of the servers can agree in advance
    on a joint technique for choosing x
  • giving each server its conflict utility
  • maximizing a social welfare criterion
  • the sum of the servers utilities.
  • or the generalized Nash product of the servers
    utilities P (Us(x)-Us(conflict))

79
Experimental Evaluation
  • How do the parameters influence the results of
    the negotiation?
  • vcost(alloc) the variable costs due to an
    allocation (excludes storage_cost and the gains
    due to queries).
  • vcost_ratio the ratio of vcosts when using
    negotiation, and vcosts of the static allocation.

80
Effect of Parameters on The Results
  • As the number of servers grows, vcost_ratio
    increases (more complex computations) L.
  • As the number of datasets grows, vcost_ratio
    decreases (negotiation is more beneficial) J.
  • Changing the mean usage did not influence
    vcost_ratio significantlyK, but vcost_ratio
    decreases as the standard deviation of the usage
    increasesJ.

81
Influence of Parameters - cont.
  • When the standard deviation of the distances
    between servers increases, vcost_ratio
    decreasesJ.
  • When the distance between servers increases,
    vcost_ratio decreasesJ.
  • In the domains tested,
  • answer_cost ? vcost_ratio ? L.
  • storage_cost ? vcost_ratio ? L.
  • retrieve_cost ? vcost_ratio ? J.
  • query_price ? vcost_ratio ? J.

82
Incomplete Information
  • Each server knows
  • The usage frequency of all datasets, by clients
    from its area
  • The usage frequency of datasets stored in it, by
    all clients

83
BARGAINING
ZOPA
Sellers surplus
Buyers surplus
x final price
sL
sH
bH
bL
Sellers RP Sellers wants s or more
Buyers RP Buyer wants b or less
84
Definition of a Bayesian game
  • N is the set of players.
  • O is the set of the states of nature.
  • Ai is the set of actions for player i. A A1
    A2 An
  • Ti is the type set of player i. For each state
    of nature, the game will have different types of
    players (one type per player).
  • u O A ? R is the payoff function for player i.
  • pi is the probability distribution over O for
    each player i, that is to say, each player has
    different views of the probability distribution
    over the states of the nature. In the game, they
    never know the exact state of the nature.

85
Solution concepts for Bayesian games
  • A (Bayesian) Nash equilibrium is a strategy
    profile and beliefs specified for each player
    about the types of the other players that
    maximizes the expected utility for each player
    given their beliefs about the other players'
    types and given the strategies played by the
    other players.

86
Incomplete Information - cont.
  • A revelation mechanism
  • First, all the servers report simultaneously all
    their private information
  • for each dataset, the past usage of the dataset
    by this server.
  • for each server, the past usage of each local
    dataset by this server.
  • Then, the negotiation proceeds as in the complete
    information case.

87
Incomplete Information - cont.
  • LemmaThere is a Nash equilibrium where each
    server tells the truth about its past usage of
    remote datasets, and the other servers usage of
    its local datasets.
  • Lies concerning details about local usage of
    local datasets are intractable.

88
Summary negotiation on data allocation
  • We have considered the data allocation problem in
    a distributed environment.
  • We have presented the utility function of the
    servers, which expresses their preferences.
  • We have proposed using a negotiation protocol for
    solving the problem.
  • For incomplete information situations, a
    revelation process was added to the protocol.

89
Agent-Human Negotiation
90
Computers interacting with people
Computer has the control
Computer persuades human
Human has the control
91
91
92
Culture sensitive agents
  • The development of standardized agent to be used
    in the collection of data for studies on culture
    and negotiation

Buyer/Seller agents negotiate well across
cultures
  • PURB agent

93
Semi-autonomous cars
94
Medical applications
Gertner Institute for Epidemiology and Health
Policy Research
95
Automated care-taker
The physiotherapist has no other available
appointments this week. How about resting
before the appointment?
I scheduled an appointment for you at the
physiotherapist this afternoon
I will be too tired in the afternoon!!!
Try to reschedule and fail
96
Security applications
  • Collect
  • Update
  • Analyze
  • Prioritize

97
People often follow suboptimal decision strategies
  • Irrationalities attributed to
  • sensitivity to context
  • lack of knowledge of own preferences
  • the effects of complexity
  • the interplay between emotion and cognition
  • the problem of self control
  • bounded rationality in the bullet

98
  • Agents that play repeatedly with the same person

99
AutONA BY03
  • Buyers and sellers
  • Using data from previous experiments
  • Belief function to model opponent
  • Implemented several tactics and heuristics
  • including, concession mechanism

A. Byde, M. Yearworth, K.-Y. Chen, and C.
Bartolini. AutONA A system for automated
multiple 1-1 negotiation. In CEC, pages 5967,
2003
100
Cliff-Edge
  • Virtual learning and reinforcement learning
  • Using data from previous interactions
  • Implemented several tactics and heuristics
  • qualitative in nature
  • Non-deterministic behavior, via means of
    randomization

R. Katz and S. Kraus. Efficient agents for cliff
edge environments with a large set of decision
options. In AAMAS, pages 697704, 2006
101
General opponent modeling
  • Agents that play with the same person only once

102
Challenges of human opponent modeling
  • Small number of examples
  • difficult to collect data on people
  • Noisy data
  • people are inconsistent (the same person may act
    differently)
  • people are diverse

103
Guessing Heuristic
  • Multi-issue, multi-attribute, with incomplete
    information
  • Domain independent
  • Implemented several tacticsand heuristics
  • including, concession mechanism

C. M. Jonker, V. Robu, and J. Treur. An agent
architecture for multi-attribute negotiation
using incomplete preference information. JAAMAS,
15(2)221252, 2007
104
PURB Agent
  • Building blocks Personality model, Utility
    function, Rules for guiding choice.
  • Key idea Models Personality traits of its
    negotiation partners over time.
  • Uses decision theory to decide how to negotiate,
    with utility function that depends on models and
    other environmental features.
  • Pre-defined rules facilitate computation.

Plays as well as people adapts to culture
105
QOAgent LIN08
Played at least as well as people
  • Multi-issue, multi-attribute, with incomplete
    information
  • Domain independent
  • Implemented several tactics and heuristics
  • qualitative in nature
  • Non-deterministic behavior, also via means of
    randomization

Is it possible to improve the QOAgent?
Yes, if you have data
R. Lin, S. Kraus, J. Wilkenfeld, and J. Barry.
Negotiating with bounded rational agents in
environments with incomplete information using an
automated agent. Artificial Intelligence,
172(6-7)823851, 2008
105
106
KBAgent
  • Multi-issue, multi-attribute, with incomplete
    information
  • Domain independent
  • Implemented several tactics and heuristics
  • qualitative in nature
  • Non-deterministic behavior, also via means of
    randomization
  • Using data from previous interactions

Y. Oshrat, R. Lin, and S. Kraus. Facing the
challenge of human-agent negotiations via
effective general opponent modeling. In AAMAS,
2009
106
107
Example scenario
  • Employer and job candidate
  • Objective reach an agreement over hiring terms
    after successful interview

107
108
General opponent modeling
  • Challenge sparse data of past negotiation
    sessions of people negotiation
  • Technique Kernel Density Estimation

108
109
General opponent modeling
  • Estimate likelihood of other party
  • accept an offer
  • make an offer
  • its expected average utility
  • The estimation is done separately for each
    possible agent type
  • The type of a negotiator is determined using a
    simple Bayes' classifier
  • Use estimation for decision making

109
110
KBAgent as the job candidate
  • Best result 20,000, Project manager, With
    leased car 20 pension funds, fast promotion, 8
    hours

KBAgent
Human
20,000 Team Manager With leased car Pension
20 Slow promotion 9 hours
12,000 Programmer Without leased car Pension
10 Fast promotion 10 hours
20,000 Project manager Without leased
car Pension 20 Slow promotion 9 hours
110
111
KBAgent as the job candidate
  • Best agreement 20,000, Project manager, With
    leased car 20 pension funds, fast promotion, 8
    hours

KBAgent
Human
?
Round 7
20,000 Programmer With leased car Pension
10 Slow promotion 9 hours
111
112
Experiments
Learned from 20 games of human-human
  • 172 grad and undergrad students in Computer
    Science
  • People were told they may be playing a computer
    agent or a person.
  • Scenarios
  • Employer-Employee
  • Tobacco Convention England vs. Zimbabwe

112
113
Results Comparing KBAgent to others
Player Type Average Utility Value (std)
KBAgent vs people Employer 468.9 (37.0)
QOAgent vs peoples Employer 417.4 (135.9)
People vs. People Employer 408.9 (106.7)
People vs. QOAgent Employer 431.8 (80.8)
People vs. KBAgent 380. 4 (48.5)
KBAgent 482.7 (57.5)
QOAgent Job Candidate 397.8 (86.0)
People vs. People Job Candidate 310.3 (143.6)
People vs. QOAgent Job Candidate 320.5 (112.7)
People vs. KBAgent Job Candidate 370.5 (58.9)
113
114
Main results
  • In comparison to the QOAgent
  • The KBAgent achieved higher utility values than
    QOAgent
  • More agreements were accepted by people
  • The sum of utility values (social welfare) were
    higher when the KBAgent was involved
  • The KBAgent achieved significantly higher utility
    values than people
  • Results demonstrate the proficiency negotiation
    done by the KBAgent

General opponent modeling improves agent
negotiations
General opponent modeling improves agent
bargaining
114
115
Automated care-taker
I arrange for you to go to the physiotherapist in
the afternoon
I will be too tired in the afternoon!!!
How can I convince him? What argument should I
give?
116
Security applications
How should I convince him to provide me with
information?
117
Argumentation
Should I tell him that we are running out of
antibiotics?
  • Which information to reveal?

Build a game that combines information revelation
and bargaining
117
118
Automated care-taker
I arrange for you to go to the physiotherapist in
the afternoon
I will be too tired in the afternoon!!!
How can I convince him? What argument should I
give?
119
Security applications
How should I convince him to provide me with
information?
120
Color Trails (CT)
  • An infrastructure for agent design,
    implementation and evaluation for open
    environments
  • Designed with Barbara Grosz (AAMAS 2004)
  • Implemented by Harvard team and BIU team

120
121
An experimental test-ted
  • Interesting for people to play
  • analogous to task settings
  • vivid representation of strategy space (not just
    a list of outcomes).
  • Possible for computers to play
  • Can vary in complexity
  • repeated vs. one-shot setting
  • availability of information
  • communication protocol.

121
122
Social Preference Agent
  • Learns the extent to which people are affected by
    social preferences such as social welfare and
    competitiveness.
  • Designed for one-shot take-it-or-leave-it
    scenarios.
  • Does not reason about the future ramifications of
    its actions.

Y. Gal and A. Pfeffer Predicting people's
bidding behavior in negotiation. AAMAS 2006
370-376
123
  • Agents for Revelation Games

Peled Noam, Gal Kobi, Kraus Sarit
124
Introduction - Revelation games
  • Combine two types of interaction
  • Signaling games (Spence 1974)
  • Players choose whether to convey private
    information to each other
  • Bargaining games (Osborne and Rubinstein 1999)
  • Players engage in multiple negotiation rounds
  • Example Job interview

125
Colored Trails (CT)
  • Asymmetric Symmetric

126
Why not equilibrium agents?
  • Results from the social sciences suggest people
    do not follow equilibrium strategies
  • Equilibrium based agents played against people
    failed.
  • People rarely design agents to follow equilibrium
    strategies (Sarne et al AAMAS 2008).
  • Equilibrium strategies are usually not
    cooperative
  • all lose.

126
127
Perfect Equilibrium (PE) Agent
  • Solved using Backward induction.
  • No signaling.
  • Counter-proposal round (selfish)
  • Second proposer Find the most beneficial
    proposal while the responder benefit remains
    positive.
  • Second responder Accepts any proposal which
    gives it a positive benefit.

128
PE agent Phase one
  • First proposal round (generous)
  • First proposer propose the opponents
    counter-proposal.
  • First responder Accepts any proposals which
    gives it the same or higher benefit from its
    counter-proposal.
  • Revelation phase - revelation vs non revelation
  • In both boards, the PE with goal revelation
    yields lower or equal expected utility than
    non-revelation PE

129
Benefits Diversity
  • Average proposed benefit to players from first
    and second rounds

130
Performance of PEQ agent
131
Revelation Effect
  • Only 35 of the games played by humans included
    revelation
  • Revelation had a significant effect on human
    performance but not on agent performance
  • Revelation didn't help the agent
  • People were deterred by the strategic
    machine-generated proposals

132
SIGAL agent
  • Agent based on general opponent modeling

Genetic algorithm
Logistic Regression
133
SIGAL Agent
  • Learns from previous games.
  • Predict the acceptance probability for each
    proposal using Logistic Regression.
  • Models human as using a weighted utility function
    of
  • Humans benefit
  • Benefits difference
  • Revelation decision
  • Benefits in previous round

134
Logistic Regression using a Genetic Algorithm
135
Expected benefit maximization
136
Maximization round 2
137
Strategy Comparison
  • Strategies for the asymmetric board, non of
    the players has revealed, the human lacks 2 chips
    for reaching the goal, the agent lacks 1

In first round the agent was proposed a benefit
of 90
138
Heuristics
  • Tit for Tat
  • Never give more than you asks in the
    counter-proposal
  • Risk averseness
  • Isoelastic utility

139
Learned Coefficients
  • Responder benefit (0.96)
  • Benefits difference (-0.79)
  • Responder revelation (0.26)
  • Proposer revelation (0.03)
  • Responder benefit in first round (0.45)
  • Proposer benefit in first round (0.33)

140
Methodology
  • Cross validation.
  • 10-fold
  • Over-fitting removal.
  • Stop learning in the minimum of the
    generalization error
  • Error calculation on held out test set.
  • Using new human-human games
  • Performance prediction criteria.

141
Performance
General opponent modeling improves agent
negotiations
142
General opponent modeling in Maximization
problems
142
143
AAT agent
  • Agent based on general opponent modeling

Decision Tree/ Naïve Byes
AAT
143
144
Aspiration Adaptation Theory (AAT)
  • Economic theory of peoples behavior (Selten)
  • No utility function exists for decisions (!)
  • Relative decisions used instead
  • Retreat and urgency used for goal variables

Avi Rosenfeld and Sarit Kraus. Modeling Agents
through Bounded Rationality Theories. Proc. of
IJCAI 2009., JAAMAS, 2010.
145
Commodity search
1000
145
146
Commodity search
900
1000
147
Commodity search
900
1000
950
If price lt 800 buy otherwise visit 5 stores and
buy in the cheapest.
147
148
Results
148
149
  • General opponent modeling in cooperative
    environments

150
Coordination with limited communication
  • Communication is not always possible
  • High communication costs
  • Need to act undetected
  • Damaged communication devices
  • Language incompatibilities
  • Goal Limited interruption of human activities

Zuckerman, S. Kraus and J. S. Rosenschein.
Using Focal Points Learning to Improve
Human-Machine Tactic Coordination, JAAMAS, 2010.
150
151
Focal Points (Examples)
  • Divide 100 into two piles, if your piles are
    identical to your coordination partner, you get
    the 100. Otherwise, you get nothing.

101 equilibria
151
152
Focal points (Examples)
9 equilibria
16 equilibria
152
153
Focal Points
  • Thomas Schelling (63)
  • Focal Points Prominent solutions to tactic
    coordination games

153
154
Prior work Focal Points Based Coordination for
closed environments
  • Domain-independent rules that could be used by
    automated agents to identify focal points
  • Properties Centrality, Firstness, Extremeness,
    Singularity.
  • Logic based model
  • Decision theory based model
  • Algorithms for agents coordination

Kraus and Rosenchein MAAMA 1992 Fenster et al
ICMAS 1995 Annals of Mathematics and Artificial
Intelligence 2000
154
155
FPL agent
  • Agent based on general opponent modeling

Decision Tree/ neural network
Focal Point
155
156
FPL agent
  • Agent based on general opponent modeling

raw data vector FP vector
Decision Tree/ neural network
156
157
Focal Point Learning
  • 3 experimental domains

157
158
Results cont
General opponent modeling improves agent
coordination
  • very similar domain (VSD) vs similar domain
    (SD) of the pick the pile game.

158
159
Experiments with people is a costly process
160
Evaluation of agents (EDA)
  • Peer Designed Agents (PDA) computer agents
    developed by humans
  • Experiment 300 human subjects, 50 PDAs, 3 EDA
  • Results
  • EDA outperformed PDAs in the same situations in
    which they outperformed people,
  • on average, EDA exhibited the same measure of
    generosity

R. Lin, S. Kraus, Y. Oshrat and Y. Gal.
Facilitating the Evaluation of Automated
Negotiators using Peer Designed Agents, in AAAI
2010.
161
Conclusions
  • Negotiation and argumentation with people is
    required for many applications
  • General opponent modeling is beneficial
  • Machine learning
  • Behavioral model
  • Challenge how to integrate machine learning and
    behavioral model

161
162
References
  • S.S. Fatima, M. Wooldridge, and N.R. Jennings,
    Multi-issue negotiation with deadlines, Jnl of AI
    Research, 21 381-471, 2006.
  • R. Keeney and H. Raiffa, Decisions with multiple
    objectives Preferences and value trade-offs,
    John Wiley, 1976.
  • S. Kraus, Strategic negotiation in multiagent
    environments, The MIT press, 2001.
  • S. Kraus and D. Lehmann. Designing and Building a
    Negotiating Automated Agent, Computational
    Intelligence, 11(1)132-171, 1995
  • S. Kraus, K. Sycara and A. Evenchik. Reaching
    agreements through argumentation a logical model
    and implementation. Artificial Intelligence
    journal, 104(1-2)1-69, 1998.
  • R. Lin and Sarit Kraus. Can Automated Agents
    Proficiently Negotiate With Humans?
    Communications of the ACM Vol. 53 No. 1, Pages
    78-88, January, 2010.
  • R. Lin, S. Kraus, Y. Oshrat and Y. Gal.
    Facilitating the Evaluation of Automated
    Negotiators using Peer Designed Agents, in AAAI
    2010.

163
References contd.
  1. R. Lin, S. Kraus, J. Wilkenfeld, and J. Barry.
    Negotiating with bounded rational agents in
    environments with incomplete information using an
    automated agent. Artificial Intelligence,
    172(6-7)823851, 2008
  2. A. Lomuscio, M. Wooldridge, and N.R. Jennings, A
    classification scheme for negotiation in
    electronic commerce , Int. Jnl. of Group Deciion
    and Negotiation, 12(1), 31-56, 2003.
  3. M.J. Osborne and A. Rubinstein, A course in game
    theory, The MIT press, 1994.
  4. M.J. Osborne and A. Rubinstein, Bargaining and
    Markets, Academic Press, 1990.
  5. Y. Oshrat, R. Lin, and S. Kraus. Facing the
    challenge of human-agent negotiations via
    effective general opponent modeling. In AAMAS,
    2009
  6. H. Raiffa, The Art and Science of Negotiation,
    Harvard University Press, 1982.
  7. J.S. Rosenschein and G. Zlotkin, Rules of
    encounter, The MIT press, 1994.
  8. I. Stahl, Bargaining Theory, Economics Research
    Institute, Stockholm School of Economics, 1972.
  9. I. Zuckerman, S. Kraus and J. S. Rosenschein.
    Using Focal Points Learning to Improve
    Human-Machine Tactic Coordination, JAAMAS, 2010.

164
Tournament
  • 2nd annual competition of state-of-the-art
    negotiating agents to be held in AAMAS11

Do you want to participate? At least 2,000
for the winner! Contact us! sarit_at_cs.biu.ac.il
Write a Comment
User Comments (0)