UT^2: Human-like Behavior via Neuroevolution of Combat Behavior and Replay of Human Traces - PowerPoint PPT Presentation

About This Presentation
Title:

UT^2: Human-like Behavior via Neuroevolution of Combat Behavior and Replay of Human Traces

Description:

UT^2: Human-like Behavior via Neuroevolution of Combat Behavior and Replay of Human Traces Jacob Schrum (schrum2_at_cs.utexas.edu) Igor V. Karpov (ikarpov_at_cs.utexas.edu) – PowerPoint PPT presentation

Number of Views:84
Avg rating:3.0/5.0
Slides: 31
Provided by: HeDec
Learn more at: https://nn.cs.utexas.edu
Category:

less

Transcript and Presenter's Notes

Title: UT^2: Human-like Behavior via Neuroevolution of Combat Behavior and Replay of Human Traces


1
UT2 Human-like Behavior via Neuroevolution of
Combat Behavior and Replay of Human Traces
  • Jacob Schrum (schrum2_at_cs.utexas.edu)
  • Igor V. Karpov (ikarpov_at_cs.utexas.edu)
  • Risto Miikkulainen (risto_at_cs.utexas.edu)

2
Our Approach UT2
  • Human traces to get unstuck and navigate
  • Filter data to get general-purpose traces
  • Evolve skilled combat behavior
  • Restrictions/filters maintain humanness
  • Observe and judge like a human
  • Necessary to account for the judging game

3
Bot Architecture
4
Human Trace Replay
5
(No Transcript)
6
Record and Index Human Games
Synthetic pose data
Indexed by nearest navpoint
Replay nearest trace when needed
7
Unstuck Controller
  • Mix scripted responses and human traces
  • Previous UT2 used only human traces
  • Human traces also used after repeated failures

Stuck Condition Response
Still Move Forward
Collide With Wall Move Away
Frequent Collisions Dodge Away
Under Elevator Goto Nearest Item or Dodge
Bump Agent Move Away
Same Navpoint Human Traces
Off Navpoint Grid Human Traces
8
Explorative Retrace
  • Explore the level like a human
  • Collisions allowed when using RETRACE
  • Humans often bump walls with no problem
  • If RETRACE fails
  • No trace available, or trace gets bot stuck
  • Fall through to PATH module (Nav graph)

9
(No Transcript)
10
Evolved Battle Controller
11
(No Transcript)
12
Battle Controller Outputs
  • 6 movement outputs
  • Advance
  • Retreat
  • Strafe left
  • Strafe right
  • Move to nearest item
  • Stand still
  • Additional output
  • Jump?

13
Battle Controller Inputs
Pie slice sensors for enemies
Ray traces for walls/level geometry
Other misc. sensors for current weapon
properties, nearby item properties, etc.
14
Battle Controller Inputs
  • Opponent movement sensors
  • Opponent performing movement action X?
  • Opponents modeled as moving like bot
  • Approximation used

15
Constructive Neuroevolution
  • Genetic Algorithms Neural Networks
  • Build structure incrementally (complexification)
  • Good at generating control policies
  • Three basic mutations (no crossover used)

Perturb Weight
Add Connection
Add Node
16
Evolving Battle Controller
  • Used NSGA-II with 3 objectives
  • Damage dealt
  • Damage received (negative)
  • Geometry collisions (negative)
  • Evolved in DM-1on1-Albatross
  • Small level to encourage combat
  • One native bot opponent
  • High score favored in
    selection of final network
  • Final combat behavior
    highly constrained

K. Deb et al. A Fast and Elitist Multiobjective
Genetic Algorithm NSGA-II. Evol. Comp. 2002
17
Action Filtering
  • Network choice not always used
  • Forced to stand still sometimes
  • Sniping, not threatened, high ground
  • Prevented from jumping while still
  • Prevented from jumping near walls/opponents
  • Prevented from going to unwanted items
  • Prevented from strafing/retreating into walls
  • Etc
  • Forced lower accuracy
  • Forced delays to simulate human response time
  • Evolution constrained to look human

18
Importance of Observing
  • Humans dont just want max score
  • Human goal is to judge correctly
  • Requires observation w/o fighting
  • Observe module
  • Bot hasnt judged
    opponent
  • Avoids crowds
  • Judging module
  • Lengthy observation
    leads to judging

19
Observation Behavior
Still
Approach
Use Battle Controller
Retreat
20
Human Subject Evaluation
  • BotPrize tests humanness without saying what is
    human-like vs. bot-like
  • Idea BotPrize style experiment in which players
    are extensively interviewed
  • IRB Human Subject Study w/cash prizes
  • Performed at UT
  • 6 human volunteers
  • 3 human interviewers
  • 4 versions of UT2
  • Native bots

21
Justify Judgments
  • Record each match and replay to human
  • Human explains rationale for judgments
  • Downsides
  • Humans forget
  • Humans make things up
  • Humans change their minds
  • Still, many common themes emerged

22
Humans Arent Killing Machines
  • Accuracy affected by movement/distraction
  • Pause before responding to surprises
  • Humans dont fire non-stop
  • Waiting for opportune shot
  • Saving ammo
  • Few weapon switches
  • Pause to observe

23
Humans Arent Stupid
  • Humans rapidly correct mistakes
  • Get unstuck quickly
  • Move/dodge when fired upon
  • Dont stare at walls
  • Humans know their limitations
  • Prefer weapons requiring less accuracy
  • Dont fight with a weak weapon

24
Complex Human Movements
  • Do
  • Chase opponents tenaciously
  • Retreat while firing on opponent
  • Move in and out from cover
  • Dont
  • Perform many rapid movements too quickly
  • Turn around too quickly

25
Cognitive Issues
  • Theory of Mind
  • Behavior transitions
  • A chasing human expects to fight
  • Humans expect to be chased (traps)
  • Communication via judging
  • Human knows that its action will be
    recognized as human-like by humans
  • Emotion
  • Revenge on humans more satisfying
  • Fear of dangerous opponents

26
Conclusion
  • Human trace replay provides human style
    exploration and gets bot unstuck
  • Multiobjective neuroevolution provides combat
    behavior
  • Simulated observation makes bot seem more
    human-like
  • Future work Incorporate Theory of Mind

27
Questions?
  • Jacob Schrum (schrum2_at_cs.utexas.edu)
  • Igor V. Karpov (ikarpov_at_cs.utexas.edu)
  • Risto Miikkulainen (risto_at_cs.utexas.edu)

28
Auxiliary Slides
29
Multiobjective Optimization
High health but did not deal much damage
  • Game with two objectives
  • Damage Dealt
  • Remaining Health
  • A dominates B iff A is
    strictly better in
    one
    objective and at least

    as good in others
  • Population of points
    not dominated are best

    Pareto Front
  • Weighted-sum provably
    incapable of capturing
    non-convex front

Tradeoff between objectives
Dealt lot of damage, but lost lots of health
30
NSGA-II
  • Evolution natural approach for finding optimal
    population
  • Non-Dominated Sorting Genetic Algorithm II
  • Population P with size N Evaluate P
  • Use mutation to get P size N Evaluate P
  • Calculate non-dominated fronts of P È P size
    2N
  • New population size N from highest fronts of P È
    P

K. Deb et al. A Fast and Elitist Multiobjective
Genetic Algorithm NSGA-II. Evol. Comp. 2002
Write a Comment
User Comments (0)
About PowerShow.com