UT^2: Human-like Behavior via Neuroevolution of Combat Behavior and Replay of Human Traces - PowerPoint PPT Presentation

About This Presentation

Title:

UT^2: Human-like Behavior via Neuroevolution of Combat Behavior and Replay of Human Traces

Description:

UT^2: Human-like Behavior via Neuroevolution of Combat Behavior and Replay of Human Traces Jacob Schrum (schrum2_at_cs.utexas.edu) Igor V. Karpov (ikarpov_at_cs.utexas.edu) – PowerPoint PPT presentation

Number of Views:84

Avg rating:3.0/5.0

Slides: 31

Provided by: HeDec

Learn more at: https://nn.cs.utexas.edu

Category:

more less

Transcript and Presenter's Notes

Title: UT^2: Human-like Behavior via Neuroevolution of Combat Behavior and Replay of Human Traces

1
UT2 Human-like Behavior via Neuroevolution of
Combat Behavior and Replay of Human Traces

Jacob Schrum (schrum2_at_cs.utexas.edu)
Igor V. Karpov (ikarpov_at_cs.utexas.edu)
Risto Miikkulainen (risto_at_cs.utexas.edu)

2
Our Approach UT2

Human traces to get unstuck and navigate
Filter data to get general-purpose traces
Evolve skilled combat behavior
Restrictions/filters maintain humanness
Observe and judge like a human
Necessary to account for the judging game

3
Bot Architecture
4
Human Trace Replay
5
(No Transcript)
6
Record and Index Human Games
Synthetic pose data
Indexed by nearest navpoint
Replay nearest trace when needed
7
Unstuck Controller

Mix scripted responses and human traces
Previous UT2 used only human traces
Human traces also used after repeated failures

Stuck Condition Response
Still Move Forward
Collide With Wall Move Away
Frequent Collisions Dodge Away
Under Elevator Goto Nearest Item or Dodge
Bump Agent Move Away
Same Navpoint Human Traces
Off Navpoint Grid Human Traces
8
Explorative Retrace

Explore the level like a human
Collisions allowed when using RETRACE
Humans often bump walls with no problem
If RETRACE fails
No trace available, or trace gets bot stuck
Fall through to PATH module (Nav graph)

9
(No Transcript)
10
Evolved Battle Controller
11
(No Transcript)
12
Battle Controller Outputs

6 movement outputs
Advance
Retreat
Strafe left
Strafe right
Move to nearest item
Stand still
Additional output
Jump?

13
Battle Controller Inputs
Pie slice sensors for enemies
Ray traces for walls/level geometry
Other misc. sensors for current weapon
properties, nearby item properties, etc.
14
Battle Controller Inputs

Opponent movement sensors
Opponent performing movement action X?
Opponents modeled as moving like bot
Approximation used

15
Constructive Neuroevolution

Genetic Algorithms Neural Networks
Build structure incrementally (complexification)
Good at generating control policies
Three basic mutations (no crossover used)

Perturb Weight
Add Connection
Add Node
16
Evolving Battle Controller

Used NSGA-II with 3 objectives
Damage dealt
Damage received (negative)
Geometry collisions (negative)
Evolved in DM-1on1-Albatross
Small level to encourage combat
One native bot opponent
High score favored in
selection of final network
Final combat behavior
highly constrained

K. Deb et al. A Fast and Elitist Multiobjective
Genetic Algorithm NSGA-II. Evol. Comp. 2002
17
Action Filtering

Network choice not always used
Forced to stand still sometimes
Sniping, not threatened, high ground
Prevented from jumping while still
Prevented from jumping near walls/opponents
Prevented from going to unwanted items
Prevented from strafing/retreating into walls
Etc
Forced lower accuracy
Forced delays to simulate human response time
Evolution constrained to look human

18
Importance of Observing

Humans dont just want max score
Human goal is to judge correctly
Requires observation w/o fighting
Observe module
Bot hasnt judged
opponent
Avoids crowds
Judging module
Lengthy observation
leads to judging

19
Observation Behavior
Still
Approach
Use Battle Controller
Retreat
20
Human Subject Evaluation

BotPrize tests humanness without saying what is
human-like vs. bot-like
Idea BotPrize style experiment in which players
are extensively interviewed
IRB Human Subject Study w/cash prizes
Performed at UT
6 human volunteers
3 human interviewers
4 versions of UT2
Native bots

21
Justify Judgments

Record each match and replay to human
Human explains rationale for judgments
Downsides
Humans forget
Humans make things up
Humans change their minds
Still, many common themes emerged

22
Humans Arent Killing Machines

Accuracy affected by movement/distraction
Pause before responding to surprises
Humans dont fire non-stop
Waiting for opportune shot
Saving ammo
Few weapon switches
Pause to observe

23
Humans Arent Stupid

Humans rapidly correct mistakes
Get unstuck quickly
Move/dodge when fired upon
Dont stare at walls
Humans know their limitations
Prefer weapons requiring less accuracy
Dont fight with a weak weapon

24
Complex Human Movements

Do
Chase opponents tenaciously
Retreat while firing on opponent
Move in and out from cover
Dont
Perform many rapid movements too quickly
Turn around too quickly

25
Cognitive Issues

Theory of Mind
Behavior transitions
A chasing human expects to fight
Humans expect to be chased (traps)
Communication via judging
Human knows that its action will be
recognized as human-like by humans
Emotion
Revenge on humans more satisfying
Fear of dangerous opponents

26
Conclusion

Human trace replay provides human style
exploration and gets bot unstuck
Multiobjective neuroevolution provides combat
behavior
Simulated observation makes bot seem more
human-like
Future work Incorporate Theory of Mind

27
Questions?

Jacob Schrum (schrum2_at_cs.utexas.edu)
Igor V. Karpov (ikarpov_at_cs.utexas.edu)
Risto Miikkulainen (risto_at_cs.utexas.edu)

28
Auxiliary Slides
29
Multiobjective Optimization
High health but did not deal much damage

Game with two objectives
Damage Dealt
Remaining Health
A dominates B iff A is
strictly better in
one
objective and at least

as good in others
Population of points
not dominated are best

Pareto Front
Weighted-sum provably
incapable of capturing
non-convex front