GDC 2005

About This Presentation

Title:

GDC 2005

Description:

Engineering issues for online games ... Handout notes: automated testing is a strong tool for online games! ... accelerates online game development & helps ... – PowerPoint PPT presentation

Number of Views:87

Avg rating:3.0/5.0

Slides: 44

Provided by: jami125

Category:

more less

Transcript and Presenter's Notes

Title: GDC 2005

1
(No Transcript)
2
GDC 2006 tutorial abstract Engineering issues
for online games

As the size and scope of multiplayer games
continue to grow, the engineering requirements of
multiplayer development expand drastically.
Moreover, the lifecycle demands for successful
massively multiplayer games can involve more than
a decade of sustained development after launch.
This tutorial focuses on the software engineering
challenges of producing multiplayer games,
including single-player versus multi-player code,
testing and regressions, security in peer to peer
and server-oriented games, and protecting the
consumers in an increasingly dangerous online
environment. Common fail points of single-player
engineering tactics in a multi-player world are
addressed, and the longer term issues of building
and maintaining server-based games when downtime
means a direct and public loss of revenue and
market share.
This slide deck contains several background
slides, hidden in slide-show mode. Print for
complete data.

3
Tutorial Takeaway Messages

Building online games with single-player game
techniques is painful
Early changes to your development process
system architecture to accommodate online play
will greatly ease the pain
Lessons Learned from our background will
provide your project with ways to avoid
especially painful places weve found ourselves

4
Todays Schedule

1000am ? Automated Testing for Online Games
Larry Mellon, Emergent Game Technologies
1100am Coffee break
1115am ? Crash Course in Security What it is,
and why you need it
Dave Weinstein, Microsoft
1230pm Lunch Break
200pm ? Integrating reliability and
performance into the production
process How to survive when five nines
is a must
Neil Kirby, Bell Labs
300pm ? Single player woes for MP design
Gordon Walton, Bioware (Austin)
400pm Snack break
415pm ? Building continually updated
technology the MMO lifecycle
Bill Dalton , Bioware (Austin)
530pm ? Questions War Stories (All
panelists)
Question to ponder for this session Inherited
problems.
What do you do once the bad decisions have
already been made, and youre the one who has to
deal with them?
600pm End of tutorial

5
Introduction
Why Online?
Our Focus
Non-determinism multi-process
?
Difficult to debug
?
Scale, reliability, long lifecycle,
Difficult to get right
6
Automated testing supports your ability to deal
with all these problems
Development Operations
Automated testing
Multi-player testing
Find and reproduce hard bugs, _at_ scale

?
Scale repeatability
Speed of automation
Prediction, stability focus

?
Accurate, repeatable tests
7
Handout notes automated testing benefits

Multi-player inputs
Expensive test cycles difficulty for both QA
engineers
Scale non-determinism
Difficult bug reproduction long, risky
development schedules
Constantly evolving game play long-term
persistence
Large frequent regression tests
High Quality of Service bar
Alternatives are costly and less effective

8
Automated Testing for Online Games(One Hour)

Overview
Hooking up your game
? external tools
? internal game changes
Applications Gotchas
? engineering, QA, operations
? production management
Summary Questions

9
Big green autoTest button gives controlled
tests actionable results that helps across your
team
(1)
Repeatable tests, using N synchronized game
clients
Test Game
Programmer
Development Director
Executive
10
Handout notes automated testing is a strong tool
for online games!

Pushbutton, large-scale, repeatable tests
Benefit
Accurate, repeatable measurable tests during
development and operations
Stable software, faster, measurable progress
Base key decisions on fact, not opinion
Augment your teams ability to do their jobs,
find problems faster
Measure / change / measure repeat
Increased developer efficiency is key
Get the game out the door faster, higher
stability less pain

11
Handout notes more benefits of automated testing

Comfort and confidence level
Managers/Producers can easily judge how
development is progressing
Just like bug count reports, test reports
indicate overall quality of current state of the
game
Frequent, repeatable tests show progress
backsliding
Investing developers in the test process helps
prevent QA vs. Development shouting matches
Smart developers like numbers and metrics just as
much as producers do
Making your goals you will ship cheaper,
better, sooner
Cheaper even though initial costs may be
higher, issues get exposed when its cheaper to
fix them (and developer efficiency increases)
Better robust code
Sooner its ok to ship now is based on real
data, not supposition

12
Automated testing accelerates online game
development helps predictability
Ship Date
Complete
Oops
autoTest
Time
Time
Target Launch
Project Start
13
Measurable targets projected trends give you
actionable progress metrics, early enough to react
Target
Oops
Any test (e.g. clients)
Time
Any Time (e.g. Alpha)
14
Success stories

Many game teams work with automated testing
EA, Microsoft, any MMO,
Automated testing has many highly successful
applications outside of game development
Caveat there are many ways to fail

15
How to succeed

Plan for testing early
Non-trivial system
Architectural implications
Fast, cheap test coverage is a major change in
production, be willing to adapt your processes
Make sure the entire team is on board
Deeper integration leads to greater value
Kearneyism make it easier to use than not to
use

16
Automated testing components
Any Online Game
Startup Control
Test Manager Test Selection/Setup Control N
Clients RT probes
17
Input systems for automated testing
scripted
algorithmic
recorders
Game code
Multiple test applications are required, but each
input type differs in value per application.
Scripting gives the best coverage.
18
Handout notes Input systems for automated testing

Multiple forms of input sources
Multiple sets of test types requirements
Make sure the input technology you pick matches
the test types you need
Cost of systems, types of testing required,
support, cross-team needs,
A single, data-driven autoTest system is the
usually the best option

19
Handout notes Input sources (algorithmic)

Powerful low development cost
Exploits game semantics for test coverage
Highly useful for some test types, but limited
verification
E.g. for each ltavatarTypegt CreateNewAvatar
for each ltobjectCategorygt
BuyAndPlaceAllObjects ltcurrentCategorygt
for each ltObjectgt
UseAllActions ltcurrentAvatargt, ltcurrentObjectgt
Broad, shallow test of all object-based content
Combine with automated errorManagers to increase
verification, and/or currentObject.selfTest()

20
Handout notes Input (recorders)

Internal event pump / external UI actions
Both are brittle to maintain
Neither can effectively support load or
multi-client synchronization, and are limited for
regression testing
Best use capturing defects that are hard to
reproduce, effective in overnight random testing
of builds some play testing
Semantic recorders are much less brittle and more
useful

21
Input (Scripted Test Clients)
Pseudo-code script of users play the game, and
what the game should do in response
Command steps
createAvatar sam enterLevel 99 buyObject
knife attack opponent

Validation steps
checkAvatar sam exists checkLevel 99
loaded checkInventory knife checkDamage
opponent

22
Handout notes scripted test clients

Scripts are emulated play sessions just like
somebody plays the game
Command steps what the player does to the game
Validation steps what the game should do in
response
Scripted clients are flexible powerful
Use for many different test types
Quick easy to write tests
Easy for non-engineers to understand create

23
Handout notes scripted test clients

Scriptable test clients
Lightweight subset of the shipping client
Instrumented spits out lots of useful
information
Repeatable
Embedded automated debugging support helps you
understand the test results
Log both server and client output (common
format), w/timestamps!
Automated metrics collection aggregation
High level at a glance reports with detail
drill down
Build in support for hung clients triaging
failures

24
Handout notes scripted test client

Support costs one (data driven) client better
than N test systems
Tailorable validation output is a very powerful
construct
Each test script contains required validation
steps (flexible, tunable, )
Minimize state to regress against fewer false
positives
Presentation layer tip build a spreadsheet of
key word/actions used by your manual testers,
automate the most common/expensives

25
Scripted Players Implementation
Script Engine
Game GUI
Commands
Presentation Layer
Client-Side Game Logic
26
Test-specific input output via a data-driven
test client gives maximum flexibility
Regression
Load
Reusable Scripts Data
Input API
Test Client
Output API
Key Game States
Pass/Fail Responsiveness
Script-Specific Logs Metrics
27
A Presentation Layer is often unique to a game

Some automation scripts should read just like QA
test scripts for your game
TSO examples
routeAvatar, useObject
buyLot, enterLot
socialInteraction (makeFriends, chat, )

NullView Client
28
Handout notes Scriptable tailorable for many
applications engineering, QA and management

Unit testing 1 feature 1 script
Load testing Representative play session, times
1,000s
Make sure your servers work, before the players
do
Integration test code changes for catastrophic
failures
Build stability quickly find problems and verify
the fix
Content testing exhaustive analysis of game play
to help tuning and ensure all assets are
correctly hooked up and explore edge cases
Multi-player testing engineers and QA can test
multi-player game code without requiring multiple
manual testers
Performance compatibility testing repeatable
tests across a broad range of hardware gives you
a precise view of where you really are
Project completeness how many features pass
their core functionality tests what are our
current FPS, network lag and bandwidth numbers,

29
Input (data sets)
Repeatable tests in development, faster load,
edge conditions
?
Mock data
Unpredictable user element finds different bugs
?
Real data
30
Input (client synchronization)
?
RemoteCommand (x)
Ordered actions to clients
?
waitFor (time)
Brittle, less reproducible
?
waitUntil (localStateChange)
Most realistic flexible
31
Common Gotchas

Not designing for testability
Retrofitting is expensive
Blowing the implementation
Code blowout
Addressing perceived needs, not real needs
Use automated testing incorrectly
Testing the wrong thing _at_ the wrong time
Not integrating with your processes
Poor testing methodology

32
Testing the wrong time at the wrong time
Applying detailed testing while the game design
is still shifting and the code is still
incomplete introduces noise and the need to keep
re-writing tests
33
More gotchas poor testing methodology tools

Case 1 recorders
Load regression were needed not understanding
maintenance cost
Case 2 completely invalid test procedures
Distorted view of what really worked (GIGO)
Case 3 poor implementation planning
Limited usage (nature of tests led to high test
cost programming skill required)
Common theme limited or no senior engineering
committed to the testing problem

34
Handout notes more gotchas

Automating too late, or too much detail too early
No ability to change the development process of
the game
Not having ways to measure the effects compared
to no automation
People and processes are funny things
Sometimes the process is changed, and sometimes
your testing goals have to shift
Games differ a lot
autoTest approaches will vary across games

35
Handout notes BAT vs FAT

Feature drift expensive test maintenance
Code is built incrementally reporting failures
nobody is prepared to deal with yet wastes
everybodys time
New tools, new concept focus on a few areas
first, then measure, improve, iterate

36
Automated Testing for Online Games(One Hour)

Overview
Hooking up your game
? external tools
? internal game changes
Applications
? engineering, QA, operations
? production management
Summary Questions

37
Handout notes Applying automated testing

Know what is automation good / not good at play
to its strengths
Change your processes around it
Establish clear measures, iteratively improve
Make sure everybody can use it has bought into
it
Tests become a form of communication

38
The strength of automated testing is the ability
to repeat massive numbers of simple, easily
measurable tasks and mine results
The difference between us and a computer is that
the computer is blindingly stupid, but it is
capable of being stupid many, many millions of
times a second. Douglas Adams (1997 SCO Forum)
39
Handout notes autoTest complexity

Automation breaks down as individual test
complexity increases
Repeating simple tests hundreds of times and
combining the results is far easier to maintain
and analyze than using long, complex tests, and
parallelism allows a dramatically accelerated
test cycle

40
Semi-automated testing is best for game
development
Testing Requirements
Automation

Rote work (does door108 still open?)
Scale
Repeatability
Accuracy
Parallelism

Integrate the two for best impact
41
Handout notes Semi-automated testing

Automation simple tasks (repetitive or
large-scale)
Load _at_ scale
Workflow information management
Regression
All weapon damage / broad, shallow feature
coverage /
Integrated automated manual testing
Tier 1 / Tier 2 automation flags potential
errors, manual investigates
Within a single test automation snapshots key
game states, manual evaluates results
Augmented / accelerated complex build steps,
full level play thru,

42
Plan your attack (retire risk early)

Tough shipping requirements (e.g.)
Scale, reliability
Regression costs
Development risk
Cost / risk of engineering debugging
Impact on content creation
Management risk
Schedule predictability visibility

43
Handout notes plan your attack

What are the big costs risks on your project?
Technology development (e.g., scalable servers)
Breadth of content to be regressed, frequency of
regressions
Your development team is significantly
handicapped without automated tests
multi-client support focus on production support
to start
Often, sufficient machines QA testers not
available
Run-time debugging of networked games often
becomes post-mortem debugging slower harder

44
Factors to consider
Test applications
Test characteristics
Unit
Repeatable / random
Full system
Frequency of use
Sub system
Overlap w/other tests
Creation maintenance
Game logic
Execution
Graphics
Manual
45
Handout notes design factors

Test overlap code coverage
Cost of running the test (graphics high,
logic/content low) vs frequency of test need
Cost of building the test vs manual cost (over
time)
Maintenance cost of the test suites, the test
system, churn rate of the game code

46
Automation focus areas (Larrys top 5)
Load testing
?
Scale is hard to get right
47
Handout notes automation focus areas
(recommendations)

Full system scale/stability testing
Multi-client server code must always function,
or the team slows down
Hardest part to get right (and to debug) when
running live players
Scale will screw you, over and over again
Non-determinism
Difficultly in debugging slows development and
hurts system reliability
Content regression
Build stability
Complex systems large development teams require
extra care to keep running smoothly, or youll
pay the price in slower development and more
antacids
And for some systems, compatibility testing or
installer testing
A data-driven system is very important you can
cover all the above with one test system

48
Yikes, that all sounds very expensive!

Yes, but remember, the alternative costs are
higher and do not always work
Costs of QA for a 6 player game you need at
least 6 testers at the same time
Testers
Consoles, TVs and disks
Network connections
MMO regression costs yikes2
10s to 100s of testers
10 year code life cycle
Constant release iterations

49
Stability analysis (code servers)What brings
down the team?
Test Case Can an Avatar Sit in a Chair?
use_object ()
Failures on the Critical Path block access to
much of the game
buy_object ()
enter_house ()
buy_house ()
create_avatar ()
login ()
50
Unstable builds are expensive slow down your
entire team!
Development
Checkin
Repeated cost of detection validation
Firefighting, not going forward
Build
Impact on others
Smoke
Regression
Dev Servers
51
Prevent critical path code breaks that take down
your team
Candidate code
Development
Safe code
Sniff Test
Pass / fail, diagnostics
Checkin
52
Stability non-determinism (monkey tests)
Continual Repetition of Critical Path Unit Tests
53
Handout notes build stability

Poor build stability slows forward progress
(especially the critical path)
People are blocked from getting work done
Uncertainty did I bust it, or did it just
happen?
A lot of developers just didnt get
non-determinism
Backsliding things kept breaking
Monkey Tests always current baseline for
developers
Common measuring stick across builds
deployments extremely valuable
Monkey tests rock!
Instant trip wire of problems focusing device
Server aging fill the pipes, get some buffers
dirty
Keeps wheels in motion while developers use those
servers
Accurate measure of race condition bugs

54
Build stability full testing comb filtering
Sniff Test, Monkey Tests - Fast to run -
Catch major errors - Keeps coders working

New code

Cheap tests to catch gross errors early in the
pipeline
More expensive tests only run on known
functional builds

55
Handout notes build stability

Much faster progress after stability checkers
added
Sniff
Hourly reference tests (sniff monkey, unit
monkey)
Comb filters kept the manpower overhead low (on
both sides), and gave quick feedback
Fewer redos for engs, fewer bugs for QA to
findprocess
Size of team gives high broken build cost
Fewer side-effect bugs

56
Handout notes dealing with stability

Hourly stability checkers (monkey tests)
Aging (dirty processes, growing datasets, leaking
memory)
Moving parts (race conditions)
Stability measure what works, right now?
Flares go off, etc
Unit tests (against Features)
Minimal noise / side effects
Reference point what should work?
Clarity in reporting / triaging

57
Handout notes event ordering and poor
transactional atomicity increase both coding
errors and the difficulty in reproducing them
Near-endless (and illogical) orderings are
possible
Client A
Client B
A1,A2,B1,B2 A1,B1,B2,A2 B1,B2,A1,A2 A1,A2,B1,B2 A
2,A1,B1,B2 B2,A2,B1,A1
A.2
B.2
A.1
B.1
Server Thread
58
Handout notes non-determinism is a big risk
factor in online development

Race conditions, dirty buffers, shared state,
Developers test with a single client against a
single server no chance to expose race
conditions
Fuzzy data views over networked connections
further complicates implementation debugging
Real-time debugging is replaced with post-mortem
analysis

59
Handout notes the effects of non-determinism

Multiple CPUs / players greatly complicates
development testing, while also increasing
system complexity
You cant reliably reproduce bugs
Near-infinite code path coverage variable
latency transactions over time introduce
massive code complexity, very hard to get right
Also hard to test edge cases or broad coverage
Each test can execute differently over any run

60
AutoTest addresses non-determinism

Detection reproduction of race condition
defects
Even low probability errors are exposed with
sufficient testing (random, structured, load,
aging)
Measurability of race condition defects
Occurs x of the time, over 400x test runs

61
Monkey test enterLot ()
62
Monkey test 3 enterLot ()
63
Four different failures in thirty runs!
64
Handout notes non-deterministic failures

30 test runs, 4 behaviours
Successful entry
Hang or Crash
Owner evicted, all possessions stolen
Random results observed in all major features
Critical Path random failures outside of Unit
Tests very difficult to track

65
Content testing (areas)

Regression
Error detection
Balancing / tuning
This topic is a tutorial in and of itself
Content regression is a huge cost problem
Many ways to automate it (algorithmic, scripted
combined, )
Differs wildly across game genres

66
Content testing (more examples)

Light mapping, shadow detection
Asset correctness / sameness
Compatibility testing
Armor / damage
Class balances
Validating against old userData
(unique to each game)

67
Load testing, before paying customers show up

Expose issues that only occur at scale

Establish hardware requirements
Establish play is acceptable _at_ scale
68
Handout notes some examples of things caught
with load testing

Non-scalable algorithms
Server-side dirty buffers
Race conditions
Data bloat clogged pipes
Poor end-user performance _at_ scale
you never really know what, but something will
always go spang! _at_ scale

69
Load testing catches non-scalable designs
Global data
(SP) all data is always available up to date
Scalability is hard shared data grows with
players, AI, objects, terrain, , more bugs!
70
Handout notes why you need load testing

SP all information is always available
MP shared information must be packaged,
transmitted and unpackaged
Each step costs CPU bandwidth, and can happen
10s to 100s of times per minute
May also cause additional overhead (e.g. DB
calls)
Scalability is key many shared data structures
grow with the number of players, AI, objects,
terrain,
Caution early prototypes may be cheap enough,
but as game progresses, costs may explode

71
Handout notes why you need load testing

Case 1, initial design Transmit entire lotList
to all connected clients, every 30 seconds
Initial fielding no problem
Development testing lt 1,000 Lots, lt 10 clients
Complete disaster as clients DB scaled
Shipping requirements 100,000 Lots, 4,000
clients
DO THE MATH BEFORE CODING
LotElementSize LotListSize NumClients
20 Bytes 100,000 4,000
8,000,000,000 Bytes, TWICE per minute!!

72
Load testing find poor resource utilization
22,000,000 DS Queries! 7,000 next highest
73
Load test both client server behaviors
74
Handout notes automated data mining / triage

Test results Patterns of failures
Bug rate to source file comparison
Easy historical mining results comparison
Triage debugging aids that extract RT data from
the game
Timeout crash handlers
errorManagers
Log parsers
Scriptable verification conditions

75
Automated Testing for Online Games(One Hour)

Overview
Hooking up your game
? external tools
? internal game changes
Applications
? engineering, QA, operations
? production management
Summary Questions

76
Summary automated testing

Start early make it easy to use
Strongly impacts your success
The bigger more complex your game, the more
automated testing you need
You need commitment across the team
Engineering, QA, management, content creation

77
Resources

Slides are on the web at www.emergent.net
My email larry.mellon__at__emergent.net
More material on automated testing for games
http//www.maggotranch.com/mmp.html
Last years online engineering slides
Talks on automated testing scaling the
development process
www.amazon.com Massively Multiplayer Game
Development II
Chapters on automated testing and automated
metrics systems
www.gamasutra.com Dag Frommhold, Fabian Röken
Lengthy article on applying automated testing in
games
Microsoft various groups writings
From outside the gaming world
Kent Beck anything on test-driven development
http//www.martinfowler.com/articles/continuousInt
egration.htmlid108619 Continual integration
testing
Amazon Google inside outside our industry