GDC 2005 - PowerPoint PPT Presentation

About This Presentation
Title:

GDC 2005

Description:

Engineering issues for online games ... Handout notes: automated testing is a strong tool for online games! ... accelerates online game development & helps ... – PowerPoint PPT presentation

Number of Views:87
Avg rating:3.0/5.0
Slides: 44
Provided by: jami125
Category:
Tags: gdc | bandwidth | speed | test

less

Transcript and Presenter's Notes

Title: GDC 2005


1
(No Transcript)
2
GDC 2006 tutorial abstract Engineering issues
for online games
  • As the size and scope of multiplayer games
    continue to grow, the engineering requirements of
    multiplayer development expand drastically.
    Moreover, the lifecycle demands for successful
    massively multiplayer games can involve more than
    a decade of sustained development after launch.
  • This tutorial focuses on the software engineering
    challenges of producing multiplayer games,
    including single-player versus multi-player code,
    testing and regressions, security in peer to peer
    and server-oriented games, and protecting the
    consumers in an increasingly dangerous online
    environment. Common fail points of single-player
    engineering tactics in a multi-player world are
    addressed, and the longer term issues of building
    and maintaining server-based games when downtime
    means a direct and public loss of revenue and
    market share.
  • This slide deck contains several background
    slides, hidden in slide-show mode. Print for
    complete data.

3
Tutorial Takeaway Messages
  • Building online games with single-player game
    techniques is painful
  • Early changes to your development process
    system architecture to accommodate online play
    will greatly ease the pain
  • Lessons Learned from our background will
    provide your project with ways to avoid
    especially painful places weve found ourselves

4
Todays Schedule
  • 1000am ? Automated Testing for Online Games
  • Larry Mellon, Emergent Game Technologies
  • 1100am Coffee break
  • 1115am ? Crash Course in Security What it is,
    and why you need it
  • Dave Weinstein, Microsoft
  • 1230pm Lunch Break
  • 200pm ? Integrating reliability and
    performance into the production
  • process How to survive when five nines
    is a must
  • Neil Kirby, Bell Labs
  • 300pm ? Single player woes for MP design
  • Gordon Walton, Bioware (Austin)
  • 400pm Snack break
  • 415pm ? Building continually updated
    technology the MMO lifecycle
  • Bill Dalton , Bioware (Austin)
  • 530pm ? Questions War Stories (All
    panelists)
  • Question to ponder for this session Inherited
    problems.
  • What do you do once the bad decisions have
    already been made, and youre the one who has to
    deal with them?
  • 600pm End of tutorial

5
Introduction
Why Online?
Our Focus
Non-determinism multi-process
?
Difficult to debug
?
Scale, reliability, long lifecycle,
Difficult to get right
6
Automated testing supports your ability to deal
with all these problems
Development Operations
Automated testing
Multi-player testing
Find and reproduce hard bugs, _at_ scale

?
Scale repeatability
Speed of automation
Prediction, stability focus

?
Accurate, repeatable tests
7
Handout notes automated testing benefits
  • Multi-player inputs
  • Expensive test cycles difficulty for both QA
    engineers
  • Scale non-determinism
  • Difficult bug reproduction long, risky
    development schedules
  • Constantly evolving game play long-term
    persistence
  • Large frequent regression tests
  • High Quality of Service bar
  • Alternatives are costly and less effective

8
Automated Testing for Online Games(One Hour)
  • Overview
  • Hooking up your game
  • ? external tools
  • ? internal game changes
  • Applications Gotchas
  • ? engineering, QA, operations
  • ? production management
  • Summary Questions

9
Big green autoTest button gives controlled
tests actionable results that helps across your
team
(1)
Repeatable tests, using N synchronized game
clients
Test Game
Programmer
Development Director
Executive
10
Handout notes automated testing is a strong tool
for online games!
  • Pushbutton, large-scale, repeatable tests
  • Benefit
  • Accurate, repeatable measurable tests during
    development and operations
  • Stable software, faster, measurable progress
  • Base key decisions on fact, not opinion
  • Augment your teams ability to do their jobs,
    find problems faster
  • Measure / change / measure repeat
  • Increased developer efficiency is key
  • Get the game out the door faster, higher
    stability less pain

11
Handout notes more benefits of automated testing
  • Comfort and confidence level
  • Managers/Producers can easily judge how
    development is progressing
  • Just like bug count reports, test reports
    indicate overall quality of current state of the
    game
  • Frequent, repeatable tests show progress
    backsliding
  • Investing developers in the test process helps
    prevent QA vs. Development shouting matches
  • Smart developers like numbers and metrics just as
    much as producers do
  • Making your goals you will ship cheaper,
    better, sooner
  • Cheaper even though initial costs may be
    higher, issues get exposed when its cheaper to
    fix them (and developer efficiency increases)
  • Better robust code
  • Sooner its ok to ship now is based on real
    data, not supposition

12
Automated testing accelerates online game
development helps predictability
Ship Date
Complete
Oops
autoTest
Time
Time
Target Launch
Project Start
13
Measurable targets projected trends give you
actionable progress metrics, early enough to react
Target
Oops
Any test (e.g. clients)
Time
Any Time (e.g. Alpha)
14
Success stories
  • Many game teams work with automated testing
  • EA, Microsoft, any MMO,
  • Automated testing has many highly successful
    applications outside of game development
  • Caveat there are many ways to fail

15
How to succeed
  • Plan for testing early
  • Non-trivial system
  • Architectural implications
  • Fast, cheap test coverage is a major change in
    production, be willing to adapt your processes
  • Make sure the entire team is on board
  • Deeper integration leads to greater value
  • Kearneyism make it easier to use than not to
    use

16
Automated testing components
Any Online Game
Startup Control
Test Manager Test Selection/Setup Control N
Clients RT probes
17
Input systems for automated testing
scripted
algorithmic
recorders
Game code
Multiple test applications are required, but each
input type differs in value per application.
Scripting gives the best coverage.
18
Handout notes Input systems for automated testing
  • Multiple forms of input sources
  • Multiple sets of test types requirements
  • Make sure the input technology you pick matches
    the test types you need
  • Cost of systems, types of testing required,
    support, cross-team needs,
  • A single, data-driven autoTest system is the
    usually the best option

19
Handout notes Input sources (algorithmic)
  • Powerful low development cost
  • Exploits game semantics for test coverage
  • Highly useful for some test types, but limited
    verification
  • E.g. for each ltavatarTypegt CreateNewAvatar
  • for each ltobjectCategorygt
  • BuyAndPlaceAllObjects ltcurrentCategorygt
  • for each ltObjectgt
  • UseAllActions ltcurrentAvatargt, ltcurrentObjectgt
  • Broad, shallow test of all object-based content
  • Combine with automated errorManagers to increase
    verification, and/or currentObject.selfTest()

20
Handout notes Input (recorders)
  • Internal event pump / external UI actions
  • Both are brittle to maintain
  • Neither can effectively support load or
    multi-client synchronization, and are limited for
    regression testing
  • Best use capturing defects that are hard to
    reproduce, effective in overnight random testing
    of builds some play testing
  • Semantic recorders are much less brittle and more
    useful

21
Input (Scripted Test Clients)
Pseudo-code script of users play the game, and
what the game should do in response
Command steps
createAvatar sam enterLevel 99 buyObject
knife attack opponent

Validation steps
checkAvatar sam exists checkLevel 99
loaded checkInventory knife checkDamage
opponent

22
Handout notes scripted test clients
  • Scripts are emulated play sessions just like
    somebody plays the game
  • Command steps what the player does to the game
  • Validation steps what the game should do in
    response
  • Scripted clients are flexible powerful
  • Use for many different test types
  • Quick easy to write tests
  • Easy for non-engineers to understand create

23
Handout notes scripted test clients
  • Scriptable test clients
  • Lightweight subset of the shipping client
  • Instrumented spits out lots of useful
    information
  • Repeatable
  • Embedded automated debugging support helps you
    understand the test results
  • Log both server and client output (common
    format), w/timestamps!
  • Automated metrics collection aggregation
  • High level at a glance reports with detail
    drill down
  • Build in support for hung clients triaging
    failures

24
Handout notes scripted test client
  • Support costs one (data driven) client better
    than N test systems
  • Tailorable validation output is a very powerful
    construct
  • Each test script contains required validation
    steps (flexible, tunable, )
  • Minimize state to regress against fewer false
    positives
  • Presentation layer tip build a spreadsheet of
    key word/actions used by your manual testers,
    automate the most common/expensives

25
Scripted Players Implementation
Script Engine
Game GUI
Commands
Presentation Layer
Client-Side Game Logic
26
Test-specific input output via a data-driven
test client gives maximum flexibility
Regression
Load
Reusable Scripts Data
Input API
Test Client
Output API
Key Game States
Pass/Fail Responsiveness
Script-Specific Logs Metrics
27
A Presentation Layer is often unique to a game
  • Some automation scripts should read just like QA
    test scripts for your game
  • TSO examples
  • routeAvatar, useObject
  • buyLot, enterLot
  • socialInteraction (makeFriends, chat, )

NullView Client
28
Handout notes Scriptable tailorable for many
applications engineering, QA and management
  • Unit testing 1 feature 1 script
  • Load testing Representative play session, times
    1,000s
  • Make sure your servers work, before the players
    do
  • Integration test code changes for catastrophic
    failures
  • Build stability quickly find problems and verify
    the fix
  • Content testing exhaustive analysis of game play
    to help tuning and ensure all assets are
    correctly hooked up and explore edge cases
  • Multi-player testing engineers and QA can test
    multi-player game code without requiring multiple
    manual testers
  • Performance compatibility testing repeatable
    tests across a broad range of hardware gives you
    a precise view of where you really are
  • Project completeness how many features pass
    their core functionality tests what are our
    current FPS, network lag and bandwidth numbers,

29
Input (data sets)
Repeatable tests in development, faster load,
edge conditions
?
Mock data
Unpredictable user element finds different bugs
?
Real data
30
Input (client synchronization)
?
RemoteCommand (x)
Ordered actions to clients
?
waitFor (time)
Brittle, less reproducible
?
waitUntil (localStateChange)
Most realistic flexible
31
Common Gotchas
  • Not designing for testability
  • Retrofitting is expensive
  • Blowing the implementation
  • Code blowout
  • Addressing perceived needs, not real needs
  • Use automated testing incorrectly
  • Testing the wrong thing _at_ the wrong time
  • Not integrating with your processes
  • Poor testing methodology

32
Testing the wrong time at the wrong time
Applying detailed testing while the game design
is still shifting and the code is still
incomplete introduces noise and the need to keep
re-writing tests
33
More gotchas poor testing methodology tools
  • Case 1 recorders
  • Load regression were needed not understanding
    maintenance cost
  • Case 2 completely invalid test procedures
  • Distorted view of what really worked (GIGO)
  • Case 3 poor implementation planning
  • Limited usage (nature of tests led to high test
    cost programming skill required)
  • Common theme limited or no senior engineering
    committed to the testing problem

34
Handout notes more gotchas
  • Automating too late, or too much detail too early
  • No ability to change the development process of
    the game
  • Not having ways to measure the effects compared
    to no automation
  • People and processes are funny things
  • Sometimes the process is changed, and sometimes
    your testing goals have to shift
  • Games differ a lot
  • autoTest approaches will vary across games

35
Handout notes BAT vs FAT
  • Feature drift expensive test maintenance
  • Code is built incrementally reporting failures
    nobody is prepared to deal with yet wastes
    everybodys time
  • New tools, new concept focus on a few areas
    first, then measure, improve, iterate

36
Automated Testing for Online Games(One Hour)
  • Overview
  • Hooking up your game
  • ? external tools
  • ? internal game changes
  • Applications
  • ? engineering, QA, operations
  • ? production management
  • Summary Questions

37
Handout notes Applying automated testing
  • Know what is automation good / not good at play
    to its strengths
  • Change your processes around it
  • Establish clear measures, iteratively improve
  • Make sure everybody can use it has bought into
    it
  • Tests become a form of communication

38
The strength of automated testing is the ability
to repeat massive numbers of simple, easily
measurable tasks and mine results
The difference between us and a computer is that
the computer is blindingly stupid, but it is
capable of being stupid many, many millions of
times a second. Douglas Adams (1997 SCO Forum)
39
Handout notes autoTest complexity
  • Automation breaks down as individual test
    complexity increases
  • Repeating simple tests hundreds of times and
    combining the results is far easier to maintain
    and analyze than using long, complex tests, and
    parallelism allows a dramatically accelerated
    test cycle

40
Semi-automated testing is best for game
development
Testing Requirements
Automation
  • Rote work (does door108 still open?)
  • Scale
  • Repeatability
  • Accuracy
  • Parallelism

Integrate the two for best impact
41
Handout notes Semi-automated testing
  • Automation simple tasks (repetitive or
    large-scale)
  • Load _at_ scale
  • Workflow information management
  • Regression
  • All weapon damage / broad, shallow feature
    coverage /
  • Integrated automated manual testing
  • Tier 1 / Tier 2 automation flags potential
    errors, manual investigates
  • Within a single test automation snapshots key
    game states, manual evaluates results
  • Augmented / accelerated complex build steps,
    full level play thru,

42
Plan your attack (retire risk early)
  • Tough shipping requirements (e.g.)
  • Scale, reliability
  • Regression costs
  • Development risk
  • Cost / risk of engineering debugging
  • Impact on content creation
  • Management risk
  • Schedule predictability visibility

43
Handout notes plan your attack
  • What are the big costs risks on your project?
  • Technology development (e.g., scalable servers)
  • Breadth of content to be regressed, frequency of
    regressions
  • Your development team is significantly
    handicapped without automated tests
    multi-client support focus on production support
    to start
  • Often, sufficient machines QA testers not
    available
  • Run-time debugging of networked games often
    becomes post-mortem debugging slower harder

44
Factors to consider
Test applications
Test characteristics
Unit
Repeatable / random
Full system
Frequency of use
Sub system
Overlap w/other tests
Creation maintenance
Game logic
Execution
Graphics
Manual
45
Handout notes design factors
  • Test overlap code coverage
  • Cost of running the test (graphics high,
    logic/content low) vs frequency of test need
  • Cost of building the test vs manual cost (over
    time)
  • Maintenance cost of the test suites, the test
    system, churn rate of the game code

46
Automation focus areas (Larrys top 5)
Load testing
?
Scale is hard to get right
47
Handout notes automation focus areas
(recommendations)
  • Full system scale/stability testing
  • Multi-client server code must always function,
    or the team slows down
  • Hardest part to get right (and to debug) when
    running live players
  • Scale will screw you, over and over again
  • Non-determinism
  • Difficultly in debugging slows development and
    hurts system reliability
  • Content regression
  • Build stability
  • Complex systems large development teams require
    extra care to keep running smoothly, or youll
    pay the price in slower development and more
    antacids
  • And for some systems, compatibility testing or
    installer testing
  • A data-driven system is very important you can
    cover all the above with one test system

48
Yikes, that all sounds very expensive!
  • Yes, but remember, the alternative costs are
    higher and do not always work
  • Costs of QA for a 6 player game you need at
    least 6 testers at the same time
  • Testers
  • Consoles, TVs and disks
  • Network connections
  • MMO regression costs yikes2
  • 10s to 100s of testers
  • 10 year code life cycle
  • Constant release iterations

49
Stability analysis (code servers)What brings
down the team?
Test Case Can an Avatar Sit in a Chair?
use_object ()
Failures on the Critical Path block access to
much of the game
buy_object ()
enter_house ()
buy_house ()
create_avatar ()
login ()
50
Unstable builds are expensive slow down your
entire team!
Development
Checkin
Repeated cost of detection validation
Firefighting, not going forward
Build
Impact on others
Smoke
Regression
Dev Servers
51
Prevent critical path code breaks that take down
your team
Candidate code
Development
Safe code
Sniff Test
Pass / fail, diagnostics
Checkin
52
Stability non-determinism (monkey tests)
Continual Repetition of Critical Path Unit Tests
53
Handout notes build stability
  • Poor build stability slows forward progress
    (especially the critical path)
  • People are blocked from getting work done
  • Uncertainty did I bust it, or did it just
    happen?
  • A lot of developers just didnt get
    non-determinism
  • Backsliding things kept breaking
  • Monkey Tests always current baseline for
    developers
  • Common measuring stick across builds
    deployments extremely valuable
  • Monkey tests rock!
  • Instant trip wire of problems focusing device
  • Server aging fill the pipes, get some buffers
    dirty
  • Keeps wheels in motion while developers use those
    servers
  • Accurate measure of race condition bugs

54
Build stability full testing comb filtering
Sniff Test, Monkey Tests - Fast to run -
Catch major errors - Keeps coders working

New code
  • Cheap tests to catch gross errors early in the
    pipeline
  • More expensive tests only run on known
    functional builds

55
Handout notes build stability
  • Much faster progress after stability checkers
    added
  • Sniff
  • Hourly reference tests (sniff monkey, unit
    monkey)
  • Comb filters kept the manpower overhead low (on
    both sides), and gave quick feedback
  • Fewer redos for engs, fewer bugs for QA to
    findprocess
  • Size of team gives high broken build cost
  • Fewer side-effect bugs

56
Handout notes dealing with stability
  • Hourly stability checkers (monkey tests)
  • Aging (dirty processes, growing datasets, leaking
    memory)
  • Moving parts (race conditions)
  • Stability measure what works, right now?
  • Flares go off, etc
  • Unit tests (against Features)
  • Minimal noise / side effects
  • Reference point what should work?
  • Clarity in reporting / triaging

57
Handout notes event ordering and poor
transactional atomicity increase both coding
errors and the difficulty in reproducing them
Near-endless (and illogical) orderings are
possible
Client A
Client B
A1,A2,B1,B2 A1,B1,B2,A2 B1,B2,A1,A2 A1,A2,B1,B2 A
2,A1,B1,B2 B2,A2,B1,A1
A.2
B.2
A.1
B.1
Server Thread
58
Handout notes non-determinism is a big risk
factor in online development
  • Race conditions, dirty buffers, shared state,
  • Developers test with a single client against a
    single server no chance to expose race
    conditions
  • Fuzzy data views over networked connections
    further complicates implementation debugging
  • Real-time debugging is replaced with post-mortem
    analysis

59
Handout notes the effects of non-determinism
  • Multiple CPUs / players greatly complicates
    development testing, while also increasing
    system complexity
  • You cant reliably reproduce bugs
  • Near-infinite code path coverage variable
    latency transactions over time introduce
    massive code complexity, very hard to get right
  • Also hard to test edge cases or broad coverage
  • Each test can execute differently over any run

60
AutoTest addresses non-determinism
  • Detection reproduction of race condition
    defects
  • Even low probability errors are exposed with
    sufficient testing (random, structured, load,
    aging)
  • Measurability of race condition defects
  • Occurs x of the time, over 400x test runs

61
Monkey test enterLot ()
62
Monkey test 3 enterLot ()
63
Four different failures in thirty runs!
64
Handout notes non-deterministic failures
  • 30 test runs, 4 behaviours
  • Successful entry
  • Hang or Crash
  • Owner evicted, all possessions stolen
  • Random results observed in all major features
  • Critical Path random failures outside of Unit
    Tests very difficult to track

65
Content testing (areas)
  • Regression
  • Error detection
  • Balancing / tuning
  • This topic is a tutorial in and of itself
  • Content regression is a huge cost problem
  • Many ways to automate it (algorithmic, scripted
    combined, )
  • Differs wildly across game genres

66
Content testing (more examples)
  • Light mapping, shadow detection
  • Asset correctness / sameness
  • Compatibility testing
  • Armor / damage
  • Class balances
  • Validating against old userData
  • (unique to each game)

67
Load testing, before paying customers show up
  • Expose issues that only occur at scale

Establish hardware requirements
Establish play is acceptable _at_ scale
68
Handout notes some examples of things caught
with load testing
  • Non-scalable algorithms
  • Server-side dirty buffers
  • Race conditions
  • Data bloat clogged pipes
  • Poor end-user performance _at_ scale
  • you never really know what, but something will
    always go spang! _at_ scale

69
Load testing catches non-scalable designs
Global data
(SP) all data is always available up to date
Scalability is hard shared data grows with
players, AI, objects, terrain, , more bugs!
70
Handout notes why you need load testing
  • SP all information is always available
  • MP shared information must be packaged,
    transmitted and unpackaged
  • Each step costs CPU bandwidth, and can happen
    10s to 100s of times per minute
  • May also cause additional overhead (e.g. DB
    calls)
  • Scalability is key many shared data structures
    grow with the number of players, AI, objects,
    terrain,
  • Caution early prototypes may be cheap enough,
    but as game progresses, costs may explode

71
Handout notes why you need load testing
  • Case 1, initial design Transmit entire lotList
    to all connected clients, every 30 seconds
  • Initial fielding no problem
  • Development testing lt 1,000 Lots, lt 10 clients
  • Complete disaster as clients DB scaled
  • Shipping requirements 100,000 Lots, 4,000
    clients
  • DO THE MATH BEFORE CODING
  • LotElementSize LotListSize NumClients
  • 20 Bytes 100,000 4,000
  • 8,000,000,000 Bytes, TWICE per minute!!

72
Load testing find poor resource utilization
22,000,000 DS Queries! 7,000 next highest
73
Load test both client server behaviors
74
Handout notes automated data mining / triage
  • Test results Patterns of failures
  • Bug rate to source file comparison
  • Easy historical mining results comparison
  • Triage debugging aids that extract RT data from
    the game
  • Timeout crash handlers
  • errorManagers
  • Log parsers
  • Scriptable verification conditions

75
Automated Testing for Online Games(One Hour)
  • Overview
  • Hooking up your game
  • ? external tools
  • ? internal game changes
  • Applications
  • ? engineering, QA, operations
  • ? production management
  • Summary Questions

76
Summary automated testing
  • Start early make it easy to use
  • Strongly impacts your success
  • The bigger more complex your game, the more
    automated testing you need
  • You need commitment across the team
  • Engineering, QA, management, content creation

77
Resources
  • Slides are on the web at www.emergent.net
  • My email larry.mellon__at__emergent.net
  • More material on automated testing for games
  • http//www.maggotranch.com/mmp.html
  • Last years online engineering slides
  • Talks on automated testing scaling the
    development process
  • www.amazon.com Massively Multiplayer Game
    Development II
  • Chapters on automated testing and automated
    metrics systems
  • www.gamasutra.com Dag Frommhold, Fabian Röken
  • Lengthy article on applying automated testing in
    games
  • Microsoft various groups writings
  • From outside the gaming world
  • Kent Beck anything on test-driven development
  • http//www.martinfowler.com/articles/continuousInt
    egration.htmlid108619 Continual integration
    testing
  • Amazon Google inside outside our industry
Write a Comment
User Comments (0)
About PowerShow.com