Metrics for MMP Development and Operations Lessons Learned: The Sims Online - PowerPoint PPT Presentation

About This Presentation
Title:

Metrics for MMP Development and Operations Lessons Learned: The Sims Online

Description:

Automated collection & aggregation of data ... Game designers were heavy Esper users. Validated metrics against community boards, tests, ... – PowerPoint PPT presentation

Number of Views:1334
Avg rating:3.0/5.0
Slides: 45
Provided by: Lar9156
Category:

less

Transcript and Presenter's Notes

Title: Metrics for MMP Development and Operations Lessons Learned: The Sims Online


1
Metrics forMMP Development and
OperationsLessons Learned The Sims Online
  • Larry Mellon
  • GDC, Spring 2004

2
Metrics Catch-22
Useful
Careful
GI / GO
Hard Data
Optimization Tool
Expensive
3
Key Points
  • What you measure becomes what people base
    critical decisions on
  • Pick carefully (example Refactoring)
  • Red herrings tracking the wrong metric produces
    the wrong result
  • Cross check the numbers
  • Metrics bugs cause great confusion GI/GO
  • Mis-information bad data that people then take
    take action against leads to bad results

4
Importance of Metrics is Relative
Mark Twain
Lord Kelvin
Measure Everything
Measure Just Enough
5
Key Points
  • Need some form of scale to estimate how big of a
    club to use on the metrics problem for your
    project
  • Everybody needs some level, but how much to
    spend?
  • Two fans of metrics make good icons for the end
    points
  • Lord kelvin smart enough to get a scale named
    after him, measure everything, spare no expense
    kind of guy
  • Mark twain they have their uses, but viewed them
    with a jaundiced eye
  • Be careful (pick well, cross check)
  • Dont go overboard (metrics have uses, but are no
    silver bullet, and and hurt you)

6
Pro-Metrics Lord Kelvin
  • I often say that when you can measure what you
    are speaking about and express it in numbers you
    know something about it
  • But when you cannot express it in numbers, your
    knowledge is of a meagre and unsatisfactory
    kind.
  • Institution of Civil Engineers, 1883

7
Mark Twain Caveat Emptor / Siren Song
  • Figures often beguile me, particularly when I
    have the arranging of them myself
  • There are three kinds of lies

Lies
Damned Lies
Statistics
8
Invest No More Than You Need
Kelvin
Twain


9
Key Points
  • Notional Scale Lord Kelvin vs Mark Twain
  • LK complexity of a system under study requires
    fine-grain visibility into many variables
  • MT Practical Man measurements cut to fit,
    good enough, roughly correct
  • What the scale tells you about your problem is
    the size complexity of your metrics system
  • Big metrics systems are expensive
  • Dont go postal (unless you need to)
  • Build no more than you need (why measure beyond
    what you care about for either precision,
    frequency, depth or breadth)

10
MMP Measurement Focal Points
Operational Costs
Infrastructure
Player Actions
Economy
11
Key Points MMP Infrastructure
  • Complexity of implementation
  • Butterfly effect non-determinism what went
    wrong??
  • Number of moving parts tens of interacting
    server processes, hundreds to thousands of
    (highly variable) user inputs
  • Many interacting developer teams
  • Scale repeat the above, to support 50,000
    butterflys
  • Quality of Service requirements are high
  • Reliability
  • Performance

12
Key Points (2) Operations
  • Service business, not packaged goods
  • Driving Requirements reliability / performance /
    fun
  • ROI (value to customer vs cost to buildrun)
  • Player base
  • Who costs money
  • Who generates money
  • Minimize overhead
  • Anything you can measure, you can optimize
  • Where do the operational costs go?
  • E.g. bwidth, service calls, crashes
  • What costs money
  • What generates money
  • Customer Service
  • Whos being naughty?
  • Who is a loyal customer?

13
Key Points (3) Social / Economic
  • What do people do in-game?
  • Where does their in-game money come from?
  • What do they spend it on?
  • Why?
  • The need to please
  • What aspects of the game are used the most
  • Are people having fun, right now
  • Tuning the gameplay

14
Key Points
  • Profiler captures tonso data
  • Essential to have report generators that
    automatically create high level views/summaries
    of data
  • Each view is tailored to the particular class of
    user
  • Designers/community_managers daily summaries
  • Eng/ops time-driven charts

15
Similar Use Case Casinos(Harrahs Total
Reward)
Unified Player Action DB
Casino
Casino
Casino
Table / Machine
Table / Machine
Table / Machine
Track every Player Action
Player
Player
Player
16
Highly Profitable, Highly Popular
Analyze
Unified Player Action DB
Profit (per Casino, per Player)
Patterns of Play
Modify
Casino Operations
Player Awards Program
This is one of the best investments that we have
ever made as a corporation and will prove to
forge key new business strategies and
opportunities in the future. John Boushy
(Harrah's CIO, 2000)
17
Key Points Harrahs Total Reward
  • One of the biggest success stories for CRM is in
    fact a sibling game industry casinos It is, in
    fact, the only visible sign of one of the most
    successful computer-based loyalty schemes ever
    seen.
  • well on the way to becoming a classic business
    school story to illustrate the transformational
    use of information technology
  • 26 of customers generate 82 of revenues
  • "Millionaire Maker," which ties regional
    properties to select "destination" properties
    through a slot machine contest held at all of
    Harrah's sites. Satre makes a personal invitation
    to the company's most loyal customers to
    participate, and winners of the regional
    tournaments then fly out to a destination
    property, such as Lake Tahoe, to participate in
    the finals. Each one of these contests is
    independently a valuable promotion and profitable
    event for each property
  • 286.3 million in such comps. Harrah's might
    award hotel vouchers to out-of-state guests,
    while free show tickets would be more appropriate
    for customers who make day trips to the casino
  • At a Gartner Group conference on CRM in Chicago
    in September 1999, Tracy Austin highlighted the
    key areas of benefits and the ROI achieved in the
    first several years of utilizing the 'patron
    database' and the 'marketing workbench' (data
    warehouse). "We have achieved over 74 million in
    returns during our first few years of utilizing
    these exciting new tool and CRM processes within
    our entire organization
  • John Boushy, CIO of Harrah's, in a speech at the
    DCI CRM Conference in Chicago in February 2000,
    stated "We are achieving over 50 annual
    return-on-investment in our data warehousing and
    patron database activities. This is one of the
    best investments that we have ever made as a
    corporation and will prove to forge key new
    business strategies and opportunities in the
    future."

18
TSO Live Monitors, Summary Views
Embedded Profiler (Server Side) Automated Report
Generators
19
Outline
  • Background Metrics MMP
  • Implementation Overview
  • Metrics in TSO
  • Applications Sample Charts
  • Wrapup
  • Lessons Learned
  • Conclusions
  • Questions

20
ImplementationDriving Requirements
Low overhead
Common Infrastructure
Ease of use
21
Key Points Driving Requirements
  • Ease of use Information Management
  • Adding probes
  • Pointclick to find things, speed
  • Automated collection aggregation of data
  • Volume of data quickly becomes unmanagable
    people stop looking.
  • Metrics are high entropy If you rely on a
    person, some part will eventually become
    unreliable.
  • If you cant rely on metrics, they become
    useless, and then nobody bothers to look any
    more.
  • if you cant get the information you _need_ out
    of the information _available_, it isnt
    _useful_
  • Low RT overhead
  • Dont disrupt the servers under study
  • Positive feedback loops
  • Shrodingers cat dilemma
  • But, still need massive volumes of information
  • Common Infrastructure
  • Less code (than N separately targeted systems
  • Bonus allows direct comparison of user actions
    to load spikes

22
Esper Architecture
Live Server CPUs
23
Key Points
  • Visualization tool parallel simulation
  • Entity actions heavily drove CPU performance, but
    scale (of both) made finding patterns problems
    very difficult
  • Esper golden-age SF term, one with ESP (Andre
    Norton, etc)
  • Peers into the internal workings of the
    (distributed) mind, giving a high-level view of
    the data
  • TSO Esper (v. 4) eliminate raw data
  • Most of the time, detailed data never used
  • Probes collect _at_ aggregate level
  • Repeatable tests could be done with detailed
    metrics, when required
  • Data capture server-side only
  • Allows single infrastructure for engine player
    data
  • Untrusted client privacy spoofing
  • Summary-only data views means we can collect
    aggregate-only data its most of what you need,
    and is far cheaper
  • Probe internal to every server process
  • Count/average/min/max values inside a fixed time
    window
  • Log out values _at_ end of time_window, reset probes

24
Event-level Sampling, Aggregated Reporting
esperProbes
esperStore
Min, Max, Av, Count
DBImporter
esperFetch
25
Esper Probes
  • Self-organizing class hierarchy
  • Data driven new probes and/or new game content
    immediately visible on web
  • Example ESPER_PROBE
  • (Object.interaction.s, chair-gtpicked)
  • (Object.interaction.puppet.s, self-gtpicked)
  • Human-readable intermediate files

26
EsperView Web-Driven Presentation
Daily Reports
Report Generator
Graph Caching Archiving
Filtering Meta Data
27
Key Points
  • Standard set of views posted to Daily Reports
    page
  • Flexible report_generator to gen new charts
  • Caching of large graphs (used in turn for
    archiving historical views)
  • Noise filters (something big you just dont care
    about right now)
  • Open source graphing system

28
EsperView Hierarchical Presentation
Process-Level Collection
Server Cluster
Process Class
Process Instance
29
Key Points Hierarchical Presentation
  • Metrics were collected and tracked at the process
    level, with two aggregated views
  • Server Farm (anyProbe, averaged across all
    processes)
  • Process class (anyProbe, averaged across just the
    simulators within a system)
  • Process instance (anyProbe, averaged within a
    single simulator)
  • Data collections viewable at three levels
    slide?
  • Server Farm (all processes)
  • Process class (all simulators within a system)
  • Process instance (a single login server)
  • Viewable in timeOrder or dailySummary,
    w/drill-down

30
Outline
  • Background Metrics MMP
  • Implementation Overview
  • Metrics in TSO
  • Applications Sample Charts
  • Wrapup
  • Lessons Learned
  • Conclusions
  • Questions

31
Applications of Metrics
Load Testing (Realistic Inputs)
Beta Testing Live Operations (Game Tuning,
Community Management)
Load Testing Live Operations (Server
Performance)
32
Load TestingMonkey See / Monkey Do
Sim Actions (Player Controlled)
Sim Actions (Script Controlled)
33
Key Points
  • Measure userLoad _at_ peak in Live city
  • Change user_behaviour in load testing script
    (automated testing), using Esper to measure
    emulatedLoad against liveLoad
  • Re-calibration as required (constant protocol /
    code shifts)
  • Example WAH.txt
  • Used in turn to
  • measure the infrastructure for completeness is
    the infrastructure ready for launch?
  • Find fix bugs
  • Very realistic load testing!
  • oh, thats what happens when 1,000 simulators
    all start up at the same time
  • Client-side response metrics tracked separately

34
Applications of Metrics
Load Testing Realistic Inputs
Beta Testing Live Operations Tuning/Mngmnt
Load Testing Live Operations Server Performance
35
Key Points Game Play Analysis
  • Game designers were heavy Esper users
  • Validated metrics against community boards,
    tests,
  • Most popular Interactions / Objects / places
  • Trends
  • Length of time in a house
  • Chat rate
  • Types of characters chosen
  • Direct cycle, repeat N times
  • Observe behaviour, tune play, observe changes

36
History
Make Friends, Shake Hands beats out Give Money /
Get Money Least Used Disco Dancing
Meta Data
37
Top TD Dance, Woohoo Bottom Dance
38
Players per Lot
0 to 70 players lt 2 / lot 70 to 400 players gt
3 / lot
39
Top Metrics Bug (sorta) Next Garden Gnome,
Toilet Bottom Buffet Table
40
Beta numPlayers by numRMates
41
Key Points Economy Analysis
  • Where did the money come from?
  • Where did it go?
  • How much did users play the money sub-game?
  • Av amount of made per player over 1st 10 days

42
(No Transcript)
43
Economy Detailed View
44
Visitor Bonus Who Makes Money?
45
4 of top 5 windows??
46
House Categories (Beta Test)

47
Community Management
Community Actions Trends
Influencing Player Activity
Free Content
Tracking Problem Players
48
Key Points Community Management
  • Observing community behaviour
  • Metrics that matter
  • Influencing player behaviour via publishing
    selected metrics
  • Example shifting users to Calvins Creek
  • Cheap content
  • Customer Service
  • Whos being a pain?
  • Cheaters / griefers /

49
Marketing
In-Game Brand Exposure
Special Events
Press Release Teasers
50
Key Points Marketing
  • Press releases
  • Teasers to catch media / free pub
  • Paid sponsorship
  • How many eyes on their brand, and for how long?
  • Tracking special objects / events

51
NYEve Kiss Count
                  Esper Cities     All
Cities (extrapolated)
New Year's Kiss      32,560       
      271,333Be Kissed Hotly       
7,674             63,950Be Kissed             
5,658             47,150Be Kissed
Sweetly     2,967             24,725Blow a
Kiss            1,639            
13,658Be Kissed Hello        1,161             
9,675Have Hand Kissed        415             
3,458
Total              52,074        433,949
52
Applications of Metrics
Load Testing Realistic Inputs
Beta Testing Live Operations Tuning/Mngmnt
Load Testing Live Operations Server Performance
53
(No Transcript)
54
DB byte count oscillates out of control
55
A single DB Request is clearly at fault
56
(No Transcript)
57
Most-Used DB Queries (unfiltered)
11,000,000 level Queries need attention, and
drown out others
58
DB Queries (Filtered)
Filters on 11,000,000 level Queries show patterns
of 7,000 level Queries
59
Incoming Outgoing Packets
60
Outline
  • Background Metrics MMP
  • Implementation Overview
  • Metrics in TSO
  • Applications Sample Charts
  • Wrapup
  • Lessons Learned
  • Conclusions
  • Questions

61
Lessons Learned
  • Implement early
  • Ownership, senior engineers
  • Aggregated probes vs event-level tracking
  • Automation collect / summarize / alarms
  • There can be only one

62
Key Points
  • When to implement (a) when the system is complex
    (b) before you need it
  • Implementation notes
  • Easier report generator vs commercial data
    reporting tool (flirter, no followthru)
  • Fully automated metrics engineer yourself out of
    a job
  • Complex sub-system senior engineer needs to own
    drive
  • Ease of use UI, UI, UI. Speed, speed, speed.
  • Automate error checking on inputs
  • Fast/easy turnaround on new metrics
  • Integration with server logs
  • Allows drilldowns by finding logs in same time
    window via quikeasy web UI
  • Excellent compliment to automated testing
  • Repeatable inputs accurate measurements allow
    cutfit _at_ scale
  • Scale break cycle fast repeatable
  • Closer integration with cityDB (lotso useful
    data)
  • Too many metrics collection system
  • Lack of a useful central system meant N people
    went and did one for their (narrowly targeted)
    needs
  • Categories of players vs playerEvent tracking
  • Debatable which to pick, but event tracking did
    not work out

63
Conclusion Very Useful!
Game Design
Data Mining On Players Untapped Gold
Realistic Load Testing
Engine Fixes, Optimization
Server Cost, Launch Timing
Critical Feature Accessibility
64
Key Points
  • Critical Feature accessibility
  • Collecting data is easy doing something useful
    with it is much harder
  • High-level views of data
  • Ease of use
  • Fully automated collection / display /
    errorChecking
  • Very useful!
  • Game design
  • Engine optimization
  • Load testing accuracy
  • Server internals
  • Release planning (server capacity, launch timing)
  • DB was least tested. Guess what was the sole
    real problem _at_ launch? Guess why
  • Real user data different from load testing data.
  • Assumed stress on DB would be from numberQueries,
    not relationships. Thus no DB-specific stress
    tests.
  • Oops. DB has most variable inputs _and_ has
    what users care most about persistent results.
    Should have pounded the snot out of it
    pre-launch.
  • Build it early
  • Data Mining on players is very, _very_ cool
  • They are your source of your costs revenue
    analyze / optimize loop to maximize profit
  • They shape your game tune your game based on
    direct observation

65
Questions
Slides available _at_ www.maggotranch.com/MMP
Write a Comment
User Comments (0)
About PowerShow.com