Graphics Stability - PowerPoint PPT Presentation

1 / 59
About This Presentation
Title:

Graphics Stability

Description:

CRASH (Comparative Reliability Analyzer for Software and Hardware) The CRASH Tool ... Enables wide range of endurance/load/stress testing. Configurable load profiles ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 60
Provided by: downloadM
Category:

less

Transcript and Presenter's Notes

Title: Graphics Stability


1
Graphics Stability
  • Gershon Parent
  • Software Swordsman
  • WGGT
  • gershonp _at_ microsoft.com
  • Microsoft Corporation

Steve Morrow Software Design Engineer WGGT stevemo
r _at_ microsoft.com Microsoft Corporation
2
Session Outline
  • Stability Benchmark History
  • CRASH (Comparative Reliability Analyzer for
    Software and Hardware)
  • The CRASH Tool
  • The CRASH Plan
  • The Experiments
  • CDER (Customer Driver Experience Rating)
  • Program Background and Description
  • High-level Statistics of the Program
  • Factors Examined in the Crash Data
  • Normalized Ratings
  • Customer Experience and Loyalty

3
Stability Benchmark History
  • WinHEC May 04
  • CRASH 1.0 released.
  • Web portal has 52 non-MS members from 16
    companies
  • November 04
  • CRASH 1.1 released to the web. Includes DB
    backend
  • December 04
  • Stability Benchmark components ship to 8,000
    customers and normalizable OCA data begins
    flowing in
  • CRASH Lab completes first data collection pass
  • Web portal has over 60 non-MS members from 17
    companies

4
CRASH Tool
  • CRASH is new dynamic software loading tool
    designed to expose and easily reproduce
    reliability defects in drivers/hardware
  • Answers the call from IHVs and OEMs for more
    reliability test tools.
  • Enables wide range of endurance/load/stress
    testing
  • Configurable load profiles
  • Scheduled cycling (starting and stopping) of
    test applications
  • Replay-ability
  • Automatic failure cause determination
  • Scripting for multiple passes with different
    scenarios
  • Creation of a final score

5
CRASH Demo
o
_
X
6
CRASH Demo
7
CRASH Demo
8
CRASH 4 Phase Plan
  • Phase 1
  • Produce CRASH documentation for review by
    partners
  • Release 1.0 to our partners for feedback
  • Phase 2
  • Release 1.1 with database functionality to our
    partners
  • Execute controlled baseline experiments on a
    fixed set of HW and SW to evaluate the tools
    effectiveness
  • Phase 3
  • Execute series of experiments and use results to
    increase accuracy and usefulness of the tool
  • Phase 4
  • Create a CRASH-based tool for release to a
    larger audience

9
Experiment 1 Objectives
  • Determine if the CRASH data collected sufficient
    to draw meaningful conclusions about the
    part/driver stability differences
  • Determine how machine configuration affects
    stability
  • Evaluate how the different scenarios relate to
    conclusions about stability
  • Find the minimum data-set needed to make
    meaningful conclusions about part/driver
    stability
  • Create a baseline from which to measure future
    experiments
  • Identify other dimensions of stability not
    exposed in the CRASH score

10
Experiment 1 Details
  • Standardize on one late-model driver/part from
    four IHVs
  • Part/Driver A, Part/Driver B, Part/Driver C,
    Part/Driver D
  • Test them across 12 different flavors of
    over-the-counter PCs from 4 OEMs
  • OEM A, OEM B, OEM C, OEM D
  • High End and Low End
  • Include at least two motherboard types
  • MB Type 1, MB Type 2
  • Clean install of XP SP2 plus latest WHQL drivers
  • Drivers snapped 8/16/04
  • Use the 36 hr benchmark profile shipped with
    CRASH 1.1

11
Important Considerations
  • Results apply only to these Part/Driver/System
    combinations only
  • Extrapolation of these results to other parts or
    drivers or systems is impossible with this data

12
CRASH Terminology
  • Profile
  • Represents a complete run of the Crash tool
    against a driver
  • Contains one or more scenarios
  • Scenario
  • Describes a session of CRASH testing
  • Load intensity/profile
  • What tests will be used
  • How many times to run this scenario (loops)
  • Score
  • Score is always a number that represents the
    percentage of the testing completed before a
    system failure (hang or kernel-break)

13
Profile Score Averages
14
CRASH Terminology Failures
  • Failure
  • Hang
  • No minidump found and loop did not complete
  • Targeted Failure
  • Minidump auto-analysis found failure was in the
    display driver
  • Non-Targeted Failure
  • Minidump analysis found failure was not in
    display driver
  • Does not count against the score

15
Percentage of Results by Type
16
Average Profile Score by Machine Group
17
Average Profile Score by OEM and MB
18
Affect of MB Type on Profile Score
19
Score Distribution for Part/Driver C D(MB Type
1)
20
Experiment 1 Test Profile
  • Real Life
  • Moderate load and application cycling
  • 9 max and 3 min load
  • Tractor Pull
  • No load cycling
  • Moderate application cycling
  • Incrementally increasing load
  • Intense
  • High frequency load and application cycling
  • 9 max and 0 min load

21
Average Scenario Score by Part/Driver
22
Statistical Relevance Questions
  • Question How do I know that the difference
    between the averages of result set 1 and Result
    Set 2 are meaningful?
  • Question How can I find the smallest result set
    size that will give me 95 confidence?
  • Answer Use the Randomization Test

23
Randomization Test
Delta 1
Set 1
Set 2
Combination Set
Delta 2
Random Set 2
Random Set 1
  • Random test 10,000 times. If 95 of the time the
    Delta 1 is greater than Delta 2 then you are
    assured the difference is meaningful.
  • Try smaller sample sizes until the confidence
    drops below 95. That is your minimum sample
    size.
  • Information on the Randomization Test can be
    found online athttp//www.uvm.edu/dhowell/Stat
    Pages/Resampling/RandomizationTests.html

24
Scores and Confidence Intervals for
Part/Driver/MB Combinations
25
The Experiment Matrix
  • With three experiments completed, we can now
    compare
  • One driver across two OS configurations
  • Two versions of one driver across a single OS
    configuration

26
Old vs. New Drivers
  • This table compares the profile scores for old
    drivers vs. new drivers on OEM Image
  • New drivers were noticeably better for
    parts/drivers C D
  • Part/Driver A and B were unchanged

27
OEM Image vs. Clean Install
  • This table compares profile scores for OEM Image
    vs. Clean Install with Old Drivers
  • Clean install scores universally better than OEM
    image for parts/drivers C and D
  • Part/Driver A and B were unchanged

28
Future Plans
  • Collate with OCA data
  • CRASH failure to OCA bucket correlations
  • What buckets were fixed between 1st and 2nd
    driver versions?
  • Do our results match field data?
  • customer machines have hardware that is typically
    several years old
  • Can we find the non-display failure discrepancy
    in the field?
  • Begin to tweak other knobs
  • Content
  • Driver-versions
  • HW-versions
  • Windows codenamed Longhorn Test Bench
  • PCIe cards

29
Suggested Future Experiments
  • Include more motherboard types
  • Newer drivers or use a Control Group driver.
    Reference Rasterizer?
  • Disabled AGP to isolate chipset errors from AGP
    errors
  • Driver-Verifier enabled
  • Add non-graphics stress tests to the mix
  • Modified Loop Times

30
IHV Feedback
  • There are definitely unique driver problems
    exposed through the use of CRASH and it is
    improving our driver stability greatly
  • CRASH is producing real failures and
    identifying areas of the driver that we are
    improving on
  • Thanks for a very useful tool

31
CRASH 1.2 features
  • RunOnExit
  • User specified command run upon the completion of
    CRASH profile
  • More logging
  • Logging to help troubleshoot problems with data
    flow
  • More information output in xml
  • More system information
  • More failure details from minidumps
  • More control over where files are put
  • More robust handling of network issues

32
Customer Device Experience Rating (CDER) Program
Background
  • Started from a desire to rate display driver
    stability based on OCA crashes
  • Controlled program addresses shortcomings of OCA
    data
  • Unknown market share
  • Unknown crash reporting habits
  • Unknown info on non-crashing machines
  • This allows normalization of OCA data to be able
    to get accurate number of crashes per machine
    stability rating

33
CDER Program Description Status
  • Program Tools
  • A panel of customers (Windows XP only)
  • User opt-in allows extensive data collection,
    unique machine ID
  • System Agent/scheduler
  • System Configuration Collector
  • OCA Minidump Collector
  • System usage tool (not yet in the analysis)
  • Status
  • All tools for Windows XP in place and functioning
  • First set of data collected, parsed, analyzed

34
Overall Crash Statistics of Panel
  • Machines
  • 8927 in panel
  • 49.9 experience no crashes
  • 50.1 experience crash(es)
  • 8580 have valid device driver info
  • 82.2 have no display crashes
  • 17.8 have display crashes
  • Crashes
  • 16.1 of valid crashes are in display
  • Note Crashes occurred over 4 yr period

35
Crash Analysis Factors
  • Examined several factors which may have an impact
    on stability ratings
  • Processor
  • Display Resolution
  • Bit Depth
  • Monitor Refresh Rate
  • Display Memory
  • Note Vendor part naming does not correspond to
    that in CRASH presentation.
  • Note Unless otherwise noted, data for these
    analyses were from the last 3 years

36
Display Resolution Crashes Distribution
37
Bit Depth Crashes Distribution
38
Refresh Rate Crashes Distribution
39
Display Memory Crashes Distribution
40
Display Crashes By Type (Over Last Year)
41
Normalized Crash Data
  • The following data is normalized by program share
    of crashing and non-crashing machines

42
Crashes per Machine Ranking by Display Vendor for
Last Year (2004)
43
Vendor A Normalized Crashes By Part/ASIC Family
Over Last 3 Years
44
Display Vendor B Normalized Crashes By
Part/ASIC Family Over Last 3 Years
45
Display Vendor C Normalized Crashes By
Part/ASIC Family Over Last 3 Years
46
Normalized Crashes Ranked by Part - 2004
47
Ranking and Rating Conclusions
  • This is a first look
  • Need to incorporate system usage data
  • Need to continue collecting configuration data to
    track driver and hardware changes
  • Need more panelists, and a higher proportion of
    newer parts
  • With that said
  • This is solid data
  • This demonstrates our tools work as designed
  • It shows the viability of a crash-based rating
    program

48
Customer Experience Loyalty
  • A closer look at segment of panelists who
  • Experienced display crashes, and
  • Switched or upgraded their display hardware or
    driver

49
Experience Loyalty Highlights
  • 19.4 of users who experienced display crashes
    upgraded their drivers, or hardware, or changed
    to a different display vendor
  • 7.9 of users (nearly 41 of the 19.4) who
    experienced display crashes switched to a
    competitors product
  • ALL users who switched to a competitors product
    had the same or better experience
  • Only 91.3 of those who upgraded had the same or
    better experience afterwards, based on crashes
  • Time clustering of crashes

50
Overall Experience of Users After Changing
Display System
51
Experience of Users After Upgrading
52
Experience of Users After Switching Display
Vendors
53
Time-Clustering of Crashes for Users Who
Experienced 3 Or More Crashes
  • Our data indicates a users crashes are
    generallyhighly clustered in time

54
Time-Clustering of Crashes for Users Who
Experienced 6 Or More Crashes
55
User Experience Caveats
  • User Experience here is strictly concerned with
    how many crashes the users experienced
  • It doesnt include hardware changes/upgrades
    where different hardware used the same driver
  • Having fewer crashes may not always mean user
    experience was better, but for the vast majority
    we believe it does
  • Having fewer crashes may be attributable to other
    system changes, and/or other factors
  • Crashes going away may mean the user gave up
    using whatever was causing the crashes

56
Going Forward
  • Current Program (Windows XP-based)
  • Normalize by usage as data becomes available
  • Include periodic configuration data in analysis
  • Correlate with CRASH tool results
  • Continue to develop towards rating program
  • Planned for Longhorn/LDDM
  • Modify tools for Longhorn and new display driver
    model
  • Larger set of participants for Longhorn Beta1
  • Recruit more users with newer hardware

57
Call To Action
  • Create LDDM Drivers
  • If you are a display vendor, leverage stability
    advances in new Longhorn Display Driver Model
    (LDDM)
  • Join the Stability Benchmark Portal
  • If you are a display IHV or a System Builder
    contact grphstab _at_ microsoft.com
  • Get latest tools and documents
  • Join the Stability discussion on the portal
  • Use the tools
  • Send us feedback and suggestions
  • Share ideas for new experiments

58
Community Resources
  • Windows Hardware Driver Central (WHDC)
  • www.microsoft.com/whdc/default.mspx
  • Technical Communities
  • www.microsoft.com/communities/products/default.msp
    x
  • Non-Microsoft Community Sites
  • www.microsoft.com/communities/related/default.mspx
  • Microsoft Public Newsgroups
  • www.microsoft.com/communities/newsgroups
  • Technical Chats and Webcasts
  • www.microsoft.com/communities/chats/default.mspx
  • www.microsoft.com/webcasts
  • Microsoft Blogs
  • www.microsoft.com/communities/blogs

59
Additional Resources
  • Email grphstab _at_ microsoft.com
  • Related Sessions
  • Graphics Stability Part 2
  • WDK for Graphics An Introduction
  • Longhorn Display Driver Model Roadmap and
    Requirements
  • Longhorn Display Driver Model Key Features
Write a Comment
User Comments (0)
About PowerShow.com