TRANSFERRING LEAN SIX SIGMA AND DFSS DATA SIMPLY AND EFFECTIVELY Baseball Analytics - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

TRANSFERRING LEAN SIX SIGMA AND DFSS DATA SIMPLY AND EFFECTIVELY Baseball Analytics

Description:

... and Best Fit Model. Predictive Models. Statistical Process Control ... Baseball statistics are like a girl in a bikini. They show a lot, but not everything. ... – PowerPoint PPT presentation

Number of Views:446
Avg rating:3.0/5.0
Slides: 29
Provided by: detroit
Category:

less

Transcript and Presenter's Notes

Title: TRANSFERRING LEAN SIX SIGMA AND DFSS DATA SIMPLY AND EFFECTIVELY Baseball Analytics


1
TRANSFERRING LEAN SIX SIGMA AND DFSS DATA SIMPLY
AND EFFECTIVELYBaseball Analytics
4th Annual Design for Six Sigma Conference James
M. Wasiloff Cary Young US Army TACOM LCMC 9
February 2009
Baseball is the only field of endeavor where a
man can succeed three times out of ten and be
considered a good performer.  Ted Williams
2
Agenda
  • Introduction of Baseball Analytics
  • Descriptive statistics and graphical data
    analysis
  • Hypothesis development and testing
  • Analysis of Variance (ANOVA)
  • Pearson Correlation Coefficient
  • Simple Linear Regression
  • Multiple Regression and Best Fit Model
  • Predictive Models
  • Statistical Process Control
  • Next Steps / Application in Other Sports

3
Introduction
Baseball quote
  • Why the session
  • Better way to understand and teach LSS and DFSS
    Tools
  • Can Money Spent Wins
  • Keep it Statistically Simple
  • Just the Beginning

The charm of baseball is that, dull as it may be
on the field, it is endlessly fascinating as a
rehash.  Jim Murray
4
Test of Hypothesis
  • Null Hypothesis
  • Ho m1 m2
  • MLB example Ho Mean Batting Average of the NY
    Yankees from 2006-2008 equals the Mean Batting
    Average of the Tampa Bay Rays from 2006-2008
  • Alternative Hypothesis
  • Ha m1 m2 or m1 m2
  • They are not the same

During my 18 years I came to bat almost 10,000
times.  I struck out about 1,700 times and walked
maybe 1,800 times.  You figure a ballplayer will
average about 500 at bats a season.  That means I
played seven years without ever hitting the
ball.  Mickey Mantle, 1970
5
Batting Stats
American League
National League
It ain't like football.  You can't make up no
trick plays.  Yogi Berra
6
Test of Hypothesis
  • Are the batting averages of the National League
    different than the American League?
  • T-test
  • Interpretation P Low, null must go P High,
    null will fly

Two-Sample T-Test and CI AL, NL N
Mean StDev SE Mean AL 14 0.27086
0.00772 0.0021 NL 16 0.26356 0.00704
0.0018 Difference mu (AL) - mu (NL) Estimate
for difference 0.007295 95 CI for difference
(0.001717, 0.012872) T-Test of difference 0 (vs
not ) T-Value 2.69 P-Value 0.012
7
Are Salaries Correlated to Team Performance?
  • The trend is
  • Problem statement
  • Will increasing player salaries lead to more
    success?

Baseball was the major American sport in which
money bought success. George Will, Moneyball
8
2008 MLB Salaries and Win Count
9
Correlation Between Salary and Wins?
10
Use these Derivation Formulae or?
11
Use This Simple Graphic? Pearson Correlation
Coefficient Definition Values of r
12
Correlation Coefficient
  • Graphic approximation what do you think?
  • Minitab results Pearson correlation of Total
    Salary 2008 and Wins in 2008 0.323
  • Interpretation of results

13
American League West in 2002(Moneyball Data
Set)
Pearson correlation of Wins and Payroll -0.928
14
ANOVA
A baseball fan has the digestive apparatus of a
billy goat.  He can, and does, devour any set of
diamond statistics with insatiable appetite and
then nuzzles hungrily for more.  Arthur Daley
  • Null Hypothesis
  • Ho m1 m2 m3 mn
  • MLB example Ho Mean Batting Average of the NY
    Yankees equals the Mean Batting Average of the
    Tampa Bay Rays equals the Mean Batting Average of
    the NY Mets equals the Mean Batting Average of
    the
  • Alternative Hypothesis
  • Ha At least on mk is different from one
    other mk
  • MLB example At least one team has a Mean
    Batting Average different from all other teams

15
Regression Analysis
  • Is it possible to model and predict number of
    wins for a season based on statistical
    parameters?
  • The initial simple linear regression model, 2002
    data

16
Multiple Regression and Best Fit Model
  • Regression studies the relationship between the
    mean value of a random variable and the
    corresponding values of one or more independent
    variables.
  • A model for predicting one variable from
    another.
  • A statistical analysis assessing the association
    between two variables.Regression analysis is a
    method of analysis that enables you to quantify
    the relationship between two or more variables
    (X) and (Y) by fitting a line or plane through
    all the points such that they are evenly
    distributed about the line or plane.
  • Multiple regression is a method of determining
    the relationship between a continuous process
    output (Y) and several factors (Xs).

17
American League West in 2002(Moneyball Data
Set)
18
Exploratory Data Analysis
What does it mean?
19
Testing the Predictive Model
  • Tigers 2008 data
  • Here is the predictive transfer function from
    Minitab
  • Testing on 2008 Data
  • Actual win count 74
  • Predicted win count 74.26

Wins 32.1 1.48 Average Age - 34.5 Team ERA
154 Team Batting Average 0.582 Saves (P)
0.150 Runs (P) - 0.0202 Walks (P) - 0.0087 SO (P)
20
Statistical Process Control and Statistical
Thinking
  • Statistical process control is the application
    of statistical methods to identify and control
    the special cause of variation in a process
    iSixSigma.com
  • Statistical Thinking The process of using wide
    ranging and interacting data to understand
    processes, problems, and solutions.
  • The opposite of one factor at a time where the
    tendency is to change one factor and see what
    happens.
  • Statistical thinking is the tendency to want to
    understand situational phenomena over a wide
    range of data where several control factors may
    be interacting at once to produce and outcome.
  • Common cause variation becomes your friend and
    special cause variation your enemy.
  • Attribute judgements of good and bad are replaced
    with estimates of significance with given
    confidence.

21
Example 1 Notional Data Status at Game 37
Range outside UCL indicates out of control
-Need to investigate special cause
22
Which Method is Earliest at Detecting a Special
Cause?
Analytics Approach
Old Way
23
Next Steps
  • Additional MLB Analytics
  • System approach to baseball
  • Other sports?
  • Golf Fishbone Cause and Effect Analysis example

Baseball statistics are like a girl in a bikini. 
They show a lot, but not everything.  Toby
Harrah, 1983
24
(No Transcript)
25
(No Transcript)
26
Our Mission Develop world class batters who
use consistent, disciplined, and proven methods,
of eliminating or preventing hitting problems
thereby providing our fans excellence in batting,
league leading run creation resulting in high
level fan satisfaction  
  • Lean Six Sigma Analytics
  • Design for Six Sigma
  • Statistical Methods
  • Correlation/Regression Analysis
  • Design of Experiments
  • VOC / QFD
  • Taguchi Methods
  • Innovation Methods

Systems Based Potential Causes
Stadium
Accessories
Optimal Batting System Design
Pre Emptive Batting Problem Discovery
Wasiloff Young Baseball Analytics Systems
Approach to Batting
Analytic Based Reactive Batting Problem Solving
27
(No Transcript)
28
Baseball?  It's just a game - as simple as a ball
and a bat.  Yet, as complex as the American
spirit it symbolizes.  It's a sport, business -
and sometimes even religion.  Ernie Harwell,
"The Game for All America," 1955
Questions / comments? Thanks!
Write a Comment
User Comments (0)
About PowerShow.com