Statistical Procedures for Comparing RealWorld System and Simulation Output Data - PowerPoint PPT Presentation

1 / 24

About This Presentation

Title:

Statistical Procedures for Comparing RealWorld System and Simulation Output Data

Description:

If the ? =0. for Ers= Esm then, based on the data at hand, there is no strong evidence. ... the time between independent events, or a process time which is ... – PowerPoint PPT presentation

Number of Views:184

Avg rating:3.0/5.0

Slides: 25

Provided by: Alre9

Category:

more less

Transcript and Presenter's Notes

Title: Statistical Procedures for Comparing RealWorld System and Simulation Output Data

1
Statistical Procedures for Comparing Real-World
System and Simulation Output Data

There are three approaches to compare
observations from a real-world system and output
data from a corresponding simulation model.
Inspection Approach
Compute one or more statistics from
real-world system observations and corresponding
statistics from the model output data, and then
compare them without the use of formal
statistical procedure (using sample variance is
danger Basic inspection approach).

2
Statistical Procedures (cont.)

To avoid the danger due the using of basic
inspection approach, using the correlated
inspection approach.

3
Statistical Procedures (cont.)

Confidence-Interval (?) Approach Based on
Independent Data
More reliable approach
Is used to answer two questions
How large is the mean difference, and how
precise is the estimator of mean difference?
Is there a significant difference between the
actual system and the model? This will lead to
one of
If the ? ?0. for Ers? Esm then there is strong
evidence.
If the ? 0. for Ers Esm then, based on the data
at hand, there is no strong evidence.

4
Statistical Procedures (cont.)

A two-sided 100(1-a)? will always be of the
form
Where is the sample mean, is the
degree of freedom, is the 100(1-a) of a
paired-t, and se(.) is the standard error of the
specified estimator

5
Statistical Procedures (cont.)

Time series Approaches
Spectral-analysis approach
Computing the sample spectrum i.e. the Fourier
cosine transformation of the estimated
autocovariance function, of each output applied,
then using existing theory to construct a
confidence interval for the difference of the
logarithms of the two spectra.

6
Statistical Procedures (cont.)

Alternative approach
Fitting a parametric time-series model to each
set of output data, then apply a hypothesis test
to see whether the two models appear to the same.
Chen and Sargents method
Construct a confidence interval for the
difference between the system steady-state mean
and the corresponding steady-state mean of the
model.

7
Selecting Input Probability Distributions

How analyst might go about specifying the input
probability distributions?
All real systems contain one or more sources of
randomness.
Sources of randomness for common simulation
application are given in the following table.

8
(No Transcript)
9
Selecting Input (cont.)

Useful probability distributions
Parameterization of continuous distributions
The parameters can be classified as location
, scale (ß), or shape parameters (a).
Continuous distributions
Continuous random variables can be used to
describe random phenomena in which the variable
of interest can take on any value in some
interval.

10
Continuous distributions (cont.)

Uniform distribution U(a,b) Models complete
uncertainty, since all outcomes are equally
likely.
Exponential distribution expo(ß) models the time
between independent events, or a process time
which is memoryless.
Gamma distribution gamma (a,ß) An extremely
flexible distribution used to model nonnegative
random variables.

11
Continuous distributions (cont.)

Beta distribution Beta(a1, a2) An extremely
flexible distribution used to model bounded
random variables.
Weibull distribution weibull (a,ß) Models the
time to failure for components.
Normal distribution N(µ,s2) Models the
distribution of process that can be thought of as
the sum of a number of component process.

12
Continuous distributions (cont.)

Lognormal distribution LN (µ,s2) Models the
distribution of process that can be thought of as
the product of (meaning to multiply together) a
number of component process.
Pearson type V distribution PT5(a,ß) time to
perform some task (similar to Lognormal)

13
Continuous distributions (cont.)

Pearson type VI distribution PT6 (a,ß) time to
perform some task
Triangular distribution triang (a,b,c) Models a
process when only the minimum, most likely, and
maximum values of the distribution are known.

14
Discrete distributions

Bernoulli distribution Bernoulli (p) used to
generate other discrete random varieties
Discrete uniform distribution DU (i, j) Models
complete uncertainty, since all outcomes are
equally likely.
Binomial distribution bin (t,p) Models the
number of successes in trials, when the trials
are independent with common success probability,
p.

15
Discrete distributions (cont.)

Geometric distribution geom (p) Number of
failures before the first success in a sequence
of independent Bernoulli trials with probability
p of success on each trial.
Negative Binomial distribution negbin(s,p)
Models the number of trial required to achieve k
success.
Poisson distribution Poisson(?) Models the
number of independent events that occur in a
fixed amount of time or space.

16
Empirical distributions

If the modeler has been unable to find a
theoretical distribution that provides a good
model for the input data, it may be necessary to
use the empirical distribution of the data.

17
Identify the distribution with data

Methods for selecting families of input
distributions when data are available will be
discussed
Activity I
Hypothesizing Families of Distributions Decide
what general families appear to be appropriate on
the basis of their shapes, without worrying about
the specific parameter value for these families.

18
Identify the distribution (cont.)

Histograms and line graphs A
frequency distribution or histogram is useful in
identifying the shape of a distribution.
A histogram is
constructed as follows
Divide the range of the data (X1, X2, , Xn)
into k disjoint intervals ?b(bj-1-bj), j1, 2,
, k.
Label the horizontal axis to conform to the
intervals selected.

19
Identify the distribution (cont.)

Determine the frequency of occurrences within
each interval. Define the function
Label the vertical axis so that the total
occurrences can be plotted for each intervals.
Plot the frequencies on the vertical axis.

20
Identify the distribution (cont.)

Then Compared with plots of densities of various
distributions in the basis of shape alone because
for any some number y
For Choosing the number of interval K using

21
Identify the distribution (cont.)

Quantile (q) summaries and box plots
Useful for determining whether the underlying
probability is symmetrical or skewed to the right
or to the left.
If 0 lt F(x) lt 1
Then for 0lt q lt1, the q-quantile of F(x) is that
number such that
Inverse F(x),
The median ,

22
Identify the distribution (cont.)

The lower and upper quartiles and
The lower and upper octiles and

23
Identify the distribution (cont.)

Activity II
Estimation of parameters
After a family of distributions has been
selected, the next step is to estimate the
parameters of the distribution.
MLEs (maximum-likelihood estimators)
The likelihood function for unknown ?

24
MLE Properties