Data Analysis presentation

About This Presentation

Transcript and Presenter's Notes

Title: Data Analysis

1
Data Analysis
Chapter 8

2
In this chapter, we focus on 3 parts
Chapter 6
Data Analysis

1. Descriptive Analysis
2. Two-way Analysis of Variance
3. Forecasting

3
1. Descriptive Analysis
Chapter 6
Data Analysis

1.1 Index Numbers
1.2 Exponential Smoothing

4
1.1 Index Numbers
Chapter 6
Data Analysis

Index Number a number that measures the change
in a variable over time relative to the value of
the variable during a specific base period
Simple Index Number index based on the relative
changes (over time) in the price or quantity of a
single commodity

5
1.1 Index Numbers
Chapter 6
Data Analysis

Laspeyres and Paasche Indexes compared
The Laspeyres Index weights by the purchase
quantities of the baseline period
The Paasche Index weights by the purchase
quantities of the period the index value
represents.
Laspeyres Index is most appropriate when baseline
purchase quantities are reasonable approximations
of purchases in subsequent periods.
Paasche Index is most appropriate when you want
to compare current to baseline prices at current
purchase levels

6
1.1 Index Numbers
Chapter 6
Data Analysis

Calculating a Laspeyres Index
Collect price info for the k price series (the
basket) to be used, denoted as P1t, P2tPkt
Select a base period t0
Collect purchase quantity info for base period,
denoted as Q1t0, Q2t0..Qkt0
Calculate weighted totals for each time period
using the formula
Calculate the index using the formula

7
1.1 Index Numbers
Chapter 6
Data Analysis

Calculating a Paasche Index
Collect price info for the k price series to be
used, denoted as P1t, P2tPkt
Select a base period t0
Collect purchase quantity info for every period,
denoted as Q1t, Q2t..Qkt
Calculate the index for time t using the formula

8
1.2 Exponential Smoothing
Chapter 6
Data Analysis

Exponential smoothing is a type of weighted
average that applies a weight w to past and
current values of the time series. (Yi actual
value)
Exponential smoothing constant (w) lies between 0
and 1, and smoothed series Et is calculated as
How much influence
does the past have when w 0 and
when w 1?

9
1.2 Exponential Smoothing
Chapter 6
Data Analysis

Selection of smoothing constant w is made by
researcher.
Small values of w give less weight to current
value, yield a smoother series
Large values of w give more weight to current
value, yield a more variable series

10
2 Two-way Analysis of Variance
Chapter 6
Data Analysis

Two-way ANOVA is a type of study design with one
numerical outcome variable and two categorical
explanatory variables.
Example In a completely randomised design we
may wish to compare outcome by age, gender or
disease severity. Subjects are grouped by one
such factor and then randomly assigned one
treatment.
Technical term for such a group is block and the
study design is also called randomised block
design

11
2 Two-way Analysis of Variance
Chapter 6
Data Analysis

2.1 Randomised Block Design
2.2 Analysis in Two-way ANOVA 1
2.3 Analysis of Two-way ANOVA by the regression
method

12
2.1 Randomised Block Design
Chapter 6
Data Analysis

Blocks are formed on the basis of expected
homogeneity of response in each block (or group).
The purpose is to reduce variation in response
within each block (or group) due to biological
differences between individual subjects on
account of age, sex or severity of disease.

13
2.1 Randomised Block Design
Chapter 6
Data Analysis

Randomised block design is a more robust design
than the simple randomised design.
The investigator can take into account
simultaneously the effects of two factors on an
outcome of interest.
Additionally, the investigator can test for
interaction, if any, between the two factors.

14
Steps in Planning a Randomised Block Design
Chapter 6
Data Analysis
2.1 Randomised Block Design

Subjects are randomly selected to constitute a
random sample.
Subjects likely to have similar response
(homogeneity) are put together to form a block.
To each member in a block intervention is
assigned such that each subject receives one
treatment.
Comparisons of treatment outcomes are made within
each block

15
2.2 Analysis in Two-way ANOVA - 1
Chapter 6
Data Analysis

The variance (total sum of squares) is first
partitioned into WITHIN and BETWEEN sum of
squares. Sum of Squares BETWEEN is next
partitioned by intervention, blocking and
interaction

SS TOTAL
SS BETWEEN
SS WITHIN
SS INTERVENTION
SS BLOCKING
SS INTERACTION
16
Chapter 6
Data Analysis
2.2 Analysis in Two-way ANOVA - 1
method. And an interaction between gender and
teaching method is being sought. Analysis of
Two-way ANOVA is demonstrated in the slides that
follow. The study is about a n experiment
involving a teaching method in which professional
actors were brought in to play the role of
patients in a medical school. The test scores of
male and female students who were taught either
by the conventional method of lectures, seminars
and tutorials and the role-play method were
recorded. The hypotheses being tested
are Role-play method is superior to conventional
way of teaching. Female students in general have
better test scores than male students. Role-play
method makes a better impact on students of a
particular gender. Thus, there are two factors
gender and teaching method. And an interaction
between teaching method and gender is being
sought.
17
Chapter 6
Data Analysis
2.2 Analysis in Two-way ANOVA - 2

Each Sum of Squares (SS) is divided by its degree
of freedom (df) to get the Mean Sum of Squares
(MS).
The F statistic is computed for each of the three
ratios as
MS INTERVENTION MS WITHIN
MS BLOCK MS WITHIN
MS INTERVENTION MS WITHIN

18
2.2 Analysis of Two-way ANOVA - 3
Chapter 6
Data Analysis

Analysis of Variance for score
Source DF SS MS F
P
sex 1 2839 2839 22.75
0.000
Tchmthd 1 1782 1782 14.28
0.001
Error 29 3619 125
Total 31 8240

19
2.2 Analysis of Two-way ANOVA - 4
Chapter 6
Data Analysis

Individual 95 CI
Sex Mean -------------------------
-------------
0 58.5
(------------)
1 39.6 (-------------)
-------------------------
-------------
40.0 48.0
56.0 64.0
Individual 95 CI
Tchmthd Mean -------------------------
-------------
0 56.5
(--------------)
1 41.6 (---------------)
-------------------------
-------------
42.0 49.0
56.0 63.0

20
2.2 Analysis of Tw0-way ANOVA - 5
Chapter 6
Data Analysis
Analysis of Variance for SCORE Source
DF SS MS F
P SEX 1 2839
2839 22.64 0.000 TCHMTHD 1
1782 1782 14.21 0.001 INTERACTN
1 108 108 0.86
0.361 Error 28 3511
125 Total 31 8240
Interaction is not significant P 0.361
21
2.2 Analysis of Two-way ANOVA - 6
Chapter 6
Data Analysis
Individual 95 CI SEX Mean
-------------------------------------- 0
58.5
(------------) 1 39.6
(-------------)
--------------------------------------
40.0 48.0 56.0
64.0 Individual 95
CI TCHMTHD Mean ----------------------
---------------- 0 56.5
(--------------) 1
41.6 (---------------)
--------------------------------------
42.0 49.0 56.0
63.0
22
2.3 Analysis of Two-way ANOVA by the regression
method (reference coding)
Chapter 6
Data Analysis
The regression equation is SCORE 65.9 - 18.8
SEX - 14.9 TCHMTHD Predictor Coef
SE Coef T P Constant
65.913 3.420 19.27 0.000 SEX
-18.838 3.950 -4.77
0.000 TCHMTHD -14.925 3.950
-3.78 0.001 S 11.17 R-Sq 56.1
R-Sq(adj) 53.1 Analysis of Variance Source
DF SS MS F
P Regression 2 4620.9
2310.4 18.51 0.000 Residual Error 29
3619.0 124.8 Total 31
8239.8
23
2.3 Analysis of Two-way ANOVA by the regression
method (effect coding)
Chapter 6
Data Analysis
The regression equation is SCORE 49.0 - 9.42
EFCT-Sex - 7.46 EFCT-Tchmthd - 1.84
Interaction Predictor Coef SE
Coef T P Constant 49.031
1.980 24.77 0.000 EFCT-Sex
-9.419 1.980 -4.76 0.000 EFCT-Tch
-7.463 1.980 -3.77
0.001 Interact -1.838 1.980
-0.93 0.361 S 11.20 R-Sq 57.4
R-Sq(adj) 52.8
24
Reference Coding and Effect Coding - 1
Chapter 6
Data Analysis

In both methods, for k explanatory variables k-1
dummy variables are created.
In reference coding the value 1 is assigned to
the group of interest and 0 to all others (e.g.
Female 1 Male 0).
In effect coding the value -1 is assigned to
control group 1 to the group of interest (e.g.
new treatment), and 0 to all others (e.g. Female
1 Male (control group) -1 Role Play 1
conventional teaching (control) -1).

25
Reference Coding and Effect Coding - 2
Chapter 6
Data Analysis

In reference coding the ß coefficients of the
regression equation provide estimates of the
differences in means from the control (reference)
group for various treatment groups.
In effect coding the ß coefficients provide the
differences from the overall mean response for
each treatment group.

26
Chapter 6
Data Analysis

3 Marketing Forecasting

3.1 The concept of market forecast 3.2 The
theoretical bases of forecast 3.3 The
classification of forecast methods 3.4
Qualitative Forecast Methods 3.5 Quantitative
Forecast Methods
27
3.1 The concept of market forecast
Chapter 6
Data Analysis

Based on market surveys and by applying
scientific methods, to estimate the development
situation of objects-forecasted in a certain
period in future in order to help managers to
improve decisions-making qualify. The process is
generally called as market forecast.
In this chapter, objects-forecasted mainly are
need quantities of products, sometime may also be
product prices, competitive situations,
environmental factors, and so on.

28
3.2 The theoretical bases of forecast
Chapter 6
Data Analysis

(1)The continuity principle
?It is also called as inertia principle. Because
of existing inertia, any system doesn't change
its basic characteristics in the short run.
Attention all time series analysis methods
are based on this principle.

29
Chapter 6
Data Analysis
3.2 The theoretical bases of forecast

(2)The analogy principle
?time analogy to make an inference in future
from the past and the present. When two things
and more things have characteristic similarity
(structure, mode, property, and develop
tendency), we can forecast the developing things
and the ready-to-develop things by studying the
developed or advanced things. Attention analogy
is suitable to the homogeneous things, also to
inhomogeneous things.

30
Chapter 6
Data Analysis
3.2 The theoretical bases of forecast

(2)The analogy principle
?(continual to front page) sampling analogy to
make an inference about the whole from the part.
When the whole and the part have characteristic
similarity, we can forecast the whole by studying
the part.
Attention the similarity is the key point either
between the things with difference in advance
time, or between the whole and the part.

31
Chapter 6
Data Analysis
3.2 The theoretical bases of forecast

(3)The relevancy principle
?the theory considers that there is relativity
among things, especially between two relevance
things or causal things. All statistical
regression analysis methods are based on this
principle.

32
3.3 The classification of forecast methods
Chapter 6
Data Analysis

Although there are many theoretical forecast
methods, in general forecast can be classified as
two types
qualitative forecast
quantitative forecast.

33
3.3.1 Qualitative forecast
Chapter 6
Data Analysis

Qualitative forecast emphasizes the development
tendencies (maybe essential characteristics), and
is suitable to cases which there are a fewer and
lack of data, such as science and technology
forecast, development forecast of infant
industries, long-term forecast, and forecasting
things with uncertainty, etc.

34
Chapter 6
Data Analysis
3.3.2 Quantitative forecast

Quantitative forecast emphasizes the quantitative
relationships of developing things. Essentially
it is a kind of methods based on quantitative
trend extrapolation, and is suitable to cases
which there are many data.

35
Chapter 6
Data Analysis
3.3.3 The comparison of two methods

Qualitative forecast might contribute to the
analysis of the basic trends, development
inflection point, and the essence of things.
Quantitative forecast can draw us numeral
development concepts, and bring us conveniences
of applying forecast results. None of two methods
should be our preference, otherwise we probably
abuse forecast methods.

36
3.4 Qualitative Forecast Methods
Chapter 6
Data Analysis

Delphi method
Social investigation or consumer survey
Colligating sellers opinions
having an informal discussion of a team
Integration of experts forecasts
The method of subjective probabilities
above methods all belong to non-models.

37
3.5 Quantitative Forecast Methods
Chapter 6
Data Analysis

Exponential Smoothing
?mathematical model
?signs and meanings to explain every sign and
its meaning
?avalue ais greater, means that the more late
sample observations, the more its influence on
forecast results. Vice versa. Recommendation a
2/(n1)

38
3.5 Quantitative Forecast Methods
Chapter 6
Data Analysis

Exponential Smoothing
?mathematical model----horizontal trend
? mathematical model----lineal trend

39
3.5 Quantitative Forecast Methods
Chapter 6
Data Analysis

Exponential Smoothing
?mathematical model---- quadratic curve trend

40
3.5 Quantitative Forecast Methods
Chapter 6
Data Analysis

Exponential Smoothing
?how to choose mathematical models according to
the trend of sample observations on coordinate
diagram.

41
3.5 Quantitative Forecast Methods
Chapter 6
Data Analysis

Exponential Smoothing
?how to determine initial values of smoothing
parameters in general, the first observation
value instead of them.
?superiorities of exponential smoothing the
storage data only is a fewer and it is suitable
to forecast in short run.
?application cases reference to another teaching
materials.

42
3.5 Quantitative Forecast Methods
Chapter 6
Data Analysis

The growth curve
?mathematical model
Logistic curve
Gompertz curve

43
3.5 Quantitative Forecast Methods
Chapter 6
Data Analysis

The growth curve
?mathematical processing of initial observations
For Logistic curve
2. For Gompertz curve

The processed data of observations can be used
for calculation of parameters k, a and b. The
calculation formulas are as following
44
3.5 Quantitative Forecast Methods
Chapter 6
Data Analysis

Calculation of k, a, and b

Attention the processed data of observations
must be blacked into 3 groups, thus we can obtain
3 sum values
When the number of initial data is not integer
multiple of 3, we must add or cut down data of
initials.
45
Chapter 6
Data Analysis
3.5 Quantitative Forecast Methods

Linear regression
?An independent variable and a dependent variable
are chosen on the model, and the varied relation
of y and x is linear. This model is widely
applied in quantitative forecasts.
?the standard model
yabx
to non-standard equation, it is must
transferred as standard model.

46
3.5 Quantitative Forecast Methods
Chapter 6
Data Analysis

Linear regression
?determination of the coefficient a and b
by means of method of minimum squares, let
the variance minimization, and the calculation
of is as following
and let derivatives of Q to a and b are equal
to 0, then

47
3.5 Quantitative Forecast Methods
Chapter 6
Data Analysis

Linear regression

We can get a and b
48
3.5 Quantitative Forecast Methods
Chapter 6
Data Analysis

Linear regression
?then the forecast model is

It is necessary to check if the model the built
model is of high quality, the checking methods
are 1. standards error analysis
49
3.5 Quantitative Forecast Methods
Chapter 6
Data Analysis

Linear regression
in general, the following is required

2. correlation coefficient and test of
significance. The calculation of correlation
coefficient is
50
3.5 Quantitative Forecast Methods
Chapter 6
Data Analysis

Linear regression
?discussion of correlation coefficient R
?when R0, means y doesn't have the correlation
with x, the case is called 0- correlation, so the
built model cant be applied to forecast.
?when R1, means y has the direct correlation
with x.
?in general, R is required to meet Rgt0.7. when
Rlt0.3, means the built model can not be applied.
When 0.3ltRlt0.7, means the model is not good and
worthless.

51
3.5 Quantitative Forecast Methods
Chapter 6
Data Analysis

Linear regression
? The quality of regression model is also tested
by significance.
if , the built model is good and
worth to application. on the contrary, if
, the built model is worthless.
is the critical value of
correlation coefficient R. It is known by looking
up the given table. Theais given level of
significance such as 0.05. The (n-m) is the
degree of freedom such as n-2, m is the number of
variables.

52
3.5 Quantitative Forecast Methods
Chapter 6
Data Analysis

Linear regression
?the application of model if the future value of
x is known as x?, the interval value of forecast
variable is

Here, s?is determined by the formula
53
Quantitative Forecast Methods
Chapter 6
Data Analysis

Linear regression
and is T-distribution with
significance level aand freedom degree n-m-1,
here n is the number of observations, m is the
number of variables.
?In addition, many non-linear equation can be
transferred as linear regression. For example

54
3.5 Quantitative Forecast Methods
Chapter 6
Data Analysis

Linear regression

Then we can get the equation , the
same work is suitable to exponential function,
logarithm function, reciprocal function, etc.
Those functions are called as allowed linear
regression with single variable.
Application case to see another teaching
materials.
55
3.5 Quantitative Forecast Methods
Chapter 6
Data Analysis

analogy forecasting method a case of
application
?We can forecast an object variable by
researching the relationship between the variable
and an economic indicator (for example, per
capita national income, NI, or gross national
product, GNP)
?The relationship between vehicle population and
NI is given in page 78 of textbook.

56
3.5 Quantitative Forecast Methods
Chapter 6
Data Analysis

Elastic coefficient method
?For example, we can get the average growth rate
of vehicle sales quantity by observing selling in
the past years, but the rate is only an image. If
we analyzes growth rate of sales together with
growth rate of an economic indicator, we can
improve forecast quality. Detail case is given in
textbook.

57
3.5 Quantitative Forecast Methods
Chapter 6
Data Analysis

Combination forecasting method
?Concept of combination forecasting it is called
as combination forecasting to get a final
forecast conclusion based on colligating multi
intermediate forecast results gained by adopting
multi-models, or on same model adopting multi
independent variables.
?The core idea combination is benefit to clear
up the chanciness of single mode or independent
variable.

58
3.5 Quantitative Forecast Methods
Chapter 6
Data Analysis

talking about forecast experience
?policy variables it is very difficult to
forecast changes of policy, but we can strengthen
monitoring of environmental factors, especially
paying attention to the running condition of the
national economy. Establishing the monitoring and
early warning system of the national economy is
very necessary.

59
3.5 Quantitative Forecast Methods
Chapter 6
Data Analysis

talking about forecast experience
?predicting accuracy and goodness of fit in
model.
?simple model and complexity model
?single predicting result and many results
?reliability of forecast conclusions three pints
are very important----reality initial data
(authoritativeness), accuracy of mathematical
models, and correctness of forecast procedures.

Data Analysis PowerPoint PPT Presentation