Two-stage Cluster Sampling When Clusters are of Unequal Size - PowerPoint PPT Presentation

About This Presentation
Title:

Two-stage Cluster Sampling When Clusters are of Unequal Size

Description:

Title: Two-stage Cluster Sampling When Clusters are of Unequal Size Author: Last modified by: nkfust Created Date: 5/13/2002 2:19:22 PM – PowerPoint PPT presentation

Number of Views:131
Avg rating:3.0/5.0
Slides: 87
Provided by: 4518
Category:

less

Transcript and Presenter's Notes

Title: Two-stage Cluster Sampling When Clusters are of Unequal Size


1
Cluster Analysis
2
  • First used by Tryon (1939) encompasses a
    number of different algorithms and methods for
    grouping objects of similar kind into respective
    categories.

3
???? ??
????????????,?????????(???)????? ????????????????
??,???????????????????(homogeneity),??????????????
????
4
???????????????????????,???????????????
5
  • ??????
  • ????
  • ???
  • ???
  • ???
  • ???
  • ???

6
  • ???????
  • ???????????????????????????????????,??????????????
    ,?????????????
  • ??????????????????????,?????????N?????????????????
    ?????????

7
??????
  • ????????????,??????????????(Euclidean Distance)
  • ??N????,??????M???,??X?NM?????,???????????

8
dij
????????????,????????????????????????,??????0,????
?1?
9
??????
  • ??????????????????,???????????,??????????????????
  • ??????????????????(matching coefficient)???

10
Ex ?i?j???????(1???????,0????????)
11
??????
12
  • ?????????
  • ???? (non-hierarchical)????????????????????,?????
    ??

a. ?????? (sequential threshold)
?????,???????????,????????,???????????????????????
????????????????????,???????????????????,????????
?
13
b.??????(paralleled threshold)
???????????????????????,???????,???????????????,??
???????????????,?????(???)???????????
c.?????(optimizing partitioning) ????????
(???????????) ???,????????,????? (criterion
measure) ????????
14
d.????(K-means Method) ??????????????,??????????
??K???,????????????????,??????????????????????????
??????????????,??????????????????????,????????????
??????
15
??? (hierarchical)????? ??????????,?????????????
???????,?????????????????
????????,?????????????,?????
16
??????????????????,??????????,????????????,???????
??????????????????????? ???????????????????????,?
????????????,????????,??????????
17
?K-means???????
1.?????????K????? 2.?????????????(???)??(????????)
,?????????????????????????????????????????????????
??? 3.??????,?????????????????????
18
Ex????????????????
??????????????????,????1,2?????3,4?,??????????????
???
???1,2?     ?? ?3,4?
19
X2
X2
????????????????????,??????????????
?D21?1,2?(12-2)2(8-6)2104
??????????4????3,4?????,??????????2????3,4??????
,?????????3,4???????????1???2,3,4?,????????
20
???1?      ???2,3,4?
X112         X1
X28          X2
????????????1?????2,3,4??????
21
????????1????1????????2,3,4????2,3,4??????,?????
??????,???K2???,??????1?????2,3,4??
22
Two-stage Cluster Sampling When Clusters are of
Unequal Size
  • Desired Sample Proportion pn/N
  • a Desired of Clusters Selected in the 1st
    Stage
  • A Total of Clusters
  • b Sample Size within Each Cluster Selected
  • Ni of Elements in Cluster i

23
Simple Two-stage Cluster Sampling
  • The First-stage Prob. p1a/A
  • The Second-stage Prob. p2p?(a/A)
  • Sample size in cluster I, ni p2Ni


24
Probability Proportional to Size
where
25
Example
  • Draw a sample of 1,000 households from a city
    that contains about 200,000 households
    distributed among 2000 blocks of unequal but
    known size.
  • The desired sample proportion 1/200
  • The desired of clusters selected in the 1st
    stage100
  • How do we conduct the two-stage cluster sampling?

26
What is Cluster Analysis?
  • Cluster Analysis is a class of statistical
    techniques that can be applied to data that
    exhibit natural groupings.
  • CA is an interdependence technique that makes no
    distinction between dependent and independent
    variables.
  • There is NO statistical significance testing in
    CA.
  • CA is more a group of different algorithms that
    put objects into clusters following well-defined
    similarity rules.

27
What is A Cluster?
  • A cluster is a group of relatively homogeneous
    cases and observations.
  • Clusters exhibit high internal homogeneity and
    high external heterogeneity.

28
A Cluster Diagram Drinkers Perceptions of
Alcohol
29
Characteristics of CA
  • Cluster Analysis is a tool of discovery.
  • It discovers structures in data but does NOT
    explain why they exist.
  • CA is used when we do not have an a priori
    hypothesis, but when we are in the exploratory
    phase.

30
How does CA differ
  • From Discriminant Analysis
  • A dependence technique
  • Predict the probability that an object will fall
    into one of two or more mutually exclusive
    categories based on several independent
    variables.
  • Find a linear combination of independent
    variables.
  • Find natural groupings based on distances among
    objects.

31
  • From Factor Analysis
  • Similar to cluster analysis in that it is an
    interdependence technique.
  • Primary difference lies in the focus on objects
    and variables.
  • Factor analysis reduces variables to a few
    factors. Cluster analysis reduces objects to a
    few clusters.

32
Cluster Analysis Methods
  • Three Cluster Analysis Methods
  • Joining (Tree Clustering)
  • Two-way Joining
  • K-means Testing

33
Joining (Tree Clustering)
  • A type of hierarchical clustering --
    agglomerative
  • Each unit is a cluster.
  • Dendogram ?
  • Many other methods

34
The first level shows all samples xi as singleton
clusters. Increase levels, more samples are
clustered together in a hierarchical manner.
35
It is based on sets where each cluster level may
contain sets that are subclusters as shown in the
Venn diagram.
36
Two-way Joining Hartigan (1975)
  • Two-way Joining tries to cluster both variables
    and objects.
  • Only useful if you think clustering along BOTH
    lines will be useful.
  • Very rare in application.

37
k-Means Clustering
  • Begin with a preconception about the number of
    clusters (k).
  • Thought of as ANOVA in reverse.
  • ANOVA evaluates between group var. against within
    group var. when computing stat. signif. of
    hypothesis that groups are different.
  • In k-Means the computer will try to move objects
    in and out of the groups to get the most
    significant ANOVA results.

38
Its all about distance
  • Distance Measures
  • Euclidean Distance
  • Squared Euclidean Distance
  • Manhattan Distance
  • Chebychev Distance
  • Power Distance

39
EQUATION Euclidean Distance
  • Basic equation for determining distance measure.
  • Distance (x,y) Si (xi yi)21/2
  • A standard formula for determining the distance
    between two points on a plane

40
Fairly simple, right?
41
In other words, how do we get from this
42
To this
43
To this
44
How to Determine Clusters.
  • Use a computer.
  • Call a professional.

45
  • Clusters in the
  • Real World

46
Why is Cluster Analysis Important?
  • Relatively new/evolving technique
  • Highly useful for market segmentation
  • Segmentation identifying groupings of customers
    using statistical multi-variate analysis, often
    based on perceptions and attitudes as well as
    demographics and behavior.
  • Segmentation helpful to small companies
    attempting to carve out a niche
  • Large companies trying to tailor their
    products/services to different segments

47
In addition to segmentation, clusters are used to
  • Design products and establish brands
  • Target direct mail
  • Make decisions about customer conversion and
    retention
  • Decide on marketing cost levels

48
Ex Luxury Car Customers
  • Demographic examples easier to illustrate
  • Demographics
  • Gender
  • Education
  • Age
  • 149 customers (objects) of a luxury car dealership

49
Using SPSS for Clustering
  • Chose TwoStep Cluster Analysis
  • Basically, the agglomerative technique
    (dendogram).
  • Step One Creates very small (individual)
    sub-clusters.
  • Step Two Cluster sub-clusters into desired
    number of clusters.
  • Automatically finds optimum number of clusters.

50
Two-Step CA Output
What are these clusters?
51
Two-Step CA Output
52
(No Transcript)
53
(No Transcript)
54
What does this mean?
  • Cluster 5
  • Age 36 - 65
  • Education High School graduate or above
  • Gender Female
  • Could have used k-Means, would have generated
    different results.
  • Clustering is a powerful marketing research tool.

55
Claritas Clustering Experts
  • Example Claritas Corporation
  • Claritas founded the U.S. geodemographic industry
    when it launched the first PRIZM segmentation
    system in 1974.
  • PRIZM (Potential Rating Index for Zip Markets)
    categorizes every U.S. neighborhood into 1 of 62
    clusters.
  • Descriptive Names
  • Money and Brains
  • Young Literati
  • Shotguns and Pickups

56
Money and Brains
  • Sophisticated Urban Fringe Couples
  • Cluster is a mix of family types singles,
    married couples with children and married couples
    without children. These families own their homes
    in upscale neighborhoods near cities. Dual
    incomes provide luxuries, travel and
    entertainment.
  • Demographics
  • Affluent
  • Age Groups 55-64, 65
  • Predominantly White, High Asian

57
Clusters Work!
  • At a conservative estimate, more than 20,000
    companies in the United States and Canada alone
    used clusters as part of their marketing
    information mix last year.

58
Web Sources
  • http//cwis.livjm.ac.uk/bus/busrmccl/ae230/lect10.
    ppt
  • http//www.clusterbigip1.claritas.com/claritas/Def
    ault.jsp?main3submenusegsubcatsegprizm
  • http//www.clusterbigip1.claritas.com/claritas/Def
    ault.jsp?main3submenusegsubcatsegprizmne
  • http//www.insightsc.ie/newsletter7.htm
  • http//www.directionsmag.com/article.asp?article_i
    d12
  • http//fun.supereva.it/scoleri.freeweb/cern/biogra
    fie/hawking.jpg
  • http//www.statsoft.com/textbook/stcluan.html
  • http//www-db.stanford.edu/ullman/mining/cluster1
    .pdf
  • http//www.snr.missouri.edu/multivariate/ClusterAn
    alysis.pdf

59
Print Sources
  • Recent Developments in Clustering and Data
    Analysis. Edited by Chikio Hayashi, Edwin Diday,
    Michel Jambou, Noboru Ohsumi. Academic Press,
    Inc. 1988.
  • Finding Groups in Data An Introduction to
    Cluster Analysis. Leonard Kaufman, Peter J.
    Rousseeuw. John Wiley and Sons, Inc. 1990.
  • Marketing Research An Aid to Decision Making.
    Dr. Alan T. Shao. South-Western. 2002.
  • Exploring Marketing Research. William G. Zikmund.
    South-Western. 2003.

60
Ex 7 Hypothetical Data
Subject Id. Income (1000) Education (years)
S1 5 5
S2 6 6
S3 15 14
S4 16 15
S5 25 20
S6 30 19
61
Similarity Matrix (Euclidean Distances)
Id S1 S2 S3 S4 S5 S6
S1 0 2 181 221 625 821
S2 2 0 145 181 557 745
S3 181 145 0 2 136 250
S4 221 181 2 0 106 212
S5 625 557 136 106 0 26
S6 821 745 250 212 26 0
d(S1, S3) ? (15-5)2 (19-5)2 181 d(S1, S2)
? 2 ???? (?????) ???
62
Centroid Method Five ClustersData For Five
Clusters
Cluster Cluster Members Income (1000) Education (years)
1 S1S2 (5,5) (6,6) 5.5 56/2 5.5 56/2
2 S3 15 14
3 S4 16 15
4 S5 25 20
5 S6 30 19
63
Similarity Matrix (Euclidean Distances)
Id S1 S2 S3 S4 S5 S6
S1 S2 0 162.5 200.5 590.5 782.5
S3 162 0 2 135.96 250
S4 200.5 2 0 106 212
S5 590.5 135.96 106 0 26
S6 782.5 250 212 26 0
d(S1 S2 , S3) ? (5.5-15)2 (5.5-14)2 ?
162.5 d( S3, S4) ? 2 ???? (?????) ???
64
Centroid Method Four ClustersData For Four
Clusters
Cluster Cluster Members Income (1000) Education (years)
1 S1S2 (5,5) (6,6) 5.5 56/2 5.5 56/2
2 S3 S4 (15,14) (16,15) 15.5 1516/2 14 .5 1415/2
3 S5 25 20
4 S6 30 19
65
Similarity Matrix (Euclidean Distances)
Id S1 S2 S3S4 S5 S6
S1 S2 0 181 590.5 782.5
S3 S4 181 0 120.5 230.5
S5 590.5 120.5 0 26
S6 782.5 230.5 26 0
d(S1 S2 , S5) ? (5.5-25)2 (5.5-20)2 ?
590.5 d( S5, S6) ? 26 ???? (?????) ???
66
Centroid Method Three ClustersData For Three
Clusters
Cluster Cluster Members Income (1000) Education (years)
1 S1S2 (5,5) (6,6) 5.5 56/2 5.5 56/2
2 S3 S4 (15,14) (16,15) 15.5 1516/2 14 .5 1415/2
3 S5 S6 (25,20) (30,19) 27.5 2530/2 19.5 1415/2
67
Similarity Matrix (Euclidean Distances)
Id S1 S2 S3S4 S5 S6
S1 S2 0 181 680
S3 S4 181 0 169
S5 S6 680 169 0
d(S1 S2 , S5 S6) ? (5.5-27.5)2 (5.5-19.5)2
? 680 d( S3 S4, S5 S6) ? 169 ????
(?????) ???
68
Exhibit 7-1SAS Output for cluster analysis on
data in Table 7.1
1
???????????
  • Simple statistics
  • Mean Std Dev
    Skewness Kurtosis Bimodality
  • INCOME 16.1667 9.9883 0.2684
    -1.4015 0.2211
  • EDUC 13.1667 6.3692
    -0.4510 -1.8108 0.2711
  • Root-Mean-Square Total-Sample Standard Deviation
    8.376555

69
Root-Mean-Square Total-Sample Standard
Deviation8.376555 (RMSSTD)
RMSSTO?????????????(?????????)
  • Step Number
    Frequency RMS STD
  • Number of
    of New of New Semipartial
    Centroid
  • Clusters Clusters Joined Cluster
    Cluster R-Squared R-Squared
    Distance
  • 1 5 S1 S2
    2 0.707107 0.001425
    0.998575 1.4142
  • 2 4 S3 S4
    2 0.707107 0.001425
    0.997150 1.4142
  • 3 3 S5 S6
    2 2.549510 0.018527
    0.978622 5.0990
  • 4 2 CL4 CL3
    4 5.522681 0.240855 0.737767
    13.0000
  • 5 1 CL5 CL2
    6 8.376555 0.737767 0.000000
    19.7041

?????,?R2????
70
  • CLUSTER1 CLUSTER2
    CLUSTER3
  • OBS SID INCOME EDUC OBS SID INCOME EDUC
    OBS SID INCOME EDUC
  • 1 S1 5 5 3
    S3 15 14 5
    S5 25 20
  • 2 S2 6 6 4
    S4 16 15 6
    S6 30 19

71
Exhibit 7.2Non-hierarchical Clustering On Data
  • ReplaceFULL Radius0 Maxclusters3 Maxiter20
    Converge0.02
  • Initial Seeds
  • Cluster INCOME EDUC
  • -------- -----------------------------------
  • 1 5.0000 5.0000
  • 2 30.0000 19.0000
  • 3 16.0000 15.0000

??????????S1, S6, S4
72
Exhibit 7-2 (continued)
  • Minimum Distance Between Seeds 14.56022
  • Iteration Change in Cluster Seeds
  • 1 2
    3
  • -------------------------------------------------
    -
  • 1 0.707107 2.54951 0.707107
  • 2 0 0
    0
  • Statistics for Variables
  • Variable Total STD Within STD
    R-Squared RSQ/(1-RSQ)
  • -------------- -----------------------------------
    -------------------------------------------
  • INCOME 9.988327 2.121320
    0.972937 35.950617
  • EDUC 6.369197 0.707107
    0.992605 134.222222
  • OVER-ALL 8.376555 1.581139
    0.978622 45.777778

73
Exhibit 7-2 (continued)
  • Pseudo
    F Statistic 68.67
  • Approximate Expected Over-All R-Squared .
  • Cubic Clustering
    Criterion .
  • WARNING The two above values are invalid for
    correlated variables.
  • Cluster Means
  • Cluster INCOME EDUC
  • --------- -----------------------------------
  • 1 5.5000 5.5000
  • 2 27.5000 19.5000
  • 3 15.5000 14.5000

???????(?????)
74
Exhibit 7.4 Hierarchical Cluster Analysis For
Food Data
  • SINGLE LINKAGE CLUSTER ANALYSIS


  • SIMPLE STATISTICS


  • MEAN STD DEV SKEWNESS KURTOSIS
    BIMODALITY


  • CALORIES 207.407 101.208
    0.542 -0.675 0.478
  • PROTEIN 19.000 4.252
    -0.824 1.327
    0.357
  • FAT 13.481 11.257
    0.790 -0.624
    0.589
  • CALCIUM 43.963 78.034
    3.159 11.345 0.746
  • IRON 2.381 1.461
    1.230 1.469
    0.518

75
Exhibit 7.4 (continued)
(?????)
  • COMPLETE LINKAGE CLUSTER ANALYSIS


  • NUMBER
    FREQUENCY RMS STD
  • OF CLUSTERS
    OF NEW OF NEW
    SEMIPARTIAL MAXIMUM
  • CLUSTERS JOINED
    CLUSTER CLUSTER R-SQUARED
    R-SQUARED DISTANCE
  • 10 CL15 CANNED CRABMEAT
    4 11.32324
    0.003476 0.985594 50.6665
  • 9 CL17 ROAST LAMB
    SHOUL 3 12.59929
    0.003226 0.982367 55.6611
  • 8 CL14 CANNED SHRIMP
    3 16.10565
    0.005231 0.977136 71.1677
  • 7 CL13 ROAST BEEF
    6 14.34190
    0.009755 0.967381 80.9343
  • 6 CL10 CL8
    7
    22.14096 0.023782 0.943599
    108.1758
  • 5 CL9 CL11
    11
    20.22234 0.039103 0.904496
    141.7814
  • 4 CL6 CL12
    9
    30.07489 0.048662 0.855835
    154.4447
  • 3 CL7 CL5
    17
    38.73570 0.220433 0.635402
    262.5666
  • 2 CL4 CANNED
    SARDINES 10 51.36181
    0.192623 0.442779 364.8934
  • 1 CL3 CL2
    27
    57.40958 0.442779 0.000000
    433.7617

76
Exhibit 7.4 (continued)
  • ROOT-MEAN-SQUARE TOTAL-SAMPLE STANDARD DEVIATION
    57.4096


  • NUMBER
    FREQUENCY RMS STD
  • OF CLUSTERS
    OF NEW OF NEW
    SEMIPARTIAL MINIMUM
  • CLUSTERS JOINED
    CLUSTER CLUSTER R-SQUARED
    R-SQUARED DISTANCE


  • 10 CANNED CANNED
    2 11.16786
    0.001455 0.973438 35.3159
    MACKEREL SALMON
  • 9 CL14
    ROAST LAMB 3 12.59929
    0.003226 0.970211
    35.4131

  • SHOULDER
  • 8 CL11
    CANNED 12 16.80697
    0.014701 0.955510
    39.5267

  • CRABMEAT
  • 7 CL15
    CL9 8
    20.48901 0.028341 0.927169
    40.1627
  • 6 CL7
    CL8 20
    40.04817 0.285060 0.642109
    40.2746
  • 5 CL12
    CANNED 3 16.10565
    0.005231 0.636878
    44.8504

  • SHRIMP
  • 4 CL6
    ROAST BEEF 21 43.49500
    0.085924 0.550954
    45.7642
  • 3 CL4
    CL5 24
    48.72189 0.189548 0.361406
    48.7139
  • 2 CL3
    CL10 26
    50.53988 0.106595 0.254811
    62.2624
  • 1 CL2
    CANNED 27 57.40958
    0.254811 0.000000
    211.5691

77
Exhibit 7.4 (continued)
(???)
  • CENTROID HIERARCHICAL CLUSTER ANALYSIS
  • NUMBER
    FREQUENCY RMS STD
  • OF CLUSTERS
    OF NEW OF NEW SEMIPARTIAL
    CENTROID
  • CLUSTERS JOINED
    CLUSTER CLUSTER R-SQUARED R-SQUARED
    DISTANCE


  • 10 CL15 CANNED
    4 11.32324
    0.003476 0.985594 44.5633

  • CRABMEAT
  • 9 CL16 ROAST
    LAMB 3 12.59929
    0.003226 0.982367 45.5370

  • SHOULDER
  • 8 CL14 CANNED
    SHRIMP 3 16.10565 0.005231
    0.977136 57.9815
  • 7 CL13 CL10
    12 16.80697
    0.026857 0.950279 65.6901
  • 6 CL12 ROAST
    BEEF 6 14.34190
    0.009755 0 940524 70.8222
  • 5 CL6 CL9
    9 24.36751
    0.039727 0.900797
    92.2533
  • 4 CL8 CL11
    5 26.85628
    0.026158 0.874639 96.6423
  • 3 CL7 CL4
    17 31.36108
    0.113709 0.760930 117.4906
  • 2 CL5 CL3
    26 50.53988
    0.506119 0.254811 191.9655
  • 1 CL2 CANNED
    27 57.40958
    0.254811 0.000000 336.7134
  • SARDINES

78
Exhibit 7.4 (continued)
(???)
  • WARD'S MINIMUM VARIANCE CLUSTER ANALYSIS


  • NUMBER
    FREQUENCY RMS STD
    BETWEEN-
  • OF CLUSTERS
    OF NEW OF NEW SEMIPARTIAL
    CLUSTER
  • CLUSTERS JOINED
    CLUSTER CLUSTER R-SQUARED R-SQUARED
    SUM OF


    SQUARES
  • 10 CL14 CANNED
    4 11.32324 0.003476
    0.985908 1489.42
  • CRABMEAT
  • 9 CL16 CL20
    8 7.75641
    0.003541 0.982367 1517.12
  • 8 CL15 CANNED
    3 16.10565 0.005231
    0.977136 2241.24
  • SHRIMP
  • 7 CL12 ROAST BEEF
    6 14.34190 0.009755
    0.967381 4179.83
  • 6 CL10 CL8
    7 22.14096
    0.023782 0.943599 10189.5
  • 5 CL11 CL9
    11 20.22234
    0.039103 0.904496 16754.1
  • 4 CL6 CL13
    9 30.07489
    0.048662 0.855835 20849.7
  • 3 CL5 CL4
    20 36.22080
    0.158726 0.697109 68007.8
  • 2 CL3 CANNED
    21 47.72546 0.240715
    0.456394 103137
  • SARDINES
  • 1 CL7 CL2
    27 57.40958
    0.456394 0.000000 195548

79
Exhibit 7.5 Non-Hierarchical Analysis For
Food-Nutrient Data
  • INITIAL SEEDS (??????)
  • CLUSTER CALORIES PROTEIN
    FAT CALCIUM IRON
  • --------------------------------------------------
    -------------------------------------------------
  • 1 331.111 19.000
    27.556 8.778 2.467
  • 2 161.667 20.500
    7.500 14.250 1.925
  • 3 100.000 14.800
    3.400 114.000 3.000

80
Exhibit 7.5 (continued)
  • MINIMUM DISTANCE BETWEEN SEEDS 117.4876
  • ITERATION CHANGE IN CLUSTER SEEDS
  • 1
    2 3
  • ----------------------- --------------------------
    ----------------
  • 1 10.8475
    6.46446 0.3
  • 2 0
    6.85281 12.7855
  • 3 0
    0 0

81
  • CLUSTER SUMMARY
  • MAXIMUM

  • DISTANCE
  • CLUSTER RMS STD
    FROM SEED TO NEAREST CENTROID
  • NUMBER FREQUENCY DEVIATION OBSERVATION
    CLUSTER DISTANCE
  • --------------------------------------------------
    --------------------------------------------------
    ------------
  • 1 8 20.8936
    78.8882 2 168.5
  • 2 12 16.3651
    70.9576 3 117.9
  • 3 6 27.8059
    79.6672 2 117.9
  • ????? ?2?????? ??? ?????
  • ??? ???

82
?????(??)???,?????RMSSTD.????,???? Within
SD/Total SD
  • VARIABLE TOTAL STD WITHIN STD
    R-SQUARED RSQ/(1-RSQ)
  • -------------------------------------------------
    --------------------------------------------------
    -------
  • CALORIES 103.06085
    39.89286 0.86216
    6.25453
  • PROTEIN 4.29257
    3.58590 0.35798
    0.55758
  • FAT 11.44357
    4.52989 0.85584
    5.93681
  • CALCIUM 44.70188
    22.76009 0.76150
    3.19291
  • IRON 1.49005
    1.51663 0.04688
    0.04919
  • OVER-ALL 50.53988
    20.71299 0.84547
    5.47135
  • PSEUDO F STATISTIC 62.92
  • APPROXIMATE EXPECTED OVER-ALL R-SQUARED
    0.78678
  • CUBIC
    CLUSTERING CRITERION 2.186

STATISTICS FOR VARIABLES
83
Exhibit 7.5 (continued)
  • CLUSTER MEANS
  • CLUSTER CALORIES PROTEIN FAT
    CALCIUM IRON
  • --------------------------------------------------
    ---------------------------------------------
  • 1 341.875
    18.750 28.875 8.750
    2.437
  • 2 174.583
    21.083 8.750 11.833
    2.083
  • 3 98.333
    14.667 3.167 101.333
    2.883

Cluster 1?????? Cluster 2??????? Cluster 3?????
84
Exhibit 7.5 (continued)
?????????(????,?????,??) (8 Cases)
  • CLUSTER1
  • OBS NAME CLUS DISTA
    CALORIES PROTEIN FAT CALCIUM IRON
  • 1 BRAISED BEEF 1
    2.4357 340 2 0 28
    9 2.6
  • 2 ROAST BEEF 1
    78.8882 420 15
    39 7 2.0
  • 3 BEEF STEAK 1
    33.2744 375 19
    32 9 2.6
  • 4 ROST LAMB LEG 1 77.3963
    265 20 20
    9 2.6
  • 5 ROAST LAMB 1
    42.0616 300 18
    25 9 2.3
  • 6 SMOKED HAM 1 2.4311
    340 20 28
    9 2.5
  • 7 PORK ROAST 1
    1.9132 340 19
    29 9 2.5
  • 8 PORK SIMMERED 1 13.1779
    355 19 30
    9 2.4

85
Exhibit 7.5 (continued)
?????????(????,?????,??) (12 Cases)
  • CLUSTER2
  • OBS NAME CLUSTER
    DISTANCE CALORIES PROTEIN FAT CALCIUM IRON
  • 9 HAMBURGER 2
    70.9576 245 21
    17 9 2.7
  • 10 CANNED BEEF 2
    7.8135 180 22
    10 17 3.7
  • 11 BROILED CHICKEN 2
    59.9964 115 20 3
    8 1.4
  • 12 CANNED CHICKEN 2
    6.3070 170 25 7
    12 1.5
  • 13 BEEF HEART 2
    16.4369 160 26
    5 14 5.9
  • 14 BEEF TONGUE 2
    31.3971 205 18
    14 7 2.5
  • 15 VEAL CUTLET 2
    10.9841 185 23
    9 9 2.7
  • 16 BAKED BLUEFISH 2
    42.0215 135 22
    4 25 0.6
  • 17 FRIED HADDOCK 2
    40.2403 135 16
    5 15 0.5
  • 18 BROILED MACKEREL 2
    26.7634 200 19 13
    5 1.0
  • 19 FRIED PERCH 2
    21.2850 195 16
    11 14 1.3
  • 20 CANNED TUNA 2
    7.9719 170 25
    7 7 1.2

86
Exhibit 7.5 (continued)
???????????? (6 Cases)
  • CLUSTER3
  • OBS NAME CLUSTER
    DISTANCE CALORIES PROTEIN FAT CALCIUM IRON
  • 21 RAW CLAMS 3
    34.7046 70 11
    1 82 6.0
  • 22 CANNED CLAMS 3
    60.5092 45 7
    1 74 5.4
  • 23 CANNED CRABMEAT 3
    63.9273 90 14
    2 38 0.8
  • 24 CANNED MACKEREL 3
    79.6672 155 16
    9 157 1.8
  • 25 CANNED SALMON 3
    61.7127 120 17
    5 159 0.7
  • 26 CANNED SHRIMP 3
    14.8809 110 23
    1 98 2.6
Write a Comment
User Comments (0)
About PowerShow.com