Loading...

PPT – An Introduction to Statistical Problem Solving in Geography - 2nd Edition PowerPoint presentation | free to download - id: 6b53f2-MDBhO

The Adobe Flash plugin is needed to view this content

An Introduction to Statistical Problem Solving in

Geography - 2nd Edition

- Chapter 2 - Summary

Cathy Walker February 13, 2010 GEOG 3000-

Advanced Geographic Statistics Winter Qrtr. 2010

P. Sutton

Definitions

- Individual Level Data Sets - each data value

represents an individual element or unit of the

phenomenon under study. - Spatially Aggregated Data Sets - each value

entered into the statistical analysis is a

summary or spatial aggregation of individual

units of information for a particular place or

area. - Ecological Fallacy - the invalid transfer of

conclusions from spatially aggregated analysis to

smaller areas or to the individual.

- Discrete Variable a variable that has some

restrictions placed on the values the variable

can assume. - Continuous Variable - a variable that has an

infinite number of possible values along some

interval of a real number line. - In general, discrete data are the result of

counting or tabulating the number of items, and

potential values are limited to whole integers. - Continuous data are the result of measurements,

and values can be expressed as decimals.

- Quantitative - observations or responses are

expressed numerically units of data are assigned

numerical values. - Qualitative each observation or response is

assigned to one of two or more categories.

Four Levels of Measurement

1

- Each category is given a name or title, but no

assumptions are made about any relationships

between categories. - Problems based on a nominal scale are considered

categorical (qualitative). - Two Necessary Conditions for Nominal Scale

Classifications - Categories are exhaustive every value or unit of

data can be assigned to a category. - Mutually exclusive it is not possible to assign

a value to more then one category because the

categories do not overlap. - Examples
- Religious Affiliation Classifications Baptists,

Catholic, Methodist, Presbyterian, Mormon,

Jewish, etc. - Political Party Affiliation Democrat,

Republican, Independent

- Nominal Scale

2

- Values are placed in rank order.
- More quantitative distinctions are possible than

with the nominal scale variables. - Strongly Ordered
- Each value or unit of data is given a particular

position in a rank-order sequence - Weakly Ordered
- The values are placed in categories, and the

categories themselves are ranked ordered. - Example

- Ordinal Scale

Top 10 best places to live in the U.S. No. 10

Des Moines, Iowa No. 9 Charlotte, N.C. No. 8

Austin, Texas No. 7 San Antonio, Texas No. 6

Fort Collins, Colorado No. 5 Omaha, Neb. No. 4

Houston, Texas No. 3 Colorado Springs, Colorado

No. 2 Boise, Idaho No. 1 Raleigh, N.C.

3

- Each value or unit is based on a measurement

scale, and the interval between any two units of

data on this scale can be measured. - The origin or zero starting point is assigned

arbitrarily (i.e. the origin does not have a

natural or real meaning. - Example
- The placement of the
- zero degree point on these
- temperature scales is
- arbitrary zero does not mean
- a complete lack of heat.

- Interval Scale

4

- Each value or unit is based on a measurement

scale, and the interval between any two units of

data on this scale can be measured. - The origin or zero starting point is natural or

non-arbitrary, making it possible to determine

the ratio between values. - Example
- The measurement of precipitation from a rain

gauge the ratio between 10 inches of rain and 5

inches of rain is precisely 2.

- Ratio Scale

Measurement Concepts

Precision Accuracy

- Precision refers to the level of exactness

associated with measurement. - Accuracy refers to the extent of system wide

bias in the measurement process. - It is possible for a measurement to be very

precise yet inaccurate.

Validity

- Addresses the measurement issues on the nature,

meaning, or definition of a concept or variable. - To express the true meaning of multi-faceted

concepts is often to difficult, so geographers

often find it necessary to create operational

definitions that can serve as indirect or

surrogate measures for these variables.

Reliability

- Reliability problems often occur when using

international data, since fully comparable and

totally consistent methods of collecting data

rarely exists from country to country. - One way to assess the degree of reliability of a

measurement instrument is to compare at least two

applications of the data collection method used

at different times.

- When data are collected over time or when changes

in spatial pattern are analyzed over time, the

geographer must question the consistency and

stability of the data.

Basic Classification Methods

Equal Intervals Based on Range

- To determine class breaks, the range is divided

into the desired number of equal-width class

intervals - The range is simply the difference in magnitude

between the smallest and largest values in an

interval/ratio set of data.

Equal Intervals Not Based on Range

- This classification method also designates class

breaks to create equal-interval classes, but the

exact range is not used to select the class

breaks. - A convenient and practical interval width is

selected arbitrarily, based on rounded-off

class-break values. - This method if classification is preferred for

constructing a frequency distribution, histogram,

or ogive to represent the data graphically.

Quantile Breaks

- The total number of values is divided as equally

as possible into the desired number of classes. - The allocation of an equal number of values to

each category is often an advantage in choropleth

mapping, particularly if an approximately equal

area on the map is desired for each category. - The possible disadvantages of quantile breaks

should also be evaluated before deciding to use

this method.

Natural Breaks

- The most elementary natural-breaks method is

known as the single-linkage approach. - The logic is to identify natural breaks in the

data and separate values into different classes

based on these breaks. - Similar values are kept together in the same

category, dissimilar values are separated into

different categories, and the gaps in the data

are incorporated directly in the grouping

procedure. - This method will highlight extreme values,

placing unusual outliers of data into their own

unique categories.

What Can Be Concluded About The Disparities Among

Classification Methods?

- Depending on the classification method used,

outcomes can be quite different, even though the

same data is used and the same number of classes

are created. - The logical conclusion is to recognize that any

observed spatial pattern (map) is a function of

the specific classification method applied and

that using a different method of classification

will likely result in a visually distinctive map.

Graphic Procedures

Definitions

- Histogram - the frequency of values is shown as a

series of vertical bars, one for each value or

class of values. - When using categories instead of actual values

along the horizontal scale of a histogram,

classification by equal intervals not based on

range is usually the best technique. - Frequency Polygon - very similar to a histogram,

except that the vertical position of each data

value or class is shown as a point rather than a

bar.

- Cumulative Frequency Diagram ( or Ogive) -

instead of showing actual frequencies for each

value or class, this graphic aggregates

frequencies from value to value or class to class

and displays the cumulative frequencies at each

position. - The cumulative absolute frequencies can be

divided by the sum of all frequencies to obtain

cumulative relative values or proportions. - Scattergram (or Scatterplot) - shows the pattern

of association or relationship between two

variables ( a bivariate relationship) - If a set of observations is plotted, analysis of

the scatter of points suggests the amount and

nature of association or relationship that exists

between the two graphed variables.

Histogram

Frequency Polygon

Cumulative Frequency Diagram

Scattergram

?? Questions ??