Title: INF 397C Introduction to Research in Library and Information Science Fall, 2003 Day 3
1INF 397CIntroduction to Research in Library and
Information ScienceFall, 2003Day 3
2Calculating percentiles
- From Runyon et al. (2000)
3(No Transcript)
4Standard Deviation
5Measures of Dispersion
- Range
- Semi-interquartile range
- Standard deviation
- s (sigma)
6Range
- Like the mode . . .
- Easy to calculate
- Potentially misleading
- Doesnt take EVERY score into account.
- What we need to do is calculate one number that
will capture HOW spread out our numbers are from
that Central Tendency. - Standard Deviation
7Hmmm . . .
Mode Range
Median ?????
Mean Standard Deviation
8We need . . .
- A measure of spread that is NOT sensitive to
every little score, just as median is not. - SIQR Semi-interquartile range.
- (Q3 Q1)/2
9To summarize
Mode Range Easy to calculate. Maybe be misleading.
Median SIQR Capture the center. Not influenced by extreme scores.
Mean (µ) SD (s) Take every score into account. Allow later manipulations.
10A quick, real-time, example
- How many pets have you ever owned?
- Order
- Freq. dist. (cumu. freq., rel. freq. cumu. rel.
freq.) - Histogram
- Measures of central tendency
- Measures of spread
11Graphs
- Graphs/tables/charts do a good job (done well) of
depicting all the data. - But they cannot be manipulated mathematically.
- Plus it can be ROUGH when you have LOTS of data.
- Lets look at your examples.
12Your Charts/Graphs/Tables
- http//www.cnn.com/SPECIALS/2003/back.to.school/co
llege/ - http//www.austin.isd.tenet.edu/k12/docs/ratings/2
001-2002/227901006.pdf - http//www.denverpost.com/Stories/0,1413,36257E53
257E1609345,00.html - http//www.usatoday.com/money/advertising/adtrack/
2003-08-24-viagra_x.htm - http//money.cnn.com/2003/08/07/pf/college/bestcol
legedegrees/index.htm - http//www.economist.com/agenda/displayStory.cfm?s
tory_id2034869 - http//www.usatoday.com/money/perfi/housing/2003-0
9-07-mym_x.htm - http//www.bts.gov/publications/national_transport
ation_statistics/2002/html/table_01_35.html - Phone numbers memorability vs. dialability!!
- http//www.understandingusa.com/chaptercc12cs26
0.html
13Some rules . . .
- . . . For building graphs/tables/charts
- Label axes.
- Divide up the axes evenly.
- Indicate when theres a break in the rhythm!
- Keep the aspect ratio reasonable.
- Histogram, bar chart, line graph, pie chart,
stacked bar chart, which when? - Keep the user in mind.
14The Normal Distribution(From Jaisingh 2000)
15(No Transcript)
16So far . . .
- . . . weve talked of summarizing ONE
distribution of scores. - By ordering the scores.
- By organizing them in graphs/tables/charts.
- By calculating a measure of central tendency and
a measure of dispersion. - What happens when we want to compare TWO
distributions of scores?
17Now, why would I want to do that?
- Is your child taller or heavier?
- Is this months SAT test any easier or harder
than last months? - Is my 91 in my Research Methods class better than
my 95 in my Digital Libraries class? - Is the new library lay-out better than the old
one? - Can more employees sign up, more quickly, for
benefits with our new intranet site than with our
old one? - Did my class perform better on the TAKS test than
they did on the TAAS test? -
18Well?
- COULD it be the case that your 91 in your
Research Methods class is better than your 95 in
your Digital Libraries class? - How?
19What if . . .
- The mean in Research Methods was 50, and the mean
in Digital Libraries was 99? - (What, besides the fact that everyone else is
trying to drop the Research class!) - So
You Mean
Res. Meth. 91 50
Dig. Lib. 95 99
20The Point
- As I said last week, you need to know BOTH a
measure of central tendency AND a measure of
spread to understand a distribution. - BUT STILL, this can be convoluted . . .
- Well, daughter, how are you doing in grad school
this semester?
21Well, Mom . . .
- . . . I have a 91 in Research Methods but the
mean is 50 and the standard deviation is 12. But
I only have a 95 in Digital Libraries, whereas
the mean in that class is 99 with a standard
deviation of 1. - Of course, your moms reaction will be, Just
call home more often, dear.
22Wouldnt it be nice . . .
- . . . if there could be one score we could use
for BOTH classes, for BOTH the TAKS test and the
TAAS test, for BOTH your childs height and
weight? - There is and its called the standard score,
or z score. (Get ready for another headache.)
23Standard Score
- z (X - µ)/s
- Hunh?
- Each score can be expressed as the number of
standard deviations it is from the mean of its
own distribution. - Hunh?
- (X - µ) This is how far the score is from the
mean. (Note Could be negative! No squaring,
this time.) - Then divide by the SD to figure out how many SDs
you are from the mean.
24Z scores (contd.)
- z (X - µ)/s
- Notice, if your score (X) equals the mean, then z
is, what? - If your score equals the mean PLUS one standard
deviation, then z is, what? - If your score equals the mean MINUS one standard
deviation, then z is, what?
25An example
Test 1 Test 2
Kris 76 76
Robin 52 86
Marty 58 80
Terry 58 90
SX 244 332
µ
Mode, median?
26Lets calculate s Test 1
X X-µ (X-µ)2
Kris 76 15 225
Robin 52 -9 81
Marty 58 -3 9
Terry 58 -3 9
S 244 0 324
/N 61 81
s 9
27Lets calculate s Test 2
X X-µ (X-µ)2
Kris 76 -7 49
Robin 86 3 9
Marty 80 -3 9
Terry 90 7 49
S 332 0 116
/N 83 29
s 5.4
28So . . . z (X - µ)/s
- Kris had a 76 on both tests.
- Test 1 - µ 61, s 9
- So her z score was (76-61)/9 or 15/9 or 1.67. So
we say that Kriss score was 1.67 standard
deviations above the mean. - Test 2 - µ 83, s 5.4
- So her z score was (76-83)/5.4 or -7/5.4 or 1.3.
So we say that Kriss score was 1.3 standard
deviations BELOW the mean. - Given what I said last week about two-thirds of
the scores being within one standard deviation of
the mean . . . .
29z (X - µ)/s
- If I tell you that the average IQ score is 100,
and that the SD of IQ scores is 16, and that
Bobs IQ score is 2 SD above the mean, whats
Bobs IQ? - If I tell you that your 75 was 1.5 standard
deviations below the mean of a test that had a
mean score of 90, what was the SD of that test?
30Notice . . .
- The mean of all z scores (for a particular
distribution) will be zero, as will be their sum. - With z scores, we transform raw scores into
standard scores. - These standard scores are RELATIVE distances from
their (respective) means. - All are expressed in units of s.
31Practice Questions
32Probability
- Remember all those decisions we talked about,
last week. - VERY little of life is certain.
- It is PROBABILISTIC. (That is, something might
happen, or it might not.)
33Prob. (contd.)
- Lifes a gamble!
- Just about every decision is based on a probable
outcomes. - None of you raised your hands last week when I
asked for statistical wizards. Yet every one
of you does a pretty good job of navigating an
uncertain world. - None of you touched a hot stove (on purpose.)
- All of you made it to class.
34Probabilities
- Always between one and zero.
- Something with a probability of one will
happen. (e.g., Death, Taxes). - Something with a probability of zero will not
happen. (e.g., My becoming a Major League
Baseball player). - Something thats unlikely has a small, but still
positive, probability. (e.g., probability of
someone else having the same birthday as you is
1/365 .0027, or .27.)
35Just because . . .
- . . . There are two possible outcomes, doesnt
mean theres a 50/50 chance of each happening. - When driving to school today, I could have
arrived alive, or been killed in a fiery car
crash. (Two possible outcomes, as Ive defined
them.) Not equally likely. - But the odds of a flipped coin being heads, . .
. .
36Lets talk about socks
37Prob (contd.)
- Probability of something happening is
- of successes / of all events
- P(one flip of a coin landing heads) ½ .5
- P(one die landing as a 2) 1/6 .167
- P(some score in a distribution of scores is
greater than the median) ½ .5 - P(some score in a normal distribution of scores
is greater than the mean but has a z score of 1
or less is . . . ? - P(drawing a diamond from a complete deck of
cards) ?
38Probabilities and or
- From Runyon
- Addition Rule The probability of selecting a
sample that contains one or more elements is the
sum of the individual probabilities for each
element less the joint probability. When A and B
are mutually exclusive, - p(A and B) 0.
- p(A or B) p(A) p(B) p(A and B)
- Multiplication Rule The probability of
obtaining a specific sequence of independent
events is the product of the probability of each
event. - p(A and B and . . .) p(A) x p(B) x . . .
39Prob (II)
- From Slavin
- Addition Rule If X and Y are mutually exclusive
events, the probability of obtaining either of
them is equal to the probability of X plus the
probability of Y. - Multiplication Rule The probability of the
simultaneous or successive occurrence of two
events is the product of the separate
probabilities of each event.
40Prob (II)
- http//www.midcoast.com.au/turfacts/maths.html
- The product or multiplication rule. "If two
chances are mutually exclusive the chances of
getting both together, or one immediately after
the other, is the product of their respective
probabilities. - the addition rule. "If two or more chances are
mutually exclusive, the probability of making ONE
OR OTHER of them is the sum of their separate
probabilities."
41Lets try with Venn diagrams
42Additional Resources
- Phil Doty, from the ISchool, has taught this
class before. He has welcomed us to use his
online video tutorials, available at
http//www.gslis.utexas.edu/lis397pd/fa2002/tutor
ials.html - Frequency Distributions
- z scores
- Intro to the normal curve
- Area under the normal curve
- Percentile ranks, z-scores, and area under the
normal curve - Pretty good discussion of probability
- http//ucsub.colorado.edu/maybin/mtop/ms16/exp.ht
ml
43Think this through.
- What are the odds (what are the chances) (what
is the probability) of getting two heads in a
row? - Three heads in a row?
- Three flips the same (heads or tails) in a row?
44So then . . .
- WHY were the odds in favor of having two people
in our class with the same birthday? - Think about the problem!
- What if there were 367 people in the class.
- P(2 people with same bday) 1.00
45Happy Bday to Us
- But we had 43.
- Probability that the first person has a birthday
1.00. - Prob of the second person having the same bday
1/365 - Prob of the third person having the same bday as
Person 1 and Person 2 is 1/365 1/365 the
chances of all three of them having the same
birthday.
46Sooooo . . .
- http//www.people.virginia.edu/rjh9u/birthday.htm
l
47- http//highered.mcgraw-hill.com/sites/0072494468/s
tudent_view0/statistics_primer.html - Click on Statistics Primer.
48Who wants to guess . . .
- . . . What I think is the most important sentence
in S, Z, Z (2003), Chapter 2?
49p. 19
- Penultimate paragraph, first sentence
- If differences in the dependent variable are to
be interpreted unambiguously as a result of the
different independent variable conditions, proper
control techniques must be used.
50Homework
- Keep reading.
- See you next week.