Enhancing the Teaching of Statistics Using SPSS Statistics - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Enhancing the Teaching of Statistics Using SPSS Statistics

Description:

... in advanced math in eighth grade predict twelfth-grade math achievement scores? ... Download a free, 30-day trial of SPSS Statistics 17.0 at: http://www. ... – PowerPoint PPT presentation

Number of Views:1146
Avg rating:5.0/5.0
Slides: 40
Provided by: sharonw3
Category:

less

Transcript and Presenter's Notes

Title: Enhancing the Teaching of Statistics Using SPSS Statistics


1
Enhancing the Teaching of Statistics Using SPSS
Statistics
  • Prof. Sharon L. Weinberg
  • New York University
  • sharon.weinberg_at_nyu.edu
  • Prof. Sarah K. Abramowitz
  • Drew University
  • sabramow_at_drew.edu

2
Commonly Asked Questions
  • 1. Will I be able to get copies of the slides
    after the event?
  • 2. Is this web seminar being taped so I or others
    can view it after the fact?

Yes
Yes
3
The Technology Revolution
  • Has changed the way statisticians work and
    should change what and how we teach (GAISE
    (2005) report)
  • A caution In choosing among technologies, the
    focus cannot be on the technology itself, but on
    how that technology improves the teaching of our
    subject (Moore, 1997, p. 135).

4
Enhancing Student Learning of Statistics The
GAISE (2005) Report
  • Use Real Data (Many others also have made this
    recommendation (e.g., Chance, Ben-Zvi, Garfield,
    and Medina, 2008)
  • Use Technology for Developing Conceptual
    Understanding Analyzing Such Data
  • Emphasize Statistical Literacy Develop
    Statistical Thinking
  • Stress Conceptual Understanding Rather than Mere
    Knowledge of Procedures
  • Foster Active Learning in (and Outside) the
    Classroom
  • Use Assessments to Improve and Evaluate Student
    Learning

5
Recommendation 1 Use Real Data
  • Gives students a better idea of what
    statisticians do by having them analyze real and
    often messy data.
  • Facilitates the discussion of more interesting
    problems that often are captured by large and
    more complicated data sets.
  • Motivates students to take ownership of the
    process of analyzing and drawing conclusions
    about data so as to uncover answers to timely and
    relevant questions.
  • Creates an empowering experience students learn
    skills that may be used in other settings and
    arenas unrelated to the class in question.

6
Sources for Real Data Classroom Activities
  • The Data and Story Library (DASL,
    http//lib.stat.cmu.edu/DASL)
  • The Journal of Statistics Education (JSE) Dataset
    and Stories feature (see http//www.amstat.org/pub
    lications/jse/jse_data_archive.html)
  • CAUSE (http//www.causeweb.org)

7
Additional Real Data Sources
  • http//www.statsci.org/datasets.html
  • http//www.stat.ucla.edu/data/
  • http//www.cdc.gov/datastatistics/
  • http//www.sci.usq.edu.au/staff/dunn/Datasets/
  • http//www.umass.edu/statdata/statdata/non-local-d
    ata.html
  • http//www.amstat.org/publications/jse/jse_data_ar
    chive.html
  • http//lib.stat.cmu.edu/
  • http//espn.go.com/
  • https//www.icpsr.umich.edu/
  • http//www.du.edu/idea/data.htm
  • http//mathforum.org/workshops/sum96/data.collecti
    ons/datalibrary/other.resources.html
  • http//www.google.com/search?qdatasets

8
Another Source of Data NELS
  • The NELS data set collected by the U.S.
    Department of Educations National Center of
    Education Statistics (NCES).
  • Nationally Representative Longitudinal Data Set
    to measure achievement outcomes in four core
    subject areas (English, history, mathematics, and
    science), in addition to personal, familial,
    social, institutional, and cultural factors that
    might relate to these outcomes.

9
Students May Find Answers to Such Interesting and
Relevant Questions As
  • Do boys perform better on math achievement tests
    than girls?
  • Does socioeconomic status relate to educational
    and/or income aspirations?
  • To what extent does enrollment in advanced math
    in eighth grade predict twelfth-grade math
    achievement scores?
  • Can we distinguish between students who use
    marijuana and those who dont in terms of
    self-concept?
  • Does owning a computer vary as a function of
    geographical region of residence (Northeast,
    North Central, South, and West)?
  • Note The text Statistics Using SPSS An
    Integrative Approach (2nd ed.) by Weinberg
    Abramowitz (2008), contains a subsample of these
    data, with 500 cases and 48 variables on an
    accompanying CD.

10
And Yet Another Source of Data Framingham Heart
Study
  • First prospective study of risk factors and their
    joint effects related to coronary heart disease
    (CHD).
  • Longitudinal data collection began in 1956 on
    5,209 subjects.
  • Risk factors and disease markers of CHD include
    blood pressure, blood chemistry, lung function,
    smoking history, health behaviors, and medication
    use.
  • Note Data on a random subsample of 400 cases at
    first examination in 1956, and at third
    examination in 1968, blocking on smoking and
    gender, is included on a CD accompanying
    Statistics Using SPSS.

11
Students May Find Answers to Such Interesting and
Relevant Questions As
  • Is the mean body mass index of smokers lower than
    that for non-smokers among the population of
    non-institutionalized adults?
  • What evidence is there to suggest that HDL is
    good cholesterol and LDL is bad cholesterol?
  • Does total serum cholesterol predict the
    incidence of coronary heart disease?
  • Is there a difference in diastolic blood pressure
    levels, on average, between those who do and
    those who do not take anti-hypertensive
    medication?

12
Recommendation 2 Use Technology
  • Advantages
  • Reduces focus on time-consuming computation
  • Frees student to focus on conceptual
    understanding
  • Enhances the teaching of our subject
  • Models statistical practice
  • Pay Attention to Such Features As
  • Availability and Cost with student network
    versions available
  • Ease of use for particular audiences
  • Ease of data entry, ability to import data in
    multiple formats
  • Dynamic linking between data, graphical, and
    numerical analyses
  • Interactive and High Speed Capabilities
  • Versatility all purpose tool for use throughout
    the course beyond, in academic settings and
    industry
  • Portability for classroom and home
  • Our Software Choice SPSS, which has all of the
    above features.

13
Recommendation 3 Emphasize Statistical Literacy
Thinking
  • Modeling statistical thinking from conception to
    conclusion Example 11.8 from Statistics Using
    SPSS.
  • Exploring a large data set checking underlying
    inferential assumptions Exercise 6.9 from
    Statistics Using SPSS.
  • The importance of open-ended problems Exercise
    3.15 from Statistics Using SPSS.

14
Modeling statistical thinking from conception to
conclusion
  • Example 11.8, Statistics Using SPSS, using the
    NELS data set. Does the population of
    college-bound males from the southern United
    States who have always been at grade level differ
    from the corresponding population of females in
    terms of the number of years of math in high
    school? Conduct the significance test at alpha
    .05.

15
Modeling statistical thinking from conception to
conclusion
  • Use boxplots of the variable number of years in
    math in high school to investigate visually the
    tenability of the underlying assumptions of the
    t-test
  • Boxplots suggest that homogeneity of variance
    assumption is met.
  • Boxplots suggest also that median number of years
    taken in math by males and females is quite
    similar.

16
Modeling statistical thinking from conception to
conclusion
  • The output of the t-test on means

17
Modeling statistical thinking from conception to
conclusion
  • Levenes test -- suggests that homogeneity of
    variance assumption of t-test is met (p .294 gt
    .05).
  • Note that t(148) 2.107, p .04.
  • Reject null hypothesis in favor of alternative.
  • Conclude that for those from the South who have
    always stayed on track, mean number of years of
    math taken in high school by college-bound males
    is statistically significantly different than
    that taken by females.
  • Effect size of result, according to Cohens d, is
    small to moderate.
  • Mean number of years of math taken by males is
    .35 standard deviations greater than mean number
    taken by females.
  • Results appear to contradict visual impressions
    based on the boxplots.

18
Modeling statistical thinking from conception to
conclusion
  • Explore apparent contradiction of results with a
    population pyramid provides a more detailed view
    of the separate male and female distributions.
  • Even with similarly-shaped distributions, medians
    can provide a different result from means and
    that multiple representations of data can be
    useful.

19
Modeling statistical thinking from conception to
conclusion
20
Exploring a large data set checking underlying
inferential assumptions
  • Exercise 6.9, Statistics Using SPSS
  • Use the Framingham data set to regress initial
    total cholesterol (TOTCHOL1) on initial body mass
    index (BMI1).
  • Create the scatterplot for these two variables.
    Label the scatterplot by ID number and
    superimpose the regression line. Would you say
    that linear regression is appropriate in this
    case? Explain.
  • According to the scatterplot, which person is
    most unusual in terms of the linear trend between
    serum cholesterol and BMI? What is his or her ID
    number? Looking at the data set, is this person
    male or female? How old was this person when the
    study began?
  • With all people included in the data set, what is
    the equation of the regression line? With the
    bivariate outlier omitted, what is the equation
    of the regression line? Do the coefficients
    change as a result of omitting this person from
    the data set?

21
Exploring a large data set checking underlying
inferential assumptions
  • Scatterplot suggests a linear, rather than a
    non-linear, relationship between these variables.
  • A bivariate outlier (case 205) may be identified
    as 55 years old and female. She has an unusual
    combination of total cholesterol and BMI values
    relative to the other cases.

22
Exploring a large data set checking underlying
inferential assumptions
  • To what extent does this individual influence the
    regression coefficients?
  • With all people included, the equation is
    predicted TOTCHOL1 1.82(BMI1)189.98
  • When omitting ID 205, the equation is predicted
    TOTCHOL1 1.93(BMI1)186.52
  • Using technology we are able to demonstrate
    simply that even one case in a data set of 400
    cases can influence the results of a regression
    analysis.

23
The importance of open-ended problems
  • Open-ended problems -- the student is required to
    develop an analytic strategy on his/her own
    he/she is not asked to carry out a series of
    directed analyses (e.g., compute the mean,
    compute the SD, etc.)
  • As an example, see Exercise 3.15, Statistics
    Using SPSS. This question asks students to
    provide a demographic assessment by sex of total
    serum cholesterol (TOTCHOL1) measured at baseline
    in the Framingham data set.
  • The student needs to construct an appropriate
    strategy for arriving at this assessment.

24
The importance of open-ended problems
  • One Approach. According to the boxplot, the
    cholesterol distribution is fairly symmetric for
    men, but positively skewed for women. The IQR for
    females is larger, so their cholesterol values
    are more heterogeneous. The median cholesterol
    for women is slightly higher than it is for men.
    Descriptive statistics quantify these
    impressions. According to the skewness ratio, the
    distribution is fairly symmetric for men (1.17)
    and severely positively skewed for women (4.34).
    The median cholesterol for men in the study at
    initial examination (median 231.50) was
    slightly lower than for women (median 239.00).
    The distribution of cholesterol for men was
    slightly more homogeneous (IQR 54) than it was
    for women (IQR 60). For men, the cholesterol
    levels ranged from 133 to 333, whereas for women
    they ranged from 152 to 464.

25
Additional Ways to Emphasize Statistical Literacy
Thinking
  • Choosing an appropriate method of analysis
    Leaving it up to the student Exercise 11.29,
    Statistics Using SPSS.
  • Visualizing data Table 6.4, Anscombe data sets
    -- Statistics Using SPSS.
  • Using simulations Example 9.5, Statistics Using
    SPSS.

26
Choosing an appropriate method of analysis
leaving it up to the student
  • Exercise 11.29, Statistics Using SPSS. For each
    of the following questions based on the NELS data
    set, select an appropriate statistical procedure
    to use to answer it from the list that follows.
    Then use SPSS to conduct the appropriate
    hypothesis test in cases where the underlying
    assumptions are tenable. If the result of a
    hypothesis test is statistically significant,
    report and interpret an appropriate measure of
    effect size.
  • Among college-bound students who are always at
    grade level, do those who attended nursery school
    (NURSERY) tend to have higher SES than those who
    did not?
  • Among college-bound students who are always at
    grade level, does self-concept differ in eighth
    (SLFCNC08) and tenth (SLFCNC10) grades?
  • Among college-bound students who are always at
    grade level, do those who attend public school
    (SCHTYP8) perform differently in twelfth-grade
    math achievement (ACHMATH12) from those who
    attend private school? (Note that to answer this
    question the variable SCHTYP8 has to be recoded
    to be dichotomous as described in Chapter 4.)
  • Among college-bound students who are always at
    grade level, do families typically have four
    members (FAMSIZE)?
  • Among college-bound students who are always at
    grade level, do students tend to take more years
    of English (UNITENGL) than math (UNITMATH)?

27
Visualizing data Anscombes (1973) data sets
from Statistics Using SPSS.
28
Visualizing data Anscombes (1973) data sets
from Statistics Using SPSS.
  • While visually quite different, for all four
    panels of data
  • Mx 9.0, My 7.5, Sx 3.17, Sy 1.94, rxy
    .82,
  • the standard error of the estimate is 1.12, and
  • predicted Y 0.5X 3.
  • Underscores the importance of having the ability
    to visualize ones data.

29
The Utility of Simulations
  • See Example 9.5, Statistics Using SPSS, which
    uses simulation to appreciate the power of the
    Central Limit Theorem.
  • Based on the following syntax file, students may
    view first hand how the sampling distribution of
    means varies as a function of sample size.

30
The Utility of Simulations
  • The syntax file SAMPDISVER2.SPS appears below
  • DEFINE bootmn (nsamp !tokens(1) / nsize
    !tokens(1) /samsize !tokens(1)
  • / bootvar !tokens(1) / outfile !tokens(1)).
  • Sort cases by !bootvar.
  • vector data (!nsize).
  • compute data (casenum) !bootvar.
  • compute nobreak 1.
  • Aggregate outfile
  • /break nobreak
  • /data1 to !concat(data,!nsize) max(data1 to
    !concat(data,!nsize)).
  • Vector data data1 to !concat(data,!nsize).
  • vector tmp(!nsize).
  • loop p 1 to !nsamp.
  • loop q 1 to !nsize.
  • compute tmp(q) 0.
  • End loop.
  • Loop I 1 to !samsize.
  • compute id runk(uniform(!nsize) 1).
  • Compute tmp(id) tmp(id) 1.

31
The Utility of Simulations
  • Figure 9.5. Positively skewed population of 1,000
    scores ? 8, ? 4.

32
The Utility of Simulations
  • Figure 9.6. Sampling distribution of 10,000
    means, each of size 100 ? 8, ? 0.4.

33
Recommendation 4 Stress Conceptual Understanding
Over Procedural Knowledge
  • Avoid time-consuming mechanical hand computation
    through the use of technology.
  • Focus on data exploration of a single data set to
    show that only from multiple analyses can one
    achieve a thorough understanding of the
    information contained in that data set.
  • Present formulas in a form that enables greater
    conceptual understanding, not computational ease
    See Equation 3.4 in Statistics Using SPSS.

34
Recommendation 5 Foster active learning in (and
outside) the classroom
  • Bring an applied perspective to an otherwise
    theoretical or conceptual presentation by
    demonstrating how one may apply the method(s)
    discussed. We have found that by using SPSS one
    can accomplish this well.
  • Demonstrate with the help of real data that
    students can relate to (e.g., the NELS).
  • Present the theoretical or conceptual
    underpinnings of a method before demonstrating on
    SPSS, for example.
  • Have instructor demonstrate on his/her own
    computer with class following along.
  • Provide good notes of how to replicate what has
    been demonstrated in class. Our text has the
    SPSS commands explicitly stated in boxed-in
    areas.
  • Assign data-driven problems for homework that
    continue to reinforce statistical literacy and
    thinking.

35
Recommendation 6 Assessments
  • Align with Learning Goals.
  • Focus on Understanding Key Ideas and Not Just on
    Skills, Procedures, and Computed Answers.
  • Include the Communication of Statistical Concepts
    and Analytic Results.
  • Use Multiple Formats Quizzes, Exams, Homework
    Problem Sets, Article Critiques, Individual and
    Group Term Projects.

36
Poised for the Future
  • With a good conceptual understanding of
    statistics and a good hands-on working knowledge
    of a versatile and accessible software package
    like SPSS, students will be well-poised for
    future study in statistics and for conducting
    their own research at the undergraduate or
    graduate level that relies on a fundamental
    quantitative analysis of data.
  • Statistics Using SPSS An Integrative Approach
    has been written to allow students to achieve
    both goals under one cover.

37
Where to Find Materials Discussed Today
  • Statistics using SPSS An Integrative Approach
  • http//www.cambridge.org
  • ISBN 9780521676373
  • Download a free, 30-day trial of SPSS Statistics
    17.0 at
  • http//www.spss.com/statistics
  • Click on the Downloads tab

38
References
  • Anscombe, F.J.. (1973). Graphs in Statistical
    Analysis, American Statistician, 27, 17-21.
  • Chance, B., Ben-Zvi, D., Garfield, J., Medina,
    E. (2007). The role of technology in improving
    student learning of statistics. Technology
    Innovations in Statistics Education Vol. 1 No.
    1, Article 2. http//repositories.cdlib.org/uclas
    tat/cts/tise/vol1/iss1/art2
  • Friel, S. (2007). The research frontier Where
    technology interacts with the teaching and
    learning of data analysis and statistics. In G.W.
    Blume M.K. Heid (Eds.), Research on technology
    and the teaching and learning of mathematics
    Cases and Perspectives (Vol. 2, pp. 279-331).
    Greenwich, CT Information Age Publishing, Inc.
  • GAISE (2005). Guidelines for assessment and
    instruction in statistics education (GAISE)
    college report. The American Statistical
    Association (ASA). Retrieved August 25, 2008.
    http//www.amstat.org/education/gaise/GAISEColleg
    e.htm
  • Moore, D.S. (1997). New pedagogy and new
    content the case of statistics. International
    Statistics Review, 635, 123-165.
  • Weinberg, S.L. Abramowitz, S.K. (2008).
    Statistics Using SPSS An Integrative Approach
    (2nd ed.). New York Cambridge University Press.

39
Contact Information
Prof. Sharon L. Weinberg New York
University sharon.weinberg_at_nyu.edu Prof. Sarah
K. Abramowitz Drew University sabramow_at_drew.edu
Write a Comment
User Comments (0)
About PowerShow.com