LEARNING TO USE SPSS: SOME COMMON PROBLEMS - PowerPoint PPT Presentation

1 / 65
About This Presentation
Title:

LEARNING TO USE SPSS: SOME COMMON PROBLEMS

Description:

Colin Gray and I have been teaching our students statistics and the use of ... undertaking exploratory data analysis (EDA) before launching into statistical analyses. ... – PowerPoint PPT presentation

Number of Views:66
Avg rating:3.0/5.0
Slides: 66
Provided by: pkin2
Category:
Tags: common | learning | problems | some | spss | use | eda

less

Transcript and Presenter's Notes

Title: LEARNING TO USE SPSS: SOME COMMON PROBLEMS


1
LEARNING TO USE SPSS SOME COMMON PROBLEMS
  • Paul Kinnear
  • University of Aberdeen

2
Introduction
  • Colin Gray and I have been teaching our students
    statistics and the use of statistical packages
    for many years.
  • We have written several books on SPSS, published
    by LEA and then Psychology Press.
  • The latest is SPSS12 Made Simple (2004).
  • We are presently updating it for SPSS14.

3
Our associates
  • Our teaching has benefited hugely from the advice
    and support of John Lemon, our Universitys
    Senior Computing Adviser, and Caroline Green,
    Senior Teaching Fellow in the School of
    Psychology.
  • Today, I would like to share with you some of our
    experiences in teaching SPSS to our students.

4
Purpose of this talk
  • The advent of Windows-based versions of SPSS has
    been a boon to students and teachers alike.
  • Certain aspects of SPSS, however, present
    problems for students.
  • My purpose today is to identify some of the most
    common difficulties.

5
Example of a problem
  • Many difficulties arise from users not taking
    time to make a list of what their dependent and
    independent variables are, and what they are
    trying to show in terms of their experimental
    hypothesis (or hypotheses).

6
Summary of problem areas
  • Not identifying the variables and their types.
  • Not understanding the differences between within
    subjects and between subjects designs.
  • Not undertaking exploratory data analysis (EDA)
    before launching into statistical analyses.
  • Not knowing how to select the appropriate
    analysis.

7
Summary of problem areas (continued)
  • Not knowing how to edit out unnecessary bits of
    output.
  • Misunderstanding the meaning of p-values recorded
    as .0000
  • Not knowing how to process frequency data in
    contingency tables.

8
Where problems can arise
  • Data entry.
  • Choosing the right statistics.
  • Interpretation of the output.
  • These areas are certainly not independent
    problems in one area often arise from problems in
    another.

9
Data entry
  • Often, what seems to be a problem with data
    entry arises from a lack of clarity about the
    purpose and design of the investigation and
    about the variables involved.

10
Purpose and design
  • The user must be able to answer the following
    questions
  • What was the hypothesis?
  • How was the study designed?
  • What pattern in the results would confirm the
    hypothesis?

11
The variables Some key questions for learners
  • Was your investigation an experiment or a
    correlational study?
  • If it was an experiment, what were the
    independent and dependent variables?
  • Was the experiment a between subjects or a within
    subjects (or possibly a mixed) design?

12
Key terms
  • To use SPSS, the user must be fully conversant
    with some standard terms in experimental design.
  • The user needs to know these terms in order to
    complete SPSSs dialog boxes.

13
Some common terms
  • Dependent (DV) and independent (IV) variables.
  • Between subjects versus within subjects (repeated
    measures).
  • Independent versus related samples.
  • Nominal, ordinal and interval (or scale) data.
  • Factors and levels.
  • Covariates.

14
Between subjects or within subjects?
  • Users are often unclear about this distinction.
  • A treatment factor is between subjects if there
    is no basis for pairing the data obtained under
    different conditions.
  • For example, if the same participant is tested
    twice under different conditions, a pair of
    scores will result.
  • Such paired data are likely to be correlated.

15
Misunderstood concepts univariate, bivariate and
multivariate statistics
  • Univariate statistics relate to analyses with
    just one DV (e.g. driving performance).
  • Bivariate statistics relate to analyses such as
    correlation where two variables are measured but
    neither can be considered as an IV (e.g. scores
    in two tests).
  • Multivariate statistics relate to analyses with
    more than one DV (e.g. reaction time number of
    errors).

16
We discourage the use of MANOVA
  • MANOVA is multivariate analysis of variance i.e.
    ANOVA applied to more than one DV.
  • In line with Tabachnick Fidells recommendation
    (Using Multivariate Statistics 4th Edition,
    p323), we discourage students from using MANOVA
    and encourage them instead to use several
    univariate ANOVAs.
  • The difficulty is that a MANOVA significant
    result does not tell you which DVs are sensitive
    to the IV(s).

17
SPSS data sets
  • An SPSS data set in the Data Editor is not in all
    respects like the sort of table one sees in a
    book or a journal article.
  • In a published table, the columns may represent
    data gathered on different groups of people.
  • In an SPSS data set, each row contains data on
    only one participant.

18
Between subjects example (p.23)
  • An experiment has been carried out to investigate
    the effects of a drug upon performance.
  • The design is between subjects a Placebo and
    Drug group are compared in their performance on a
    test of skill.

19
A between subjects experiment (Kinnear Gray
2004, page 23)
20
Grouping variables
  • Each participants group membership is indicated
    by a code number 1 Placebo 2 Drug.
  • Together, the code numbers make up what is known
    as a grouping variable.
  • Now, each row in the data set contains
    information on only one participant.

21
Between subjects data in SPSS format (ibid page
24)
22
Variable View of Data Editor for between subjects
design (ibid page 29)
23
Data View of Data Editor for between subjects
design (ibid p.32)
24
Within subjects example (p.250)
  • Participants were each asked to produce three
    pictures of an object using a different drawing
    medium for each.
  • The media were crayon, paintbrush and felt-tip
    pen.
  • A panel of judges then rated the aesthetic
    pleasingness of the pictures.
  • Since each participant used all three media, the
    design is within subjects. (The order of using
    the media counterbalanced across participants).

25
Within subjects experiment (ibid page 250)
26
Variable View of Data Editor for within subjects
design
27
Data View of Data Editor for within subjects
design
28
Not undertaking exploratory data analysis (EDA)
before launching into statistical analyses.
  • Learners are often so excited at having
    completed a data file that they do not
  • Bother to check whether there have been any
    transcription errors.
  • Whether there are outliers needing attention.

29
Exploring your data
  • Statistics such as the mean, SD and r are very
    useful for summarising some feature of the data.
    But they can all be misleading.
  • It is essential to use graphical methods to
    confirm the summaries provided by the statistics.

30
Get to know your data first
  • Learners must be warned against pressing ahead
    with formal tests before they have first explored
    their data thoroughly.

31
Panelled bar charts in SPSS14
32
Pie Chart
33
What to look for in a scatterplot
  • The cloud of points should either be elliptical
    or circular.
  • An ellipse indicates a linear relationship a
    circular cloud indicates independence.

34
Example of an outlier
35
More problems
  • Failure to label variables and values carefully.
    This can make the output very difficult to read.
  • Inability to analyse nominal data.

36
Choosing the right statistic
  • We use statistical techniques for TWO purposes.
  • Summarising and exploring our data.
  • Confirming the patterns we find in our data.

37
Five common research situations (Kinnear Gray
2004, page 6)
38
Not knowing how to select the appropriate analysis
  • Learners find difficulty finding the ANOVA
    analysis because SPSS does not have an item in
    the Analyze drop-down menu called ANOVA.
  • They also have difficulty finding the chi-square
    test of association and other statistics for
    analysing contingency tables because they do not
    remember that it is in the Crosstabs item of the
    drop-down menu Descriptive Statistics.

39
Yet more problems
  • Failure to remove unnecessary items from the SPSS
    output.
  • Confusion by expressions such as .0000 in columns
    of p-values.

40
Not knowing how to edit out unnecessary bits of
output
  • Learners are prone to print all SPSSs output
    without taking the trouble to edit out
    unnecessary bits.
  • Editing can include cutting out whole items
    (tables, graphics) or just bits within tables.
  • The output for within subjects ANOVA can be
    particularly obscure without appropriate editing.

41
Need to edit output too
  • Learners are also wary of trying to edit output.
  • The result is that they copy large tables with
    unnecessary columns and unnecessary decimal
    places into their documents.

42
More extensive outputs
  • Lets have a look at some ANOVA outputs and see
    how we can edit them.

43
Complete within subjects output
44
Within subjects Mauchly output
45
Within subjects effects output
46
ANOVA summary table after editing
47
Edited Bonferroni output
48
Profile plot before editing
49
Profile plot after editing the differences are
less impressive
50
Results of editing output
  • Hopefully you will agree that suitable editing of
    the SPSS output has made the material much more
    suitable for inserting into laboratory or
    research reports.
  • Learners often think that they are obliged to
    paste everything SPSS outputs into their reports.

51
Misunderstanding the meaning of p-values recorded
as .0000
  • Learners seem to have difficulty interpreting
    p-values.
  • They often do not realise that .0600 is larger
    than .0500 and hence that the statistical test is
    not significant (assuming ? 0.05).
  • They are bewildered by decimals with just zeros
    after the point e.g. .0000

52
Not knowing how to process frequency data in
contingency tables
  • Two problems here
  • How to enter the data.
  • How to find chi-square and other contingency
    table statistics.
  • Learners need reminding that they need two coding
    variables as well as a count variable.

53
Contingency table example (Kinnear Gray 2004,
page 308)
54
Contingency Table for SPSS
  • The table has to be re-arranged into two coding
    variables and a count variable as in the next
    slide.

55
Two coding variables and a count variable (ibid
p.309)
56
Remember the Weight Cases procedure
  • When data are aggregated as in the previous
    table, it is necessary to tell SPSS that the
    numbers in the Count variable are frequencies.
  • This is done by means of the Weight Cases item in
    the Data drop-down menu.
  • Weight Cases is not needed when each row of the
    data set represents a participant.

57
The Weight Cases procedure for getting SPSS to
treat the data in Count as a frequency
58
Finding the Chi-square measure of association in
Crosstabs
59
The Crosstabs dialog box (ibid p.310)
60
Selecting Chi-square and expected frequencies
61
Contingency table with expected counts
62
Chi-square output (continued)
63
Conclusions
  • I have tried to identify the most common
    difficulties we have found for people learning to
    use SPSS.
  • Some of these are due to basic statistical
    inexperience but others relate to the way in
    which SPSS operates.
  • Learners find difficulties in working out how to
    re-arrange their data to fit in with SPSSs Data
    Editor requirement that rows generally represent
    single participants (or cases) and columns
    represent variables.

64
Conclusions (continued)
  • Learners need to be clear about what are their
    DVs and what are their IVs. They also need to be
    clear about whether the experiment has
    independent or related samples of scores (i.e.
    between subjects or within subjects designs).
  • Prudent editing of output is desirable.
  • They need to understand how to re-arrange data
    for the Chi-Square test of association.

65
The End
  • Many thanks for your attention.
Write a Comment
User Comments (0)
About PowerShow.com