Intro to Statistics for the Behavioral Sciences PSYC 1900 - PowerPoint PPT Presentation

Loading...

PPT – Intro to Statistics for the Behavioral Sciences PSYC 1900 PowerPoint presentation | free to download - id: 69c1af-NDVhM



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Intro to Statistics for the Behavioral Sciences PSYC 1900

Description:

Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 2: Basic Concepts and Data Visualization – PowerPoint PPT presentation

Number of Views:9
Avg rating:3.0/5.0
Slides: 30
Provided by: DavidDe164
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Intro to Statistics for the Behavioral Sciences PSYC 1900


1
Intro to Statistics for the Behavioral
Sciences PSYC 1900
  • Lecture 2 Basic Concepts and
  • Data Visualization

2
Primary Goal
Statistics
Statistics
3
Why do we use statistics?
Is This Difference Meaningful?
Do statistics lie?
Adherence to Scientific Method Specific
Assumptions Long-Term Replicability
4
Definition of Terms
  • Variable
  • A concept or entity of interest on which
    variability exists
  • Goal of behavioral science research is to explain
    why scores differ
  • Sample
  • Set of observations used in analysis
  • Subset of the population
  • Population
  • Entire set of relevant observations
  • Findings with sample are used to generalize to
    population
  • What is the Harvard Student Body?

5
Definitions Continued
  • Statistics
  • Numerical values summarizing sample data
  • Examples mean, median, variance
  • Parameters
  • Numerical values summarizing population data
  • We estimate population parameters based on sample
    statistics
  • Random Sample
  • Sample in which each member of population has an
    equal chance of inclusion.

6
Descriptive vs. Inferential Statistics
  • Distinct types for distinct purposes
  • Descriptive
  • Purpose is to provide statistics that summarize
    or capture nature of the sample
  • Mean is average score
  • Standard Deviation is measure of average
    dispersion or deviation from the norm (i.e., how
    well the mean captures the score of the sample)
  • Inferential
  • Purpose is to calculate probability that
    differences in statistics across groups or levels
    of relationships among variables reflect the
    operation of chance alone.

7
Measurement
  • In order to conduct analyses, we have assign
    values or codes to observations.
  • Different types of data require different types
    of scales.
  • Scale types determine which analytic procedures
    are appropriate

8
Measurement Scales
  • There are two broad types containing four
    subtypes.
  • Qualitative nominal scales
  • Quantitative ordinal, interval, and ratio
    scales.

9
Nominal Scales
  • Categorical in nature
  • No ordering is possible
  • Examples Religion, Ethnicity, Gender
  • We can assign numerical codes, but they do not
    represent any magnitude or ordering information

10
Ordinal Scales
  • Order is provided
  • No information provided about magnitudes of
    differences between points on the scale
  • Examples Rankings
  • We can again use numerical codes, but they do not
    offer information on levels of difference or
    additivity

11
Interval Scales
  • Order is provided
  • Equivalence of differences between points is
    provided
  • Examples Fahrenheit, Likert Scales (?)
  • Majority of statistical techniques we will cover
    are designed for use with interval or ratio data.

12
Ratio Scales
  • Order is provided
  • Equivalence of differences between points is
    provided
  • Scale has an absolute and meaningful zero point.
  • Examples Kelvin, Salary, Hormone Levels
  • For ratio scaled data, we tend to use raw data
    descriptors. For interval, we often use
    standardized descriptors (e.g., z-scores)

13
More Definitions
  • Discrete Variables
  • Take on smallish sets of possible values
  • Continuous Variables
  • Variables that can take any values
  • Independent Variables
  • Variables that are controlled by experimenter or
    designated as possible causal factors
  • Dependent Variables
  • Variables being measured as data theorized to be
    caused by independent variables

14
Random Sampling
  • Used to ensure that composition of sample
    matches composition of population
  • If sample deviates from population,
    generalizability is threatened
  • Randomization happens in many ways
  • Randomization programs, random number tables
  • Note that Chance is lumpy
  • Convenience samples

15
Random Assignment
  • Used to ensure that composition of groups are
    equivalent
  • If groups deviate on relevant variables, validity
    of experiment is reduced
  • Purpose of the control group is to match
    treatment group in every way except experimental
    manipulation.

16
Notation
  • Sigma (S) is the symbol for summation.
  • Rules of summation.

17
Sample Data
18
Visualizing Data
  • One of most useful things you can do is display
    data visually.
  • As well see, a picture is worth a thousand words
    when it comes to checking assumptions of data.

19
Frequency Distributions
  • Presents data in a logical order that is easy to
    see.
  • Values of variable are plotted against their
    frequency of occurrence.

20
Data 1,1,1,1,1,2,2,2,3
21
Problems with Frequency Distributions
  • Sensitive to individual frequencies as opposed to
    general patterns
  • With a highly variable scale, there may be very
    few indices of specific values
  • In such cases, a histogram provides a better
    description of the data

22
Histograms
  • Graph in which bars represent frequencies of
    observations within specific intervals

23
Each observed frequency
No true optimal number of intervals. Ten is a
good rule of thumb.
Binned into 6 intervals (34.5 38.5 38.5
42.5 Etc.)
24
Stem and Leaf Displays
  • The benefits of stem and leaves is that they show
    both pattern of frequencies and actual individual
    level data itself.
  • As the name implies, the data are separated into
    stems (i.e., leading digits) and leaves
    (i.e., following digits marking each data point).

25
  • Stem
  • Vertical axis comprised of leading digits
  • Trailing Digits
  • Digits to the right of the leading ones
  • Leaves
  • Horizontal axis of trailing
  • digits

Stem-and-Leaf Plot Frequency Stem Leaf
2.00 0 . 69 5.00 1 . 01222
5.00 1 . 67789 4.00 2 .
1223 2.00 2 . 57 Stem width
10.00
Data
6,9,10,11,12,12,12, 16,17,17,18,19,21, 22,22,23,25
,27
26
The nature of the stems is determined by visual
ease. Here, there are two stems for each digit,
broken at the midpoint.
Stem-and-leaf of RxTime N 300 Leaf Unit
1.0 7 3 6788999 27 4
00001112223333344444 62 4
55555566666666666777777777888899999 103 5
00000111111111111222222222233333333444444 150
5 55555556666666666777777788888888888899999999999
150 6 00000000000011111111111222222222222223
3333333334444444 96 6 5555555566666666777777
77777777889999999 57 7 011112222222233344444
4 35 7 5566667788899 22 8 000112333
13 8 5678 9 9 044 6 9 558 3
10 44 1 10 1 11 1 11 1
12 1 12 5
Outlier
27
Height Stem Leaf
Looking for Volunteers!!!
28
Modality Skewness
  • Modality
  • Number of meaningful peaks
  • Unimodal1, Bimodal2
  • Skewness
  • Measure of the asymmetry of a distribution
  • Positive skew tail to the right
  • Negative skew tail to the left

29
(No Transcript)
About PowerShow.com