Statistics%20for%20CS%20312 - PowerPoint PPT Presentation

About This Presentation
Title:

Statistics%20for%20CS%20312

Description:

Statistics for CS 312 Descriptive vs. inferential statistics Descriptive used to describe an existing population Inferential used to draw conclusions of ... – PowerPoint PPT presentation

Number of Views:82
Avg rating:3.0/5.0
Slides: 23
Provided by: CarolW165
Learn more at: http://csis.pace.edu
Category:

less

Transcript and Presenter's Notes

Title: Statistics%20for%20CS%20312


1
Statistics for CS 312
2
Descriptive vs. inferential statistics
  • Descriptive used to describe an existing
    population
  • Inferential used to draw conclusions of related
    populations

3
Graphical descriptions
  • Histograms
  • Frequency polygons/curves
  • Pie charts

4
Measures of central tendency
  • Mean average used most often
  • Median midpoint value used when data is
    skewed
  • Mode most frequently occurring value used
    when interested in what most people think

5
Measures of variability
  • Range highest value minus lowest value
  • Standard deviation average of how distant the
    individual values are from the mean

6
Normal curve
  • Bell shaped curve 68 of values lie within one
    standard deviation of the mean
  • Non-normal skewed either negatively (tail to
    left) or positively (tail to right)
  • Percentiles - values that fall between two
    percentile values
  • Standard scores distance from mean in terms of
    the standard deviation z (X-m) / s.
  • Z scores transformed standard scores Z 10z
    50

7
Variables
  • Quantitative things that can be measured (age,
    income, number of credits)
  • Qualitative things without an inherent order
    (college major, address)

8
Populations and samples
  • Population entire universe from which a sample
    is drawn
  • Sample subset of population
  • Symbols mean m, µ standard deviation s, s
    variance s2, s2

9
How representative is the sample
  • Random sample use random numbers to choose
    members of the sample
  • Stratified sample sample that represents
    subgroups proportionally

10
Hypothesis testing
  • Hypothesis as to relationship of variables
    similar or different
  • Inference from a sample to the entire population

11
Statistical significance
  • Accept true hypotheses and reject false ones
  • Based on probability (10 heads in a row occurs
    once in 1024 coin tosses)
  • Significant result means a significant departure
    from what might be expected from chance alone
  • Example a result two standard deviations from
    the mean occurs 2.3 of the time in a normally
    distributed population

12
Null hypothesis
  • Assumption that there is no difference between
    two variables
  • Example Male and female college students do
    similar amounts of music downloading using
    BitTorrent.
  • Example School use of computers is unrelated to
    income of the students families

13
Levels of significance
  • 5 percent level Event could occur by chance
    only 5 times in 100
  • 1 percent level Event could occur by chance
    only 1 time in 100
  • Significance level should be chosen before doing
    experiment

14
Types of errors
  • Type I error Rejection of a true null
    hypothesis
  • Type II error Acceptance of a false null
    hypothesis
  • Decreasing one type increases the other

15
One and two tailed tests
  • One tailed test Experimental values will only
    fail the null hypothesis in one direction
  • Two tailed test Values could occur on either
    the positive or negative tail of the curve

16
Estimation
  • Concerns the magnitude of relationships between
    variables
  • Hypothesis testing asks is there a relationship
  • Estimation asks how large is the relationship
  • Confidence interval provides an estimate of the
    interval that the mean will be in

17
Sequence of activities
  • Description
  • Tests of hypotheses
  • Estimation
  • Evaluation

18
Correlation
  • Quantifiable relationship between two variables
  • Example relationship between age and type of
    computer games played
  • Example relationship between family income and
    speed of home computer connection.

19
Correlation chart
  • Two (or more) dimensional table
  • Variables on the axes, could be intervals
  • Scattergram positive correlated values scatter
    with positive slope, negative with negative slope

20
Product-moment coefficient
  • Formula based on deviations from means
  • If deviations are the same or similar, values are
    positively correlated
  • If deviations are the opposite, values are
    negatively correlated
  • Most correlations are somewhere in between 1 and
    -1

21
Perfect positive correlation r 1
X Y Y
22
Perfect negative correlation r -1
X Y
Write a Comment
User Comments (0)
About PowerShow.com