Statistics and Experimental Design for Animal Research: A Gentle Introduction - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

Statistics and Experimental Design for Animal Research: A Gentle Introduction

Description:

Statistics and Experimental Design for Animal Research: A ' ... Cher. 2. Oprah. 5. Hillary. 1. 1 = unassisted. 5 = Caesarean. Other variable issues (cont'd) ... – PowerPoint PPT presentation

Number of Views:2847
Avg rating:3.0/5.0
Slides: 42
Provided by: robertjt6
Category:

less

Transcript and Presenter's Notes

Title: Statistics and Experimental Design for Animal Research: A Gentle Introduction


1
Statistics and Experimental Design for Animal
Research A Gentle Introduction
  • Robert J. Tempelman
  • Department of Animal Science
  • Michigan State University
  • CANR Statistical Consulting Center

http//www.fw.msu.edu/orgs/canr_big/SCC.htm
2
What is statistics???
  • Ott and Longnecker (2001) "science of learning
    from data
  • Biometry/Biostatistics
  • Statistics applied to biology
  • Double meaning of biometrics

3
Biological research involves data!!
  • 1) Collecting Data
  • Experimental Design
  • 2) Summarizing Data
  • Simple numerical and graphical descriptions
  • 3) Analyzing Data
  • Formal statistical methods for hypothesis testing
    and estimation
  • 4) Communicating Results
  • Discussion and Interpretation

4
Biological data will always be noisy
  • Why?
  • In a study, data is collected from a finite
    sample randomly drawn from a conceptually large
    population.
  • Every subject responds to the same treatment
    differently (experimental error).
  • Even the same subject might not respond the same
    way to the same treatment from one day to the
    next (measurement error).
  • Therefore, ALWAYS an element of uncertainty in
    drawing conclusions from the statistical analysis
    of data from finite samples.

5
Experimental Design  Definition Plan for
assigning experimental units to
treatments.   Simplest experimental
design Completely Randomized Design (CRD)   In a
CRD, experimental units are 1) Randomly chosen
from a representative populationthen   2)
Randomly assigned to one of several treatments
? experimental units should be as homogeneous
as possible otherwise consider blocking (see
later) - Randomization is essential to remove
systematic bias!
6
Blood cholestorol example
  • Experimental study with randomization
  • 6 rats assigned at random to one of 2 treatments
    (n3 rats per treatment)
  • Blood Cholestorol Data collected (mg/dl)
  • Would you conclude the treatments lead to
    different mean blood cholestorol levels ????

B1
A1
B2
A2
B3
A3
Ave yA 146
Ave yB 152
7
A MEAN TREATMENT DIFFERENCE IS FOUND!
  • But is it
  • Due to mere chance (biological noise) ???
  • Or
  • The real thing (beyond reasonable doubt)?

Statistical inference
Statistical analysis for previous slide consider
a regular t-test
8
Significance and Power
  • Practical significance vs. statistical
    significance
  • Statistically significant results may not be
    practically important.
  • Statistical Power issues!
  • Was the study large enough to allow a reasonable
    chance of definitively concluding a treatment
    differenceif one truly existed?

9
Scientific Method
  • a) Review and Research the problem
  • b) Formulate Hypothesis
  • c) Design experiment that will allow test of
    hypothesis
  • d) Evaluate the hypothesis
  • e) Draw Conclusions

10
How does statistical inference work?
  • Infer upon the characteristics of a large
    population based on data from a finite random
    sample
  • Mechanics (Part of the Scientific Method)
  • Design and collect data from an experiment (e.g
    blood pressure).
  • Assess the probability of getting the
    experimental results assuming a true null
    hypothesis (status quo knowledge..e.g. no
    treatment difference ).
  • Common investigator objective disprove the
    status quo in favor of an alternative hypothesis
    (there is a treatment effect).
  • Conclusions never made with absolute
    certaintymust establish proof beyond a
    reasonable doubt.

11
Terminology Sample versus specimen
  • Biologist draws blood from 20 people
  • A biologist might state that he/she has 20
    samples of blood.
  • Statistician would state that the biologist has
    one sample of 20 glucuse measurements.
  • 20 specimens or 20 experimental units rather than
    20 samples.

12
Sample vs. Population
SAMPLE VERSUS POPULATION
(Humans)
(All rats)
  • Sample Population Target pop'n

(Humans)
Judicious inference
Random
Actual versus target population differences could
be far more subtle
13
Variables, Variables!!!
  • Quantitative variables
  • Due to a true numerical measurement.
  • Ratio scale (e.g. weight) versus Interval Scale
    (e.g. temperature)
  • Discrete (countable) versus Continuous
  • Qualitative variables
  • Nominal scale (classification or group) GENOTYPE,
    SEX
  • Ordinal scale (ranked variables..small, medium,
    large)
  • Dependent (response variables e.g. weight) as a
    function of Independent variables (GENOTYPE, SEX)

14
Discrete (quantitative) versus ordinal
(qualitative) variables
  • 1) Litter size (discrete quantitative)

15
Discrete (quantitative) versus ordinal
(qualitative) variables (contd)
  • 2) Calving ease scores (ordinal 1-5 scale in
    Holsteins)

1 unassisted 5 Caesarean
16
Other variable issues (contd)
  • Dont confuse discrete variables with truly
    continuous variables
  • i.e. some variables appear to be discrete
    (integers) because of data recording round-off.
  • e.g. age of cattle recorded to the nearest month
    is a continuous variable.
  • Number of mastitis cases within a lactation is a
    discrete variable

17
Parameters versus statistics
Population Characterized by parameters
e.g mean m
Size N
Random selection
Statistical inference
A sample statistic is an estimator of a
population parameter
Sample characterized by statistics
Size n
e.g mean y
18
Usual distributional assumption for continuous
responses Normality
Distribution of weight gains of 100 baby chicks
over specified time period. Data does not have
to be perfectly normally distributed to use
common statistical procedures (t-tests, ANOVA)
19
Pseudo replication (an obvious example from Gill,
1978)
Suppose 6 rats per treatment, each measured twice
(stimulus response to drugs on two different
occasions). e.g. Rat A1 had response 33 1st time,
then 35
How much replication?
Biological versus technical replication
(subsampling)
A1
A1
A2
A2
A3
A3
A4
A4
One rat per treatment no replication
A5
A5
A6
A6
n 6not 12
20
Remedy? Average each experimental units
responses
Treat each experimental units average as the
responsethen do regular t-test.
Note there are still benefits to subsampling
controls measurement error. -gt but always better
off increasing number of rats per treatment than
number of measurements per rat.
21
Animal in pens
  • Suppose you have two pens/litters of pigs
  • Each pen has four pigs
  • All pigs in Pen 1 receive Diet A
  • All pigs in Pen 2 receive Diet B
  • Do you have replication?

Pen 1 -gt Diet A
Pen 2 -gt Diet B
22
Animals in pens (contd)
  • Answer to question on previous slide NO! n
    1
  • Pens/Litters are the experimental units for diets
  • Pigs within pens are merely pseudoreplicates!
  • Need several pens per diets in order to have a
    valid study.

See also
Wainwright, Patricia E. 1998. Issues of Design
and Analysis Relating to the Use of Multiparous
Species in Developmental Nutrition Studies.
Journal of Nutrition 128661-663.
23
Instructions to the Authors (Journal of Dairy
Science, 2007)
  • The experimental unit is the smallest unit to
    which an individual treatment is imposed. For
    group-fed animals, the group of animals in the
    pen or the paddock is the experimental unit
    therefore, groups must be replicated.
  • i.e. must have more than 1 pen per treatment (and
    2 might not be nearly enough!)

24
Basic design concepts
  • Randomization
  • Experimental units need to be randomly assigned
    to treatments!
  • Replication
  • Several experimental units per treatment needed
    to assess experimental error
  • Power Having sufficiently large enough sample
    size (experimental units) to detect a mean
    differenceif one truly exists
  • Blocking
  • Similar experimental units could be blocked
    together and randomization of units to treatments
    conducted within each block

25
BLOCKING
e.g. you wish to test a new diet supplement
(Treatment B) versus a control diet (Treatment
A) for growth in piglets established that
there are known litter/pen effects on growth. -gt
then consider blocking on litters!
Suppose the size of each litter is standardized
to two pigs.   We randomly assign one piglet
within each litter to Treatment A and the
remaining piglet to Treatment B.   This is an
example of a randomized complete block design
(RCBD)!
26
The Randomized Complete Block Design (RCBD)
Population of litters of size 2
Draw random sample of litters
Litter 1
Litter 2
Litter n

Trt A
Trt B
Trt B
Trt A
Trt A
Trt B
Randomly assign treatments to piglets within
litters
27
Why block?
  • Remove block (e.g. litter) as a source of
    variability
  • Greater statistical power since treatment
    comparisons conducted WITHIN each litter
  • Basis of paired t-test when block size 2.
  • Remove litter as a potential source of bias
  • Other examples of blocking?
  • Identical twins
  • "Before and after" treatment on same subject

28
Crossover designs
  • Where animals are blocks for treatments
  • 2 period crossover

etc.
Design is balanced with respect to diets and
periods. e.g. not good idea to always feed Diet
A in Period 1 and Diet B in Period 2otherwise
Diets and Periods are confounded with each other
29
Typical experimental designs exploiting blocking
and used for animal science.
  • Variants of crossover designs exploiting power of
    within-animal comparisons have been chosen for
    comparing two treatments
  • Constructed to be balanced with respect to
    periods
  • 4-period crossovers ? double reversal
  • half of animals -gt A B A B other half -gt B
    A B A
  • 3-period crossover ? switchback
  • half of animals -gt A B A other half -gt
    B A B
  • 2-period crossover ?simple crossover/Latin square
  • half of animals -gt A B other half -gt B A

30
Table 1 from Cox (1980)Cox, D.R. 1980. Design
and analysis in nutritional and physiological
experimentation. Journal of Dairy Science
63313-321
  • Table 1. Classification of 24 weight gains
    measured on 12 cows fed in two pens (2 period
    crossover)
  • Period Pen 1 containing Pen 2 containing
  • Cows 1 to 6 Cows 7 to 12
  • One Diet A used and Diet B used and
  • (4 wks) 6 gains recorded 6 gains recorded
  • Two Diet B used and 6 Diet A used and 6
  • (4 wks) gains recorded gains recorded

The experimental units in this situation were
pens of animals in a given period no way to
separate effects of diet from all other possible
factors. PSEUDOREPLICATION ISSUE!!!
31
Coxs internal torment
  • One merely could assume that effects of pens
    were negligible. This is equivalent to assuming
    that, when feeding a single diet, the gains of
    two animals in the same pen are no more alike
    than the gains of two animals each in different
    pens
  • BUT
  • Pen-to-pen variation has been important too
    often to make such an assumption credible.

32
How many animals do I need for a study!
  • It depends on
  • Your design (blocking versus not blocking, size
    of experimental unit -gt pen vs. animal)
  • True mean difference that you hope to detect!
  • D mA mB
  • Relative amount of variability (s)
  • Between responses within same animal (se
    measurement error)
  • Between animals (sa innate or biological
    variability)
  • Between litters/pens (sp where applicable)
  • Type I error rate Probability of concluding a
    treatment effect when one doesnt exist
    (typically set lt5)
  • Why we choose P-values lt 0.05
  • Desirable power Probability of concluding a
    treatment effect when one truly exists (typically
    set gt 80)

33
Well, how do we specify some of this stuff?
  • Literature and educated guessing
  • Uniformity trials or existing data on subjects in
    current facility under regular management
    conditions.
  • Range approximation s 1/4 x Range of
    responses
  • se 1/4 x range of responses within same
    subject and treatment
  • sa 1/4 x range of responses between subjects
    within same treatment.
  • sp 1/4 x range of average responses between
    pens within same treatment.
  • What would be a practically important
    specification for D mA mB

34
Dairy Example Relationship between within-cow
variance s2e (kg2) and DIM (by parity)
Early lactation
Late lactation
Reasonable assumption s2e s2a
From Jensen, J. 2001. Genetic evaluation of dairy
cattle using test-day models. Journal of Dairy
Science. 842803-2812.
35
CRD power for individual animal study as function
of n and D
Late lactation
Early lactation
s2e s2a 4 kg2
s2e s2e 20 kg2
Power
Power
n
n
36
Two period crossover trial for individual animal
trials
Early lactation
Late lactation
s2e 2 kg2
s2e 10 kg2
37
Binary data?
  • Yes or no responses e.g. mastitis, conception
    rate,
  • Consider comparison of Trt A versus Trt B
  • Incidence rate for Trt A 5
  • Incidence rate for Trt B 7.5, 10, 12.5,.,
    25

CRD Power to conclude a difference in incidence
rates between Trt A (0.05) and Trt B (0.075 to
0.250)
38
Power calculators
39
(No Transcript)
40
(No Transcript)
41
Questions?
Write a Comment
User Comments (0)
About PowerShow.com