CRITICAL ANALYSIS - PowerPoint PPT Presentation

1 / 54
About This Presentation
Title:

CRITICAL ANALYSIS

Description:

Was there planning for any adverse effects & dropouts. Statistical analysis sensible? ... allow for and manage dropouts? Significance? C.I.s? dose-response? ... – PowerPoint PPT presentation

Number of Views:1332
Avg rating:3.0/5.0
Slides: 55
Provided by: FP79
Category:

less

Transcript and Presenter's Notes

Title: CRITICAL ANALYSIS


1
CRITICAL ANALYSIS
  • WHICH RESEARCH DESIGN FOR
  • WHICH CLINICAL PROBLEM?

2
(No Transcript)
3
   
4
   
5
(No Transcript)
6
Appraising a Clinical Experimental Study
  • Population/Subjects
  • What was the source population?
  • What were the inclusion/exclusion criteria?
  • Were they be a representative and relevant
    sample?
  • How long was the follow-up period?

7
  • Are the Results of the Study Valid?
  • Randomization? Was randomization list hidden?
  • Were baseline characteristics of the groups same
    at start?
  • Was there an intention-to-treat analysis?
  • Were interventions outcomes clearly defined
    replicable?
  • How complete was blinding? Assessed at end?
  • Apart from the experimental intervention, were
    the groups treated equally ? i.e. same
    co-interventions?
  • Was comparison group contaminated with main
    interventions?
  • Was compliance with interventions
    measured/assured?
  • Were all accounted for at end? (was follow-up
    complete)
  • Was follow-up time sufficient to detect relevant
    outcomes?

8
  • Results
  • How large were the intervention effects?
  • what measure(s) of 'event rate' or outcome were
    used?
  • What was the NNT or NNH ?
  • How accurate were estimations of the intervention
    effects
  • e.g. p-values, confidence intervals
  • How large were the intervention effects?
  • Did the study have sufficient power?

9
  • Applicability and Conclusions
  • Applicability Relevance?
  • (to your patients, and is the treatment
    feasible available)
  • Were all important outcomes considered?
  • Are the likely treatment benefits worth the
    potential harm costs? (adverse effects)
  • Strengths and weaknesses?

10
Appraising a Diagnostic Study
  • Population/Subjects/Setting
  • What was the source population tested?
  • What were the inclusion/exclusion criteria?
  • Were subjects a representative and relevant
    sample?
  • How did they recruit subjects?

11
  • Validity
  • Was there a Comparison with a 'Gold Standard
    Test'
  • How did they define 'caseness' to be detected by
    the test?
  • If is no 'Gold Standard', can test be validated
    in other ways?
  • Was there blinding of Subjects and of
    Investigators to theory?
  • How thorough was this and was it assessed at the
    end?
  • Was Sample Size OK re Power?
  • Did all subjects get both new test and Gold
    Standard test?
  • Was there testing by 2 independent investigators?
  • Was there planning for any adverse effects
    dropouts
  • Statistical analysis sensible?
  • Test-retest issues discussed?

12
  • Conclusions
  • Sensitivity - Proportion of true positives
    identified by a test or by epidemiological
    screening.
  • Specificity - Proportion of true negatives
    identified by a test or by epidemiological
    screening
  • Did the test work as well as Gold Standard?
  • Benefits vs harm?
  • Relevance? Practicality in the real world?
  • Are the likely clinical benefits worth the
    potential harm costs? (e.g. adverse effects)
  • Strengths and weaknesses?
  • How could it be improved?

13
APPRAISING A CAUSATION STUDY
  • Population/Subjects
  • What is the source population being studied?
  • Did they define 'exposed' group vs 'comparison'
    group (cohort study)
  • Or define controls (case-control study) - any
    randomisation?
  • What were the inclusion/exclusion criteria?
  • Were subjects a representative and relevant
    sample?
  • How did they recruit subjects and
    comparisons/controls?

14
  • Basic Structure of Study
  • Cohort study?
  • A Longitudinal study in which groups of
    people are interviewed repeatedly over a period
    of time - respondents usually share a common
    characteristic. Where the same group of people
    are followed up over time this is known as a
    cohort study. If a group of different people are
    interviewed in each wave a survey this is known
    as a trend design.
  • Case-control study? (did exposure precede
    outcome?)
  • Cross-sectional study? (did exposure precede
    outcome?)
  • Did Researchers Define
  • The causal factor studied - is their theory
    sensible?
  • The 'outcome' caused by causal factor?
  • Often the Risk Ratio is discussed (A comparison
    of the risk of some health-related event such as
    disease or death in two groups)
  • Was there Blinding?
  • Re the hypothesis - ideally both subjects
    assessors
  • How good was this and was it assessed at the end?

15
  • Data Validity
  • Was Sample Size Ok re Power
  • Did they follow-up long enough?
  • How did they allow for and manage dropouts?
  • Significance? C.I.s? dose-response? Specificity?
  • Conclusions
  • Relevance usefulness?
  • Strengths and Weaknesses of study?
  • How could it be improved?

16
APPRAISING A PROGNOSIS STUDY
  • A Prognostic Factor is a patient characteristic
    that can predict the patient's eventual outcome
  • a demographic e.g. sex, age, race
  • disease-specific e.g. tumour stage, symptom
    pattern
  • comorbidity other co-existing conditions
  • Articles that report prognostic factors often use
    two independent patient samples
  • derivation sets asks - "what factors might
    predict patient outcomes?"
  • validation sets ask - "do these prognostic
    factors predict patient outcomes accurately?"

17
  • Methods
  • Design? (cohort / case series / prospective vs.
    retrospective)
  • Setting? hospital / location / clinic
  • Patient Population? - number / screening or
    enrollment methods / number screened vs number
    enrolled
  • Description of prognostic or outcome factors
    considered
  • Prognostic Outcome Factors are the numbers of
    events that occur over time, expressed in
  • absolute terms e.g. 5 year survival rate
  • relative terms e.g. risk from prognostic factor
  • survival curves a curve that starts at 100 of
    the study population and shows of the
    population still surviving at successive times.
    Applied to onset of a disease, complication or
    some other endpoint (e.g. time before relapse)

18
  • Validity
  • Was a defined, representative sample of patients
    assembled at a common (usually early) point of
    the illness ?
  • Inclusion and exclusion criteria?
  • Selection biases?
  • Stage of disease?
  • Was patient follow-up sufficiently long
    complete?
  • Reasons for incomplete follow-up?
  • Prognostic factors similar for patients lost and
    not-lost to follow-up?
  • Were objective unbiased outcome criteria used?
  • Outcomes defined at start of study?

19
  • Validity
  • Assessors and subjects blinded to prognostic
    factor theory?
  • Statistical models seem OK?
  • Follow-up duration / completeness / accounting
    for patients
  • If subgroups with different prognoses were
    identified
  • Was there adjustment for important prognostic
    factors?
  • Are the (hopefully valid) results of this
    prognosis study important? i.e.
  • How large is the likelihood of the outcome
    event(s) in a specified time?
  • Survival curves?
  • How precise are prognostic estimates?
  • Confidence intervals?

20
  • Conclusions
  • Strengths and Weaknesses of Study
  • In context of other studies /or current standard
    of care?
  • Next steps for further study of this problem?
  • Can you apply the (hopefully valid important)
    results of this study to caring for your own
    patients? - i.e.
  • were the study patients similar to your own?
  • patients similar for demographics, severity,
    co-morbidity, and other prognostic factors?
  • will this evidence make a clinically important
    impact on your views on what to tell or to offer
    your patients?
  • Compelling reason why the results should not be
    applied?
  • Will the results lead directly to you selecting
    or avoiding therapy?
  • Are the results useful for reassuring or
    counselling patients?

21
  • Incidence
  • can be defined as the number of new
    occurrences of a phenomenon e.g. illness, in a
    defined population in a specified period. An
    incidence rate would be the rate at which new
    cases of the phenomena occur in a given
    population.
  •  
  • Prevalence (also called Prevalence Rate re
    prevalence across time)
  • the number of cases (or events, or conditions)
    within a specified time period. e.g. prevalence
    of a condition includes all people with the
    condition even if the condition started prior to
    the start of the specified time period.
  • Period prevalence The amount a particular
    disease present in a population over a period of
    time.
  • Point prevalence The amount of a particular
    disease present in a population at a single point
    in time.

22
Appraising Systematic Reviews(of treatment /
Intervention Studies)
  • What were the relevant population(s)?
  • What were the main exposure(s)?
  • What were the comparison(s)?
  • What were the outcome(s)?
  • Design of the Studies
  • experimental or non-experimental ?
  • cross-sectional or longitudinal ?
  • All trials included in a review should first have
    been appraised using the model for experimental
    studies

23
  • Validity of Review Results
  • were the criteria used to select studies for
    inclusion in Review both explicit and
    appropriate?
  • Is it likely that any important, relevant studies
    were missed? (completeness of literature search)
  • Was the validity of the included studies
    appraised?
  • Were assessments of the studies reproducible?
    (documented and replicated)
  • Were the results similar from study to study?
    (tests of heterogeneity)

24
  • Results (Size of Effects and Precision)
  • What were the overall results of the review - how
    large were the effects ?
  • How precise were the results ?
  • Applicability Relevance
  • Are the results applicable in normal practice?
  • Were all important outcomes considered?
  • Are the likely treatment benefits worth the
    potential harm costs ? (e.g. adverse effects
    etc.)
  • Strengths weaknesses of the Review?
  • How could the Review be improved?

25
Critical Appraisal - NNTS NNHS
  • Decide from reading the study if the experimental
    group had a better outcome than the control group
    - if so, do the NNT
  • Or
  • if the control group had a better outcome than
    the experimental group - if so, do the NNH
  • When the experimental treatment decreases risk of
    an undesirable outcome NNT and RBI (relative
    benefit increase) are useful
  • Number Needed to Treat number of patients who
    need to be treated to cause 1 good outcome
  • Number Needed to Harm number of patients who
    need to be treated to cause 1 bad outcome

26
  • EER event rate in the experimental group
  • CER event rate in the control group
  • If this is a difference, ignore minus signs
    except as a reminder as to whether treatment was
    overall helpful or harmful
  • E (event) outcome (express it as a
    decimal eg. 40 occurrence as 0.4)
  • e.g. in a study comparing mood stabilisers,
    a bad outcome might be that the manic state does
    not improve with the treatment, or gets worse
  • Absolute Benefit Increase when the treatment
    benefits more experimental subjects than occurs
    with those in the control group
  • ABI EER - CER
  • Relative Benefit Increase fewer bad outcomes in
    the experimental group compared with the control
    group
  • RBI EER - CER / CER
  • NNT 1 / ABI

27
  • EXAMPLE
  • Treatment of acute mania.
  • Results are a reduction of a certain amount on
    the young mania rating scale (YMRS) After 1 week
  • DRUG A
  • 65 OF SUBJECTS HAD OUTCOME
  • PLACEBO
  • 30 OF SUBJECTS HAD OUTCOME
  • EER event rate in the experimental group
  • CER event rate in the control group
  • E (event) outcome 65 (S) 30 (C)
  • EER IS THUS 0.65 CER IS
    THUS 0.30
  • ABI EER - CER 0.35
  • NNT 1 / ABI 1 / 0.35 2.86
  • So number needed to treat is close to 3 - i.e. We
    have to treat 3 patients for 1 to get benefit.
    This would be an extremely good and impressive
    NNT.

28
Asking a Research Question
  • What is the Question? (the Clinical Problem to be
    answered)
  • What sort of Issue being investigated
  • An Intervention or Treatment ?
  • A Diagnostic Test or Instrument ?
  • A Causal factor ?
  • A Prognostic Factor ?
  • What is the main alternative for Comparison
  • A Control group?
  • A Comparison group?
  • A Placebo group?
  • Comparing 2 interventions?
  • What is the main Outcome or Outcomes?

29
Examples
  • You are sure that on-call nights for
    psychiatric registrars and crisis nurses are
    always busier when there is a full moon. How
    would you try to determine whether this is in
    fact the case?

30
  • You are working in the C-L service of a
    general hospital. Budget cuts are threatened and
    you have to justify maintaining the C-L service
    to several medical wards. One ward refers to C-L
    a lot, and the other hardly ever. You feel that
    your services C-L input shortens the length of
    stay for patients with delirium and self-harm.
    How could you demonstrate this in time for next
    years budgeting round in 9 months time?

31
Significance - p values
  • The statistical significance of a result is the
    probability that the observed relationship (e.g.,
    between variables) or difference (e.g., between
    means) in a sample occurred by pure chance, and
    that in the population from which the sample was
    drawn, no such relationship or differences exist.
  • The p-value represents the probability of error
    in accepting our observed result as valid, or
    "representative of the population."

32
P-values
  • A p-value of 0.05 (1 in 20) indicates that there
    is a 5 probability that the relation between the
    variables found in our sample is a "fluke."
  • p values of lt0.05 are by convention 'just'
    significant
  • but this level of significance still involves a
    pretty high probability of error (5).
  • Results that are significant at the p lt0.01 level
    are considered by convention statistically
    significant, and p lt0.005 or p lt0.001 levels are
    often called highly significant.

33
Data-mining and spurious significance
  • The more analyses you perform on a data set, the
    more results will "by chance" meet the
    conventional significance level.
  • For example, if you calculate correlations
    between ten variables (i.e., 45 different
    correlation coefficients), then you should expect
    to find by chance that about two (i.e., one in
    every 20) correlation coefficients are
    significant at the p lt0.05 level, even if the
    values of the variables were totally random and
    don't correlate in the population.
  • Some statistical methods that involve many
    comparisons include some "correction" for the
    total no. of comparisons - but not all do.

34
(No Transcript)
35
Correlation Coefficients
  • Shows the extent to which a change in one
    variable is associated with change in another
    variable the relationship between them.
  • Best to have /-0.90 and above to show a
    correlation
  • Range from -1.00 to 1.00.
  • -1.00 perfect (strong) negative relationship.
  • 1.00 perfect (strong) positive relationship.
  • 0.00 (midpoint) no relationship at all.

36
Strength vs Reliability of a Relationship
Between Variables
  • In general, in a sample of a particular size, the
    larger the size of the relationship between
    variables, the more reliable the relationship.
  • If there are few observations, then there are
    also few possible combinations of values, so the
    probability of a chance combination showing a
    strong correlation is high - so small 'n' studies
    are statistically weak.
  • If a correlation between variables in question is
    very small in the population, then there's no way
    to identify it in a study unless the sample is
    very large.
  • Similarly, if a correlation is very large in the
    population, then it can be found to be highly
    significant even in a very small sample.
  • If a coin is slightly asymmetrical, and when
    tossed is slightly more likely to produce heads
    than tails (e.g. 60 vs. 40), then ten tosses
    would not be enough to show that the coin is
    asymmetrical. But if the coin is weighted to
    almost always fall as heads, then ten tosses
    would be quite enough to show this.

37
(No Transcript)
38
  • Other terms and concepts to learn
  • Measures of Central Tendancy and of Variability
  • Types of Data

39
Confidence Interval
  • If the Confidence Interval does not overlap zero,
    the effect is said to be statistically
    significant
  • CI is range of values, within which we're fairly
    sure the true value of the parameter being
    investigated lies.
  • If independent samples are taken repeatedly from
    the population a Confidence Interval calculated
    for each, a certain (confidence level) of the
    intervals will include the unknown population
    parameter. Confidence intervals are usually
    calculated so that this percentage is 95.
  • Width of the confidence interval shows how
    uncertain we are about the unknown parameter.
    Very wide interval ? more data should be
    collected before anything definite can be said
    about parameter.

40
Odds Ratios
  • Compares frequency of exposure to risk factors in
    epidemiological studies
  • The odds ratio is a reasonable approximation of
    the relative risk when the outcome is relatively
    large (e.g., when less than 1 of the people
    exposed to an agent develop disease). The odds
    ratio produces larger errors as the outcome rate
    rises above 1.
  • You can say that a proposed risk factor acts as a
    significant risk to disease if
  • odds ratio is gt1
  • lower edge of the C.I. gt1

41
VARIOUS TESTS
  • Have some idea what each is for -
  • A reasonable reference is
  • http//www.une.edu.au/WebStat/unit_materials/c6_co
    mmon_statistical_tests/
  • Parametric Tests and Non-Parametric Tests
  • Nonparametric methods are used when we know
    nothing about the distribution of the variable in
    the population. Not so much that they are for
    non-normal distributed data, but there's no
    assumption of a normal distribution
  • Parametric tests are used where there is a normal
    distribution

42
Parametric vs Non-Parametric tests
  • Memorize a name of each sort e.g.

43
Null Hypothesis
  • The alternative hypothesis (to the
    researchers theory). It usually assumes that
    there is no relationship between the dependent
    and independent variables. The null hypothesis is
    assumed to be correct until research demonstrates
    that it is incorrect. This process is known as
    falsification.

44
POWER
  • Type I Error Rate (Alpha)
  • The probability of incorrectly rejecting a true
    null hypothesis (a Type I error gives a false
    positive result)
  • Type II Error Rate (Beta)
  • The probability of incorrectly accepting a false
    null hypothesis (a Type II error gives a false
    negative result)

45
  • In the social sciences there are conventions
    that
  • ? the Type I error (risk of a false positive)
    - must be kept at or below 0.05 (50)
  • ? the Type II error (risk of a false
    negative)- must be kept low as well (20 or
    less, generally)
  • Statistical Power is equal to 1 - ?
  • and must be kept correspondingly high
  • Power should be at least 0.80 (80) to detect a
    reasonable departure from the null hypothesis
  • Statistical Power The probability of
    rejecting a false null hypothesis

46
In Reject-Support (RS) research (the usual kind)
  • (the opposite is true in Accept-Support AS
    research)
  • The researcher wants to reject the null
    hypothesis
  • "Society" wants to control Type I error (false
    positives)
  • The researcher is very concerned about Type II
    error (false negative - missing the fact that
    you have a result that supports your theory - is
    much more likely to get published)
  • High sample size works for the researcher
  • But if there is "too much power", trivial effects
    become "highly significant"

47
Factors influencing power in a statistical test
  • 1. What kind of statistical test is being used
  • 2. Sample size
  • 3. Size of the experimental effect
  • 4. Level of error in experimental measurements
  • A Sampling Distribution
  • the distribution of a statistic over repeated
    samples
  • The Standard Error of the Proportion
  • the standard deviation of the distribution of the
    sample proportion over repeated samples

48
Power Analysis in Studies
  • In planning a study, one must estimate
  • What would be the reasonable minimum
    experimental effect that one wants to detect
  • A minimum Power to detect that effect
  • The sample size that will achieve that desired
    level of Power

49
Steps required for Power analysis and sample size
estimation
  1. The type of analysis and the null hypothesis are
    specified
  2. Power and required sample size for a reasonable
    range of likely experimental effects is
    investigated
  3. The sample size required to detect a reasonable
    experimental effect (i.e. departure from the null
    hypothesis) with a reasonable level of power is
    calculated, while allowing for a reasonable
    margin of error

50
  • Method (Excerpt) Statistical analysis
  • It was estimated that in order to detect a
    30 difference between the percentage of
    responders in the control group compared with
    that in the exercise group at the P0.05 level of
    significance, a sample size of 40 subjects per
    group would be required to give a power of 90.
    Data on poorly responsive depression are scant
    but the proportion of responders in the control
    group was reasonably anticipated to be 10,
    compared with an anticipated 40 in the exercise
    group.

51
  • Was a power analysis done prior to the study?
    What is the main implication?
  • Yes. The power was set at 0.9 (90)
  • Power 1-beta (beta is the probability of making
    a Type-II error)
  • So, 0.9 1- beta, or Beta 1 - 0.9, which is
    0.1 or 10. Thus the risk of making a Type-II
    error in this study was 10, as opposed to most
    studies which set Power at 0.8 - i.e. they
    tolerate a risk of 20 of making a Type-II error
    (a false negative)
  • Main Implication was that the study did have
    enough power to detect a significant improvement,
    which it did not do

52
Ethics in Research
  • http//www.wma.net/e/policy/b3.htm World
    Medical Association Helsinki principles for
    research in humansEthics Committees Think
    about their role and how to design studies to
    meet these requirementsRANZCP principles from
    Code of EthicsPsychiatrists involved in
    clinical research shall adhere to those relevant
    ethical principles embodied in national and
    international guidelines

53
College Code of Ethics (paraphrased)
  • It's done on people so high standards are needed
    and must be scientifically justified
  • Must be OKd by an Ethics Committee
  • Minimize any harm to subjects
  • The interests of subjects always takes precedence
    over science or society's interests
  • Informed consent must be obtained from people
    participating in research
  • Special care to be taken with consent from those
    in dependent relationships, eg. students,
    prisoners, the elderly
  • For minors - consent from parent/guardian

54
College Code of Ethics (paraphrased)
  • If subjects aren't competent to consent get this
    from a relative or guardian
  • Subjects can withdraw at any time it won't
    jeopardise their care
  • If a researcher uncovers clinically relevant
    information needing acting on, researcher should
    tell the patient their doctor
  • Confidential information obtained from the
    research stays within the study
  • No plagiarism, acknowledge all references
  • Research reports to be truthful and accurate
  • Ensure participants are deidentified
  • Declare any conflict of interest in all
    publications
Write a Comment
User Comments (0)
About PowerShow.com