Session I How to use STATA - PowerPoint PPT Presentation

1 / 113
About This Presentation
Title:

Session I How to use STATA

Description:

Session I How to use STATA & Basic Data Management Commands Compare patient characteristics at the time of randomization and baseline measurements between the groups ... – PowerPoint PPT presentation

Number of Views:678
Avg rating:3.0/5.0
Slides: 114
Provided by: Prof189
Category:
Tags: stata | session | stata | use

less

Transcript and Presenter's Notes

Title: Session I How to use STATA


1
Session IHow to use STATA Basic Data
Management Commands
2
What will be covered?
  • Introduction to STATA Software
  • General Guidelines in Data entry
  • Data Management in STATA

3
Introduction to STATA
4
(No Transcript)
5
(No Transcript)
6
Open Close the Output File
  • To open the log file
  • log using directory\path\filename.log
  • log using d\trials\zinc.log
  • To close
  • log close

zinc.dta
7
To Open Log (Output) File
8
To Close the Log File
9
Append Replace the Existing Log File
To append the existing log file log using
d\trials\zinc.log, append To replace the
existing log file log using d\trials\zinc.log,
replace
10
Open the Data File
To open the data file use directory\path\filename
.dta use d\trials\zinc.dta To save save
zinc.dta
zinc.dta
11
To Make A New Directory
12
To Change the Directory
13
General Guidelines in Data Entry
  • Rows in the datasheet should contain individual
    information - Record.
  • Each column should contain values of a single
    entity of all the individuals Variable.
  • Variable name should not exceed more than eight
    characters.
  • Variables can be either numeric or string or
    alphanumeric.
  • A numeric variable must posses only numbers.
  • In any datasheet, identification number is must.

14
DATA DESCRIPTION
15
Data Management using STATA
16
Data Management using STATA
  • Inputting Data
  • Editing Data
  • Creating and Changing Variables
  • Saving and Reusing Data
  • Data Reorganization
  • Merging and Appending datasets

17
Inputting Data
  • Enter data from keyboard
  • input varlist
  • input str25 name age str1 sex
  • Best way is copy from excel and directly paste
    the data to STATA editor
  • Transfer from other programs

18
Arithmetic Operators
(Addition) - (Subtraction)
(Multiplication) / (Division) (Raise to
power)
19
Relational Operators
gt (greater than) lt (less than) gt
(greater than or equal) lt (less than or
equal) (equal) ! (not equal)
20
Logical Operators
(and) (or) ! (not equal)
21
Expressions
If used when expression is to be specified with
the condition In used when range is to be
specified in the condition
22
Editing Data
  • Edit using Data Editor
  • edit varlist if in
  • edit treatment centre age
  • edit treatment age if centre3agegt25

23
Browsing Data
  • List using Data Editor
  • browse varlist if in
  • browse treatment centre age
  • browse treatment age if centre3agegt25

24
Do this Exercise
  • Edit the following
  • pcode, treatment and cough only for centre 4
  • browse for the same and feel the difference

zinc.dta
25
Creating Changing Variables
  • Create new variable
  • generate newvar exp if in
  • gen totstl24 s1_tstool_wt s2_tstool_wt
    s3_tstool_wt

26
Do this Exercise
  • Generate total stool output from 0-48 hours

zinc.dta
27
Creating Changing Variables
contd
  • Change contents of existing variable
  • To replace
  • replace oldvar exp if in
  • replace sodium1 . if sodium10
  • To recode
  • recode varlist (erule) (erule) ... if in
  • recode age min/61 7/112 12/max3 , gen(agecat)

Rule Example Meaning
/ nonmissing missing 3 1 2 4 5 4/8 3 nonmissing 2 missing 9 3 recoded to 1 2 and 4 recoded to 5 4 through 8 recoded to 3 all other nonmissing to 2 all other missing to 9
28
Do this Exercise
Ex 1 Replace all zeros in serum Potassium as
missing. Ex 2 Recode pre admission diarrhea
duration into 0-24h, 25-72h and gt 72h
zinc.dta
29
Creating Changing Variables
contd
  • Rename the existing variable
  • rename oldvarname newvarname
  • ren tlc_t2 tlc2
  • ren tlc_t3 tlc3
  • Eliminate the existing variable
  • To drop
  • drop varlist
  • drop name address
  • To keep
  • keep varlist
  • keep idno age sodium albumin-tlc

zinc.dta
30
Saving Reusing Data in Stata Format
  • To Save data
  • save filename.dta
  • save zinc, replace
  • clear
  • To reuse data
  • use filename
  • use zinc

zinc.dta
31
Data Reorganization
  • Sorting observations and changing variable order
  • To sort
  • sort varlist in
    ascending
  • sort pcode
  • Move specified variables to front of dataset
  • order varlist
  • Move one variable to specified position
  • move varname1 varname2
  • Alphabetize specified variables and move to
    front of dataset
  • aorder varlist

zinc.dta
32
Data Reorganization contd
  • Convert data from wide to long
  • reshape long stubnames, i(varlist) j(varname)
  • reshape long albumin, i(pcode) j(time)

Wide Shape Data
Long Shape Data
33
Data Reorganization contd
  • Convert data from long to wide
  • reshape wide stubnames, i(varlist) j(varname)
  • reshape wide albumin, i(pcode) j(time)

Long Shape Data
Wide Shape Data
34
Do this Exercise
Convert serum zinc from wide to long shape data
using zinclab.dta
zinclab.dta
35
Answer!!!
zinclab.dta
36
Merging Appending Datasets
  • To append datasets
  • append using filename
  • use zinc1.dta
  • append using zinc2.dta
  • To merge datasets
  • merge varlist using filename
  • use zinclab
  • sort pcode
  • save zinclab, replace
  • use zincprognostic
  • sort pcode
  • merge pcode using zinclab

zinclab.dta
37
Do this Exercise
Merge file 1 (zinclab.dta) with file 2
(zincprognosis.dta)
zinclab.dta
38
Session IIData Cleaning Preparing Data for
Analysis
39
Preparing Data for Analysis
  • Inclusion criteria 35 months old children

40
Preparing Data for Analysis contd
41
Do this Exercise
Inclusion criteria for the study was pre
admission diarrhea duration lt 7 days Ex 1
Convert pre admission diarrhea duration from
hours to days using zincclean.dta Ex 2 Find
values beyond expected range
zinc.dta
42
Answer!!!
43
Preparing Data for Analysis contd
44
Preparing Data for Analysis contd
45
Preparing Data for Analysis contd
?
46
Do this Exercise
Do similar exercise for hemoglobin using zinc.dta
zinc.dta
47
Answer!!!
48
Preparing Data for Analysis contd
What do you mean by 1 2???
zinc.dta
49
Preparing Data for Analysis contd
Label name
50
Preparing Data for Analysis contd
zinc.dta
What is wrong and how to correct it???
51
Preparing Data for Analysis contd
52
Preparing Data for Analysis contd
53
Do this Exercise
Generate total stool output for first 48 hrs
zinclean.dta
54
Preparing Data for Analysis contd
55
Do this Exercise
Draw a boxplot and identify extreme value, if
any, for s2_tstool_wt using zincclean.dta
zincclean.dta
56
Session III Introduction to Basic Data
Analysis
57
What will be Covered?
  • Descriptive Statistics
  • Parametric tests
  • Non-parametric tests

58
Analyses
  • Univariate (one variable at a time)
  • Bivariate (two variables at a time)
  • Multivariate (more than two variables at a time)

59
Descriptive Statistics
60
Univariate Analysis
Quantitative Mean Median Range/IQ Range SD
Categorical Frequency percentage
61
Descriptive Statistics-Categorical Variable
Can we label the variables???
62
Contingency Table
63
Contingency Table contd
64
Contingency Table contd
65
Contingency Table contd
66
Contingency Table contd
Immediate commands
67
Do this Exercise
Ex 1 Draw a crosstab between treatment and
withdrawn using zinc.dta Ex 2 Draw a crosstab
between treatment and diarr24, diarr48
zinc.dta
68
Descriptive Statistics-Quantitative Variable
69
Summary in Detail
70
Do this Exercise
  • Calculate summary statistics for the following
    variables
  • Total stool output 0-48h
  • Total ORS intake 0-24h
  • Total stool frequency in 24h before admission
  • Serum zinc at admission

zinc.dta
71
Summary Statistics by Group
72
Do this Exercise
  • Calculate summary statistics by treament for
    the following variables
  • Total stool output 0-48h
  • Total ORS intake 0-24h
  • Total stool frequency in 24h before admission
  • Serum zinc at admission

zinc.dta
73
Percentile Values
74
Do this Exercise
  • Calculate 3rd and 97th percentile value by
    treatment for the following variables
  • Total stool output 0-48h
  • Total ORS intake 0-24h

zinc.dta
75
Session IV (A)Bi-variate Analyses
76
Analysis of Clinical Trial Data
77
Analysis of Clinical Trial Data
  1. Compare patient characteristics at the time of
    randomization and baseline measurements between
    the groups
  2. Assess the difference in outcome variable(s)
    between the groups (adjusting for any imbalance
    in patient characteristics or baseline outcome
    variables)

78

Bi-variate Analyses
  • Categorical vs Categorical
  • Categorical vs Quantitative

79
1. Categorical Vs Categorical
  • Unrelated Related
  • Chi square test McNemar test
  • Fishers Exact test

X2, Y2
Xgt2, Ygt2
  • Unrelated
  • - Chi square test
  • Fishers Exact test

X Group variable Y Outcome variable
80
Chi-square test
81
Do this Exercise
Is there a difference between the proportion of
patients requiring IV fluids in the two treatment
groups?
zinc.dta
82
Chi-square Test/Fishers exact Test by Group
83
Comparison of two proportions
84
Do this Exercise
  1. Is there a difference in the proportion of
    patients recovered in rota virus negativity
    between the two treatment groups?
  2. 91 of patients recovered in treatment A (n248)
    and 95 of patients recovered in treatment B
    (n252). Test these proportions and find out the
    p-value

zinc.dta
85
McNemars Chi-square Test
86
McNemars Chi-square Test contd
87
Do this Exercise
Is there a shift in zinc deficiency from baseline
after giving treatment B?
zinc.dta
88
2. Categorical vs Quantitative
Parametric
Non-Parametric
X2 Y Normal
X2 Y Non Normal
Unrelated Related Students t test Paired t test
Unrelated Related Wilcoxon ranksum Wilcoxon
signrank
Xgt2 Y Non-Normal
Xgt 2 Y Normal
Unrelated Related Kruskal Wallis
Freidmans test
Unrelated Related One way Repeated
ANOVA measures ANOVA
89
Students t Test for Independent Groups
90
Students t Test for Independent
Groups contd
91
What is the Difference in the Total ORS Intake in
the First 24h between the Two Groups?
92
Transformations
93
Transformations contd
94
Do this Exercise
Ex 1 What is the difference in total stool
output 0-48hours between the two groups? Ex 2
Is there a difference between total duration of
diarrhea (in hours) (varname tot_du_dia_h)
between the two treatment groups?
zinc.dta
95
Geometric Mean if Log Transformation is Used
96
Do this Exercise Ex Calculate the geometric
mean for stool output 0-48 hours
zinc.dta
97
Paired t-Test
98
Do this Exercise
Is there a change in zinc value from baseline
after giving treatment B?
zinc.dta
99
Is there a Change in the Serum Zinc from Baseline
to Recovery between Two Treatment Groups?
Discuss..
100
One-way ANOVA
Analysis of Variance
101
Multiple Comparisons
Difference in means of zinc values between age
group of 6 gt 12
P-value
102
Non-Parametric Methods
103
Is there a difference in total stool output in
the first 24h between the two treatment groups?
Answer Wilcoxon Ranksum test
104
Is there a difference in total stool output in
the first 24h between the two treatment groups?
contd
Answer Wilcoxon Ranksum test
105
Do this Exercise
Is there a difference in total diarrhea duration
between the two groups?
zinc.dta
106
Is there a Change in zinc from baseline after
giving treatment A?
Answer Wilcoxon signed-rank test
107
Is There a Change in zinc from baseline after
giving treatment A?
Answer Wilcoxon signed-rank test
108
Do this Exercise
  1. Is there any difference in zinc from baseline
    after giving treatment B?

zinc.dta
109
Is there a difference in total stool output
across age groups?
Answer Kruskal-Wallis Test
Contd
110
Is there a difference in total stool output
across age groups?
Answer Kruskal-Wallis Test
Contd
111
Is there a difference in total stool output
across age groups?
Answer Kruskal-Wallis Test
Contd
112
Is there a difference in total stool output
across age groups?
Answer Kruskal-Wallis Test
Contd
113
Do this Exercise
  1. Is there any difference in serum zinc (at
    admission) across the age groups ?

zinc.dta
Write a Comment
User Comments (0)
About PowerShow.com