I Do, I Do, I DoW! A look at SAS DO and DoW loops - PowerPoint PPT Presentation

About This Presentation
Title:

I Do, I Do, I DoW! A look at SAS DO and DoW loops

Description:

I Do, I Do, I DoW! A look at SAS DO and DoW loops John Matro Virginia Commonwealth University ... – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 85
Provided by: JohnMa207
Category:
Tags: dow | sas | automatic | indexing | look | loops

less

Transcript and Presenter's Notes

Title: I Do, I Do, I DoW! A look at SAS DO and DoW loops


1
I Do, I Do, I DoW!A look at SAS DO and DoW loops
  • John Matro
  • Virginia Commonwealth University

2
Topics
  • What is a DO loop?
  • Some simple examples
  • Using DO loops with SAS arrays
  • Using DO loops for reading data
  • The DoW loop

3
Acknowledgements
  • Do Which? Loop, Until or While? A Review Of Data
    Step And Macro Algorithms, Ronald J. Fehd, SAS
    Global Forum 2007 proceedings, http//www2.sas.com
    /proceedings/forum2007/067-2007.pdf
  • The Magnificent DO, Paul M. Dorfman, SESUG 2002
    proceedings, http//analytics.ncsu.edu/sesug/2002/
    TU05.pdf
  • In Lockstep with the DoW-Loop, Paul M. Dorfman,
    SESUG 2011 proceedings, http//analytics.ncsu.edu/
    sesug/2011/SS01.Dorfman.pdf

4
Preliminaries
We will use sashelp.class in some of our examples
Obs Name Sex Age Height Weight
1 Alfred M 14 69.0 112.5
2 Alice F 13 56.5
84.0 3 Barbara F 13 65.3
98.0 4 Carol F 14 62.8
102.5 5 Henry M 14 63.5
102.5 6 James M 12 57.3
83.0 7 Jane F 12 59.8
84.5 8 Janet F 15 62.5
112.5 ... 17 Ronald M 15
67.0 133.0 18 Thomas M 11
57.5 85.0 19 William M 15
66.5 112.0
5
Preliminaries
  • DATA _NULL_
  • SET sashelp.class (OBS3)
  • PUT 'Hello ' age age name _N_
  • RUN

Hello 14 Age14 NameAlfred _N_1 Hello 13 Age13
NameAlice _N_2 Hello 13 Age13 NameBarbara
_N_3
_NULL_ Means we are not creating a SAS
table. PUT Writes info to the SAS log or to a
file. _N_ Automatic SAS variable that indicates
the current iteration of the Data step.
6
Preliminaries
  • You can modify _N_

DATA _NULL_ PUT 'Top ' _N_ SET
sashelp.class (OBS3) _N_ _N_ 10 PUT
'Bottom ' _N_ / RUN
Top _N_1 Bottom _N_11 Top
_N_2 Bottom _N_12 Top _N_3 Bottom
_N_13 Top _N_4
7
Preliminaries
  • A data step can have multiple SET statements.
  • Each reads its own virtual copy of the file
    (buffer), which are completely independent of
    each other.
  • Each begins reading at the first record.
  • Each has its own file pointer to remember where
    it stopped reading (in its virtual copy).

DATA _NULL_ SET sashelp.class (OBS2) PUT name
age SET sashelp.class (OBS2) PUT name
age RUN
NameAlfred Age14 NameAlfred Age14 NameAlice
Age13 NameAlice Age13
8
Preliminaries
  • A data step stops executing when any SET
    statement will read beyond the end of file

DATA _NULL_ PUT _N_ SET sashelp.class
(OBS2) PUT 'Middle' SET sashelp.class
(OBS2) RUN
_N_1 Middle _N_2 Middle _N_3
9
Preliminaries
  • On a SET statement, END creates a temporary
    variable, initialized to 0, that is set to 1
    after the last observation is read. In this
    example, we name the variable 'eof'.

_N_1 eof0 _N_2 eof0 _N_3 eof0 _N_4 eof0 .
. . _N_17 eof0 _N_18 eof0 _N_19 eof1
DATA _NULL_ SET sashelp.class ENDeof PUT _N_
eof RUN
10
Preliminaries
  • Variables created in a data step are reset to
    missing on each data step iteration

DATA _NULL_ SET sashelp.class (OBS3) if
(_N_1) then a0 a a age PUT _N_ age
a RUN
_N_1 Age14 a14 _N_2 Age13 a. _N_3 Age13
a.
But you can override that reset by retaining a
variable
11
Preliminaries
Retaining with a RETAIN statement
  • DATA _NULL_
  • SET sashelp.class (OBS3)
  • RETAIN a 0
  • a a age
  • PUT _N_ age a
  • RUN

_N_1 Age14 a14 _N_2 Age13 a27 _N_3 Age13
a40
RETAIN a 0 SAS will not reset 'a' on each data
step iteration. Also initializes 'a' to 0.
12
Preliminaries
Retaining with a 'sum statement'
  • DATA _NULL_
  • SET sashelp.class (OBS3)
  • a age
  • b 1
  • PUT a b
  • RUN

a14 b1 a27 b2 a40 b3
x ___ Similar to x x ___ but it also
initializes 'x' to zero and retains 'x'. (In SAS
docs, see Sum statement.)
13
DO Loop General Forms
  • Indexed DO loop
  • DO variable spec-1 lt, spec-ngt
  • where spec-n is of the form start ltTO
    stopgt ltBY incrementgt ltWHILE(expression)
    UNTIL(expression)gt
  • Conditional DO loops
  • DO WHILE (condition)
  • DO UNTIL (condition)
  • Special
  • DO OVER arrayname

14
Indexed DO Loop
  • DO variable spec-1 lt, spec-ngt
  • ltSAS statementsgt
  • END
  • Where each spec-n is of the form
  • start ltTO stopgt ltBY incrementgt ltWHILE(expressi
    on) UNTIL(expression)gt

15
Indexed DO Loop
  • DATA _NULL_
  • DO i 1,2,-7,46
  • PUT i
  • END
  • RUN

i1 i2 i-7 i46
16
Indexed DO Loop
i1 i2 i3 i4 i5 i6 i7 i8 i9 i10
  • DATA _NULL_
  • DO i 1 TO 10
  • PUT i
  • END
  • RUN

16
17
Indexed DO Loop
You can modify the index variable, if desired
  • DATA _NULL_
  • DO i 1 TO 10
  • PUT i
  • i i 2
  • END
  • RUN

i1 i4 i7 i10
17
18
Indexed DO Loop
Obs i x 1 1 4 2 2 8 3 3
12 4 4 16 5 5 20 6 6 24 7 7
28 8 8 32 9 9 36 10 10 40
  • DATA test
  • DO i1 TO 10
  • x i 4
  • OUTPUT
  • END
  • RUN
  • PROC PRINT DATAtest
  • RUN

19
Indexed DO Loop

i1 i3 i5 i7 i9 i11 i13 i15 i17 i19
DATA _NULL_ DO i1 TO 20 BY 2 PUT
i END RUN
20
Indexed DO Loop
i20 i18 i16 i14 i12 i10 i8 i6 i4 i2
DATA _NULL_ DO i20 TO 1 BY -2 PUT
i END RUN
21
Indexed DO Loop
i1 i3 i5 i7 i9 i11 i13 i15 i17 i19
DATA _NULL_ DO i1 BY 2 TO 20 PUT
i END RUN
22
Indexed DO Loop
DATA _NULL_ a0 b10 c2 DO i a TO b BY c
PUT i a22 b33 c44 END RUN
i0 i2 i4 i6 i8 i10
23
Conditional DO Loops
  • Condition is checked at top of loop
  • DO WHILE (condition)
  • ltSAS statementsgt
  • END
  • Condition is checked at bottom of loop
  • DO UNTIL (condition)
  • ltSAS statementsgt
  • END

24
WHILE Conditional DO Loop
DATA _NULL_ i 0 DO WHILE (i LT 3) i i
1 PUT i END RUN
i1 i2 i3
25
UNTIL Conditional DO Loop
DATA _NULL_ i 0 DO UNTIL (i GE 3) i i
1 PUT i END RUN
i1 i2 i3
26
Indexed DO Loop WHILE/UNTIL
DATA _NULL_ x 0 DO i1 BY 1 WHILE (x LT 50)
x i 10 PUT i x END RUN
i1 x10 i2 x20 i3 x30 i4 x40 i5 x50
27
Indexed DO Loop Multiple Specs
i-4 i21 i24 i27 i7 i100 i61 i63 i65
DATA _NULL_ DO i -4, 21 to 40 by 3
while(ilt29), 7, 100, 61 to 66 by 2 PUT
i END RUN
28
DO Loop Auxiliary Statements
  • CONTINUE Begins a new iteration of the loop.
  • LEAVE Exits the loop.

29
CONTINUE Statement
i1 i2 i3 i4 i6 i7 i8 i9 i10
DATA _NULL_ DO i 1 TO 10 IF (i EQ 5) THEN
CONTINUE PUT i END RUN
30
CONTINUE Statement
DATA _NULL_ i 0 DO UNTIL (i GE 10) i i
1 IF (i5) THEN CONTINUE PUT i END RUN
i1 i2 i3 i4 i6 i7 i8 i9 i10
31
LEAVE Statement
DATA _NULL_ DO i 1 TO 10 IF (i EQ 5) THEN
LEAVE PUT i END RUN
i1 i2 i3 i4
32
LEAVE Statement
DATA _NULL_ DO i 1 BY 1 IF (i EQ 5) THEN
LEAVE PUT i END RUN
i1 i2 i3 i4
32
33
LEAVE Statement
DATA _NULL_ x 0 DO WHILE (1 1) x x
100 IF (x GE 500) THEN LEAVE PUT
x END RUN
x100 x200 x300 x400
34
LEAVE Statement
Caution The LEAVE statement stops the
pro-cessing of the current DO loop or SELECT
group
DATA _NULL_ DO UNTIL(whatever) DO
UNTIL(something) ltSAS statementsgt IF
(zgt10) THEN LEAVE END END RUN
35
DO Loops With Arrays
  • In the next few slides, our data file is
  • named one with variables 'a', 'b', and 'c'

DATA one INPUT a b c DATALINES 1 2 3 11 22
33 PROC PRINT RUN
Obs a b c 1 1 2 3 2 11
22 33
36
DO Loops With Arrays
Here we create three new variables
  • DATA test
  • SET one
  • x a 100
  • y b 100
  • z c 100
  • PUT a b c _at_17 x y z
  • RUN

a1 b2 c3 x101 y102 z103 a11 b22 c33
x111 y122 z133
37
DO Loops With Arrays
  • We created 3 new variables, based on the original
    3 variables.
  • The coding was identical, except for the variable
    names.
  • We can automate that process by using arrays
    and a DO loop.

38
DO Loops With Arrays
  • DATA test
  • SET one
  • ARRAY jack (3) a b c
  • ARRAY jill (3) x y z
  • DO i 1 TO 3
  • jilli jacki 100
  • END
  • PUT a b c _at_17 x y z
  • RUN

a1 b2 c3 x101 y102 z103 a11 b22 c33
x111 y122 z133
39
DO Loops With Arrays
  • Each ARRAY statement creates/defines an array
    that represents 3 variables.
  • If SAS sees jack2 (for example), it substitutes
    the 2nd variable in the 'jack' array --- that is,
    variable 'b'.
  • Likewise, for jill2 it substitutes variable
    'y'.
  • Thus
  • jill2 jack2 100
  • is equivalent to
  • y b 100

40
Conditional DO Loops With Arrays
  • DATA test
  • SET one
  • ARRAY jack (3) a b c
  • ARRAY jill (3) x y z
  • i 0
  • DO WHILE (i LT 3)
  • i i 1
  • jilli jacki 100
  • END
  • PUT a b c _at_17 x y z
  • RUN

a1 b2 c3 x101 y102 z103 a11 b22 c33
x111 y122 z133
41
DO OVER Loop With Arrays
  • DATA test
  • SET one
  • ARRAY jack a b c
  • ARRAY jill x y z
  • DO OVER jack
  • jill jack 100
  • END
  • PUT a b c _at_17 x y z
  • RUN

a1 b2 c3 x101 y102 z103 a11 b22 c33
x111 y122 z133
42
DO OVER Loop With Arrays
  • DO OVER is an undocumented SAS feature, so use
    'caution' (last documented in Version 6 ??).
  • You must omit the count (3) portion of the ARRAY
    statement.
  • In the DO loop, you do not use any indexing. SAS
    does all the indexing for you, processing one
    variable at a time from each array, until it has
    exhausted all the variables in the array
    specified in the DO statement.

43
Using DO Loops For Input
For data, we will use sashelp.class, stored in
the SAS Work library as 'class'
Obs Name Sex Age Height Weight
1 Alfred M 14 69.0 112.5
2 Alice F 13 56.5
84.0 3 Barbara F 13 65.3
98.0 4 Carol F 14 62.8
102.5 5 Henry M 14 63.5
102.5 6 James M 12 57.3
83.0 7 Jane F 12 59.8
84.5 8 Janet F 15 62.5
112.5 ... 17 Ronald M 15
67.0 133.0 18 Thomas M 11
57.5 85.0 19 William M 15
66.5 112.0
44
Using DO Loops For Input
  • The 'traditional' way to read data uses the
    implied
  • data step loop ('automatic loop' or 'observation
  • loop')

DATA test SET class RUN
NOTE There were 19 observations read from the
data set WORK.CLASS. NOTE The data set WORK.TEST
has 19 observations and 5 variables.
45
Using DO Loops For Input
  • Dorfman The automatic loop is engraved in the
    SAS usage mentality to such an extent that it has
    attained an almost religious status. ... As a
    result, almost every time when a file ... has to
    be processed, an attempt is subconsciously made
    to use the implied loop, whatever it takes.
  • Such a rigid approach is practically tantamount
    to forcing a program into the fixed cage of an
    existing programming construct. But programming
    is not meant to be this way. It makes sense to
    choose the tool best fitting the task, rather
    than tweaking the task to fit the tool.
  • (excerpt from The Magnificent Do)

46
Using DO Loops For Input
  • Instead, we can use a DO loop in the data step

DATA test DO UNTIL(eof) SET class ENDeof
OUTPUT END RUN
NOTE There were 19 observations read from the
data set WORK.CLASS. NOTE The data set WORK.TEST
has 19 observations and 5 variables.
Each execution of SET reads another record.
47
Using DO Loops For Input
  • The obvious question WHY use a DO loop?
  • To help answer that, notice how many times SAS
  • processed the data step

DATA test PUT _N_ DO UNTIL(eof) SET class
ENDeof OUTPUT END RUN
_N_1 _N_2
48
Using DO Loops For Input
  • Because the data step makes only one iteration
  • to read the data, this method is ideal for tasks
  • that require
  • Retaining the values of variables.
  • Performing actions before and after reading the
    data.
  • Reading multiple files separately in one data
    step.
  • Performing break-event processing (the 'DoW
    loop').

49
Using DO Loops For Input
  • Task 1
  • Write a header containing today's date.
  • Read the data, select only the 'age14' records,
    and print 'name'.
  • Write a trailer containing today's date and a
    count of the selected records.

50
Using DO Loops For Input
  • Here is an example of the traditional approach

DATA _NULL_ SET class ENDeof RETAIN count 0
date IF (_N_ EQ 1) THEN DO date TODAY()
PUT date MMDDYY10. END IF (eof) THEN PUT date
MMDDYY10. _at_15 count IF (age NE 14) THEN
DELETE PUT name count count 1 RUN
51
Using DO Loops For Input
  • What's 'wrong' with the traditional approach
  • Must retain 'count' and 'date', otherwise they
    get reset to missing on each iteration of the
    data step.
  • For each record, must evaluate the header
    condition and the trailer condition.
  • The logic flow is not simple.

These problems arise because all actions, even if
needed only once, must be coded inside the
implied loop.
52
Using DO Loops For Input
  • Using a DO loop avoids these problems

DATA _NULL_ date TODAY() PUT date
MMDDYY10. count 0 DO UNTIL(eof) SET class
ENDeof IF (age NE 14) THEN CONTINUE PUT
name count count 1 END PUT date
MMDDYY10. _at_15 count STOP RUN
53
Using DO Loops For Input
  • Our program is straighforward
  • compute 'date' and 'count' and print the header
  • read the data and do the required processing
  • print the trailer
  • We did not have to
  • evaluate any begin or end conditions
  • retain 'date' or 'count', since there was only
    one pass through the data step

54
Multiple Input Files
  • We next consider a task involving two input
    files. Our input files are named file1 and file2

DATA file1 INPUT score1 datalines 12 26 37 49
DATA file2 INPUT score2 datalines 10 20 30
55
Multiple Input Files
  • Task 2
  • Print a beginning banner.
  • In file1, compute the average of 'score1'.
    (that average is 31)
  • Print that average and the file1 record count.
  • In file2, add the above average to 'score2' and
    print each new value as 'score3'.
  • Print an ending banner.

56
Multiple Input Files DO Loop
  • DATA _NULL_
  • PUT 'STARTING'
  • count 0
  • total 0
  • DO UNTIL (eof1)
  • SET file1 ENDeof1
  • IF MISSING(score1) THEN
  • CONTINUE
  • count count 1
  • total total score1
  • END
  • avg total / count
  • PUT count avg

DO UNTIL (eof2) SET file2 ENDeof2 score3
score2 avg PUT score3 END PUT
'DONE' STOP RUN
STARTING count4 avg31 score341 score351 score3
61 DONE
(continued in next box)
57
Multiple Input Files Traditional
  • DATA _NULL_
  • SET file1 (INin1)
  • file2 (INin2) ENDeof
  • RETAIN total count 0 avg
  • IF (_N_1) THEN DO
  • PUT 'STARTING'
  • END
  • IF (in1) THEN DO
  • total total score1
  • IF MISSING(score1) THEN DELETE
  • count count 1
  • END

obs2 in2 IF (in2 AND obs21) THEN DO
avg total / count PUT count
avg END IF (in2) THEN DO score3 score2
avg PUT score3 END IF (eof) THEN PUT
'DONE' RUN
58
Using DO Loops For Input
  • To mimic the error handling of the Data step
  • implied loop
  • DATA _NULL_
  • DO _N_ 1 BY 1 UNTIL(eof)
  • _ERROR_ 0
  • SET class ENDeof
  • ltSAS statementsgt
  • IF _ERROR_ THEN PUT _ALL_
  • END
  • RUN

59
The DoW Loop
  • Named after Ian Whitlock (the 'renowned Master of
    the SAS Universe') and perhaps Don Henderson.
  • Uses DO loop(s) to read data for tasks which
    require break-event processing, such as
  • BY-group processing (FIRST. and LAST.)
  • checking for a specific value or a missing
    value

60
The DoW Loop
  • Basic structure

Data ... ltStuff done before break-eventgt
Do ltIndex Specsgt Until (Break-Event)
Set A ltStuff done for each recordgt
End ltStuff done after break-event... gt Run
61
The DoW Loop
  • Dorfman The intent of organizing such a
    structure is to achieve logical isolation of
    instructions executed between two successive
    break-events from actions performed before and
    after a break-event, and to do it in the most
    programmatically natural manner.

62
The DoW Loop
  • Our data file is named base, has variables 'id'
  • and 'score', and is sorted by 'id'

DATA base INPUT id score DATALINES a 1 a 2 b
3 b 4 b 5 PROC SORT DATAbase BY id PROC
PRINT RUN
Obs id score 1 a 1 2 a 2 3 b
3 4 b 4 5 b 5
63
The DoW Loop
  • Task 3 Compute the mean of the 'score'
    variable for each 'id' group.

Obs id score 1 a 1 2 a 2 3 b
3 4 b 4 5 b 5
64
DATA new SET base BY id RETAIN count total IF
FIRST.id THEN DO count 0 total
0 END count count 1 total total
score IF LAST.id THEN DO mean total /
count OUTPUT END PROC PRINT DATAnew RUN
  • Traditional
  • approach

id score a 1 a 2 b 3 b 4 b 5
Obs id score count total mean 1 a
2 2 3 1.5 2 b 5
3 12 4.0
65
DATA new count 0 total 0 DO UNTIL
(LAST.id) SET base BY id count
count 1 total total score END mean
total / count PROC PRINT DATAnew RUN
  • DoW loop
  • approach

id score a 1 a 2 b 3 b 4 b 5
Obs count total id score mean 1 2
3 a 2 1.5 2 3 12
b 5 4.0
66
The DoW Loop
  • Dorfman What makes the DOW-loop special? It
    is all in the logic. The construct
    programmatically separates the before-, during-,
    and after-group actions in the same manner and
    sequence as does the stream-of-the-consciousness
    logic
  • (continued next slide)

67
The DoW Loop
  • (continued)
  • If an action is to be done before the group is
    processed, simply code it before the DOW-loop.
    Note that is unnecessary to predicate this action
    by the IF FIRST.ID condition.
  • If it is to be done with each record, code it
    inside the loop.
  • If is has to be done after the group, like
    computing an average and outputting summary
    values, code it after the DOW-loop. Note that is
    unnecessary to predicate this action by the IF
    LAST.ID condition.

68
The DoW Loop
  • We can "improve" the previous DoW Loop approach
    by changing how we write the DO statement . . .

69
DATA new count 0 total 0 DO UNTIL
(LAST.id) SET base BY id count
count 1 total total score END mean
total / count PROC PRINT DATAnew RUN
  • DoW loop
  • Approach (as before)

Obs count total id score mean 1 2
3 a 2 1.5 2 3 12
b 5 4.0
70
DATA new DO n 1 BY 1 UNTIL (LAST.id) SET
base BY id total SUM(total,score) END mean
total / n PROC PRINT DATAnew RUN
  • DoW loop
  • approach
  • (improved)

Obs n id score total mean 1 2
a 2 3 1.5 2 3 b
5 12 4.0
71
DATA new DO _N_ 1 BY 1 UNTIL (LAST.id) SET
base BY id total SUM(total,score) END mean
total / _N_ PROC PRINT DATAnew RUN
  • DoW loop
  • approach
  • (further improved)

Obs id score total mean 1 a
2 3 1.5 2 b 5 12
4.0
72
The DoW Loop
  • Task 4 Create a table with the mean 'score'
    for each 'id' group merged in

Obs id score mean 1 a 1 1.5 2 a
2 1.5 3 b 3 4.0 4 b 4
4.0 5 b 5 4.0
Obs id score 1 a 1 2 a 2 3 b
3 4 b 4 5 b 5
73
The DoW Loop
  • The 'traditional' approach would be to use
  • one data step, as we did earlier, to create
  • an intermediate file containing the
  • averages

id mean a 1.5 b 4.0
and then use a second data step to merge that
file with the original file.
74
The DoW Loop
  • OR
  • We could do it all in one data step using
  • the DoW loop!

75
DATA new DO _N_ 1 BY 1 UNTIL (LAST.id) SET
base BY id total SUM(total, score) END
mean total / _N_ DO UNTIL (LAST.id) SET
base BY id OUTPUT END PROC PRINT
DATAnew RUN
  • DoW loop
  • approach

id score a 1 a 2 b 3 b 4 b 5
Obs id score total mean 1 a
1 3 1.5 2 a 2 3
1.5 3 b 3 12 4.0 4
b 4 12 4.0 5 b 5
12 4.0
76
  • How It Works
  • This data step has two SET statements, and each
    one reads the same file.
  • Each SET statement reads its own virtual copy
    of the file. The two virtual copies are
    completely independent of each other.
  • Likewise, each SET statement uses its own file
    pointer to mark where it stops reading in its
    virtual copy, and these file pointers are
    independent of each other.
  • (continued next slide)

77
  • The first DO loop is similar to our earlier
    program. It reads all the ID'a' records and
    computes a running total for 'score', storing it
    in the 'total' variable.  SAS sets a file pointer
    to remember where it stopped reading in this
    file.
  • Next, the 'mean' variable is computed, using the
    values for 'total' and '_N_' obtained in the
    first DO loop. This value for 'mean' will be
    used in the second DO loop.
  • The second DO loop then reads its copy of the
    base file (beginning with case 1).  For each case
    it reads, it does an OUTPUT to the new file. 
    Each case will contain the 'mean' variable, that
    was computed above. The DO loop continues until
    it has read and output all the records for
    ID'a'. 
  • After all the ID'a' cases have been processed,
    the second DO loop stops, and SAS sets a file
    pointer (a different pointer, independent of the
    one used in the first DO loop) to remember where
    it stopped reading in this second copy of the
    base file. (continued next slide)

78
  • SAS reaches the end of the data step.
  • Now SAS goes through the data step again.  As
    such, it resets 'total' to missing (so there is
    no need to manually reset it).
  • In the first DO loop, SAS begins reading
    according to where the pointer was set previously
    in that first DO loop.  Thus, it starts with the
    first ID'b' record.  Etc.
  • The 'mean' variable is computed (for the ID'b'
    records).
  • The second DO loop begins reading according to
    where the pointer was set previously in the
    second DO loop (the ID'b' records).  Each case
    is output, and it includes the 'mean' computed
    above (which is the average for the ID'b'
    records).
  • (continued next slide)

79
  • SAS reaches the end of the data step.
  • SAS goes through the data step a third time. 
  • This time, the SET statement in the first DO loop
    encounters the end of its file.  The data step
    stops processing.

80
The DoW Loop
  • We can "improve" the previous DoW Loop approach
    by changing how we write the second DO statement
    . . .

81
DATA new DO _N_ 1 BY 1 UNTIL (LAST.id) SET
base BY id total SUM(total, score) END
mean total / _N_ DO UNTIL (LAST.id) SET
base BY id OUTPUT END PROC PRINT
DATAnew RUN
DoW loop Approach (as before)
Obs id score total mean 1 a
1 3 1.5 2 a 2 3
1.5 3 b 3 12 4.0 4
b 4 12 4.0 5 b 5
12 4.0
82
DATA new DO _N_ 1 BY 1 UNTIL (LAST.id) SET
base BY id total SUM(total,
score) END mean total / _N_ DO _N_ 1 TO
_N_ SET base OUTPUT END PROC PRINT
DATAnew RUN
DoW loop approach (improved)
Obs id score total mean 1 a
1 3 1.5 2 a 2 3
1.5 3 b 3 12 4.0 4
b 4 12 4.0 5 b 5
12 4.0
83
Summary
  • The DO loop is a useful tool for performing
    various repetitive tasks in SAS.
  • In certain situations when reading data, it
    provides an alternative and perhaps better method
    than using the implied data step loop.
  • The DoW loop can be a valuable tool for
    performing break-event processing.

84
  • John Matro
  • Virginia Commonwealth University
  • jmatro_at_vcu.edu
  • SAS and all other SAS Institute Inc. product or
    service names are registered trademarks or
    trademarks of SAS Institute Inc. in the USA and
    other countries. In the USA and other countries
    indicates USA registration.
Write a Comment
User Comments (0)
About PowerShow.com