Concatenating SAS Data Sets - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Concatenating SAS Data Sets

Description:

TORRES. F. NA2. Concatenating with Different Variable Lists ... TORRES. F. NA2. Re-naming Variables. The RENAME option: SAS-data-set(RENAME=(old-name=new-name) ... – PowerPoint PPT presentation

Number of Views:100
Avg rating:3.0/5.0
Slides: 18
Provided by: BLU54
Category:

less

Transcript and Presenter's Notes

Title: Concatenating SAS Data Sets


1
Concatenating SAS Data Sets
  • STT 305

2
Concatenating
  • Concatenation Vertical combination or
    chaining of data sets.
  • General form of data step
  • data data-set-name
  • set data-set1 data-set2 data-setn
  • SAS statements
  • run
  • Reads/writes all records from data sets present
    in set statement in listed order.

3
Example
  • The following code concatenates two simple data
    sets
  • data newhires
  • set stt305.newemp1 stt305.newemp2
  • run
  • newemp1 newemp2

4
The Concatenation Process
data newhires set stt305.newemp1
stt305.newemp2 run PDV newemp1 newemp2
First Pass
No New Variables
5
The Concatenation Process
data newhires set stt305.newemp1
stt305.newemp2 run PDV newemp1 newemp2
Subsequent Passes
Read from 1st data set
Read from next data set
6
The Result
7
Concatenating with Different Variable Lists
  • Suppose the previous example had data sets with
    the same variables, but they had been named
    differently
  • newempA newempB

8
Concatenating with Different Variable Lists
  • The code
  • data newhires
  • set stt305.newempA stt305.newempB
  • run
  • PDV

At Compilation
Adds New Variable
9
The Result
10
Re-naming Variables
  • The RENAME option
  • SAS-data-set(RENAME(old-namenew-name))
  • One possible revision of the previous
  • data newhires
  • set stt305.newempA stt305.newempB(RENAME(J
    CodeCode))
  • run
  • Re-naming is applied at compilation.
  • rename is also a data step statement.

11
The Result
12
Interleaving
  • Suppose the data sets were sorted by name, and we
    want to have the final data set also sorted by
    name after putting them together.
  • sortemp1 sortemp2

13
Interleaving
  • To accomplish this, use a by statement in
    conjunction with the set statement.
  • data newhires_sorted
  • set stt305.sortemp1 stt305.sortemp2
  • by name
  • run

14
InterleavingThe Process
  • data newhires_sorted
  • set stt305.sortemp1 stt305.sortemp2
  • by name
  • run

Pointers are placed in all data sets.
Record with 1st by value is read.
Pointer is updated
Same valueRecord read from 1st data set
15
InterleavingThe Result
16
The in data set option
  • Suppose in our previous examples that employees
    from data set 1 correspond to main office
    employees and those from data set 2 are branch
    office employees.
  • Can I add a variable during the
    concatenation/interleaving process to note this?
  • SAS-data-set(INvariable)
  • Variable 1 if record was read from data set, 0
    if not.
  • Can be any legal variable name not already
    present in the data set.

17
The in data set option
  • Use the following
  • data newhires_sorted
  • set stt305.sortemp1(inin1) stt305.sortemp2
  • by name
  • if in1 then officeMain
  • else officeBranch
  • run
  • A length statement may be useful here
Write a Comment
User Comments (0)
About PowerShow.com