An Introduction to PROC IML - PowerPoint PPT Presentation

1 / 57
About This Presentation
Title:

An Introduction to PROC IML

Description:

Other Calculations: Residuals and Error ... See the SAS/IML Users Guide, or Online Documentation. Mosaic Plots: http://www.math.yorku.ca/SCS/Papers/moshist.pdf ... – PowerPoint PPT presentation

Number of Views:304
Avg rating:3.0/5.0
Slides: 58
Provided by: dand154
Category:

less

Transcript and Presenter's Notes

Title: An Introduction to PROC IML


1
An Introduction to PROC IML
  • Dan DiPrimeo
  • ASG Users Group

2
What is IML?
  • Interactive
  • Matrix
  • Language

3
What is IML?
  • A powerful, flexible programming language in a
    dynamic, interactive environment.
  • The fundamental object is a data matrix.
  • Use IML interactively (Iinteractive!) to see
    results immediately, or store in a module.
  • Powerful built-in operators to perform matrix
    operations.
  • Data management commands

4
Why IML?
  • Can do graphs and analyses that other SAS Modules
    dont readily do in programming statements that
    are easily translated from mathematics and
    statistics statements.
  • A powerful graphics package for scientific
    exploration

5
Goal
  • To get you acquainted with IML via
  • a VERY brief intro
  • An elementary example
  • A real life example
  • Touching on Graphics

6
Define Matrix A
  • proc iml
  • reset print / send to the .lst
    file /
  • A 1 3 5, 4 4 1, 2 2 6
  • / Define A to be a 3 x 3
    matrix /
  • A 3 rows 3 cols (numeric)
  • 1
    3 5
  • 4
    4 1
  • 2
    2 6

7
Define Matrix B
  • B 1 3 4, 3 5 2, 4 2 1
  • / Define B to be a 3 x3 positive definite
    symmetric matrix /
  • B 3 rows 3 cols (numeric)
  • 1
    3 4
  • 3
    5 2
  • 4
    2 1

8
Define Matrix C
  • C 2 1 1, 3 4 6
  • / C is a 2 x 3 matrix /
  • C 2 rows 3 cols (numeric)
  • 2
    1 1
  • 3
    4 6

9
Define Matrix D
  • D 2 2, 4 5, 5 1
  • / D is a 3 x 2 matrix /
  • D 3 rows 2 cols (numeric)
  • 2
    2
  • 4
    5
  • 5
    1

10
Define Matrix X
  • X 1 1 0, 1 1 0, 1 1 0, 1 0 1, 1 0 1, 1 0 1
  • / X is a design matrix for ANOVA, 6 obs, 1
    independent X, 2 levels of X /
  • X 6 rows 3 cols (numeric)
  • 1
    1 0
  • 1
    1 0
  • 1
    1 0
  • 1
    0 1
  • 1
    0 1
  • 1
    0 1

11
Compute the Inverse Of A
  • inverseA inv(A)
  • / compute the inverse of A/
  • INVERSEA 3 rows 3 cols (numeric)
  • -0.5 0.1818182 0.3863636
  • 0.5 0.0909091 -0.431818
  • 0 -0.090909 0.1818182

12
Compute the Transpose Of C
  • transposeC t(C) / or, C /
  • / compute the transpose of C /
  • TRANSPOSEC 3 rows 2 cols (numeric)
  • 2
    3
  • 1
    4
  • 1
    6

13
Binary Operation AB
  • AplusB A B
  • / Add 2 matrices, of same size /
  • APLUSB 3 rows 3 cols (numeric)
  • 2
    6 9
  • 7
    9 3
  • 6
    4 7

14
Binary Operation AB(Matrix Multiplicaton)
  • AtimesB A B
  • / Matrix multiplication /
  • ATIMESB 3 rows 3 cols (numeric)
  • 30
    28 15
  • 20
    34 25
  • 32
    28 18

15
Reduction Operators
  • Asumrows A,
  • / Reduction operator, sum the rows /
  • ASUMROWS 3 rows 1 col (numeric)

  • 9

  • 9

  • 10

16
Reduction Operators
  • Asumcols A,
  • / Reduction operator, sum the columns /
  • ASUMCOLS 1 row 3 cols
    (numeric)
  • 7
    9 12

17
Catenating Matrices
  • AnextB AB
  • / Put 2 matrices side by side /
  • ANEXTB 3 rows 6 cols (numeric)
  • 1 3 5
    1 3 4
  • 4 4 1
    3 5 2
  • 2 2 6
    4 2 1

18
Catenating Matrices
  • AtopB A//B
  • / Put 2 matrices on top of each other /
  • ATOPB 6 rows 3 cols (numeric)
  • 1
    3 5
  • 4
    4 1
  • 2
    2 6
  • 1
    3 4
  • 3
    5 2
  • 4
    2 1

19
Computing Eigenvalues
  • eigvalsA eigval(A)
  • / Eigenvalues of A /
  • EIGVALSA 3 rows 2 cols (numeric)
  • 9.4488409
    0
  • 3.0686516
    0
  • -1.517492
    0

20
Computing Eigenvalues
  • eigvalsB eigval(B)
  • / Eigenvalues of B /
  • EIGVALSB 3 rows 1 col (numeric)

  • 8.5572316

  • 1.519351

  • -3.076583

21
Diagonal Operators
  • diagA diag(A)
  • / Change non-diagonal elements of A to 0, keep
    diagonals as they are /
  • DIAGA 3 rows 3 cols (numeric)
  • 1
    0 0
  • 0
    4 0
  • 0
    0 6

22
Diagonal Operators
  • vdiagA vecdiag(A)
  • / Take the diagonal elements of A, put into a
    column matrix /
  • VDIAGA 3 rows 1 col (numeric)

  • 1

  • 4

  • 6

23
Functions
  • logA log(A)
  • / Log of each term /
  • LOGA 3 rows 3 cols (numeric)
  • 0 1.0986123 1.6094379
  • 1.3862944 1.3862944 0
  • 0.6931472 0.6931472 1.7917595

24
Functions
  • ssgvdiagA ssq(vdiagA)
  • / Square each element of vdiagA, sum them /
  • SSGVDIAGA 1 row 1 col (numeric)

  • 53

25
Programming Statements
  • In addition to the functions and operators that
    we talked about earlier, we have programming
    statements
  • If/Then
  • Do
  • Jumping (GOTO, LINK Statements)
  • Modules

26
Modules
  • Modules are similar to subroutines, or
    functions, that can be called anywhere in a
    program, and reused later.

27
Programming Statements, Modules
  • start TransMatrix(D)
  • Dcol ncol(D) / Number of
    columns in D /
  • Drow nrow(D) / Number of
    rows in D /
  • Dtranspose_temp shape(.,Dcol, Drow)
  • do i 1 to Dcol
  • do j 1 to Drow
  • Dtranspose_tempi,j Dj,i / transpose
    the matrix D /
  • end
  • end
  • return (Dtranspose_temp)
  • finish TransMatrix
  • Dtranspose TransMatrix(X)
  • print Dtranspose

28
Programming Statements, Modules
  • The previous slide illustrates
  • Programming Statements (In the form of Do Loop)
  • Modules (creating groups of statements that can
    be invoked anywhere in the program, i.e., a
    subroutine, and creating a separate environment
    local to the module).

29
An Application Regression
  • To illustrate some of the ideas just presented,
    lets perform a regression analysis.
  • Of course, with PROC REG, GLM, one wouldnt
    consider doing this, but . . .
  • Example follows.

30
An Application Regression
  • data Graybill
  • input x y _at__at_
  • datalines
  • 550 200 200 50 280 60 340 140 410 130
  • 160 20 380 120 510 190 510 160 475 180
  • run

31
Bring the data into IML
  • proc iml
  • use Graybill
  • / identify dataset to read data
    from/
  • Note The USE, and READ statements are the
    method of getting data from a dataset into IML
    for converting to matrices.

32
Bring the data into IMLRead Y variable into
matrix Y
  • read all vary into y
  • / Y variable into Y matrix /
  • Y 10 rows 1 col (numeric)

  • 200

  • 50

  • 60

  • 140

  • 130

  • 20

  • 120

  • 190

  • 160

  • 180

33
Bring the data into IMLRead X variable into
matrix XOBS
  • read all varx into xobs
  • / X variable into X matrix /
  • XOBS 10 rows 1 col (numeric)

  • 550

  • 200

  • 280

  • 340

  • 410

  • 160

  • 380

  • 510

  • 510

  • 475

34
Define the X Matrix
  • nobs nrow(y) /
    Let nobs number of observtions /
  • x shape(1,nobs,1)xobs / Matrix of
    1s, nobs x 1, catenated to X /
  • XOBS 10 rows 1 col (numeric)

  • 1
    550
  • 1
    200
  • 1
    280
  • 1
    340
  • 1
    410
  • 1
    160
  • 1
    380
  • 1
    510
  • 1
    510
  • 1
    475

35
Compute (XX)-1
  • xinvx inv(xx)
  • / inverse of X(transpose) X /
  • XINVX 2 rows 2 cols (numeric)
  • 0.9820609
    -0.002312
  • -0.002312
    6.0605E-6

36
Compute the Estimates
  • betahat xinvx x y
  • / estimate of Betahat xinvx x(trans)y /
  • BETAHAT 2 rows 1 col (numeric)

  • -45.22735

  • 0.4462054

37
Other CalculationsDegrees of Freedom
  • nobs nrow(y)
  • nind ncol(x)
  • dftot nobs - 1
  • dfmod nind - 1
  • dferr dftot - dfmod

38
Other CalculationsResiduals and Error
  • resid y - yhat
  • sserr ssq(resid)
  • mserr sserr / dferr
  • rootmse sqrt(mserr)

39
Other CalculationsSums of Squares
  • ybar sum(y)/nobs
  • sstot ssq(y-ybar)
  • ssmod sstot - sserr
  • msmod ssmod/dfmod

40
Other CalculationsTest of Hypotheses
  • stdbeta sqrt(vecdiag(xinvx)mserr)
  • t betahat/stdbeta
  • probt 1-probf(abs(tt),1,dferr)

41
Compute the Estimates
  • We can do further analyses, such as R2, CIs, etc.
  • This example can be extended to multiple
    regression.
  • Some of these are spelled out in the program
    attached here, and the online doc gives more
    details of a regression example to illustrate
    these points.

42
Question is.
  • Why?
  • Proc Reg already does this!

43
A real-life example
  • Consider the following table

44
A real life example
  • The problem is to test the equality of the
    proportions responding between the two treatment
    groups, controlling for stratum.
  • We choose not to run the CMH test in PROC FREQ
    because the test isnt appropriate.
  • Test desired can be found in Mehrotra and
    Railkar, Statistics in Medicine, 2000.
  • How to proceed?

45
Defining the X, N matrices
46
Compute row sums, point estimates
47
Compute point estimates, weights
48
Compute point estimates of proportion (weighted)
in each group, and estimated variance
49
Compute the weighted difference of proportions
between two groups, and the estimated variance
50
Confidence intervals for proportion of each
treatment group, and for the difference in
proportions
51
p-values for testing the hypothesis
52
Other Uses
  • Graphics.
  • Extremely Powerful.
  • Two examples with Output Only.
  • Ill refer you to programs.

53
Mosaic Plots
54
Mosaic Plots
55
Mosaic Plots (data not shown)
56
For more reference
  • See the SAS/IML Users Guide, or Online
    Documentation
  • Mosaic Plots http//www.math.yorku.ca/SCS/Papers
    /moshist.pdf
  • Mehrotra, D., and Railkar, R. Minimum risk
    weights for comparing treatments in stratified
    binomial trials. Statistics in Medicine 2000
    19 811-825.
  • Graybill, F. Theory and Application of the
    General Linear Model. Wadsworth Brooks
    Pacific Grove, 1976

57
Summary
  • IML is a powerful tool for data analysis,
    statistics, and graphics.
  • Consider instead of datastep programming if the
    opportunity presents itself.
  • Not always the best alternative, but helps inform
    the decision making.
Write a Comment
User Comments (0)
About PowerShow.com