A Case for Source-Level Transformations in MATLAB - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

A Case for Source-Level Transformations in MATLAB

Description:

High-Level Interpreted Language for Numerical Computing. Matrix is 1st class type ... MATLAB interprets expressions na vely in left to right order ... – PowerPoint PPT presentation

Number of Views:18
Avg rating:3.0/5.0
Slides: 43
Provided by: vijay65
Category:

less

Transcript and Presenter's Notes

Title: A Case for Source-Level Transformations in MATLAB


1
A Case for Source-Level Transformations in MATLAB
Vijay Menon and Keshav Pingali Cornell University
The MaJic Project at Illinois/Cornell
  • George Almasi
  • Luiz De Rose
  • David Padua

2
MATLAB
  • High-Level Interpreted Language for Numerical
    Computing
  • Matrix is 1st class type
  • Library of numerical functions
  • Application Domains
  • Image Processing
  • Structural Mechanics
  • Computational Finance

3
The Problem
  • Development is fast...
  • 10X as concise as C/Fortran
  • Performance is slow!
  • 10X as slow as C/Fortran
  • Conventional Approach
  • Rewrite
  • Compile

4
Our Approach Source-Level Optimization
  • Apply high-level transformations directly on
    MATLAB codes
  • Significant performance benefit for
  • interpreted code
  • compiled code

5
Outline
  • Overheads in MATLAB
  • Conventional Compilation
  • Source-Level Optimization
  • Comparison
  • Implementation Status

6
Outline
  • Overheads in MATLAB
  • Type/Shape Checking
  • Memory Management
  • Array Bounds Checking
  • Conventional Compilation
  • Source-Level Optimization
  • Comparison
  • Implementation Status

7
Type/Shape Checking
  • MATLAB has no type/shape declarations
  • Consider
  • A B
  • Interpreter checks to perform multiply ()
  • Shape
  • ScalarScalar
  • ScalarMatrix
  • MatrixMatrix
  • Type
  • RealReal
  • RealComplex
  • ComplexComplex

8
Type/Shape Checking
  • Consider
  • for i 1n
  • y y a x(i)
  • end
  • Loops
  • perform redundant checks
  • magnify interpreter overhead

9
Memory Management Dynamic Resizing
  • Consider
  • x(10) 10
  • C/Fortran x must have gt 10 elements
  • MATLAB x is resized if needed
  • Memory reallocated
  • Data copied

10
Memory Management Dynamic Resizing
  • MATLAB dynamically grows arrays
  • for i 1 1000
  • x(i) i
  • end
  • Every iteration triggers resize!
  • 1,000 memory allocations
  • 500,000 elements copied
  • Execution Time
  • x is undefined 14.2 seconds
  • x is already defined 0.37 seconds

11
Array Bounds Checking
  • Consider array indexing
  • x(i) y(i)
  • Failed Bounds Check on
  • x(i) can trigger resize
  • y(i) can trigger error

12
Array Bounds Checking
  • In a loop
  • for i 3100
  • x(i) x(i-1) x(i-2)
  • end
  • Interpreter performance redundant checks
  • Compiler work
  • Nonresizable arrays Gupta PLDI90
  • Resizable arrays more difficult

13
Common Theme
  • Loops magnify overheads
  • every iteration redundant checks, resizes,
  • MATLAB interprets naively
  • computes as is
  • no reorganization to optimize

14
Outline
  • Overheads in MATLAB
  • Conventional Compilation
  • Compile to C/Fortran
  • Rely on C/Fortran compiler for optimization
  • Source-Level Optimization
  • Comparison
  • Implementation Status

15
MATLAB Compilers
  • Compile to C/C/Fortran
  • MCC -gt C (The MathWorks)
  • MATCOM -gt C (Mathtools)
  • FALCON -gt F90 (U of Illinois)
  • Native compiler generates executable code
  • Link back into MATLAB environment
  • Run as stand-alone program

16
The MCC Compiler
  • Safe Optimization
  • Type Inference - no declarations in MATLAB
  • Eliminate Type Checks / Reduce Storage
  • Specialize for real input variables
  • Always legal!
  • Unsafe Optimization
  • Assume all data is real
  • Eliminate all bounds checks - disallow resizing
  • User must ensure legality!

17
Falcon Benchmarks
  • Collected by DeRose from MATLAB users at
    Illinois/NCSA
  • Element/Loop Intensive
  • CN - Crank-Nicholson PDE Solver
  • Di - Dirichlet PDE Solver
  • FD - Finite Difference PDE Solver
  • Ga - Galerkin PDE Solver
  • IC - Incomplete Cholesky Factorization
  • Memory Intensive
  • AQ - Adaptive Quadrature w/ Simpsons Rule
  • EC - Euler-Cromer 2 body problem
  • RK - Runga Kutta 2 body problem
  • Library Intensive
  • CG - Conjugate Gradients Iterative Solver
  • Mei - 3D surface Generation
  • QMR - Quasi-Minimal Residual
  • SOR - Successive Over-Relaxation AQ

18
MCC Safe Optimizations
19
MCC Unsafe Optimizations
Note User must ensure legality!
20
Outline
  • Overheads in MATLAB
  • Conventional Compilation
  • Source-Level Optimization
  • Vectorization
  • Preallocation
  • Expression Optimization
  • Comparison
  • Implementation Status

21
Vectorization
  • Loops are expensive
  • Overheads are magnified
  • Idea Eliminate Loops
  • Map loops to higher-level matrix operations
  • Interpreter uses efficient libraries
  • BLAS
  • LINPACK/EISPACK

22
Example of Vectorization
  • In Galerkin, 98 of execution spent in
  • for i 1N
  • for j 1N
  • phi(k) a(i,j)x(i)y(i)
  • end
  • end

23
Vectorized Code
  • In Optimized Galerkin
  • phi(k) xay
  • Fragment Speedup 260
  • Program Speedup 110
  • Note Not always possible!

24
Effect of Vectorization
25
Preallocation
  • Eliminate Dynamic Resizing
  • Try to predict eventual size of array
  • Insert early allocation when possible
  • x zeros(1000,1)
  • Resizing will not be triggered

26
Example of Preallocation
  • In Euler-Cromer, 87 of time spent in
  • for i 1N
  • r(i)
  • th(i)
  • t(i)
  • k(i)
  • p(i)
  • end

27
Preallocated Code
  • In Optimized Euler-Cromer
  • r zeros(1,N)
  • ...
  • for i 1N
  • r(i)
  • end
  • Fragment Speedup 7
  • Program Speedup 4

28
Effect of Preallocation
29
Expression Optimization
  • MATLAB interprets expressions naïvely in left to
    right order
  • Simple restructuring may significantly effects
    execution time, e.g.
  • ABx O(n3) flops
  • A(Bx) O(n2) flops

30
Example of Expression Optimization
  • In QMR, 70 of execution spent in
  • w Aq
  • A 420x420 matrix
  • q, w 420x1 vectors
  • A transpose(A)

31
Expression Optimized Code
  • In Optimized QMR Aq (qA)
  • w (qA)
  • Transpose 2 vectors instead 1 matrix
  • Fragment Speedup 20
  • Program Speedup 3

32
Effect of Expression Optimization
33
Summary Source-Level
34
Comparison
35
Point 1
  • Source optimizations can outperform MCC

36
Point 2
  • Source optimizations complement MCC

37
Benefits of Source-Level Optimizations
  • Vectorization
  • Directly eliminates loop overhead
  • Move work to hand-optimized BLAS
  • Preallocation
  • Eliminates resizing overhead
  • Enables MCC array bounds elimination
  • Expression Optimization
  • Uses algebraic info unavailable in C/Fortran

38
Implementation Status
  • Illinois/Cornell MaJic system
  • Just-in-time MATLAB interpreter/compiler
  • Incorporates Source-Level Transformation
  • Semantic Optimization (Menon/Pingali ICS99)
  • Vectorization/BLAS call generation
  • Expression Optimization
  • Preallocation/Bounds Check Optimization (Work in
    progress)

39
Conclusion
  • Source Level Optimizations are important for
    enhancing performance of MATLAB whether code is
    just interpreted or later compiled

40
THE END
41
Unsafe Type Check Removal
  • Correct on 11/12 Codes

42
Unsafe Bounds Check Removal
  • Correct on 7/12 Codes
Write a Comment
User Comments (0)
About PowerShow.com