Compiling and Using the - PowerPoint PPT Presentation

Loading...

PPT – Compiling and Using the PowerPoint presentation | free to download - id: 4b1419-NzFjN



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Compiling and Using the

Description:

... --with-blas= – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 14
Provided by: vsac9
Learn more at: http://files.meetup.com
Category:
Tags: compiling | group | using

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Compiling and Using the


1
Compiling and Using the best R
  • Vipin Sachdeva
  • IBM Computational Science Division

2
Improving R performance
  • Performance improvements
  • Hardware (Number of cores etc.)
  • Intel quad-core _at_2.4 Ghz Intel Q6600
  • Compilers
  • Intel versus GNU
  • Compiler flags (unoptimized versus optimized)
  • Libraries (BLAS)
  • netlib BLAS, GotoBLAS2, Intel MKL, Intel MKL-SMP

3
Benchmark for R
  • R-benchmark-25.R
  • http//r.research.att.com/benchmarks/R-benchmark-2
    5.R
  • Measures timings for
  • B A A,
  • C A/B
  • Eigenvalues, Determinant, Cholesky, Inverse
    (BLAS)
  • Needs SuppDists package
  • ./Rscript --vanilla R-benchmark-25.R

4
Base R
  • ./configure prefix/home/vsachde/R-install
  • Source directory .
  • Installation directory /home/vsachde/R-project
    /all-R/GNU-R/R-native-unoptimized
  • C compiler gcc -stdgnu99 -g
    -O2
  • Fortran 77 compiler gfortran -g -O
  • C compiler g -g -O2
  • Fortran 90/95 compiler gfortran -g -O
  • Obj-C compiler
  • Interfaces supported X11, tcltk
  • External libraries readline
  • Additional capabilities PNG, JPEG, TIFF,
    NLS, cairo
  • Options enabled static R library,
    shared BLAS, R profiling, Java

Compiler flags
GNU Compilers
External libraries being used
5
Somewhat Optimized R
  • export optim_flags-O3 -funroll-loops
    -ffast-math -marchcore2
  • CC"gcc" CFLAGSoptim_flags CXX"g"
    CXXFLAGSoptim_flags F77"gfortran"
    FFLAGSoptim_flags FC"gfortran"
    FCFLAGSoptim_flags ./configure
    prefixinstalldir

C compiler gcc -stdgnu99 -O3
-funroll-loops -ffast-math -marchcore2 Fortran
77 compiler gfortran -O3 -funroll-loops
-ffast-math -marchcore2 C compiler
g -O3 -funroll-loops -ffast-math
-marchcore2 Fortran 90/95 compiler
gfortran -O3 -funroll-loops -ffast-math
-marchcore2
  • Compilers can be changed by variables CC, CXX,
    F77
  • CCicc CXXicpc F77ifort will use Intel
    compilers.

6
Linking external BLAS with R
  • R uses unoptimized routines to do linear algebra
    if not linked with external BLAS.
  • ./configure -with-blasltlocation of BLAS libgt
  • Various sources of BLAS
  • Netlib BLAS - Generic and unoptimized
  • GotoBLAS2 Optimized and multi-threaded
  • Intel MKL Optimized library from Intel
    (sequential)
  • Intel MKL-SMP (Multi-threaded)
  • Many others including ACML, Atlas.
  • Performance of kernels change on different
    libraries used.


Tries to link the BLAS library
7
Linking external BLAS with R
  • If everything goes well
  • Source directory .
  • Installation directory /home/vsachde/R-proje
    ct/all-R/GNU-R/R-netlib-blas
  • C compiler gcc -stdgnu99 -O3
    -funroll-loops -ffast-math -marchcore2
  • Fortran 77 compiler gfortran -O3
    -funroll-loops -ffast-math -marchcore2
  • C compiler g -O3
    -funroll-loops -ffast-math -marchcore2
  • Fortran 90/95 compiler gfortran -O3
    -funroll-loops -ffast-math -marchcore2
  • Obj-C compiler
  • Interfaces supported X11, tcltk
  • External libraries readline,
    BLAS(generic)
  • Additional capabilities PNG, JPEG, TIFF,
    NLS, cairo
  • Options enabled static R library, R
    profiling, Java
  • Recommended packages yes

BLAS was linked in properly
8
Linking external BLAS with R
  • What does -with-blas do ?
  • Link and run R with dgemm.
  • configure28567 checking for dgemm_ in
    /home/vsachde/R-project/all-blas/GNU-blas/netlib-b
    las/libblas_GNU.a
  • configure28588 gcc -stdgnu99 -o conftest -g
    -O2 -I/usr/local/include -L/usr/local/lib64
    conftest.c /home/vsachde/R-project/all-blas/GNU-bl
    as/netlib-blas/libblas_GNU.a -lgfortran -lm -ldl
    -lm gt5
  • configure28595 result yes
  • If the above linking step fails
  • Installation wont fail, but BLAS will not be
    linked in.
  • Summary at end wont show external BLAS linking.
  • Search for dgemm in config.log and look for
    errors.
  • Advice Compile static libraries as they are
    easier to link

9
Linking with different BLAS
  • Netlib-BLAS
  • Download source from netlib.org, unoptimized.
  • GotoBLAS2
  • Download from TACC website
  • Optimized and multi-threaded
  • Turn off CPU throttling to compile.
  • Intel MKL
  • Sequential and SMP
  • Linking step is same for most BLASes except Intel
    libs

10
Linking with Intel MKL libs
  • export MKLPATH/opt/intel/Compiler/11.1/072/mkl/li
    b/em64t/
  • Intel MKL sequential
  • --with-blas"-Wl,--start-group
    MKLPATH/libmkl_intel_lp64.a MKLPATH/libmkl_seque
    ntial.a MKLPATH/libmkl_core.a -Wl,--end-group
    -lpthread
  • Intel MKL SMP
  • --with-blas"-Wl,--start-group
    MKLPATH/libmkl_intel_lp64.a MKLPATH/libmkl_intel
    _thread.a MKLPATH/libmkl_core.a -Wl,--end-group
    -liomp5 -lpthread"

Intel MKL SMP and GotoBLAS2 should show
performance improvements in quad-core (run 4
threads)
11
Performance Single-thread BLAS
12
Performance BLAS
Performance went down by 15-20X through
compilers, compiler options and hardware (4
threads)
Revolution R uses Intel MKL-SMP
13
Results
  • Generic R can be optimized for performance.
  • Intel MKL libraries give best performance results
    with freely available GotoBLAS2 a close second.
  • Experiment with LAPACK as well.
  • Question How much is performance important for R
    users ?
About PowerShow.com