Compilers and Multi-Core Computing Systems - PowerPoint PPT Presentation

About This Presentation
Title:

Compilers and Multi-Core Computing Systems

Description:

'The best opportunity Computer Science has to improve user productivity, ... Processors can work on independent parts of the same task ... – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 25
Provided by: fran541
Category:

less

Transcript and Presenter's Notes

Title: Compilers and Multi-Core Computing Systems


1
Compilers and Multi-Core Computing Systems
  • FRAN ALLEN
  • allen_at_watson.ibm.com
  • Triangle Computer Science Lecture UNC
  • February 18, 2008

2
Topics
  • New Technical Challenge Performance
  • Solution?? Multicores and Parallelism
  • A personal tour of some languages, compilers, and
    computers for high performance systems
  • Meeting the Challenge

3
What Is The Challenge and Why Does It Matter?
  • Computers are hitting a performance limit
  • The biggest problem Computer Science has ever
    faced. John Hennessy
  • The best opportunity Computer Science has to
    improve user productivity, application
    performance, and system integrity. Fran Allen

4
The Performance Problem
  • Transistors continue to shrink
  • More and more transistors fit on a chip
  • The chips run faster and faster
  • Resulting in HOT CHIPS!

5
Performance Problem Solution Multicores
  • Two or more processors (multicores) on a chip
  • Simpler, slower, cooler processors
  • Processors can work on independent parts of the
    same task
  • Users and software will organize tasks to
    maximize PARALLELISM

6
Parallelism Solves the Performance Problem! (or
does it?)
7
The (Very Conservative) Future of Multi-cores
  • 2007 - 8 cores on a chip
  • 2009 - 16 cores
  • 2013 - 64 cores
  • 2015 - 128 cores
  • 2021 - 1k cores
  • LUNATIC LEVELS OF PARALLELISM!!

8
Languages, Compilers, and Computers A Personal
History
  • Fortran 1954-1957
  • Sequential Programs and Hardware Concurrency
  • 1955-1962 Stretch Harvest
  • 1962-1968 Advanced Computing System (ACS)
  • 1970s Consolidation
  • Sequential Programs and Parallel Computers
  • 1983 1995 PTRAN

In the beginning there was Fortran. Jim Gray
9
Fortran Project (1954-1957) Goals
  • Increase user productivity
  • "...produce programs almost as efficient as hand
    coded ones and do so on virtually every job."

John Backus
THE FORTRAN GOALS BECAME MY GOALS
10
The Fortran Language and Compiler
  • Available April 15, 1957
  • Some features
  • Beginnings of formal parsing techniques
  • Intermediate language form for optimization
  • Control flow graphs
  • Common sub-expression elimination
  • Generalized register allocation - for only 3
    registers!
  • Spectacular object code!!

11
Stretch (1956-1961)
  • Goal 100 times faster than any existing machine
  • Main Performance Limitation Memory Access Time
  • Extraordinarily ambitious hardware
  • Equally ambitious compiler

Fred Brooks
12
Stretch Concurrency
  • Overlapped storage references up to 6 at a time
  • Instruction lookahead unit
  • Up to 11 instructions executing in cpu at the
    same time
  • Hardware gave the appearance of a sequential
    machine
  • Superscalar??
  • Multiprogramming

13
HARVEST (1958 - 1962)
  • Built for NSA for code breaking
  • Hosted by Stretch
  • Streaming data computation model
  • Eight instructions and unbounded execution times
  • Only system with balanced I/O, memory and
    computational speeds (per conversation with Jim
    Pomerene 11/2000)
  • ALPHA a language designed to fit the problem and
    the machine

14
Stretch Harvest Compiler Organization
Autocoder II
ALPHA
Fortran
Translation
Translation
Translation
IL
OPTIMIZER
IL
REGISTER ALLOCATOR
IL
ASSEMBLER
OBJECT CODE
STRETCH
STRETCH-HARVEST
15
Stretch - Harvest Outcomes
  • April 1961 Stretch delivered to Los Alamos but
  • Stretch performance off by 50
  • Considered a failure by IBM
  • Feb 1962 Harvest accepted by National Security
    Agency and used for 14 years
  • Stretch had a huge influence on future IBM
    systems!

16
The IBM 360 (1959? 1964)
  • Goal Unify existing product lines
  • One Instruction set for scientific and business
    applications
  • Multiple hardware models ranging from small and
    cheap to powerful and expensive
  • One software product line
  • IBM bet the company and won!

Fred Brooks
17
Advanced Computing System (ACS) 1962-1968
  • Goal Fastest Machine in the World
  • Pipelined and superscalar
  • Branch prediction
  • Out of order instruction execution
  • Instruction and data caches
  • Experimental Compiler
  • Built early to drive hardware design
  • Compiler code often faster than the best hand
    code

John Cocke
18
ACS Compiler Optimization Results
  • Language-independent machine-independent
    optimization
  • A theoretical basis for program analysis and
    optimization
  • A Catalogue of Optimizations which included
  • Procedure integration
  • Loop transformations unrolling, jamming,
    unswitching
  • Redundant subexpression elimination, code motion,
    constant folding, dead code elimination, strength
    reduction, linear function test replacement,
    carry optimization, anchor pointing
  • Instruction scheduling
  • Register allocation
  • IBM CANCELLED ACS PROJECT IN 1968!

19
The 1970's Consolidation and Simplification
  • Mainstreaming new optimization techniques
  • Lots of research on optimization algorithms
  • Whole program analysis
  • Experimental Compiling System

John Cocke gave up his goal of building the
worlds fastest computer to build the best
cost/performance machine. The Result THE POWER
PC!!
20
PTRAN for Automatic Parallelization (1980s to
1995)
  • Research
  • Program Dependence Graphs
  • Constructing Useful Parallelism
  • Static Single Assignment (SSA)
  • Whole Program Analysis Framework
  • Compiler development
  • IBMs XL Family of Compilers
  • Fortran 90
  • Run-time technologies
  • Dynamic Process Scheduling
  • Debugging
  • Visualization

21
Automatic Parallelization is Hard
  • Identifying potential parallelism is hard
  • Pointers
  • Storage reuse
  • Procedure boundaries
  • Forming useful parallelism is hard
  • Caches
  • Data management
  • Multiple models of parallelism

22
Components of the Performance Solution
  • Very high level domain specific languages
  • Automatic parallelism
  • Data management optimization locality,
    integrity, ownership,
  • Influence the architects before it is too late.
  • Remember the goals
  • User Productivity
  • Application Performance
  • Bold thinkers and high risk projects

23
Peak Performance Computers by Year
24
END OF TALKSTART OF A NEW ERA IN COMPUTING
AND COMPUTER SCIENCE!
Write a Comment
User Comments (0)
About PowerShow.com