Lecture 1: Introduction to High Performance Computing

About This Presentation

Title:

Lecture 1: Introduction to High Performance Computing

Description:

Title: CSE 574 Parallel Processing Author: ICS Faculty User Last modified by: Esin Onbasioglu Created Date: 7/12/2005 12:19:29 PM Document presentation format – PowerPoint PPT presentation

Number of Views:195

Avg rating:3.0/5.0

Slides: 27

Provided by: ICSFacu

Category:

more less

Transcript and Presenter's Notes

Title: Lecture 1: Introduction to High Performance Computing

1
Lecture 1Introduction to High Performance
Computing
2
Grand challenge problem

A grand challenge problem is one that cannot be
solved in a reasonable amount of time with
todays computers.

3
Weather Forecasting

Cells of size 1 mile x 1 mile x 1 mile
gt Whole global atmosphere about 5 x 108 cells
If each calculation requires 200 Flops
gt 1011 Flops, in one time step
To forecast the weather over 10 days using
10-minute intervals, with a computer operating at
100 Mflops (108 Flops/s)
gt would take 107 seconds or over 100 days.
To perform the calculation in 10 minutes would
require a computer operating at 1.7 Tflops (1.7 x
1012 Flops/s).

4
Some Grand Challenge Applications

Science
Global climate modeling
Astrophysical modeling
Biology genomics protein folding drug design
Computational Chemistry
Computational Material Sciences and
Nanosciences
Engineering
Crash simulation
Semiconductor design
Earthquake and structural modeling
Computation fluid dynamics (airplane design)
Combustion (engine design)
Business
Financial and economic modeling
Transaction processing, web services and search
engines
Defense
Nuclear weapons -- test by simulations
Cryptography

5
Units of High Performance Computing

Speed
1 Mflop/s 1 Megaflop/s 106 Flop/second
1 Gflop/s 1 Gigaflop/s 109 Flop/second
1 Tflop/s 1 Teraflop/s 1012 Flop/second
1 Pflop/s 1 Petaflop/s 1015 Flop/second
Capacity
1 MB 1 Megabyte 106 Bytes
1 GB 1 Gigabyte 109 Bytes
1 TB 1 Terabyte 1012 Bytes
1 PB 1 Petabyte 1015 Bytes

6
Moores Law

Gordon Moore (co-founder of Intel) predicted in
1965 that the transistor density of semiconductor
chips would double roughly every 18 months.

7
Moores Law holds also for performance and
capacity
1945 2002
Computer ENIAC Laptop
Number of vacuum tubes / transistors 18 000 6 000 000 000
Weight (kg) 27 200 0.9
Size (m3) 68 0.0028
Power (watts) 20 000 60
Cost () 4 630 000 1 000
Memory (bytes) 200 1 073 741 824
Performance (Flops/s) 800 5 000 000 000
8
Peak Performance

A contemporary RISC processor delivers 10 of its
peak performance
Two primary reasons behind this low efficiency
IPC inefficiency
Memory inefficiency

9
Instructions per cycle (IPC) inefficiency

Today the theoretical IPC is 4-6
Detailed analysis for a spectrum of applications
indicates that the average IPC is 1.21.4
75 of the performance is not used

10
Reasons for IPC inefficiency

Latency
Waiting for access to memory or other parts of
the system
Overhead
Extra work that has to be done to manage program
concurrency and parallel resources the real work
you want to perform
Starvation
Not enough work to do due to insufficient
parallelism or poor load balancing among
distributed resources
Contention
Delays due to fighting over what task gets to use
a shared resource next. Network bandwidth is a
major constraint

11
Memory Hierarchy
12
Processor-Memory Problem

Processors issue instructions roughly every
nanosecond
DRAM can be accessed roughly every 100
nanoseconds
The gap is growing
processors getting faster by 60 per year
DRAM getting faster by 7 per year

13
Processor-Memory Problem
14
How fast can a serial computer be?

Consider the 1 Tflop sequential machine
data must travel distance, r, to get from memory
to CPU
to get 1 data element per cycle, this means 1012
times per second at the speed of light, c 3x108
m/s
so r lt c / 1012 0.3 mm
For 1 TB of storage in a 0.3 mm2 area
each word occupies about 3 Angstroms2, the size
of a small atom

So, we need Parallel Computing!

16
High Performance Computers

In 1980s
1x106 Floating Point Ops/sec (Mflop/s)
Scalar based
In 1990s
1x109 Floating Point Ops/sec (Gflop/s)
Vector Shared memory computing
Today
1x1012 Floating Point Ops/sec (Tflop/s)
Highly parallel, distributed processing, message
passing

17
What is a Supercomputer?

A supercomputer is a hardware and software
system that provides close to the maximum
performance that can currently be achieved

18
Top500 Computers

Over the last 10 years the range for the
Top500 has increased greater than Moores law
1993
1 59.7 GFlop/s
500 422 MFlop/s
2004
1 70 TFlop/s
500 850 GFlop/s

19
Top500 List at June 2005
Manuf. Computer Instal. Site Cntry Year Rmax (Tflop/s) proc
1 IBM BlueGene/L LLNL USA 2005 136.8 65536
2 IBM BlueGene/L IBM Watson Res. Center USA 2005 91.3 40960
3 SGI Altix NASA USA 2004 51.9 10160
4 NEC Vector Earth Simulator Center Japan 2002 35.9 5120
5 IBM Cluster Barcelona Supercomp. C. Spain 2005 27.9 4800
20
Performance Development
21
Increasing CPU Performance

Manycore Chip
Composed of hybrid cores
Some general purpose
Some graphics
Some floating point

22
What is Next?

Board composed of multiple manycore chips sharing
memory

Rack composed of multiple boards
A room full of these racks
?Millions of cores
?Exascale systems (1018 Flop/s)

23
Moores Law Reinterpreted

Number of cores per chip doubles every 2 year,
while clock speed decreases (not increases).
Need to deal with systems with millions of
concurrent threads
Number of threads of execution doubles every 2
year

24
Performance Projection
25
Directions

Move toward shared memory
SMPs and Distributed Shared Memory
Shared address space with deep memory hierarchy
Clustering of shared memory machines for
scalability
Efficiency of message passing and data parallel
programming
MPI and HPF

Lecture 1: Introduction to High Performance Computing - PowerPoint PPT Presentation

Lecture 1: Introduction to High Performance Computing

Title: CSE 574 Parallel Processing Author: ICS Faculty User Last modified by: Esin Onbasioglu Created Date: 7/12/2005 12:19:29 PM Document presentation format – PowerPoint PPT presentation