PlayStation and IBM Cell Architecture - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

PlayStation and IBM Cell Architecture

Description:

85% of peak performance (at 3.0GHz 80 GFlops) Most PS2 Problems go away! ... IBM on Cell, we will be able to utilize the Playstation 3 soon after its release. ... – PowerPoint PPT presentation

Number of Views:314
Avg rating:3.0/5.0
Slides: 36
Provided by: ben101
Category:

less

Transcript and Presenter's Notes

Title: PlayStation and IBM Cell Architecture


1
PlayStation and IBM Cell Architecture
  • Benjamin Levine,1 Jacob Schroeder,2 Pavan
    Tumati,3 Eric DeSturler,2 Sanjay Patel,3 Todd J.
    Martínez1

1Department of Chemistry 2Computer Science
3Electrical and Computer Engineering
2
Commodity-Off-The-Shelf (COTS) Computing
  • Benefits
  • Economy of Scale
  • Ease of Upgrade
  • Obstacles
  • Dissimilarity between desktop applications and
    physics simulations

3
Commodity-Off-The-Toy-Shelf (COTTS) Computing
  • Benefits
  • Economics of game consoles
  • Similarity between video games and physics
    simulations
  • Obstacles
  • Complexity of hardware
  • Absence of scientific software (e.g. BLAS)
  • Lifespan of product

4
PS2 Architecture
16 MB RDRAM
16 MB RDRAM
Emotion Engine (EE)
Graphic Synthesizer
Video
IEEE 1394
Sound Processor
16 MB SDRAM
I/O Processor
Sound
USB
Controller
Operating System ROM
PCMCIA interface
DVD
5
Emotion Engine (EE) Architecture
Vector Processing Unit 1 (VPU1)
Vector Processing Unit 0 (VPU0)
to Graphics Synthesizer
CPU
System Bus (128 bit)
Direct Memory Access Controller (DMAC)
Image Processing Unit
Memory Interface
I/O Interface
to Peripherals
to Main Memory
6
Key Components
Vector Unit 0 (VU0)
Floating Point Unit (FPU)
Processor Core
Instruction Memory (4 kB)
Data Memory (4 kB)
Scratchpad RAM
Inst. Cache
Data Cache
Vector Interface 0 (VIF0)
CPU Analogous to Pentium
VPU0 Macromode (coprocessor) or Micromode
(Asynchronous w/ 2 instruction streams)
7
Key Components
  • System Bus
  • 128bit
  • 2.4Gb/sec transfer
  • 10-channel DMA

Vector Unit 1 (VU1)
Instruction Memory (16 kB)
Data Memory (16 kB)
System Bus (128 bit)
Graphics Interface (GIF)
DMA
Memory Interface
Vector Interface 1 (VIF1)
to Main Memory
VU1 Micromode only
8
How do we compare to a Pentium?
(MFLOPS Million Floating Point Operations /
Second)
9
How do we compare to a Pentium?
(MFLOPS Million Floating Point Operations /
Second)
10
How is the EE used by game designers?
Vector Processing Unit 1 (VPU1)
Vector Processing Unit 0 (VPU0)
to Graphics Synthesizer
CPU
System Bus (128 bit)
Direct Memory Access Controller (DMAC)
Image Processing Unit
Memory Interface
I/O Interface
to Peripherals
to Main Memory
11
How is the EE used by game designers?
to Graphics Synthesizer (GS)
VPU1
CPU
VPU0
  • CPU Control, Basic physics

12
How is the EE used by game designers?
to Graphics Synthesizer (GS)
VPU1
CPU
VPU0
  • CPU Control, Basic physics
  • VPU0 Basic graphics transformation

13
How is the EE used by game designers?
to Graphics Synthesizer (GS)
VPU1
CPU
VPU0
  • CPU Control, Basic physics
  • VPU0 Basic graphics transformation
  • VPU1 Further graphics transformation

14
How is the EE used by game designers?
to Graphics Synthesizer (GS)
VPU1
CPU
VPU0
  • CPU Control, Basic physics
  • VPU0 Basic graphics transformation
  • VPU1 Further graphics transformation
  • GS Texturing

15
How is the EE used by game designers?
to Graphics Synthesizer (GS)
VPU1
CPU
VPU0
  • General Characteristics
  • One directional flow of data
  • Each VPU does one task

16
Ab Initio Molecular Dynamics (AIMD)
1-e-, 2-e- Integrals
MCSCF, DFT, etc.
Classical Propagation
17
How would a quantum chemist use EE for AIMD?
to Graphics Synthesizer (GS)
VPU1
CPU
VPU0
System Bus
  • CPU Control, 1-e- integrals

18
How would a quantum chemist use the EE?
to Graphics Synthesizer (GS)
VPU1
CPU
VPU0
System Bus
  • CPU Control, 1-e- integrals
  • VPUs 2-e- integrals

19
How would a quantum chemist use the EE?
to Graphics Synthesizer (GS)
VPU1
CPU
VPU0
System Bus
  • CPU Control, 1-e- integrals
  • VPUs 2-e- integrals
  • VPUs Linear Algebra (for MCSCF, etc.)

20
How would a quantum chemist use the EE?
to Graphics Synthesizer (GS)
VPU1
CPU
VPU0
System Bus
  • CPU Control, 1-e- integrals
  • VPUs 2-e- integrals
  • VPUs Linear Algebra (for MCSCF, etc.)
  • CPU Classical Propagation

21
How would a quantum chemist use the EE?
to Graphics Synthesizer (GS)
VPU1
CPU
VPU0
System Bus
  • General Characteristics
  • Data flow is more bus-intensive/chaotic
  • Each VPU does a few tasks

22
Obstacles to Overcome
to Graphics Synthesizer
VPU1
CPU
VPU0
System Bus (128 bit)
Image Processing Unit
Memory Interface
I/O Interface
DMAC
to Peripherals
to Main Memory
23
Obstacles to Overcome
to Graphics Synthesizer
VPU1
CPU
VPU0
System Bus (128 bit)
Image Processing Unit
Memory Interface
I/O Interface
DMAC
to Peripherals
to Main Memory
24
Obstacles to Overcome
to Graphics Synthesizer
GIF
VPU1
CPU
VPU0
VIF0
VIF1
System Bus (128 bit)
Image Processing Unit
Memory Interface
I/O Interface
DMAC
to Peripherals
to Main Memory
25
Dot Product Test Macromode
to Graphics Synthesizer (GS)
VPU1
CPU
VPU0
  • CPU uses VPU0 like a coprocessor
  • 4 element dot products

26
Dot Product Test Macromode
to Graphics Synthesizer (GS)
VPU1
CPU
VPU0
27
Micromode
to Graphics Synthesizer (GS)
VPU1
CPU
VPU0
  • VPUs run independently of CPU
  • Data transfer controlled by either CPU or DMAC/VIF

28
Matrix-Vector Multiplication Micromode
to Graphics Synthesizer (GS)
VPU1
CPU
VPU0
29
Matrix-Matrix Multiplication Micromode
to Graphics Synthesizer (GS)
VPU1
CPU
VPU0
30
Playing GAMESS
  • GAMESS successfully ported to PS2
  • Uses CPU only, not VPUs
  • Test case - 11 steps of a HF geometry
    optimization of butadiene

Schmidt, M. W. Baldridge, K. K. Boatz, J. A.
Elbert, S. T. Gordon, M. S. Jensen, J. H.
Koseki, S. Matsunaga, N. Nguyen, K. A. Su, S.
Windus, T. L. Dupuis, M. Montgomery, J. A. J.
Comp. Chem. 1993 14, 1347.
31
Playstation Cluster
  • We installed MPICH on two PS2s.
  • All tests ran successfully, with no modifications
    at all.
  • We built and ran a driver to calculate numerical
    energy derivatives in parallel.

32
Playstation3 and Cell
256Mb RAM 3.0GHz 8SPU/1VMX/1PE 400? Spring 2006
Q1 2006
33
Playstation3 and Cell
Collaboration with IBM Yorktown (Ashwini
Nanda) Implemented DP matrix-matrix
multiplication on Cell simulator 85 of peak
performance (at 3.0GHz 80 GFlops) Most PS2
Problems go away! DMA supports transfer to/from
SPU and CPU 256K local memory / SPU (compare to
32K on PS2 VU) 3.0GHz CPU/SPU (compare to 266MHz
for PS2) Hardware DP floating point
support Compiler support from IBM (no more
assembler) Remaining concerns Limited memory
256Mb on PS3, 512Mb on Cell blades
34
Conclusion
  • Game consoles can potentially out perform
    conventional x86 computers for scientific
    computing, at a significantly lower cost.
  • With what we have learned from the Playstation 2
    and collaborations with IBM on Cell, we will be
    able to utilize the Playstation 3 soon after its
    release.

35
Dot Product Test Micromode
to Graphics Synthesizer (GS)
VPU1
CPU
VPU0
Write a Comment
User Comments (0)
About PowerShow.com