SIMD Architectures - PowerPoint PPT Presentation

About This Presentation
Title:

SIMD Architectures

Description:

Operations can be performed in parallel on each element of a large regular data ... When computers were large, could amortize the control portion of many replicated ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 25
Provided by: laxmib
Learn more at: http://www.cs.ucr.edu
Category:

less

Transcript and Presenter's Notes

Title: SIMD Architectures


1
SIMD Architectures
  • Laxmi Narayan Bhuyan
  • http//www.cs.ucr.edu/bhuyan

2
(No Transcript)
3
(No Transcript)
4
(No Transcript)
5
(No Transcript)
6
(No Transcript)
7
(No Transcript)
8
(No Transcript)
9
(No Transcript)
10
Data Parallel Model
  • Operations can be performed in parallel on each
    element of a large regular data structure, such
    as an array
  • 1 Control Processsor broadcast to many PEs (see
    Ch. 1, Fig. 1-25, page 45 of CSG99)
  • When computers were large, could amortize the
    control portion of many replicated PEs
  • Condition flag per PE so that can skip
  • Data distributed in each memory
  • Early 1980s VLSI gt SIMD rebirth 32 1-bit PEs
    memory on a chip was the PE
  • Data parallel programming languages lay out data
    to processor

11
Data Parallel Model
  • Vector processors have similar ISAs, but no data
    placement restriction
  • SIMD led to Data Parallel Programming languages
  • Advancing VLSI led to single chip FPUs and whole
    fast µProcs (SIMD less attractive)
  • SIMD programming model led to Single Program
    Multiple Data (SPMD) model
  • All processors execute identical program
  • Data parallel programming languages still useful,
    do communication all at once Bulk Synchronous
    phases in which all communicate after a global
    barrier

12
SIMD Programming High-Performance Fortran (HPF)
  • Single Program Multiple Data (SPMD)
  • FORALL Construct similar to Fork
  • FORALL (I1N), A(I) B(I) C(I), END
    FORALL
  • Data Mapping in HPF
  • 1. To reduce interprocessor communication
  • 2. Load balancing among processors
  • http//www.npac.syr.edu/hpfa/
  • http//www.crpc.rice.edu/HPFF/

13
How does an SIMD computer work?
  • A Host computer is necessary to do the I/O
    operations
  • The user program is loaded into the control
    memory
  • The data is distributed to all the memory modules
  • The control unit decodes the instn and executes
    it if it is a scalar instn. If it is a vector
    instn, it broadcasts the control signals to the
    PEs to do the executions
  • Before broadcasting the control signals, the CU
    broadcasts an enable vector which will enable the
    PEs

14
Masking and Data Routing Mechanisms
  • A,B,C working registers
  • Si status (1 active, 0 inactive)
  • Ri Data routing register
  • Di holds address
  • Ii Index register

15
Example
16
Matrix Multiplication
17
(No Transcript)
18
(No Transcript)
19
N N Mesh
20
The Illiac IV Architecture
  • Distributed memory architecture
  • 64 PEs connected as an 8X8 2-D mesh with end
    around connection
  • LDB Local Data Buffer
  • 64, 64-bit each
  • PEM 2K X 64 bits memory

21
The Illiac IV Network
22
Maspar MP-1 Architecture
  • Configuration with 1K-16K PEs are available
  • Each PE has a 4-bit ALU, 1-bit logic unit, a
    64-bit mantissa unit, a 16-bit exponent unit,
    communication input and output ports
  • Each PE has 40 32-bit registers available to the
    programmer
  • Each processor board has 1024 PEs arranges as 64
    PE clusters (PECs) with 16 PEs per cluster
  • Each PEC is a chip connected to 8 neighbors via
    an octagonal mesh
  • Another network, called Multistage Crossbar
    Network, with three router stages gives a
    function of 1024X1024 crossbar for routing from
    any PEC to another PEC

23
(No Transcript)
24
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com