CSCI%204717/5717%20Computer%20Architecture - PowerPoint PPT Presentation

About This Presentation
Title:

CSCI%204717/5717%20Computer%20Architecture

Description:

Optimized for calculation rather than multitasking and I/O ... Aerodynamics, seismology, meteorology. Continuous field simulation ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 18
Provided by: facult2
Learn more at: http://faculty.etsu.edu
Category:

less

Transcript and Presenter's Notes

Title: CSCI%204717/5717%20Computer%20Architecture


1
CSCI 4717/5717 Computer Architecture
  • Topic Vector/Array Processors
  • Reading Stallings, Section 18.7

2
Vector/Array Computing
  • Optimized for calculation rather than
    multitasking and I/O
  • Design focus is to perform parallel mathematical
    operations on a vector or array of data elements
  • Scalar processor would need to handle one element
    at a time.
  • Limited market -- Research, government agencies,
    meteorology

3
Vector/Array Computing (continued)
  • Target applications
  • data-intensive/scientific research such as
  • Aerodynamics, seismology, meteorology
  • Continuous field simulation
  • specialized (high-performance) graphics
    applications
  • Applicable because of ever-increasing need for
    improved resolution and model capabilities

4
Array Processor
  • Alternative to supercomputer
  • Configured as a peripheral to mainframe or
    minicomputer
  • Processor is only responsible for running vector
    portion of problem
  • The Sony PlayStation 3 uses a processor
    consisting of one scalar processor and eight
    vector processors. Developed by IBM, Toshiba and
    Sony. (Source http//en.wikipedia.org/wiki/Vecto
    r_computer)

5
Vector/Array Operation
  • Power of vector computing comes in the form of
    special processing instructions (Single
    Instruction, Multiple Data or SIMD)
  • Lock-step execution of code issuing single
    instruction to a large number of identical
    processors (or ALUs) with a large register set
    working on different data elements
  • Single master CPU keeps control of the entire
    process

6
Speed-Up Not Linear
  • As with any parallel processing architecture, the
    realized speed up of a vector processor is not
    linear because of
  • Overhead for managing parallel computations
  • Bottlenecks for communication and storage
  • Load of application doesn't always match
    available processors
  • These problems have an increasing effect with
    increases in the number of processors

7
Data Pipelining
  • The sequential nature of instructions allows for
    an instruction pipeline
  • Vector computing tends to have data that is well
    organized too
  • This allows for pipelining the data too
  • Single decode for instruction
  • Stages to fetch data, process data, store result
    in register

8
Data Pipelining (continued)
  • Example To add an array of numbers, processor
    must have the following information
  • a single "add" instruction
  • start address for the data
  • end address for the data

9
Vector/Array Programming
  • The programming goal is to divide a large dataset
    into independent sets that can be operated on in
    parallel
  • Requires a deep understanding of the algorithm
    being applied to the data
  • Distribute data to different processors
  • Initiate parallel processing
  • Bring everything back together when parallel
    processing is complete

10
Vector/Array Programming (continued)
  • Example Count the number of times a specific
    value appears in a large array
  • Begin by breaking up array into smaller arrays,
    one for each array processor
  • Each array processor, in parallel, counts the
    number of occurrences of the value
  • Final sum is then computed by adding the results
    from all of the processors

11
Vector/Array Applications
  • Which of the following applications would be
    better served by a vector or array computer than
    an SMP, cluster, or scalar processor? What
    component of the problem is parallel?
  • Web search indexing
  • Generating Fibonacci Sequence f(i) f(i-1)
    f(i-2)
  • Weather prediction
  • Image processing for a game
  • Web site server
  • Photoshop-type image processing

12
Scalar Programming
  • The following two slides are based on the
    multiplication of two 100X100 matrices A and B
  • DO 100 I 1,N
  • DO 100 J 1,N
  • C(I,J) 0.0
  • DO 100 K 1,N
  • C(I,J) C(I,J) A(I,K)B(K,J) (J 1,N)
  • 100 CONTINUE

13
(J 1,N) Vector Programming
  • The notation (J 1,N) indicates that operations
    on all indices J are to be carried out on N
    processors as a single operation
  • DO 100 I1,N
  • C(I,J) 0.0 (J 1,N)
  • DO 100 K 1,N
  • C(I,J) C(I,J) A(I,K)B(K,J) (J 1,N)
  • 100 CONTINUE

14
Fork/Join Parallel Programming
  • One method of parallel programming is the
    fork-join.
  • Programs start as a single process known as a
    master thread
  • The operation "fork" is used to indicate the
    beginning of sections of the program that are to
    be executed in parallel
  • The operation "join" is used to terminate the
    parallel threads created by "fork" to bring the
    program back to a single, master thread

15
Fork/Join Method (continued)
  • DO 50 J1,N 1
  • FORK 100
  • 50 CONTINUE
  • J N
  • 100 DO 200 I1,N
  • C(I,J) 0.0
  • DO 200 K 1,N
  • C(I,J) C(I,J) A(I,K)B(K,J)
  • 200 CONTINUE

16
Neural Networks
17
What?! A Blank Slide?!It must be over!!!
Write a Comment
User Comments (0)
About PowerShow.com