GP2: General Purpose Computation using Graphics Processors - PowerPoint PPT Presentation

1 / 56
About This Presentation
Title:

GP2: General Purpose Computation using Graphics Processors

Description:

iPhone. The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL. Graphics Processing Units (GPUs) ... Formal model for GPU algorithms or GPU hacking ... – PowerPoint PPT presentation

Number of Views:94
Avg rating:3.0/5.0
Slides: 57
Provided by: DM17
Category:

less

Transcript and Presenter's Notes

Title: GP2: General Purpose Computation using Graphics Processors


1
GP2 General Purpose Computation using Graphics
Processors
Dinesh Manocha Avneesh Sud
  • http//gamma.cs.unc.edu/GPGP
  • Spring 2007
  • Department of Computer Science
  • UNC Chapel Hill

2
Instructors
  • Dinesh Manocha dm_at_cs.unc.edu 962-1749
  • Avneesh Sud sud_at_cs.unc.edu 962-1849

3
Class Schedule
  • Current Time Slot 200 315pm, Mon/Wed, SN011
  • Office hours TBD
  • Class mailing list gpgp_at_cs.unc.edu (??)

4
GPGP What kind of course is it?
  • Is it a graphics course?

5
GPGP What kind of course is it?
  • Is it a graphics course?
  • Is it a system course?

6
GPGP What kind of course is it?
  • Is it a graphics course?
  • Is it a system course?
  • Is it an application course?

7
GPGP What kind of course is it?
  • Is it a graphics course?
  • Is it a system course?
  • Is it an application course?
  • It is all of them!!

8
Is this the right course for me?
  • No strict pre-requisites
  • Course would borrow concepts from
  • Computer graphics
  • Linear algebra
  • Numerical computations
  • Architectures CPU GPUs
  • Parallel programming (data parallel programming)
  • Applications
  • Geometric computations
  • Database computations
  • Scientific computing and physical simulation
  • Computer vision

9
Modern Commodity Processors
GPU (1.3 GHz)
Video Memory(768 MB)
2 x 4 MB Cache
PCI-E Bus(4 GB/s)
GPU (1.3 GHz)
2 x 4 MB Cache
Video Memory(768 MB)
System Memory(4 GB)
HyperTransport(20 GB/s)
10
GPUs of Today!
  • The GPU on commodity video cards has evolved into
    an extremely flexible and powerful processor
  • Programmability
  • Precision
  • Power

11
GPGP
  • The GPU on commodity video cards has evolved into
    an extremely flexible and powerful processor
  • Programmability
  • Precision
  • Power
  • This course will address how to harness that
    power for general-purpose computation
    (non-rasterization)
  • Algorithmic issues
  • Programming and systems
  • Applications

12
GeForce 7900 302M Transistors (2005)
13
GeForce 7900 302M Transistors (OUT OF DATE)
14
GeForce 8800 600M Transistors (2006)
15
Graphics Processing Units (GPUs)
  • Commodity processor for graphics applications
  • Massively parallel vector processors
  • High memory bandwidth
  • Low memory latency pipeline
  • Programmable
  • High growth rate
  • Power-efficient

16
GPU Commodity Processor
Laptops
Consoles
Cell phones
PSP
Desktops
17
GPU Commodity Processor
Laptops
Consoles
Cell phones
????
SuperComputers
PSP
Desktops
18
GPU Commodity Processor
Laptops
Consoles
Cell phones
????
iPhone
PSP
Desktops
19
Graphics Processing Units (GPUs)
  • Commodity processor for graphics applications
  • Massively parallel vector processors
  • 10-20x more operations per sec than CPUs
  • High memory bandwidth
  • Better hides memory latency pipeline
  • Programmable
  • High growth rate
  • Power-efficient

20
Parallelism on GPUs
Graphics FLOPS GPU 1.3 TFLOPS CPU 25.6
GFLOPS
21
Quad SLI 1.3 Billion transistors
Jan2006
22
Graphics Processing Units (GPUs)
  • Commodity processor for graphics applications
  • Massively parallel vector processors
  • High memory bandwidth
  • Better hides latency pipeline
  • Programmable
  • 10x more memory bandwidth than CPUs
  • High growth rate
  • Power-efficient

23
CPU vs. GPU Memory Hierarchy
Core 1
Core 2
FP
FP
FP
FP
FP
Registers
Registers
Registers
L1 Dcache
L1 Dcache
L1 cache
L2 cache
L2 cache
DDR2 RAM
GDDR4 RAM
24
CPU vs. GPU Memory HierarchyBroad Level
Comparison
Core 1
Core 2
FP
FP
FP
FP
FP
Registers
Registers
Registers
L1 Dcache
L1 Dcache
L1 cache
Write through
Write back
L2 cache
L2 cache
DDR2 RAM
GDDR4 RAM
25
CPU vs. GPU Memory Hierarchy
Core 1
Core 2
FP
FP
FP
FP
FP
Registers
Registers
Registers
L1 Dcache
L1 Dcache
L1 cache
Very small
Small, 4MB
L2 cache
L2 cache
DDR2 RAM
GDDR4 RAM
26
CPU vs. GPU Memory Hierarchy
Core 1
Core 2
FP
FP
FP
FP
FP
Registers
Registers
Registers
L1 Dcache
L1 Dcache
L1 cache
L2 cache
L2 cache
High B/W, 86 GB/s
Low B/W, 8GB/s
DDR2 RAM
GDDR4 RAM
27
Graphics Processing Units (GPUs)
  • Commodity processor for graphics applications
  • Massively parallel vector processors
  • High memory bandwidth
  • Better hides latency pipeline
  • Programmable
  • High growth rate
  • Power-efficient

28
GFLOPS for GPUs CPUs
Graphics-Flops
Giga-Flops
29
Graphics Processing Units (GPUs)
  • Commodity processor for graphics applications
  • Massively parallel vector processors
  • High memory bandwidth
  • Better hides latency pipeline
  • Programmable
  • High growth rate
  • Power-efficient (high throughput per watt)

30
Computational Power of GPUs
  • Why are GPUs getting faster so fast?
  • Arithmetic intensity the specialized nature of
    GPUs makes it easier to use additional
    transistors for computation not cache
  • Economics multi-billion dollar video game market
    is the killer application that pays for innovation

31
GPUs and Computer Architecture
  • Current research in computer architecture is
    looking at
  • Streaming computation
  • Flexible polymorphous computing systems
  • Multi-core architecture
  • Heterogeneous architecture
  • More on these topics in the future

32
GPUs and Computer Architecture
  • Current research in computer architecture is
    looking at
  • Streaming computation
  • Flexible polymorphous computing systems
  • Multi-core architecture
  • Heterogeneous architecture
  • GPU-like architectures have a lot in common with
    all these research trends!

33
GPUs and Computer Architecture
  • Current research in computer architecture is
    looking at
  • Streaming computation
  • Flexible polymorphous computing systems
  • Multi-core architecture
  • Heterogeneous architecture
  • GPU-like architectures have a lot in common with
    all these research trends!
  • We plan to touch on many of these topics as part
    of the course!

34
Is There a Future of GPGPU?
  • http//www.informationweek.com/news/showArticle.jh
    tml?articleID196800208 One of the Five
    Disruptive Technologies for 2007
  • http//www.wired.com/news/technology/computers/0,7
    2090-0.html?twwn_index_9
  • SuperComputings Next Revolution

35
Capabilities of Current GPUs
  • Modern GPUs are deeply programmable
  • Programmable pixel, vertex, video engines
  • Solidifying high-level language support
  • Modern GPUs support 32-bit floating point
    precision
  • Great development in the last few years
  • 64-bit arithmetic may be coming soon
  • Almost IEEE FP compliant

36
The Potential of GPGP
  • The power and flexibility of GPUs makes them an
    attractive platform for general-purpose
    computation
  • Example applications range from in-game physics
    simulation, geometric applications to
    conventional computational science
  • Goal make the inexpensive power of the GPU
    available to developers as a sort of
    computational coprocessor
  • Check out http//www.gpgpu.org

37
GPGP Challenges
  • GPUs designed for and driven by video games
  • Programming model is unusual tied to computer
    graphics
  • Programming environment is tightly constrained
  • Underlying architectures are
  • Inherently parallel
  • Rapidly evolving (even in basic feature set!)
  • Largely secret
  • No clear standards (besides DirectX imposed by
    MSFT)
  • Cant simply port code written for the CPU!
  • Is there a formal class of problems that can be
    solved using current GPUs

38
Importance of Data Parallelism
  • GPUs are designed for graphics or gaming industry
  • Highly parallel tasks
  • GPUs process independent vertices fragments
  • Temporary registers are zeroed
  • No shared or static data
  • No read-modify-write buffers
  • Data-parallel processing
  • GPUs architecture is ALU-heavy
  • Multiple vertex pixel pipelines, multiple ALUs
    per pipe
  • Hide memory latency (with more computation)

39
GPGPU Applications
  • Geometric computations
  • Database computations
  • Scientific computing and physical simulation
  • Signal processing
  • Computer vision
  • Efficient when computation domain is a uniform
    grid

40
Geometric Computations
  • Distance computations Data-parallel computation
  • Demo (2D)

41
Geometric Computations
  • Distance computations

42
Geometric Computations
  • Collision Detection and Proximity Computations
  • GPU A culling co-processor

N-Objects
Stage 1 Culling
GPU-Based Culling
Exact Tests
Potential Colliding Set
Overlap Tests
Collision
Potential Neighbor Set
Distance
Distance-Based Culling
CPU
GPU
43
Geometric Computations
  • Collision Detection

44
Geometric Computations
  • Proximity Computations

45
Database Computations
46
Physical Simulation
  • Solving PDEs
  • Numerical methods
  • Linear Algebra
  • Reaction-Diffusion Demo
  • Fluid Demo

47
Signal Processing
  • FFT, DCT, Video Processing
  • DCT demo
  • Video filtering demo

48
Computer Vision
  • Realtime feature tracker (KLT)

49
Computer Vision
  • Realtime feature tracker (KLT)

50
Goals of this Course
  • A detailed introduction to general-purpose
    computing on graphics hardware
  • Emphasis includes
  • Core computational building blocks
  • Strategies and tools for programming GPUs
  • Cover many applications and explore new
    applications
  • Highlight major research issues

51
Course Organization
  • Survey lectures
  • Instructors, other faculty, senior graduate
    students
  • Breadth and depth coverage
  • Student presentations

52
Course Contents
  • Overview of GPUs architecture and features
  • Models of computation for GPU-based algorithms
  • System issues Cache and data management
    Languages and compilers
  • Numerical and Scientific Computations Linear
    algebra computations. Optimization, FFTrigid body
    simulation, fluid dynamics
  • Geometric computations Proximity computations
    distance fields motion planning and navigation
  • Database computations database queries
    predicates, booleans, aggregates streaming
    databases and data mining sorting searching
  • GPU Clusters Parallel computing environments for
    GPUs
  • Rendering Ray-tracing, photon mapping Shadows

53
Student Load
  • Stay awake in classes!
  • One class lecture
  • Read a lot of papers
  • 2-3 small assignments

54
Student Load
  • Stay awake in classes!
  • One class lecture
  • Read a lot of papers
  • 2-3 small assignments
  • A MAJOR COURSE PROJECT WITH RESEARCH COMPONENT

55
Course Projects
  • Work by yourself or part of a small team
  • Develop new algorithms for simulation, geometric
    problems, database computations
  • Formal model for GPU algorithms or GPU hacking
  • Issues in developing GPU clusters for scientific
    computation
  • Look into new architecture and parallel
    programming trends

56
Possible Course Projects
  • Check the WWW site
  • http//gamma.cs.unc.edu/GPGP/projects
Write a Comment
User Comments (0)
About PowerShow.com