Farhan Mohamed Ali (W2-1) Jigar Vora (W2-2) Sonali Kapoor (W2-3) Avni Jhunjhunwala (W2-4) - PowerPoint PPT Presentation

Loading...

PPT – Farhan Mohamed Ali (W2-1) Jigar Vora (W2-2) Sonali Kapoor (W2-3) Avni Jhunjhunwala (W2-4) PowerPoint presentation | free to download - id: 26157a-ZDc1Z



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Farhan Mohamed Ali (W2-1) Jigar Vora (W2-2) Sonali Kapoor (W2-3) Avni Jhunjhunwala (W2-4)

Description:

... shifter. e) Normalize: n pass shifter. f) Round: incrementer and shifter ... Alignment Shifter 1500. Leading 0 Anticipator 600. Normalize 3400. Rounding 600 ... – PowerPoint PPT presentation

Number of Views:77
Avg rating:3.0/5.0
Slides: 16
Provided by: avn2
Learn more at: http://www.ece.cmu.edu
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Farhan Mohamed Ali (W2-1) Jigar Vora (W2-2) Sonali Kapoor (W2-3) Avni Jhunjhunwala (W2-4)


1
Presentation 3 MAD MAC 525
Farhan Mohamed Ali (W2-1)Jigar Vora
(W2-2)Sonali Kapoor (W2-3) Avni Jhunjhunwala
(W2-4)
W2
Design Manager Zack Menegakis
8th February, 2006 Size estimates/Floor plan
Project Objective Design a crucial part of a GPU
called the Multiply Accumulate Unit (MAC) which
will revolutionize graphics.
2
MAD MAC 525 Status
  • Project chosen
  • Specifications defined
  • Architecture
  • Design
  • Behavioral Verilog
  • Testbenches
  • Verilog Gate Level Design (adder test)
  • Floor plan
  • To be done
  • Schematic (started)
  • Layout
  • Extraction, LVS, post-layout simulation

3
Overview - MAD MAC 525
  • Multiply Accumulate unit (MAC)
  • Executes function ABC on 16 bit floating point
    inputs
  • Multiply and add in parallel to greatly speed up
    operation
  • Rounding is only performed only once so greater
    accuracy than individual multiply and add
    functions.
  • MAD MAC accelerates FP16 blending to enable true
    HDR graphics
  • Bright things can be really bright
  • Dark things can be really dark
  • And the details can be seen in both

4
Block Diagram
Input
Input
Input
16
16
16
5
RegArray A
RegArray B
RegArray C
10
10
10
5
5
Multiplier
Exp Calc
Align
5
14
22
35
Control Logic Sign Dtrmin
Leading 0 Anticipator
Adder/Subtractor
36
4
Normalize
14
5
Round
10
5
1
Reg Y
Output
16
5
Design Decisions (Week 3)
  • Decided last week on implementation of all
    blocks
  • a) Multiplier Carry save
  • b) Adder Variable carry select adder
  • c) Leading zero counter carry save adder to
    count leading zeros
  • d) Align n pass shifter
  • e) Normalize n pass shifter
  • f) Round incrementer and shifter

6
Floorplan
Reg A
Reg C
Reg B
Align
Exp calc
Multiplier
Ld zero
Adder
Normalize
Register Y
Round
7
Floorplan
  • Estimated area
  • Registers 6000 um sq
  • Multiplier 22000 um sq
  • Exponent calc 7000 um sq
  • Align 15000 um sq
  • Adder 22000 um sq
  • Normalize 20000 um sq
  • Round 4000 um sq
  • Leading zero counter 3000 um sq
  • Total 98000 um sq

8
Floorplan
  • Metal Directionality

M1,M2- Local interconnect, Gnd,Vdd
M3,M4- Clock, global wiring
9
  • Updated Estimated Transistor Count
  • Registers (I/O, pipelining, threading) 1800
  • Carry-Save Multiplier 4500
  • Carry-Select Adder/Subtractor 3500
  • Alignment Shifter 1500
  • Leading 0 Anticipator 600
  • Normalize 3400
  • Rounding 600
  • Special Cases and Control Logic 2000
  • Total 17900

10
Critical Path
Input
Input
Input
16
16
16
5
RegArray A
RegArray B
RegArray C
10
10
10
5
5
Multiplier
Exp Calc
Align
5
14
22
35
Control Logic Sign Dtrmin
Leading 0 Anticipator
Adder/Subtractor
36
4
Normalize
14
5
Round
10
5
1
Reg Y
16
11
Structural Verilog
12
Structural Verilog (contd.)
13
Schematics
14
Problems and Questions?
  • Have tested our verilog using our own
    testbenches. Not yet been able to verify it with
    high level simulation. Currently looking into
    Simulink for a solution.
  • Suggestions from last week PDP 11 code found
    online does 32 bit fp arithmetic.

15
  • Questions?
About PowerShow.com