Lecture 2: System Metrics and Pipelining - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Lecture 2: System Metrics and Pipelining

Description:

Lecture 2: System Metrics and Pipelining Today s topics: (Sections 1.6, 1.7, 1.9, A.1) Quantitative principles of computer design Measuring cost and dependability – PowerPoint PPT presentation

Number of Views:73
Avg rating:3.0/5.0
Slides: 22
Provided by: RajeevBalas182
Learn more at: http://www.eng.utah.edu
Category:

less

Transcript and Presenter's Notes

Title: Lecture 2: System Metrics and Pipelining


1
Lecture 2 System Metrics and Pipelining
  • Todays topics (Sections 1.6, 1.7, 1.9, A.1)
  • Quantitative principles of computer design
  • Measuring cost and dependability
  • Introduction to pipelining
  • Class web-page and class mailing list are now
  • functional
  • http//www.eng.utah.edu/cs6810
  • Assignment 1 will be posted later today due in
    12 days

2
Amdahls Law
  • Architecture design is very bottleneck-driven
    make the
  • common case fast, do not waste resources on a
    component
  • that has little impact on overall
    performance/power
  • Amdahls Law performance improvements through
    an
  • enhancement is limited by the fraction of time
    the
  • enhancement comes into play
  • Example a web server spends 40 of time in the
    CPU
  • and 60 of time doing I/O a new processor
    that is ten
  • times faster results in a 36 reduction in
    execution time
  • (speedup of 1.56) Amdahls Law states that
    maximum
  • execution time reduction is 40 (max speedup of
    1.66)

3
Principle of Locality
  • Most programs are predictable in terms of
    instructions
  • executed and data accessed
  • The 90-10 Rule a program spends 90 of its
    execution
  • time in only 10 of the code
  • Temporal locality a program will shortly
    re-visit X
  • Spatial locality a program will shortly visit
    X1

4
Exploit Parallelism
  • Most operations do not depend on each other
    hence,
  • execute them in parallel
  • At the circuit level, simultaneously access
    multiple ways
  • of a set-associative cache
  • At the organization level, execute multiple
    instructions at
  • the same time
  • At the system level, execute a different program
    while one
  • is waiting on I/O

5
Factors Determining Cost
  • Cost amount spent by manufacturer to produce a
    finished
  • good
  • High volume ? faster learning curve, increased
  • manufacturing efficiency (10 lower cost if
    volume doubles),
  • lower RD cost per produced item
  • Commodities identical products sold by many
    vendors in
  • large volumes (keyboards, DRAMs) low cost
    because of
  • high volume and competition among suppliers

6
Wafers and Dies
An entire wafer is produced and chopped into
dies that undergo testing and packaging
7
Integrated Circuit Cost
  • Cost of an integrated circuit
  • (cost of die cost of packaging and testing) /
    final test yield
  • Cost of die cost of wafer / (dies per wafer x
    die yield)
  • Dies/wafer wafer area / die area - p wafer
    diam / die diag
  • Die yield wafer yield x (1 (defect rate x
    die area) / a) -a
  • Thus, die yield depends on die area and
    complexity
  • arising from multiple manufacturing steps (a
    4.0)

8
Integrated Circuit Cost Examples
  • A 30 cm diameter wafer cost 5-6K in 2001
  • Such a wafer yields about 366 good 1 cm2 dies
    and 1014
  • good 0.49 cm2 dies (note the effect of area and
    yield)
  • Die sizes Alpha 21264 1.15 cm2 , Itanium 3.0
    cm2 ,
  • embedded processors are between 0.1 0.25 cm2

9
Contribution of IC Costs to Total System Cost
Subsystem Fraction of total cost
Cabinet sheet metal, plastic, power supply, fans, cables, nuts, bolts, manuals, shipping box 6
Processor 22
DRAM (128 MB) 5
Video card 5
Motherboard 5
Processor board subtotal 37
Keyboard and mouse 3
Monitor 19
Hard disk (20 GB) 9
DVD drive 6
I/O devices subtotal 37
Software (OS Office) 20
10
Defining Fault, Error, and Failure
  • A fault produces a latent error it becomes
    effective when
  • activated it leads to failure when the
    observed actual
  • behavior deviates from the ideal specified
    behavior
  • Example I a programming mistake is a fault
    the buggy
  • code is the latent error when the code runs,
    it is effective
  • if the buggy code influences program
    output/behavior, a
  • failure occurs
  • Example II an alpha particle strikes DRAM
    (fault) if it
  • changes the memory bit, it produces a latent
    error when
  • the value is read, the error becomes effective
    if program
  • output deviates, failure occurs

11
Defining Reliability and Availability
  • A system toggles between
  • Service accomplishment service matches
    specifications
  • Service interruption services deviates from
    specs
  • The toggle is caused by failures and
    restorations
  • Reliability measures continuous service
    accomplishment
  • and is usually expressed as mean time to
    failure (MTTF)
  • Availability measures fraction of time that
    service matches
  • specifications, expressed as MTTF / (MTTF
    MTTR)

12
The Assembly Line
Unpipelined
Start and finish a job before moving to the next
Jobs
Time
A
B
C
Break the job into smaller stages
A
B
C
A
B
C
A
B
C
Pipelined
13
Performance Improvements?
  • Does it take longer to finish each individual
    job?
  • Does it take shorter to finish a series of jobs?
  • What assumptions were made while answering these
  • questions?
  • Is a 10-stage pipeline better than a 5-stage
    pipeline?

14
Quantitative Effects
  • As a result of pipelining
  • Time in ns per instruction goes up
  • Number of cycles per instruction goes up (note
    the
  • increase in clock speed)
  • Total execution time goes down, resulting in
    lower
  • time per instruction
  • Average cycles per instruction increases
    slightly
  • Under ideal conditions, speedup
  • ratio of elapsed times between successive
    instruction
  • completions
  • number of pipeline stages increase in
    clock speed

15
A 5-Stage Pipeline
16
A 5-Stage Pipeline
Use the PC to access the I-cache and increment
PC by 4
17
A 5-Stage Pipeline
Read registers, compare registers, compute branch
target for now, assume branches take 2 cyc
(there is enough work that branches can easily
take more)
18
A 5-Stage Pipeline
ALU computation, effective address computation
for load/store
19
A 5-Stage Pipeline
Memory access to/from data cache, stores finish
in 4 cycles
20
A 5-Stage Pipeline
Write result of ALU computation or load into
register file
21
Title
  • Bullet
Write a Comment
User Comments (0)
About PowerShow.com