Lecture 2: Evaluating Computer Architectures - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Lecture 2: Evaluating Computer Architectures

Description:

Benchmarks. Microbenchmarks. one performance dimension. cache bandwidth. main memory bandwidth ... Popular benchmarks typically reflect yesterday's programs ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 22
Provided by: andrew638
Learn more at: http://www.cs.utexas.edu
Category:

less

Transcript and Presenter's Notes

Title: Lecture 2: Evaluating Computer Architectures


1
Lecture 2 Evaluating Computer Architectures
  • Last Time
  • Technology ? Architecture ? Applications
  • Technology/Applications are a moving target
  • Today
  • Evaluation as key component in iterative design
    process
  • Two components of evaluation
  • Performance
  • Cost

2
Metrics of Evaluation
  • Level of design ? performance metric
  • Examples
  • Applications perspective
  • Time to run task (Response Time)
  • Tasks run per second (Throughput)
  • Systems perspective
  • Millions of instructions per second (MIPS)
  • Millions of FP operations per second (MFLOPS)
  • Bus/network bandwidth megabytes per second
  • Function Units cycles per instruction (CPI)
  • Fundamental elements (transistors, wires, pins)
    clock rate

3
Benchmarks
  • Microbenchmarks
  • one performance dimension
  • cache bandwidth
  • main memory bandwidth
  • procedure call overhead
  • FP performance
  • weighted combination of microbenchmark
    performance can be good predictor of application
    performance
  • insight into the cause of performance bottlenecks
  • Macrobenchmarks
  • application execution time
  • system throughput
  • measures overall performance but only on one
    application

4
Some Warnings about Benchmarks
  • Benchmarks measure the whole system
  • application
  • compiler
  • operating system
  • architecture
  • implementation
  • Popular benchmarks typically reflect yesterdays
    programs
  • what about the programs people are running today?
  • need to design for tomorrows problems
  • Benchmark timings are sensitive
  • alignment in cache
  • location of data on disk
  • values of data
  • Danger of inbreeding or positive feedback
  • if you make an operation fast (slow) it will be
    used more (less) often
  • therefore you make it faster (slower)
  • and so on, and so on
  • the optimized NOP

5
Know what you are measuring!
  • Compare apples to apples
  • Example
  • Wall clock execution time
  • User CPU time
  • System CPU time
  • Idle time (multitasking, I/O)

cshgt time latex lecture2.tex cshgt 0.68u 0.05s
001.60 45.6
user
elapsed
system
CPU time
6
Performance Comparison Terminology
  • X is n faster than Y means
  • Example Y takes 15 seconds to complete task, X
    takes 10 seconds
  • What faster is X?
  • Same definitions apply to other metrics, such as
    throughput

7
Performance Comparison Example
  • Performance metrics
  • Time to run the task (travel time for each
    passenger)
  • Throughput (person miles per hour PMPH)
  • Comparisons
  • Speed of Concorde gt 747
  • Throughput of 747 gt Concorde

8
Improving Performance Fundamentals
  • Suppose we have a machine with two instructions
  • Instruction A executes in 100 cycles
  • Instruction B executes in 2 cycles
  • We want better performance.
  • Which instruction do we improve?

9
Amdahls Law
  • Performance improvements depend on
  • how good is enhancement
  • how often is it used
  • Speedup due to enhancement E (fraction p sped up
    by factor S)

10
Amdahls Law Example
  • FP instructions improved by 2x
  • But.only 10 of instructions are FP
  • Speedup bounded by

11
Amdahls Law Example 2
  • Speed up vectorizable code by 100x

12
Amdahls Law assumes serial execution
T2
T3
T1
T T1 T2 T3
T3n
T1
T2
T2
T1
T3
T T1 Max(T2, T3)
13
Amdahls Law Summary message
  • Make the Common Case fast
  • Examples
  • All instructions require instruction fetch, only
    fraction require data
  • optimize instruction access first
  • Data locality (spatial, temporal), small memories
    faster
  • storage hierarchy most frequent accesses to
    small, local memory

14
CPU Performance Equation
  • 3 components to execution time
  • Factors affecting CPU execution time
  • Consider all three elements when optimizing
  • Workloads change!

15
Cycles Per Instruction (CPI)
  • Depends on the instruction
  • Average cycles per instruction
  • Example

16
Comparing and Summarizing Performance
  • Fair way to summarize performance?
  • Capture in a single number?
  • Example Which of the following machines is best?

17
Means
Can be weighted aiTi
Arithmetic mean
Represents total execution time
Harmonic mean
Ri 1/Ti
Consistent independent of reference Best for
combining results
Geometric mean
18
The Bottom Line Cost
  • Many contributing factors
  • Wafer cost
  • Chip size
  • Wafer yield (good die/wafer)
  • Testing cost
  • Packaging cost
  • Design/Engineering Costs
  • Marketing costs
  • Chip cost is primarily a function of chip area

19
Real World Examples
From Estimating IC Manufacturing Costs, by
Linley Gwennap, Microprocessor Report, August 2,
1993.
20
CPU cost is only a part of the picture
21
Summary
  • Evaluation of Systems
  • Performance
  • Amdahls Law, CPI
  • Cost
  • Next Time
  • Benchmarking
  • SPEC suite others
  • Pitfalls of performance evaluation
Write a Comment
User Comments (0)
About PowerShow.com