NEW TRENDS IN COMPUTER ARCHITECTURE DESIGN - PowerPoint PPT Presentation

View by Category
About This Presentation
Title:

NEW TRENDS IN COMPUTER ARCHITECTURE DESIGN

Description:

Sony ... cell phone, radio, timer, camera, TV remote, am/fm radio, garage door ... Matrix transpose/multiply (3D Gr.) # vertices at once. DCT (video, comm. ... – PowerPoint PPT presentation

Number of Views:514
Avg rating:3.0/5.0
Slides: 36
Provided by: saeidnoo
Learn more at: http://arthur.sale.tripod.com
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: NEW TRENDS IN COMPUTER ARCHITECTURE DESIGN


1
NEW TRENDS IN COMPUTER ARCHITECTURE DESIGN
Saeid Nooshabadi Arthur Sale University of
Tasmania
2
Outline
  • Desktop/Server Microprocessor State of the Art
  • Current Processors Limit
  • Embedded Processors Market
  • Mobile Multimedia Computing as New Direction
  • Conclusion

3
Computer in the NewsTechnology Marches on (1)
  • SANTA CLARA, Calif., March 8, 2000 --
  • Intel Corporation today introduced the Intel
    Pentium III processor 1.0 GHz (GigaHertz or
    1,000 MegaHertz), the world's highest performance
    microprocessor for PCs. The Pentium III processor
    at 1 GHz delivers a 15 percent performance gain
    over the fastest processors on the market today.
  • Source http//www.intel.com

4
Computer in the NewsTechnology Marches on (2)
  • INTEL DEVELOPER FORUM, Calif., Feb. 15, 2000 -
  • Intel Corporation Chairman Andrew S. Grove
    today kicked off the semi-annual Intel Developer
    Forum by demonstrating the company's fastest
    microprocessor a chip running at 1.5 GHz, or 1.5
    billion clock cycles per second, at room
    temperature. Based on a new microarchitecture
    from Intel, the chip is code-named "Willamette."
    (To be marketed towards end of the year)
    Source http//www.intel.com
  • Who needs 1.5 GHz Processor?

5
State of the Art Alpha 21264
  • 15M transistors
  • 2 x 64KB caches on chip 16MB L2 cache off chip
  • Clock lt1.7 nsec, or gt600 MHz (Fastest Cray
    Supercomputer T90 2.2 nsec)
  • 90 watts
  • Superscalar fetch up to 6 instructions/clock
    cycle, retires up to 4 instruction/clock cycle
  • Execution out-of-order

6
Processor Limit DRAM Gap
7
Processor-Memory Performance Gap Tax (1)
  • Processor Area Transistors
  • (cost) (power)
  • Alpha 21164 37 77
  • StrongArm SA110 61 94
  • Pentium Pro 64 88
  • 2 dies per package Proc/I/D L2
  • Caches have no inherent value, only try to close
    performance gap
  • COST F(Area4)

8
Processor-Memory Performance Gap Tax (2)
  • Microprocessor-DRAM performance gap
  • time of a full cache miss in instructions
    executed
  • 1st Alpha (7000) 340 ns/5.0 ns  68 clks x 2
    or 136
  • 2nd Alpha (8400) 266 ns/3.3 ns  80 clks x 4 or
    320
  • 3rd Alpha (21264) 180 ns/1.7 ns 108 clks x 6 or
    648
  • 1/2X latency x 3X clock rate x 3X Instr/clock ?
    5X

9
Todays Situation Microprocessor
  • MIPS MPUs R5000 R10000 10k/5k
  • Clock Rate 200 MHz 195 MHz 1.0x
  • On-Chip Caches 32K/32K 32K/32K 1.0x
  • Instructions/Cycle 1( FP) 4 4.0x
  • Pipe stages 5 5-7 1.2x
  • Model In-order Out-of-order ---
  • Die Size (mm2) 84 298 3.5x
  • without cache, TLB 32 205 6.3x
  • Development (man yr.) 60 300 5.0x
  • SPECint_base95 5.7 8.8 1.6x

10
Processors Evaluation Metrics
  • SPECint95 Suit of Integer Programs
  • SPECft95 Suit of Floating Point Programs
  • TCP-C On Line Transaction Processing Programs
    (OLTP)
  • All state of the arts processors perform well for
    SPECint95 and SPECft95 (scientific and technical
    applications)
  • TCP-C ?

11
Processor Limits for TPC-C

  • SPEC-
  • Pentium Pro
    int95 TPC-C
  • Multilevel Caches Miss rate 1MB L2 cache 0.5
    5
  • Superscalar (2-3 instr. retired/clock) clks
    40 10
  • Out-of-Order Execution speedup
    2.0X 1.4X
  • Clocks per Instruction
    0.8 3.4
  • Peak performance
    40 10

source Bhandarkar, D. Ding, J. Performance
characterization of the Pentium Pro processor.
Proc. 3rd Int'l. Symp. on High-Performance
Computer Architecture, Feb 1997. p. 288-97.
12
Embedded Processor Market
  • Over 97 of the processors fabricated
  • 50 of the revenues from processor sales
  • Embedded devices cover wide range products
  • simple devices such as thermostats and toasters
  • complex and mission-critical applications such as
    avionics systems.
  • In between are phones, facsimile machines, ATM
    switches, digital cameras, automotive
    applications, set-top boxes, ...

13
Embedded Processor Design
  • Drives the technology Post-PC era
  • Embedded processors incorporate capabilities
    traditionally associated with the conventional
    CPUs.
  • They are subject to challenging
  • cost,
  • power consumption,
  • and application- imposed constraints.

14
Intel Embedded Mobile Celeron Processor
  • Available at 600, 566, 533, 500 and 466 MHz.
  • Dynamic Execution technology.
  • Includes Intel MMX media enhancement technology.
  • Intel Streaming SIMD Extensions (available on the
    Intel Celeron Processor at 566 and 600 MHz).
  • 32 Kbyte (16 Kbyte/16 Kbyte) Level 1 cache.
  • 128 Kbyte integrated Level 2 cache.
  • 66 MHz Intel P6 micro-architecture's
    multitransaction system bus.
  • Intel Chipset support Intel 810 chipset, Intel
    810E chipset, Intel 440BX, Intel 440EX and the
    Intel 440ZX-66 AGPset.
  • Power 17 - 30 Watts Source http//www.intel.com

15
Desktop/Server Processors Summary (1)
  • SPEC performance doubling / 18 months
  • Growing CPU-DRAM performance gap tax
  • Running out of ideas, competition? Back to 2X /
    2.3 yrs?
  • Benchmarks SPEC-int, SPEC-ft, TPC (for OLTP)
  • Benchmark highest optimization, ship lowest
    optimization?
  • Processor tricks not as useful for transactions?
  • Clock rate increase compensated by CPI increase?
  • When gt 100 MIPS on TPC-C?

16
Desktop/Server Processors Summary (2)
  • Embedded processors promising
  • Strong ARM 110 233 MHz, 268 MIPS, 0.36W typ.,
    49
  • 1/10 cost, 1/100 power, 1/2 integer performance?
  • Consolidation of desktop industry? Innovation?
  • Time to look for the computing trends and
    applications of tomorrow?

17
Billion Transistor Architectures and Stationary
Computer Metrics
  • SS Trace SMT CMP IA-64 RAW
  • SPEC Int
  • SPEC FP
  • TPC (DataBse)
  • SW Effort
  • Design Scal.
  • Physical Design Complexity
  • (See IEEE Computer (9/97), Special Issue on
    Billion Transistor Microprocessors)
  • Very Long Instruction Word (Intel,HP
    IA-64/Merced)
  • multiple ops/ instruction, compiler controls
    parallelism
  • Coined as the next generation Intel/HP processor
  • Renamed Itanium (October 99)

18
Current Computer Design with the Bias for the Past
  • Most Billion Transistor Architectures show high
    physical design complexity
  • Most show impressive performance for SPEC suits
    of programs
  • Suitablity
  • suitable for high end traditonal applications
  • unsuitable for pervasive computing environment of
    the future
  • high power budget (gt180 Watts),
  • expensive (gt500)
  • Applications of past to design computers of future

19
Challenge for Future Microprocessors
  • ...wires are not keeping pace with scaling of
    other features. In fact, for CMOS processes
    below 0.25 micron ... an unacceptably small
    percentage of the die will be reachable during a
    single clock cycle.
  • Architectures that require long-distance, rapid
    interaction will not scale well ...
  • Will Physical Scalability Sabotage Performance
    Gains? Matzke, IEEE Computer (9/97)

20
Computer in the NewsExpert Talking
  • Intel specializes in designing
    microprocessors for the desktop PC, which in five
    years may no longer be the most important type of
    computer. Its successor may be a personal mobile
    computer that integrates the portable computer
    with a cellular phone, digital camera, and video
    game player Such devices require low- cost,
    energy- efficient microprocessors, and Intel is
    far from a leader in that area.
  • -David Patterson, NY Times, June 9, 1998
  • David Patterson led the design of Berkeley
    RISC Machine, the first RISC computer. He is also
    the author/co-author of two of most popular
    Textbooks on Computer Architecture.

21
Post PC Motivation
  • Next generation fixes problems of last gen.
  • 1960s batch processing slow turnaround ?
    Timesharing
  • 15-20 years of performance improvement, cost
    reduction (minicomputers, semiconductor memory)
  • 1980s Time sharing inconsistent response
    times ? Workstations/Personal Computers
  • 15-20 years of performance improvement, cost
    reduction (microprocessors, DRAM memory, disk)
  • 2000s PCs difficulty of use/high cost of
    ownership ? ???

22
Computing Trends Post-PC Era
  • Multimedia Applications
  • real time data types video, speech, animation,
    music
  • 90 of desktop cycles will be spent on media
    applications by end of 2000.
  • Multimedia workloads will continue in importance
  • Image, handwriting, and speech recognition will
    pose other major challenges.
  • Pervasive Mobile Computing Devices
  • support an expanding range of functions
  • challenge is in converging them into a single
    device
  • keeping the size, weight, and power consumption
    constant.

23
Sony Playstation 2000
  • Emotion Engine 6.2 GFLOPS, 75 million polygons
    per second (Microprocessor Report, 135)
  • Superscalar MIPS core vector coprocessor
    graphics/DRAM
  • Claim Toy Story realism brought to games!

24
Intelligent PDA ( 2005?)
  • Pilot PDA
  • gameboy, cell phone, radio, timer, camera, TV
    remote, am/fm radio, garage door opener, ...
  • Wireless data (WWW)
  • Speech, vision recog.
  • Voice output for conversations

-Speech control of all devices - Vision to see,
- Scan documents, - read bar code, ... -
Measure room
25
Billion Transistor Architectures and Mobile
Multimedia Metrics
  • SS Trace SMT CMP IA-64 RAW
  • Design Scal.
  • Energy/power
  • Code Size
  • Real-time
  • Cont. Data
  • Memory BW
  • Fine-grain Par.
  • Coarse-gr.Par.
  • Direction for Computer Architecture Research,
    Kozyrakis, Patterson IEEE Computer (11/98)

26
New Architecture Directions
  • media processing will become the dominant force
    in computer arch. microprocessor design.
  • ... new media-rich applications... involve
    significant real-time processing of continuous
    media streams, and make heavy use of vectors of
    packed 8-, 16-, and 32-bit integer and Fl. Pt.
  • Needs include high memory BW, high network BW,
    continuous media data types, real-time response,
    fine grain parallelism
  • How Multimedia Workloads Will Change Processor
    Design, Diefendorff Dubey, IEEE Computer (9/97)

27
Some Media-Processing Functions
  • Kernel Vector length
  • Matrix transpose/multiply (3D Gr.) vertices at
    once
  • DCT (video, comm.) image width
  • FFT (audio) 256-1024
  • Motion estimation (video) image width, i.w./16
  • Gamma correction (video) image width
  • Haar transform (media mining) image width
  • Median filter (image process.) image width

(from http//www.research.ibm.com/people/p/pradeep
/tutor.html)
28
Challenges for Mobile Multimedia
  • High performance for multimedia functions
  • Energy and power efficiency (lt1 Watt)
  • Small size (fit in pocket)
  • Low design complexity and high degree of
    scalability (costs few tens of )

29
A Better Mobile Multimedia MPUs LogicDRAM
  • Embedded DRAM processors one possibility
  • Faster logic in DRAM process
  • DRAM vendors offer faster transistors same
    number metal layers as good logic process?_at_
    20 higher cost per wafer?
  • Called Intelligent RAM (IRAM) since most of
    transistors will be DRAM
  • Leave for another presentation
  • A Case for Intelligent RAMPatterson, Anderson,
    . IEEE Computer (3/97)

30
Mobile Multimedia Conclusion
  • 10000X cost-performance increase in stationary
    computers, consolidation of industrygt time for
    architecture/OS/compiler researchers declare
    victory, search for new horizons?
  • Mobile Multimedia offer many new challenges
    energy efficiency, size, real time performance,
    ...
  • Apps/metrics of future to design computer of
    future!
  • Suppose PDA replaces desktop as primary computer?
  • Work on FPPP on PC vs. Speech on PDA?

31
From the Horse Mouth
  • Personal mobile computing offers a vision of the
    future with a much richer and more exciting set
    of architecture research challenges than
    extrapolations of the current desktop
    architectures and benchmarks.
  • Put another way, which problem would you rather
    work on improving performance of PCs running
    FPPPPa 1982 Fortran benchmark used in
    SPECfp95or making speech input practical for
    PDAs?
  • Direction for Computer Architecture Research,
    Kozyrakis, Patterson IEEE Computer (11/98)

32
References
  • IEEE Computers Sept. 97, Jan. 98, Aug. 98, Nov.
    98,
  • IEEE Micro Dec. 96, Mar. 97, Sept. 97

33
Acknowledgement
  • Thanks to Dr. Vishv Malhotra for lending me some
    of his IEEE Computer issues.
  • Thanks to Prof. Sale for going through the slides
    and making useful suggestions.
  • WAIT FOR THE NEXT TWO SLIDES

34
Purpose of This Talk
  • To get Staff and Students excited about the new
    opportunities for research.
  • What would you be doing as a graduate?
  • Service Windows NT, and if lucky perhaps UNIX?
  • Develop web pages?
  • Do more of the same?
  • Or rather do something really exciting?
  • We need you if you choose the LATTER!
  • 50 Post Graduate Scholarship for IT up for grab

35
Our Vision and Aim
  • Achieve Critical Mass in Research
  • Create a Group of Staff Students Working on the
    Problems of Future.
  • Pulling Australian IT Research Community Together
  • Identifying Niches Where We Can Make
    International Contribution.
About PowerShow.com