???? ?????? 0368-2159 Lecture 1 ????? ??? ???????? ?????? ??? ???????: ???? ???? ???? ??-???? - PowerPoint PPT Presentation

About This Presentation
Title:

???? ?????? 0368-2159 Lecture 1 ????? ??? ???????? ?????? ??? ???????: ???? ???? ???? ??-????

Description:

0368-2159 Lecture 1 : - – PowerPoint PPT presentation

Number of Views:75
Avg rating:3.0/5.0
Slides: 81
Provided by: BobBro8
Category:

less

Transcript and Presenter's Notes

Title: ???? ?????? 0368-2159 Lecture 1 ????? ??? ???????? ?????? ??? ???????: ???? ???? ???? ??-????


1
???? ??????0368-2159Lecture 1????? ???
???????? ?????? ??? ??????? ????
???? ???? ??-????
2
?? ?? ???? ???????
  • ????? - ???????????
  • ?????? ??????
  • ?????????? ??????

3
?? ?? ???? ????
  • Introduction Computer Architecture
  • Administrative Matters
  • History
  • ???????? ????? ??? ?????? ??????? ??????? ?????
  • ??? ?????
  • ???????
  • ??????? ????? ?????
  • ?????????
  • ?????? ??????? ??????? ??????????

4
Computing Devices Then
  • EDSAC, University of Cambridge, UK, 1949

5
Computing Devices Now
Sensor Nets
Cameras
Games
Set-top boxes
Media Players
Laptops
Servers
Robots
Routers
Smart phones
Automobiles
Supercomputers
6
???? ??????, ?? ???
7
(No Transcript)
8
Mother board
9
First Pacemaker, 1957
10
(No Transcript)
11
The paradigm (Patterson)
  • Every Computer Scientist should master the AAA
  • Architecture
  • Algorithms
  • Applications

12
Computer Architecture GOAL
Fast, Effective and Cheap
  • The goal of Computer Architecture
  • To build cost effective systems
  • How do we calculate the cost of a system ?
  • How we evaluate the effectiveness of the system?
  • To optimize the system
  • What are the optimization points ?
  • Fact most of the computer systems still use
    Von-Neumann principle of operation, even though,
    internally, they are much different from the
    computer of that time.

13
Anatomy 5 components of any Computer (since 1946)
Personal Computer
Keyboard, Mouse
Computer
Processor
Memory (where programs, data live
when running)
Devices
Disk (where programs, data live when not
running)
Input
Control (brain)
Datapath (brawn)
Output
Display, Printer
14
Computer System Structure
15
The Instruction Set a Critical Interface
software
instruction set
hardware
16
?? ?? Computer Architecture ?
  • Computer Architecture
  • Instruction Set Architecture
  • Machine Organization
  • ????? ??????????

17
What are Machine Structures?
Application (ex browser)
Operating
Compiler
System (Linux, Win, ..)
Software
Assembler
Instruction Set Architecture
Hardware
I/O system
Processor
Memory
Datapath Control
Digital Design
Circuit Design
transistors
Physics
  • Coordination of many
  • levels (layers) of abstraction

18
Levels of Representation
temp vk vk vk1 vk1 temp
High Level Language Program
Compiler
  • lw 15, 0(2)
  • lw 16, 4(2)
  • sw 16, 0(2)
  • sw 15, 4(2)

Assembly Language Program
Assembler
0000 1001 1100 0110 1010 1111 0101 1000 1010 1111
0101 1000 0000 1001 1100 0110 1100 0110 1010
1111 0101 1000 0000 1001 0101 1000 0000 1001
1100 0110 1010 1111
Machine Language Program
Machine Interpretation
Control Signal Specification
ALUOP03 lt InstReg911 MASK

19
Computer Architectures Changing Definition
  • 1950s to 1960s Computer Architecture Course
  • Computer Arithmetic
  • 1970s to mid 1980s Computer Architecture Course
  • Instruction Set Design, especially ISA
    appropriate for compilers
  • 1990s Computer Architecture Course
  • Design of CPU, memory system, I/O system,
    Multi-processors, Networks
  • 2000s Computer Architecture Course
  • Special purpose architectures, Functionally
    reconfigurable, Special considerations for low
    power/mobile processing
  • 2005 futue (?) Multi processors, Parallelism
  • Synchronization, Speed-up, How to Program ??? !!!

20
Forces on Computer Architecture
Technology
Programming
Languages
Applications
Computer Architecture
Cleverness
Operating
Systems
History
21
Computers in the News Sony Playstation 2000
The Playstation 3 will deliver nearly 2 teraflops
overall performance, said Ken Kutaragi,
president and group CEO of Sony Computer
Entertainment
  • As reported in Microprocessor Report, Vol 13, No.
    5
  • Emotion Engine 6.2 GFLOPS, 75 million polygons
    per second
  • Graphics Synthesizer 2.4 Billion pixels per
    second
  • Claim Toy Story realism brought to games!

22
(No Transcript)
23
Ray Kurzweil By 2029 reverse engineer the Human
Brain
http//singules-atarityhub.com/2010/01/25/kurzweil
-discusses-the-future-of-brain-computer-interfac-x
-prize-lab-video/
24
Where are We Going??
???? ??????
?
25
?????? ???? ??????? ????? ??? ??????
26
Course Administration
  • Instructors
  • Nathan Intrator (nin_at_post.tau.ac.il)
  • TA Kiril Solovey (kirilsolo_at_gmail.com )
  • http//cs.tau.ac.il/nin/Courses/CompStruct/CompS
    truct.htm
  • http//virtual.tau.ac.il
  • Books
  • V. C. Hamacher, Z. G. Vranesic, S. G.
    Zaky Computer Organization. McGraw-Hill, 1982
  • H. Taub Digital Circuits and Microporcessors.
    McGraw-Hill 1982
  • ?????? ??????? ??????? ??????????? ??????
  • Hennessy and Patterson, Computer Organization
    Design, the hardware/software interface, Morgan
    Kaufman 1998

27
Grading
  • ????
  • ???? ???? 80
  • ??????? 20
  • 6 - 7 ???????

28
Architecture Microarchitecture Elements
  • Architecture
  • Registers data width (8/16/32/64)
  • Instruction set
  • Addressing modes
  • Addressing methods (Segmentation, Paging, etc...)
  • Architecture
  • Physical memory size
  • Caches size and structure
  • Number of execution units, number of execution
    pipelines
  • Branch prediction
  • TLB
  • Timing is considered Arch (though it is user
    visible!)
  • Processors with the same arch may have different
    Arch

29
Compatibility
  • Backward compatibility
  • New hardware can run existing software
  • Example Pentium? 4 can run software originally
    written for Pentium? III, Pentium? II, Pentium? ,
    486, 386, 286
  • Forward compatibility
  • New software can run on existing (old) hardware
  • Example new software written with MMXTM must
    still run on older Pentium processors which do
    not support MMXTM
  • Less important than backward compatibility
  • New ideas architecture independent
  • JIT just in time compiler Java and .NET
  • Binary translation

30
How to compare between different systems?
31
Benchmarks Programs for Evaluating Processor
Performance
  • Toy Benchmarks
  • 10-100 line programs
  • e.g. sieve, puzzle, quicksort
  • Synthetic Benchmarks
  • Attempt to match average frequencies of real
    workloads
  • e.g., Winstone, Dhrystone
  • Real programs
  • e.g., gcc, spice
  • SPEC System Performance Evaluation Cooperative
  • SPECint (8 integer programs)
  • and SPECfp (10 floating point)

32
CPI to compare systems with same instruction
set architecture (ISA)
  • The CPU is synchronous - it works according to a
    clock signal.
  • Clock cycle is measured in nsec (10-9 of a
    second).
  • Clock rate ( 1/clock cycle) is measured in MHz
    (106 cycles/second).
  • CPI - cycles per instruction
  • Average cycles per Instruction (in a given
    program)
  • IPC ( 1/CPI) Instructions per cycles
  • Clock rate is mainly affected by technology, CPI
    by the architecture
  • CPI breakdown how many cycles (on average) the
    program spends for different causes e.g., in
    executing, memory I/O etc.

33
CPI (cont.)
  • CPIi - cycles to execute a given type of
    instruction
  • e.g. CPIadd 1, CPImul 3
  • Independent of a program
  • Calculating the CPI of a program
  • ICi - times instruction of type i was
    executed in the program
  • IC - instruction executed in the program
  • Fi - relative frequency of instruction of
    type i Fi ICi/IC
  • Ncyc - cycles required to execute the program
  • CPI
  • This calculation does not take into account other
    delays such as memory, I/O

34
CPU Time
  • CPU Time
  • The time required by the CPU to execute a given
    program
  • CPU Time clock cycle ? cyc clock cycle
    ? CPI ? IC
  • Our goal minimize CPU Time
  • Minimize clock cycle more MHz (process, circuit,
    ?Arch)
  • Minimize CPI ?Arch (e.g. more execution units)
  • Minimize IC architecture (e.g. MMXTM
    technology)
  • Speedup due to enhancement E

35
Amdahls Law
Suppose that enhancement E accelerates a fraction
F of the task by a factor S, and the remainder of
the task is unaffected, then
ExTimeold ExTimenew
1

Speedupoverall
Fractionenhanced
(1 - Fractionenhanced)
Speedupenhanced
36
Amdahls Law Example
  • Floating point instructions improved to run 2X
    but only 10 of actual instructions are FP

ExTimenew ExTimeold x (0.9 .1/2) 0.95 x
ExTimeold
Corollary Make The Common Case Fast
37
Instruction Set Design
The ISA is what the user and the compiler
sees The ISA is what the hardware needs to
implement
38
Why ISA is important?
  • Code size
  • long instructions may take more time to be
    fetched
  • Requires large memory (important in small
    devices, e.g., cell phones)
  • Number of instructions (IC)
  • Reducing IC reduce execution time (assuming same
    CPI and frequency)
  • Code simplicity
  • Simple HW implementation which leads to higher
    frequency and lower power
  • Code optimization can better be applied to
    simple code

39
The impact of the ISA
  • RISC vs CISC

40
CISC Processors
  • CISC - Complex Instruction Set Computer
  • The idea a high level machine language
  • Characteristic
  • Many instruction types, with many addressing
    modes
  • Some of the instructions are complex
  • Perform complex tasks
  • Require many cycles
  • ALU operations directly on memory
  • Usually uses limited number of registers
  • Variable length instructions
  • Common instructions get short codes ? save code
    length
  • Example x86

41
CISC Drawbacks
  • Compilers do not take advantage of the complex
    instructions and the complex indexing methods
  • Implement complex instructions and complex
    addressing modes
  • ? complicate the processor
  • ? slow down the simple, common instructions
  • ? contradict Amdahls law corollary
  • Make The Common Case Fast
  • Variable length instructions are real pain in the
    neck
  • It is difficult to decode few instructions in
    parallel
  • As long as instruction is not decoded, its length
    is unknown
  • ? It is unknown where the instruction ends
  • ? It is unknown where the next instruction
    starts
  • An instruction may not fit into the right
    behavior of the memory hierarchy (will be
    discussed next lectures)
  • Examples VAX, x86 (!?!)

42
RISC Processors
  • RISC - Reduced Instruction Set Computer
  • The idea simple instructions enable fast
    hardware
  • Characteristic
  • A small instruction set, with only a few
    instructions formats
  • Simple instructions
  • execute simple tasks
  • require a single cycle (with pipeline)
  • A few indexing methods
  • ALU operations on registers only
  • Memory is accessed using Load and Store
    instructions only.
  • Many orthogonal registers
  • Three address machine Add dst, src1, src2
  • Fixed length instructions
  • Examples MIPSTM, SparcTM, AlphaTM, PowerPCTM

43
RISC Processors (Cont.)
  • Simple architecture ? Simple micro-architecture
  • Simple, small and fast control logic
  • Simpler to design and validate
  • Room for on die caches instruction cache data
    cache
  • Parallelize data and instruction access
  • Shorten time-to-market
  • Using a smart compiler
  • Better pipeline usage
  • Better register allocation
  • Existing RISC processor are not pure RISC
  • e.g., support division which takes many cycles

44
RISC and Amdhals Law (Example)
  • In comparison to the CISC architecture
  • 10 of the static code, that executes 90 of the
    dynamic has the same CPI
  • 90 of the static code, which is only 10 of the
    dynamic, increases in 60
  • The number of instruction being executed is
    increased in 50
  • The speed of the processor is doubled
  • This was true for the time the RISC processors
    were invented
  • We get
  • And then

45
So, what is better, RISC or CISC
  • Today CISC architectures (X86) are running as
    fast as RISC (or even faster)
  • The main reasons are
  • Translates CISC instructions into RISC
    instructions (ucode)
  • CISC architecture are using RISC like engine
  • We will discuss this kind of solutions later on
    in this course.

46
Technology Trends Microprocessor Complexity
Itanium 2 410 Million
Athlon (K7) 22 Million
Alpha 21264 15 million Pentium Pro 5.5
million PowerPC 620 6.9 million Alpha 21164 9.3
million Sparc Ultra 5.2 million
Moores Law
2X transistors/Chip Every 1.5 years Called
Moores Law
47
(No Transcript)
48
(No Transcript)
49
Technology Trends Processor Performance
Intel P4 2000 MHz (Fall 2001)
1.54X/yr
Performance measure
year
50
Technology Trends Memory Capacity(Single-Chip
DRAM)
year size (Mbit) 1980 0.0625 1983 0.25 1986
1 1989 4 1992 16 1996 64 1998 128 2000 256 2002 5
12
  • Now 1.4X/yr, or 2X every 2 years.
  • 8000X since 1980!

51
Technology Trends Imply Dramatic Change
  • Processor
  • Logic capacity about 30 per year
  • Clock rate about 20 per year
  • Memory
  • DRAM capacity about 60 per year (4x every 3
    years)
  • Memory speed about 10 per year
  • Cost per bit improves about 25 per year
  • Disk
  • Capacity about 60 per year
  • Total data use 100 per 9 months!
  • Network Bandwidth
  • Bandwidth increasing more than 100 per year!

52
1980-2003, CPU--DRAM Speed gap
Q. How do architects address this gap?
A. Put smaller, faster cache memories between
CPU and DRAM.
Performance (1/latency)
10000
CPU
1000
100
10
DRAM
2005
1980
2000
1990
Year
53
Dimensions
2001 devices (0.18 µm)
Chip size (1 cm)
Diameter of Human Hair (25 µm)
1996 devices (0.35 µm)
2007 devices (0.01 µm)
Silicon atom radius (1.17 Å)
Deep UV Wavelength (0.248 µm)
X-ray Wavelength (0.6 nm)
Demo
54
?????????? ?????? ????? ?????
  • ???? ?????? / ????? ???? non issue.
  • ???? Power Wall ???? ???. ??????????? ??
    ?????.
  • ???? ??????? ??????? ?"? ?????? ???? ??????
    ??????, ?????????? ?????, ???????????? CPU ????
    (pipelining, superscalar, out-of-order execution,
    speculations)
  • ???? ILP Wall ?????? ????? ?????? ??????? ??
    ?????.
  • ???? ??? ????, ???? ??????? ?????.
  • ???? Memory Wall ??? ???? ????? ???????
    ??????.
  • (200 ?????? ???? ?DRAM 4 ???????
    ????)
  • ???? ?????? ???? ???? X 2 ?? 1.5 ????.
  • ???? ?? ??"? ???? X 2 ?? 5 ??????
  • ??? X 2 ?????? (????? Cores) ?? ??????.
    ???? 4 ?? 40 ????? ?????

55
Physics / Transistors History
1906
1947
First point contact transistor (germanium),
1947 John Bardeen and Walter Brattain Bell
Laboratories
Audion (Triode), 1906 Lee De Forest
56
History
1958
1997
First integrated circuit (germanium), 1958 Jack
S. Kilby, Texas Instruments Contained five
components, three types transistors resistors
and capacitors
Intel Pentium II, 1997 Clock 233MHz Number of
transistors 7.5 M Gate Length 0.35
57
Annual Sales
  • 1018 transistors manufactured in 2003 alone
  • 100 million for every human on the planet

58
(No Transcript)
59
(No Transcript)
60
(No Transcript)
61
Integrated Circuits (2003 state-of-the-art)
  • Primarily Crystalline Silicon
  • 1mm - 25mm on a side
  • 2003 - feature size 0.13µm 0.13 x 10-6 m
  • 100 - 400M transistors
  • (25 - 100M logic gates")
  • 3 - 10 conductive layers
  • CMOS (complementary metal oxide semiconductor)
    - most common.

Bare Die
Chip in Package
  • Package provides
  • spreading of chip-level signal paths to
    board-level
  • heat dissipation.
  • Ceramic or plastic with gold wires.

62
Printed Circuit Boards
  • fiberglass or ceramic
  • 1-20 conductive layers
  • 1-20in on a side
  • IC packages are soldered down.

63
nMOS Transistor
  • Four terminals gate, source, drain, body
  • Gate oxide body stack looks like a capacitor
  • Gate and body are conductors
  • SiO2 (oxide) is a very good insulator
  • Called metal oxide semiconductor (MOS)
    capacitor
  • Even though gate is
  • no longer made of metal

Off
On
64
nMOS Operation
  • Body is commonly tied to ground (0 V)
  • When the gate is at a low voltage
  • P-type body is at low voltage
  • Source-body and drain-body diodes are OFF
  • No current flows, transistor is OFF

Off
65
nMOS Operation Cont.
  • When the gate is at a high voltage
  • Positive charge on gate of MOS capacitor
  • Negative charge attracted to body
  • Inverts a channel under gate to n-type
  • Now current can flow through n-type silicon from
    source through channel to drain, transistor is ON

On
66
pMOS Transistor
  • Similar, but doping and voltages reversed
  • Body tied to high voltage (VDD)
  • Gate low transistor ON
  • Gate high transistor OFF
  • Bubble indicates inverted behavior

67
(No Transcript)
68
Example Inverter
69
Example NAND3
  • Horizontal N-diffusion and p-diffusion strips
  • Vertical polysilicon gates
  • Metal1 VDD rail at top
  • Metal1 GND rail at bottom
  • 32 l by 40 l

70
(No Transcript)
71
(No Transcript)
72
CMOS Inverter
A Y
0
1
73
CMOS Inverter
A Y
0
1 0
74
CMOS Inverter
A Y
0 1
1 0
75
(No Transcript)
76
(No Transcript)
77
Multiplexers
  • 21 multiplexer chooses between two inputs

S D1 D0 Y
0 X 0
0 X 1
1 0 X
1 1 X
78
Multiplexers
  • 21 multiplexer chooses between two inputs

S D1 D0 Y
0 X 0 0
0 X 1 1
1 0 X 0
1 1 X 1
79
Transmission Gate Mux
  • Nonrestoring mux uses two transmission gates
  • Only 4 transistors

80
out
81
?? ????? ????
  • Computer Architecture integrates few levels,
    from programming languages to logic design.
  • Instruction Set Architecture (ISA)
  • Amdahls law
  • Moors law
  • Processor (CPU) --- Memory speed gap
  • History
  • Transistors. What, and how.
  • From transistors to logic design
Write a Comment
User Comments (0)
About PowerShow.com