Internal Memory

About This Presentation

Transcript and Presenter's Notes

Title: Internal Memory

1
William Stallings Computer Organization and
Architecture
Chapter 4 5 Cache Memory and Internal Memory
2
Computer Components Top Level View
registers
3
Memory

How much ?
As much as possible
How fast ?
As fast as possible
How expensive ?
As cheap as possible
Fast memory is expensive
Large memory is expensive
The larger the memory, the slower the access

4
Memory Hierarchy

CPU Registers
L1 cache (on chip)
L2 cache (on board)
Main memory
Disk cache
Disk
Optical
Tape

Access time
Size
Access Frequency
Cost per bit
5
Characteristics

Location
Capacity
Unit of transfer
Access method
Performance
Physical type
Physical characteristics
Organisation

6
Location

CPU
Registers
Internal access directly from CPU
Cache
RAM
External access through I/O module
Disks
CD-ROM,

7
Capacity

Word size
The natural unit of organisation
Usually, it is equal to the numer of bits used
for representing numbers or instructions
Typical word size 8 bits, 16 bits, 32 bits
Number of words (or Bytes)
1 Byte 8 bits 23 bits
1 K Byte 210 Bytes 210 x 23 bits 1024
bytes (Kilo)
1 M Byte 210 K Bytes 1024 K Bytes (Mega)
1 G Byte 210 M Bytes 230 Bytes (Giga)
1 T Byte 210 G Bytes 1024 G Bytes (Tera)

8
Unit of Transfer

Number of bits can be read/written at the same
time
Internal
Usually governed by data bus width
bus width may be equal to word size or (often)
larger
Typical bus width 64, 128, 256 bits
External
Usually a block which is much larger than a word
A related concept addressable unit
Smallest location which can be uniquely addressed
Word internally
Cluster on M disks

9
Access Methods (1)

Sequential
Start at the beginning and read through in order
Access time depends on location of data and
previous location
e.g. tape
Direct
Individual blocks have unique address
Access is by jumping to vicinity plus sequential
search
Access time depends on location and previous
location
e.g. disk

10
Access Methods (2)

Random
Individual addresses identify locations exactly
Access time is independent of location or
previous access
e.g. RAM
Associative
Data is located by a comparison with contents of
a portion of the store
Access time is independent of location or
previous access
e.g. cache

11
Performance

Access time
Time between presenting the address and getting
the valid data
Memory Cycle time
Time may be required for the memory to recover
before next access
Cycle time is access recovery
Transfer Rate
Rate at which data can be moved
TNTA N/R

N number of bits TA access
time TN time need to read N bits R
transfer rate
12
Physical Types

Semiconductor
RAM, ROM, EPROM, Cache
Magnetic
Disk Tape
Optical
CD DVD
Others

13
Semiconductor Memory

RAM (Random Access Memory)
Misnamed as all semiconductor mem. are random
access
Read/Write
Volatile
Temporary storage
Static or dynamic
ROM (Read only memory)
Permanent storage
Read only

14
Dynamic RAM

Bits stored as charge in capacitors
Charges leak
Need refreshing even when powered
Simpler construction
Smaller per bit
Less expensive
Need refresh circuits
Slower
Main memory (static RAM would be too expensive)

15
Static RAM

Bits stored as on/off switches
No charges to leak
No refreshing needed when powered
More complex construction
Larger per bit
More expensive
Does not need refresh circuits
Faster
Cache (here the faster the better)

16
Read Only Memory (ROM)

Permanent storage
Microprogramming (see later)
Library subroutines
Systems programs (BIOS)
Function tables

17
Types of ROM

Written during manufacture
Very expensive for small runs
Programmable (once)
PROM
Needs special equipment to program
Read mostly
Erasable Programmable (EPROM)
Erased by UV (it can take up to 20 minuts)
Electrically Erasable (EEPROM)
Takes much longer to write than read
a single byte can be erased
Flash memory
Erase memory electrically block-at-a-time

18
Physical Characteristics

Decay (refresh time)
Volatility (needs power source)
Erasable
Power consumption

19
Organisation

Physical arrangement of bits into words
Not always obvious
e.g. interleaved

20
Basic Organization (1)

Basic element memory cell
has 2 stable states one represent 0, the other 1
can be written at least once
can be read

Write
Read
R/W Control
R/W Control
Cell
Cell
Select
Select
Input Data
Output Data
21
Basic Organization (2)

Basic organization of a 512x512 bits chip

Timing and control
Array of Memory Cells (512x512)
Row Address Decoder
A0
9
A8
D0
1
Sense Amplifier and I/O Gate
A9
9
Column Address Decoder
A17
22
Module Organisation

Basic organization of a 256KB chip
8 times a 512x512 bits chip
For a 1 MB chip replicate 4 times this
organization

23
Module Organisation (1 MByte)
24
Organisation for larger sizes

The larger the size the higher the number of
address pins
For 2k words, k pins are needed
A solution to reduce the number of address pins
Multiplex row address and column address
k/2 pins to address 2k Bytes
Adding one more pin doubles range of values so x4
capacity

25
Typical 16 Mb DRAM (4M x 4)
X
X
26
Refreshing (Dynamic RAM)

Refresh circuit included on chip
Disable chip
Count through rows
Read Write back
Takes time
Slows down apparent performance

27
Packaging
X
28
Error Correction

Hard Failure
Permanent defect
Soft Error
Random, non-destructive
No permanent damage to memory
Detected using Hamming error correcting code
it is able to detect and correct 1-bit errors

29
Error Correcting Code Function
30
A simple example of correction (1)
B
A

Correcting errors in 4 bits words
3 control groups
In each control group add 1 parity bit

1
1
1
0
C
B
A
1
1
0
1
1
0
0
C
31
A simple example of correction (2)
B
A

One of the bits change value
Using control bit the right value is restored

1
1
0
1
0
0
0
C
B
A
1
1
0
1
1
0
0
C
32
Compare Circuit

it takes two K-length binary strings X, Y as
input
XXKX1
YYKY1
it returns a K-length binary string Z (syndrome)
ZZKZ1
ZiXi ? Yi for each i1,,K
Z00 means no error

33
Relation between M and K

Z may assume 2K values
the value Z00 means no error
the error may be in any bit among the MK bits
it must be

2K -1 ? MK
Data bits (M) Control Bits (K) Additional Memory ()
4 3 75
8 4 50
16 5 31,25
32 6 18,75
64 7 10,94
128 8 6,25
256 9 3,52
34
How to arrange the MK bits

the MK bits are arranged so that
If Z?0, error occured in the i-th bit where i is
the value (in binary) of Z

35
The case M4
bit position 7 6 5 4 3 2 1
position number 111 110 101 100 011 010 001
data bits D4 D3 D2 D1
control bits C4 C2 C1
D1
C1 D1 ? D2 ? D4 C2 D1 ? D3 ? D4 C4 D2 ? D3 ? D4
C1
C2
D4
D2
D3
C4
36
Exercise

Design a Hamming error correcting code for
8-bit words
See the textbook for the solution

37
Cache

Small amount of fast memory
Sits between normal main memory and CPU
May be located on CPU chip or module

38
Cache operation - overview

CPU requests contents of memory location
Check cache for this data
If present (hit), get from cache (fast)
If not present (miss), read required block from
main memory to cache
Then deliver from cache to CPU

39
Cache Performance

Cache access time t1
Memory access time T10
Hit Probability H
Taverage accesstH(Tt)(1-H)t(1-H)T

T average access
H
40
Locality of Reference (Denning68)

Spatial Locality
Memory cells physically close to those just
accessed tend to be accessed
Temporal Locality
During the course of the execution of a program,
all accesses to the same memory cells tend to
close in time
e.g. loops, arrays

41
An example
200 201 202 SUB X, Y 203 BRZ 211
210 BRA 202 211 225 BR
E R1, R2, 235 235
unconditional branch
conditional branch
conditional branch
42
Typical Cache Organization
43
Cache Design

Size
Mapping Function
Replacement Algorithm
Write Policy
Block Size
Number of Caches

44
Size does matter

Cost
More cache is expensive
Speed
More cache is faster (up to a point)
Checking cache for data takes time

45
Cache-memory mapping

There are M2n/K blocks
C ltlt M
Each block is mapped to a cache line

46
A simple example of Direct Mapping
w
r
s-r
00000 00001 00010 00011 00100 00101 00110 00111
01000 01001 01010 .. .. .. 11110 11111

Block 0
Line 0

Block 1
Line 1

Block 2
Line 2

Block 3
Line 3

Block 4
Line 0

Block 15
Line 3
47
Direct Mapping (1)

Each block of main memory is mapped to a specific
cache line
i.e. if a block is in cache, it must be in one
specific place
In a cache of C lines, block j is stored into
line i, where i j mod C

48
Direct Mapping (2)

Address is in two parts
w Least Significant Bits (LSB) identify unique
word
s Most Significant Bits (MSB) specify one memory
block
The MSBs are split into
a cache line field r (least significant)
a tag of s-r (most significant)

49
Direct Mapping Summarizing

address length nsw bits
number of addressable units (words) 2sw
block sizecache line size 2w words
number of memory bocks 2sw/2w 2s
number of cache lines C 2r
tag length (s-r) bits

50
Cache Line Mapping Table

Cache line Main Memory blocks held
0 0, C, 2C, ,2s-C
1 1, C1, 2C1, ,
2s-C1
C-1 C-1, 2C-1, 3C-1, ,
2s-1

51
Mapping Function

Word size 1 Byte
Cache of 64KBytes (216 Bytes)
Cache block of 4 bytes
64 KB/4 16K (214) lines of 4 bytes
16MBytes (224) main memory
224/4 4M (222) blocks in main memory
Map 222 blocks to 214 lines of cache

52
Direct MappingAddress Structure
Tag s-r
Line or Slot r
Word w
14
2
8

24 bit address 16MBytes (224) main memory
2 bit word identifier (4 byte block)
Cache 64 KB/4 16K (214) lines of 4 bytes
22 bit block identifier
8 bit tag (22-14)
14 bit slot or line
No two blocks mapping to the same line have the
same Tag field
Check contents of cache by finding line and
checking Tag

53
Direct Mapping Cache Organization
54
Direct Mapping pros cons

Simple
Inexpensive
Fixed location for given block
If a program repeatedly accesses 2 distinct
blocks that are mapped to the same line, cache
misses are very high (thrashing)

55
Associative Mapping

A main memory block can load into any line of
cache
Memory address is interpreted as tag and word
Tag uniquely identifies block of memory
Every lines tag is examined for a match
Cache searching gets expensive

56
A simple example of Associative Mapping
w
s
00000 00001 00010 00011 00100 00101 00110 00111
01000 01001 01010 .. .. .. 11110 11111

Block 0

Block 1
w0 w1

Block 2
Line 0 Line 1 Line 2 Line 3
0011 0001 0000 0100

Block 3

Block 4
Note a replacement algorithm is needed (see
later)

Block 15
57
Associative Mapping Summarizing

address length nsw
number of addressable units (words) 2sw
block sizecache line size 2w words
number of memory bocks 2sw/2w 2s
number of cache lines not specified
tag length s bits

58
Associative MappingAddress Structure
Word 2 bit
Tag 22 bit

22 bit tag stored with each 4 byte block of data
Compare tag field with tag entry in cache to
check for hit
Least significant 2 bits of address identify
which byte is required from the 4 byte data block

59
Fully Associative Cache Organization
60
Set Associative Mapping

Cache is divided into v sets
Each set contains k lines
number of cache lines Cv?k
A given block maps to any line in a given set
Block j can be in any line of set i, where ij
mod v
There are k lines in a set (k-way set associative
mapping)
k1 direct mapping kC associative mapping
The best choice in practice is 2 lines per set
2 way associative mapping
A given block can be in only one set, but in any
of its 2 lines

61
A simple example of Set Associative Mapping
d
w
s-d
00000 00001 00010 00011 00100 00101 00110 00111
01000 01001 01010 .. .. .. 11110 11111

Block 0
Set 0

Block 1
Set 1
w0 w1

Block 2
Line 0 Line 1 Line 2 Line 3
010 000 111 000
Set 0

Set 0

Block 3
Set 1

Set 1

Block 4
Set 0
Note a replacement algorithm is needed (see
later)

Block 15
Set 1
62
Set Associative Mapping

Address is in two parts
w Least Significant Bits (LSB) identify unique
word
s Most Significant Bits (MSB) specify one memory
block
The MSBs are split into
a cache set field d (least significant)
a tag of s-d (most significant)

63
Set Associative Mapping Summarizing

address length nsw bits
number of addressable units (words) 2sw
block sizecache line size 2w words
number of memory bocks 2sw/2w 2s
number of lines for each cache set k
number of sets v 2d
number of cache lines C k v k 2d
tag length (s -d) bits

64
Set Associative MappingAddress Structure
Word 2 bit
Tag 9 bit
Set 13 bit

number of cache lines 214
number of cache sets 213
each cache set has two lines 2-way set
associative mapping
Use set field to determine cache set to look in
Compare Tag field with all lines in the set to
see if we have a hit

65
Two Way Set Associative Cache Organization
66
Replacement Algorithms (1)Direct mapping

No choice
Each block only maps to one line
Replace that line

67
Replacement Algorithms (2)Associative Set
Associative

Hardware implemented algorithm (to obtain speed)
Least Recently used (LRU)
e.g. in 2 way set associative
Which of the 2 blocks is LRU?
First in first out (FIFO)
replace block that has been in cache longest
Least frequently used
replace block which has had fewest hits
Random
Almost as good as LRU

68
Write Policy

Multiple CPUs may have individual caches
I/O may address main memory directly

cache(s) and main memory may become
non-consistent
69
Write through

All writes go to main memory as well as cache
Multiple CPUs can monitor main memory traffic to
keep local (to CPU) cache up to date
Lots of traffic
Slows down writes

70
Write back

Updates initially made in cache only
Update bit for cache slot is set when update
occurs
If block has to be replaced, write to main memory
only if update bit is set
I/O must access main memory through cache
N.B. 15 of memory references are writes
Caches of other devices get out of sync
Cache coherency problem (a general problem in
distributed systems !)

71
Block Size

Too small
Locality of reference is not used
Too large
Locality of reference is lost
Typical block size 8 32 bytes

72
Number of Caches

2 levels of cache
L1 on chip (since technology allows it)
L2 on board (to fill the speed gap)
2 kinds of cache
Data cache
Instruction cache
To allow instruction parallel processing and data
fetching interfere

Write a Comment

User Comments (0)

About PowerShow.com

Internal Memory PowerPoint PPT Presentation