ICS220 Data Structures and Algorithms - PowerPoint PPT Presentation

1 / 29

About This Presentation

Title:

ICS220 Data Structures and Algorithms

Description:

In C we allocate parts of the heap using the new' command, and reclaim them ... the buddies are divided, and then reunited (if possible) when the memory is returned. ... – PowerPoint PPT presentation

Number of Views:31

Avg rating:3.0/5.0

Slides: 30

Provided by: ace80

Category:

more less

Transcript and Presenter's Notes

Title: ICS220 Data Structures and Algorithms

1
ICS220 Data Structures and Algorithms

Lecture 13
Dr. Ken Cosh

2
Review

Data Compression Techniques
Huffman Coding method

3
This week

Memory Management
Memory Allocation
Garbage Collection

4
The Heap

Not a heap, but the heap.
Not the treelike data structure.
But the area of the computers memory that is
dynamically allocated to programs.
In C we allocate parts of the heap using the
new command, and reclaim them using the
delete command.
C allows close control over how much memory is
used by your program.
Some programming languages (FORTRAN, COBOL,
BASIC), the compiler decides how much to
allocate.
Some programming languages (LISP, SmallTalk,
Eiffel, Java) have automatic storage reclamation.

5
External Fragmentation

External Fragmentation occurs when sections of
the memory have been allocated, and then some
deallocated, leaving gaps between used memory.
The heap may end up being many small pieces of
available memory sandwiched between pieces of
used memory.
A request may come for a certain amount of
memory, but perhaps no block of memory is big
enough, even though there is plenty of actual
space in memory.

6
Internal Fragmentation

Internal Fragmentation occurs when the memory
allocated to certain processes or data is too
large for its contents.
Here space is wasted even though its not being
used.

7
Sequential Fit Methods

When memory is requested a decision needs to be
made about which block of memory is allocated to
the request.
In order to discuss which method is best, we need
to investigate how memory might be managed.
Consider a linked list, containing links to each
block of available memory.
When memory is allocated or returned, the list is
rearranged, either by deletion or insertion.

8
Sequential Fit Methods

First Fit Algorithm,
Here the allocated memory is the first block
found in the linked list.
Best Fit Algorithm,
Here the block closest in size to the requested
size is allocated.
Worst Fit Algorithm,
Here the largest block on the list is allocated.
Next Fit Algorithm,
Here the next available block that is large
enough is allocated.

9
Comparing Sequential Fit Methods

First Fit is most efficient, comparable to the
Next Fit. However there can be more external
fragmentation.
The Best Fit algorithm actually leaves very small
blocks of practically unusable memory.
Worst Fit try to avoid this fragmentation, by
delaying the creation of small blocks.
Methods can be combined by considering the order
in which the linked list is sorted if the
linked list is sorted largest to smallest, First
Fit becomes the same as Worst Fit.

10
Non-Sequential Fit Methods

In reality with large memory, sequential fit
methods are inefficient.
Therefore non-sequential fit methods are used
where memory is divided into sections of a
certain size.
An example is a buddy system.

11
Buddy Systems

In buddy systems memory can be divided into
sections, with each location being a buddy of
another location.
Whenever possible the buddies are combined to
create a larger memory location.
If smaller memory needs to be allocated the
buddies are divided, and then reunited (if
possible) when the memory is returned.

12
Binary Buddy Systems

In binary buddy systems the memory is divided
into 2 equally sized blocks.
Suppose we have 8 memory locations
000,001, 010, 011, 100, 101, 110, 111
Each of these memory locations are of size 1,
suppose we need a memory location of size 2.
000, 010, 100, 110
Or of size 4,
000, 100
Or size 8.
000
In reality the memory is combined and only broken
down when requested.

13
Buddy System in 1024k memory
14
Sequence of Requests.

Program A requests memory 34K..64K in size
Program B requests memory 66K..128K in size
Program C requests memory 35K..64K in size
Program D requests memory 67K..128K in size
Program C releases its memory
Program A releases its memory
Program B releases its memory
Program D releases its memory

15
If memory is to be allocated

Look for a memory slot of a suitable size
If it is found, it is allocated to the program
If not, it tries to make a suitable memory slot.
The system does so by trying the following
Split a free memory slot larger than the
requested memory size into half
If the lower limit is reached, then allocate that
amount of memory
Go back to step 1 (look for a memory slot of a
suitable size)
Repeat this process until a suitable memory slot
is found

16
If memory is to be freed

Free the block of memory
Look at the neighbouring block - is it free too?
If it is, combine the two, and go back to step 2
and repeat this process until either the upper
limit is reached (all memory is freed), or until
a non-free neighbour block is encountered

17
Buddy Systems

Unfortunately with Buddy Systems there can be
significant internal fragmentation.
Case Program A requests 34k Memory but was
assigned 64 bit memory.
The sequence of block sizes allowed is
1,2,4,8,162m
An improvement can be gained from varying the
block size sequence.
1,2,3,5,8,13
Otherwise known as the Fibonacci sequence.
When using this sequence further complicated
problems occur, for instance when finding the
buddy of a returned block.

18
Fragmentation

It is worth noticing that internal and external
fragmentation are roughly inversely proportional.
As internal fragmentation is avoided through
precise memory allocation

19
Garbage Collection

Another key function of memory management is
garbage collection.
Garbage collection is the return of areas of
memory once their use is no longer required.
Garbage collection in some languages is
automated, while in others it is manual, such as
through the delete keyword.

20
Garbage Collection

Garbage collection follows two key phases
Determine what data objects in a program will not
be accessed in the future
Reclaim the storage used by those objects

21
Mark and Sweep

The Mark and Sweep method of garbage collection
breaks the two tasks into distinct phases.
First each used memory location is marked.
Second the memory is swept to reclaim the unused
cells to the memory pool.

22
Marking

A simple marking algorithm follows the pre order
tree traversal method
marking(node)
if node is not marked
mark node
if node is not an atom
marking(head(node))
marking(tail(node))
This algorithm can then be called for all root
memory items.
Recall the problem with this algorithm?
Excessive use of the runtime stack through
recursion, especially with the potential size of
the data to sort through.

23
Alternative Marking

The obvious alternative to the recursive
algorithm is an iterative version.
The iterative version however just makes
excessive use of a stack which means using
memory in order to reclaim space from memory.
A better approach doesnt require extra memory.
Here each link is followed, and the path back is
remembered by temporarily inverting links between
nodes.

24
Schorr and Waite

SWmark(curr)
prev null
while(1)
mark curr
if head(curr) is marked or atom
if head(curr) is unmarked atom
mark head(curr)
while tail(curr) is marked or atom
if tail(curr) is an unmarked atom
mark tail(curr)
while prev is not null and tag(prev) is 1
tag(prev)0
invertLink(curr,prev,tail(prev))
if prev is not null
invertLink(curr, prev, head(prev))
else finished
tag(curr) 1
invertLink(prev,curr, tail(curr))
else invertLink(prev,curr,head(curr))

25
Sweep

Having marked all used (linked) memory locations,
the next step is to sweep through the memory.
Sweep() checks every item in the memory, any
which havent been marked are then returned to
available memory.
Sadly, this can often leave the memory with used
locations sparsely scattered throughout.
A further phase is required compaction.

26
Compaction

Compaction involves copying data to one section
of the computers memory.
As our data is likely to involve linked data
structures, we need to maintain the pointers to
the nodes even when their location changes.

B
C
B
C
C
A
A
A
A
B
B
B
C
C
C
27
Compaction

compact()
lo bottom of heap
hi top of the heap
while (lo lt hi)
while lo is marked
lo
while hi is not marked
hi--
unmarked cell hi
lo hi
tail(hi--) lo //forwarding address
lo the bottom of heap
while(lo lthi)
if lo is not atom and head(lo) gt hi
head(lo) tail(head(lo))
if lo is not atom and tail(lo) gt hi
tail(lo) tail(tail(lo))
lo

28
Incremental Garbage Collection

The Mark and Sweep method of garbage collection
is called automatically when the available memory
resources are unsatisfactory.
When it is called the program is likely to pause
while the algorithm runs.
In Real time systems this is unacceptable, so
another approach can be considered.
The alternative approach is incremental garbage
collection.

29
Incremental Garbage Collection

In Incremental Garbage collection the collection
phase is interweaved with the program.
Here the program is called a mutator as it can
change the data the garbage collector is tidying.
One approach, similar to the mark and sweep, is
to intermittently copy n items from a fromspace
to a tospace, to semispaces in the computers
memory.
The next time the two spaces are switched.
Consider what are the pros and cons of
incremental vs mark and sweep?