Title: Amortized Analysis of Algorithms Adnan YAZICI Dept. of Computer Engineering Middle East Technical Univ. Ankara - TURKEY
1Amortized Analysis of Algorithms Adnan
YAZICIDept. of Computer EngineeringMiddle East
Technical Univ.Ankara - TURKEY
2Amortized Analysis of Algorithms
- Worst-case analysis is sometimes overly
pessimistic. - Amortized analysis of an algorithm involves
computing the maximum total number of all
operations on the various data structures. - Amortized cost applies to each operation, even
when there are several types of operations in the
sequence. - In amortized analysis, time required to perform a
sequence of data structure operations is averaged
over all the successive operations performed.
That is, a large cost of one operation is spread
out over many operations (amortized), where the
others are less expensive. - Therefore, amortized anaysis can be used to show
that the average cost of an operation is small,
if one averages over a sequence of operations,
even though one of the single operations might be
very expensive.
3Amortized Analysis of Algorithms
- Amortized time analysis provides more accurate
analysis. - These situations arise fairly often in connection
with dynamic sets and their associated
operations. - An Example Time needed to get a cup of coffee in
a common coffee room. Once in a while, you have
to start a fresh brew when you find the pot
empty. It is quick in amortized sense since a
long time is required only after several cups
have been obtained quickly. - Operations - get a cup of coffee (quick)
- - brew a fresh pot (time consuming)
4Amortized Analysis of Algorithms
- Amortized analysis differs from average-case
analysis in that probability is not involved in
amortized analysis. - Rather than taking the average over all possible
inputs, which requires an assumption on the
probability distribution of instances, in
amortized analysis we take the average over
successive calls. - In amortized analysis the times taken by the
various calls are highly dependent, whereas in
average-case analysis we implicitly assume that
each call is independent from the others.
5Amortized Analysis of Algorithms
- Suppose we have an ADT and we want to analyze its
operation using amortized time analysis.
Amorized time analysis is based on the following
equation, which applies to each individual
operation of this ADT. - amortized cost actual cost accounting cost
- The creative part is to design a system of
accounting costs for individual operations that
achives the two goals - In any legal sequence of operations, beginning
from the creation of the ADT object being
analyzed, the sum of the accounting cost is
nonnegative. - Although the actual cost may fluctuate widely
from one individual operation to the next, it is
feasible to analyze the amortized cost of each
operation.
6Amortized Analysis of Algorithms
- If these two goals are achived, then the total
amortized cost of a sequence of operations
(always starting from the creation of the ADT
object) is an upper bound on the total actual
cost. - Intuitively, the sum of the accounting costs is
like a savings account. - The main idea for designing a system of
accounting costs is that normal individual
operations should have a positive accounting
cost, while the unusually expensive individual
operations receive a negative accounting cost. - Working out how big to make the positive charges
to accounting costs often requires creativity,
and may involve a degree of trial and error to
arrive at some amount that is reasonaly small,
yet large enough to prevent the accounting
balance from going negative.
7Amortized Analysis of Algorithms
- There exists three common techniques used in
amortized analysis - Aggeregate method
- Accounting trick
- The potential function method
8Amortized Analysis of Algorithms
- Aggeregate method
- We show that a sequence of n operations take
worst-case time T(n) in total. In the worst case,
the ave. cost, or amortized cost, per operation
is therefore T(n) / n. - In the aggregate method, all operations have the
same amortized cost. - The other two methods, the accounting tricky and
the potential function method, may assign
different amortized costs to diferent types of
operations.
9Amortized Analysis of Algorithms
- Example Stack operations
- Push(S,x) pushes object x onto stack S
- Pop(S) pops the top of the stack S and returns
the poped object - Multipop(S,k) Removes the k top objects of stack
S - The action of Multipop on a stack S is as
follows - Multipop (S,k)
- while not STACK-EMPTY(S) and k ? 0
- do POP(s)
- k ? k 1
- The top 4 objects are popped by Multipop(S,4),
whose result is shown in second column.
top
23
34
14
10
22 22
50 50 -
10Amortized Analysis of Algorithms
- The worst-case cost of a Multipop operation in
the sequence is O(n), hence a sequence of n
operations costs O(n2), (since we may have O(n)
Multipop operations costing O(n) each and the
stack size is at most n.) - Although this analysis is correct, but not tight.
- Using the aggregate method of amortized analysis,
we can obtain a tighter upper bound that
considers the entire sequence of n operations.
11Amortized Analysis of Algorithms
- In fact, although a single Multipop operation can
be expensive, any sequence of n Push, Pop, and
Multipop operations on an initially empty stack
can cost at most O(n). Why? - Because each object can be poped at most once for
each time it is pushed. Therefore, the number of
times that Pop can be called on a nonempty stack,
including calls within Multipop, is at most the
number of Push, which is at most n. For any value
of n, any sequence of n Push, Pop, and Multipop
operations takes a total of O(n) time. - The amortized cost of an operation is the
average O(n)/n O(1).
12Amortized Analysis of Algorithms
- Accounting trick
- Different charges to different operations are
assigned. Some operations are charged more or
less than they actually cost. - When an operations amortized cost exceeds its
actual cost, the difference is assigned to
specific objects in the data structure as credit.
- Credit can be used later on to help pay for
operations whose amortized cost is less than
their actual cost. - One must choose the amortized costs of operations
carefully. The total credit in the data structure
should never become negative, otherwise the total
amortized cost would not be an upper bound on the
total actual cost.
13Amortized Analysis of Algorithms
- Example-1 stack operations
- The actual costs of the operations were,
- Push 1,
- Pop 1,
- Multipop min(k,s),
- where k is the argument supplied to Multipop and
s is the stack size when it is called. - We assign the following amortized costs
- Push 2,
- Pop 0,
- Multipop 0.
- Here all three amortized costs are O(1), although
in general the amortized costs of the operations
under consideration may differ asymptotically.
14Amortized Analysis of Algorithms
- We shall now show that we can pay for any
sequence of stack operations by charging the
amortized costs. - For Push operation we pay the actual cost of the
push 1 token and are left with a credit of 1
token out of 2 tokens charged, which we put on
top of the plate. - When we execute a Pop operation, we charge the
operation nothing and pay its actual cost using
the credit stored in the stack. Thus, by charging
the Push operation a little bit more, we neednt
charge the Pop operation anything. - We neednt charge the Multipop operation anything
either. We have always charged at least enough up
front to pay for the Multipop operations. - Thus, for any sequence of n Push, Pop, and
Multipop operations, the total amortized cost is
an upper bound on the total actual cost. Since
the total amortized cost is O(n), so is the total
actual cost.
15Amortized Analysis of Algorithms
- Example -2 Accounting scheme for Stack with
array doubling - Say the actual cost of push or pop is 1 when no
resizing of the array occurs, and - The actual cost of push is 1 nt, for some
constant t, if it involves doubling the array
size from n to 2n and copying n elements over the
new array. - So, the worst-case actual time for push is ?(n).
However, the amortized analysis gives a more
accurate picture. - The accounting cost for a push that does not
require array doubling is 2t, - The accounting cost for a push that requires
doubling the array from n to 2n is nt 2t, - Pop is 0.
16Amortized Analysis of Algorithms
- The coefficient of 2 in the accounting costs is
chosen to be large enough, from the time the
stack is created, the sum of the accounting costs
can never be negative. To see this informally,
when the account balance net sum of accounting
costs - grows to 2nt (doubling occurs from size n
to 2n), then the first negaive charge will reduce
it to nt 2t. Therefore, this is a valid
accounting scheme for the Stack ADT. - With some experimentation we can convince
ourselves that any coefficient less than 2 will
lead to eventual bankruptcy in the worst case. - Amortized cost actual cost accounting cost
1 nt (-nt 2t) 1 2t. - With this accounting scheme, the amortized cost
of each individual push operation is 1 2t,
whether it causes array doubling or not and the
amortized cost of each pop operation is 1. Thus
we can say that both push and pop run in the
worst-case amortized time that is in ?(1). - More complicated data structures often require
more complicated accounting schemes, which
require more creativity to think up.
17Amortized Analysis of Algorithms
- The potential function method
- The potential is associated with the data
structure as a whole rather than with specific
objects within the data structure. - The potential method works as follows
- We start with an initial data structure D0 on
which n operations are performed. - For each i 1,2,...,n, we let ci be the actual
cost of the ith operation and Di be the data
structure that results after applying the ith
operation on data structure Di-1.
18Amortized Analysis of Algorithms
- A potential function ? maps each data structure
Di to a real number. - ?(Di) is potential associated with data structure
Di. - The amortized cost aci of the ith operation with
respect to potential function ? is defined by - aci ci ?(Di) - ?(Di-1).
- The amortized cost of each operation is therefore
its actual cost(ci) plus the increase in
potential (?(Di) - ?(Di-1)) caused by ith
operation. - So, the total amortized cost of the n operations
is - ?1?i?n aci ?1?i?n (ci ?(Di) - ?(Di-1))
- ?1?i?n ci ?(Dn) - ?(D0).
- Here we used telescoping series
- for any sequence a0, a1, ..., an, ?1?k?n (ak
ak-1) (an a0).
19Amortized Analysis of Algorithms
- ?1?i?n aci ?1?i?n ci ?(Dn) - ?(D0).
- If we can define a potential function ? so that
?(Dn) ? ?(D0), then the total amortized cost,
?1?i?n aci, is an upper bound on the total
actual cost needed to perform a sequence of
operations. - It is often convenient to define ?(D0) to be 0
and then show that ?(Di) ? 0, ?i. - The challenge in applying this technique is to
figure out the proper potential function. - Different potential functions may yield different
amortized costs yet still be upper bounds on the
actual costs.
20Amortized Analysis of Algorithms
- Example Suppose that the process to be analysed
modifies a database and its efficiency each time
it is called depends on the current state of that
database. We associate a notion of cleanliness,
known as the potential function of the database. - Formally, we introduce the following parameters
- ? an integer-valued potential function of the
state of the database. Larger values of ?
correspond to dirtier states. - ?0 the value of ? on the initial state it
represents our standard of cleanliness. - ?i the value of ? on the database after the ith
call on the process, and - ci the actual time needed by that call.
- aci the amortized time, which is actual time
(required to carry out the ith call on the
process plus the increase in potential caused by
that call.
21Amortized Analysis of Algorithms
- So, the amortized time taken by that call is
- aci ci ?i - ?i-1
- Let Tn denote the total time required for the
first n calls on the process, and denote the
total amortized time by aTn. - aTn ?1?i?n aci ?1?i?n (ci ?i - ?i-1)
?1?i?n ci ?1?i?n ?i - ?1?i?n ?i-1 - Tn ?n ?n-1 .. ?1 - ?n-1 - ...-
?1 - ?0 - Tn ?n - ?0
- Therefore, aTn Tn (?n - ?0).
- The significance of this is that Tn ? aTn holds
for all n provided ?n never becomes smaller than
?0. In other words, the total amortized time is
always an upper bound on the total cost actual
time needed to perform a sequence of operations,
as long as the database is never allowed to
become cleaner than it was initially. - This shows that overcleaning can be harmful!!
- This approach is interesting when the actual time
varies significantly from one call to the next,
whereas the amortized time is nearly invarient.
22Amortized Analysis of Algorithms
- Example stack operations
- We define ? on a stack to be the number of
objects in the stack. - The stack Di that results after the ith operation
has nonnegative potential, since the number of
objects in the stack is never negative. Thus, - ?(Di) ? 0 ?(D0).
- The total amortized cost of n operations w.r.t ?
therefore represents an upper bound on the actual
cost.
23Amortized Analysis of Algorithms
- The amortized costs of the various stack
operations are as follows - If the ith operation on a stack containing s
objects is a Push operation, then the potential
difference is - ?(Di) - ?(Di-1) (s1) s 1.
- The amortized cost of this Push operation is
- aci ci ?(Di) - ?(Di-1) 1 1 2.
- If ith operation is Pop on the stack containing
an object that is popped off the stack. The
actual cost of the Pop operation is 1, and the
potential difference is - ?(Di) - ?(Di-1) -1.
- Thus, the amortized cost of this Pop operation
is - aci ci ?(Di) - ?(Di-1) 1 1 0.
24Amortized Analysis of Algorithms
- Therefore, the amortized cost of the each of the
three operations is O(1), and thus the total
amortized cost of a sequence of n operations is
O(n). - Suppose that ith operation on the stack is
Multipop(S,k) and k min(k,s) objects are
popped off the stack. The actual cost of the
operation is k, and the potential difference is - ?(Di) - ?(Di-1) -k.
- Thus, the amortized cost of this Multipop
operation is - aci ci ?(Di) - ?(Di-1) k k 0.
25Amortized Analysis of Algorithms
- Problem How large should a static hash table be?
- Problem What if we dont know the proper size in
advance? - Goal Make the table as small as possible, but
large enough so that it wont overflow (or
otherwise become inefficient). - Idea Whenever the table overflows, grow it by
allocating a new, a larger table. Move all items
from the old table into the new one, and free the
storage for the old table. - Solution Dynamic tables.
26Amortized Analysis of Algorithms
- Worst-case analysis
- Consider a sequence of n insertions. The
worst-case time to execute one insertion is ?(n).
Therefore, the worst-case time for n insertions
is n. - ?(n) ?(n2).
- WRONG! In fact, the worst-case cost for n
insertions is only ?(n) ltlt ?(n2). - Lets see why.
27Amortized Analysis of Algorithms
- Example Dynamic Tables
- Assume that T is an object representing the
table. - The field tableT contains a pointer to the
block of storage representing the table. - The field numT contains the number of items in
the table - The field sizeT is the total number of slots in
the table. - Initially, the table is empty numT sizeT
0.
28Amortized Analysis of Algorithms
- Dynamic Tables
- Table insertion If only insertions are
performed, the load factor of a table is always
at least ½, thus the amount of wasted space never
exceeds half the total space in the table. - Table-Insert (T, x)
- 1. If size T 0
- 2. Then allocate tableT with 1 slot
- 3. If numT size T
- 4. Then allocate new-table with 2sizeT
slots - 5. Insert all items in tableT into new-table
- 6. Free tableT
- 7. tableT ? new-table
- 8. sizeT ? 2sizeT
- 9. Insert x into tableT
- 10. numT ? numT 1
29Amortized Analysis of Algorithms
- To use the potential function method to analyze a
sequence of n Table-Insert operations, we start
by defining a potential function ? that is 0
immediately after an expansion, but builds to the
table size by the time the table is full, so that
the next expansion can be paid for by the
potential. - The potential function ?(T)2numTsizeT is
one possibility. - Immediately after the expansion, we have numT
sizeT/2, and thus ?(T) is 0 (as desired). - Immediately before the expansion, we have
numTsizeT, thus ?(T)numT, thus the
potential can pay for an expansion if an item is
inserted (as desired). - The inial value of the potential is 0, since the
table is always at least half full, numT ?
sizeT, which imples that ?(T) is always
nonnegative. Thus, the sum of the amortized costs
of n Table-Insert operations is an upper bound on
the sum of the actual costs (as desired).
30Amortized Analysis of Algorithms
- If the ith Table-Insert operation does not
trigger an expansion, then sizei sizei-1 and the
amortized cost of the operation is - aci ci ?i - ?i-1
- 1 (2numi sizei) (2numi-1 sizei-1)
- 1 (2numi sizei) (2(numi 1) - sizei)
- 3.
- If the ith Table-Insert operation does trigger an
expansion, - then sizei / 2 sizei-1 numi 1 and the
amortized cost of the operation is - aci ci ?i - ?i-1
- numi (2numi sizei) (2numi-1 sizei-1)
- numi (2numi(2numi2)) (2(numi
1)(numi1)) - numi 2 - (numi1)
- 3.
31Amortized Analysis of Algorithms
- Table expansion and contraction
- The improvement on the natural strategy for
expansion and contraction (doubling the table
size for both cases which may result an immediate
expansion and contraction on the table size whose
n sequence of them would be ?(n2), where
amortized cost of an operation would be ?(n)) is
to allow the load factor of the table to drop
below ½. - The load factor, denoted as ?(T), is the no. of
items stored in the table divided by the size
(no. of slots) of the table that is, - ?(T) numT / sizeT.
- Specifically, we continue to double the table
size when an item is inserted into a full table,
but halve the table size when a deletion causes
the table to become less than ¼ full rather than
½ full as before.
32Amortized Analysis of Algorithms
- We can now use the potential method to analyze
the cost of a sequence of n Table-Insert and n
Table-delete operations. - We start by defining a potential function ? that
is 0 immediately after an expansion or
contraction and builds as the load factor
increases to 1 or decreases to ¼. - We use the potential function as
- 2numT sizeT if ?(T) ? ½,
- ?(T)
- sizeT/2 - numT if ?(T) lt ½.
33Amortized Analysis of Algorithms
- 2numT sizeT if ?(T) ? ½,
- ?(T)
- sizeT/2 - numT if ?(T) lt ½.
- Observe that when the load factor is ½, the
potential is 0 (since we have numT sizeT/2,
and thus ?(T) is 0 (as desired)). - When ?(T) is 1, we have numT sizeT, which
implies ?(T) numT, thus the potential can
pay for an expansion if an item is inserted (as
desired). - When the load factor is 1/4, we have sizeT
4numT, which implies ?(T) numT, thus the
potential can pay for an contraction if an item
is deleted (as desired). - Observe that the potential of an empty table is 0
and the potential is never negative. Thus, the
total amortized cost of a sequence of operations
w.r.t ? is an upper bound on their actual cost
(as desired).
34Amortized Analysis of Algorithms
The figure below illustrates how the potential
behaves for a sequence of operations.
35Amortized Analysis of Algorithms
- Initially, num0 0, size0 0, ?0 1, and ?0
0. - We start with the case in which the ith
operation is Table-Insert. - If ?i-1 ? ½, the analysis is identical to that
for table expansion before, whether the table
expands or not, the amortized cost, aci, of the
Table-insert operation is at most 3. - If ?i-1 ? ½, the table cannot expand as a result
of the operation, since expansion occurs only
when ?i-11. If ?i ? ½ as well, then amortized
cost of the ith operation is - aci ci ?i - ?i-1
- 1 (sizei /2- numi) (sizei-1 /2- numi-1)
- 1 (sizei /2- numi) (sizei /2- (numi-1))
0. - Since sizei sizei-1 and numi-1 numi-1.
36Amortized Analysis of Algorithms
- If ?i-1 ? ½ but ?i ? ½, then
- aci ci ?i - ?i-1
- 1 (2numi sizei) (sizei-1 /2 - numi-1)
- 1 (2(numi-1 1) sizei-1) (sizei-1 /2 -
numi-1) - 3numi-1 3/2sizei-1 3
- 3?i-1sizei-1 3/2sizei-1 3
- lt 3/2sizei-1 3/2sizei-1 3 3.
- Since sizei sizei-1, numi-1 1 numi, and
?i-1 numi-1/sizei-1 . - Thus, the amortized cost of a Table-insert
operation is at most 3.
37Amortized Analysis of Algorithms
- We now turn to the case in which the ith
operation is Table-delete. - In this case, numi numi 1. If ?i-1 ? ½, then
we must consider whether the Table-delete
operation causes a contraction. - If it does not, then sizei sizei-1 and the
amortized cost of the operation is - aci ci ?i - ?i-1
- 1 (sizei /2 - numi) (sizei-1 /2 -
numi-1) - 1 (sizei /2 - numi) (sizei /2 - numi
1). - 2.
38Amortized Analysis of Algorithms
- If ?i-1 ? ½ but ?i ? ½, then
- aci ci ?i - ?i-1
- 1 (2numi sizei) (sizei-1 /2 - numi-1)
- 1 (2(numi-1 1) sizei-1) (sizei-1 /2 -
numi-1) - 3numi-1 3/2sizei-1 3
- 3?i-1sizei-1 3/2sizei-1 3
- lt 3/2sizei-1 3/2sizei-1 3 3.
- Since sizei sizei-1, numi-1 1 numi, and
?i-1 numi-1/sizei-1 . - Thus, the amortized cost of a Table-insert
operation is at most 3.
39Amortized Analysis of Algorithms
- When the ith operation is a Table-delete and ?i-1
? ½, the amortized cost is also bounded above by
a constant. - Since sizei-1 2sizei, numi numi-1-1.
- If ?i-1 ? ½, then
- aci ci ?i - ?i-1
- numi-1 (sizei-1 /2 - numi-1)- (2numi
sizei) - sizei-1(sizei-1 /2 - sizei-1)-(2sizei-1-2size
i-1/2) - sizei-1 ½sizei-1 3/2sizei-1 2
- 2
- Thus, the amortized cost of a Table-delete
operation is at most 2.
40Amortized Analysis of Algorithms
- In summary, since the amortized cost of each
operation, delete or insert, is bounded above by
a constant, the actual time for any sequence of a
sequence of n operations on a dynamic table is
O(n). - In classic algorithm anlaysis, the time required
for a sequence of a sequence of n operations on a
dynamic table was O(n2) gtgt O(n).
41Amortized Analysis of KMP
Example Knuth-Morris-Pratt (KMP) string
matching algorithm using amortized
analysis Potential function ? is the number
of characters in the pattern, P, that we
currently have matched to text characters. ?(D0)
The initial potential is 0, since there are no
matched characters. ?(Di) We can never have a
negative number of pattern characters matched.
42Amortized Analysis of KMP
- Example Knuth-Morris-Pratt (KMP) string
matching algorithm using amortized analysis - aci ci ?i - ?i-1 amortized cost true cost
change in potential. - If Pq1 Ti
- aci ci ?(Di) - ?(Di-1) 1 (q1) - q 2,
since - ci 1, we do one comparison
- ?(Di) - ?(Di-1) (q1) - q 1,
- number of matched characters is increased by 1.
43Amortized Analysis of KMP
- If Pq1 ltgt Ti
- / We shift P to the right r times (Shifting P m
number of characters to the right and there is a
match) / - aci ci ?(Di) - ?(Di-1) (r1) (q1-x)
q - r-x2, since
- r ? x, we do one comparison
- aci ? 2.
- ci r 1
- ?(Di) - ?(Di-1) (q1-x) q, we had (q)
characters matched. Now at i we have (q1-x)
characters matched. x is the number of characters
that we do not compare at all. Check the KMP
algorithm.
44Amortized Analysis of KMP
- If Pq1 ltgt Ti
- / We shift P to the right r times (Shifting P x
characters to the right) and never find a match/ - aci ci ?(Di) - ?(Di-1) r 0 q, since r
? q, - aci ? 0.
- ci r, (one comparison per shift)
- ?(Di) - ?(Di-1) 0 q, previously we had (q)
characters matched. Now at i we have no character
matched. - In all three cases aci ? 2 so the amortized cost
of the ith operation is ? (1). - For n character long string, the total cost is
?(n).
45Amortized Analysis for Binary Heap
Potential function ? is the number of
comparisons, which is equal to the height of the
binary heap. ?(D0) 0 ? ?(Di) ? is never
negative. The total amortized cost of n
operations w.r.t. ? represents an upper-bound on
the actual cost. INSERT aci ci ?(Di) -
?(Di-1) lg(n-1) ?nx1lgx - ?n-1x1lgx
lg(n-1) lgn ? (lgn).
46Amortized Analysis of Binary Heap
EXTRACT-MIN aci ci ?(Di) - ?(Di-1)
lgn ?n-1x1lgx - ?nx1lgx lgn
(-lgn) ?(1).
47Conclusions
Amortized costs can provide a clean abstraction
of data-structure performance. Any of the
analysis methods can be used when an amortized
analysis is called for, but each method has some
situations where it is arguably the simplest.
Different schemes may work for assigning
amortized costs in the accounting method, or
potentials in the potential method, sometimes
yielding radically different bounds.