Title: Priority Queues and Heapsort 9.19.4
1Priority Queues and Heapsort (9.1-9.4)
- Priority queues are used for many purposes, such
as job scheduling, shortest path, file
compression - Recall the definition of a Priority Queue
- operations insert(), delete_max()
- also max(), change_priority(), join()
- How would I sort a list, using a priority queue?
- for (i0 iltn i) insert(Ai)
- for (i0 iltn i) cout ltlt delete_max()
- How would I implement a priority queue?
- how fast a sorting alg would your implementation
yield? - can we do better?
2Priority Queue Implementations
- insert max delete change join
priority - sorted array n 1 n n n
- unsorted array 1 n 1 1 n
- heap lg n lg n lg n lg n n
- binomial queue lg n lg n lg n lg n lg n
- (best) 1 lg n lg n 1 1
3Heaps
- How can we build a data structure to do this?
- Hints
- we want to find the smallest element quickly
- we want to be able to remove an element quickly
- Tree of some sort?
- Heap
- a full binary tree (all leaves at the same level,
on left) - each element is at least as large as its children
- (note this is not a BST!)
- How to delete the maximum?
- How to add a number to a heap?
- How to build a heap out of a list of numbers?
4Insert
- Implicit representationXTOGSMNAERAI
- (children of i at 2i and 2i1)
- How would I do an insert()?
- add to the end of the array
- repeat if larger than parent, swap
XTOGSMNAERIP XTOGSPNAERIM XTPGSONAERIM
template ltclass Itemgt void insert(Item a, Item
newItem, int items) n items an
newItem while (ngt1 an/2 lt an)
exch(an, an/2) n/2
Runtime?
Q(log n)
5DeleteMax
- How would I delete X?
- Move last element to root
- If larger than either child,
- swap with larger child
XTOGSMNAERAI ITOGSMNAERAX TIOGSMNAERAX TSOGIMNAERA
X TSOGRMNAEIAX
Item DeleteMax(Item a, int items) exch(a1,
aitems--) reHeapify(a, items) return
aitems1 void reHeapify(Item a, int
items) int n1 while (2n lt items) int
j 2n if (jltitems aj lt aj1) j
if (an gt aj) break exch(an,aj)
nj
Runtime?
Q(log n)
6BuildHeap (top down)
- Given an array, e.g. ASORTINGEXAMPLE, how do I
make it a heap? - Top-down
- for (i2 iltitems i)
- insert(a,ai,i-1)
- Runtime
- Q(n log n)
- Can we do better?
7BuildHeap (bottom up)
- Suppose we use the reHeapify() function instead
and work bottom-up. - For (iitems/2 igt1 i--)
- reHeapify(a)
ASORTINGEX ASORXINGET AXORSINGET AXORTINGES XAORTI
NGES XTORAINGES XTORSINGEA
Runtime? 111122..4.. n/4
2(n/8) 3(n/16) 4(n/32) n(1/4 2/8
3/16 4/32 ) n 1
Q(n) !
Top-down was Q(n log n) bottom up is Q(n)!
cool!
8Heapsort
- BuildHeap()
- for (i1 iltn i) DeleteMax()
- Runtime?
- Q(n log n)
- Almost competitive with quicksort
9Priority Queue
- Operations insert(), max(), deleteMax()
- Could implement with heap
- Runtime for each operation?
- insert(), deleteMax() O(log n)
- max() O(1)
10Example Application
- Suppose you have a text, abracadabra. Want to
compress it. - How many bits required?
- at 3 bits per letter, 33 bits.
- Can we do better?
- How about variable length codes?
- In order to be able to decode the file again, we
would need a prefix code no code is the prefix
of another. - How do we make a prefix code that compresses the
text?
11Huffman Coding
- Note Put the letters at the leaves of a binary
tree. Left0, Right1. Voila! A prefix code. - Huffman coding an optimal prefix code
- Algorithm use a priority queue.
- insert all letters according to frequency
- if there is only one tree left, done.
- else, adeleteMin() bdeleteMin()
- make tree t out of a and b with weight
a.weight() b.weight() - insert(t)
12Huffman coding example
- abracadabra frequencies
- a 5, b 2, c 1, d 1, r 2
- Huffman code
- a 0, b 100, c 1010, d 1011, r 11
- bits 5 1 2 3 1 4 1 4 2 2 23
- Finite automaton to decode Q(n)
- Time to encode?
- Compute frequencies O(n)
- Build heap O(1) assuming alphabet has constant
size - Encode O(n)
13Huffman coding summary
- Huffman coding is very frequently used
- (You use it every time you watch HTDV or listen
to mp3, for example) - Text files often compress to 60 of original size
- In real life, Huffman coding is usually used in
conjunction with a modeling algorithm - E.g. jpeg compression DCT, quantization, and
Huffman coding - Text compression dictionary Huffman coding
14Finite Automata and Regular Expressions
- How can I decode some Huffman-encoded text
efficiently? - (hand-design a dfa to recognize)
- Or how can I find all instances of aardvark,
aaardvark, aaaardvark, etc. or zyzzyva, zyzzzyva,
zyzzzzyva, etc. in Microsoft Word? Unix? (grep) - All words with 2 or more As or Zs?
- Important topic regular expressions and finite
automata. - theoretician regular expressions are grammars
that define regular languages - programmer compact patterns for matching and
replacing
15DFA for abracadabra
- Huffman code A0, B100, C1010, D1011, E11
- DFA
- state read out new state
- 0 0 A 0
- 0 1 1
- 1 0 2
- 1 1 R 0
- 2 0 B 0
- 2 1 3
- 3 0 C 0
- 3 1 D 0
- (Actually, this looks just like the original
tree, doesnt it.)
16Regular Expressions
- Regular expressions are one of
- a literal character
- a (regular expression) in parentheses
- a concatenation of two R.E.s
- the alternation (or) of two R.E.s, denoted
- the closure of an R.E., denoted (i.e 0 or more
occurrences) - Regular expressions define regular languages
- Examples
- abracadabra
- abra(cadabra) abra, abracadabra,
abracadabracadabra, - (ab ac)d
- (a(ab)b)
17RE Variants
- Different programming languages, text editors,
etc. use different syntaxes. - Perl regexps
- . any character
- 1-4a-c any of 1, 2, 3, 4, a, b, c, , ,
- abc any letter other than a, b, or c
- alternation (or)
- 0 or more occurrences (maximal)
- 1 or more occurrences
- ?, ? same, but use minimal matching
- ? 0 or 1 occurrences
- 1, 2 back references
- \s any white-space character
- \w any word character (a-zA-Z0-9_)
- \d any digit (0-9)
18RE Examples (perl syntax)
- s/s/th/g
- s/\s/ /g
- s/aa/aa/
- s/(\w)\s(\w)/2 1/
- s/(\w?)\s(\w?)/2 1/
- m/ltp(gt)class\s\s()(.?)\2(gt)gt/
- date m/(\d)(\d)(\d)/ hours 1
minutes 2 seconds 3 - if (ls m/index.htm/)
- if (cat myname.txt m/Joe Smith/)
- stat program
19Finite Automata
- Finite automata machines that recognize regular
languages. - Deterministic Finite Automaton (DFA)
- a set of states including a start state and one
or more accepting states - a transition function given current state and
input letter, whats the new state? - Non-deterministic Finite Automaton (NDFA)
- like a DFA, but there may be
- more than one transition out of a state on the
same letter (Pick the right one
non-deterministically, i.e. via lucky guess!) - epsilon-transitions, i.e optional transitions on
no input letter
20RE ? NDFA
- Given a Regular Expression, how can I build a
DFA? - Work bottom up.
- Letter
- Concatenation
- Or Closure
21RE -gt NDFA Ex
- Construct an NDFA for the RE(AB AC)D
- A
- A
- AB
- AB AC
- (AB AC)D
22NDFA -gt DFA
- Keep track of the set of states you are in.
- On each new input letter, compute the new set of
states you could be in. - The set of states for the DFA is the power set of
the NDFA states. - I.e. up to 2n states, where there were n in the
DFA.
23Recognizing Regular Languages
- Suppose your language is given by a DFA. How to
recognize? - Build a table. One row for every (state,input
letter) pair. Give resulting state. - For each letter of input string, compute new
state - When done, check whether the last state is an
accepting state. - Runtime?
- O(n), where n is the number of input letters
- Another approach use a C program to simulate
NDFA with backtracking. Less space, more time.
(perl, egrep vs. fgrep)
24Examples
- Unix grep
- Perl
- input s/two?o/2
- input sltlinkgtgt\sgs
- input s\s\_at_font-face\s.?gs
- input s\smso-gt"""gis
- input s/( ) ( )/2 1/
- input m/0-9\.?0-9\.0-9/
- (word1,word2,rest)
- (foo m/ ( ) ( ) (.)/)
- inputsltspangtgt\sltbr\sclear"?allgtgt\s
lt/spangtltbr clear"all"/gtgis