Overview and History - PowerPoint PPT Presentation

Loading...

PPT – Overview and History PowerPoint presentation | free to download - id: 6d54d4-MzIwN



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Overview and History

Description:

CSC 421: Algorithm Design & Analysis Spring 2013 Greedy algorithms greedy algorithms examples: optimal change, job scheduling Prim's algorithm (minimal spanning tree) – PowerPoint PPT presentation

Number of Views:9
Avg rating:3.0/5.0
Slides: 24
Provided by: DaveR181
Learn more at: http://www.dave-reed.com
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Overview and History


1
CSC 421 Algorithm Design Analysis Spring 2013
  • Greedy algorithms
  • greedy algorithms
  • examples optimal change, job scheduling
  • Prim's algorithm (minimal spanning tree)
  • Dijkstra's algorithm (shortest path)
  • Huffman codes (data compression)
  • applicability

2
Greedy algorithms
  • the greedy approach to problem solving involves
    making a sequence of choices/actions, each of
    which simply looks best at the moment
  • local view choose the locally optimal option
  • hopefully, a sequence of locally optimal
    solutions leads to a globally optimal solution
  • example optimal change
  • given a monetary amount, make change using the
    fewest coins possible
  • amount 16 coins?
  • amount 96 coins?

3
Example greedy change
  • while the amount remaining is not 0
  • select the largest coin that is ? the amount
    remaining
  • add a coin of that type to the change
  • subtract the value of that coin from the amount
    remaining
  • e.g., 96 50 25 10 10 1
  • will this greedy algorithm always yield the
    optimal solution?
  • for U.S. currency, the answer is YES
  • for arbitrary coin sets, the answer is NO
  • suppose the U.S. Treasury added a 12 coin
  • GREEDY 16 12 1 1 1 1 (5 coins)
  • OPTIMAL 16 10 5 1 (3 coins)

4
Example job scheduling
  • suppose you have a collection of jobs to execute
    and know their lengths
  • want to schedule the jobs so as to minimize
    waiting time
  • Job 1 5 minutes Schedule 1-2-3 0 5 15
    20 minutes waiting
  • Job 2 10 minutes Schedule 3-2-1 0 4 14 18
    minutes waiting
  • Job 3 4 minutes Schedule 3-1-2 0 4 9
    13 minutes waiting
  • GREEDY ALGORITHM do the shortest job first
  • i.e., while there are still jobs to execute,
    schedule the shortest remaining job

does the greedy approach guarantee the optimal
schedule? efficiency?
5
Application minimal spanning tree
  • consider the problem of finding a minimal
    spanning tree of a graph
  • a spanning tree of a graph G is a tree (no
    cycles) made up of all the vertices and a subset
    of the edges of G
  • a minimal spanning tree for a weighted graph G is
    a spanning tree with minimal total weight
  • minimal spanning trees arise in many real-world
    applications
  • e.g., wiring a network of computers connecting
    rural houses with roads

spanning tree? minimal spanning tree?
example from http//compprog.wordpress.com/
6
Prim's algorithm
  • to find a minimal spanning tree (MST)
  • select any vertex as the root of the tree
  • repeatedly, until all vertices have been added
  • find the lowest weight edge with exactly one
    vertex in the tree
  • select that edge and vertex and add to the tree

7
Prim's algorithm
  • to find a minimal spanning tree (MST)
  • select any vertex as the root of the tree
  • repeatedly, until all vertices have been added
  • find the lowest weight edge with exactly one
    vertex in the tree
  • select that edge and vertex and add to the tree

minimal spanning tree? is it unique?
8
Correctness of Prim's algorithm
  • Proof (by induction) Each subtree T1, T1, ,
    TV in Prim's algorithm is contained in a MST.
    Thus, TV is a MST.
  • BASE CASE T1 contains a single vertex, so is
    contained in a MST.
  • ASSUME T1, , Ti-1 are contained in a MST.
  • STEP Must show Ti is contained in a MST.
  • Assume the opposite, that Ti is not contained in
    a MST.
  • Let ei be the new edge (i.e., minimum weight edge
    with exactly one vertex in Ti-1).
  • Since we assumed Ti is not part of any MST,
    adding ei to a MST will yield a cycle.
  • That cycle must contain another edge with exactly
    one vertex in Ti-1 .
  • Replacing that edge with ei yields a spanning
    tree, and since ei had the minimal weight of any
    edge with exactly one vertex in Ti-1, it is a
    MST.
  • Thus, Ti is contained in a MST ? CONTRADICTION!

9
Efficiency of Prim's algorithm
  • brute force (i.e., adjacency matrix)
  • simple (conservative) analysis
  • for each vertex, must select the least weight
    edge ? O(V E)
  • more careful analysis
  • note that the number of eligible edges is
    shrinking as the tree grows
  • S (V deg(vi)) O(V2 E) O(V2)
  • smarter implementation
  • use a priority queue (min-heap) to store
    vertices, along with minimal weight edge
  • to select each vertex remove from PQ ? V
    O(log V) O(V log V)
  • to update each adjacent vertex after removal (at
    most once per edge)
  • E O(log V) O(E log V)
  • overall efficiency is O( (EV) log V )

10
Application shortest path
  • consider the general problem of finding the
    shortest path between two nodes in a graph
  • flight planning and word ladder are examples of
    this problem
  • - in these cases, edges have uniform cost
    (shortest path fewest edges)
  • if we allow non-uniform edges, want to find
    lowest cost/shortest distance path

Redville ? Purpleville ?
example from http//www.algolist.com/Dijkstra's_al
gorithm
11
Modified BFS solution
  • we could modify the BFS approach to take cost
    into account
  • instead of adding each newly expanded path to the
    end (i.e., queue), add in order of path cost
    (i.e., priority queue)
  • Redville0
  • Redville, Blueville5,
  • Redville, Orangeville8,
  • Redville, Greenville10
  • Redville, Orangeville8,
  • Redville, Blueville, Greenville8,
  • Redville, Greenville10,
  • Redville, Blueville, Purpleville12
  • Redville, Blueville, Greenville8,
  • Redville, Greenville10,
  • Redville, Orangeville, Purpleville10,
  • Redville, Blueville, Purpleville12

note as before, requires lots of memory to store
all the paths HOW MANY?
12
Dijkstra's algorithm
  • alternatively, there is a straightforward greedy
    algorithm for shortest path
  • Dijkstra's algorithm
  • Begin with the start node. Set its value to 0 and
    the value of all other nodes to infinity. Mark
    all nodes as unvisited.
  • For each unvisited node that is adjacent to the
    current node
  • If (value of current node value of edge) lt
    (value of adjacent node), change the value of the
    adjacent node to this value.
  • Otherwise leave the value as is.
  • Set the current node to visited.
  • If unvisited nodes remain, select the one with
    smallest value and go to step 2.
  • If there are no unvisited nodes, then DONE.
  • this algorithm is O(N2), requires only O(N)
    additional storage

13
Dijkstra's algorithm example
  • suppose want to find shortest path from Redville
    to Purpleville
  1. Begin with the start node. Set its value to 0 and
    the value of all other nodes to infinity. Mark
    all nodes as unvisited
  • For each unvisited node that is adjacent to the
    current node
  • If (value of current node value of edge) lt
    (value of adjacent node), change the value of the
    adjacent node to this value.
  • Otherwise leave the value as is.
  • Set the current node to visited.

14
Dijkstra's algorithm example cont.
  • If unvisited nodes remain, select the one with
    smallest value and go to step 2.
  • Blueville set Greenville to 8 and Purpleville to
    12 mark as visited.
  • Greenville no unvisited neighbors mark as
    visited.
  • If unvisited nodes remain, select the one with
    smallest value and go to step 2.
  • Orangeville set Purpleville to 10 mark as
    visited.
  • If there are no unvisited nodes, then DONE.

With all nodes labeled, can easily construct the
shortest path HOW?
15
Correctness efficiency of Dijkstra's algorithm
  • analysis of Dijkstra's algorithm is similar to
    Prim's algorithm
  • can show that each greedy selection is safe,
    leads to shortest path
  • brute force (i.e., adjacency matrix) approach
  • for each vertex, need to select shortest edge ?
    O(V E)
  • or, more carefully, S (V deg(vi)) O(V2
    E) O(V2)
  • smarter implementation
  • use a priority queue (min-heap) to store
    vertices, along with minimal weight edge
  • to select each vertex remove from PQ ? V
    O(log V) O(V log V)
  • to update each adjacent vertex after removal ?
    E O(log V) O(E log V)
  • overall efficiency is O( (EV) log V )

16
Another application data compression
  • in a multimedia world, document sizes continue to
    increase
  • a 6 megapixel digital picture is 2-4 MB
  • an MP3 song is 3-6 MB
  • a full-length MPEG movie is 800 MB
  • storing multimedia files can take up a lot of
    disk space
  • perhaps more importantly, downloading multimedia
    requires significant bandwidth
  • it could be a lot worse!
  • image/sound/video formats rely heavily on data
    compression to limit file size e.g., if no
    compression, 6 megapixels 3 bytes/pixel 18
    MB
  • the JPEG format provides 101 to 201 compression
    without visible loss

17
Audio, video, text compression
  • audio video compression algorithms rely on
    domain-specific tricks
  • lossless image formats (GIF, PNG) recognize
    repeating patterns (e.g. a sequence of white
    pixels) and store as a group
  • lossy image formats (JPG, XPM) round pixel values
    and combine close values
  • video formats (MPEG, AVI) take advantage of the
    fact that little changes from one frame to next,
    so store initial frame and changes in subsequent
    frames
  • audio formats (MP3, WAV) remove sound out of
    hearing range, overlapping noises
  • what about text files?
  • in the absence of domain-specific knowledge,
    can't do better than a fixed-width code
  • e.g., ASCII code uses 8-bits for each character
  • '0' 00110000 'A' 01000001 'a' 01100001
  • '1' 00110001 'B' 01000010 'b' 01100010
  • '2' 00110010 'C' 01000011 'c' 01100011
  • . . .
  • . . .
  • . . .

18
Fixed- vs. variable-width codes
  • suppose we had a document that contained only the
    letters a-f
  • with a fixed-width code, would need 3 bits for
    each character
  • a 000 d 011
  • b 001 e 100
  • c 010 f 101
  • if the document contained 100 characters, 100 3
    300 bits required
  • however, suppose we knew the distribution of
    letters in the document
  • a45, b13, c12, d16, e9, f5
  • can customize a variable-width code, optimized
    for that specific file
  • a 0 d 111
  • b 101 e 1101
  • c 100 f 1100
  • requires only 451 133 123 163 94
    54 224 bits

19
Huffman codes
  • Huffman compression is a technique for
    constructing an optimal variable-length code
    for text
  • optimal in that it represents a specific file
    using the fewest bits
  • (among all symbol-for-symbol codes)
  • Huffman codes are also known as prefix codes
  • no individual code is a prefix of any other code
  • a 0 d 111
  • b 101 e 1101
  • c 100 f 1100
  • this makes decompression unambiguous
    1010111110001001101
  • note since the code is specific to a particular
    file, it must be stored along with the compressed
    file in order to allow for eventual decompression

20
Huffman trees
  • to construct a Huffman code for a specific file,
    utilize a greedy algorithm to construct a Huffman
    tree
  • process the file and count the frequency for each
    letter in the file
  • create a single-node tree for each letter,
    labeled with its frequency
  • repeatedly,
  • pick the two trees with smallest root values
  • combine these two trees into a single tree whose
    root is labeled with the sum of the two subtree
    frequencies
  • when only one tree remains, can extract the codes
    from the Huffman tree by following edges from
    root to each leaf (left edge 0, right edge 1)

21
Huffman tree construction (cont.)
the code corresponding to each letter can be read
by following the edges from the root left edge
0, right edge 1 a 0 d 111 b 101 e
1101 c 100 f 1100
22
Huffman code compression
  • note that at each step, need to pick the two
    trees with smallest root values
  • perfect application for a priority queue
    (min-heap)
  • store each single-node tree in a priority queue
    (PQ) O(N log N)
  • repeatedly, O(N) times
  • remove the two min-value trees from the PQ O(log
    N)
  • combine into a new tree with sum at root and
    insert back into PQ O(log N)
  • total efficiency O(N log N)
  • while designed for compressing text, it is
    interesting to note that Huffman codes are used
    in a variety of applications
  • the last step in the JPEG algorithm, after
    image-specific techniques are applied, is to
    compress the resulting file using a Huffman code
  • similarly, Huffman codes are used to compress
    frames in MPEG (MP4)

23
Greed is good?
  • IMPORTANT the greedy approach is not applicable
    to all problems
  • but when applicable, it is very effective (no
    planning or coordination necessary)
  • GREEDY approach for N-Queens start with first
    row, find a valid position in current row, place
    a queen in that position then move on to the next
    row





since queen placements are not independent, local
choices do not necessarily lead to a global
solution GREEDY does not work need a more
holistic approach
About PowerShow.com