Title: Trees Ch' 9'2 Longin Jan Latecki Temple University based on slides by Simon Langley and ShangHua Ten
1Trees (Ch. 9.2) Longin Jan LateckiTemple
University based on slides bySimon Langley and
Shang-Hua Teng
2Basic Data Structures - Trees
- Informal a tree is a structure that looks like a
real tree (up-side-down) - Formal a tree is a connected graph with no
cycles.
3Trees - Terminology
size7
root
subtree
x
value
b
e
m
height2
c
d
a
nodes
leaf
Every node must have its value(s) Non-leaf node
has subtree(s) Non-root node has a single parent
node
4Types of Tree
Binary Tree
Each node has at most 2 sub-trees
m-ary Trees
Each node has at most m sub-trees
5Binary Search Trees
- A binary search tree
- is a binary tree.
- if a node has value N, all values in its left
sub-tree are less than or equal to N, and all
values in its right sub-tree are greater than N.
6This is NOT a binary search tree
7This is a binary search tree
8Searching a binary search tree
- search(t, s)
- If(s label(t))
- return t
- If(t is leaf) return null
- If(s lt label(t))
- search(ts left tree, s)
- else
- search(ts right tree, s)
Time per level
O(1)
O(1)
h
Total O(h)
9Searching a binary search tree
- search( t, s )
- while(t ! null)
- if(s label(t)) return t
- if(s lt label(t)
- t leftSubTree(t)
- else
- t rightSubTree(t)
-
- return null
Time per level
O(1)
O(1)
h
Total O(h)
10- Heres another function that does the same (we
search for label s) -
- TreeSearch(t, s)
- while (t ! NULL and s ! labelt)
- if (s lt labelt)
- t leftt
- else
- t rightt
- return t
11Insertion in a binary search treewe need to
search before we insert
Insert 6
Insert 11
6
11
6
11
6
6
11
always insert to a leaf
?
Time complexity
O(height_of_tree)
n size of the tree
O(log n) if it is balanced
12Insertion
- insertInOrder(t, s)
- if(t is an empty tree) // insert here
- return a new tree node with value s
- else if( s lt label(t))
- t.left insertInOrder(t.left, s )
- else
- t.right insertInOrder(t.right, s)
- return t
13Comparison Insertion in an ordered list
insertInOrder(list, s) loop1 search from
beginning of list, look for an item gt s
loop2 shift remaining list to its right, start
from the end of list insert s
Insert 6
6
6
6
6
9
8
6
2
3
4
5
7
6
7
8
9
Time complexity?
O(n) n size of the list
14Try it!!
- Build binary search trees for the following input
sequences - 7, 4, 2, 6, 1, 3, 5, 7
- 7, 1, 2, 3, 4, 5, 6, 7
- 7, 4, 2, 1, 7, 3, 6, 5
- 1, 2, 3, 4, 5, 6, 7, 8
- 8, 7, 6, 5, 4, 3, 2, 1
15Data Compression
- Suppose we have 3GB character data file that we
wish to include in an email. - Suppose file only contains 26 letters a,,z.
- Suppose each letter a in a,,z occurs with
frequency fa. - Suppose we encode each letter by a binary code
- If we use a fixed length code, we need 5 bits for
each character - The resulting message length is
- Can we do better?
16Data Compression A Smaller Example
- Suppose the file only has 6 letters a,b,c,d,e,f
with frequencies - Fixed length 3G3000000000 bits
- Variable length
Fixed length
Variable length
17How to decode?
- At first it is not obvious how decoding will
happen, but this is possible if we use prefix
codes
18Prefix Codes
- No encoding of a character can be the prefix of
the longer encoding of another character - We could not encode t as 01 and x as 01101 since
01 is a prefix of 01101 - By using a binary tree representation we generate
prefix codes with letters as leaves
19Decoding prefix codes
- Follow the tree until it reaches to a leaf, and
then repeat! - A message can be decoded uniquely!
20Prefix codes allow easy decoding
Decode 11111011100
s 1011100
sa 11100
san 0
sane
21Some Properties
- Prefix codes allow easy decoding
- An optimal code must be a full binary tree (a
tree where every internal node has two children) - For C leaves there are C-1 internal nodes
- The number of bits to encode a file is
where f(c) is the freq of c, lengthT(c) is the
tree depth of c, which corresponds to the code
length of c
22Optimal Prefix Coding Problem
- Given is a set of n letters (c1,, cn) with
frequencies (f1,, fn). - Construct a full binary tree T to define a prefix
code that minimizes the average code length
23Greedy Algorithms
- Many optimization problems can be solved using a
greedy approach - The basic principle is that local optimal
decisions may be used to build an optimal
solution - But the greedy approach may not always lead to an
optimal solution overall for all problems - The key is knowing which problems will work with
this approach and which will not - We study
- The problem of generating Huffman codes
24Greedy algorithms
- A greedy algorithm always makes the choice that
looks best at the moment - My everyday examples
- Driving in Los Angeles, NY, or Boston for that
matter - Playing cards
- Invest on stocks
- Choose a university
- The hope a locally optimal choice will lead to a
globally optimal solution - For some problems, it works
- Greedy algorithms tend to be easier to code
25David Huffmans idea
- A Term paper at MIT
- Build the tree (code) bottom-up in a greedy
fashion
Each tree has a weight in its root and symbols as
its leaves. We start with a forest of one
vertex trees representing the input symbols. We
recursively merge two trees whose sum of weights
is minimal until we have only one tree.
26Building the Encoding Tree
27Building the Encoding Tree
28Building the Encoding Tree
Building the Encoding Tree
29Building the Encoding Tree
Building the Encoding Tree
30Building the Encoding Tree
Building the Encoding Tree