Title: JETT 2005 Session 5: Algorithms, Efficiency, Hashing and Hashtables
1JETT 2005 Session 5Algorithms, Efficiency,
Hashing and Hashtables
2Todays buzzwords
- Algorithm
- A strategy to solve a problem
- A systematic approach that describes the solution
process - Complexity and Efficiency
- How does an algorithm scale when the input size
grows? - How do two algorithms solving the same problem
compare? - Big Oh Notation
- A theoretical measure for the execution of an
algorithm, in terms of the problem size n
usually the number of items. - Hashing function
- A function that takes an object and generates a
number (or some form of an address) to a location
where the object should be placed - Hash Table
- A data structure that stores items in designated
places using hashing functions for speeding up
key-based search
3Algorithms
- Every problem can be solved if given enough
computing power - But it doesnt hurt to use as little as you can
- Problems do not have to be boring crunching
numbers stuff - Solutions are almost always elegant
- A good algorithm uses less computing resource
(less memory, less CPU time) - Strangely, may not always result in more lines of
code!
4So lets solve a fun problem!
How would you take the mouse to the cheese?
5An algorithm to solve this problem
- First, read in the maze and populate a 2D array
of Rooms - Next, create an instance of a stack.
- Place the Room/Location corresponding to the
mouse position in the stack. - Now use the following strategy. Peek at the top
room of the stack. If this room is not visited,
do the following - Mark it to be visited. If the room location is
the same as the cheese location, then you are
done. If not, find all the rooms that are
accessible from this top room that are not
visited (you can use a specific order, or random
order - it does not matter). - Push all the neighboring un-visited rooms in the
stack. - If the top room is already visited, pop it off
the stack. Continue with Step 4. - If the stack becomes empty, guess what? There is
no solution to the problem.
6A data structure to solve these problems Game
trees
Start
goal
7Algorithms to solve the problems
- Easiest Depth First search
- Simplest, guaranteed to find a solution
- Easily implemented with a stack as we saw
- Inefficient lot of backtracking, will
potentially visit all nodes! - Pruning trees
- Each node is given a weight based on how close it
is to goal - Idea is to pick a node with the highest weight
- Creating the tree is more difficult
- May not always find the best solution
8Complexity and Efficiency of Algorithms
- Complexity is not
- How hard the algorithm is to implement
- How many lines of code it takes to implement it
- Complexity is
- How many operations the algorithm does to solve
the problem - How much resources it takes to solve the problem
- A program is more complex (less efficient) if
- It performs more operations
- Takes more time and memory
- Scales worse as input size grows
9How is Efficiency Measured?
- Big Oh Notation
- A theoretical measure of the execution of an
algorithm, usually the time or memory needed,
given the problem size n, which is usually the
number of items. - Informally, saying some equation f(n) O(g(n))
means it is less than some constant multiple of
g(n). - The notation is read, "f of n is big oh of g of
n".
10So what does that mean?
- Two algorithms with the same efficiency in Big Oh
notation will scale the same way - There may be a slight difference in actual
running times. - For example, Bubblesort and Insertionsort are
both O(n2), but Insertionsort on a given size
input typically is faster - However, as the input size grows, both algorithms
show the same increase in runtime - Provides a means to compare algorithms
11Hash Table the most efficient data structure
for Collections
- Start with an array that holds the hash table.
- Use a hash function to take a key and map it to
some index in the array. This function will
generally map several different keys to the same
index. - If the desired record is in the location given by
the index, then were finished, otherwise we must
use some method to resolve the collision that may
have occurred between two records wanting to go
to the same location. - This process is called hashing. To use hashing
we must - find good hash functions
- determine how to resolve collisions
12Collision Resolution with Open Addressing
- Linear Probing
- Linear probing starts with the hash address
and searches sequentially for the target key or
an empty position. The array should be considered
circular, so that when the last location is
reached, the search proceeds to the first
location of the array.
13Collision Resolution with Open Addressing (Contd.)
- Quadratic Probing
- If there is a collision at hash address h,
quadratic probing goes to location h1, h4,
h9,, that is, at locations h i2 for i1,2,... - Other Methods
- Key-dependent increments
- Random probing
14Chained Hash Tables
15Birthday Surprise How Collisions are Possible?
- If 24 or more randomly chosen people are in a
room, what is the probability that two people in
the class have the same birthday? -
- For hashing, the birthday surprise says that for
any problem of reasonable size, collisions will
almost certainly occur.
16So how efficient are hash tables?
Ordered Arrays Binary Trees Hash Tables
Find O(log n) O(log n) O(1)
Insert O(n) O(log n) O(1)
Delete O(n) O(log n) O(1)
17Moral of the Story
- Data Structures provide the vehicle for problem
solving - The algorithm is the route from the source to the
destination - Efficiency is the time you take to make the trip!
- An interstate is more efficient than a state
route ?