Hash Tables - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Hash Tables

Description:

A perfect hash function produces a unique integer for every item. ... Collision happens when an item to be stored computes to the same ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 25
Provided by: toshi155
Category:
Tags: hash | item | tables

less

Transcript and Presenter's Notes

Title: Hash Tables


1
Hash Tables
2
Hash Function
A hash table stores and retrieves data item using
a
hash function key ? an index
address for storage
A perfect hash function produces a unique integer
for every item.
But it may be too slow to compute.
So we often settle for less than perfect hash
functions.
3
Hash Function Examples
In case we want to store
? last few digits of ISU ID
? last few digits of SSN
? sum of char values on the students name
? compute on the first 10 chars.
? convert letters in VIN to integers
? convert letters in the license place to integers
4
Choice of Hash Function
Should distribute keys uniformly into slots.
Should be unaffected by any patterns in the data.
Ex. Suppose keys are in the range 0,9999, and
there are 100 slots.
Consider the hash function h(k) k 100
If you are given numbers between 0 and 99 to
hash, they will all end up in the same slot 0!
5
Hash Function Division Method
h(k) k m
Ex. 2000 character strings.
r in ASCII
8 bits per char
p in ASCII
6
Multiplication Method
Choose a constant A with 0 lt A lt 1 but close to 0
or 1.
Choose m as some power of 2.
For a key k, let ? be the fractional part of kA.
h(k) greatest integer less than or equal to m?
Ex. m 8 and 7-bit words.
.1011001 A 1101011 k
Binary
1. Take the fractional part, 2. Discard the
rest. 3. Shift it to the left. 4. Take the
shifted out bits.
1001010.0110011 kA
h(k)
7
Function Objects
Solution treat a function as an object defined
by a class referred to as a
function object type.
an object that behaves like a function
Defined by overloading operator() as a member
function.
8
Function Object Example
template lttypename Tgt class greaterThan
public bool operator() (const T x, const T
y) const return x gt y
greaterThanltintgt f // function object f
compares two integers int a, b cin gtgt a gtgt b
if (f(a,b)) // evaluated as
f.operator()(a,b) cout ltlt a ltlt gt ltlt b
ltlt endl else cout ltlt a ltlt lt ltlt b ltlt
endl
9
Anonymous Function Object
greaterThanltintgt() is an anonymous object of
greatThanltintgt type.
greaterThanltintgt()(x, y) evaluates x gt y.
string strA walk, strB crawl if
(greaterThanltstringgt() (strA, strB)) cout
ltlt strA ltlt gt strB ltlt endl
10
General Purpose Sorting
Modify insertion sort to sort in either ascending
or descending order, or by any comparison
function.
template lttypename T, typename Comparegt void
insertionSort(vectorltTgt v, Compare comp)
int i, j, n v.size() T temp
for (i 1 i lt n i) j i temp
vi while (j gt 0 comp(temp, vj-1))
vj vj-1 j-- vj
temp
11
Integer Hash Functions
Identity hash function
class hFintID public unsigned int
operator()(int item) const return (unsigned
int) item
hFintID hf
hf(35) 35 // index for a serial number in a
10000-element table hf(0682401) 10000
0682401 10000 2401
But conflicts can arise
hf(9732001) hf(1362001) hf(8572001) 2001
12
The Midsquare Technique
This hash function takes an integer and then
perform the following
class hFint public unsigned int
opeartor() (int item) const unsigned
int value (unsigned int) item value
value value / 256 //
discard the lowest 8 bits return value
65536 // return item in range 0 to 65536
hFint hf // hf(9732001) 51491 51491
10000 1491 // hf(1362001)
26281 26281 10000 6281 //
hf(8572001) 44732 44732 10000 4732
13
String Hash Function
Function hFstring
Ex key and.
n (0 8) 97 97 // ASCII value
of a is 97 n (97 8) 110 886 //
ASCII value of n is 110 n (886 8) 100
7188 // ASCII value of d is 100
14
Implementation of hFstring
class hFstring public unsigned int
operator()(const string item) const unsign
ed int prime 2049982463 int n 0,
i for (i 0 i lt item.length() i) n
n8 itemi return n gt 0 ? (n prime)
(-n prime)
hFstring hfStr hfStr(and) 7188
hfStr(multiplication) 950233562 hfStr(algori
thm) 1885049517
15
Regular Hash Table
Takes the form of an array or vector
Sally
? Compute the hash value of an item to be stored.
? Mod () the value by the hash table size.
? The result is an index of the array, where you
store the item or look for it.
Ex Suppose Sally hashes to 15 and the table
size is 7.
Store at the location 15 7 1.
16
Resolving Collisions
Collision happens when an item to be stored
computes to the same hash index as an already
existing item.
Resolving strategy linear probe open addressing
? Search down the array until an empty slot is
found.
? Store the item there.
Works well if the table is large relative to
items to store.
But the performance can degrade when their ratio
approaches 1.
Ex. Given the table
Jack
Now, add Jack, which hashes to 22 ? 22 7 1.
17
Chained Hash Table
An array of pointers to linked lists where the
items are stored.
Sally
Typically, array size is 30 ? 50 larger than
the maximum number of data items.
18
Chaining
? Add to the linked list stored at the hash index.
Hash takes time O(1) on the average if
? the data is well analyzed
? the hash function and table size are set to
minimize collisions.
19
Operations on Chained Hash Table
hash table T
Space of keys
l
q
r
p

k
L
h(k)
hash function h
Insertion O(1)
Search O(L)
O(L) if singly linked list
Deletion
O(1) if doubly linked list
20
The hash Class
template lttypename T, typename HashFuncgt class
hash public include "d_hiter.h" // hash
table iterator nested classes hash(int
nbuckets, const HashFunc hfunc
HashFunc()) // constructor specifying the
number of buckets in the hash table // and the
hash function hash(T first, T
last, int nbuckets, const HashFunc hfunc
HashFunc()) // constructor with arguments
including a pointer range // first, last) of
values to insert, the number of // buckets in
the hash table, and the hash function
bool empty() const // is the hash table
empty? int size() const // return
number of elements in the hash table
21
contd
iterator find(const T item) const_iterator
find(const T item) const // return an
iterator pointing at item if it is in the //
table otherwise, return end()
pairltiterator,boolgt insert(const T
item) int erase(const T item) void
erase(iterator pos) void erase(iterator first,
iterator last) iterator
begin() const_iterator begin()
const iterator end() const_iterator end()
const private int numBuckets vectorltlistltT
gt gt bucket HashFunc hf int
hashtableSize
22
Examples
hashltint, hFintIDgt hInt(23) // declare a hash
table that stores integer values in 23
// buckets,
using the identity function string strArr
a, more, bucket, hash, table,
class int strArrSize sizeof(strArr) /
sizeof(string) hashltstring, hFstringgt
hString(strArr, strArr strArrSize, 101) // a
hash table that // consists of
101 buckets holding strings.
23
Employee Records
class employee public
employee(const string snum, double sal)
ssn(snum, salary(sal) friend class
hFemp // hash funnction object type
private string ssn double salary
class hFemp public unsigned int
operator() (const employee item) const
return hFstring(item.ssn) hashltemployee,
hFempgt hEmp(157) // a hash table with 157
buckets to // store employee
records
24
contd
iterator find(const T item) const_iterator
find(const T item) const // return an
iterator pointing at item if it is in the //
table otherwise, return end()
pairltiterator,boolgt insert(const T
item) int erase(const T item) void
erase(iterator pos) void erase(iterator first,
iterator last) iterator
begin() const_iterator begin()
const iterator end() const_iterator end()
const private int numBuckets vectorltlistltT
gt gt bucket HashFunc hf int
hashtableSize
Write a Comment
User Comments (0)
About PowerShow.com