FPGA Co-Processor Enhanced Ant Colony Systems Data Mining - PowerPoint PPT Presentation

1 / 51
About This Presentation
Title:

FPGA Co-Processor Enhanced Ant Colony Systems Data Mining

Description:

FPGA Co-Processor Enhanced Ant Colony Systems Data Mining Jason Isaacs and Simon Y. Foo Machine Intelligence Laboratory FAMU-FSU College of Engineering – PowerPoint PPT presentation

Number of Views:92
Avg rating:3.0/5.0
Slides: 52
Provided by: klabsOrgm5
Learn more at: http://www.klabs.org
Category:

less

Transcript and Presenter's Notes

Title: FPGA Co-Processor Enhanced Ant Colony Systems Data Mining


1
FPGA Co-Processor Enhanced Ant Colony Systems
Data Mining
  • Jason Isaacs and Simon Y. Foo
  • Machine Intelligence Laboratory
  • FAMU-FSU College of Engineering
  • Department of Electrical and Computer Engineering

2
Presentation Outline
  • Introduction
  • Significance of Research
  • Concise Background on ACS
  • Summary of Data Mining focused on Clustering
  • Discussion of ACS-based Data Mining
  • FPGA Co-processor Enhancement
  • Conclusions
  • Future Work

3
Project Goal to design and implement an Ant
Colony Systems toolbox for non-combinatorial
problem solving. This toolbox will comprise both
hardware and software based solutions.
4
Ant Colony Systems Project Overview
  • This work aims at advancing fundamental research
    in Ant Colony Systems.
  • The major objectives of this project are
  • Develop a set of behavior models
  • Design ACS algorithms for solutions to
    non-combinatorial problems
  • Analyze algorithms for hardware implementations
  • Implement FPGA Modules CURRENT
  • Incorporate all modules into a cohesive toolbox

5
Introduction to Ant Colony Systems
  • Ants are model organisms for bio-simulations due
    to both their relative individual simplicity and
    their complex group behaviors.
  • Colonies have evolved means for collectively
    performing tasks that are far beyond the
    capacities of individual ants. They do so without
    direct communication or centralized control
    Stigmergy.
  • Previous Research our use of simulated ants to
    generate random numbers proved a novel
    application for ACS.
  • Prior to 1992, ACS was used exclusively to study
    real ant behavior.
  • However, in the last decade, beginning with Marco
    Dorigos 1992 PhD Dissertation Optimization,
    Learning and Natural Algorithms, modeling the
    way real ants solve problems using pheromones,
    ant colony simulations have provided solutions to
    a variety of NP-hard combinatorial optimization
    problems

6
ACS Application Area Data Mining
  • Ant Colony real-world behaviors applicable to
    Data Mining
  • Ant Foraging
  • Cemetery Organization and Brood Sorting
  • Division of Labor and Task Allocation
  • Self-organization and Templates
  • Co-operative Transport
  • Nest Building

7
Cemetery Organization and Brood Sorting
8
Ant Colony Nest Examples
9
Flowchart for the ACS Data Mining System
10
Knowledge Discovery andData Mining
  • What is Data Mining?
  • Discovery of useful summaries of data
  • Also, Data Mining refers to a collection of
    techniques for extracting interesting
    relationships and knowledge hidden in data.
  • It is best described as the nontrivial process
    of identifying valid, novel, potentially useful,
    and ultimately understandable patterns in data.
    (Fayyad, et al 1996)

11
Knowledge Discovery in Databases
12
Typical Tasks in Data Mining
  • Classification
  • Prediction
  • Clustering
  • Association Analysis
  • Summarization

13
Clustering
  • What is Clustering?
  • Given points in some space, often a
    high-dimensional space, group the points into a
    small number of clusters, each cluster consisting
    of points that are near in some sense.

14
The k-Means Algorithm
  • k-means picks k cluster centroids and assigns
    points to the clusters by picking the closest
    centroid to the point in question. As points are
    assigned to clusters, the centroid of the cluster
    may migrate.
  • For a very simple example of five points in two
    dimensions. Suppose we assign the points 1, 2, 3,
    4, and 5 in that order, with k 2. Then the
    points 1 and 2 are assigned to the two clusters,
    and become their centroids for the moment.
  • When we consider point 3, suppose it is closer to
    1, so 3 joins the cluster of 1, whose centroid
    moves to the point indicated as a. Suppose that
    when we assign 4, we find that 4 is closer to 2
    than to a, so 4 joins 2 in its cluster, whose
    center thus moves to b. Finally, 5 is closer to a
    than to b, so it joins the cluster 1,3, whose
    centroid moves to c.

15
The k-Means Algorithm
Having located the centroids of the k clusters,
we can reassign all points, since some points
that were assigned early may actually wind up
closer to another centroid, as the centroids move
about. If we are not sure of k, we can try
different values of k until we find the smallest
k such that increasing k does not much decrease
the average distance of points to their
centroids.
16
ACS Notation and Heuristics
E Oi,, On Set of n data or objects
collected. Oi vi,, vk Each object is a
vector of k numerical attributes. Vector
similarity is measured by Euclidean distance (can
use other Minkowski, Hamming, or
Mahalanobis). Dmax max DOi, Oj, where Oi,Oj
? E
17
ACS Notation and Heuristics
  • 2-D search area, in general, must be at least m2
    ? n, but experiments have shown that m2 ? 4n
    provides good results.
  • A heap/pile H is considered to be a collection of
    two or more objects. This collection is located
    on a given single cell rather than just spatially
    connected. This limitation prevents overlaps.

Spatial pattern cluster
Single-cell ranked cluster
18
ACS Distance Measures
  • Dmax is the maximum distance between two objects
    of H
  • Ocenter is the center of mass of all objects in
    H (not necessarily a real object)
  • Odissim is the most dissimilar object in H, i.e.
    which maximizes
  • Dmean is the mean distance between the objects of
    H and the center of mass Ocenter

19
ACS Unsupervised Learning and Clustering Algorithm
  • Initialize randomly the ant positions
  • Repeat
  • For each anti Do
  • Move anti
  • If anti does not carry any object Then look at
    8-cell neighborhood and pick up object according
    to pick-up algorithm
  • Else (anti is already carrying an object O) look
    at 8-cell neighborhood and drop O according to
    drop-off algorithm
  • Until stopping criterion

20
ACS Data Mining AlgorithmTop Level
  • Load Database
  • Data Compression
  • Object Clustering
  • Clustering of Similar Groups
  • Reevaluate Objects in Groups

21
ACS Data Mining AlgorithmTop Level
  • Load Database
  • Select Compression Method
  • Wavelets
  • Principle Component Analysis
  • None
  • Repeat for Max_Iterations1 Object Clustering
  • Begin Ants Redistribute Objects
  • K-means
  • Repeat for Max_Iterations2 Clustering of
    Similar Groups
  • Ants Redistribute Piles (Clusters) of Objects
  • K-means
  • Repeat for Max_Iterations3 Reevaluate Objects
    in Groups
  • Ants Redistribute Objects in Clusters with a
    Probability based on Least Similar Objects
    Distance from the Mean of the Cluster
  • K-means

22
ACS Object Pick-up Algorithm
  • Label 8-cell neighborhood as unexplored
  • Repeat
  • Consider the next unexplored cell c around anti
    with the following order cell 1is NW, cell 2 is
    N, cell 3 is NE, N is the direction the ant is
    facing.
  • If c is not empty Then do one of the following
  • If c contains a single object O, Then load O with
    probability Pload, Else
  • If c contains a heap of two objects, Then remove
    one of the two with a probability Pdestroy, Else
  • If c contains a heap H of more than 2 objects,
    Then remove the most dissimilar object Odissim(H)
    from H provided that
  • Label c as explored
  • Until all 8 cells have been explored or one
    object has been loaded

23
ACS Object Drop-off Algorithm
  • Label 8-cell neighborhood as unexplored
  • Repeat
  • Consider the next unexplored cell c around anti
    with the following order cell 1is NW, cell 2 is
    N, cell 3 is NE, N is the direction the ant is
    facing.
  • If c is empty Then drop O in cell with a
    probability Pdrop, Else
  • If c contains a single object O, Then drop O to
    create a heap H provided that
    Else
  • If c contains a heap H, Then drop O on H provided
    that
  • Label c as explored
  • Until all 8 cells have been explored or carried
    object has been dropped

24
Parameter Table
25
K-means Algorithm
  • Take as input the partition P of the data set
    found by the ants in the form of k heaps Hi,,Hk
  • Repeat
  • Compute Ocenter(Hi),, Ocenter(Hk)
  • Remove all objects from heaps,
  • For each object Oi? E
  • Let Hi, j? 1, k be the heap whose center is the
    closest to Oi,
  • Assign Oi to Hj,
  • Compute the resulting new partition P H1,,Hk
    by removing all empty clusters,
  • Until stopping criterion

26
Benchmark Databases
  • The following public domain data sets were
    obtained from the UCI (University of California
    at Irvine) - Machine Learning Repository. These
    have been used extensively for classification
    tasks using different paradigms. The main
    characteristics of each of these domains are
    described in the three slides.

27
Tested Databases
  • Golf
  • Very simple database, 4 attributes, 2 classes
  • Balloons
  • The influence of prior knowledge on concept
    acquisition, 4 data sets, 4 attributes, 2 classes
  • Wine
  • Well behaved class structure, 178 instances, 13
    attributes, 3 classes
  • Hepatitis
  • Poorly distributed database, 155 instances, 19
    attributes, 2 classes
  • Iris (plant)
  • Very popular database, 150 instances, 4
    attributes, 3 classes.
  • Wisconsin Breast Cancer
  • High dimensional database, 198 instances, 32
    attributes, 2 classes

28
Golf Data Results
Given Data
Numerical Equivalent
Normalized
29
Golf Data Results
Number in Cluster
ERROR
Dont Play
Play
Dont Play
Objects (1-14)
Position of Cluster
30
Golf Data Results
Number in Cluster
No Errors
Play
Dont Play
Dont Play
Objects (1-14)
Position of Cluster
31
Wine Database
Data is the results of a chemical analysis of
wines grown in the same region in Italy but
derived from three different cultivars.
The attributes are 1) Alcohol 2) Malic acid
3) Ash 4) Alcalinity of ash 5)
Magnesium 6) Total phenols 7) Flavanoids 8)
Nonflavanoid phenols 9) Proanthocyanins 10)Colo
r intensity 11)Hue 12)OD280/OD315 of diluted
wines 13)Proline  
Number of Instances class 1 59 class 2 71 class 3
48
 
  • Error 0.050562
  • 5 class 1 mislabeled as class 2
  • 3 class 2 mislabeled as class 3
  • 1 class 3 mislabeled as class 2

32
Iris (Plant) Database
This is perhaps the best known database to be
found in the pattern recognition literature.
  • Attribute Information
  • 1. sepal length in cm
  • 2. sepal width in cm
  • 3. petal length in cm
  • 4. petal width in cm

Number of Instances 150 (50 in each of three
classes) -- Iris Setosa -- Iris Versicolour --
Iris Virginica
Errors 0.047 4 mislabeled as type 2 3 mislabeled
as type 3
Errors 0.04 2 mislabeled as type 3 4 mislabeled
as type 2
33
ACS DM Optimization of Parameters
  • Number of Total Iterations
  • Compression Method (PCA, Wavelet, None)
  • Cluster Method
  • Objects Only
  • Objects and Groups of Objects
  • Objects, Groups, then Objects again
  • Number of Ants
  • K-Means Iterations
  • Distance Measure (Euclidean, Minkowski, Hamming,
    or Mahalanobis)
  • Others (RNG, Ants Movement Distance, Ant Carrying
    Capacity)

34
ACS DM Object Grouping Only
35
ACS DM Object and Cluster Grouping Only
36
ACS DM Object, Cluster, and Object
37
Why Move to Hardware?
  • For such large datasets the ACS classifier
    perform remarkably well. However,
  • Speed of classification is very limited in
    software.
  • The computational bottlenecks lay in the number
    of multiply and adds that must be performed for
    each object. In addition, the requirement of a
    square root for each distance measurement adds
    complexity.

38
Target HardwareAvnets Virtex II Pro Board
  • Uses Virtex II Pro XC2VP20
  • Many Options for I/O.
  • 32 Bit PCI Bus has Data Throughput of Over 100 MB
    per Second.

39
ACS-DM System Top-Level HW
40
ACSDM Hardware Design
41
K-Means Distance Calculator with CORDIC Square
Root
42
Device Utilization Summary
  • Selected Device 2vp20ff896-6
  • Number of Slices 6600 out
    of 9280 71
  • Number of Slice Flip Flops 8312 out
    of 18560 44
  • Number of 4 input LUTs 7661 out
    of 18560 41
  • Number of bonded IOBs 266 out
    of 556 48
  • Number of BRAMs 3 out
    of 88 3
  • Number of MULT18X18s 8 out
    of 88 9
  • Number of GCLKs 1 out
    of 16 6

  • TIMING REPORT
  • Clock Information
  • -------------------------------------------------
    -----------------
  • Clock Signal Clock
    buffer(FF name) Load
  • -------------------------------------------------
    -----------------
  • clk BUFGP
    1419
  • -------------------------------------------------
    -----------------
  • Timing Summary
  • Minimum period 16.499ns (Maximum Frequency
    60.611MHz)

CORDIC Sqrt data path is greatest bottleneck
causing high period
43
Hardware Euclidean Distance Result
V1
V2
0.83812 0.01964 0.68128 0.37948 0.8318 0.50281 0.7
0947 0.42889 0.30462 0.18965 0.19343 0.68222 0.302
76 0.54167 0.15087 0.6979 0.37837 0.86001 0.85366
0.59356
0.49655 0.89977 0.82163 0.64491 0.81797 0.66023 0.
34197 0.28973 0.34119 0.53408 0.72711 0.30929 0.83
85 0.56807 0.37041 0.70274 0.54657 0.44488 0.69457
0.62131
  • Result from Matlab 1.5058
  • Result from Hardware 1.5172
  • Vectors are Fix 8_7 on input
  • Then after add Fix 9_7
  • Then after multi Fix 18_14
  • Then after accum Fix 20_14
  • Then after CORDIC Sqrt Fix 42_36
  • Error is present in round-off and Cordic Sqrt

44
Ant Colony Actions Movement
CARNG is a simple 32-bit rule 30 that is user
initialized for reproducibility
RNG Ant(1)
Ant Move-Direction Filter
RNG Ant(2)
Current Location Data
Current Location Last Location Have Data Status
Ant Colony Data
Ant Change Location
New Location Data
RNG Ant(N)
45
Pheromone Trail Result from Hardware
Co-simlulation
A single ant is simulated for clarity and the
Darker Red is most recent position
46
Ant Colony Actions Object Load/Drop
Were Probabilities and Thresholds Met?
Enable Drop/Load Y/N
Current Location
Current Location Carried Status
Object Information
Carried Status
Ant Change Have Data Status
Current Have Data Status
Current Location Last Location Have Data Status
Ant Colony Data
New Have Data Status
47
ACS DM Hardware Storage Requirements
  • Preprocessed Data (Number of Objects Vector
    Length, 8- to 32-bit fixed-point)
  • Object Vectors
  • Object Locations
  • Object Status
  • Parameter Values (16 32-bit fixed-point)
  • Probabilities
  • Thresholds
  • Limits
  • Max Distance (1 32-bit fixed-point)
  • Groups (Number of Objects Number of Groups,
    1-bit and 3Number of Groups 8-bit)
  • Members
  • Means (Object Vector Length 32-bit fixed-point)
  • Locations
  • Ant Locations and Have-Object Status (Number of
    Ants 8-bit, plus 1-bit status)

48
(No Transcript)
49
PCI Bridge
50
Block Diagram
  • Virtex-II Pro is focal point.
  • Spartan acts as bridge to PCI
  • On Board Memory
  • 32 MB SDRAM
  • 2 MB SRAM
  • 16 MB FLASH
  • 128 MB DDR SDRAM
  • 64 MB Compact Flash
  • Ethernet
  • RS232
  • 4 AvBus Connectors
  • 2 PMC Connectors

51
Conclusions/Future Work
  • Continue to design the ACS Data Mining System
  • Implement an improved Memory Manager
  • Correct Errors associated with Round-off and the
    CORDIC Sqrt.
  • Implement the Group Clustering Algorithm
  • Optimize the PC/FPGA interfacing to create our
    own low-cost integrated system.
  • Our problems currently reside on the PCI
    interface design shipped with the Avnet
    Development Board. We are working hard to resolve
    this issue, but in the end we may have to
    consider another board. Also shown in
    presentation P248.
  • We also need to improve the speed. 60Mhz is too
    slow.
  • Optimize data through put and calculating
    efficiency of the distance metric algorithm,
    i.e., consider a multi-stage pipeline or employ
    the use of more look-up tables.
  • The ultimate goal is to demonstrate the ability
    of ACS algorithms to perform as well as other
    well-know techniques allowing for computational
    speed-up utilizing FPGAs as co-processors.
Write a Comment
User Comments (0)
About PowerShow.com