Title: Research%20Activities%20at%20Center%20for%20Applied%20Vision%20and%20Imaging%20Sciences%20and%20Florida%20State%20Vision%20Group%20Florida%20State%20University
1Research Activities at Center for Applied Vision
and Imaging Sciences andFlorida State Vision
GroupFlorida State University
- Xiuwen Liu
- Department of Computer Science
- Florida State University
- http//cavis.fsu.edu http//fsvision.fsu.edu
2Research Statement
- My research goal is to create machines that can
see with similar human performance - This seems a trivial problem as each of us can do
this without any effort - Computer Camera A See Machine ?
3Visual Pathway
4Visual Illusion
5Outline
- Motivations
- Some applications of computer vision and pattern
recognition techniques - Some of the research projects
- Related Courses
- Contact information
6Computer Vision Applications
- No hands across America
- Sponsored by Delco Electronics, AssistWare
Technology, and Carnegie Mellon University - Navlab 5 drove from Pittsburgh, PA to San Diego,
CA, using the RALPH computer program. - The trip was 2849 miles of which 2797 miles were
driven automatically with no hands - Which is 98.2
7Computer Vision Applications continued
8Computer Vision Applications continued
9(No Transcript)
10Human-Computer Interactions
11Sign Language Recognition
12CyberKnife
13CyberKnife Cont.
14Image-Guided Neurosurgery
15Intelligent Transportation Systems
http//dfwtraffic.dot.state.tx.us/dal-cam-nf.asp
16Computer Vision Applications cont.
- Military applications
- Automated target recognition
17Computer Vision Applications continued
18Biometrics cont.
Iris code can achieve zero false acceptance
19Computer Vision in Sports
- How was the yellow created?
20Generic Image Modeling
- How can we characterize all these images
perceptually?
21Spectral Histogram Representation
- Spectral histogram
- Given a bank of filters F(a), a 1, , K, a
spectral histogram is defined as the marginal
distribution of filter responses
22Spectral Histogram Representation - continued
- Choice of filters
- Laplacian of Gaussian filters
- Gabor filters
- Gradient filters
- Intensity filter
23Spectral Histogram Representation - continued
24Texture Synthesis Examples - continued
Observed image
Synthesized image
- An image with periodic structures
25Object Synthesis Examples - continued
26Performance Comparison
27Face Detection Based On Spectral Representations
- Face detection is to detect all instances of
faces in a given image - Each image window is represented by its spectral
histogram - A support vector machine is trained on training
faces - Then the trained support vector machine is used
to classify each image window in an input image - More results at http//fsvision.fsu.edu/face-detec
tion
28Face detection - continued
29Face detection - continued
30(No Transcript)
31Face detection - continued
32Rotation Invariant Face Detection
33Rotation Invariant Face Detection - continued
34Linear Representations
- Linear representations are widely used in
appearance-based object recognition and other
applications - Simple to implement and analyze
- Efficient to compute
- Effective for many applications
35Standard Linear Representations
- Principal Component Analysis
- Designed to minimize the reconstruction error on
the training set - Obtained by calculating eigenvectors of the
co-variance matrix - Fisher Discriminant Analysis
- Designed to maximize the separation between means
of each class - Obtained by solving a generalized eigen problem
- Independent Component Analysis
- Designed to maximize the statistical independence
among coefficients along different directions - Obtained by solving an optimization problem with
some object function such as mutual information,
negentropy, ....
36Standard Linear Representations - continued
- Standard linear representations are sub optimal
for recognition applications - Evidence in the literature
- A toy example
- Standard representations give the worst
recognition performance - Optimal component analysis
37Performance Measure - continued
- Suppose there are C classes to be recognized
- Each class has ktrain training images
- It has kcross cross validation images
- We used h(x) 1/(1exp(-2bx)
38Performance Measure - continued
- F(U) depends on the span of U but is invariant to
change of basis - In other words, F(U)F(UO) for any orthonormal
matrix O - The search space of F(U) is the set of all the
subspaces, which is known as the Grassmann
manifold - It is not a flat vector space and gradient flow
must take the underlying geometry of the manifold
into account
39Deterministic Gradient Flow - continued
- Gradient at J (first d columns of n x n
identity matrix)
40Deterministic Gradient Flow - continued
- Gradient at U Compute Q such that QUJ
- Deterministic gradient flow on Grassmann manifold
41Stochastic Gradient and Updating Rules
- Stochastic gradient is obtained by adding a
stochastic component - Discrete updating rules
42MCMC Simulated Annealing Optimization Algorithm
- Let X(0) be any initial condition and t0
- Calculate the gradient matrix A(Xt)
- Generate d(n-d) independent realizations of wijs
- Compute Y (Xt1) according to the updating rules
- Compute F(Y) and F(Xt) and set dFF(Y)- F(Xt)
- Set Xt1 Y with probability minexp(dF/Dt),1
- Set Dt1 Dt / g and set tt1
- Go to step 1
43ORL Face Dataset
44Performance Comparison
45Performance Comparison cont.
46Brain Curve Classification
47Brain Curve Classification cont.
48Real-time Scene Interpretation
- Object detection and recognition problem
- Given a set of images, find regions in these
images which contain instances of relevant
objects - Here the number of relevant objects is assumed to
be large - For example, the system should be able to handle
30,000 different kinds of objects, an estimate of
the human brains capacity for basic level visual
categorization I. Biederman, Psychological
Review, vol. 94, pp. 115-147, 1987
49Global Monitoring Through High-resolution
Satellite Images
50Problem Statement for Scene Interpretation
- Object detection and recognition problem
- Given a set of images, find regions in these
images which contain instances of relevant
objects - Here the number of relevant objects is assumed to
be large - For example, the system should be able to handle
30,000 different kinds of objects, an estimate of
the humans capacity for basic level visual
categorization I. Biederman, Psychological
Review, vol. 94, pp. 115-147, 1987 - Goal
- Develop a system that can achieve real-time
detection and recognition for images of size 640
x 480 with high accuracy - Say, at a frame rate of 15 frames per second
51Existing Approaches
- Fast methods but low accuracy
- One can for example classify one pixel at a time
- However, it is to identify airplanes with high
accuracy due to high false positives and negatives
52Existing Approaches cont.
- Fast methods but low accuracy
- One can for example classify one pixel at a time
- However, it is to identify airplanes with high
accuracy - Methods with good accuracy but slow
- One can in theory use deformable template
matching to locate instances of airplanes - It may need several hours to process one image
53Proposed Framework
54Specifications and Requirements
- We want to detect and recognize at least 30,000
object classes in images - At four different scales
- Using exhaustive search of local windows, that
is, we do not assume segmentation or other
pre-processing - If we assume objects are in some (e.g. 21 x 21)
windows, this means that there will be many
(18,432,000) local windows to be
classified/processed - We want to do this on a 3.6 Ghz Dell Precision
workstation with an estimated performance of
28,665.4 MIPS - This amounts to that we have about 1555
instructions to process a 21 x 21 local window
55Requirements cont.
- To achieve the specifications, we need two
critical components - A classifier that can reduce the average
classification time effectively - Note that on average we have 1555 instructions
if we can process 90 of those windows using only
100 instructions per window, we can have on
average 14,650 instructions for the remaining 10
local windows - Features that can discriminate a large number of
objects and can be computed using a few
instructions - Do such features exist?
56Topological Local Spectral Histograms
- We introduce a new class of features, which we
called TLSH features - It is defined relative to a chosen set of filters
- For a given filter, it is defined as a histogram
of a local window of the filtered image - One bin of the histogram is given by
57Topological Local Spectral Histogram Example
Convolution is implemented using FPGAs
58Local Spectral Histogram Features
59Field Programmable Gate Arrays
- Two primary methods for computation
- Hard Wired Application Specific Integrated
Circuit (ASIC) - Software-programmed microprocessors
- New Approach
- Programmable hardware
- Field Programmable Gate Arrays (FPGAs) represent
a breakthrough in computing technology - Especially for intrinsically parallel applications
60µP/ ASIC / FPGA Comparison Summary
µP ASIC FPGA
Programmable (flexible) Fixed Design Functionality (inflexible) Programmable (flexible)
Relatively Slow Serial Computation Very Fast, highly parallelized computation Fast, Parallel Computation
Floating and Fixed Point Fixed Point / Floating Fixed Point / Floating
Relatively Inexpensive Design Cycle (Software) Expensive Design Cycle (requires chip design) Relatively Inexpensive Design Cycle
Limited Bandwidth Very High Bandwidth Near ASIC Bandwidth
Standard High Level Languages C/C or Assembly Hardware Description Language for Design / Simulation VHDL / Verilog Hardware Description Language for Design / Simulation VHDL / Verilog
61Hardware vs. Software
- Sum 0.0
- I 0
- While (I lt L)
- tmp x(i) h(i)
- Sum Sum tmp
- I I1
- end
A typical software implementation takes 4L
instructions to compute one convolution
62Hardware vs. Software
- A custom hardware implementation
Multiply/Accumulate performed in parallel Can be
done in one clock cycle
63Convolution Timing Diagram
All nine response values finished
Every 7 Clock Cycles 9 new response values
Convolution Start Signal
Clock
64Topological Local Spectral Histograms cont.
- Why TLSH features?
- It provides a very rich set of over-complete
features - For example, suppose we have 22 filters, there
will be 1,173,942 different TLSH features within
a 21 x 21 region, considering different windows
and different filters - TLSH features are more effective than Haar
features used by Viola and Jones P. Viola and M.
Jones, International Journal of Computer Vision,
vol. 57, pp. 137-154, 2004
65ORL Face Dataset
66Comparison Between Haar and TLSH Features
67COIL Dataset
68Comparison Between Haar and TLSH Features
69Texture Dataset
70Comparison Between Haar and TLSH Features
71Mixed Dataset
72Comparison Between Haar and TLSH Features
73Comparison Between Haar and TLSH Features
74Classifier
- To achieve the specification, we also need a
classifier that takes only a few instructions to
make a decision on average - At the same time, we need to achieve high
accuracy - We propose to use a look-up table tree classifier
- I.e., a decision tree classifier where each node
is implemented by a look-up table
75Look-up Table Tree Classifier
76Look-up Table Tree Classifier
77An Example Path in a Decision Tree
78Constructing Look-up Table Decision Tree
- Joint optimization of clustering, TLSH features,
and optimal linear projections - We want to maximize the separations between
marginal distributions of different clusters - We can do the optimization iteratively
- We can do clustering first using current TLSH
features and projections to maximize the
separations - We can find optimal TLSH features given linear
projections - Then we can find optimal linear projections given
updated TLSH features
79Performance Comparison
RCT Rapid Classification Tree, implemented by
Keith Haynes
80Detection and Recognition
81Detection and Recognition
82Shape Theory
- We want to quantify the difference between two
shapes in a principled way - We do this by constructing a shape space and then
use the geodesic distance of two shapes on the
shape manifold as the metric
83Shape Clustering
84Shape Clustering
85 Clustering Dendrogram
86Sulcal Curves
- Sulcal curves are important for characterizing
brain functions
87Sulcal Curves
- Sulcal curves are important for characterizing
brain functions
88Clustering of Sulcal Curves
89(No Transcript)
90Modeling Mathematical Abilities and Disabilities
- As it is possible to acquire detailed surfaces of
the human brain, one may ask how characteristics
of the brain structure affect the mathematical
abilities and disabilities - The U.S. Department of Education wants to know so
that they can understand and find solutions to
the mathematical problems young children have
Corpus callosum examples of young children
without mathematical disabilities (a) and with (b)
91SurfaVision A Surface-based Vision System
- One of the challenges is how to build a machine
vision that is robust - This has been proven to be very difficult after
several decades of computer vision research - We may now have a solution for applications in an
indoor environment
92Multi-Camera Multi-Projector Scanning
93Surface Parametrization
94Geodesic Interpolation Between Surfaces
95Robust Visual Inference
- With a common domain for surface representations,
we can pose the visual inference in the Bayesian
framework by building probability models
96Human-Robot Collaborative Interaction
- The goal is to let robots be aware of the
positions, poses, expressions, moods, and other
factors of the humans so that robots can interact
with humans collaborative
In collaboration with Prof. Emmanuel Collins at
the College Engineering
97Automated 3D Phenotype Measurement
- The central problem in biology is to understand
the relationship between genotype and phenotype - With availability of genomes of humans and model
organisms, the central problem becomes how to
measure phenotype at a large scale
983D Urban Models
99(No Transcript)
100Courses
- Most Relevant Courses
- CAP 5638 Pattern Recognition
- CAP 5415 Principles and Algorithms of Computer
Vision - CAP 6417 Theoretical Foundations of Computer
Vision - STA 5106 Computational Methods in Statistics I
- STA 5107 Computational Methods in Statistics I I
- Seminars and advanced studies
- Related Courses
- CAP 5615 Artificial Neural Networks
- CAP 5600 Artificial Intelligence
- CAP 5xxx Machine Learning
101Funding of the Group
- National Science Foundation
- DMS
- CISE IIS
- FRG
- ACT
- CCF
- NGA National Geo-spatial Intelligence Agency
- Army Research Office
- DURIP
- Research grant
- Companies
- Next Century and others under negotiation
102Summary
- CAVIS group and FSvision group offer interesting
research topics/projects - Efficient represent for generic images
- Real-time detection and recognition
- Computational models for object recognition and
image classification - Medical image analysis
- Motion/video sequence analysis and modeling
- They are challenging
- They are interesting
- They are exciting
103Contact Information
- Name Xiuwen Liu
- Web sites http//cavis.fsu.edu
- http//fsvision.fsu.edu
- http//www.cs.fsu.edu/liux
- Email liux_at_cs.fsu.edu
- Offices LOV 166 and 118 North
Woodward Ave. - Phones 644-0050 and 645-2257
104Thank you! Any questions?