Title: Decision Making and Reasoning with Uncertain Image and Sensor Data
1Decision Making and Reasoning with Uncertain
Image and Sensor Data
- Pramod K Varshney
- Kishan G Mehrotra
- Chilukuri K Mohan
2Main Themes
- Decentralized decision-making
- Multiple uncertain information streams
- Dynamically changing environments
- Algorithms for realistic battlefield scenarios
3What is the agents current location?
What activities are other agents involved in?
What is the likelihood of damage at various
locations?
What would be the safest paths to a goal/exit
zone?
4Main Contributions
- Scenario recognition from video sequences
- Improved activity recognition with audiovideo
information - Development of new algorithms for path planning
in a battlefield - Formulation of path planning as a multi-objective
optimization problem - Development of a new multi-objective evolutionary
algorithm
51. Scenario Recognition and Classification
- Event recognition and scene analysis with real
time visual and audio information
6Problem Formulation
- Detect moving objects and classify activities
- Identify sounds indicative of specific events
- Quantify uncertainty in activity classification
- Develop an enhanced scene representation by
integrating audio and visual information - Related work
71.1.Video Component
- Goal To detect and track moving objects and
classify activity in real time - Input real time video stream
- Output detected moving object and activity
classification
8Video Processing Pipeline (contd.)
- Goal Recognition of a moving objects activities
from a sequence of images (video) -
Low Level Processing -Filtering -Detection -Tracki
ng -Feature Extraction
High Level Processing -Frame Classification -Sce
nario Recognition
Sequence of Frames
Extracted Features
Extracted Scenarios
9Video Processing Pipeline
Real time Video Acquisition
Detection
Tracking
Scene Description Generator
Feature extraction
Classification
Visualization
10Features Extracted
- Aspect Ratio (AR) d / (abc)
- Relative Upper Density (RUD) a / (abc)
- Relative Middle Density (RMD) b / (abc)
- Relative Lower Density (RLD) c / (abc)
- Velocity and centroid
-
-
-
- a
-
-
- b
-
-
c
11Video Feature Analysis Example
Feature Walking Bending AR 0.2
0.3 RUD 0.3 0.2 RMD
0.4 0.5 RLD 0.3
0.3
Figure 1
Figure 2
12Classification Algorithms Used for Activity
Detection
- Multi-module back-propagation neural network
- Inductive Decision Tree Learning (C5) algorithm
- Control Chart Approach
- Bayesian networks
13Visualization of Activity with Uncertainty Measure
- Example activities
- shown here sitting, bending
- and standing
- Uncertainty is calculated
- from classifier output, for
- each event
- The blue pointer indicates
- the level of certainty in the
- classifier decision
14Control Chart Approach for Video Activity
Classification
- Control Chart indicates the variation in the
values of some feature over time, with graphical
depiction of the upper and lower control limits
for that feature. -
- High level detection with control charts
- Identification of each activity.
- Recognition of when the activity begins and ends.
15Control Chart Example (with Upper and Lower
Control Limits for each activity)
detail
161.2.Why Audio?
17Role of Audio Component
- Obtain information which may not be acquired
visually - Provide additional comprehensive information
enriching the scene context - Due to large number of potential sounds to
identify, the scope of problem is very vast
18Audio Processing
- Goal To detect and classify sounds indicative of
specific events - Input A sample of sound in real time
- Output Detected class of specific sound
- Example sound samples indicate specific
objects/events such as explosions and vehicles
19Whats New?
- Fusion of audio and video for surveillance and
scene analysis - New audio features - Spectrum shape modeling
coefficients
20Audio Processing Pipeline
Audio acquisition
Linear predictive coding /Cepstral
coefficients
Histogram Features
Spectral Features
Relative Band Energies
Choose features
Multi Module back-propagation Neural Networks
21Audio Features
- Amplitude Histogram Features (width, symmetry,
skewness and kurtosis calculated on a histogram
of a 3 second clip)
- Spectral Centroid and Zero Crossing Rate
- Relative Band energies
- Linear Predictive Coding Coefficients
- Cepstral Coefficients
- Spectrum shape modeling coefficients
22Audio Enhanced Visual Processing
Fusion
Video Processing and Classification
Visualization
Video Acquisition
Uncertainty
Audio Processing and Classification
Description Generation
Sound Acquisition
23Audio Visual Classes
- 3 classes of video events
- Sitting
- Standing
- Bending
- 4 classes of sound events are considered
- Silence
- Clear Speech
- Babble or Speech in noise
- Alarm sounds (smoke detector class)
24Prototype Demonstration
25Experimental Results - Video
- Sub-scenario recognition accuracy of Control
Chart approach
Video Number of Frames Number of sub-scenarios Number of recognized sub-scenarios
1 823 11 10
2 512 6 6
3 701 12 12
4 514 9 10
26Experimental Results - Video
- We used 4 different video sequences. Total 2250
feature vectors, 1072 were used in the training
and rest of the 1478 vectors were used in the
testing. - Classification Accuracy using different methods
- Neural Network (back-propagation) 91.34
- Decision Tree (C5) 92.86
- Naive Bayesian Network 89.61
- Control Chart 95.70
27Experimental Results - Audio
- In this 4 class problem, we obtain classification
accuracy of 92 on recorded data (off-line
classification) - 75 for real time classification in the
laboratory acoustic environment - Acoustics of each environment can be different,
leading to misclassifications - Characteristics of the recording equipment
281.3.Representation Scheme
- Audio and visual processing yields information
about scene context - Need for representation scheme for acquired audio
video information - Generation of a document containing audio-visual
information, which can be further processed
29XML Based Description
- We chose an XML based representation
- Widely accepted standard for information exchange
- More comprehensive forms such as XML schema will
be used for representation - MPEG standards use XML based Audio visual content
management - Semi structured, allowing for addition of user
defined data and information - An XML based representation allows for
standardization, flexibility and extendibility - Automatic generation of XML based description
- Descriptor gives the state of observed scenario
over a certain time period
30Example Descriptor
Header
Moving object Features and activity class
Complete descriptor
31 Descriptor Utility
- The combined audio visual descriptor can serve as
a base for - Data mining for unusual events or correlation
between events and activities - Building case libraries of interesting scenarios
or for particular cases - Audio-visual fusion and visualization
32Discussion
- We have shown the feasibility of activity
recognition using combined video and audio
information. - Future work integration, extension, elaboration
- Next section (path planning) after activity
recognition, battlefield decision-maker must act.
332. Personnel Movement Planning in a Battlefield
- Path computation algorithms for risk minimization
342.1 Path Planning in a Battlefield
- Goal To determine (escape) paths for personnel
in a battlefield - Input A node weighted graph with each node
representing a geographical location of a
battlefield whose weight corresponds to the
associated risk. - Quality Measure The quality of an escape path is
determined by cumulative risk of the path
35Problem Formulation
- A path P is a non cyclic sequence (L1,L2.Ln)
where L1 is the initial location of personnel, Ln
is a target or exit point, and each Li is
adjacent to Li1 in the graph. - Determine escape paths which maximize path
quality Q(P) defined as - k
- Q(P) ? log(1-risk(Li))
- i1
36Modeling Risks
- We define risk as the probability of occurrence
of a high level of damage to personnel traversing
a path - Two probabilistic risk models
- Gaussian Distribution - models risks due to
- specific events such as explosion and chemical
threats - Beta Distribution - models risks due to
distribution of events through the entire
geographical region -
37Modeling Risks with Gaussian Distribution
38Algorithms for Path Planning
- Uniform Cost Search finds the optimal solution
(Dijkstras algorithm) - Simulated Annealing
- Evolution Strategies (ES)
- µ1 ES
- Stochastic ES
- Evolutionary Quenching Strategy (EQS)
39Evolution Strategies
- Initialize population
- Generate offspring at each iteration from a
population of size µ - Replacement Strategy
- µ1 ES Deterministic replacement only
offspring of higher quality are accepted - Stochastic ES - Probability of replacement
- is equal to min1,Q(offspring)/Q(parent)
-
40Key Principle of EQS
- An evolution strategy which accepts solutions of
lower quality with a probability that decreases
with increase in number of iterations (annealing
principle) - Ensures escape of local optimum during early
stages of the algorithm - Emphasizes convergence to optimal solution at
later stages of the algorithm
41Optimal Route Planning for Battlefield Risk
Minimization
Goal
Source
Source
42Optimal Route Planning for Battlefield Risk
Minimization (Contd.)
Goal
Source
High risk
Moderate risk
Low risk
Risk free
43Simulation Results
- The algorithms were simulated on a 100x100 grid
with 15 target nodes on the periphery of the
grid. - In all instances of the problem, EQS approximates
the optimal solution outperforming Simulated
Annealing and variants of ES. - EQS and other variants of ES require a relatively
less computational time of 21 seconds compared to
uniform cost search (470 seconds)
44Performance Comparison of Different Algorithms
with a Gaussian Distribution for Risk Values
452.2 Multi-Objective Path Planning
- In a battlefield, a path can be evaluated with
respect to different objectives. - Some crucial aspects of a path to be considered
are - Cumulative Risk
- Length of the Path
- Reward associated with the target node
46Multi-objective Evolutionary Algorithms
- Goal To discover a set of non dominated
solutions with significant diversity - Evolutionary algorithms are best suited for
multi-objective optimization since they
simultaneously explore multiple solutions
47Multi-objective Evolutionary Algorithms (Contd.)
- We have implemented three multi-objective
evolutionary algorithms for path planning problem - Pareto Archived Evolution Strategy- J.D. Knowles
and D.W Corne, On Metrics for comparing non
dominated sets, in Proc. IEEE Congress on
Evolutionary Computation (CEC02), pp.711-716,
2002. - Non-dominated Sorting Genetic Algorithm - K. Deb
, S. Agarwal, A. Pratap, and T. Meyarivan, A
fast and elitist multi-objective genetic
algorithm NSGA II, in Proc. Parallel Problem
Solving from Nature VI, pp.849-858, 2000. - Evolutionary Multi-objective Crowding Algorithm
-
48Evolutionary Multi-objective Crowding Algorithm
(EMOCA)
- EMOCA considers crowding density in data space
for path planning - Mating opportunities are given to better quality
as well as substantially different individuals - Stochastic acceptance criteria is used which
depends on crowding density difference between
parent and offspring
EMOCA Main steps
49Multi-objective Problem Scenario
Goal-1 Goal-2
Goal-3
Source
High risk
Moderate risk
Low risk
Risk free
50Multi-objective Problem Scenario (contd.)
- Paths are evaluated with respect to three
different measures risk, path length and reward - Difficult tradeoffs exist for example, should
personnel follow a more risky path to increase
the probability of finding a greater reward?
51Illustrating Mutually Non Dominating Paths
P1 goal1 P2 goal2
goal3
P3
source
High risk
Moderate risk
Low risk
Risk free
52Path Quality with respect to Different Measures
Path Risk Path length Reward
P1 0.7 9 0.2
P2 0.2 14 0.5
P3 0.7 12 1
53Best Choice of Path
W-risk W-path length W-reward Best path
Low High Low P1
High Low Low P2
Low Low High P3
54Performance Comparison
- We have used a well known metric C metric for
performance comparison. Smaller values of C
metric indicates better performance. - We have also obtained C metric values over
multiple trials comparing the solutions obtained
by different algorithms for each trial -
-
55Simulation Results
- EMOCA outperforms NSGA II and PAES for results
obtained over 100 trials - EMOCA obtains more non-dominated solutions and
has lower C metric values than other algorithms. - The results clearly indicate that EMOCA performs
best for the path planning application
56C-metrics for Various Pair-wise Algorithm
Comparisons
Algorithm1 Algorithm2 C(Algorithm2, Algorithm1)
EMOCA(without crossover) PAES 0.15
EMOCA(with crossover) PAES 0.00
EMOCA (with crossover) NSGA II 0.06
57Discussion
- Efficient algorithms for risk minimization
- Near-optimal solutions
- Modeled path planning as a multi-objective
optimization problem - Developed a new algorithm (EMOCA) outperforming
state of the art multi-objective evolutionary
algorithms
58Future Work
- Develop multi-objective evolutionary algorithms
for other battlefield applications such as
wireless sensor networks employed in surveillance
systems - Develop algorithms for dynamic path planning
- Multiple object detection and tracking, and work
on Multi camera platform - Develop a comprehensive library of recognizable
sounds to provide richer context information - New methodologies for audio visual fusion
- Integration with VGIS
59Mutation
- The mutation step consists of replacing a
randomly chosen edge of the path by another sub
path between the same nodes. - In mutating the path a? b ? c? d? e,
- a randomly chosen edge of the path,
- say c? d, is replaced by an alternate sub-path
c? f? h? d, yielding - a? b? c? f? h? d? e
-
60Simulated Annealing- main steps
- Initialize population- straight line shortest
paths from source node to target node - Mutation of parent to produce offspring
- Stochastic replacement with probability
- 1-e (Q(offspring)-Q(parent))/temperature
61Mutation
- The mutation step consists of replacing a
randomly chosen edge of the path by another sub
path between the same nodes. - In mutating the path a? b ? c? d? e,
- a randomly chosen edge of the path,
- say c? d, is replaced by an alternate sub-path
c? f? h? d, yielding - a? b? c? f? h? d? e
-
62Multi-objective Optimization- Preliminaries
- The solution to a multi-objective optimization
problem is a set of non-dominated vectors. - A solution vector x dominates a solution vector y
(xgtgty) if and only if - ? i ? 1,.m fi(x) gt fi(y), and
- ? j ? 1,.m fj(x) gt fj(y)
- Where m is the number of objectives. X andY
are mutually non-dominating if the above
conditions do not hold. -
63EMOCA- Main Steps
- Initialize
- Generate mating population
- Generate offspring by crossover , mutation
- Create a new pool consisting of some parents and
some offspring - Trim new pool to generate population
- of next iteration
64Crossover
- Two Point Path Crossover operator (2PTPX) which
is less disruptive and preserves a major portion
of the parent paths. - Consider two parent paths S ? N1 ? N3 ? E1 and
S ? N2 ? N4 ? E2, where N1 and N2 are at
least four path lengths away from E1 and E2, and
nodes N3 and N4 are a few edges away from N1 and
N2, respectively. The crossover operator then
generates the offspring S ? N1 ? N4 ? E2
and S ? N2? N3 ?E1 .
65Pareto Archived Evolution Strategy (PAES)
- Uses a local search strategy and maintains an
archive of non-dominated solutions. - Parent is mutated to produce offspring
- If offspring dominates parent, it is accepted
- If offspring and parent are non-dominated, then
acceptance decision is based on the squeeze
factor of the solutions.
66Non-dominated Sorting Genetic Algorithm(NSGA II)
- Generates offspring population of size N from
mating population of size N by crossover and
mutation - Uses binary tournament to select mating pairs
- A non dominated sorting on combined
population(parentoffspring) is used to obtain
mating population for next iteration
67Crowding density
- Data space crowding density is defined as ?(P)
L/E where L is the number of paths in the
current population passing through each edge of
path P, and E is total number of edges in path P
- A relatively low value of ?(P) indicates that
path P does not share many edges with other
paths in the population, giving it a relatively
high diversity rank. -
68Salient features of EQS
- The acceptance probability of EQS depends on ?
where ?((c(1-c)i)/?)-?, i is the current
iteration , ? is the maximum number of
iterations, c and ? are algorithm parameters. - During initial stages of the algorithm, when i0,
?c/?-?, and the probability of acceptance is
high. During later stages of the algorithm when
i approaches ? , - ?c/?(1-c)-?, and the probability of
accepting the offspring is relatively low.
69Trimming New pool
- The new pool is sorted based on the primary
criterion of non-domination rank and the
secondary criterion of diversity rank - The new population will consist of the first N
elements of the sorted list containing solutions
grouped into different frontsF1, F2,..Fn where
elements of Fi1 are dominated only by elements
in F1,F2 ,..Fi.
70New Pool Generation
- The offspring is compared with one of the
parents to form the new pool.There are three
possible cases - Case 1 If the offspring dominates the parent,
then the offspring is added to the new pool. - Case 2 If dominated by the parent, the offspring
is added to the new pool with probability - 1-exp(?(offspring)- ?(parent)).
- Case 3 Otherwise, if the offspring has a lower
crowding density than the parent, then it is
added to the new pool, else the parent is added
to the new pool.
71Mating Population Generation
- Binary tournament selection is iterated to
create the mating pool - In each step, two randomly chosen members of the
current population are compared - The tournament to determine who enters the mating
population is won by the solution with lower
total rank, the sum of its non-domination rank
and diversity rank
72Squeeze factor
- The squeeze factor of a candidate solution is
the number of archive elements located in the
same cell of the objective function space,
assuming that this space is a finite hyper cube
divided in to (2d)m equal sized non overlapping
hyper cubes.
73C-metric
- C metric calculates the fraction of
solutions in one non-dominated set that are
dominated by the non-dominated solutions of the
other set.
74Significance of audio features
- Histogram features
- Features calculated on histogram
- Width
- Symmetry
- Skewness
- Kurtosis
- Clear voice has a asymmetric
- broad histogram
- Voice in noise has a narrower
- histogram, and is more
- symmetric
- Useful in detecting modulations
- in sound
75Other sound environments
- We conducted experiments
- To classify the following environments
- Air conditioned rooms
- Construction site
- Factory
- Rail tunnel
- Warehouse
- To distinguish between types of power tools in a
construction setting - Drills
- Hammers
- Generators
- Compressor
- Electric motors
76Significance of audio features (contd)
- Spectral Centroid and Zero Crossing Rate, model
the spectral distribution and the dominant
frequency (pitch) of sound - Band Relative Energies calculate the energy in
several spectral bands. Speech mostly contains
energy in the band below 1 khz whereas alarms
might have a different distribution - LPC coefficients and Cepstral Coefficients give a
direct indication of sampled sound in time and
querfency domain respectively
77Complete XML descriptor
78Related Work
- Interpretation system of dynamic scenes INRIA
France 2003. - Robust, Online Event Detection and Classification
for Video Monitoring (Cornell University) - Video Surveillance and Monitoring (Carnegie
Mellon University 2000) - Work dealing with situational context learning
like Computational Auditory scene analysis,
Wearable Audio Computing at MIT(2003), Technology
for Enabling Awareness (TEA) project(2000)
79Low Level Processing of video
- Moving Object Detection
- Background Subtraction
- Luminance Contrast Method
- Background/Template Updating
- Moving Object Tracking
- Dynamic Template
- Infinite Impulse Response (IIR)
- Feature Extraction
- Bounding box is identified, and useful features
extracted from it
80Uncertainty computation
Module 1standing
0.987
0.9063
Module 2 standing
0.01
0.0092
0.092
Module 3 sitting
0.0845
81Spectral shape coefficients
- Divide the spectrum into 5
- bands
- Do a linear regression,find
- best fit lines for the
- spectral envelope in each
- Band
- Slopes of these lines give
- the coefficients
- Inspired by the Kates
- coefficients
- Indicate shape of spectrum
82Frame based classification
- The mean values and standard deviations are
computed for each feature fi and for each class
ci to be discriminated, using the available
training data - For each class ci , the upper and lower bounds
associated with the control chart are obtained - upperBound(fk , ci ) mean(fk , ci ) ?
fk, ci .standard deviation (fk , ci ) - lowerBound(fk , ci ) mean(fk , ci ) ?
fk, ci . standard deviation (fk , ci )
83Decision in Classification
- Final classification uses the majority rule.
- For instance, if standing,standing,standing,bend
ing is the vector representing single-feature
based classification for each of the four
features, the final conclusion is standing. - Ties are broken by giving priority to one
feature - A tie between standing and bending is broken in
favor of Standing if the value of RUD feature
for the candidate object is closer to
mean(RUD,Standing) than to mean(RUD,Bending). - A tie between standing and sitting is broken by
AR. - A tie between sitting and bending is broken by
RLD.
84Recognition of Sub-Scenario
- If c (gt0) consecutive decisions at times t,
(t-1), .. - (t-c1) are all different from the decision
being made at time (t-c), then we conclude that a
new sub-scenario had commenced at time (t-c1). - Otherwise, we attribute the differences to noise
and image quality, and presume that the
sub-scenario has not changed.
85Video features
- Features derived from the moving object used for
activity detection are - Aspect ratio
- Velocity
- Relative densities of pixels in upper , lower and
middle bands of bounding box - Coordinates of centroid of bounding box