Title: Evolutionary Feature Extraction for SAR Air to Ground ATR Statistical Approach
1Evolutionary Feature Extraction for SAR Air to
Ground Moving Target Recognition a Statistical
Approach
Evolving Hardware Dr. Janusz StarzykOhio
University
2Neural Network Data Classification
- Concept of Logic Brain
- Random learning data generation
- Multiple space classification of data
- Feature function extraction
- Dynamic selectivity strategy
- Training procedure for data identification
- FPGA implementation for fast training process
3Neural Network Data Classification
Abdulqadir Alaqeeli, and Jing Pang
- Concept of Logic Brain
- Threshold setup converts analog to digital
world - Logic Brain is possible based on artificial
neural - network
- Random learning data generation
- Gaussian distribution random multiple dimension
- data generation
- Half data sets prepared for learning procedure
- Another half used later for training procedure
4Neural Network Data Classification
- Multiple space classification of data
- Each space can be represented by a set of
minimum base vectors - Feature function extraction and dynamic selecting
strategy - Conditional entropy extracts information in
each - subspace
- Different combinations of base vectors compose
the redundant sets of new subspace - ? expansion strategy
- Minimum function selection
- ? shrinking strategy
5Neural Network Data Classification
- FPGA implementation for fast training process
- Learning results are saved on board
- Testing data sets are generated on board and
sent - through the artificial neural network
generated on - board to test the successful data
classification rate - The results are displayed on board
- Promising application
- Especially useful for feature extraction of
large data sets - Catastrophic circuit fault detection
6Information Index Background
- A priori class probabilities are known
- Entropy measure based on conditional
probabilities
7Information Index Background
- P1 and P2 and a priori class probabilities
- P1w and P2w are conditional probabilities of
correct - classification for each class
- P12w and P21w are conditional probabilities of
misclassification given a test signal - P1w , P2w, P12w and P21w are calculated using
Bayesian estimates of their probability density
functions
8Information Index Background
- probability density functions of P1w , P2w, P12w,
P21w
9Direct Integration
10Monte Carlo Integration
pdf
11Information Index probability density functions
P2w
12Information Index weighted pdfs
P2w
13Information Index Monte Carlo Integration
- To integrate the probability density function
- generate random points xi with pdf1
- weight generated points according to
- estimate the conditional probability P1w using
14Information Index and Probability of
Misclassification
15Standard Deviation of Information in MC Simulation
16Normalized Standard Deviation of Information
17Information Index Status
- MIIFS was generalized to continuous distributions
- N-dimensional information index was developed
- Efficient N-dimensional integration was used
- Information error analysis was performed
- Information index can be used with non
Gaussian distributions - For small training sets and low information index
information error is larger than information
18Optimum Transformation Background
- Principal Component Analysis (PCA) based on
Mahalanobis distance suffers from scaling - PCA assumes Gaussian distributions and estimates
covariance matrices and mean values - PCA is sensitive to outliers
- Wavelets provide compact data representation and
improve recognition - Improvement shows no statistically significant
difference in recognition for different wavelets - Need for a specialized transformation
19Optimum Transformation Haar Wavelet
20Optimum Transformation Haar Wavelet
- Repeat average and difference log2(n) times
21Optimum Transformation Haar Wavelet
22Optimum Transformation Haar Wavelet
- Matrix interpretation
- bWa where
23Optimum Transformation Haar Wavelet
- Matrix interpretation for the class of signals
BWA - where A is (n x m) input signal matrix
- Selection of n best coefficients performed using
the information index Bs1S1WA - where S1 is (n x nlog2(n)) selection matrix
24Optimum Transformation Evolutionary Iterations
- Iterating on the selected result Bs2S2W
Bs1 - where S2 is a selection matrix or Bs2S2W
S1W A - after k iterations Bsk SkW ... S2W
S1W A - So, the optimized transformation matrix T
SkW ... S2W S1W - can be obtained from the Haar wavelet
25Optimum Transformation Evolutionary Iterations
- Learning with the evolved features
26Optimum Transformation Evolutionary Iterations
- Waveform interpretation of T rows
27Optimum Transformation Evolutionary Iterations
- Mean values and the evolved transformation
Original Signals and the evolved transformation
1.5
1
0.5
Signal Value
0
-0.5
-1
-1.5
0
20
40
60
80
100
120
140
Bin Index
28Two Class Training
- Training on HRR signals 17o depression angle
profiles of BMP2 and BTR60
29Wavelet-Based Reconfigurable FPGA for
Classification
t
window
8bit
8bit
Sample 1
1
Haar-Wavelet Transform
N.N.
input signal is recognized
Sample m
k
8bit
8bit
Note k ? m
30Block Diagram of The Parallel Architecture
31Simplified Block Diagram of The Serial
Architecture
First the Blue Second the Green
32RAM-Based Wavelet
33The Processing Element
10 2 11 X
9 6 5 X
9 9 5 X
20 8 10 2 11
-8 9 XX
0 0 1 0 1
8 10 2 11
34Results For One Iteration of Haar Wavelet
- For 8 samples
- Parallel arch. 120 CLBs, 128 IOBs, 58ns.
- Serial arch. 98 CLBs, 72 IOBs, 148ns.
- Parallel Arch. wins for larger number of
samples. - For 16 samples
- Parallel arch. 320 CLBs, 256 IOBs, 233ns.
- RAM-Based arch. 136 CLBs, 16 IOBs, 1?s.
- RAM-Based Arch. Wins since 1?s is not so slow.
- --------------------------------------------------
---------- - These values increase very fast when the of
samples increases, and the delay becomes
extremely higher.
35Reconfigurable Haar-Wavelet-Based Architecture
PE
PE
PE
PE
Data
36(No Transcript)
37Test Results
- Testing on HRR signals 15o depression angle
profiles of BMP2 and BTR60 - With 15 features selected correct classification
for BMP2 data is 69.3 and for BTR60 is 82.6 - Comparable results in SHARP Confusion Matrix for
BMP2 data is 56.7 and for BTR60 is 67
38Problem Issues
- BTR60 signals with 17o and 15o depression angles
do not have compatible statistical distributions
39Problem Issues
- BMP2 and BTR60 signal distributions are not
Gaussian
40Work Completed
- Information index and its properties
- Multidimensional MC integration
- Information as a measure of learning quality
- Information error
- Wavelets and their effect on pattern recognition
- Haar wavelet as a linear matrix operator
- Evolution of the Haar wavelet
- Statistical support for classification
41Recommendations and Future Work
- Training Data must represent a statistical sample
of all signals not a hand picked subset - Probability density functions will be
approximated using parametric or NN approach - Information measure will be extended to k-class
problems - Training and test will be performed on 12 class
data - Dynamic clustering will prepare decision tree
structure - Hybrid, evolutionary classifier will be developed