Outline - PowerPoint PPT Presentation

About This Presentation
Title:

Outline

Description:

Outline – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 107
Provided by: xiuwe
Learn more at: http://www.cs.fsu.edu
Category:
Tags: hann | outline

less

Transcript and Presenter's Notes

Title: Outline


1
Outline
  • Syllabus
  • Introduction
  • Computational Paradigms for Vision
  • Appearance-based computer vision
  • Physics-based computer vision

2
Class Materials
  • In this class most of the time we will discuss
    papers from the literature
  • At the beginning I will give a general
    introduction based on chapters from different
    books
  • There is no required textbook for this class

3
Vision
  • Vision
  • The process of acquiring knowledge about the
    environmental objects and events by extracting
    information from the light they emit or reflect
  • Vision is a very complicated process, involving
    different processes such as memory
  • Vision is the most useful source for information
    as about 50 of the human brain is devoted to
    visual processing

4
Vision cont.
  • Vision has been studied from many different
    perspectives
  • Computational vision
  • Emphasis on approaches that are biologically
    plausible
  • Computer vision
  • Emphasis on algorithms to solve particular
    problems
  • Statistical vision
  • Emphasis on developing and analyzing mathematical
    and statistical models

5
Darwin X
Source NewScientist
6
Computer Vision
  • Computer vision tries to automate the vision
    process by building devices that simulate the
    human vision process
  • Note that devices that solve part of the problems
    can be very useful

7
Motivation Examples
  • Computer vision techniques can provide novel
    opportunities and improve performance of existing
    systems (sometimes significantly)
  • Hopefully the following examples will convince you

8
Human Computer Interfaces
  • Mouse gestures
  • Allow one to control programs more easily by
    drawing commands using mouse
  • Some of the 80 gestures recognized by strokeit
    (http//www.tcbmi.com/strokeit/)

9
Mouse Gestures
  • In Photoshop, for example, you can
  • In a web browser, you can

10
Human-Computer Interactions
11
3D Hand Mouse
12
HandiEye
13
Sign Language Recognition
14
ALVINN
15
(No Transcript)
16
RALPH
17
Applications continued
18
DARPA Grant Challenge
  • http//www.darpa.mil/grandchallenge/gcorg/index.ht
    ml

19
DARPA Grant Challenge
20
Introduction cont.
  • Honda ASIMO
  • http//world.honda.com/ASIMO/

21
Automated Map Updating
22
Automated Map Updating
23
3D Urban Models
24
Image-Guided Neurosurgery
25
Intracardiac Surgical Planning
26
Medical Image Analysis
27
Detection and Recognition
28
Detection and Recognition of Text in Natural
Scenes
29
Detection and Recognition of Text in Natural
Scenes
30
Text Detection and Recognition in Images and
Videos
31
Driver Monitoring System
32
Face Recognition
http//www.a4vision.com
33
Intelligent Transportation Systems
http//dfwtraffic.dot.state.tx.us/dal-cam-nf.asp
34
Handwritten Address Interpretation System
  • HWAI - http//www.cedar.buffalo.edu/HWAI/
  • The HWAI (Handwritten Address Interpretation)
    System was developed at Center of Excellence for
    Document Analysis and Recognition (CEDAR) at
    University at Buffalo, The State University of
    New York. It resulted from many years of research
    at CEDAR on the problems of Address Block
    location, Handwritten Digit/Character/Word
    Recognition, Database Compression, Information
    Retrieval, Real-Time Image Processing, and
    Loosely-Coupled Multiprocessing.
  • The following presentation is based on the
    demonstration pages at HWAI

35
Handwritten Address Interpretation System cont.
  • Step 1 Digitization

36
Handwritten Address Interpretation System Cont.
  • Step 2 Address Block Location

37
Handwritten Address Interpretation System Cont.
  • Step 3 Address Extraction

38
Handwritten Address Interpretation System Cont.
  • Step 4 Binarization

39
Handwritten Address Interpretation System Cont.
  • Step 5 Line Separation

40
Handwritten Address Interpretation System Cont.
  • Step 6 Address Parsing

41
Handwritten Address Interpretation System Cont.
  • Step 7 Recognition
  • (a) State Abbreviation Recognition

42
Handwritten Address Interpretation System Cont.
  • Step 7 Recognition
  • (b) ZIP Code Recognition

43
Handwritten Address Interpretation System Cont.
  • Step 7 Recognition
  • (c) Street Number Recognition

44
Handwritten Address Interpretation System Cont.
  • Step 8 Street Name Recognition

45
Handwritten Address Interpretation System Cont.
  • Step 9 Delivery Point Codes

46
Handwritten Address Interpretation System Cont.
  • Step 10 Bar coding

47
Military Applications
  • Unmanned Aerial Vehicles

48
Automated Global Monitoring
49
Approaches to Computer Vision
  • Vision is a complicated computational process
  • Try to simulate the human vision system
  • Try to build mathematical formulations of the
    environment (to be perceived) and then perform
    inference
  • Try to invent approximate but efficient short
    cuts to the general vision problem

50
Neuroanatomy of the Brain
51
Visual Pathway
52
Visual Pathway Diagram
53
Eye-Camera Analogy
  • The eye is much like a camera
  • Both form an upside-down image by admitting light
    through a variable-sized opening and focusing it
    on a two-dimensional surface using a transparent
    lens

54
Functions of Different Cells
55
Nobel Prize Winning Experiments
56
Nobel Prize Winning Experiments
57
Nobel Prize Winning Experiments cont.
58
Nobel Prize Winning Experiments cont.
59
Simple Cells in the Visual Cortex
60
Simple Cells
  • rectangular shaped receptive fields
  • segregated ON and OFF zones
  • respond to a bright or dark bar
  • represent a restricted region in the visual field
  • respond best to a specific orientation
  • non-optimally oriented stimuli will be
    ineffective in stimulating the neuron

61
Complex Cells
  • larger receptive field than simple cells
  • orientation tuned
  • ON and OFF zones are mixed in the receptive field
  • respond well to a moving bar
  • direction selective

62
Hyper-complex Cells
  • receptive field is selective for the length of
    the stimulus
  • similar to complex cell receptive fields
    (orientation and direction selective)
  • selective for features of shape such as length
    and width of the bar of light.

63
Brain Imaging
64
Psychophysical Studies
  • Determination of the relationship between the
    magnitude of a sensation and the magnitude of the
    stimulus that gave rise to that perceptual
    sensation
  • By studying the perception to different stimuli,
    one can guess what happened in the visual
    system

65
Contrast sensitivity function
66
Single Channel or Multiple Channels
67
Neural Spatial Frequency Channels
  • Neural receptive fields are tuned to the spatial
    frequency of the stimulus
  • There seems to be a range of neural spatial
    frequency channels, each tuned to a different
    spatial frequency
  • A spatial frequency channel can be adapted

68
Vision as an Inverse Problem
  • 2-D images are generated by projecting 3-D world
    onto an image plane under certain lighting
    conditions and view angles
  • The images are a function of the 3-D object
    surfaces and their surface properties
  • Vision essentially needs to solve an inverse
    problem
  • Roughly the inverse of computer graphics

69
An Example
70
Physics-based Computer Vision
  • This naturally leads to the physics-based
    computer vision
  • One needs to build computational models for image
    formation process (computer graphics)
  • One needs to build representations of objects
  • Which includes surface and texture (color map)
  • Vision is essentially an algorithm to recover the
    underlying three dimensional models of a given
    image
  • A widely accepted framework is Bayesian inference

71
Face Recognition based on a 3D Model
72
Face Recognition based on a 3D Model cont.
73
Face Recognition based on a 3D Model cont.
74
Face Recognition based on a 3D Model cont.
75
Appearance-based Computer Vision
  • A different approach is to try to utilize the
    resulting 2-D images directly
  • The images are treated as a matrix
  • One tries to make decisions based on the images
    without building explicit 3-D models
  • Note that here computer vision is an application
    of pattern recognition algorithms

76
Face detection using spectral histograms
  • The problem is to detect faces in images

77
Face detection using spectral histograms cont.
Preprocessing
78
Face detection using spectral histograms cont.
79
Face detection using spectral histograms cont.
80
Face detection using spectral histograms cont.
81
Face detection using spectral histograms cont.
82
Rotation-invariant face detection
83
Face detection using spectral histograms cont.
84
Object Detection and Recognition
  • Object detection and recognition problem
  • Given a set of images, find regions in these
    images which contain instances of relevant
    objects
  • Here the number of relevant objects is assumed to
    be large
  • For example, the system should be able to handle
    30,000 different kinds of objects, an estimate of
    the humans capacity for basic level visual
    categorization
  • Goal
  • Develop a system that achieves real-time
    detection and recognition for images of size 720
    x 480
  • At a frame rate of 30 frames per second (which is
    the NTSC standard video stream)

85
A Framework
86
Requirements
  • To achieve real-time detection and recognition,
    we need two critical components
  • A classifier that can reduce the average
    classification time effectively
  • Features that can discriminate a large number of
    objects and can be computed using a few
    instructions

87
Lookup Table Decision Trees
  • We use local spectral histogram features that are
    computed using histogram integral images
  • We build a decision tree by clustering
  • At each node, we reduce the dimension to a small
    number, i.e., no more than 5 for detection and
    recognition applications
  • We can approximate the decision from any of the
    classifiers using a lookup table

88
Local spectral histogram features
89
Comparison of LSH and Haar features
90
Look-up Table Decision Trees
This requires clustering and we just use some
standard methods
91
An example path of a decision tree
92
Real-time detection and recognition cont.
93
Optimal component analysis
  • Linear representations are widely used in
    appearance-based object recognition applications
  • Simple to implement and analyze
  • Efficient to compute
  • Effective for many applications

94
Standard linear representations
  • Principal Component Analysis
  • Designed to minimize the reconstruction error on
    the training set
  • Fisher Discriminant Analysis
  • Designed to maximize the separation between means
    of each class
  • Independent Component Analysis
  • Designed to maximize the statistical independence
    among coefficients along different directions
  • A toy example
  • Standard representations give the worst
    recognition performance

95
Optimal Component Analysis
  • Derive a performance function that is related to
    the recognition performance
  • Formulate the problem of finding optimal
    representations as an optimization one on the
    Grassmann manifold
  • Use MCMC stochastic gradient algorithm for
    optimization

96
Performance Measure - continued
  • Suppose there are C classes to be recognized
  • Each class has ktrain training images
  • It has kcross cross validation images

97
Performance Measure - continued
  • F(U) depends on the span of U but is invariant to
    change of basis
  • In other words, F(U)F(UO) for any orthonormal
    matrix O
  • The search space of F(U) is the set of all the
    subspaces, which is known as the Grassmann
    manifold
  • It is not a flat vector space and gradient flow
    must take the underlying geometry of the manifold
    into account

98
Kernel optimal component analysis
99
Kernel optimal component analysis
100
Kernel optimal component analysis
101
Kernel function and kernel parameter learning
102
Subset of a face dataset for visualization
103
Evolution of OCA learning
104
Performance comparison
105
Performance comparison on a full face dataset
106
Summary
  • Computer vision as an information processing
    process is very complex
  • A fundamental approach to vision is synthesis by
    analysis
  • Which involves building 3D models
  • A popular short cut is appearance-based computer
    vision
  • Where an object is approximated by views under
    different conditions
  • We will start will appearance-based approach
Write a Comment
User Comments (0)
About PowerShow.com