Multimedia Information Systems - PowerPoint PPT Presentation

1 / 51
About This Presentation
Title:

Multimedia Information Systems

Description:

Tactile sensing is also the basis of complex perceptual tasks like medical ... hidden anatomical structures and evaluate tissue properties using their hands. ... – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 52
Provided by: Owne652
Category:

less

Transcript and Presenter's Notes

Title: Multimedia Information Systems


1
Multimedia Information Systems
  • Basics of Multimedia Data Capture-II
  • Elementary Pattern Recognition

2
Announcements
  • Accounts Ready
  • DirectX8.1SDK has been installed on 5 stations,
    five most west machines stations KA-KD! (Yippee)
  • Deadline 1159pm February 13th 2004

3
Text Capture
  • Input Devices
  • Keyboard etc
  • Scanners and OCR

4
Keyboard
  • Keyboards are the easiest form of text entry
  • Based on the ASCII American Standard Code for
    Information Interchange.
  • An outgrowth of type-writers designed for the
    visually impaired.

5
Optical Character Recognition
  • Huge libraries of data are available in books,
    journals etc.
  • To make this data available electronically is a
    big challenge
  • The ability to recognize letters and digits is
    fundamental to interpreting text
  • For a computer though a character is just another
    image.

6
What is OCR?
  • The problem of Optical Character Recognition
    (OCR), is the problem of the automatic
    recognition of raster images as being letters,
    digits or some other symbols.
  • This is a tough but very useful problem.

7
Anatomy of OCR
Document
Document image
Document image
Orientation Detector and Corrector
Scanner
Black White Pixels
Individual Lines
Image Segmentation unit
Line Segmentation unit
Recognized character
Individual character
Database Of Templates
glyph Segmentation unit
8
Add-ons
OCR
8ecause
Spell-Checker
Because
9
Other Problems
  • Font
  • Bold Italic
  • Interlaced Graphics and Images
  • Hand Written Characters

10
Simple OCR
  • Bi-Level image
  • Text only
  • Single Font
  • The problem is to recognize characters and words
    to make a text file.
  • Standard character set is 128.

11
Steps
  • Scan the image
  • Segment it into black and white pixels
  • Segment into lines
  • Segment into glyphs
  • Compare glyphs with templates of glyphs of 128
    characters
  • Recognize the incoming character

12
Template Formation
  • By Template we mean a blue print, a dye of what a
    character is like
  • To make templates we generally need to take in a
    database of samples and then get an average idea
    of what the character is going to look like.
  • So for example take a sample of 4 a images,
    average these images to get template of a.

13
Sample Training Data
14
Line Generation
  • The image is already bi-level no image
    segmentation necessary
  • To segment document into lines take each row and
    add the number of black pixels in it (horizontal
    projection).
  • Approximately if the number of black pixels is
    around zero then a line has just ended.

15
Glyph Generation
  • For each given line
  • For each column
  • Take vertical projections
  • If (vertical projection in a columnzero)
  • Glyph end or beginning

16
Template Matching of Glyphs
  • Template matching accounts to pixel by pixel
    matching of the incoming glyph to the 128
    templates.
  • A simple measure of match would be number of
    black pixels that match-number of black pixels
    that dont
  • White pixels arent counted

17
Example
Template Glyphs
Input glyph
18
OCR on Scanned Images
  • Scanned images contain many problems.
  • Images are grayscale rather than bi-level
  • Noise
  • Orientation

19
Typical Scanned Image
20
Thresholding
  • We need to convert grayscale into binary image
  • To do this we set a threshold of value. Any pixel
    above this threshold value is colored white while
    others are made black.
  • We can also change levels of quantization a more
    effective technique

21
Adobe Photoshop demos
22
Noise
  • Acquire multiple images of the same source
  • Average the gray level of each pixel in the image
    to reduce noise
  • Acquire multiple images threshold them then take
    only pixels that are black.

23
Noise-II
  • Use a Median Filter
  • Removes noise
  • Preserves detail
  • Slow

24
Isolating Individual Glyphs
  • Glyphs may appear connected due to noise or
    undersampling.
  • Connected Glyphs pose a major problem.
  • False Division is also possible
  • The character m may be falsely recognized as r
    and n

25
Isolating Individual Glyphs
26
Correct Division
27
Matching Templates
  • Its a very tough problem and pattern recognition
    techniques come in handy
  • A measure of similarity between templates and
    glyph needs to do two things
  • Maximize the similarity between glyph and the
    correct template
  • Maximize the dissimilarity between glyph and the
    other 127 templates

28
Classification
  • This is a standard classification problem.
  • Given data and its spread we need to find
    boundaries between different classes.

29
A good measure
  • Normalized Match Index
  • NMI M - M-
  • M M-
  • Value always lies between 1 and -1

30
Template MatchingII
  • In order to speed up template matching we can
    take subsample of the glyph match that
  • If the match is good then match other sub-samples
    else not
  • Has some problems here too

31
Statistical Matching
  • Pixel to Pixel match clearly has its limitations.
  • We need to move beyond just pixels to do
    analysis.
  • This again brings us to classic pattern
    recognition.

32
Feature
  • A Feature is a useful measurement
  • Usefulness depends on the problem to be solved
  • For example a useful feature for classification
    problem will be a measurement that maximizes
    boundaries between classes and also the
    similarity between the right match

33
Feature-II
  • Features can be combined to form a large feature
    vector.
  • If N is the number of elements in the feature
    vector, then we have a N dimensional space.
  • To Match a glyph you calculate Euclidean distance
    between features of 128 templates and the feature
    of glyph.

34
An Example Feature
35
Other Possible Features
  • Slopes
  • Holes
  • Signature Distance from centroid of a glyph to
    the boundary measured from 0 to 360 degrees.

36
Orientation
  • Skew distortion especially from a true value or
    symmetrical form.
  • Characterized by the skew angle.
  • If the skew angle is known
  • If small horizontal projection will work
  • If large we need to do rotation.

37
Estimation of Rotation
  • For characters with no descendants (all except g,
    j, p, q and y) the bottom of character in each
    line is collinear.
  • If the bounding box of a character is found, then
    the center of the bottom of bounding boxes should
    all be collinear.
  • This is the algorithm used by Baird 1987

38
Baird Algorithm
39
BookReader_at_CUbiC
40
Haptic Data
  • Skin sensation is essential for many manipulation
    and exploration tasks.
  • Tactile sensing is also the basis of complex
    perceptual tasks like medical palpation, where
    physicians locate hidden anatomical structures
    and evaluate tissue properties using their hands.
  • We have tactile display and tactile sensing

41
Tactile Display
  • Sony PS2, XBOX all have forcefeedback in their
    controllers
  • DirectX can be used to generate force feedback
    with connected devices.
  • tactile pin arrays to convey visual information
    to the blind
  • vibrotactile displays of auditory information for
    the hearing impaired.
  • Teleoperation.

42
Display types
  • Vibrations can relay information about phenomena
    like surface texture, slip, impact, and puncture.
  • vibrator for each finger or region of skin may be
    adequate.
  • Small-scale shape or pressure distribution
    information is much more difficult to convey.
  • array of closely-spaced pins that can be
    individually raised and lowered against the
    finger tip to approximate the desired shape.
  • Thermal display is a relatively new area of
    research.
  • thermal perceptions are based on a combination of
    thermal conductivity, thermal capacity, and
    temperature

43
ForceFeedback
  • Many forcefeedback devices use the same hardware
    as sound card for haptic data generation and
    capture.
  • TouchWare Desktop Immersion Corporation
  • Haptic Mouse

44
Gaming System
45
Haptic Data Capture
  • The most common way of capturing haptic data is
    through gloves.
  • Haptic Gloves are an adjunct to the dataglove
    which measures angles of fingers and other
    information.

46
DataGlove
  • 150 frames a second
  • Angle resolution 0.5 degrees

47
Basic Touch
  • 5 pressure sensors on the fingers and one on the
    palm.
  • Vibrational Frequency 0-125 Hz
  • Vibrational Amplitude 1.2 N peak-to-peak _at_ 125
    Hz (max)

48
Full Fingers Feedback and Capture
  • Maximum Continuous Force 12 N per finger
  • Force resolution 12-bit resolution

49
Fingers and Palm
  • Costs 100K!

50
Haptics
  • The new buzzword in medical application,
    teleoperations and NASA.
  • Has widespread use but technology still needs to
    advance further.
  • A wide research area you can pursue it.

51
Resources
  • Algorithms for Image Processing and Computer
    Vision J.R Parker
  • Immersion Corporation www.immersion.com
Write a Comment
User Comments (0)
About PowerShow.com