Visual Information Systems presentation

About This Presentation

Transcript and Presenter's Notes

Title: Visual Information Systems

1
Visual Information Systems

visual information retrieval

2
Computational steps for visual retrieval systems

image processing (colour, texture etc)
human perception and computer perception
(computer vision)
Sensory gap
features definition, extraction
low-level and high-level
content, semantics, and concepts
small scale and large scale
knowledge domain, knowledge elicitation,
knowledge discovery and management
Similarity measure, learn from feedback, and
dynamic indexing
Databases and system architecture
Evaluation, not just system performance, but
insights for the future

3
VIR and Traditional Database?

A traditional SQL database has as its basic
element data items in a relation
select name
from employee, project
where employee.deptnumber 25 AND
project.number 100
databases exploit known structures and relations
DBMS retrieval is not probabilistic
How different from the WWW?
And from traditional IR?

4
VIR and Traditional IR systems?

IR systems can be considered the precursors to
VIR
The basic unit of a IR system is a document and
the focus is on textual retrieval
exact matching - Boolean, text pattern searching
inexact matching - probabilistic, vector space,
clustering
Visual information has its own characteristics
that traditional IR is incapable to handle

5
Recap IR Whats IR

Motivation
the larger the holdings of the archive, the more
useful it is
however, it is harder to find what you want
IR is all about finding what you want when what
you want is buried in a mass of what you dont
want

6
from Lesk, http//community.bellcore.com/lesk/colu
mbia/session2/
7
Simple IR Model
User
Boolean Vector
Feedback
Query
Results
Ranking Clustering Weighting
Stemming Thesaurus Signature
Pre- Processing
Post- Processing
Boolean Vector
Searching
Flat Files Inverted Files Signature Files PAT
Trees
Storage
Stemming Stoplist
Collection Processing
Stuff
8
Recap IR Precision and Recall

Precision
ratio of the number of relevant documents
retrieved over the total number of documents
retrieved
how much extra stuff did you get?
Recall
ratio of relevant documents retrieved for a
given query over the number of relevant documents
for that query in the database
how much did you miss?

9
Recap IR Text Retrieval

The most popular approach is to extract keywords
from each text document in the database to form
the indices of the document.
The keyword extraction process may be divided
into three major steps, stopwords removal,
stemming and word weighting
stopwords removal a, an and the.
stemming removes the suffix and prefix of each
word.
word weighting estimates the weighting of each
word.

10
Recap IR Text Retrieval

Query will go through the same procedure
Similarity matching calculated from the
pre-computed weighting of the matched keywords.
All documents with a similarity value higher than
a certain threshold will be considered as
relevant documents and returned to the user.
These relevant document may be ranked according
to the similarity values when presenting to the
user. (Most web search engines do this.)

11
Visual Information Retrieval-keyword

It is difficult for text to capture the
perceptual saliency of some visual features
Pictures cannot speak, but they are stronger than
words.
Text is not well suited for modelling perceptual
similarity.
Subjective.
What is needed in these cases is the use of a
more concrete description of visual content, one
more closely related to human perception, and a
new way of interaction that fully exploits human
perception capabilities.

12
Visual information Retrieval content-based
approach

Textual content free text search

image content image features, shapes, color,
textures, spatial relationships

Video content motions, image features, scene
composition, video semantics, audio, etc.

13
Content-Based Image Retrieval

As happens during the maturation process of many
a discipline, after early successes in a few
applications, research is now concentrating on
deeper problems, challenging the hard problems at
the crossroads of the discipline from which it
was born (Arnold 2000)
computer vision, databases, and information
retrieval.
Deeper analysis is needed and semantics is more
desirable make use of domain knowledge

14
Domain and Variability

A narrow domain has a limited and predictable
variability in all relevant aspects of its
appearance.
Semantics is well-defined, and unique.
A broad domain has an unlimited and unpredictable
variability in its appearance even for the same
semantic meaning
Semantics is more ambiguous, and partial
Need more contextual information

15
Domain and Variability

The notions of broad and narrow domains are
helpful in characterizing patterns of use, in
selecting features, and in designing systems.
For narrow, specialized image domains, the gap
between features and their semantic
interpretation is usually smaller, so
domain-specific models may help.
In a broad image domain, the gap between the
feature description and the semantic
interpretation is generally wide
the required number of computational variables
would be enormous.
Research issues raised

16
Research issues

How to handle variability?
Multiple processors and fusion process?
Inference engines?

17
Domain Knowledge

Laws of syntactic (literal) equality and
similarity define the relation between image
pixels or image features regardless of its
physical or perceptual causes.
Laws describing the human perception of equality
and similarity
Physical laws describing equality and difference
of images under differences in sensing and object
surface properties. The physics of illumination,
surface reflection, and image formation have a
general effect on images.
Geometric and topological rules describe equality
and differences of patterns in space.
Category-based rules encode the characteristics
common to class z of the space of all notions Z.
Finally, man-made customs or man-related patterns
introduce rules of culture-based equality and
difference.

18
Difficulties in VIS

The sensory gap and the semantic gap

19
The Semantic Gap

A linguistic description is almost always
contextual, whereas an image may live by itself.
associate higher level semantics to data-driven
observables
labelling is seldom complete, context sensitive,
and, in any case, there is a significant fraction
of requests whose semantics can't be captured by
labelling alone. Both methods will cover the
semantic gap only in isolated cases.
This works well in narrow domain like I-Browse,
though it is not the perfect solution

20
From broad domain to narrow domain

The challenge for image search engines on a broad
domain is to tailor the engine to the narrow
domain the user has in mind via specification,
examples, and interaction.

21
Bridging the Gap

New challenges in content-based retrieval are the
huge amount of objects to search among, the
incomplete query specification, the incomplete
image description, and the variability of sensing
conditions and object states.
The aim of content-based retrieval systems must
be to provide maximum support in bridging the
semantic gap between the simplicity of available
visual features and the richness of the user
semantics.
The broader the domain, the more browsing or
search by association can be the right solution.
The narrower the domain, the more likely an
application of domain knowledge will succeed

22
Video Retrieval

There are three major processes to prepare a
video for retrieval, video segmentation, index
extraction and keyframe extraction.
From another perspective, video retrieval could
be considered simpler than image retrieval since
video reveals its objects more easily as the
points corresponding to one object move together.
In addition, video has a linear timeline, as
important to the narrative structure of video as
it is in text.

23
Video Retrieval

Video segmentation divides the video into a
number of segments by detecting the camera
breaks.
Index extraction manual indexing, image analysis
and computer vision and object recognition
Keyframe extraction is to select representative
image frames from each video segment to represent
the segment. These keyframes may be used for
browsing and for presentation.

Write a Comment

User Comments (0)

About PowerShow.com

Visual Information Systems PowerPoint PPT Presentation