Affinity Hybrid Tree: An Indexing Technique for ContentBased Image Retrieval in Multimedia Databases - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Affinity Hybrid Tree: An Indexing Technique for ContentBased Image Retrieval in Multimedia Databases

Description:

Kasturi Chatterjee & Shu-Ching Chen. Distributed Multimedia Information System Laboratory. School of Computing ... [5] P. Ciaccia, M. Patella, and P. Zezula. ... – PowerPoint PPT presentation

Number of Views:111
Avg rating:3.0/5.0
Slides: 32
Provided by: fiu8
Category:

less

Transcript and Presenter's Notes

Title: Affinity Hybrid Tree: An Indexing Technique for ContentBased Image Retrieval in Multimedia Databases


1
Affinity Hybrid Tree An Indexing Technique for
Content-Based Image Retrieval in Multimedia
Databases
  • Kasturi Chatterjee Shu-Ching Chen
  • Distributed Multimedia Information System
    Laboratory
  • School of Computing and Information Sciences
  • Florida International University, Miami, FL
    33199, USA

2
Outline
  • Motivation
  • Need of indexing in multimedia databases
  • Need of high level image relationships
  • Need of embedding high level
    relationships in the index structure
  • Literature Review
  • Multidimensional index structures
  • Indexing mechanisms supporting CBIR and
    RF
  • Affinity relationships
  • Affinity Hybrid Tree (AH-Tree)
  • Proposed structure
  • Characteristics
  • Similarity queries
  • Experimental Analysis
  • Conclusion and Future Work

3
Motivation
  • Need of indexing in multimedia databases
  • Popularity of multimedia presentation and
    storage
  • Emphasizes the requirement of efficient
    multimedia storage and retrieval
  • Indexing is an integral part of designing
    a database system to reduce computation
    overhead and optimize retrieval
  • Multimedia data (e.g., image)
    representation is different from traditional
    data, generally in the form of multi-dimensional
    feature vectors Index structures should
    handle high dimensionality efficiently as
    higher the dimension, better is the multimedia
    representation and more satisfactory are the
    retrieval results
  • Thus, we need a specialized Index Structure and
    Retrieval mechanism different from traditional
    indexing to handle the above concerns

4
Motivation
  • Need of high level image relationships
  • Content Based Image Retrieval (CBIR) with
    Relevance Feedback (RF) is a popular image
    retrieval mechanism
  • CBIR incorporates high-level image
    relationship in the retrieval method with the
    help of RF to capture users similarity concept
  • More accurate the users similarity
    concept interpretation, better is the
    relevance of the query results

5
Motivation
  • Need of embedding high level relationships in the
    indexstructure
  • Index structures is required to aid a
    retrieval mechanism in terms of computational
    efficiency and high level image relationship
    improves the quality and relevance of the
    retrieval result
  • Hence, to have an efficient multimedia storage
    supporting retrieval mechanism
  • such as CBIR, an index structure is necessary
    supporting high level image
  • relationship

6
Literature Review
  • Multi-dimensional index structures
  • Feature based
  • Distance based
  • Each category can be further sub divided into
  • Data partitioned
  • Space partitioned
  • project an image as a feature vector in a feature
    space and index the space
  • e.g., KDB-tree and R-Tree
  • Distance based indexing structures are built
    based on the distances or similarities
    between two data objects
  • e.g., M-tree, vp-tree
  • DP-based index structure consists of bounding
    regions (BRs) arranged in a (spatial)
    containment hierarchy, e.g., R-tree, X-tree
  • Consists of space recursively partitioned into
    mutually disjoint subspaces e.g.,
    KDB-tree, vp-tree

7
Literature Review
  • Indexing mechanisms supporting CBIR and RF
    None of the discussed index mechanism captures
    the high-level relationship as it is without
    attempting to translate it into its low level
    equivalence 4611
  • Capturing the users similarity concept
    in the form of feature level closeness is
    often error-prone and/or impose heavy burden on
    the end user (which is not desirable)
  • Affinity relationships
  • A way to capture users similarity
    measure in a CBIR paradigm
  • Parameter of Markov Model Mediator as
    proposed in 16 whose main idea is more
    frequently two images are accessed together, more
    related they are and more is their affinity
    value The relative affinity measurement
    between two images m and n is calculated by

use m,k -gt usage pattern of image m w.r.t query
image qk per time period
access k -gt access frequency of query qk per
time period
8
Literature Review
  • Limitations of existing works
  • None of the existing multidimensional
    indexing structure can incorporate the
    high-level image relationship efficiently in its
    framework
  • Feature based index structures cannot
    incorporate high-level
  • image relationship because Spatial
    Access Methods require
  • the distances between objects to be
    strictly related to the
  • object position in a low-dimensional
    vector space 13 etc.
  • Distance based index structures can
    incorporate the high level image relationship
    as it is but not efficiently since of object
    pair distance calculations are huge
  • Thus, image object similarity needs to be
    translated along different
  • dimensions, which becomes problematic
  • Thus, it negates the very essence of indexing
    which aims at keeping the computation
    overhead as low as possible

9
Affinity Hybrid Tree (AH-Tree)
  • Proposed Structure
  • Solves the two limitations discussed by
    combining Feature based and Distance based
    index mechanisms into one hybrid structure

Feature based index mechanism filters the
feature space and reduce the of distance
computations to be performed
Reduce computational overhead
Distance based index mechanism
incorporates the high-level image
relationship as it is without translating
it into its low-level equivalence
Increase retrieved image relevance by capturing
the user concept as it is
10
Affinity Hybrid Tree (AH-Tree)
  • Building AH-Tree
  • 1. build Space Index by feeding data points
  • 2. for each data node of the space index,
    check if of data points is greater than a
    threshold
  • 3. if yes, merge the data nodes builda
    distance based index for each data node. Each
    data node consists of the root of the
    corresponding distance based index tree

11
Affinity Hybrid Tree (AH-Tree)
  • Characteristics
  • Incorporation of affinity value The
    affinity value between the image objects is
    incorporated after the AH-Tree is built
    during the queries for pruning the tree
  • Why after the tree is built and not during
    ?
  • AH-tree uses a metric distance function
    to calculate the (dis)similarity among the
    image objects in the distance based index
    structure
  • A distance function is metric if it
    follows the laws of
  • (a) symmetry
  • (b) positivity
  • (c) triangular inequality

12
Affinity Hybrid Tree (AH-Tree)
  • Why after the tree is built and not during ?
    (cont) In order to satisfy the triangular
    inequality property, the affinity value could
    not be incorporated in the tree during building
    as it will necessitate the affinity value
    between the image objects used to scale the
    similarity measurement to be equal, which is
    clearly not possible.
  • Lemma 1 The affinity relationship cannot be
    involved while constructing the metric tree as it
    no longer keeps the search space metric.

13
Affinity Hybrid Tree (AH-Tree)
  • Characteristics
  • Promotion of affinity value
  • As discussed, since the affinity
    value is included in the tree after the
    construction, thus a technique should be defined
    to allow for the promotion of the affinity value
    from the leaves to the intermediate parent nodes
    till the root
  • The affinity value is promoted as follows
  • 1. during each query Q, the affinity value of
    the leaves with respect to the query object is
    derived
  • 2. the affinity value of a leaf node along with
    its siblings is used to calculate the affinity
    value of their parent (intermediate node)
  • 3. step 2 is repeated till the affinity value
    of the root is determined
  • The affinity value is thus promoted and
    distributed to each node of the tree is used
    during retrievals of the similarity query Q

14
Affinity Hybrid Tree (AH-Tree)
  • Promotion of affinity value (cont)
  • The promotion technique is pictorially
    described as follows
  • 1. Let Na and Nb be the leaf nodes of the
    distance based index structure with
    imageobjects Oa and Ob respectively, and let Nr
    be their parent
  • 2. Let affa,q and affb,q be the
    pre-computedaffinity values between query object
    anda b, respectively
  • 3. Thus, affinity of the parent Nr is equal
    tomax (affa,q , affb,q )

max (affa,q , affb,q )
15
Affinity Hybrid Tree (AH-Tree)
  • Promotion of affinity value (cont)The defined
    promotion technique ensures 2 important criteria
  • Avoiding false dismissal
  • Avoiding unnecessary traversal
  • Thus, the affinity promotion technique
    implemented has two advantages
  • enables us to embed high level
    relationships in the metric space, thus
    providing better users concept capturing and
    better query result
  • avoids unnecessary traversal,
    it further aids in reducing the
    computation overhead

the promotion of the maximum value ensures that
if any of the children possess an affinity value
gt required affinity, the parent is traversed and
included in the query path
Unnecessary traversal is avoided by discarding
the parent node from the traversal path if none
of its children has an affinity value gt required
affinity
16
Affinity Hybrid Tree (AH-Tree)
  • Similarity Queries
  • query represented as a collection of features
  • once the feature vector is obtained, AH-Tree is
    traversed from root to feature subspace once
    appropriate feature subspaces are obtained,
    corresponding metric spaces are merged the
    affinity value is promoted from leaf to the root
  • appropriate image objects are returned whose
  • (i) distances with the queried
    object satisfy the similarity
    measurement requirement and
  • (ii) affinity values with the
    queried object are greater than or equal
    to the supplied affinity value

k-NN Search Returns the top k similar objects
to a query image
17
Affinity Hybrid Tree (AH-Tree)
  • Range Queries
  • Both search range and search radius are
    supplied with the query
  • Search the Feature Space to get subspaces
    overlapping with the query object or
    falling within the specified range3. Merge
    neighboring feature spaces to increase the
    metric search space
  • 4. Affinity Promotion
  • 5. Similarities of router objects are evaluated
    against the query object with respect to the
    search radius. If satisfied, evaluated with
    respect to the supplied affinity value

18
Affinity Hybrid Tree (AH-Tree)
  • Range Queries (cont)
  • 6. If both are satisfied, the metric search is
    iterated for the sub-tree of the
    routingobjectAdditional Characteristics
  • The search radius (range) is often difficult for
    the naïve user to specify correctly. To avoid
    such aproblem, a parallel result queue is
    maintained which consists of objects satisfying
    only the affinity check even if the similarity
    check fails. The queue is presented to the user
    if he/she is not satisfied with the query
    result. Gives a higher priority to the high
    level image relationships over low level
    representations.

19
Affinity Hybrid Tree (AH-Tree)
  • k-NN Search
  • 1. The feature space uses a branch and
    bound 8 technique to perform the k-NN
    search2. Performs ordered depth-first search
    in the feature space 3. Determines the
    k-nearest sub spaces of a given query point
    at each non-leaf node, metric
    bounds are calculated between
    the query point and all its
    MBRs and stored in an ordered
    list list pruned depending
    on similarity measures on
    reaching data nodes, the nearest distance
    is updated and iteration continues
    until k nearest sub spaces
    are obtained
  • 4. The metric trees
    corresponding to each feature space are
    combined in an ordered fashion to refine
    the query result and increase the metric
    search space

20
Affinity Hybrid Tree (AH-Tree)
  • k-NN Search (cont)
  • 5. Metric Space search is the same as discussed
    in the range search method except thatboth the
    search radius and the affinityvalue become
    dynamic here

The search radius and affinity value are made
dynamic by making them the distance and affinity
valuebetween Q and the current kth nearest
neighbor respectively, storing all non-leaf
nodes satisfying similarity measurement in a
priority queue
21
Experimental Analysis
  • Experimental Setup
  • application was built using C in
    Linux environment
  • node size of 4KB was used
    image database has 10,000 images of 72 semantic
    categories feature matrix was developed
    from color information of each image in
    HSV color space a 10,000 x 10,000
    affinity matrix was re-computed from the
    training set and used to capture user perception
  • Computation Overhead Computation overhead is
    expressed with the following
  • a) I/O Cost
  • b) CPU Cost ( of distance
    computations)

The AH-Tree is compared with the performance of
M-Tree which has the potential of introducing
the high-level image relationship in its index
structure
The AH-Tree is not compared with any space based
index structures since they are incapable of
embedding high level relationship as it is into
their index structures
22
Experimental Analysis
  • I/O Costthe space filtering mechanism reduces
    the number of image objects in the metric space
    which affects the I/O Cost
  • Tree Construction
  • Range Query

23
Experimental Analysis
  • I/O Cost
  • k-NN Search

24
Experimental Analysis
  • CPU Costthe space filtering mechanism reduces
    the number of image objects in the metric space
    which reduces the of similarity computation,
    thus reducing CPU Cost drastically
  • Tree Construction
  • Range Query

25
Experimental Analysis
  • CPU Cost
  • k-NN Search

26
Experimental Analysis
  • Accuracy
  • accuracy is defined as the percentage of the
    retrieved images that are semantically
    related to the query image

Such a stark difference in accuracy is clearly
due to the introduction of the high-level
relationship in the AH-Tree which captures the
users concept of similarity better than only
relying upon the distance measurement
27
Conclusion
  • AH-Tree clearly outperforms existing
    multi-dimensional indexing structures in terms of
    Computation Overhead and Accuracy of query
    results in case of content based image retrieval
    paradigm
  • The introduction of feature space filtering
    technique has a great effect on such drastic
    improvements
  • Using metric space to introduce the high level
    image relationship as it is helps to produce such
    high accuracy retrieval results
  • AH-Tree is flexible in introducing any kind of
    high level image relationships
  • The AH-Tree, to the best of our knowledge, is the
    first attempt in combining Feature Space and
    Metric Space to produce a hybrid structure
    capable of solving the two major goals of an
    index structure supporting multimedia objects
  • Efficient query retrieval
    with reduced CPU and I/O cost
  • Relevant query results with
    a satisfactory accuracy
    measurement

28
Future Work
  • Introducing query refinement mechanisms
  • Integrating some data mining techniques to
    calculate affinity relationships on the fly
  • Developing a unified seamless index structure
    supporting all kinds of multimedia objects and
    retrieval

29
Questions
30
References
  • 1 S. Berchtold, D. A. Keim, and H. Kriegel. The
    x-tree an index structure for high dimensional
    data. In Proceedings of the 22nd International
    Conference on Very Large Databases, pages 2839,
    Bombay, India, September 1996.
  • 2 K. Chakrabarti. Hybrid tree code.
    //www.ics.uci.edu/ kaushik/research/htree.html,
    2005.
  • 3 K. Chakrabarti and S. Mehrotra. The hybrid
    tree An index structure for high-dimensional
    feature spaces. In Proceedings of the IEEE
    International Conference on Data Engineering,
    pages 440447, Sydney, Australia, March 1999.4
    K. Chakrabarti, K. Porkaew, M. Ortega, and S.
    Mehrotra. Evaluating refined queries in top-k
    retrieval systems. IEEE Transactions on Knowledge
    and Data Engineering (TKDE), 16(2)256270,
    February 2004.
  • 5 P. Ciaccia, M. Patella, and P. Zezula.
    M-tree An efficient access method for similarity
    search in metric spaces. In Proceedings of the
    23rd VLDB International Conference, pages
    426435, Athens, Greece, August 1997.
  • 6 R. Fagin. Fuzzy queries in multimedia
    database systems. In PODS 98 Proceedings of the
    seventeenth ACM SIGACTSIGMOD-SIGART symposium on
    Principles of database systems, pages 110,
    Seattle, Washington, United States, June 1998.
  • 7 D. Greene. An implementation and performance
    analysis of spatial data access methods. In
    Proceedings of ICDE, pages 606615, Los Angeles,
    California, United States, February 1989.
  • 8 A. Guttman. R-trees A dynamic index
    structure for spatial searching. In Proceedings
    of the 1984 ACM SIGMOD International Conference
    on Management of Data, pages 4757, Boston,
    Massachusetts, Unites States, June 1984.
  • 9 R. Krishnapuram, S. Medasani, J. Hwan, C. Y.
    Sik, and R. Balasubramaniam. Content based image
    retrieval based on fuzzy approach. IEEE
    Transactions on Knowledge and Data Engineering
    (TKDE), 16(10)11851199, 2004.
  • 10 D. B. Lomet and B. Salzberg. The hb-tree A
    multiattribute indexing method with good
    guaranteed performance. ACM Transactions on
    Database Systems, 15(4)625658, 1990.
  • 11 A. Motro. Vague A user interface to
    relational databases that permits vague queries.
    ACM Transactions on Office Information Systems,
    6(3)187214, 1988.

31
References
  • 12 M. Patella. M-tree code. http//www-db.deis.u
    nibo.it/Mtree, 2005. 13 J. Robinson. The
    k-d-b-tree A search structure for large
    multidimensional dynamic indexes. In Proceedings
    of the 1981 ACM SIGMOD International Conference
    on Management of Data, pages 1018, Ann Arbor,
    Michigan, United States, April 1981.
  • 14 N. Roussopoulos, S. Kelley, and F. Vincent.
    Nearest neighbor queries. In Proceedings of the
    1995 ACM SIGMOD international conference on
    Management of Data, pages 71 79, San Jose,
    California, United States, May 1995.
  • 15 Y. Rui, T. Huang, and S. Mehrotra. Content
    based image retrieval with image retrieval in
    mars. In Proceedings of International Conference
    on Image Processing, pages 815818, Santa
    Barbara, California, United States, October 1997.
  • 16 M.-L. Shyu, S.-C. Chen, M. Chen, C. Zhang,
    and C.-M. Shu. MMM A stochastic mechanism for
    image database queries. In Proceedings of the
    IEEE Fifth International Symposium on Multimedia
    Software Engineering (MSE2003), pages 188195,
    Taichung, Taiwan, ROC, December 2003.
  • 17 D. A. White and R. Jain. Similarity indexing
    with sstree. In Proceedings of the 12th
    International Conference on Data Engineering,
    pages 516523, New Orleans, LA, United States,
    February 1996.
  • 18 P. N. Yianilos. Data structures and
    algorithms for nearest neighbor search in general
    metric spaces. In Proceedings of the 3rd Annual
    ACM-SIAM Symposium on Discrete Algorithms, pages
    311321, Philadelphia, PA, United States, January
    1993.
  • 19 P. Zezula, P. Ciaccia, and F. Rabitti.
    M-tree A dynamic index for similarity queries in
    multimedia databases. In Technical Report 7,
    HERMES ESPRIT LTR Projects, 1996.
Write a Comment
User Comments (0)
About PowerShow.com