Autonomous Ontology Extraction Using Hierarchical UNSO Hypercube Graph - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Autonomous Ontology Extraction Using Hierarchical UNSO Hypercube Graph

Description:

Consider two E-Commerce ads: 'Selling red BMW car, 2000, good condition, 10,000 miles' ... Consider red 2000 BMW' vs. 'red 2000 BMW', 'leather seats, manual, ... – PowerPoint PPT presentation

Number of Views:133
Avg rating:3.0/5.0
Slides: 25
Provided by: Yosi
Category:

less

Transcript and Presenter's Notes

Title: Autonomous Ontology Extraction Using Hierarchical UNSO Hypercube Graph


1
Autonomous Ontology Extraction Using Hierarchical
UNSO Hypercube Graph
  • Yosi Ben-Asher
  • Shlomo Berkovsky
  • Yaniv Eytani

2
Outline
  • Data Integration
  • UNSpecified Ontologies
  • Hierarchical UNSO
  • Autonomous Maintenance Operations
  • Ontology Extraction

3
Data Integration Motivation
  • Consider two E-Commerce ads
  • Selling red BMW car, 2000, good condition,
    10,000 miles
  • Wanna buy second hand sports car with leather
    seats
  • Different descriptions
  • Might refer to the same object

Smart mechanism for integration of heterogeneous
data is needed
4
Data Integration - Ontology
  • Ontology captures the semantic relationships
    between objects
  • Definition
  • A shared formalization of a conceptualization of
    a domain

5
Semantic Data Management HuperCup
Schlosser et al.
  • Implemented ontology-based
  • data management over hypercube
  • topology
  • Objects are mapped to the hypercube graph
  • using a predefined ontology.
  • Efficient broadcast and search
  • Achieved within a logarithmic in N (number of
    users) steps

6
Unspecified Ontology (UNSO)
  • UNSO generalizes HyperCup by defining the notion
    of UNSpecified Ontology
  • Ontology which is not specified
  • i.e, no master ontology, each user specifies
    his own ontology
  • UNSO acts as a classification method
  • Similar objects that are mapped to the UNSO graph
    are in a close proximity
  • i.e, in near vertices (nodes) in the hypercube
    topology

7
UNSO Unspecified Descriptions
  • Each object is described as a vector of
    (propval) pairs
  • Selling red BMW car, 2005, good condition, 10000
    miles
  • becomes
  • productcar, manufacturerBMW, colorred,
    conditiongood, production_year2005,
    mileage10000
  • Each vector is projected on the UNSO graph

8
UNSO Mapping Description vectors
  • Description are mapped using hashing
  • Different props are hashed to different
    hypercube dimensions
  • vals are hashed to numeric values within the
    respective dimension
  • How to handle ambiguity?
  • Similar patterns for objects descriptions assumed
  • Common sense
  • Zipfs law
  • props and vals undergo simple standardization
    using lexical reference system (e.g., WordNet)
  • e.g., car auto automobile

9
UNSO A Mapping Example
  • Consider a simple ontology for cars domain
  • Color dark0, bright1
  • Gearbox automatic0, manual1
  • Size small0, large1
  • White manual Mini-Minor
  • Is mapped to a vector (1,1,0)

gearbox
(1,1,1)
size
color
(0,0,0)
10
UNSO Multi-Layered Hypercube
  • MLH-UNSO is comprised of nodes recursively
  • containing hypercubes
  • Description vectors are
  • partitioned between different layers
  • Example for 3-dimensional hypercubes
  • Higher layer (0,1,1)
  • Lower layer (0,0,1)

(0,1,1,0,0,1)
11
UNSO Pros and Cons
  • No predefined ontology
  • Unlimited range of props and vals
  • Locality similar objects are mapped to adjacent
    locations
  • All the props are of the same importance
  • Un-weighted flat ontology
  • Sparse and unbalanced distribution in the graph
  • Queries processing complexity is fixed
  • Inefficient processing of typical search queries

12
HUNSO Hierarchical UNSO
  • Some props are more significant
  • manufacturer, color, production_year
  • seats_type, wheels_type
  • Significance of props is assumed to be
    correlated with statistical frequency of
  • Appearance in objects descriptions
  • Appearance in search queries descriptions

13
Real-life data (130 E-Commerce ads)
14
HUNSO - Layers
  • Statistical frequencies of the properties are
    collected over time
  • For example during users interactions
  • Significant properties are located in the higher
    layers, insignificant in the lower ones
  • Advantages
  • Variable-length search operation
  • Fast processing of popular queries
  • Allows for self-management and load-balancing
  • Autonomous ontology extraction

15
HUNSO Autonomous Maintenance
  • Ontology Evolves
  • New types of objects, props and vals
  • Old type become obsolete
  • To maintain a dense ordered structure in the
    hypercube, we use three autonomous operations
  • EXPAND convert a dense node into lower-layer
    hypercube
  • SHRINK convert a sparse hypercube into
    higher-level node
  • SWAP exchange between properties of 2
    adjacent layers
  • Due to inconsistency of properties significance
    order

16
HUNSO EXPAND A Node
  • Performed when
  • A node gets overloaded we objects descriptions
  • No single property dominates
  • i.e. all props appear roughly with the same
    frequency
  • Basic stages
  • Choose K most frequent properties to form the
    lower-layer hypercube dimensions
  • Remap the descriptions from the single node in
    the higher-layer hypercube to the nodes of the
    new lower-layer hypercube

17
HUNSO EXPAND (for K3)
After
Before
18
HUNSO SHRINK A Node
  • Performed when
  • A lower-layer hypercube is sparse
  • Basic stages
  • Remap the descriptions from the nodes of the
    lower-layer hypercube to a single node of
    higher-layer hypercube

19
HUNSO Splitting (example)
After
Before
20
HUNSO SWAP
  • Performed when
  • Lower hypercubes gets overloaded with objects
    descriptions
  • props significance order is inconsistent between
    layers
  • A prop in the lower layer appears more than a
    prop in the higher-layer
  • Basic stages
  • Find the dominating property in all lower-level
    hypercubes
  • Find the least frequent property in higher-level
    hypercube
  • Swap the respective dimensions between the layers
  • Remap the descriptions from the nodes of the
    lower-layer hypercubes to the higher-layer
    hypercube and vice-versa

21
HUNSO SWAP (example)
22
Ontology Extraction
  • Objects properties for each domain are ordered
    in an hierarchical structure
  • Lower layers represents a specification of higher
    layers
  • Consider red 2000 BMW vs. red 2000 BMW,
    leather seats, manual, good condition
  • Domain ontology constantly evolves
  • No data engineering by human experts is
    required
  • Flexible ontologies, sensitive to dynamic changes
    in the objects

HUNSO automatically captures the commonalities in
the underlying objects descriptions
23
Future Work
  • Using NLP tools for recognition of terms affinity
  • Calculate weights and distances between the
    properties and particular values
  • Ranking results
  • Failed match, k nearest neighbors
  • Using world facts from external knowledge
    repositories

24
Q A
Thank You!
Write a Comment
User Comments (0)
About PowerShow.com