Title: Service Discovery and Semantic Overlay Network Creation in DBGlobe
1Service Discovery and Semantic Overlay Network
Creation in DBGlobe
4th DBGlobe Meeting Paris, June 23, 2003
2Outline
- Part 1 Service Discovery
- Part 2 Semantic Overlay Networks
- Part 3 Ontologies
3Outline
- Part 1 Service Discovery
- Part 2 Semantic Overlay Networks
- Part 3 Ontologies
4System Architecture
- PMOs are attached to Cell Administration Servers
(CASs) - CASs are responsible for the service discovery
process - XML data or XML-based service descriptions
5Service Discovery
- Each CAS maintains data summaries (e.g. Bloom
Filters) to assist query routing
B
SumB
A
C
SumC
6Multi-level Bloom Filters
- Hash-based indices that extend Bloom filters to
support the evaluation of path queries. - Two approaches Breadth and Depth Bloom Filters
that rely on different ways of hashing an
XML-tree. - Compact structures
- Appearance of false positives
7Performance Results
- Multi-level Bloom filters outperform in terms of
false postives Simple Bloom filters in evaluating
path queries. - For 2 of the total size of the data, multi-level
Bloom filters evaluate path queries for a false
positives ratio below 3 - Breadth Blooms work better than Depth Blooms.
- Depth Blooms require more space but are suitable
for special type of queries.
8Filters Distribution
- Peers organized into hierarchies connected
through a main channel - Each server maintains
- a local filter
- a merged filter of the filters in its sub-tree.
- If it is a root-peer(connected to the main
channel) a merged filter for every other
root-peer
9Distribution Hierarchical Organization
Node C Local filter Merged filter E? F ? G ?
H Root filters A, B, D
10Outline
- Part 1 Service Discovery
- Part 2 Semantic Overlay Networks
- Part 3 Ontologies
11Content-based organization
- Group peers together according to their content
- Use filter and not data similarity for efficiency
- When a peer joins the system
- it broadcasts its local summary and attaches to
the most similar peer available
12Bloom Filter Similarity
- Nodes organized according to Bloom Filter
Similarity - Measure similarity measure based on the
Manhattan distance metric. - Let two filters B and C of size m
- d(B, C) B1 C1 B2 C2 Bm
Cm. - similarity(B, C) m d(B, C).
13Bloom Filter Similarity (contd)
B
1
0
0
1
1
0
0
1
C
0
1
1
0
1
0
0
1
similarity(B, C) 8 - (0 1 1 0 1 0 1
0) 4
For multi-level Bloom filters similarity is
defined as the sum of each pair of corresponding
levels
14Performance Results
- The content-based organization is much more
efficient in finding all the results for a query,
than the proximity organization. - They both perform similarly in discovering the
first result. - The content-based organization outperforms the
proximity one when the nodes that satisfy a given
query are limited.
15Current work
- A peer can belong to more than one hierarchies.
- Self-organization--tuning of the system
predefined threshold. - Service Discovery
- Locate the right cluster (hierarchy)
- Find the peer in the hierarchy
16Outline
- Part 1 Service Discovery
- Part 2 Semantic Overlay Networks
- Part 3 Ontologies
17Ontologies
Ontologies are hierarchies --gt Thus they can be
summarized by multi-level Blooms
18Ontologies (contd)
- The main issue
- How to locate the matching hierarchy(cluster)
- Just check every root peer.
- Can we use a global ontology to route us to the
matching hierarchy more efficiently?
19Thank you