Direct and Indirect Matching of Schema Elements for Data Integration on the Web - PowerPoint PPT Presentation

About This Presentation
Title:

Direct and Indirect Matching of Schema Elements for Data Integration on the Web

Description:

The number of different common hypernym roots of A and B ... Ford Taurus. Ford F150. CarMake . CarModel. Legend. Mustang. A4. CarModel. CarMake. Target ... – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 18
Provided by: Urs65
Learn more at: https://www.deg.byu.edu
Category:

less

Transcript and Presenter's Notes

Title: Direct and Indirect Matching of Schema Elements for Data Integration on the Web


1
Direct and Indirect Matching of Schema Elements
for Data Integration on the Web
  • Li Xu
  • Data Extraction Group
  • Brigham Young University
  • Sponsored by NSF

2
Schema Matching
Color
Year
Year
Feature
Make
Make Model
Body Type
Cost
Car
Model
Car
Car
Style
Phone
Cost
Source
3
Mapping
  • Direct Matches
  • Indirect Matches
  • Union
  • Selection
  • Composition
  • Decomposition

4
Union and Selection
Color
Year
Year
Feature
Make
Make Model
Body Type
Cost
Car
Model
Car
Car
Style
Phone
Cost
Source
5
Composition and Decomposition
Color
Year
Year
Feature
Make
Make Model
Body Type
Cost
Car
Model
Car
Car
Style
Phone
Cost
Source
6
Matching Techniques
  • Terminological Relationships
  • Value Characteristics
  • Expected Data Values
  • Structure

7
Terminological Relationships
  • WordNet
  • Machine-Learned Rules
  • Example (Make, Brand)

The number of different common hypernym roots of
A and B
The sum of the number of senses of A and B
Sum of distances of A and B to a common hypernym
8
Value Characteristics
  • Machine Learning
  • Features LC94
  • String length, numeric ratio, space ratio.
  • Mean, variation, coefficient variation, standard
    deviation

9
Expected Values
  • Application Concepts
  • Data Frames
  • CarMake
  • ford
  • honda
  • CarModel
  • accord
  • mustang
  • taurus

Make Model
Brand Model
Ford Mustang Ford Taurus Ford F150
Legend Mustang A4
Acura Audi BMW
CarMake . CarModel
CarMake
CarModel
Target
Source
10
Structure
PO
PurchaseOrder
Items
POShipTo
POBillTo
POLines
DeliverTo
InvoiceTo
Count
Address
Item
ItemCount
City
Street
City
Street
Item
ItemNumber
City
Street
Line
Qty
UoM
Quantity
UnitOfMeasure
Target
Source
11
Structure (Cont.)
PO
PurchaseOrder
Items
POShipTo
POBillTo
POLines
DeliverTo
InvoiceTo
DeliverTo
Count
Address
Item
Count
City
Street
City
Street
Item
ItemNumber
City
Street
Line
Qty
UoM
Quantity
UnitOfMeasure
Target
Source
12
Structure (Cont.)
PO
PurchaseOrder
Items
POBillTo
POLines
InvoiceTo
POShipTo
DeliverTo
City
Count
City
Item
Count
Street
City
Street
City
Street
Item
Street
ItemNumber
Line
Qty
UoM
Quantity
UnitOfMeasure
Target
Source
13
Structure (Cont.)
PO
PurchaseOrder
Items
POBillTo
POLines
InvoiceTo
POShipTo
DeliverTo
City
Count
City
Item
Count
Street
City
Street
City
Street
Item
Street
ItemNumber
ItemNumber
Line
Qty
UoM
Line
Qty
UoM
Line
Qty
Quantity
Quantity
Quantity
UnitOfMeasure
Target
Source
14
Structure (Cont.)
PO
PurchaseOrder
Items
POBillTo
POLines
InvoiceTo
POShipTo
DeliverTo
City
City
Count
Count
City
City
Item
Count
Street
Street
Count
City
Street
City
Street
Item
City
Street
City
Street
Street
Street
ItemNumber
Line
Qty
UoM
Line
Qty
Quantity
Quantity
UnitOfMeasure
Target
Source
15
Experiments
  • Methodology
  • Measures
  • Precision
  • Recall
  • F Measure

16
Results
Applications (Number of Schemes) Precision () Recall () F () Correct False Positive False Negative
Course Schedule (5) 98 93 96 119 2 9
Faculty Member (5) 100 100 100 140 0 0
Real Estate (5) 92 96 94 235 20 10
Indirect Matches 94 (precision, recall,
F-measure)
  • Data borrowed from Univ. of Washington

17
Contributions
  • Direct Matches
  • Indirect Matches
  • Expected values
  • Structure
  • High Precision and High Recall
Write a Comment
User Comments (0)
About PowerShow.com