Moscow State University Faculty of Computational Mathematics and Cybernetics Integrating Information from Global Systems: Knowledge Representation and Reasoning in the Context Interchange System June 20, 2005 Stuart Madnick (smadnick@mit.edu) - PowerPoint PPT Presentation

Loading...

PPT – Moscow State University Faculty of Computational Mathematics and Cybernetics Integrating Information from Global Systems: Knowledge Representation and Reasoning in the Context Interchange System June 20, 2005 Stuart Madnick (smadnick@mit.edu) PowerPoint presentation | free to download - id: e44cc-ZDc1Z



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Moscow State University Faculty of Computational Mathematics and Cybernetics Integrating Information from Global Systems: Knowledge Representation and Reasoning in the Context Interchange System June 20, 2005 Stuart Madnick (smadnick@mit.edu)

Description:

1. Moscow State University. Faculty of Computational Mathematics and Cybernetics ... Knowledge Representation and Reasoning in the Context Interchange System ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Moscow State University Faculty of Computational Mathematics and Cybernetics Integrating Information from Global Systems: Knowledge Representation and Reasoning in the Context Interchange System June 20, 2005 Stuart Madnick (smadnick@mit.edu)


1
Moscow State University Faculty of Computational
Mathematics and Cybernetics Integrating
Information from Global Systems Knowledge
Representation and Reasoning in the Context
Interchange System June 20, 2005 Stuart
Madnick (smadnick_at_mit.edu) MASSACHUSETTS
INSTITUTE OF TECHNOLOGY SLOAN SCHOOL OF
MANAGEMENT INFORMATION TECHNOLOGIES GROUP
2005-06-20
2
Characteristics of Global Services
  • Large number of sources
  • Online travel services
  • Comparison shopping services
  • Diverse user needs
  • Increasing usability with personalization
  • Cannot establish a single data standard
  • Must get semantics right
  • Adaptability, extensibility, scalability

3
Comparison Shopper www.mysimon.com
4
Regional Comparison Shoppers
US Sweden France UK
5
Motivating Example
  • Global Online Comparison Shopping
  • Different semantic assumptions in data
  • Compare prices in the context of any source
    chosen by the user
  • Many vendor sources in different countries
  • Example 270 potential different contexts

Semantic aspect Number of distinctions
Currency 10 different currencies (e.g., US, UKP, JPY, Russian Ruble)
Scale factor 3 different scale factors 1, 1K, 1M
Price definition 3 different definitions base, basetax, basetaxSH
Date format 3 different formats, mm/dd/yyyy, dd-mm-yyyy, yyyy-mm-dd
  • Need many conversions - 159,600 of them!

6
Desired Properties
  • Adaptability
  • Capability of accommodating changes in sources
  • Extensibility
  • Easy to add/remove sources
  • Scalability
  • Effort of enabling interoperation wrt the number
    of sources and the size of ontology
  • Performance wrt number of sources and the size of
    each source (query optimization issue)
  • Flexibility Adaptability Extensibility

7
Interoperate hard-wired approaches
(c) Internal standard approach Adopting a
standard
(a) BFS approach Brute-force between pair-wise
sources
2
1
2
1
Internal standard
6
3
3
6
5
4
5
4
1
2
(b) BFC approach Brute-force between contexts
context_a currency KRW scaleFactor1000 kind
base format yyyy-mm-dd
6
5
4
3
context_b currency TRL scaleFactor1e6 kindb
asetax format dd-mm-yyyy
context_c currency USD scaleFactor1 kind
basetaxSH format mm/dd/yyyy
8
COntext INterchange (COIN) Project
CONTEXT MEDIATION Automatic conflict detection
and conversion - Derived data - Source
selection - Source attribution
INPUT PROCESSING Automatic web wrapping -
Semi-structured text -Multi-source query plan and
execution
Web Pages
OUTPUT PROCESSING ODBC Driver Web -
Publishing
Appli- cations
Receivers
Sources
TRUSTED AGENTS
Data bases
Browsers
APPLICATIONS Financial services, electronic
commerce, asset visibility, in-transit visibility.
9
Key COIN Technologies
  • Web Wrapper
  • Extract selected information from web
    (HTMLXML)
  • Allows web to be treated as large relational
    SQL database
  • Handles dynamic web sites, cookies, login,
    etc.
  • Performs SQL Joins Unions involving DBs
    Web sources
  • Context Mediator
  • Resolve semantic (meaning) differences
  • Enable meaningful aggregation comparison

10
Context Multiple Perspectives . . . old lady
or young lady ?
11
Role Of Context
05-06-07
06-05-07


?

07-06-05
  • CONTEXT VARIATIONS
  • - GEOGRAPHIC ( US vs. UK )
  • - FUNCTIONAL (CASH MGMT vs. LOANS )
  • - ORGANIZATIONAL ( CITIBANK vs. CHASE )

Data Databases Web data E-mail
12
Types of Context
Example Temporal
Representational Currency vs Scale factor 1 vs 1000 Francs before 2000, thereafter
Ontological Revenue Includes vs excludes interest Revenue Excludes interest before 1994 but incl. thereafter
13
The 1999 Overture Unit-of-measure mixup tied to
loss of 125Million Mars Orbiter NASAs Mars
Climate Orbiter was lost because engineers did
not make a simple conversion from English units
to metric, an embarrassing lapse that sent the
125 million craft off course. . . . . . . The
navigators ( JPL ) assumed metric units of force
per second, or newtons. In fact, the numbers
were in pounds of force per second as supplied by
Lockheed Martin ( the contractor ). Source
Kathy Sawyer, Boston Globe, October 1, 1999, page
1.
14
COntext Interchange (COIN) Approach
15
COIN Conceptual Model
(Ontology)
16
Ontology and Conversion Function
context_a currency KRW scaleFactor1000 kind
base format yyyy.mm.dd context_b currency
TRL scaleFactor1e6 kindbasetax format
dd-mm-yyyy context_c currency USD
scaleFactor1 kindbasetaxSH format
mm/dd/yyyy context_d is_a context_b
scaleFactor1e3 context_e is_a
context_d Format yyyy-mm-dd context_f is_a
context_c Kind basetax
Example source src_turkey(Product, Vendor,
QuoteDate, Price)
17
Demo Same Context
No semantic differences
Meaningful data returned
18
Compose only relevant conversions (b ? e)
19
Auto-reconciliation for auxiliary source (b ? f)
20
Detection and Explication (b?a)
21
Mediated Query (b ? a)
22
Flexibility and Scalability
Need to update/add many conversion programs
Not
flexible
Flexible
Update the declarative knowledge base.
  • Why other approaches cannot fully benefit from
    general purpose conversion?
  • the decision whether to invoke the conversion is
    in the conversion program

23
How COIN Scales
  • Semantic differences cannot be standardized away
  • Must be flexible and scalable
  • Component conversions are defined for each
    modifier
  • Overall conversions are automatically composed by
    abductive reasoning engine
  • Composition via symbolic equation solver and a
    shortest path algorithm
  • Inheritance enabled
  • COIN is a good solution
  • Modularization, declarativeness
  • Automatic composition of necessary conversions

24
The 1805 Overture In 1805, the Austrian and
Russian Emperors agreed to join forces against
Napoleon. The Russians promised that their forces
would be in the field in Bavaria by Oct. 20.
The Austrian staff planned its campaign based
on that date in the Gregorian calendar. Russia,
however, still used the ancient Julian calendar,
which lagged 10 days behind. The calendar
difference allowed Napoleon to surround Austrian
General Mack's army at Ulm and force its
surrender on Oct. 21, well before the Russian
forces could reach him, ultimately setting the
stage for Austerlitz. Source David Chandler,
The Campaigns of Napoleon, New York MacMillan
1966, pg. 390.
25
Summary
  • Tremendous opportunity to gather and integrate
    information from many diverse sources
  • But need to overcome many context challenges
  • Context-type metadata plays a critical role
  • COIN technology can be an important aid for
    semantically meaningful information
    integration
  • - Scalable
  • - Extensible
  • - Application Domain Merging
  • - Reuse and extension of ontologies and contexts

References http//web.mit.edu/smadnick/www/wp/CIS
L-Sloan20WP20spreadsheet.htm
26
EXTRA SLIDES
27
Another Context Example (Basis for Demo)
Context Mediation Services
Company Name
DAIMLER-BENZ
Net Income
614,995

97,736,992
Sales
Datastream
Company Name
DAIMLER-BENZ AG

Net Income
346,577
Sales
56,268,168
WorldScope
Company Name
DAIMLER BENZ CORP
Net Income
615,000,000

Sales
97,737,000,000
Appl.
Users
Disclosure

OA DEM-USD Exchange Rate
Systems
1.00 German Mark 0.58 US Dollar as 12/31/93
OANDA Web Server

Wrapper Services
28
Some Context Differences
Context Definitions
29
Domain Model
  • Some currency context possibilities
  • Currency is stated explicitly as part of record
  • Currency not stated, but the same for all
    (e.g., US )
  • Currency not stated or constant, but inferred
    by country

Inheritance Attribute Modifier
30
COIN System Architecture
SERVER PROCESSES
MEDIATOR PROCESSES
CLIENT PROCESSES
Web Client
COIN
N
SQL Compiler
(
cgi
-scripts)
Repository
SQL Query
HTTPD-Daemon
HTTPD-Daemon
Context
Datalog
Mediator
N
Query
WWW Gateway
SQL Query
Mediated
Optimizer
Query
Wrapper
Optimized
ODBC-compliant Apps
Query Plan
Executioner
HTTPD-Daemon
Results
(e.g Microsoft Excel)
ODBC-Driver
Web-site
HTTPD-Daemon
Data Store for
Intermediate
Results
31
System Demonstration
Single Source Queries with Mediation
  • Q6. Scenario Using Context Interchange, you can
    look at the Disclosure data using Datastream
    Context.
  • Query Find out from Disclosure what Net Income
    for DAIMLER-BENZ was. Use Datastream Context.
  • Capabilities Demonstrated
  • Ability to perform Scale Factor Conversion, Date
    Format Conversion, Company Name Conversion.

32
Demonstration _at_ context2.mit.edu
Source
Context
33
Context Metadata (Partial)
34
Conflict Detection and Mediation
Mediated Query in Datalog
Date convert Scale factor convert Name convert
35
Mediated SQL Query Result
Mediated SQL Query
Adjust scale factor
Date format conversion
Name conversion
Final results from Disclosure but in Datastream
context
36
More Complex Example (4 sources DB Web)
Databases
Web source
select WorldcAF.TOTAL_ASSETS, DiscAF.NET_SALES,
DiscAF.NET_INCOME, DStreamAF.TOTAL_EXTRAORD_
ITEMS_PRE_TAX, quotes.Last from WorldcAF,
DiscAF, DStreamAF, quotes where
WorldcAF.COMPANY_NAME "DAIMLER-BENZ AG" and
DStreamAF.AS_OF_DATE "01/05/94" and
WorldcAF.COMPANY_NAME DStreamAF.NAME and
WorldcAF.COMPANY_NAME DiscAF.COMPANY_NAME and
WorldcAF.COMPANY_NAME quotes.Cname
37
Conflict Table (1st part)
38
Conflict Table (2nd part)
39
Generated SQL (1st Part)
select worldcaf.total_assets, discaf.net_sales,
((discaf.net_income0.001)olsen.rate),
(dstreamaf2.total_extraord_items_pre_taxolsen2.ra
te), quotes.Last from (select date1, 'European
Style -', '01/05/94', 'American Style /'
from datexform where format1'European
Style -' and date2'01/05/94'
and format2'American Style /') datexform,
(select dt_names, 'DAIMLER-BENZ AG'
from name_map_dt_ws where
ws_names'DAIMLER-BENZ AG') name_map_dt_ws,
(select ds_names, 'DAIMLER-BENZ AG' from
name_map_ds_ws where
ws_names'DAIMLER-BENZ AG') name_map_ds_ws,
(select 'DAIMLER-BENZ AG', ticker, exc
from ticker_lookup2 where
comp_name'DAIMLER-BENZ AG') ticker_lookup2,
(select 'DAIMLER-BENZ AG', latest_annual_financi
al_date, current_outstanding_shares, net_income,
sales, total_assets, country_of_incorp
from worldcaf where company_name'DAIML
ER-BENZ AG') worldcaf, (select country,
currency from currencytypes
where currency ltgt 'USD') currencytypes,
(select exchanged, 'USD', rate, date from
olsen where expressed'USD') olsen,
(select company_name, latest_annual_data,
current_shares_outstanding, net_income,
net_sales, total_assets, location_of_incorp
from discaf) discaf,
40
Generated SQL (Continued - Partial)
(select as_of_date, name, total_sales,
total_extraord_items_pre_tax, earned_for_ordinary,
currency from dstreamaf) dstreamaf,
(select as_of_date, name, total_sales,
total_extraord_items_pre_tax, earned_for_ordinary,
currency from dstreamaf) dstreamaf2,
(select char3_currency, char2_currency
from currency_map where
char3_currency ltgt 'USD') currency_map,
(select country, currency from
currencytypes where currency ltgt 'USD')
currencytypes2, (select exchanged, 'USD',
rate, '01/05/94' from olsen
where expressed'USD' and
date'01/05/94') olsen2, (select Cname,
Last from quotes) quotes where
currencytypes.country discaf.location_of_incorp
and currencytypes.currency
olsen.exchanged and dstreamaf.currency
dstreamaf2.currency and dstreamaf2.currency
currency_map.char2_currency and olsen.date
discaf.latest_annual_data and
currency_map.char3_currency currencytypes2.curre
ncy and currencytypes2.currency
olsen2.exchanged and name_map_dt_ws.dt_names
dstreamaf2.name and name_map_ds_ws.ds_names
discaf.company_name and ticker_lookup2.ticker
quotes.Cname and datexform.date1
dstreamaf2.as_of_date and currencytypes.currenc
y ltgt 'USD' and currency_map.char3_currency ltgt
'USD' union select worldcaf2.total_assets,
discaf2.net_sales, ((discaf2.net_income0.001)ols
en3.rate), dstreamaf4.total_extraord_items_pre_tax
, quotes2.Last
from (select date1, 'European Style -',
'01/05/94', 'American Style /' from
datexform where format1'European Style
-' and date2'01/05/94' and
format2'American Style /') datexform2,
(select dt_names, 'DAIMLER-BENZ AG' from
name_map_dt_ws where ws_names'DAIMLER-B
ENZ AG') name_map_dt_ws2, (select
ds_names, 'DAIMLER-BENZ AG' from
name_map_ds_ws where ws_names'DAIMLER-BE
NZ AG') name_map_ds_ws2, (select
'DAIMLER-BENZ AG', ticker, exc from
ticker_lookup2 where comp_name'DAIMLER-B
ENZ AG') ticker_lookup22, (select
'DAIMLER-BENZ AG', latest_annual_financial_date,
current_outstanding_shares, net_income, sales,
total_assets, country_of_incorp from
worldcaf where company_name'DAIMLER-BENZ
AG') worldcaf2, (select country,
currency from currencytypes
where currency ltgt 'USD') currencytypes3,
(select exchanged, 'USD', rate, date from
olsen where expressed'USD') olsen3,
(select company_name, latest_annual_data,
current_shares_outstanding, net_income,
net_sales, total_assets, location_of_incorp
from discaf) discaf2, (select
as_of_date, name, total_sales, total_extraord_item
s_pre_tax, earned_for_ordinary, currency
from dstreamaf) dstreamaf3, (select
'USD', char2_currency from
currency_map where char3_currency'USD')
currency_map2, etc
41
Final Result
42
Execution Trace (1st Part - Partials)
Parallel Execution
. . .
Retrieving data From Web source
43
Execution Trace (Continued - Partials)
. . .
Stock price returned From Web source
Another Web source used (for currency conversion)
. . .
44
Appendix Sample Applications
  • Airfare, Car Rental and Merged Travel
  • Weather
  • Global Price Comparison
  • Airfare Aggregation
  • Disaster Relief
  • TASC Financial Example
  • Web Services Demo
  • Corporate Householding

45
Appendix COIN Web-Wrapper Technology
User or Program (via SQL Query)
Select Edgar.Net_income From Edgar Where
Edgar.Tickerintc and Edgar.Form10-Q
Web page spec file
Web Wrapper Generator
HTML Side
SQL Side
Ticker
Net Income
1,983
INTC
Data record returned
Spec file contains Schema, Navigation
rules, and Extraction rules.
About PowerShow.com