Title: RONDO: A Programming Platform for Generic Model Management
1RONDO A Programming Platform for Generic
Model Management
- Sergey Melnik University of Leipzig,
Germany - Erhard Rahm University of Leipzig,
Germany - Philip A. Bernstein Microsoft Research,
Redmond, WA
Presented by Ali Riza KONAN Aycan YALÇIN
2Outline
- Introduction
- Generic Model Management
- Motivating Scenario
- Conceptual Structures
- Operators
- Implementation
- Prototype
- Related Works
- Conclusion
3 4IntroductionProblems
- Amount of programming for development of metadata
intensive applications - Lack of common programming platform
- Lack of great infrastructures
- - Storing metadata in files, not in DBs
- - Tool specific infrastructure
5IntroductionGoals
- Reduce the amount of programming
- Generalize the transformation operations
- e.g. Relational DB ? XML Schema
- Handling metadata management in a generic
fashion - Simplifying the development of applications
6IntroductionProposal
- Defining and developing a set of
- algebraic operators to manipulate
- metadata in large chunks
7 8Generic Model Management Properties
- Not limited to a specific language or application
domain - Using high-level algebraic operators to
manipulate models and mappings - such as match, merge, compose
- Applying operators to models and mappings as a
whole
9 10Motivating Scenario
To solve such tasks at a high level of
abstraction using a concise generic script
11Motivating Scenario (contd)
- Steps of change propagation
- Detect the changes introduced in s2
- Remove d1 images of the deleted elements in s1
- Merge XML schema counterparts of added and
renamed columns in s1 into d1 to obtain d2 - During these steps intevention of a human
engineer may be required.
12Motivating Scenario (contd)
- It is assumed that a translation tool is
available as an operator (SQL2XSD) which takes as
input a relational schema and produces as output
an XML schema and a mapping between the original
and converted schema elements. -
- In this scenario c is obtined by using this
tool - ltc,s2_cgt SQL2XSD(s2)
13Motivating Scenario (contd)
Rectangles ? Schemas Arcs ? Mapping between
schemas
14Motivating Scenario (contd)
- NOTICE
- The above script is not limited to propagating
changes from relational schemas to XML schemas. - Reverse propagation is also possible.
15 16Conceptual Structures
- Models
- Morphisms
- Selectors
17Conceptual Structures Models
- Model as DAG
- (Directed Acyclic Graph)
- The nodes ? model elements (relations and attr.
in relational schemas, type definitions in XML
schemas, clauses of SQL st.) - Each element is uniquely identified by an object
identifier(OID) - lts, p, ogt ? s source node, p edge label, o
target node - In the graph, the ovals denote OIDs (Object
Identifier), rectangles denote literals. - Used in the prototype to visualize models
18Conceptual Structures Models (contd)
- Model as Relation
- M(S OID, P OID, OOID U Literal, N Integer)
- Here, N is optional attribute used for ordering
and S,P,O form a unique key. - Used for internal computations
- SQL query for node ordering
- SELECT M.O FROM M WHERE M.Ssrc node
ANDM.Pedge label ORDER BY M.N. - In the example, M.Sa1 M.Pcolumn.
19Conceptual Structures Morphisms
- A binary relation over two (possibly overlapping)
sets of OIDs, i.e., a set of pairs ltl, rgt drawn
from OIDOID
- No semantics about the transformation of
instances - e.g. No SQL WHERE-clause
- Additional properties may be added to ltl,rgt pairs
- e.g Similarity value between nodes
20Conceptual Structures Morphisms (contd)
- Advantages
- can represent a mapping between different kinds
of models (Relational ? XML) - can always be inverted and composed (SQL view
cannot) - implemented and manipulated easily
21Conceptual Structures Selectors
- a set of node identifiers from a model
- can be viewed as a relation with a single
attribute, S(VOID), where V is a unique key.
22 23Operators
- Primitive Operators
- Derived Operators
- Complex Operators
24Operators Primitive Operators
m model s selector map morphism
25Operators Primitive Operators (Contd)
- Other Primitive Operators
- Set Operators
- Union ()
- Difference (-)
- Intersection (?)
- All Operator (All(m)) Returns a selector that
contains those nodes of - m that
denote the model elemets of the models -
meta-model - Copy Operator(Copy(m,s)) Creates a copy of a
model in which -
selected node IDs are replaced by new, -
uniquely created IDs
26Operators Derived Operators
- Functional combinations of other operators
- Range(map)
- return Domain(Invert(map))
- RestrictRange(map, selector)
- return Invert(RestrictDomain(Invert(map),
selector)) - Traverse(selector, map)
- return Range(RestrictDomain(map, selector))
- Restrict(map, m1, m2)
- return RestrictRange(
- RestrictDomain(map, All(m1)),
All(m2))
27Operators Complex Operators
- Extract Delete
- Match
- Merge
28Operators Complex Op.s (Extract Delete)
- ltm,m_mgt Extract(m,s)
- (m well-formed model, sselector)
- m satisfies folowing properties
- 1. contains all selected nodes
- 2. well-formed
- 3. m equally or less expressive than m
- 4. minimal model that satisfies 1-3
29OperatorsComplex Op.s(Extract Delete contd)
- Algorithm
- 1. Create a closure of m (?m)
- 2. Assign s s (s is temporal)
- 3. For each x in s, extend s to satisfy
condition 2 and 3 - 4. Apply 3 until a fixpoint is reached
- 5. t Subgraph(m, s)
- 6. Obtain a cover of t (?t)
- 7. Return Copy(t, All(t))
- Deleting is an extraction of the unselected
portion, - operator Delete(m, s)
- return Extract(m, All(m)-s)
30Operators Complex Op.s (Match)
- ltm1_m2gt Match(m1,m2)
- Uncovers how to models correspond to each other
- Takes 2 graphs as input
- Requires info that is not presented in the
schemas - ? Cannot be fully automated
31Operators Complex Op.s (Merge)
- ltm, m1_m, m2_mgt Merge(m1,m2,map)
- Used to combine two models into one
- Input models m1 and m2 are well-formed
- Should produce a well-formed model m
32Operators Complex Op.s (Merge contd)
- m satisfies
- . At least as expressive as each of the
input models - . Minimal
- map(morphism) describes elments of m1 and
- m2 that are
equivalent or should be - merged into a
single element in m - mi_m counterparts of the elements of mi in the
- merged model m
33 34Implementation
- Extract Delete
- Match
- Merge
35ImplementationExtract Delete
- Primary key constraints PID DID
- Referential constraint PRODUCTS.PID
O_DETAILS.PID - Constraints c1,c2,c3
36Implementation Match
- Implemented by using SF(Similarity Flooding)
algorithm(a graph matching algorithm) - The attribute Sim is added to morphism as 3rd
attribute to hold similarity value of each pair
of nodes - operator Match(m1, m2, seed)
- multimap SFJoin(m1, m2, seed)
- multimap Restrict(multimap, m1, m2)
- map FilterBest(multimap)
- return ltmap, multimapgt
- Seed is obtained by NGramMatch algorithm which
computes similarities of literals in m1 and m2
37Implementation Match (contd)
- SF Algorithm
- Inputs Two graphs(m1, m2), seed
- seed (a weighted binary relation) a set of
initial similarity values between the nodes of
the graph. - ltl,rgt?a pair of seed
- Each ltl,rgt carries a sim value between 0 and 1
- Output A weighted binary relation
- Propagates the initial similarity of nodes to the
surrounding nodes (using the instution that
neighbors of similar nodes are similar)
38Implementation Merge
- To implement Merge operator, an algorithm called
GraphMerge is developped - The algorithm consists of three conceptual steps
- Node renaming nodes at the blunt ends are
renamed - to their
targets at sharp ends - Graph union a set union of two sets of
edges - Conflict resolution for not well-formed models
- (costliest step since
requires - human
feedback)
39Implementation Merge (contd)
40Implementation Merge (contd)
Morphism map (x,y, ),(x1,y2, ),(x1,y2,
) Target element is kept and source element is
discarded. e.g. x will be kept and y will be
discarded
Implementation Merge (contd)
41Implementation Merge (contd)
- In the merged graph, node z1 which represents
CUST attribute has now bocome an attribute of two
different relations. To solve this kind of
conflicets a heuristic is developed - Track the origin of each edge in the merged graph
- Assign a tag to each edge
- Source node of a map -
- A target node of map
- None of two o
- e.g. ltx,z1gt obtained by renaming from ltx,x2gt is
tagged with - - since x target node and x2 source node
- Eliminate the edge(s) which causes conflict by
considering the priority
42Implementation Merge (contd)
43Implementation Merge (contd)
Merge Algorithm
G Merged model s
All source nodes of map m1_G
Morphism between m1 and G m2_G
Morphism between m2 and G m
Copy of G
44 45Prototype
46Prototype
- Interpreter
- central component
- executes scripts
- can be run from command line
- or invoked by external app.s tools
- orchestrates the data flow
47Prototype
- Operators
- a native implementation
- e.g. ReadSQLDDL, WriteSQLDDL
- ReadDb, WriteDb
- all primitive operators
- GUI operators like EditMap,
EditSelector - operators SFJoin and GraphMerge
- schema translation conversion
operators - scripts
- e.g. alias ReadSQLDDL ltJava class namegt
- Range, Match, Merge
48Prototype (contd)
- NOTICE
- The specification of the commonly used native
or derived operators can be grouped in a single
script and utilized in other scripts using
include statements
49Prototype (contd)
- Facilities of the interpreter
- debugging
- e.g. examine the execution traces
- flexible handling of input and output
- e.g. operators having more than one parameter
- as output
50Prototype (contd)
- Currently supported features are
- SQL DDL
- XML Schema
- RDF Schema
- SQL views
- UML
51Prototype (contd)
- To introduce a new modelling language
- Step 1 Provide import/export operators
- Step 2 Implement callbacks for operators
- All, Extract and GraphMerge
52Prototype (contd)
Code Breakdown (LOC)
53 54Related Work
- In previous work, schemas were represented as
graphs having is-a, has-a, functional dependency
relationships. - In this approach, the graphs are syntactic
structures. Morphisms are used as schema
correspondences. Selectors have been first
introduced.
55Related Work (contd)
- One of the surprises of the present work is how
much leverage one can get out of simple
morphisms. - For minimizing the amount of manual
post-processing in schema matching, the
structural matcher is used instead of machine
learning. - Each converter is implemented as a custom,
non-generic operator.
56Related Work (contd)
- The operators presented are mostly syntactic,
just like the conceptual structures, and are
expressed as graph transformations.
57 58Conclusions
- The main conclusions are the following
-
- One can solve practical problems using the model
management operators - The solutions require a relatively small amount
of code - One can get very far using a relatively weak
representation for models and mappings
59- THANKS FOR YOUR VALUABLE TIME