RONDO: A Programming Platform for Generic Model Management - PowerPoint PPT Presentation

1 / 59
About This Presentation
Title:

RONDO: A Programming Platform for Generic Model Management

Description:

RONDO: A Programming Platform. for Generic Model Management. Sergey Melnik University of Leipzig, Germany. Erhard Rahm University of ... return map, multimap ... – PowerPoint PPT presentation

Number of Views:219
Avg rating:3.0/5.0
Slides: 60
Provided by: comp88
Category:

less

Transcript and Presenter's Notes

Title: RONDO: A Programming Platform for Generic Model Management


1
RONDO A Programming Platform for Generic
Model Management
  • Sergey Melnik University of Leipzig,
    Germany
  • Erhard Rahm University of Leipzig,
    Germany
  • Philip A. Bernstein Microsoft Research,
    Redmond, WA

Presented by Ali Riza KONAN Aycan YALÇIN
2
Outline
  • Introduction
  • Generic Model Management
  • Motivating Scenario
  • Conceptual Structures
  • Operators
  • Implementation
  • Prototype
  • Related Works
  • Conclusion

3
  • Introduction

4
IntroductionProblems
  • Amount of programming for development of metadata
    intensive applications
  • Lack of common programming platform
  • Lack of great infrastructures
  • - Storing metadata in files, not in DBs
  • - Tool specific infrastructure

5
IntroductionGoals
  • Reduce the amount of programming
  • Generalize the transformation operations
  • e.g. Relational DB ? XML Schema
  • Handling metadata management in a generic
    fashion
  • Simplifying the development of applications

6
IntroductionProposal
  • Defining and developing a set of
  • algebraic operators to manipulate
  • metadata in large chunks

7
  • Generic Model Management

8
Generic Model Management Properties
  • Not limited to a specific language or application
    domain
  • Using high-level algebraic operators to
    manipulate models and mappings
  • such as match, merge, compose
  • Applying operators to models and mappings as a
    whole

9
  • Motivating Scenario

10
Motivating Scenario
To solve such tasks at a high level of
abstraction using a concise generic script
11
Motivating Scenario (contd)
  • Steps of change propagation
  • Detect the changes introduced in s2
  • Remove d1 images of the deleted elements in s1
  • Merge XML schema counterparts of added and
    renamed columns in s1 into d1 to obtain d2
  • During these steps intevention of a human
    engineer may be required.

12
Motivating Scenario (contd)
  • It is assumed that a translation tool is
    available as an operator (SQL2XSD) which takes as
    input a relational schema and produces as output
    an XML schema and a mapping between the original
    and converted schema elements.
  • In this scenario c is obtined by using this
    tool
  • ltc,s2_cgt SQL2XSD(s2)

13
Motivating Scenario (contd)
Rectangles ? Schemas Arcs ? Mapping between
schemas
14
Motivating Scenario (contd)
  • NOTICE
  • The above script is not limited to propagating
    changes from relational schemas to XML schemas.
  • Reverse propagation is also possible.

15
  • Conceptual Structures

16
Conceptual Structures
  • Models
  • Morphisms
  • Selectors

17
Conceptual Structures Models
  • Model as DAG
  • (Directed Acyclic Graph)
  • The nodes ? model elements (relations and attr.
    in relational schemas, type definitions in XML
    schemas, clauses of SQL st.)
  • Each element is uniquely identified by an object
    identifier(OID)
  • lts, p, ogt ? s source node, p edge label, o
    target node
  • In the graph, the ovals denote OIDs (Object
    Identifier), rectangles denote literals.
  • Used in the prototype to visualize models

18
Conceptual Structures Models (contd)
  • Model as Relation
  • M(S OID, P OID, OOID U Literal, N Integer)
  • Here, N is optional attribute used for ordering
    and S,P,O form a unique key.
  • Used for internal computations
  • SQL query for node ordering
  • SELECT M.O FROM M WHERE M.Ssrc node
    ANDM.Pedge label ORDER BY M.N.
  • In the example, M.Sa1 M.Pcolumn.

19
Conceptual Structures Morphisms
  • A binary relation over two (possibly overlapping)
    sets of OIDs, i.e., a set of pairs ltl, rgt drawn
    from OIDOID
  • No semantics about the transformation of
    instances
  • e.g. No SQL WHERE-clause
  • Additional properties may be added to ltl,rgt pairs
  • e.g Similarity value between nodes

20
Conceptual Structures Morphisms (contd)
  • Advantages
  • can represent a mapping between different kinds
    of models (Relational ? XML)
  • can always be inverted and composed (SQL view
    cannot)
  • implemented and manipulated easily

21
Conceptual Structures Selectors
  • a set of node identifiers from a model
  • can be viewed as a relation with a single
    attribute, S(VOID), where V is a unique key.

22
  • Operators

23
Operators
  • Primitive Operators
  • Derived Operators
  • Complex Operators

24
Operators Primitive Operators
m model s selector map morphism
25
Operators Primitive Operators (Contd)
  • Other Primitive Operators
  • Set Operators
  • Union ()
  • Difference (-)
  • Intersection (?)
  • All Operator (All(m)) Returns a selector that
    contains those nodes of
  • m that
    denote the model elemets of the models

  • meta-model
  • Copy Operator(Copy(m,s)) Creates a copy of a
    model in which

  • selected node IDs are replaced by new,

  • uniquely created IDs

26
Operators Derived Operators
  • Functional combinations of other operators
  • Range(map)
  • return Domain(Invert(map))
  • RestrictRange(map, selector)
  • return Invert(RestrictDomain(Invert(map),
    selector))
  • Traverse(selector, map)
  • return Range(RestrictDomain(map, selector))
  • Restrict(map, m1, m2)
  • return RestrictRange(
  • RestrictDomain(map, All(m1)),
    All(m2))

27
Operators Complex Operators
  • Extract Delete
  • Match
  • Merge

28
Operators Complex Op.s (Extract Delete)
  • ltm,m_mgt Extract(m,s)
  • (m well-formed model, sselector)
  • m satisfies folowing properties
  • 1. contains all selected nodes
  • 2. well-formed
  • 3. m equally or less expressive than m
  • 4. minimal model that satisfies 1-3

29
OperatorsComplex Op.s(Extract Delete contd)
  • Algorithm
  • 1. Create a closure of m (?m)
  • 2. Assign s s (s is temporal)
  • 3. For each x in s, extend s to satisfy
    condition 2 and 3
  • 4. Apply 3 until a fixpoint is reached
  • 5. t Subgraph(m, s)
  • 6. Obtain a cover of t (?t)
  • 7. Return Copy(t, All(t))
  • Deleting is an extraction of the unselected
    portion,
  • operator Delete(m, s)
  • return Extract(m, All(m)-s)

30
Operators Complex Op.s (Match)
  • ltm1_m2gt Match(m1,m2)
  • Uncovers how to models correspond to each other
  • Takes 2 graphs as input
  • Requires info that is not presented in the
    schemas
  • ? Cannot be fully automated

31
Operators Complex Op.s (Merge)
  • ltm, m1_m, m2_mgt Merge(m1,m2,map)
  • Used to combine two models into one
  • Input models m1 and m2 are well-formed
  • Should produce a well-formed model m

32
Operators Complex Op.s (Merge contd)
  • m satisfies
  • . At least as expressive as each of the
    input models
  • . Minimal
  • map(morphism) describes elments of m1 and
  • m2 that are
    equivalent or should be
  • merged into a
    single element in m
  • mi_m counterparts of the elements of mi in the
  • merged model m

33
  • Implementation

34
Implementation
  • Extract Delete
  • Match
  • Merge

35
ImplementationExtract Delete
  • Primary key constraints PID DID
  • Referential constraint PRODUCTS.PID
    O_DETAILS.PID
  • Constraints c1,c2,c3

36
Implementation Match
  • Implemented by using SF(Similarity Flooding)
    algorithm(a graph matching algorithm)
  • The attribute Sim is added to morphism as 3rd
    attribute to hold similarity value of each pair
    of nodes
  • operator Match(m1, m2, seed)
  • multimap SFJoin(m1, m2, seed)
  • multimap Restrict(multimap, m1, m2)
  • map FilterBest(multimap)
  • return ltmap, multimapgt
  • Seed is obtained by NGramMatch algorithm which
    computes similarities of literals in m1 and m2

37
Implementation Match (contd)
  • SF Algorithm
  • Inputs Two graphs(m1, m2), seed
  • seed (a weighted binary relation) a set of
    initial similarity values between the nodes of
    the graph.
  • ltl,rgt?a pair of seed
  • Each ltl,rgt carries a sim value between 0 and 1
  • Output A weighted binary relation
  • Propagates the initial similarity of nodes to the
    surrounding nodes (using the instution that
    neighbors of similar nodes are similar)

38
Implementation Merge
  • To implement Merge operator, an algorithm called
    GraphMerge is developped
  • The algorithm consists of three conceptual steps
  • Node renaming nodes at the blunt ends are
    renamed
  • to their
    targets at sharp ends
  • Graph union a set union of two sets of
    edges
  • Conflict resolution for not well-formed models
  • (costliest step since
    requires
  • human
    feedback)

39
Implementation Merge (contd)
40
Implementation Merge (contd)
Morphism map (x,y, ),(x1,y2, ),(x1,y2,
) Target element is kept and source element is
discarded. e.g. x will be kept and y will be
discarded
Implementation Merge (contd)
41
Implementation Merge (contd)
  • In the merged graph, node z1 which represents
    CUST attribute has now bocome an attribute of two
    different relations. To solve this kind of
    conflicets a heuristic is developed
  • Track the origin of each edge in the merged graph
  • Assign a tag to each edge
  • Source node of a map -
  • A target node of map
  • None of two o
  • e.g. ltx,z1gt obtained by renaming from ltx,x2gt is
    tagged with -
  • since x target node and x2 source node
  • Eliminate the edge(s) which causes conflict by
    considering the priority

42
Implementation Merge (contd)
43
Implementation Merge (contd)
Merge Algorithm
G Merged model s
All source nodes of map m1_G
Morphism between m1 and G m2_G
Morphism between m2 and G m
Copy of G
44
  • Prototype

45
Prototype
46
Prototype
  • Interpreter
  • central component
  • executes scripts
  • can be run from command line
  • or invoked by external app.s tools
  • orchestrates the data flow

47
Prototype
  • Operators
  • a native implementation
  • e.g. ReadSQLDDL, WriteSQLDDL
  • ReadDb, WriteDb
  • all primitive operators
  • GUI operators like EditMap,
    EditSelector
  • operators SFJoin and GraphMerge
  • schema translation conversion
    operators
  • scripts
  • e.g. alias ReadSQLDDL ltJava class namegt
  • Range, Match, Merge

48
Prototype (contd)
  • NOTICE
  • The specification of the commonly used native
    or derived operators can be grouped in a single
    script and utilized in other scripts using
    include statements

49
Prototype (contd)
  • Facilities of the interpreter
  • debugging
  • e.g. examine the execution traces
  • flexible handling of input and output
  • e.g. operators having more than one parameter
  • as output

50
Prototype (contd)
  • Currently supported features are
  • SQL DDL
  • XML Schema
  • RDF Schema
  • SQL views
  • UML

51
Prototype (contd)
  • To introduce a new modelling language
  • Step 1 Provide import/export operators
  • Step 2 Implement callbacks for operators
  • All, Extract and GraphMerge

52
Prototype (contd)
Code Breakdown (LOC)
53
  • Related Work

54
Related Work
  • In previous work, schemas were represented as
    graphs having is-a, has-a, functional dependency
    relationships.
  • In this approach, the graphs are syntactic
    structures. Morphisms are used as schema
    correspondences. Selectors have been first
    introduced.

55
Related Work (contd)
  • One of the surprises of the present work is how
    much leverage one can get out of simple
    morphisms.
  • For minimizing the amount of manual
    post-processing in schema matching, the
    structural matcher is used instead of machine
    learning.
  • Each converter is implemented as a custom,
    non-generic operator.

56
Related Work (contd)
  • The operators presented are mostly syntactic,
    just like the conceptual structures, and are
    expressed as graph transformations.

57
  • Conclusions

58
Conclusions
  • The main conclusions are the following
  • One can solve practical problems using the model
    management operators
  • The solutions require a relatively small amount
    of code
  • One can get very far using a relatively weak
    representation for models and mappings

59
  • THANKS FOR YOUR VALUABLE TIME
Write a Comment
User Comments (0)
About PowerShow.com