Title: CASCON 2006 Model Fusion Workshop An Industry Perspective
1CASCON 2006Model Fusion WorkshopAn Industry
Perspective
Kim Letkeman Development Lead, Advanced Model
Driven Development Application AnalysisIBM
Rational Software
2Agenda
- Introduction
- An Industry Perspective
- Challenges when Modeling in a Team Environment
- Summary
3Agenda
- Introduction
- An Industry Perspective
- Challenges when Modeling in a Team Environment
- Summary
4What is Model Management?
- From the Model Fusion WEB page
- Models are often constructed and manipulated in
distributed teams, each modeler working on a
partial view of the overall system. - Coordinating this effort, maintaining
consistency, and merging the various views is a
major challenge. - The tools and techniques used to address these
issues are all a part of Model management.
5Model Management Key Issues
- Terminology - modelers and tools vary in the
definition and application of the terms that they
use. - Overlap different models may refer to the same
concepts in the requirements or design of a
system these overlapping concepts may be
presented differently in each model the models
may also contradict one another. - Ambiguity - modeling languages have ambiguities
in their semantics, even when well-defined
meta-models are available. This will cause
problems if vendors and modelers make different
assumptions. - Evolution - Models may evolve through a number of
different versions, so that model transformations
may need to be recomputed if the source models
are updated.
6Method Flow
- The way we model has a huge influence on the
problems we choose to solve - For example, if your only driver is team
development, then you will work only on a merge
solution - But if your drivers expand to include all of the
processes inherent in MDA, then you have to solve
a more general fusion issue
7Ad-hoc Modeling
Projects usually start here
- A team of modelers jumps on the task with fervor,
and ends up creating several models that must be
combined to form the enterprise model set going
forward - Chief characteristic is ad-hoc development with
no version control - High risk of model corruption and confusion, very
difficult task if done manually
1
fusion
2
4
fusion
fusion
3
Manual fusion steps add risk, prefer assisted
fusion.
8Team Modeling
Projects might go here
- Any development environment where people
collaborate and work together to build a software
system is a team environment - Chief characteristic is parallel development
- High risk of accidental erasure of artifacts if
version control is manual and appropriate
procedures not followed or discipline not
exercised
1a
1
2
3
1b
No fusion step, v2 is overwritten, all changes
lost!
9Controlled Modeling Scenario
Projects should go here!
- ClearCase and CVS automatically detect the need
for a merge!
Fusion step preserves changes in all versions.
10Model Driven Architecture
- A modeling approach whereby the system is
specified in a platform independent way, and thus
is separated from the idiosyncrasies of any
specific implementation platform. - Most importantly, a language and vendor neutral
architecture is created. - MDA is formally defined by the OMG.
11MDA Layering and Flow
Model Fusion is needed here.
And here.
And here.
12Fusion at every transformation
- At every transformation stage, you can iterate.
Change the source model, and you must transform
again. - But the target model may already have custom
content. - There is therefore a fusion step at the end of
each transformation step, which must preserve the
custom content in the transformation target
Abstract Model
Concrete Model
Concrete Model
Transformation
Fusion
Fusion step should be automatic (silent) if
possible, to allow builders to run the process.
13Agenda
- Introduction
- An Industry Perspective
- Challenges when Modeling in a Team Environment
- Summary
14Fundamental Project Drivers
- Its about the bottom line.
- Sales are directly impacted by how users perceive
our solutions. Costs skyrocket when issues arise
on customer sites and problems are escalated to
the development team. So our solution has to work
every time and in every situation. - And the customer.
- Although the complexity of fusion is fascinating
on its own, we must make it possible for users to
work in parallel on models of any size and to use
version control systems such as ClearCase and CVS
to manage their artifacts. - Thus, we have no choice but to divide our
attention into several key areas of focus.
15Key Focus Areas
- Integration
- We need integrations with Eclipse (compare
with), the various flavors of ClearCase
(cleartool command line, project explorer,
version tree, history view, original Eclipse
plugin (SCM), Eclipse remote client (CCRC), and
CVS. - Back-end
- We need fast delta generation and conflict
analysis - Front-end
- We need a GUI that will make it possible to
depict any combination of deltas without flooding
the user with extreme complexity - We choose to advance the state of the art with
visual compare in the UML diagrams themselves
And all of these solutions must be generic so we
can address all meta-models.
16Other Project Drivers
- TIME!!!
- There is never enough.
- People, Equipment
- Ditto.
- Skills, Experience
- Is there ever enough?
17Our Approach to Model Management Issues
- Terminology we defined our own.
- Overlap we recommend certain model partitioning
techniques. We also support the MDA hierarchical
levels CIM to PIM to PSM to code through our
transformation technology. Thus, we encourage
appropriate segregation of concerns. - Ambiguity we use the UML. We are in a very good
position to take the most appropriate direction
with access to the leading experts on the UML
specification and the open source embodiments of
same. - Evolution we support an environment where
transformation and evolution is a continuous
process. Each new version of a model can be
transformed again while preserving the unique
contents of the transformation target.
18Terminology
- At the least, any model fusion project must
create a similar glossary in order for team
members to be able to communicate. - The following terms define the way the RSA team
thinks and are important for understanding RSAs
fundamental behaviors.
19Identity
- A unique moniker for an element. An element ID
must be unique within all models that may
interact within an organization. - Example Class1 is created with identity set to
1. Another Class1 is later created with
identity set to 2. These look like the same
class, but they are not because each has a unique
and separate identity. - Two copies of an element with different names are
the same element if they share the same identity. - Identity is permanent and immutable. An element
retains the same identity throughout its entire
life.
20Name
- The moniker by which an element is generally
known. Names are not required to be unique within
a model or even a package. - Example Class1 is created with identity set to
1. Another Class1 is later created with
identity set to 2. These have the same name and
would be considered to be the same class upon
casual inspection of the model. - Names are impermanent and mutable. A rename will
not change an elements identity, but the user
will see it as a different element.
21Qualified Name
- The full path to an element by name. For example,
a class could have the qualified name
com.ibm.package1.class1. - A second class with a different identity could
have the same qualified name. - A class with the same name but in a different
package will have a different qualified name and
the two can therefore be distinguished from each
other. - It is possible to use qualified names as the
identity. However, a rename or a move (refactor)
would be seen as delete add, not move.
22Delta / Difference
- Any change between two elements that have the
same identity in two different versions of the
model - Must be directional, for example left minus
right, new minus old. - Add Delta an element was added to the left
(newer) model. - Delete Delta an element was removed from the
left (newer) model. - Change Delta a structural feature has a
different value. - Move Delta an element has a different parent in
the left (newer) model. - Reorder Delta special case of move delta an
element occupies a different position in a list
(e.g. a parameter change positions in a
signature.)
23Compare and Merge
- Comparing or combining two models by identity.
- Two-way merge compares them directly, while
three-way compares each against a common ancestor
to generate delta lists and then compares those
to generate a conflict list. - Elements that have the same identity on each side
of a comparison are matched to each other for
delta generation, regardless of name or content. - Used to combine generations of the same model.
- Can detect all deltas add, change, delete,
move, reorder. - Primary tool for parallel development use cases.
24Structural Compare and Fuse
- Comparing or combining two models by qualified
name. - Two-way comparison only. Looks for structural
differences. - Unnamed elements (e.g. notation) are skipped as
they do not (by definition) participate in
structure. - Elements that have the same qualified name on
each side of a comparison are matched to each
other for delta generation, regardless of
identity. - The matching could be manually overridden to
realign the model structure based on user input. - Used to combine models with different identities.
Primary tool for ad-hoc modeling and
transformation fusion stage. - Can detect a subset of deltas add, change,
delete. Move can be detected with manual match
override.
25Rational Software Development Platform
- Shipped 6.0 in December 2004.
- Full ClearCase and CVS parallel development
support. - Full Eclipse integration for EMF / UML compare
and merge. - Merge by identity only, no fusion by name.
- Open beta for version 7.0 imminent.
- Full model fusion support.
- Our solutions are proprietary and protected by
patents
26Agenda
- Introduction
- An Industry Perspective
- Challenges when Modeling in a Team Environment
- Summary
27There are no perfect solutions
- The following challenges are examples that can be
addressed, but in many cases the solution brings
along a new problem - The key point is to realize that there are no
perfect solutions, only compromises that lean
towards your preferred behavior
28Performance
- What happens when you add a 10mb package into a
model and a merge results? - A very long merge! Weve seen 24 hours in the
distant past - If you generate nested add deltas this specific
package generated many thousands of deltas then
you are asking your delta and conflict engines to
do a lot more work - Collapsing containers to a single add delta
shrinks the delta count, which improves all
aspects of performance, but there are issues with
this choice as well - It is impossible to create a relationship between
one of the now-hidden add deltas and another
delta somewhere else - It is impossible to reference any now-hidden
addition, as the whole delta is opaque
researchopportunity?
29Performance
- In delta generation, every element in one model
must be compared against every element in another
model. And then conflict analysis compares every
delta in both lists. - These are quadratic algorithms if implemented
with brute force - Even finding an element by its ID can be
expensive, cost is linear - How about indexing, which can help reduce
quadratic algorithms to linear and linear
algorithms to logarithmic? - It works, but there is a memory penalty
30Scalability
- Customers expect to be able to run very
sophisticated software on relatively slow and
under-resourced machines - Yet we have customers with models that top 100MB
on disk, which expands to 300-400mb in RAM - And merging models requires that four models be
in memory at some point ancestor, remote (last
one checked into the repository), local (current
model attempting to check in), and merged (final
result of merge.) - Total could be up to 1.6gb for the heap java
cannot even do that yet! - The answer is model partitioning. Big user
education issue! - http//www-128.ibm.com/developerworks/rational/lib
rary/05/802_comp3/ - http//www-128.ibm.com/developerworks/rational/lib
rary/5816.html
researchopportunity?
31Corruption
- What happens when a user blindly accepts related
conflict resolutions from different contributors?
That is, he or she resolves conflict A from local
and conflict B from remote? - If these conflicts are created by the same
gesture, then the result is not going to be
anything that either user did. I.e. no user
gesture is represented by the result. - Worse, it is possible that the tooling is not
expecting the mix of two gestures and will fail
to render it, or even to open the model. - One solution is to treat related conflicts
atomically. All must resolve to one side or the
other. - The penalty is a loss of flexibility eventually
many deltas are forced in one direction by
accepting a single delta
32Data Loss
- Two packages are refactored. Package A becomes
package Bs parent. Package B becomes package As
parent in a parallel contributor. - There is no obvious conflict here, because each
is getting a new parent. - But if both deltas are accepted, then they form a
circle and become detached from the model.
Bye-bye data. - This conflict gets more interesting when there
are other packages in between. - A conflict strategy that looks for these issues
comes with a sizeable performance penalty
33Usability
- How do you display thousands of deltas, each
potentially relevant to a different diagram,
without overwhelming the user? - How do you display conflicts that render in
different ways? For example, a package deletion
(rendered in a tree) conflicts with notational
changes on a contained diagram (rendered in a
diagram)? - How do you handle a conflict between two
multi-line text fields? If you dont allow merges
at the field level, you force one user to throw
away the field edits. - All of these require complex GUIs and that very
complexity somewhat mitigates its value.
researchopportunity?
34User Inexperience
- What happens if a modeling team never
synchronizes their workspaces to the repository? - Every check-in results in a merge. Every merge
gets larger as the distance between the common
ancestor model and every new generation of the
model grows. - Seems obvious, yet we have seen it. A huge
performance issue that is easily fixed by
synchronizing the workspace with the repository
now and again. Another user education issue.
35Parallel Projects
- A set of 100 models that are edited in parallel
can result in many merges per day. - So what happens when you have parallel sets of
models for individual projects? - ClearCase allows parallel streams to cross-merge
using the diffmerge tool - And then what happens when each is independently
upgraded and the conversion creates enough new
identities that the model differences rise from a
dozen to many thousand, all of which are false? - Not a problem if the requirement for identity is
removed.
Researchopportunity?
36Meta-model Mismatch
- By definition, meta-models must be identical to
be compared. - Yet, the existence of custom profiles guarantees
that a model will periodically have its
meta-model changed. And that model version can
participate in a merge. In fact, it is possible
in a volatile environment to have three different
meta-models as inputs to a merge. - We always aborted these merges, forcing users to
go through a lot of pain to upgrade the
contributors in a temporary space and merge them
outside the SCM system. - We now have a partial solution, but it cannot
handle all use cases. For example, it is
verydifficult to handle the situation where
profilesare applied at different levels in the
hierarchy.
Probablyresearchopportunity.
37DEMO
38Agenda
- Introduction
- An Industry Perspective
- Challenges when Modeling in a Team Environment
- Summary
39Model Merging
- RSA has state of the art model merge capability
(at least, for a shipping product) - Fully integrated with Eclipse, CVS and ClearCase
- Supports visual merge of notation
- Can be extended to support domain specific
meta-models and generic EMF models - Explicit GMF notation support we have a sample
logic diagram client - Version 7 has an API that can be used to support
merging of any EMF meta-model.
40Best Practices for Model and Source Management
- Use a Source Code Management (SCM) system
- Version control everything including 3rd party
binaries that are a part of your build and
packaging - Use an SCM system that supports automated merge
detection - Much higher productivity when constant need for
coordination is replaced by automated detection - Individuals should update to the latest source
very frequently - Prevents unnecessary merge sessions
- Practice strong ownership separate
responsibilities to avoid unnecessary conflicts - Important for code and for models
41(No Transcript)
42(No Transcript)