Software Merging - PowerPoint PPT Presentation

About This Presentation
Title:

Software Merging

Description:

many different software developers working simultaneously on the same software ... communicate likely conflicts to relevant developers ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 43
Provided by: mens6
Category:

less

Transcript and Presenter's Notes

Title: Software Merging


1
Software Merging
  • An Overview
  • Dr. Tom Mens
  • Programming Technology Lab
  • Vrije Universiteit Brussel
  • Course OOSE.RC
  • EMOOSE 1999-2000

2
Problem Statement
  • Collaborative Software Development
  • many different software developers working
    simultaneously on the same software
  • parallel changes are made to the same code
  • need to combine these changes
  • Software merging
  • automated tool support for combining parallel
    changes
  • detect inconsistencies or unexpected interactions
    between parallel changes
  • provide support for resolving these
    inconsistencies

3
Context
  • Merge tools are usually part of a configuration
    management or version management system
  • Definition
  • Software configuration management (SCM) is the
    discipline of managing and controlling change in
    the evolution of software systems
  • IEEE Standard 1042, 1987
  • Examples
  • Revision Control System Tichy 1985
  • Concurrent Version System Berliner 1990
  • Perforce www.perforce.com
  • ClearCase Leblang 1994
  • Adele Estublieral 1994

4
Version Terminology
5
SCM
  • Traditionally, SCM was seen purely as a
    management discipline
  • Bershoffal80
  • Nowadays, it is also treated as a software
    development support discipline
  • provides automated help to reduce complexity of
    making changes to large-scale software systems
  • SCM is necessary in all phases of software
    life-cycle

6
SCM Concepts
  • Configuration item
  • a self-contained software artefact whose
    evolution needs to be tracked and controlled
  • some items can be composite, consisting of other
    items
  • Version
  • identifies the state of a configuration item at a
    well-defined point in time
  • each state has a unique version number
  • variants are versions that are intended to
    coexist
  • e.g., Mac/Windows/Unix variant of software
    application
  • e.g., Light/Standard/Professional edition of
    software application
  • a promotion is a version available to other
    developers
  • promotions are stored in a workspace (or dynamic
    library)
  • a release is a version available to clients or
    users
  • releases are stored in a repository (or static
    library)

7
SCM Concepts ctd.
  • Configuration
  • a version of a composite configuration item,
    containing a consistent set of other
    configuration item versions
  • Change request
  • formal request for modifying a configuration item
  • Baseline
  • Formally reviewed and agreed on configuration
    item that can only be changed through a change
    request
  • Branch
  • concurrent development path requiring independent
    SCM
  • different branches can be reconciled by merging
    their versions

See BrueggeDutoit2000.
8
Exercise 1
  • SCM systems such as RCS and CVS use file names
    and their paths to identify configuration items.
  • Explain why this feature prevents the
    configuration management of composite
    configuration items, even in the presence of
    labels.

9
SCM Activities
  • Configuration item identification
  • each item has unique version number
  • Status accounting
  • record status of individual components, work
    products and change requests
  • Build management
  • enable automatic rebuilding of system when new
    versions of components are created
  • minimise amount of recompilation
  • Process management
  • implement change policy
  • e.g. only syntactically correct code can be part
    of a version /builds should be made every week
    /relevant developers should be notified about
    new versions that have been created
  • Promotion management

10
SCM Activities ctd.
  • Release management
  • creation of releases is decided at management
    level, based on marketing and quality control
    advice
  • creation of releases includes
  • updating user manual (documentation)
  • ensuring there are no inconsistencies
  • validate completeness and quality
  • Change management
  • ensure consistency with project goals during
    changes
  • different steps
  • request a change
  • assess request against project goals
  • may include cost analysis and impact analysis
  • accept or reject request
  • plan accepted change, prioritise, and assign to
    developer
  • audit implemented change (quality control)

11
SCM Activities ctd.
  • Branch management
  • merging is needed to coordinate overlapping or
    interacting parallel changes
  • detect and resolve conflicts between overlapping
    changes
  • heuristics to minimise merge conflicts
  • anticipate where overlapping changes can occur
  • merge frequently to identify overlaps early
  • communicate likely conflicts to relevant
    developers
  • minimalise changes in main branch, and do
    important changes in separate development
    branches
  • minimise number of branches
  • Variant management
  • variants are needed when
  • software operates on different platforms
    (different OS or hardware)
  • sofware is delivered in variants with different
    levels of functionality
  • variants can be dealth with by
  • Different teams for each variant --gt reduced
    complexity / increased redundancy
  • Single project with variant-specific code

12
Roles in SCM
  • Configuration manager
  • identifies configuration items
  • defines procedures for creating promotions and
    releases
  • Change control board member
  • approves or rejects change requests
  • assesses the changes and plans accepted changes
  • Developer
  • implements change requests
  • creates promotions
  • resolve merge conflicts
  • Auditor
  • ensure quality of changes
  • select and evaluate promotions for a release
  • ensure consistency and completeness of a release

13
Storing subsequent versions
  • Alternatives for storing subsequent versions of a
    software artefact
  • storing all versions integrally
  • using deltas, i.e., store differences only
  • forward deltas record original version and apply
    deltas to produce newer versions
  • e.g. SCCS Rochkind 1975
  • backward deltas record latest version entirely
    and apply deltas to produce older versions
  • e.g. RCS Tichy 1985

14
Exercise 2
  • Most version control systems use backward deltas
    rather than forward deltas to store subsequent
    revisions of the same version.
  • Explain why this is the case.

15
Kinds of Merging
  1. 2-way vs 3-way merging
  2. reuse versus evolution
  3. textual, syntactic or semantic merging
  4. state-based vs change-based

16
a) 2-way vs 3-way merging
17
b) reuse vs evolution
  • merging is necessary
  • when an object-oriented framework is being
    customised by a framework user, while it is also
    evolved by the framework developer
  • Cf. reuse contracts
  • When two parallel changes to the same software
    artifact need to be combined

18
c) textual, syntactic, semantic
  • textual merging
  • Considers sofware artefacts as pure text files
    (or, alternatively, binary files)
  • syntactic merging
  • Use more structured information of software
    artefacts (e.g. trees or graphs)
  • semantic merging
  • Use behavioural information about software
    artefacts

19
text-based merging
  • Different levels of granularity
  • Line-based merging takes lines as primitive
    building blocks
  • E.g. Unix diff
  • Using single characters as building blocks is too
    inefficient for primitive use
  • More efficient (two-way) approaches for merging
    binary files
  • E.g. bdiff Tichy84 and vdelta

20
Exercise 3
  • CVS uses a simple line-based merge rule to
    identify merge conflicts there is a conflict if
    the same line was changed in both revisions. If
    no such line exists, no conflict is generated and
    the merge is performed automatically.
  • a) Explain why this approach fails to detect
    certain types of conflicts. Provide an
    illustrative example of both a syntactic and a
    semantic conflict that goes undetected.
  • b) Vice versa, try to find an example where the
    approach generates a conflict while there isnt
    one.

21
Exercise 3 Solution a.1
function F(a,b)
function F(a,b) x F(1,2)
add function call
add third argument
Syntactic conflict! Function called with wrong
number of arguments.
function F(a,b,c)
function F(a,b,c) x F(1,2)
22
Exercise 3 Solution a.2
circumference(r) 2?r area(r) ?rr

circumference(r) 2area(r)/r area(r)
?rr
Semantic conflict! Unexpected infinite recursion
after merge.
circumference(r) 2?r area(r)
circumference(r)r/2
circumference(r) 2area(r)/r area(r)
circumference(r)r/2
23
syntactic merging
  • Based on parse trees
  • essentially models is-part-of relation between
    software entities
  • Examples
  • Westfechtel1991
  • domain-independent approach
  • Asklund1994
  • Cdiff Grass 1992
  • for parse trees of C programs

24
syntactic merging
  • Based on graphs
  • More flexible than trees
  • also models relations like invokes, calls,
    uses, accesses, defines, ...
  • Examples
  • Rhoal1998
  • Reuse contracts
  • Steyaertal96 essentially method calls
  • Mens2000 domain-independent formalism

25
semantic merging
  • Finding all possible semantic conflicts is an
    undecidable problem in general
  • Conservative approaches provide a safe
    approximation
  • No false negatives all semantic conflicts are
    detected
  • E.g. Horwitzal89, Binkleyal95
  • Lightweight approaches only consider part of the
    semantics
  • Can give rise to false positives and false
    negatives
  • Possible approaches
  • using predicates pre/postconds, invariants,
    obligations, exceptions
  • Hoare69, Perry87
  • using algebraic specifications
  • Larch Guttagal5
  • ...

26
d) state-based vs change-based
  • state-based merging
  • only uses information in original version and its
    revisions
  • change-based merging
  • explicitly documents the changes that have been
    made to the versions
  • extensional change-based versioning annotates the
    changes inside the version
  • e.g. Asklund 1994
  • intensional change-based versioning describes the
    changes separately from the versions, in terms of
    the operations or transformations that have been
    used.
  • E.g. EPOS Gullaal 1991

27
Exercise 4
  • Explain why intensional change-based merging is
    more general or more expressive than state-based
    merging.
  • Also give an example of a conflict that can be
    detected with change-based merging, but not with
    state-based merging.

28
Exercise 4 Solution
  • 1. changes can be separated from the versions to
    which they are applied.
  • a) In this way, the same changes can be applied
    more than once, for example to parallel versions
    of the software under development.
  • b) It also becomes very straightforward to
    implement a multiple undo/redo mechanism. For
    undo, perform the last applied operations in the
    opposite direction. For redo, simply reapply the
    operations.

29
Exercise 4 Solution ctd.
  • 2. improves conflict detection and conflict
    resolution
  • efficiently detect more conflicts (conflict table)

30
Two Definitions of Merging
  • Two parallel modifications M1 and M2 of the same
    software artifact can be merged if
  • They can be serialised in any order (M1M2 and
    M2M1), and both serialisations lead to the same
    result
  • The can be serialised in at least one order
    (M1M2 or M2M1).

M1
M2
31
Exercise 5
  • a) Give an example of a situation that can be
    merged by making use of definition 2, but not by
    means of definition 1.
  • b) Give an example where the merge according to
    definition 2 leads to a counter-intuitive result.

32
Exercise 5 Solution (a)
  • Can be merged according to def. 2. First apply
    AddEdge(e,a,b), then perform Rename(b,c).
  • Cannot be merged according to def. 1. If we first
    apply Rename(b,c), we cannot apply AddEdge(e,a,b)
    anymore.

33
Exercise 5 Solution (b)
  • ???

34
Other Merge Issues
  1. Domain-independence
  2. Scalability
  3. Degree of formality
  4. Level of granularity
  5. Resolving conflicts
  6. Minimising conflicts

35
1) domain-independence
  • Most approaches are restricted to a particular
    programming language
  • Cdiff restricted to C
  • Rational Rose Visual Differencing restricted to
    UML
  • Domain-independent approaches
  • Westfechtel 1991, using parse trees
  • Mens 1999, using graphs

36
2) Scalability
  • Text-based merge tools are not scalable
  • changes to multiple lines simultaneously lead to
    conflicts for each line involved
  • For operation-based merging
  • Define composite transformations in terms of more
    primitive ones
  • Gives higher-level view of the evolution
  • Ignore some basic conflicts when they appear as
    part of a composite transformation

37
3) Degree of formality
  • Ad-hoc
  • E.g. Line-based merge tools
  • Lightweight approach
  • Using conflict tables
  • Feather 1989, Steyaertal96
  • Using graph rewriting
  • Mens 1999 confluency pushout property,
    parallel sequential independence
  • Completely formal techniques
  • Berzins 1994
  • Denotational semantics and Browerian algebras
  • Horwitzal89, Binkleyal95
  • program dependence graphs and program slicing

38
4) Level of granularity
  • text-based merge tools
  • line-based
  • block-based
  • character-based

39
5) Resolving Conflicts
  • Use default conflict resolution strategies
  • Cf. Asklund1994

40
6) Minimising Conflicts
  • Small changes can have large impact
  • A simple change can give rise to conflicts
    throughout the entire code
  • Exercise 6 Try to find a number of different
    ways in which one might consider to reduce the
    number of detected conflicts to a managable
    number.

41
Exercise 6 Solution
  • Using information hiding techniques to localise
    effect of changes
  • Ignore temporary inconsistencies that are part of
    a large evolution step
  • Use fine-grained revision control, where changes
    are as small as possible
  • Keep parallel developers aware of each others
    changes
  • Only perform local merges
  • Intraprocedural merging JacksonLadd94

42
Useful Algorithms
  • Redundancy removal
  • Reduces number of detected conflicts
  • reduces spaces, increases speed, increases
    understandability
  • Normalisation
  • Canonical form
  • Reconstruction
  • Reconstruct transformation given base version and
    revised version only

43
Classify existing approaches
Considered approach ... ... Reuse contracts Mens 2000
2-way /3-way 3-way
text / syntactic / semantic Syntactic uses typed graphs Light semantics
state / change change-based uses RC operations
Domain (in) dependence independent of considered domain
44
Assignment
  • Classify a number of approaches according to the
    given criteria and answer the following questions
  • Is the approach 2-way or 3-way?
  • Is the approach textual, syntactic or semantic?
  • Be as precise as possible Is it line-based?
    Which kind of syntactic or semantic software
    artefacts does it address? Which kind of
    semantics? (Conservative/light)
  • Is the approach state-based or change-based?
  • If change-based, is it extensional or
    intentional?
  • If intentional, which operations or
    transformations are available? Is it scalable to
    composite transformations?
  • Is the approach domain-independent or
    domain-specific?
  • If domain-specific, can the technique be
    generalised to more domain-independent artefacts?
    Wy (not)?
  • Does the approach have a formal foundation?
  • Which? What are the benefits of this?
  • How are conflicts detected?
  • Are there any typical or special features of the
    approach?

45
Example reuse contract approach
  • 3-way
  • syntactic approach specialisation interfaces in
    Steyaertal96, collaboration diagrams in
    Lucas97, graphs in Mens99
  • Light semantics ...
  • Change-based merging
  • Primitive transformations are Extension,
    Refinement, Cancellation, Coarsening
  • Composite transformations can be defined
  • Steyaertal96 and Lucas97 are domain-specific
  • Make use of a conflict table
  • class inheritance hierarchies and collaborating
    classes, respectively
  • Mens99 presents domain-independent formalism
    based on graph rewriting
  • Gives a formal characterisation of merge
    conflicts
  • Special featureOriginally designed for reuse
    versus evolution conflicts

46
Assignment ctd.
  • Discuss 3 approaches from the following list
  • Feather 1989
  • Unix diff diff3 utility, Emacs emerge tool,
    bdiff, vdelta, Suns filemerge tool Adams et al.
    1986
  • Westfechtel 1991, Asklund 1994, Cdiff Grass
    1992
  • Rational Rose Visual Differencing tool
  • SCCS Rochkind 1975, DSEE Leblang et al. 1984,
    RCS Tichy 1985
  • Commercial configuration management tools
    ClearCase Leblang et al. 1988, Leblang 1994,
    Adele Estublier et al. 1994
  • Horwitz et al. 1989, Binkley et al. 1995,
    Semantic Diff Jackson et al. 1994, Berzins
    1994
  • Lie et al. 1989, Lippe et al. 1992

47
References
  • Adams et al. 1986 E. Adams, W. Gramlich, S.
    Muchnick, S. Tirfing. SunPro Engineering a
    practical program development environment. Proc.
    Int. Workshop on Advanced Programming
    Environments. LNCS 244 86-96, Springer-Verlag,
    1986
  • Asklund 1994 U. Asklund. Identifying conflicts
    during structural merge. Proc. Nordic Workshop on
    Programming Environment Research 94, pp.
    231-242, Lund University, 1994
  • Berliner 1990 B. Berliner. CVS II
    parallelizing software development. Proc. USENIX
    Conf., pp. 22-26, 1990
  • Bersoff et al. 1980 E. H. Bersoff, V. D.
    Henderson, S. G. Siegel. Software configuration
    management an investment in product integrity.
    Prentice Hall, 1980.
  • Berzins 1994 V. Berzins. Software merge
    semantics of combining changes to programs. ACM
    Transactions on Programming Languages and
    Systems, 16(6) 1875-1903, ACM Press, 1994
  • Binkley et al. 1995 D. Binkley, S. Horwitz, T.
    Reps. Program integration for languages with
    procedure calls. ACM Transactions on Software
    Engineering and Methodology, 4(1) 3-35, ACM
    Press, 1995
  • Estublier et al. 1994 J. Estublier, R.
    Casallas. The Adele configuration manager. In
    Configuration management trends in software.
    John Wiley Sons, 1994
  • Feather 1989 M. Feather. Detecting interference
    when merging specification evolutions. ???, pp.
    169-176, ACM Press, 1989
  • Grass 1992 J. E. Grass. Cdiff A syntax
    directed Diff for C programs. Proc. USENIX C
    Conf., pp. 181-193, 1992

48
References ctd.
  • Gulla et al. 1991 B. Gulla, E.-A. Karlsson, D.
    Yeh. Change-oriented version descriptions in
    EPOS. Software Engineering Journal 6(6) 378-386,
    1991.
  • Horwitz et al. 1989 S. Horwitz, J. Prins, T.
    Reps. Integrating non-interfering versions of
    programs. ACM Transaction on Programming
    Languages and Systems, 11(3) 345-387, ACM Press,
    1989
  • Jackson et al. 1994 D. Jackson, D. A. Ladd.
    Semantic Diff A tool for summarizing the effects
    of modifications. Int. Conf. On Software
    Maintenance. IEEE Press, 1994
  • Leblang et al. 1984 D. Leblang, R. Chase.
    Computer-aided software engineering in a
    distributed workstation environment.
    SIGPLAN/SIGSOFT Software Engineering Symposium on
    Practical Software Development Environments. ACM
    SIGPLAN Notices pp. 104-112, ACM Press, 1984
  • Leblang et al. 1988 D. Leblang, R. Chase, H.
    Spilke. Increasing productivity with a parallel
    configuration manager. Proc. Int. Workshop on
    Software Version and Configuration Control, pp.
    21-38, Teubner-Verlag, 1988
  • Leblang 1994 D. Leblang. The CM challenge
    configuration management that works. In
    Configuration management trends in software.
    John Wiley Sons, 1994
  • Lie et al. 1989 A. Lie, R. Conradi, T.
    Didriksen, E.-A. Karlsson. Change-oriented
    versioning in a software engineering database.
    Proc. 2nd Int. Workshop on Software Configuration
    Management, ACM SIGSOFT Software Engineering
    Notes, 17 56-65, ACM Press, October 1989

49
References ctd.
  • Lippe et al. 1992 E. Lippe, N. van Oosterom.
    Operation-based merging. Proc. 5th ACM SIGSOFT
    Symposium on Software Development Environments.
    ACM SIGSOFT Software Engineering Notes, 17(5)
    78-87, ACM Press, 1992
  • Mens 1999 T. Mens. A Formal Foundation for
    Object-Oriented Software Evolution. PhD
    Dissertation, Vrije Universiteit Brussel,
    Belgium, September 1999
  • Mens 2000 T. Mens. Conditional graph rewriting
    as a domain-independent formalism for software
    evolution, Proc. Int. Agtive 99 Conference,
    LNCS, Springer-Verlag, 2000
  • Rho et al. 1998 J. Rho, C. Wu. An efficient
    version model of software diagrams. Proc. 5th
    Asia-Pacific Conf. Software Engineering, pp.
    236-243, 1998
  • Rochkind 1975 M. Rochkind. The source code
    control system. IEEE Transactions on Software
    Engineering, 1(4) 364-370, IEEE Press, December
    1975
  • Tichy 1985 W. Tichy. RCS a system for version
    control. Software Practice and Experience, 15(7)
    637-654, 1985
  • Westfechtel 1991 B. Westfechtel.
    Structure-oriented merging of revisions of
    software documents. Proc. 3rd Int. Workshop on
    Software Configuration Management, pp. 68-79, ACM
    Press, 1991
Write a Comment
User Comments (0)
About PowerShow.com