Refactoring Erlang Programs - PowerPoint PPT Presentation

About This Presentation
Title:

Refactoring Erlang Programs

Description:

Maintain flexibility as the system evolves. Refactoring. Refactoring means changing the design or ... in a particular idiom. APIs ... programmer / user ... – PowerPoint PPT presentation

Number of Views:90
Avg rating:3.0/5.0
Slides: 64
Provided by: SimonTh3
Category:

less

Transcript and Presenter's Notes

Title: Refactoring Erlang Programs


1
RefactoringErlang Programs
  • Huiqing Li
  • Simon Thompson
  • University of Kent

2
Overview
  • What is refactoring?
  • Examples
  • The process of refactoring
  • Tool building and infrastructure
  • What is in Wrangler demo
  • Latest advances data, processes, erlide.

3
Introducing refactoring
4
Soft-ware
  • Theres no single correct design
  • different options for different situations.
  • Maintain flexibility as the system evolves.

5
Refactoring
  • Refactoring means changing the design or
    structure of a program without changing its
    behaviour.

Refactor
Modify
6
Examples
7
Generalisation
Generalisation and renaming
  • -module (test).
  • -export(f/1).
  • add_one (HT) -gt
  • H1 add_one(T)
  • add_one () -gt .
  • f(X) -gt add_one(X).
  • -module (test).
  • -export(f/1).
  • add_one (N, HT) -gt
  • HN add_one(N,T)
  • add_one (N,) -gt .
  • f(X) -gt add_one(1, X).

-module (test). -export(f/1). add_int
(N, HT) -gt HN add_int(N,T) add_int
(N,) -gt . f(X) -gt add_int(1, X).
8
Generalisation
  • -export(printList/1).
  • printList(HT) -gt
  • ioformat("p\n",H),
  • printList(T)
  • printList() -gt true.
  • printList(1,2,3)
  • -export(printList/2).
  • printList(F,HT) -gt
  • F(H),
  • printList(F, T)
  • printList(F,) -gt true.
  • printList(
  • fun(H) -gt
  • ioformat("p\n", H)
  • end,
  • 1,2,3).

9
Generalisation
  • -export(printList/1).
  • printList(HT) -gt
  • ioformat("p\n",H),
  • printList(T)
  • printList() -gt true.
  • -export(printList/1).
  • printList(F,HT) -gt
  • F(H),
  • printList(F, T)
  • printList(F,) -gt true.
  • printList(L) -gt
  • printList(
  • fun(H) -gt
  • ioformat("p\n", H) end,
  • L).

10
Asynchronous to synchronous
  • pid! self(),msg
  • Parent,msg -gt
  • body
  • pid! self(),msg, receive pid, ok-gt ok
  • Parent,msg -gt
  • Parent! self(),ok,
  • body

11
Refactoring
12
Refactoring Transformation Condition
  • Transformation
  • Ensure change at all those points needed.
  • Ensure change at only those points needed.
  • Condition
  • Is the refactoring applicable?
  • Will it preserve the semantics of the module? the
    program?

13
Transformations
full
stop
one
14
Condition gt Transformation
  • Renaming an identifier
  • "The existing binding structure should not be
    affected. No binding for the new name may
    intervene between the binding of the old name and
    any of its uses, since the renamed identifier
    would be captured by the renaming. Conversely,
    the binding to be renamed must not intervene
    between bindings and uses of the new name."

15
Which refactoring exactly?
  • Generalise f by making 23 a parameter of f
  • f(X) -gt
  • Con 23,
  • g(X) Con 23.
  • This one occurrence?
  • All occurrences (in the body)?
  • Some of the occurrences to be selected.

16
Compensate or crash?
  • -export(oldFun/1,
  • newFun/1).
  • oldFun(L) -gt
  • newFun(L).
  • newFun(L) -gt
  • .
  • -export(newFun/1).
  • newFun(L) -gt
  • .

or
?
17
Refactoring tools
18
Tool support
  • Bureaucratic and diffuse.
  • Tedious and error prone.
  • Semantics scopes, types, modules,
  • Undo/redo
  • Enhanced creativity

19
Semantic analysis
  • Binding structure
  • Dynamic atom creation, multiple binding
    occurrences, pattern semantics etc.
  • Module structure and projects
  • No explicit projects for Erlang cf Erlide /
    Emacs.
  • Type and effect information
  • Need effect information for e.g. generalisation.

20
Erlang refactoring challenges
  • Multiple binding occurrences of variables.
  • Indirect function call or function spawn
    apply (lists, rev, a,b,c)
  • Multiple arities  multiple functions rev/1
  • Concurrency
  • Refactoring within a design library OTP.
  • Side-effects.

21
Static vs dynamic
  • Aim to check conditions statically.
  • Static analysis tools possible but some aspects
    intractable e.g. dynamically manufactured atoms.
  • Conservative vs liberal.
  • Compensation?

22
Architecture of Wrangler
23
Wrangler in Emacs
24
Wrangler in Emacs
25
Wrangler refactorings
  • Rename variable/function/module
  • Generalise function definition
  • Move a function definition to another (new)
    module
  • Function extraction
  • Fold expression against function
  • Expression search
  • Detect duplicate code
  • Tuple function parameters
  • From tuple to record

26
Wrangler demo
27
(No Transcript)
28
Tool building
29
Wrangler and RefactorErl
  • Lightweight.
  • Better integration with interactive tools (e.g.
    emacs).
  • Undo/redo external?
  • Ease of implementing conditions.
  • Higher entry cost.
  • Better for a series of refactorings on a large
    project.
  • Transaction support.
  • Ease of implementing transformations.

30
Integration with IDEs
  • Back to the future? Programmers' preference for
    emacs and gvim
  • though some IDE interest Eclipse,
    NetBeans
  • Issue of integration with multiple IDEs building
    common interfaces.

31
Integration  with tools
  • Test data sets and test generation.
  • Makefiles, etc.
  • Working with macros e.g. QuickCheck uses Erlang
    macros
  • in a particular idiom.

32
APIs  programmer / user
  • API in Erlang to support user-programmed
    refactorings
  • declarative, straightforward and complete
  • but relatively low-level.
  • Higher-level combining forms?
  • OK for transformations, but need a separate
    condition language.

33
Verification and validation
  • Possible to write formal proofs of correctness
  • check conditions and transformations
  • different levels of abstraction
  • possibly-name binding substitution for renaming
    etc.
  • more abstract formulation for e.g. data type
    changes.
  • Use of Quivq QuickCheck to verify refactorings in
    Wrangler.

34
Clone detection
35
The Wrangler Clone Detector
  • Uses syntactic and static semantic information.
  • Syntactically well-formed code fragments
  • identical after consistent renaming of
    variables,
  • with variations in literals, layout and
    comments.
  • Integrated within the refactoring environment.

36
The Wrangler Clone Detector
  • Make use of token stream and annotated AST.
  • Tokenbased approaches
  • Efficient.
  • Report non-syntactic clones.
  • AST-based approaches.
  • Report syntactic clones.
  • Checking for consistent renaming is easier.

37
The Wrangler Clone Detector
Source Files
Tokenisation
Token Stream
Normalisation
Normalised Token Stream
Suffix Tree Construction
Suffix tree
38
The Wrangler Clone Detector
Source Files
Tokenisation
Parsing Static Analysis
Token Stream
Annotated ASTs
Syntactic Clones
Normalisation
Clone Decomposition
Filtered Initial Clones
Normalised Token Stream
Suffix Tree Construction
Clone Filter
Suffix tree
Initial Clones
Clone Collector
39
The Wrangler Clone Detector
Source Files
Tokenisation
Parsing Static Analysis
Token Stream
Annotated ASTs
Syntactic Clones
Consistent Renaming Checking
Normalisation
Clone Decomposition
Filtered Initial Clones
Normalised Token Stream
Clones to report
Suffix Tree Construction
Clone Filter
Suffix tree
Initial Clones
Clone Collector
40
The Wrangler Clone Detector
Source Files
Tokenisation
Parsing Static Analysis
Token Stream
Annotated ASTs
Syntactic Clones
Consistent Renaming Checking
Normalisation
Clone Decomposition
Filtered Initial Clones
Normalised Token Stream
Clones to report
Suffix Tree Construction
Clone Filter
Formatting
Suffix tree
Initial Clones
Clone Collector
Reported Code Clones
41
Clone detection demo
42
(No Transcript)
43
(No Transcript)
44
(No Transcript)
45
Support for clone removal
  • Refactorings to support clone removal.
  • Function extraction.
  • Generalise a function definition.
  • Fold against a function definition.

46
Case studies
  • Applied the clone detector to Wrangler itself
    with threshold values of 30 and 2.
  • 36 final clone classes were reported 12 are
    across modules, and 3 are duplicated function
    definitions.
  • Without syntactic checking and consistent
    variable renaming checking, 191 would have been
    reported.
  • Applied to third party code base (32k loc, 89
    modules),109 clone classes reported.

47
Data-oriented refactorings
48
Tupling parameters
-module(tup1). -export(gcd/1). gcd(X,Y) -gt
if XgtY -gt gcd(X-Y,Y) YgtX -gt
gcd(Y-X,X)? true -gt X end.
  • -module(tup1).
  • -export(gcd/2).
  • gcd(X,Y) -gt
  • if XgtY -gt
  • gcd(X-Y,Y)
  • YgtX -gt
  • gcd(Y-X,X)
  • true -gt
  • X ?
  • end.

2
49
Introduce records
-module(rec1). -record(rec,f1,
f2). g(recf1A, f2B)-gt A B. h(X, Y)-gt
g(recf1X,f2X), g(rec
f1element(1,Y), f2element(2,Y)).
  • -module(rec1).
  • g(A, B)-gt
  • A B.
  • h(X, Y)-gt
  • g(X, X),
  • g(Y).

f1 f2
50
Introduce records in a project
  • Need to replace other expressions
  • Replace tuples with record
  • Record update expression
  • Record access expression
  • Chase dependencies across functions
  • and across modules.

51
Refactoring and Concurrency
52
Wrangler and processes
  • Refactorings which address processes
  • Register a process.
  • Rename a registered process.
  • From function to process.
  • Add tags to messages sent / received.

53
Challenges to implementation
  • Data gathering is a challenge because
  • Processes are syntactically implicit.
  • Pid to process links are implicit.
  • Communication structure is implicit.
  • Side effects.

54
Underlying analysis
  • Analyses include
  • Annotation of the AST, using call graph.
  • Forward program slicing.
  • Backwards program slicing.

55
Wrangler and Erlide
56
Wrangler and Erlide
  • Erlide is an Eclipse plugin for Erlang.
  • Distribution simplified.
  • Integration with the edit undo history.
  • Notion of project.
  • Refactoring API in the Eclipse LTK.
  • Ongoing support for Erlide from Ericsson.

57
(No Transcript)
58
(No Transcript)
59
Issues on integration
  • LTK has a fixed workflow for interactions.
  • New file vs set of diffs as representation.
  • Fold and generalise interaction pattern.
  • Cannot support rename / create file.
  • Other refactorings involve search a different
    API.

60
Conclusions
61
Future work
  • Concurrency continue work.
  • Refactoring within a design library OTP.
  • Working with Erlang Training and Consulting.
  • Continue integration with Eclipse other IDEs.
  • Test and property refactoring in
    .
  • Clone detection fuller integration.

62
Ackonwledgements
  • Wrangler development funded by EPSRC.
  • The developers of syntax-tools, distel and
    Erlide.
  • George Orosz and Melinda Toth.
  • Zoltan Horvath and the RefactorErl group at
    Eotvos Lorand Univ., Budapest.

63
http//projects.cs.kent.ac.uk
Write a Comment
User Comments (0)
About PowerShow.com