Title: How To Address Rapidly Changing Data Representations in an Evolving Scientific Domain Using Aspector
1How To Address Rapidly Changing Data
Representations in an Evolving Scientific Domain
Using Aspect-oriented Programming Techniques
Overview of Bioinformatics at NEU.
- Karl Lieberherr (lieber_at_ccs.neu.edu)
- College of Computer and Information Science
- Northeastern University
- Boston
2Motivation
- From Computational Challenges in Structural and
Functional Genomics by J. Head-Gordon, IBM
SYSTEMS JOURNAL, VOL 40, NO 2, 2001.
3Some Quotes From Head-Gordon.
- Although techniques for warehousing techniques
are as vital in the sciences as in business,
functional warehouses tailored for specific
scientific needs are few and far between. - A key technical reason for this discrepancy is
that our understanding of the concepts being
explored in an evolving scientific domain change
constantly, leading to rapid changes in data
representation.
4Some Quotes From Head-Gordon (Refinement).
- evolving scientific domain change constantly,
leading to rapid changes in data representation. - Not only changes in data representation but also
changes in interfaces need protection against
changes in interfaces. - Examples additional or modified fields or
arguments additional or modified types.
5More Quotes From Head-Gordon.
- When the format of source data changes, the
warehouse must be updated to read that source or
it will not function properly. The bulk of these
modifications involve extremely tedious,
low-level translation and integration tasks that
typically require the full attention of both
database and domain experts. Given the lack of
the ability to automate this work, warehouse
maintenance costs are prohibitive, and warehouse
up-times severely restricted.
6Protect Against Changes.
- Protection against changes in data representation
and interfaces. Traditional technique
information-hiding is good to protect against
changes in data representation. Does not help
with changes to interfaces. - Need more than information hiding to protect
against interface changes restriction through
shy programming, called Adaptive Programming
(AP).
Implementation
Interface
Client
Information Hiding
Shy Programming
7Problem with Information Hiding
- Shy Programming builds on the observation that
traditional black-box composition is not
restricting enough. We use the slogan
information hiding is not hiding enough. Blackbox
composition isolates the implementation from the
interface, but does not decouple the interface
from its clients.
8Cover unimportant parts of the interface
- To permit interfaces to evolve, self-discipline
is required to prevent from programming
extensively against the interface. Certain parts
of the interface are best left as if they were
covered.
Implementation
Interface
Client
Information Hiding
Shy Programming
9Shy Programming Adaptive Programming
- This disciplined programming is referred to as
shy programming. Shy programming lets the
program recover from (or adapt to) interface
changes. Shy programming is also called Adaptive
Programming (AP). This is similar to the shyness
metaphor in the Law of Demeter (LoD) structure
evolves over time, thus communicate with just a
subset of the visible objects.
10Decoupling of Interface
- We summarize the commonalities and differences
between black-box composition and Shy Programming
into two principles. - Black-box Principle the representation of
objects can be changed without affecting clients. - Shy-Programming Principle the interface of
objects can be changed within certain parameters
without affecting clients. - It is important to notice that the
Shy-Programming Principle builds on top of the
Black-Box principle.
11Manager Metaphor.
Want to learn about organizing bioinformatics
knowledge.
- A manager M is managing a set of group leaders G,
each one managing a set of workers W. We consider
issues related to informing M and requesting
information from M. We use this example to
illustrate three points. - Micromanager no information restriction.
- Shyness helps information restriction.
- Complex requests help information restriction
and optimization.
12Manager Metaphor.
- Micromanager no information restriction.
- If the manager is a micromanager (a manager that
wants to know about and rely on all the details
of the workers projects), the managing approach
is brittle because when there is a change in the
details of one of the workers projects, the
manager needs to be notified.
13Manager Metaphor.
- Micromanager no information restriction
(continued). - An object-oriented program written in the usual
way corresponds to the manager that likes to
micromanage. It is full of detailed knowledge of
the class graph. An alternative way of
formulating the same idea is to observe that it
is good when the workers are shy. A shy worker
will only share minimal, high-level information
with the group leader. And this will prevent a
brittle situation where the group leaders and
manager rely on too much detail.
14Manager Metaphor.
- Shyness helps information restriction
- It is good for the workers to be shy and only
talk to their group leader and not to the manager
directly. (Shyness has two facets talk only to a
few friends AND share minimal information with
them. Here we use the first facet while in the
previous point we used the second facet.) The
group leader will abstract the information from
the workers and only pass on the abstract
information to the manager. This will prevent the
manager from micromanaging. This variant can be
viewed as an application of the Law of Demeter
(LoD) which states that an object should talk
only to closely related objects. The closely
related object for a worker is the group leader
and not the manager.
15Manager Metaphor.
- Shyness helps information restriction
(continued). - The motivation is that when things change at the
worker level, the manager does not have to be
informed necessarily. The group leader will be
informed and will decide whether the information
needs to be passed up.
shielded
16Manager Metaphor.
- Complex requests help information restriction
and optimization. - The manager does not want to be bothered by many
simple requests from the many workers. Instead
the manager prefers to get a complex request from
time to time from a group manager. The complex
request offers the manager the possibility to see
all the requests as a whole and to optimize the
overall result which would not be possible if
simple requests come one by one and need to be
satisfied immediately before the totality of all
simple requests is seen.
17Manager Metaphor.
- Complex requests help information restriction
and optimization (continued). - The same point applies to programming instead of
sending an object a lot of individual data access
requests, it is better to send one complex
request that can be treated as a whole and
optimized accordingly.
18Aspect-oriented Programming (AOP).
- AOP is programming with aspects. An aspect is a
complex request to modify the execution of a
program. May expose a large interface. This can
be implemented efficiently by inserting code at
compile time into the program. An aspect should
be shy with respect to the program it modifies.
19AOSD not every concern fits into a component
crosscutting
Goal find new component structures that
encapsulate rich concerns
20A Reusable Aspect.
abstract public aspect RemoteExceptionLogging
abstract pointcut logPoint() after()
throwing (RemoteException e) logPoint()
log.println(Remote call failed in
thisJoinPoint.toString()
( e ).)
abstract
public aspect MyRMILogging extends
RemoteExceptionLogging pointcut logPoint()
call( RegistryServer..(..))
call(private RMIMessageBrokerImpl..(..))
21Good Aspects Are Shy.
abstract aspect CapabilityChecking pointcut
invocations(Caller c) this(c) call(void
Service.doService(String)) pointcut
workPoints(Worker w) target(w) call(void
Worker.doTask(Task)) pointcut
perCallerWork(Caller c, Worker w)
cflow(invocations(c)) workPoints(w)
before (Caller c, Worker w) perCallerWork(c, w)
w.checkCapabilities(c)
22Lessons From Manager Metaphor.
- Information hiding does not hide enough.
Information hiding makes all public interfaces
available and (Micromanager) makes the point that
only an abstraction of those interfaces should be
visible at higher levels.
23Lessons From Manager Metaphor (Continued).
- In Shy Programming, only high-level information
about the class or call graph is visible at the
(shy) programming level and this shields the
program from many changes to the class or call
graph in the same way as the manager is shielded
from many of the changes in the workers
projects. The role of the group leader is played
by the glue code that maps high-level information
to low-level information and vice-versa. Shy
Programming is graph-shy.
24Application to Bioinformatics Knowledge
- Need shy programming and shy knowledge
representation techniques for Bioinformatics. - Need domain-specific languages to define function
in a structure-shy way.
25Another Good Example of AOP.
find all persons waiting at any bus stop on a bus
route
busStops
BusRoute
BusStopList
OO solution one method for each red class
buses
0..
BusStop
BusList
waiting
0..
passengers
Bus
PersonList
Person
0..
26Traversal Strategy.
find all persons waiting at any bus stop on a bus
route
from BusRoute through BusStop to Person
A complex request
busStops
BusRoute
BusStopList
buses
0..
BusStop
BusList
waiting
0..
passengers
Bus
PersonList
Person
0..
27Robustness of Strategy.
find all persons waiting at any bus stop on a bus
route
from BusRoute through BusStop to Person
Complex request is class-graph shy
villages
BusRoute
BusStopList
buses
VillageList
busStops
0..
0..
BusStop
BusList
Village
waiting
0..
passengers
Bus
PersonList
Person
0..
28Writing Aspect-oriented Programs With Strategies.
String WPStrategyfrom BusRoute through BusStop
to Person
class BusRoute int countWaitingPersons()
Integer result (Integer)
Main.cg.traverse(this, WPStrategy,
new Visitor() int r public void
before(Person host) r public
void start() r 0 public
Object getReturnValue() return
new Integer(r) ) return
result.intValue()
A complex request
Complex request plays role of manager
Complex request is class-graph shy
29Writing Aspect-Oriented Programs With Strategies.
String WPStrategyfrom BusRoute through BusStop
to Person
// Prepare current class graph Main.cg new
ClassGraph() int r aBusRoute.countWaitingPers
ons()
30ObjectGraph in UML Notation.
BusList
Route1BusRoute
buses
busStops
BusStopList
Bus15Bus
passengers
CentralSquareBusStop
waiting
PersonList
PersonList
JoanPerson
PaulPerson
SeemaPerson
EricPerson
31ObjectGraphSlice.
BusList
Route1BusRoute
buses
busStops
BusStopList
Bus15Bus
passengers
CentralSquareBusStop
waiting
PersonList
PersonList
JoanPerson
PaulPerson
SeemaPerson
EricPerson
32Summary So Far.
- Aspect-oriented software development helps to
create software that is - More flexible supports easy adaptation to
rapidly changing interfaces. - Easier to understand and also shorter.
- Supports the Shy Programming Principle.
33Institute for Complex Scientific Software
- Institute Home Page
- http//www.icss.neu.edu/
34What?
- Problem driving institute
- Complexity of building software systems to enable
scientific research
- Objective
- Develop general methodologies for
building complex scientific software using latest
computer science research
35Goals.
Applications
Scientific Software Solutions
The Institute
New Methodologies
Computer Science
36Applicable Computer Science Research.
- Aspect-Oriented Software Development
- Software Components
- Parallelism
- Domain Specific Languages
- Visualization
- Knowledge-Based Support Systems
37Three Testbeds.
- THEMATICS (M. Ondrechen protein function from
structure high external visibility) - Proc. Nat. Academy of Science publication
- Featured in popular scientific magazines
Nature, American Chemical Society, Science Daily - Subsurface Sensing and Imaging (many Institute
participants from this area) - Parallel Geant4 (CERN Cooperman, Reucroft and
Swain particle matter interaction -- million
line program)
38Some Other Faculty Highlights.
- Valentin Ilyin.
- Protein structure analysis novel structural
alignment method which produces high quality
alignments. - visual analytical bioinformatics interface
(Friend). - Roger Giese.
- The long term goal is to learn whether the
measurement of DNA adducts in people can help to
individualize cancer prevention, analogous to the
measurement of cholesterol as a biomarker for
risk of a heart attack.
39Some Other Faculty Highlights.
- Bob Futrelle.
- I'm particularly interested in the relations
between bio-ontologies and text and diagrams.
40Conclusions
- Northeastern University and the Institute for
Complex Scientific Software create knowledge of
significant interest to bioinformatics. - Aspect-Oriented Software Development is a useful
technology for the rapidly evolving area of
bioinformatics.
41The End
42PathSet Algorithm
- We have developed an efficient graph search
algorithm that solves the following problem - Input
- Graph G1 (V1, E1) with source s and target t.
- Graph G2 (V2, E2) where V1 is a subset of V2.
- Question Does G2 contain a path that is an
expansion of a path in G1 from s to t (the
algorithm works even if s and t are sets of
nodes.)
43Explanation.
- Given a path p, a path p' is called an expansion,
if p' can be obtained by inserting one or more
elements between elements of p. - More generally, we can find a third graph that
succinctly represents all possible such paths in
G2. - Do you see applications of such an algorithm in
biology?
44Motivation.
- G1 is a small graph that lists important
nodes. - G2 is a large graph in which we want to
recognize paths that are expansions of paths in
the the small graph. - Expansions of paths may contain additional nodes
that are noise nodes.
45Notes
- There is a path in G2 iff the traversal graph of
G1 and G2 is not empty. - G1 may have exponentially many paths from s to t.
46Topic Switch.
47Lessons From Manager Metaphor (Continued).
- AOP is related to (Micromanager) through the
observation that aspects should be loosely
coupled to the base programs they modify. The
aspect should not be brittle with respect to the
detailed calling structure of the base program in
the same way as the manager should not rely on
the details of the workers project. There is an
intermediary, called glue code, that maps the
aspect to the detailed usage context. AOP is
call-graph shy.