How To Address Rapidly Changing Data Representations in an Evolving Scientific Domain Using Aspector - PowerPoint PPT Presentation

1 / 47
About This Presentation
Title:

How To Address Rapidly Changing Data Representations in an Evolving Scientific Domain Using Aspector

Description:

Institute Home Page: http://www.icss.neu.edu/ 3/7/2003 ... Featured in popular scientific magazines: Nature, American Chemical Society, Science Daily ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 48
Provided by: pcl64
Learn more at: https://www2.ccs.neu.edu
Category:

less

Transcript and Presenter's Notes

Title: How To Address Rapidly Changing Data Representations in an Evolving Scientific Domain Using Aspector


1
How To Address Rapidly Changing Data
Representations in an Evolving Scientific Domain
Using Aspect-oriented Programming Techniques
Overview of Bioinformatics at NEU.
  • Karl Lieberherr (lieber_at_ccs.neu.edu)
  • College of Computer and Information Science
  • Northeastern University
  • Boston

2
Motivation
  • From Computational Challenges in Structural and
    Functional Genomics by J. Head-Gordon, IBM
    SYSTEMS JOURNAL, VOL 40, NO 2, 2001.

3
Some Quotes From Head-Gordon.
  • Although techniques for warehousing techniques
    are as vital in the sciences as in business,
    functional warehouses tailored for specific
    scientific needs are few and far between.
  • A key technical reason for this discrepancy is
    that our understanding of the concepts being
    explored in an evolving scientific domain change
    constantly, leading to rapid changes in data
    representation.

4
Some Quotes From Head-Gordon (Refinement).
  • evolving scientific domain change constantly,
    leading to rapid changes in data representation.
  • Not only changes in data representation but also
    changes in interfaces need protection against
    changes in interfaces.
  • Examples additional or modified fields or
    arguments additional or modified types.

5
More Quotes From Head-Gordon.
  • When the format of source data changes, the
    warehouse must be updated to read that source or
    it will not function properly. The bulk of these
    modifications involve extremely tedious,
    low-level translation and integration tasks that
    typically require the full attention of both
    database and domain experts. Given the lack of
    the ability to automate this work, warehouse
    maintenance costs are prohibitive, and warehouse
    up-times severely restricted.

6
Protect Against Changes.
  • Protection against changes in data representation
    and interfaces. Traditional technique
    information-hiding is good to protect against
    changes in data representation. Does not help
    with changes to interfaces.
  • Need more than information hiding to protect
    against interface changes restriction through
    shy programming, called Adaptive Programming
    (AP).

Implementation
Interface
Client
Information Hiding
Shy Programming
7
Problem with Information Hiding
  • Shy Programming builds on the observation that
    traditional black-box composition is not
    restricting enough. We use the slogan
    information hiding is not hiding enough. Blackbox
    composition isolates the implementation from the
    interface, but does not decouple the interface
    from its clients.

8
Cover unimportant parts of the interface
  • To permit interfaces to evolve, self-discipline
    is required to prevent from programming
    extensively against the interface. Certain parts
    of the interface are best left as if they were
    covered.

Implementation
Interface
Client
Information Hiding
Shy Programming
9
Shy Programming Adaptive Programming
  • This disciplined programming is referred to as
    shy programming. Shy programming lets the
    program recover from (or adapt to) interface
    changes. Shy programming is also called Adaptive
    Programming (AP). This is similar to the shyness
    metaphor in the Law of Demeter (LoD) structure
    evolves over time, thus communicate with just a
    subset of the visible objects.

10
Decoupling of Interface
  • We summarize the commonalities and differences
    between black-box composition and Shy Programming
    into two principles.
  • Black-box Principle the representation of
    objects can be changed without affecting clients.
  • Shy-Programming Principle the interface of
    objects can be changed within certain parameters
    without affecting clients.
  • It is important to notice that the
    Shy-Programming Principle builds on top of the
    Black-Box principle.

11
Manager Metaphor.
Want to learn about organizing bioinformatics
knowledge.
  • A manager M is managing a set of group leaders G,
    each one managing a set of workers W. We consider
    issues related to informing M and requesting
    information from M. We use this example to
    illustrate three points.
  • Micromanager no information restriction.
  • Shyness helps information restriction.
  • Complex requests help information restriction
    and optimization.

12
Manager Metaphor.
  • Micromanager no information restriction.
  • If the manager is a micromanager (a manager that
    wants to know about and rely on all the details
    of the workers projects), the managing approach
    is brittle because when there is a change in the
    details of one of the workers projects, the
    manager needs to be notified.

13
Manager Metaphor.
  • Micromanager no information restriction
    (continued).
  • An object-oriented program written in the usual
    way corresponds to the manager that likes to
    micromanage. It is full of detailed knowledge of
    the class graph. An alternative way of
    formulating the same idea is to observe that it
    is good when the workers are shy. A shy worker
    will only share minimal, high-level information
    with the group leader. And this will prevent a
    brittle situation where the group leaders and
    manager rely on too much detail.

14
Manager Metaphor.
  • Shyness helps information restriction
  • It is good for the workers to be shy and only
    talk to their group leader and not to the manager
    directly. (Shyness has two facets talk only to a
    few friends AND share minimal information with
    them. Here we use the first facet while in the
    previous point we used the second facet.) The
    group leader will abstract the information from
    the workers and only pass on the abstract
    information to the manager. This will prevent the
    manager from micromanaging. This variant can be
    viewed as an application of the Law of Demeter
    (LoD) which states that an object should talk
    only to closely related objects. The closely
    related object for a worker is the group leader
    and not the manager.

15
Manager Metaphor.
  • Shyness helps information restriction
    (continued).
  • The motivation is that when things change at the
    worker level, the manager does not have to be
    informed necessarily. The group leader will be
    informed and will decide whether the information
    needs to be passed up.

shielded
16
Manager Metaphor.
  • Complex requests help information restriction
    and optimization.
  • The manager does not want to be bothered by many
    simple requests from the many workers. Instead
    the manager prefers to get a complex request from
    time to time from a group manager. The complex
    request offers the manager the possibility to see
    all the requests as a whole and to optimize the
    overall result which would not be possible if
    simple requests come one by one and need to be
    satisfied immediately before the totality of all
    simple requests is seen.

17
Manager Metaphor.
  • Complex requests help information restriction
    and optimization (continued).
  • The same point applies to programming instead of
    sending an object a lot of individual data access
    requests, it is better to send one complex
    request that can be treated as a whole and
    optimized accordingly.

18
Aspect-oriented Programming (AOP).
  • AOP is programming with aspects. An aspect is a
    complex request to modify the execution of a
    program. May expose a large interface. This can
    be implemented efficiently by inserting code at
    compile time into the program. An aspect should
    be shy with respect to the program it modifies.

19
AOSD not every concern fits into a component
crosscutting
Goal find new component structures that
encapsulate rich concerns
20
A Reusable Aspect.
abstract public aspect RemoteExceptionLogging
  abstract pointcut logPoint()   after()
throwing (RemoteException e) logPoint()
log.println(Remote call failed in
thisJoinPoint.toString()
( e ).)
abstract
public aspect MyRMILogging extends
RemoteExceptionLogging pointcut logPoint()
call( RegistryServer..(..))
call(private RMIMessageBrokerImpl..(..))
21
Good Aspects Are Shy.
abstract aspect CapabilityChecking pointcut
invocations(Caller c) this(c) call(void
Service.doService(String)) pointcut
workPoints(Worker w) target(w) call(void
Worker.doTask(Task)) pointcut
perCallerWork(Caller c, Worker w)
cflow(invocations(c)) workPoints(w)
before (Caller c, Worker w) perCallerWork(c, w)
w.checkCapabilities(c)
22
Lessons From Manager Metaphor.
  • Information hiding does not hide enough.
    Information hiding makes all public interfaces
    available and (Micromanager) makes the point that
    only an abstraction of those interfaces should be
    visible at higher levels.

23
Lessons From Manager Metaphor (Continued).
  • In Shy Programming, only high-level information
    about the class or call graph is visible at the
    (shy) programming level and this shields the
    program from many changes to the class or call
    graph in the same way as the manager is shielded
    from many of the changes in the workers
    projects. The role of the group leader is played
    by the glue code that maps high-level information
    to low-level information and vice-versa. Shy
    Programming is graph-shy.

24
Application to Bioinformatics Knowledge
  • Need shy programming and shy knowledge
    representation techniques for Bioinformatics.
  • Need domain-specific languages to define function
    in a structure-shy way.

25
Another Good Example of AOP.
find all persons waiting at any bus stop on a bus
route
busStops
BusRoute
BusStopList
OO solution one method for each red class
buses
0..
BusStop
BusList
waiting
0..
passengers
Bus
PersonList
Person
0..
26
Traversal Strategy.
find all persons waiting at any bus stop on a bus
route
from BusRoute through BusStop to Person
A complex request
busStops
BusRoute
BusStopList
buses
0..
BusStop
BusList
waiting
0..
passengers
Bus
PersonList
Person
0..
27
Robustness of Strategy.
find all persons waiting at any bus stop on a bus
route
from BusRoute through BusStop to Person
Complex request is class-graph shy
villages
BusRoute
BusStopList
buses
VillageList
busStops
0..
0..
BusStop
BusList
Village
waiting
0..
passengers
Bus
PersonList
Person
0..
28
Writing Aspect-oriented Programs With Strategies.
String WPStrategyfrom BusRoute through BusStop
to Person
class BusRoute int countWaitingPersons()
Integer result (Integer)
Main.cg.traverse(this, WPStrategy,
new Visitor() int r public void
before(Person host) r public
void start() r 0 public
Object getReturnValue() return
new Integer(r) ) return
result.intValue()
A complex request
Complex request plays role of manager
Complex request is class-graph shy
29
Writing Aspect-Oriented Programs With Strategies.
String WPStrategyfrom BusRoute through BusStop
to Person
// Prepare current class graph Main.cg new
ClassGraph() int r aBusRoute.countWaitingPers
ons()
30
ObjectGraph in UML Notation.
BusList
Route1BusRoute
buses
busStops
BusStopList
Bus15Bus
passengers
CentralSquareBusStop
waiting
PersonList
PersonList
JoanPerson
PaulPerson
SeemaPerson
EricPerson
31
ObjectGraphSlice.
BusList
Route1BusRoute
buses
busStops
BusStopList
Bus15Bus
passengers
CentralSquareBusStop
waiting
PersonList
PersonList
JoanPerson
PaulPerson
SeemaPerson
EricPerson
32
Summary So Far.
  • Aspect-oriented software development helps to
    create software that is
  • More flexible supports easy adaptation to
    rapidly changing interfaces.
  • Easier to understand and also shorter.
  • Supports the Shy Programming Principle.

33
Institute for Complex Scientific Software
  • Institute Home Page
  • http//www.icss.neu.edu/

34
What?
  • Problem driving institute
  • Complexity of building software systems to enable
    scientific research
  • Objective
  • Develop general methodologies for
    building complex scientific software using latest
    computer science research

35
Goals.
Applications
Scientific Software Solutions
The Institute
New Methodologies
Computer Science
36
Applicable Computer Science Research.
  • Aspect-Oriented Software Development
  • Software Components
  • Parallelism
  • Domain Specific Languages
  • Visualization
  • Knowledge-Based Support Systems

37
Three Testbeds.
  • THEMATICS (M. Ondrechen protein function from
    structure high external visibility)
  • Proc. Nat. Academy of Science publication
  • Featured in popular scientific magazines
    Nature, American Chemical Society, Science Daily
  • Subsurface Sensing and Imaging (many Institute
    participants from this area)
  • Parallel Geant4 (CERN Cooperman, Reucroft and
    Swain particle matter interaction -- million
    line program)

38
Some Other Faculty Highlights.
  • Valentin Ilyin.
  • Protein structure analysis novel structural
    alignment method which produces high quality
    alignments.
  • visual analytical bioinformatics interface
    (Friend).
  • Roger Giese.
  • The long term goal is to learn whether the
    measurement of DNA adducts in people can help to
    individualize cancer prevention, analogous to the
    measurement of cholesterol as a biomarker for
    risk of a heart attack.

39
Some Other Faculty Highlights.
  • Bob Futrelle.
  • I'm particularly interested in the relations
    between bio-ontologies and text and diagrams.

40
Conclusions
  • Northeastern University and the Institute for
    Complex Scientific Software create knowledge of
    significant interest to bioinformatics.
  • Aspect-Oriented Software Development is a useful
    technology for the rapidly evolving area of
    bioinformatics.

41
The End
42
PathSet Algorithm
  • We have developed an efficient graph search
    algorithm that solves the following problem
  • Input
  • Graph G1 (V1, E1) with source s and target t.
  • Graph G2 (V2, E2) where V1 is a subset of V2.
  • Question Does G2 contain a path that is an
    expansion of a path in G1 from s to t (the
    algorithm works even if s and t are sets of
    nodes.)

43
Explanation.
  • Given a path p, a path p' is called an expansion,
    if p' can be obtained by inserting one or more
    elements between elements of p.
  • More generally, we can find a third graph that
    succinctly represents all possible such paths in
    G2.
  • Do you see applications of such an algorithm in
    biology?

44
Motivation.
  • G1 is a small graph that lists important
    nodes.
  • G2 is a large graph in which we want to
    recognize paths that are expansions of paths in
    the the small graph.
  • Expansions of paths may contain additional nodes
    that are noise nodes.

45
Notes
  • There is a path in G2 iff the traversal graph of
    G1 and G2 is not empty.
  • G1 may have exponentially many paths from s to t.

46
Topic Switch.
47
Lessons From Manager Metaphor (Continued).
  • AOP is related to (Micromanager) through the
    observation that aspects should be loosely
    coupled to the base programs they modify. The
    aspect should not be brittle with respect to the
    detailed calling structure of the base program in
    the same way as the manager should not rely on
    the details of the workers project. There is an
    intermediary, called glue code, that maps the
    aspect to the detailed usage context. AOP is
    call-graph shy.
Write a Comment
User Comments (0)
About PowerShow.com