A Proof of Concept: Provenance in a Service Oriented Architecture - PowerPoint PPT Presentation

About This Presentation
Title:

A Proof of Concept: Provenance in a Service Oriented Architecture

Description:

A Proof of Concept: Provenance in a Service Oriented Architecture Liming Chen, Victor Tan, Fenglian Xu, Alexis Biller, Paul Groth, Simon Miles, John Ibbotson, Michael ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 28
Provided by: gridprove
Category:

less

Transcript and Presenter's Notes

Title: A Proof of Concept: Provenance in a Service Oriented Architecture


1
A Proof of ConceptProvenance in a Service
Oriented Architecture
  • Liming Chen, Victor Tan,
  • Fenglian Xu, Alexis Biller,
  • Paul Groth, Simon Miles,
  • John Ibbotson, Michael Luck
  • and Luc Moreau

2
Purpose
  • Asking questions about the provenance of
    something, i.e. the process by which it came to
    be as it is, is essential in many domains
  • We are working with bioinformaticians, medics,
    aerospace engineers, physicists and have found a
    wide range of questions they wish to ask
  • A simple example application can
  • Clarify the requirements on software to aid
    answering those questions
  • Be used to explain the issues involved to
    non-domain experts
  • Be extended in controlled ways to explore issues
    that arise in real applications

3
EU Provenance and PASOA
  • Recent work of the EU Provenance project
  • Developed a logical architecture for software to
    aid answering provenance-related questions, along
    with other research on security, scalability and
    user tool support.
  • Now being applied to two project applications
    organ transport management (UPC, Spain) and
    aerospace engineering (DLR, Germany)
  • The logical architecture document should be
    released next week keep an eye on
    www.gridprovenance.org
  • Recent work of the PASOA project
  • Has focused on e-Science applications and has
    gathered requirements, developed protocols and
    software
  • EU Provenance used PASOA software for the work
    described in this talk
  • PASOA will be discussed in the following two
    presentations

4
Outline
  • The example application
  • Asking provenance-related questions
  • The example as a service-oriented process
  • Recording documentation of a process
  • What does the example show us?
  • What are the limits of the example?
  • Conclusions

5
The example application
6
Baking a Victoria Sponge
  • INGREDIENTS
  • 110g (4oz) Butter 110g (4oz) Caster Sugar 110g
    (4oz) Self-raising Flour 2 Eggs Vanilla Essence
    or 1 tsp Grated Lemon Rind
  • RECIPE
  • Preheat oven to 190C 375F Gas 5. Whisk
    together the butter and sugar until light and
    creamy. Add the beaten eggs gradually with a
    little of the flour. Fold in the remaining
    sieved flour and add the flavouring. Divide
    equally between two 15cm (6 inch) sandwich tins.
    Bake for 20 - 25 minutes. Turn out on to a wire
    rack to cool.
  • This is not so a contrived an example!

www.thefoody.com
7
20g sugar
and 20g butter
whisk them together
get mixture 1
8
beat the eggs for 2 minutes
2 eggs
mix the beaten eggs with mixture 1
obtain mixture 2
9
100g flour
together with mixture 2
fold to mixture 3
10
set baking time to 30min
put mixture 3 into oven
obtain a cake
set baking temperature to 180C
11
We then set a time for baking
cake
12
After Baking
  • Some questions can be asked after baking a cake
  • Answers to the questions can be found if we
    record details of the baking process during its
    execution
  • Details of the baking process is what we call the
    provenance of a cake

13
What went wrong? Questions
  • Did we follow the recipe accurately?
  • Did we use the correct ingredients at the right
    time?
  • Did we provide the correct quantities? Correct
    units?
  • Did we perform actions for the right duration?
  • We need to keep a record of all actions performed
    with all their parameters (such as the number of
    eggs used)
  • Organ transplant example Did the medics follow
    the correct procedure?
  • Bioinformatics example Did I analyse a amino
    acid sequence using tools that actually only
    apply to nucleotide sequences?

14
What went wrong? Questions
  • Other factors can affect the baking process
  • Amount of flour required varies with altitude
  • Oven is broken and baked at a different
    temperature
  • We need to know the internal state of the
    different entities participating in the baking
    process (such as actual oven temperature or oven
    altitude)
  • Organ transplant example By what criteria did a
    team decide to accept or reject an organ?
  • Bioinformatics example What script was used by
    the services to perform each stage of the
    experiment?

15
Process Analysis Questions
  • Did we use the same amount of ingredients for
    baking cake 1 and cake 2? or in the same
    proportion?
  • What was the longest step in the execution of a
    recipe?
  • Why did not we finish the process? Where did we
    stop?
  • The process that led to a given cake should be
    delimited and analysable
  • Organ transplant example Which patients death
    led to the organ now being transplanted?
  • Bioinformatics example What samples led to the
    final analysis result?

16
What Did Parties Do? Questions
  • Did the baker follow the users instructions
    (regardless of any claim from the baker)?
  • Did each step of the baking process follow the
    users instructions? Did they receive the correct
    instructions?
  • Did they follow the received instructions?
  • All entities should document their view of a
    process because it may vary
  • Organ transplant example Were there differing
    opinions on the suitability of an organ for
    transplant?
  • Bioinformatics example I claim I used a database
    in my experiments whose license allows me to
    patent my results does the database owner
    confirm this?

17
Implementation
  • We implemented the application as a set of Web
    Services, and then implemented clients that
    answered the provenance-related questions by
    querying the provenance store
  • This involved mapping the scenario onto a
    service-oriented architecture

18
Service-Oriented Process
19
Recording
Provenance Store
Baker (Sugar, Flour, Beating Time, Temperature
After baking, the provenance store contains a
trace of the different activities that were
involved in the production of a cake.
Whisk (Butter, Sugar)
WhiskReturn (Mixture 1)
BeatMix (Mixture 1, Eggs, Beating Time)
BeatMixReturn (Mixture 2)
The provenance of a cake is the documentation of
the process that led to that cake
Fold (Flour, Mixture 2)
FoldReturn (Mixture 3)
OvenBake (Mixture 3, Temperature, Baking Time)
OvenBakeReturn (Cake)
BakerReturn (Cake)
20
What we have learnt
21
Process Documentation and Provenance
  • We distinguish
  • process documentation (the documentation recorded
    into a provenance store about a process)
  • provenance (the information retrieved from a
    provenance store about a process)
  • This is because we have found there to be
    different requirements on each

Process documentation
Provenance
Processing
22
Process documentation
  • Should allow questions about the provenance of
    entities to be answered
  • Should follow a consistent, application-independen
    t structure so that independent parties can
    record documentation that is easily combined
  • e.g. oven may be owned by someone other than the
    user, but their documentation is combined to
    answer whether the requested temperature was used
  • Should state exactly what those recording it know
    to have happened, not confuse it with what they
    guessed or inferred had happened
  • e.g. baker states that it put the cake in the
    oven, not that the cake was successfully baked,
    because the oven may have been broken

23
Provenance
  • Should give the client asking for the provenance
    of something control over the scope of the answer
  • e.g. whether the process that produced the flour
    is included in the provenance of the cake
  • Should be/provide the information relevant to
    answering a clients/users questions (not swamp
    them with detail)
  • e.g. report how much flour used rather than
    giving XML structure sent between application
    components
  • May (in order to achieve the above) include
    inferred information
  • e.g. infer from baker putting mixture in oven and
    getting cake out that the cake was successfully
    baked from the mixture

24
Provenance architectures
  • Should allow different parties to record
    independent documentation if they want to
  • e.g. user and baker can record independently,
    allowing discrepancies to be noticed
  • Should have no dependence on any one workflow
    engine/language, and no requirement for
    (explicit) workflows to be used at all
  • e.g. our example application was written in Java,
    and baking in reality follows a plan in someones
    head
  • Should have independence from any one product of
    a process should not be necessary to store
    process documentation with any one result of a
    process
  • e.g. the provenance of the cake, the provenance
    of the ingredients and the provenance of the
    intermediate mixtures overlap, so cannot claim it
    belongs to any

25
Limitations and Strengths
  • The current example has limitations
  • Physical world treated as if it mapped directly
    to the electronic world how does a baker record
    documentation in a provenance store Web Service?
    through a GUI? what if the GUI goes wrong or
    they use the GUI wrongly, do we still have sound
    process documentation?
  • None of the objects in the process have
    constituent parts that we may want to
    independently find the provenance of
  • Assumes a single provenance store that every
    service happily submits documentation to
  • but the strength of the example is that it can
    be simply extended to remove these limitations

26
Conclusions
  • The simple example allows us to determine the
    requirements on software to record process
    documentation and make it available to users
  • We have used it as a testbed, extending it to
    explore other aspects of provenance (along with
    other applications)
  • It is rich enough to continue extending to
    mirror, in a controlled way, issues discovered in
    the future

27
EU Provenance Partners
  • IBM United Kingdom Limited
  • University of Southampton
  • University of Wales, Cardiff
  • Deutsches Zentrum fur Luft- und Raumfahrt s.V
  • Universitat Politecnica de Catalunya
  • Magyar Tudomanyos Akademia Szamitastechnikai es
    Automatizalasi Kutato Intezet
Write a Comment
User Comments (0)
About PowerShow.com