Privacy issues in integrating R environment in scientific workflows - PowerPoint PPT Presentation

About This Presentation
Title:

Privacy issues in integrating R environment in scientific workflows

Description:

Privacy issues in integrating Legacy Experiment Environment to Scientific ... R realises rich functionality of data statistics and visualisation, and has been ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 14
Provided by: zhim6
Category:

less

Transcript and Presenter's Notes

Title: Privacy issues in integrating R environment in scientific workflows


1
Privacy issues in integrating R environment in
scientific workflows
  • Dr. Zhiming Zhao
  • University of Amsterdam
  • Virtual Laboratory for e-Science

Privacy issues in integrating Legacy Experiment
Environment to Scientific WorkflowsZhiming Zhao,
Dmitry A. Vasunin, Adianto Wibisono, Adam
Belloum, Cees de Laat, Pieter Adriaans, Bob
Hertzberger
2
Outline
  • Scientific experiments and R
  • Problem description
  • Optional solutions
  • Experimental results
  • Summarizing discussion
  • Future work

3
Scientific experiments and support systems
  • In such scenarios
  • Existing experiment environments, such as R, are
    widely used by domain scientists
  • Human in the loop computing is important for
    testing and validating prototypes
  • scientific workflows are used to manage different
    processes and the experiment lifecycle

4
R and workflow support in VL-e
  • R realises rich functionality of data statistics
    and visualisation, and has been used as an
    important experimental environment in
    bio-sciences.
  • R needs scientific workflow support
  • Accessing different e-Science resources
  • Being coordinated with the other components in a
    large scale experiment
  • E-Science workflows in certain domains also need
    R
  • Reuse the advanced results from legacy systems
  • Support experiments developed on legacy systems
  • Workflow support in VL-e
  • Four systems are recommended
  • Taverna, Kepler and VLAM have support to R
  • A generic solution is under construction

5
R in scientific workflows current solutions
  • Three types of solutions
  • Local local installation of R, through the
    command line interface of R
  • Simple configuration
  • Performance bottleneck
  • Web Service SOAP to pass R script and objects
  • Standard interface, distributed computing
  • High latency
  • TCP Socket socket interface (RServe)
  • Distributed computing
  • Maintain states
  • Poor security

6
Typical scenario of RServe and requirements on
privacy
  • Different levels of privacy issues
  • Data level
  • Intermediate results not to be seen by the other
    users
  • Communication level graphical display
  • Remote X display and interaction between multi
    users

WF1
WF2
R
Display
7
Problem description and desired solution
  • Problem description
  • Most of the legacy experiment environment do not
    have strong security management
  • Workflow systems provide integration without
    considering security issues
  • The deployment of remote environment is required
    to be secure
  • Desire
  • Using existing technologies
  • Provide solutions to privacy issues at workflow
    level, preferably in a transparent way

8
Experiments
  • Review optional solutions
  • Investigate the overhead of security enhancement
    on the workflow execution

9
Different configurations and their level of
security
10
An experiment Taverna, RServe and security tunnel
  • Experiment
  • Adding security enhancement in Taverna
  • Protect the data channels between Taverna and
    RServe
  • Overhead
  • Setting up security tunnels
  • Runtime data transfer

11
Summarizing discussion
  • Integrating existing experiment environment with
    workflow system is important for rapid
    prototyping
  • Privacy issues are demanded by both users and
    e-Science infrastructure, and can be viewed a
    generic issue when integrating a user interaction
    enabled legacy component in workflow
  • Privacy protection can be achieved at certain
    level by customizing the workflow execution
  • Enhancing workflow execution not necessarily
    gives high penalty on execution

12
Future work
  • In the VL-e project, we are developing a bus
    style generic solution for different workflow
    systems
  • Taking the data privacy into account when
    realizing the interoperability between different
    workflow systems

13
Activities
  • Intl workshop on Workflow systems in
    e-Science, organized by Zhiming Zhao and Adam
    Belloum, in the context of ICCS, 2006 Reading
    University, 2007 Beijing, China.
  • Proceedings is in LNCS, Springer Verlag.
  • A special issue will be published in Scientific
    Programming Journal.
  • http//staff.science.uva.nl/zhiming/iccs-wses
  • Workshop on Scientific workflows and industrial
    workflow standards in e-Science , organized by
    Adam Belloum and Zhiming Zhao, in the context of
    IEEE e-Science and Grid computing conference in
    Amsterdam December 2006.
  • Pegasus, Dr. Ewa Deelman (Department of Computer
    Science University of South California)
  • BPEL, Dr. Dieter König (IBM Research Germany
    Development Laboratory)
  • Kepler, Dr. Bertram Ludäscher (Department of
    Computer Science University of California, Davis)
  • Taverna, Prof. Peter Rice (European
    Bioinformatics Institute)
  • WS and Semantic issues, Dr. Steve Ross-Talbot
    (CEO, and a co-founder, of Pi4 Technologies)
  • Triana, Dr. Ian J. Taylor (Department of Computer
    Science Cardiff University)
  • http//staff.science.uva.nl/adam/workshop/VL-e-wo
    rkshop.htm
Write a Comment
User Comments (0)
About PowerShow.com