GLOBUS PLUGIN FOR WINGS WOKFLOW ENGINE - PowerPoint PPT Presentation

About This Presentation
Title:

GLOBUS PLUGIN FOR WINGS WOKFLOW ENGINE

Description:

If necessary copy auxiliary data used by the executable (libraries, jar files, ... Define the RSL file for the task. Executable, arguments, working directory, etc. ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 18
Provided by: mig782
Category:

less

Transcript and Presenter's Notes

Title: GLOBUS PLUGIN FOR WINGS WOKFLOW ENGINE


1
GLOBUS PLUG-IN FOR WINGSWOKFLOW ENGINE
  • Elizabeth Martí
  • ITACA
  • Universidad Politécnica de Valencia
  • emarti_at_itaca.upv.es

2
Introduction
  • Take advantage of two concepts Workflow Grid.
  • Workflow provides the automation of the
    processes.
  • Grid makes possible the development of
    high-performance computing systems using
    heterogeneous geographically distributed
    resources with multiple administrative domains.
  • A Grid workflow can be defined as the composition
    of grid application services which execute on
    heterogeneous and distributed resources in a
    well-defined order to accomplish a specific goal
    (Rajkumar Buyya).

3
Motivation
  • There have appeared many different workflow
    initiatives.
  • Askalon, Karajan, Kepler , K-WfGrid, Taverna,
    Triana, etc.
  • They lack of some important characteristics
  • multi grid capability.
  • easy extensibility to new middleware.
  • etc.
  • WINGS provides new features focusing on high
    level definition, multigrid and extensibility
    capabilities.
  • The most significant features of WINGS are
  • Expressiveness to capture specificities of grid
    computing.
  • Provide flow control structures.
  • Consider simple light operations.
  • It is able to deal with different grid
    middlewares and versions.

4
WINGS Concepts
  • It is based on four concepts to model a workflow
  • Data sources Communication points to interchange
    data among the different executions of the
    workflow.
  • Activities Abstractions of tasks to be run on
    the Grid. Describe the functionality of the
    tasks.
  • Are defined by
  • The input and output parameters
    (simple/structured types).
  • The list of deployments that provides the
    multi-grid middlewares specifics.
  • Executions Specific instances of an activity.
    The engine is in charge of selecting from the
    different deployments defined for each activity,
    according to where it going to be run.
  • Operations Simple executions that will be
    executed by the workflow runtime in order to pre
    or post process the information available in the
    Data Sources, to be used by the next tasks.
  • Examples arithmetic and reduction operations,
    string search operations, field extractions
    operations, split or merge file operations etc.

5
Wings Engine
  • It considers a pure data flow language where a
    workflow is a sequence of
  • DS Execution or Operation DS
  • Simplifies the workflow description and
    understanding, and also increases the
    expressiveness.
  • It is in charge of providing the functionality
    defined in the XML file, creating a environment
    to launch concurrent jobs.
  • A key issue in a multi-grid environment is the
    movement of the files among the different
    resources of consecutive tasks, so the RT tries
    to
  • Reduce the number of data transferences.
  • Deal with different physical file storage systems
    .

6
WINGS Architecture
WINGS Core Engine
7
Execution scheme
  • Core Engine
  • Performs the logic and control operations
  • Prepare and select the tasks ready to be launched
    and the data to use in each execution.
  • Plug-ins
  • In charge of effectively perform the file
    transferences and all the needed operations to
    complete the execution.

8
Middleware Plugins
  • Extended functionality just implementing a
  • plug-in and adding it to the system.
  • In the first version of the workflow engine, the
    Fura middleware plug-in was developed
  • Now a Globus Toolkit plug-in has been implemented
    to enable multi-grid tests.
  • Globus has been selected due to the great number
    of current infrastructures that use it as the
    underlying grid middleware (EGEE, EELA, etc.).

9
Globus Plug-in
  • Step 1 To prepare the activity.
  • Workflow model is defined at the XML file.
  • Create a valid proxy (Proxy store).
  • Define the execution enviroment of the task
    (Globus, Fura,).
  • Create a working directory on the execution host
    (GridFTP).
  • Create an execution directory (GridFTP).
  • Copy the executable to the execution host (Third
    party copy with UrlCopy).
  • If necessary copy auxiliary data used by the
    executable (libraries, jar files, ).

10
GLOBUS PLUG-IN
  • Step 2 To prepare the initial data.
  • Obtain the information of the input data (XML
    file) and store it (input parameters matrix).
  • Obtain the number of microtask (combination of
    inputs).
  • Create an input directory.
  • Copy the input data to the input directory
    (UrlCopy).
  • Create an output directory.

11
GLOBUS PLUG-IN
  • Step 3 Execute the task.
  • Define the RSL file for the task.
  • Executable, arguments, working directory, etc.
  • Create a GRAM Job for each RSL file.
  • Launch the job (batch mode).
  • Parallel execution of microtasks.

12
GLOBUS PLUG-IN
  • Step 4 Get output data.
  • Get output data from the output directory.
  • Use of wildcards to filter files.
  • Create a replica of results in a specified
    location.
  • Path specification at the data source definition.
  • Clean intermediate data.
  • Implementation of a function to delete
    recursively directories.

13
Use Case
  • A biomedical application representing the
    execution of a medical images co-registration
    process (rigid and elastic).
  • The co-registration processes compare all the
    images with the base study to align the voxels of
    the studies to be as much as possible similar to
    the reference image.
  • The input data are dynamic series of 3D magnetic
    resonance images after the injection of a
    contrast bolus in the area of the abdomen, to
    study the perfusion of the liver.
  • The set are composed by 5 studies with 12 slices.

14
Use Case
  • Biomedical Application
  • The workflow is composed by three steps
  • Rigid co-registration
  • Elastic co-registration (the most CPU consuming
    step)
  • Process to transpose the N studies (with K
    slices) results of the co-registration into K
    studies with N slices.

15
USE Case
  • Execution Times
  • F1 AMD Opteron 2.4GHz with 1GB of RAM (Fura
    Agent)
  • F2 AMD Opteron 2.2GHz with 1GB of RAM (Fura
    Agent)
  • GN Pentium Xeon 2 GHz with 512 MB of RAM (Globus
    Node)
  • Gigabit Ethernet Network

16
Conclusions
  • We have analyzed previous works and some of them
    have good features but do not fit our needs.
  • WINGS has been designed in a modular way enabling
    to add new components to the system through a
    plug-in.
  • We have implemented a Globus plug-in oriented to
    GT middleware.
  • Currently Fura, Globus Toolkit (pre-ws services),
    and sub-workflow execution plugins have been
    developed enabling to launch cross-middleware
    tests with the two specified grid systems.

17
  • Thanks for you attention !
Write a Comment
User Comments (0)
About PowerShow.com