Beyond Workflows DOE Cloud Computing Paradigm and the SDM Role and Future - PowerPoint PPT Presentation

About This Presentation
Title:

Beyond Workflows DOE Cloud Computing Paradigm and the SDM Role and Future

Description:

... to become a true, valuable, and economical contributor to cyberinfrastructure. ... pilots, is SDM's Key Contribution over last 7 years Smokey Mountains retreat. ... – PowerPoint PPT presentation

Number of Views:72
Avg rating:3.0/5.0
Slides: 17
Provided by: your182
Learn more at: https://sdm.lbl.gov
Category:

less

Transcript and Presenter's Notes

Title: Beyond Workflows DOE Cloud Computing Paradigm and the SDM Role and Future


1
Beyond Workflows - DOE Cloud Computing Paradigm
and the SDMRole and Future
  • Mladen A. Vouk, Nagiza Smatova, Paul Breimyer,
    Pierre Moualem, Mei Nagappan, and the whole SPA
    team (list available separately)
  • Scientific Data Management Center Scientific
    Process Automation Group
  • NC State University, Raleigh, NC 27695

2
Overview
  • Scientific Workflow technology A success story
    from the past 7 years in the SDM center (a
    technology used in production or otherwise by
    application people) Developed components
    Workflows, Provenance, Dashboard, other
  • DOE SDM Cloud -Vision for the future of the SDM
    centre Integration of components - Intelligent
    Analytics and Social Networks, Component-based
    cloud, Integrated Services (service oriented
    architecture)
  • Sustainable science - Long term approach for the
    survival of SDM center technology (Beyond SciDAC
    and longer) Integration of Research,
    Engineering, Transfer-of-Technology,
    Partnerships, Results (ROI, TOC)

3
Scientific Process Automation
  • A key differentiating element of a successful
    information technology (IT) is its ability to
    become a true, valuable, and economical
    contributor to cyberinfrastructure.
  • An IT-assisted workflow represents a series of
    structured activities and computations that arise
    in information assisted problem solving.
  • Scientific process automation principles, as well
    as production level pilots, is SDMs Key
    Contribution over last 7 years Smokey Mountains
    retreat.
  • From NC State numerous publications, 3 graduated
    PhD and 4 MS with thesis students, several in
    progress, several generations of software.

4
Environment
Analytics
Analytics
Analytics
Computations
Computations
Control Panels (Dashboard) Display
Networking Local/Remote Cloud Services
Orchestration (Kepler)
Orchestration (Kepler)
Data, DataBasesProvenanceStorage
Data, DataBasesProvenanceStorage
5
Workflow Framework
Control Plane (light data flows)
Provenance, Tracking Meta-Data (DBs and Portals)
Kepler
Execution Plane (Heavy Lifting
Computations and flows)
Synchronous or Asynchronous
6
Actor/Process in a Broader Sense
Out
In
Network/Cloud
Bsub lt code_run ------------ where code_run is a
script -------------- code_run ! /bin/csh
source /usr/local/lsf/conf/cshrc.lsf BSUB -W 5
BSUB -n 100 mpiexec ./code BSUB -o
/share/vouk/WFLOW/code.out.J BSUB -e
/share/vouk/WFLOW/code.err.J BSUB -J
codevouk -------------------------
6
7
Modular Framework
Trust
Storage
Supercomputers Analytics Nodes
Kepler
Data Store
Access
Rec API
Disp API
Dash
Management API
Orchestration
Meta-Data about Processes, Data, Workflows, Syst
em, Apps Environment
8
Read More
  • Singh M.P. and M.A. Vouk, "Network Computing," in
    John G. Webster (editor), Encyclopedia of
    Electrical and Electronics Engineering, John
    Wiley Sons, New York, Vol. 14, pp. 114-132,
    1999
  • S Klasky, M Beck, V Bhat, E Feibush, B Ludäscher,
    M Parashar, A Shoshani, D Silver and M Vouk,
    "Data management on the fusion computational
    pipeline," SciDAC 2005, Journal of Physics
    Conference Series 16 (2005), 510-520,
    doi10.1088/1742-6596/16/1/070
  • Ilkay Altintas, Oscar Barney, Zhengang Cheng,
    Terence Critchlow, Bertram Ludaescher, Steve
    Parker, Arie Shoshani and Mladen Vouk,
    "Accelerating the scientific exploration process
    with scientific workflows," sciDAC 2006, Journal
    of Physics Conference Series 46 (2006), 468-478,
    doi10.1088/1742-6596/46/1/065
  • M. A. Vouk, I. Altintas R. Barreto, J. Blondin,
    Z.Cheng, T. Critchlow, A. Khan, S. Klasky, J.
    Ligon, B. Ludaescher, P. A. Mouallem, S. Parker,
    N. Podhorszki, A. Shoshani, C. Silva, "
    Automation of Network-Based Scientific
    Workflows," Proc. of the IFIP WoCo 9 on
    Grid-based Problem Solving Environemnts
    Implications for Development and Deployment of
    Numerical Software, IFIP WG 2.5 on Numerical
    Software, Prescott, AZ, 2006, printed in IFIP,
    Vol 239, "Grid-Based Problem Solving
    Environments, eds. Gaffney PW and Pool JCT
    (Boston Springer), pp. 35-61, 2007
  • Klasky, S. Barreto, R. Kahn, A. Parashar, M.
    Podhorszki, N. Parker, S. Silver, D. Vouk,
    M.A. "Collaborative visualization spaces for
    petascale simulations," Proceedings of the CTS
    2008 - International Symposium on Collaborative
    Technologies and Systems, pp 203-211, Digital
    Object Identifier 10.1109/CTS.2008.4543933,10-23
    May 2008
  • More http//sdm.ncsu.edu

9
DOE Cloud
  • Cloud computing builds on decades of research
    in virtualization, distributed computing, utility
    computing, grids, and more recently networking,
    web and software services.
  • It implies a seamless service oriented and
    component-based architecture - delivery of an
    integrated and orchestrated suite of on-demand
    functions to an end-user through composition of
    both loosely and tightly coupled functions, or
    services - often network-based, reduced
    information technology overhead for the end-user,
    service orchestration, virtualization of
    resources, great flexibility, reduced total cost
    of ownership, different flavors.
  • Intelligent Analytics and Knowledge-Creating
    Social Networks, Component-based Clouds,
    Seamless/Integrated Services
  • Necessary in the context of Peta- and Exa-
    sciences, data, etc.

10
Analytics Cloud"
Knowledge creation Integration, Social
Networking, Provenance, Tracking Meta-Data (DBs
and Portals)
Workflow control plane
Concept-driven Analytics
W/F Engine
W/F Generation Wizard
Synchronous Asynchronous Services
Run-time Manager and Scheduler
Execution Plane - Heavy duty
in-cloud Computations, Flows Services
Analytics Enabled Resources
Supercomputers
Clusters
Supercomputers
Active Storage
Other cloud devices
11
Components
  • Reusability (elements can be re-used in other
    workflows)
  • Substitutability (alternative implementations are
    easy to insert, very precisely specified
    interfaces are available, run-time component
    replacement mechanisms exist, there is ability to
    verify and validate substitutions, etc),
    extensibility and scalability (ability to readily
    extend system component pool and to scale it,
    increase capabilities of individual components,
    have an extensible and scalable architecture that
    can automatically discover new functionalities
    and resources, etc),
  • Customizability (ability to customize generic
    features to the needs of a particular scientific
    domain and problem),
  • Composability (easy construction of more complex
    functional solutions using basic components,
    reasoning about such compositions, etc.). There
    are other characteristics that also are very
    important.
  • Reliability and availability of the components
    and services,
  • Cost - the cost of the services, total cost of
    ownership, economy of scale
  • Security and privacyand so on.

12
Example Meta-Data Framework
Storage
Supercomputers Analytics
Kepler?
Other. ..
Dash
Custom Web
Orchestration
13
Fault-Tolerance Clouds of Clouds
Master DB (replicated)
14
User Categories
  • Developers (10)
  • Service Authors (100 to 1,000)
  • Service Integrators (100 10,000)
  • End-users (1000 - ?)

15
Read More
  • Sam Averitt, Michael Bugaev, Aaron Peeler, Henry
    Shaffer, Eric Sills, Sarah Stein, Josh Thompson,
    Mladen Vouk Virtual Computing Laboratory (VCL),
    In the proceedings of the International
    Conference on Virtual Computing Initiative, May
    7-8, 2007, IBM Corp., Research Triangle Park, NC,
    pp. 1-16.
  • Mladen Vouk, Sam Averitt, Michael Bugaev, Andy
    Kurth, Aaron Peeler, Andy Rindos, Henry Shaffer,
    Eric Sills, Sarah Stein, Josh Thompson ,
    Powered by VCL - Using Virtual Computing
    Laboratory (VCL) Technology to Power Cloud
    Computing, Published in the Prelim. Proceedings
    of the 2nd International Conference on Virtual
    Computing Initiative, 15-16 May 2008, RTP, NC,
    pp. 1-10, final version to be available through
    the ACM Digital Library
  • Mladen A. Vouk, Cloud Computing Issues,
    Research and Implementations, ITI08, to appear
    in IEEE Digital Library
  • Google for cloud computing
  • Other ..

16
Sustainable Science
  • A Long term approach for the survival of SDM
    center technology (Beyond SciDAC and longer)
  • Research
  • Engineering
  • Transfer-of-Technology,
  • Partnerships with scientists
  • Operational open-source tools
  • Visible results (agreed upon ROI, and an
    accounting of TOC)
Write a Comment
User Comments (0)
About PowerShow.com