Using HTC grid infrastructures: practical experiences from the eminerals project Mark Calleja proxy - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Using HTC grid infrastructures: practical experiences from the eminerals project Mark Calleja proxy

Description:

... sweep of temperature using ossia. RDatasetID = 263. AgentXdefault = trans. ... 'Code name' content='ossia' ... parameter dictRef='ossia:NumberOfSteps' name ... – PowerPoint PPT presentation

Number of Views:119
Avg rating:3.0/5.0
Slides: 25
Provided by: nes6
Category:

less

Transcript and Presenter's Notes

Title: Using HTC grid infrastructures: practical experiences from the eminerals project Mark Calleja proxy


1
Using HTC grid infrastructures practical
experiences from the eminerals projectMark
Calleja (proxy for Martin Dove)University of
Cambridge
2
Our view of eScience
Computing grids
Collaborative grids
Data grids
3
Science beyond the lab book
  • Management of too many tasks
  • Management of the resultant data deluge
  • Sharing the information content with
    collaborators
  • Maintaining accuracy and verification

4
Rock-salt structure of BaCO3
Note disordered positions of oxygen atoms
5
BaCO3 lattice parameters
Molecular dynamics simulations on the NGS
6
Usable HTC grid tools
  • Easy-to-use tools
  • Easy access to resources and data
  • Enabling me to achieve much more than before

Can I run my jobs before breakfast?
7
Useful tools for HTC grids
  • Use standard tools and interfaces, eg Globus,
    Condor
  • Heterogenous resources for heterogenous
    applications
  • Metascheduling
  • Integrated data grid
  • Give as much control as possible to the user
  • The key is in the user interface

8
Globus is useda) to provide user authentication
via digital certificates b) job submission
middleware
Our data grid is based on the San Diego Storage
Resource Broker
The application server provides databases and
server capabilities for the SRB, metadata tools,
and job submission tool
Researcher
9
Job submission process
  • Central role the data grid for data staging and
    data archiving
  • Desktop job submission
  • Automatic metadata collection
  • Wrapped up in our RMCS tool

10
Researcher
4. Job runs on grid compute resources
Application server
11
RMCS input file
Executable ossia2004 pathToExe
/home/bob.eminerals/OSSIA2004 preferredMachi
neList lv1.nw-grid.ac.uk-serial
dl1.nw-grid.ac.uk-serial jobType
performance numOfProcs 1 Output
trans.out Sdir
/home/bob.eminerals/RMCSdemo Sget
Sput GetEnvMetadata
true RDesc Test sweep of temperature
using ossia RDatasetID 263 AgentXdefault
trans.xml AgentX Energy,trans.xmlPro
pertyList.Propertytitle'Energy'.value AgentX
OrderParameter,trans.xmlModule.Pro
pertytitle'Order parameter'.value AgentX
HeatCapacity,trans.xmlModule.Propertytit
le'Heat capacity'.value AgentX
Susceptibility,trans.xmlModule.Propertytitle
'Susceptibility'.value
12
RMCS architecture
Client layer shell tools, GUI
Server layer API, database, job control
Grid resources for computing and data
13
RMCS shell interface
RMCS shell commands interact with the RMCS server
via web services removing the need for
complicated middleware installation, and is
firewall friendly Examples of commands
  • rmcs_submit submit a job
  • rmcs_status how is the job doing?
  • rmcs_cancel kill the job
  • rmcs_remove remove from status listing

14
RMCS GUI interface
15
Parameter sweeps
We have perl programs that
  • implement bulk file upload to the SRB or other
    data grid
  • generate set of RMCS input files
  • submit all the RMCS jobs

Bulk job creation and submission is a one-command
procedure
16
Data and information
?
17
Data representation XML
Chemical Markup Language
lt?xml version"1.0" encoding"UTF-8"?gt ltcml
convention"FoX_wcml-2.0" fileId"cis1.cml"
version"2.4" xmlns"http//www.xml-cml.org/schema
"gt ltmetadataList name"Metadata"gt ltmetadata
name"Code name" content"ossia"/gt ltmetadata
name"Code version date" content"January 8,
2007, v2007.3"/gt ... lt/metadataListgt
ltmodule title"Initial System" dictRef"emininiti
alModule"gt ltparameterListgt ltparameter
dictRef"ossiatemperature" name"Temperature"gt
ltscalar dataType"xsddouble"
units"cmlUnitseV"gt1.000000000000e-1lt/scalargt
lt/parametergt ltparameter
dictRef"ossiaNumberOfSteps" name"Number of
steps"gt ltscalar dataType"xsdinteger"
units"unitscountable"gt10000000lt/scalargt
lt/parametergt ... lt/parameterListgt
lt/modulegt ... ltmodule title"Finalization"
dictRef"eminfinalModule"gt ltpropertyListgt
ltproperty dictRef"ossiaEnergy"
title"Energy"gt ltscalar
dataType"xsddouble" units"cmlUnitseV"gt2.052516
362912e-1lt/scalargt lt/propertygt ...
lt/propertyListgt lt/modulegt lt/cmlgt
Capturing audit metadata
Capturing initial parameters
Capturing computed properties
18
XML and Fortran
  • Most of our simulation codes are written in
    Fortran, which has little support for XML
  • Thus we have written a set of XML libraries for
    Fortran  called FoX to make writing XML easy
  • We have XML-ised a number of simulation codes,
    including SIESTA, CASTEP, DL_POLY and GULP
  • We have also developed an XML-aware interface to
    the SRB called TobysSRB

19
What XML gives us
  • Simulation code output that is self-describing
    (no more mere lists of numbers!)
  • Data files can be transformed to give
    user-centric and information-centric
    representations, including plotted data
  • Easy to extract key information extracted,
    essential for large combinatorial studies
  • Enables automatic capture of metadata, and
    metadata is essential for managing data

20
XML ? metadata
  • RMCS automatically harvests metadata from our
    output XML files
  • We have developed a new set of tools to access
    the metadata database (RCommands)
  • We use metadata for locating data and datasets
    created by our colleagues
  • We also use metadata for extracting core
    information from data  useful for analysing
    combinatorial studies

21
RCommands and metadata
Metadata are associated with a hierarchy of
studies, datasets and data objects, both as
descriptions and as name/value pairs Examples of
commands
  • Rls list metadata items
  • Rget get metadata
  • Rannotate add metadata
  • Rgem extract metadata from all data objects
    within a dataset

22
Researcher A
Researcher B
23
Summary
  • eMinerals toolset empowers the scientist users in
    their use of HTC grid resources
  • Tools work from our personal computers with easy
    installation
  • Integrates compute, data and collaborative
    components

24
Credits
Cambridge Kat Austen, Richard Bruin, Mark
Calleja, Gen-Tau Chiang, Ian Frame, Peter
Murray-Rust, Toby White, Andrew Walker STFC
Kerstin Kleese van Dam, Phil Couch, Tom
Mortimer-Jones, Rik Tyer Bath Corrine Arrouvel,
Arnaud Marmier, Steve Parker Funded by NERC
Write a Comment
User Comments (0)
About PowerShow.com