Title: A JSDL Application Repository Portal for Heterogeneous Grids and the NGS
1A JSDL Application Repository Portal for
Heterogeneous Grids and the NGS
- David Meredith
- NGS Operations,
- e-Science Centre,
- Daresbury Laboratory, UK
- d.j.meredith_at_dl.ac.uk
2NGS Applications Repository Portal/Portlet Core
Functionality
- A JSDL Repository
- Search/browse for JSDL (personal and shared) by
category of interest (e.g bioinformatics,
chemistry, tutorials/examples). Select, load,
modify, save. - JSDL documents can be pre-configured and
published by domain experts / resource
administrators (users benefit from sharing
expertise, artefacts and configuration captured
in JSDL). - Community formation around a best practice
approach (OGF). - JSDL GUI Editor for authoring, validating,
sharing, uploading app descriptions. - Grid Operations File Staging, Application
Submission, Monitoring (run either
out-of-the-box, or modify/tweak as required). - Generic designed to be extensible, can extend to
support different Grid middleware technologies
and data staging protocols.
3JSDL
Ali Anjomshoaa, Fred Brisard, Michel Drescher,
Donal K. Fellows, William Lee, An Ly, Steve
McGough, Darren Pulsipher, Andreas Savva, Chris
Smith
- JSDL 1.0 is an OGF recommendation
- JSDL 1.0 is published as GFD-R-P.56
http//www.ggf.org/gf/docs/?final
- An XML Schema language for describing the
requirements of computational jobs for submission
to Grids. - Is agnostic of middleware - no dependencies on
Globus, WSRF, gLite (means portal can be generic
and not tied to any particular set of Grid
technologies). - GGF / OGF Standard.
- JSDL documents can be validated against the JSDL
and JSDL POSIX XSD Schema to ensure its
correctness
ltjsdlApplicationgt
ltjsdlApplicationNamegtgnuplotlt/jsdlApplicationNam
egt ltjsdl-posixPOSIXApplicationgt
ltjsdl-posixExecutablegt
/usr/local/bin/gnuplot
lt/jsdl-posixExecutablegt
ltjsdl-posixArgumentgtcontrol.txtlt/jsdl-posixArgum
entgt ltjsdl-posixInputgtinput.datlt/jsdl-po
sixInputgt ltjsdl-posixOutputgtoutp
ut1.pnglt/jsdl-posixOutputgt
lt/jsdl-posixPOSIXApplicationgt
lt/jsdlApplicationgt ltjsdlResourcesgt
.
4Grid Heterogeneity
- Different middleware adopt different formats for
the description of applications and their
associated resources (JDL, RSL), and for their
subsequent execution to a Grid. - A Number of different data storage resources are
also relevant for management and transfer of
data. e.g. GsiFTP, SRB, SRM, WebDav, (S)FTP.
5Grid A Globus RSL (Resource Specification
Language) (executable(GLOBUSRUN_GASS_URL)/hom
e/ngs0153/cpi) (arguments 30 fileA)
(jobTypempi) (environment (NGSMODULES
mpich-gm/1.2.5..10-intel8.1intel/fce/9.1.032)
(TMP /tmp)) (count 4) (hostCount 8)
(minMemory 512) (maxWallTime 3)
(directory/home/ngs0153) (stdin/home/ngs0153/cpi
.in) (stdout/home/ngs0153/cpi.out)
(stderr/home/ngs0153/cpi.err)
Grid B gLite JDL (Job Description
Language) Type "Job" JobType
"Normal" RetryCount 3 Executable
"/home/ngs0153/cpi" Arguments "30
fileA" VirtualOrganisation "myGridVOproject" S
tdInput "cpi.in" StdOutput
"cpi.out" StdError "cpi.err" InputSandbox
"gsiftp//grid-data.rl.ac.uk2811/home/ngs0153/cpi
", "gsiftp//grid-data2.dl.ac.uk2811/myhome/fileA
" InputSandboxDestFileName "cpi", "fileA"
OutputSandbox "cpi.out" OutputSandboxDes
tURI "gsiftp//mygridhome.dl.ac.uk2811/myhome
" DeleteOnTermination "fileA"
Environment "NGSMODULESmpich-gm/1.2.5..10-
intel8.1intel/fce/9.1.032", "TMP/tmp"
Requirements ( other.GlueCEInfoLRMSType
"PBS" ) ( member( GlueCEInfoHostName,
"grid-data.rl.ac.uk2119" , "mygrid-resource.dl.a
c.uk2119" ) ) ( GlueHostProcessorModel
"Intel" ) Rank -other.GlueCEStateEstimatedRespo
nseTime
6Catering for Grid Heterogeneity
- Middleware specific dependencies added at run
time - convert the JSDL into middleware specific
scheme (e.g. RSL). - Add mw-specific parameters, e.g. RSL JobType
(cater for this in JSDL using XML Schema
extensions in place of ltxsdanygt placeholder
elements) - Portal Database has to accommodate all
middleware variations.
GT2 RSL extension XML schema lt?xml version"1.0"
encoding"UTF-8"?gt ltxsdschema xmlnsxsd"http//w
ww.w3.org/2001/XMLSchema" xmlns"http//www.ggf.or
g/namespaces/2004/11/jsdl-rsl-1.0.xsd" targetNames
pace"http//www.ggf.org/namespaces/2004/11/jsdl-r
sl-1.0.xsd" elementFormDefault"qualified"gt ltxsd
element name"jobType" type"jobType"/gt ltxsdele
ment name"gramMyJob" type"gramMyJob"/gt ltxsdele
ment name"dryRun" type"boolean"
default"no"/gt ltxsdelement name"save_state"
type"boolean" default"no"/gt
7Applications Repository
Portal is open, free to browse pubic JSDL
documents without log-in (free to use JSDL
editor). Login required to browse personal
applications, save and submit jobs, interact with
Grid resources. List jobs, read job
descriptions and load a job to initialise the
Active Job. Changes to the parameters in the
GUI will update and validate the JSDL template
automatically.
8My Job Detail
- Input fields are pre-configured / filled out.
- Fields are taken from the JSDL and JSDL-POSIX
extension schemas. - POSIXApplication is a JSDL extension. It defines
standard POSIX elements. - stdin, stdout, stderr
- Working directory
- Command line arguments
- Environment variables
ltPOSIXApplicationgt ltExecutable ... /gt
ltInput ... /gt? ltOutput ... /gt? ltError ...
/gt? ltWorkingDirectory ... /gt?
lt/POSIXApplicationgt
9Environment Variables
ltjsdl1Environment nameTMP"gt/tmplt/jsdl1Environm
entgt ltjsdl1Environment name"NGSMODULES"gtenvVar
Value1lt/jsdl1Environmentgt ..
10Command Line Arguments
Paste and parse command line arguments (space
and/or line separated values)
ltjsdl1Argumentgtfasta34lt/jsdl1Argumentgt ltjsdl1Ar
gumentgt-Hlt/jsdl1Argumentgt ltjsdl1ArgumentgthumanDN
A2.inputlt/jsdl1Argumentgt ltjsdl1Argumentgt/var/dat
a/bioinformatics/..lt/jsdl1Argumentgt ltjsdl1Argume
ntgtSlt/jsdl1Argumentgt
11Named File Systems
Named file systems used to declare mount points
on the consuming system. File system names are
referenced throughout the portal (and JSDL doc)
for substituting mount points. Changes to a FS
mount point will be updated automatically
throughout the portal/JSDL. Used when
specifying path info e.g. locations to
files/dirs, stage data locations etc.
ltjsdlFileSystem nameWORKINGDIR"gt
ltjsdlMountPointgt/home/ngs0024/myScratchDir
lt/jsdlMountPointgt lt/jsdlFileSystemgt ltjsdlFileSy
stem nameDataDir"gt
ltjsdlMountPointgt/home/ngs0024/myDataDirlt/jsdlMou
ntPointgt lt/jsdlFileSystemgt ltjsdlposixOutput
filesystemName"WORKINGDIR"gt fasta.out
lt/jsdl1Outputgt
12Stage Data
List of data from across the Grid that should be
copied to the consuming system Before job
src URI After job tgt URI JSDL does
not mandate the protocol / URI format. Data is
staged relative to named file systems.
ltjsdlDataStaginggt ltjsdlFileNamegtMg.psflt/
jsdlFileNamegt ltjsdlFilesystemNamegtWORKINGD
IRlt/jsdlFilesystemNamegt ltjsdlCreationFlaggt
overwritelt/jsdlCreationFlaggt
ltjsdlDeleteOnTerminationgtfalselt/jsdlDeleteOnTerm
inationgt ltjsdlSourcegt
ltjsdlURIgtgsiftp//ngs.rl.ac.uk2811/apps/Siesta_m
pi/lt/jsdlURIgt lt/jsdlSourcegt
lt/jsdlDataStaginggt
13Candidate Hosts
Candidate Hosts resources that can be used to
run the given application. The candidate host
list can contain personal and default hosts
(available to all users). In future, a RB
matchmaking will be used to select execute host
from candidate hosts.
ltjsdlCandidateHostsgt ltjsdlHostNamegt
ngs.rl.ac.uk2119 lt/jsdlHostNamegt
ltjsdlHostNamegt clyde.dl.ac.uk2119
lt/jsdlHostNamegt lt/jsdlCandidateHostsgt
14Browse Host / Data Transfer
- File and recursive directory transfers between
hosts - File and directory operations
- Actions for updating application
15Technical
- JSFv1.1 (Java Server Faces) GUI.
- JSR-168 compliant. Vanilla JSF (core spec) is
JSR-168 compliant so can host as Web application
or portlet within institutional portals (JSF
extensions can be problematic). - Spring v2.0 for managing objects in an n-tier
server application (highly recommended, adds J2EE
to non J2EE apps, e.g. Tomcat/Jetty apps). - Declarative transaction demarcation (akin to EJB
3 session beans). - Data source management (e.g. JPA PstCtx,
Hibernate Session). - Propagation of Data Source across DAOs / session
façades during long running transactions. - C3p0 pooled database connections.
- JPA (Java persistence API) for ORM (object
relational mapping). Hibernate 3.2 for domain
model (could use Kodo, Toplink, apache openJPA). - CogKit for Globus API from Globus.
- Object / Xml data binding framework. XMLBeans /
JAX-B.
16CURRENT
- Staging from more Data Grid Web protocols
(SRB). Browsing / file operations with different
data storage resources. Staging across different
protocols adds complexity (buffering required).
TODO
- Parametric jobs (parametric JSDL extension schema
defines parametric variables, functions, ranges
for modifying JSDL doc for iteration). - Middleware extensions, e.g. gLite resource
broker, JSDL conversion to JDL (aim to use SAGA). - Integrate OMII WHIP artefact sharing framework
(gather and bundle remote resources / artefacts
together into self contained application bundle,
e.g. executable for particular OS, src, input
files, data files). - Support Roles / VOs (for artefact sharing, not
just public / personal). - Shibboleth enable.
- Describe more apps using NGS Uniform Execution
Environment (UEE) - standard way to describe same
application across different (NGS) resources
consistent JSDL description with multiple
candidate hosts for the same app. - Improvements / refinements (AJAXify)
17Please come and find me at the NGS Stand Demo on
the OMII booth (2.00pm Wed) https//portal.ngs.ac
.uk
18Summary
- Please contact NGS to request more hosted
applications.
- JSDL Repository https//portal.ngs.ac.uk
- Search/browse for JSDL (personal and shared) by
category of interest (e.g bioinformatics,
chemistry, tutorials/examples). Select, load,
save application (run either out-of-the-box, or
modify/tweak as required). - JSDL documents can be pre-configured and
published by domain experts / resource admins
(users benefit from sharing expertise and
artefacts captured in JSDL). - Community formation around a best practice
approach (JSDL is an OGF recommendation). - JSDL GUI Editor for authoring, validating,
sharing, uploading app descriptions. - Grid Operations File Staging, Application
Submission, Monitoring. - Generic and not tied to any particular set of
Grid technologies. Extend to support more
middleware and staging protocols.