Lecture Grids and Markup Languages - PowerPoint PPT Presentation

1 / 72
About This Presentation
Title:

Lecture Grids and Markup Languages

Description:

Lecture Grids and Markup Languages Gregor von Laszewski Argonne National Laboratory and University of Chicago gregor_at_mcs.anl.gov http://www.mcs.anl.gov/~gregor – PowerPoint PPT presentation

Number of Views:217
Avg rating:3.0/5.0
Slides: 73
Provided by: lucEdu
Category:

less

Transcript and Presenter's Notes

Title: Lecture Grids and Markup Languages


1
Lecture Grids and Markup Languages
  • Gregor von Laszewski
  • Argonne National Laboratory
  • and
  • University of Chicago
  • gregor_at_mcs.anl.gov
  • http//www.mcs.anl.gov/gregor

2
Outline
  • Gestalt of the Grid
  • State of the Grid
  • Example for a production Grid
  • Markup Languages
  • Example Query

3
Gestalt of the Grid
  • We start the discussion with a famous picture
    used in early psychology experiments.
  • If we examine the drawing in detail, it will be
    rather difficult to decide what the different
    components represent in each of the
    interpretations. Although hat, feather, and ear
    are identifiable in the figure, ones
    interpretation (Is it an old woman or a young
    girl?) is based instead on perceptual evidence.
  • This figure should remind us to be open to
    individual perceptions about Grids and to be
    aware of the multifaceted aspects that constitute
    the Gestalt of the Grid.

4
Motivation Perform Collaborative Multiscale
Science
sensors
scientists
compute and storage facilities
consumer
calculate
collaborate
measure
deliver
observations
model
prediction
feedback
  • von Laszewski, et al. Gestalt of the Grid,
    http//www.mcs.anl.gov/gregor/

5
The motivating experiment at ANL
Virtual Lecture Room
Advanced Photon Source
Scientist
Grid
Electronic Library and Databases
6
Grid an evolving term
(1)
  • Kleinrock 1969
  • We will probably see the spread of computer
    utilities, which, like present electric and
    telephone utilities, will service individual
    homes and offices across the country.
  • 90s Prior to using the term Grid
  • Catlett pre 1996 metacomputer
  • Foster 1996 networked supercomputing environment
  • von Laszewski 1996 integration of knowledge
    resources ( data humans) into the networked
  • 1999 The Grid Book
  • A computational Grid is a hardware and software
    infrastructure that provides dependable,
    consistent, pervasive, and inexpensive access to
    high-end computational capabilities
  • Limits definition to hardware and software
    infrastructure

7
Grid an evolving term
(2)
  • 2000 von Laszewski Grid approach
  • We define the Grid approach as a general concept
    and idea to promote a vision for sophisticated
    international scientific and business-oriented
    collaborations.
  • A Grid is the infrastructure that makes the Grid
    approach a reality.
  • A production Grid is a shared computing
    infrastructure of hardware, software, and
    knowledge resources that allows the coordinated
    resource sharing and problem solving in dynamic,
    multi-institutional virtual organizations to
    enable sophisticated international scientific and
    business-oriented collaborations
  • An ad hoc Grid provides a production Grid that
    addresses management issues related to sporadic,
    ad hoc, and time-limited interactions and
    collaborations including the instantiation and
    management of the production Grid itself.

8
Grid
  • Building a collaborative environment to share
    resources
  • Provide the users with an impression of a
    persistent infrastructure
  • Virtualize the concept of a resource
  • Virtualize the concept of groups sharing the

9
History of Globus and CoG at ANL
10
Management Challenge
  • Users requirements result in a variety of complex
    challenges
  • They will keep us busy for quite a while
  • We should not expect the solution to be here
    tomorrow or
  • that it was here yesterday.

11
Grid Management Aspects
  • Information
  • Security

12
Subset of Grid related Security Concepts
Single Sign-on
Authorization
Secure communication through encryption and
non-repudiation
Authentication
Access control through authenticationand
authorization
Community authorization
Secure Execution
Delegation
13
Grid computing must address integration challenge
14
Grid deployments and software releases
15
Grid Computing is more than middleware
  • Grid computing must be seamlessly integrated in
    commodity technologies to be effective

16
Evolution invariant architectures
  • Longevity is bound to evolution invariant
    architectures

17
Visual Interfaces / Grid faces
  • Education need easy access to lower barrier

18
Rapid Prototyping Job Submission
callback_func(void user_arg, char job_contact,
int state, int errorcode)
globus_i_globusrun_gram_monitor_t monitor
monitor (globus_i_globusrun_gram_monitor_t )
user_arg globus_mutex_lock(monitor-gtmutex)
monitor-gtjob_state state
switch(state) case GLOBUS_GRAM_PROTOCOL_
JOB_STATE_PENDING globus_i_globusrun_gram_m
onitor_t monitor monitor
(globus_i_globusrun_gram_monitor_t ) user_arg
globus_mutex_lock(monitor-gtmutex)
monitor-gtjob_state state switch(state)
case GLOBUS_GRAM_PROTOCOL_JOB_STATE_FAILED
if(monitor-gtverbose)
globus_libc_printf("GLOBUS_GRAM_PROTOCOL_JOB_STATE
_FAILED\n") monitor-gtdone
GLOBUS_TRUE break case
GLOBUS_GRAM_PROTOCOL_JOB_STATE_DONE
if(monitor-gtverbose)
globus_libc_printf("GLOBUS_GRAM_PROTOCOL_JOB_STATE
_DONE\n") monitor-gtdone
GLOBUS_TRUE break
globus_cond_signal(monitor-gtcond)
globus_mutex_unlock(monitor-gtmutex)
globus_l_globusrun_gramrun(char request_string,
unsigned long
options, char
rm_contact) char callback_contact
GLOBUS_NULL char job_contact
GLOBUS_NULL globus_i_globusrun_gram_monitor_t
monitor int err monitor.done
GLOBUS_FALSE monitor.verboseverbose
globus_mutex_init(monitor.mutex, GLOBUS_NULL)
globus_cond_init(monitor.cond, GLOBUS_NULL)
err globus_module_activate(GLOBUS_GRAM_CLIENT
_MODULE) if(err ! GLOBUS_SUCCESS)
err globus_gram_client_callback_allow(
globus_l_globusrun_gram_callback_func,
(void ) monitor,
callback_contact) if(err !
GLOBUS_SUCCESS) err
globus_gram_client_job_request(rm_contact,
request_string, GLOBUS_GRAM_PROTOCOL_JOB_STAT
E_ALL, callback_contact,
job_contact) if(err ! GLOBUS_SUCCESS)
globus_mutex_lock(monitor.mutex)
while(!monitor.done) globus_cond_wait(m
onitor.cond, monitor.mutex)
globus_mutex_unlock(monitor.mutex)
globus_gram_client_callback_disallow(callback_cont
act) globus_free(callback_contact)
globus_mutex_destroy(monitor.mutex)
globus_cond_destroy(monitor.cond)
19
Scientific workflows
  • ltprojectgt
  • ltinclude file"cogkit.xml"/gt
  • ltexecute executable"/bin/climate"
  • host"hot.mcs.anl.gov"
  • provider"GT4"/gt
  • ltecho message"Job completed"/gt
  • lt/projectgt
  • Lessen we seem to learn
  • Kepler and Taverna complex

20
Education
  • Tutorial and slide material available for Globus
  • They contain a portion
  • We found that for beginners the entry curve is
    steep
  • CoG Kit entry curve is relatively low
  • Authentication, job submission, file transfer
    (ssh like )
  • Used successfully in REU and SULI projects
    (undergrads)
  • Viz/GUIs gets students interested
  • Educational dichometry
  • we do want to use the Grid but do not want or
    have the time to learn about it

21
References
  • Globus
  • http//www.globus.org
  • CoG Kits
  • http//www.cogkit.org
  • Portals
  • http//www.ogce.org
  • Papers
  • http//www.mcs.anl.gov/gregor
  • The Grid-Idea and Its Evolution, Gregor von
    Laszewski, accepted for publication in the
    Journal of Information Technology,
    http//www.mcs.anl.gov/gregor/papers/vonLaszewski
    -grid-idea.pdf
  • Biography
  • Gregor von Laszewski is a Scientist at Argonne
    National Laboratory and a fellow of the
    Computation Institute at University of Chicago.
    He received a Masters Degree in 1990 from the
    University of Bonn, Germany, and a Ph.D. in 1996
    from Syracuse University in computer science. He
    is involved in Grid computing since the term was
    coined. Current research interests are in the
    areas of Grid computing, Grid workflows, and Grid
    user interfaces. He is the principal investigator
    of the Java Commodity Grid Kit which provides a
    basis for manyGrid related projects.

22
(No Transcript)
23
Why do we need the Grid today? Changing Nature
of Work
IT must adapt to this new reality
24
Approach Bridging the Application-Resource Gap
User Application
User Application
User Application
Database
Specialized resource
Computers
Storage
25
GT4 Web Services
Custom Services
Custom WSRF Services
GT4WSRF Web Services
Registry Admin
GT4 Container(e.g., Apache Axis)
WS-A, WSRF, WS-Notification
WSDL, SOAP, WS-Security
26
GT4 Services Include
  • Data
  • GridFTP file access movement
  • Reliable File Transfer
  • Replica Location Service
  • Data Access Integration database access
  • Computation
  • GRAM reliable job submission
  • Workspace virtual machine deployment
  • Security
  • Credential repository, authorization,
  • many others

27
Used to Create Powerful SystemsE.g., Cancer
Bioinformatics Grid
Functions
Management
Metadata Management
ID Resolution
Schema Management
Workflow
Security
Resource Management
Service Registry
Service
Service Description
Grid Communication Protocol
Transport
Spans 60 NIH cancer centers across the U.S.
Slide credit Peter Covitz, National Institutes
of Health
28
A Stateful Odyssey
Tell of the storm-tossed man, O Muse, who
wandered long (Homer)
  • A simple goal
  • Web Services conventions for manipulating state
  • A hopeful departure
  • OGSI Open Grid Services Infrastructure
  • Some detours en route
  • WS-RF WS Resource Framework
  • WS-Transfer and friends
  • Home at last?
  • WS-ResourceTransfer, WS-Eventing, etc.

And the end of all our exploring/Will be to
arrive where we started/And know the place for
the first time (Elliot)
29
Stateful Odyssey Practical Implications
  • GT4 supports WSRF today
  • Mechanisms have proved incredibly useful in many
    different contexts
  • A large user community
  • We will incorporate support for
  • final WSRF/WS-Notification specs
  • WS-RT friends (when specs mature)
  • If/when justified based on user demand
  • We will ensure backward compatibility
  • Via a single service with multiple interfaces

30
Other Standards
  • Data
  • GridFTP
  • Data Access Integration (DAIS)
  • Replica location (in progress)
  • Security
  • WS-Security, SAML included in GT4
  • XACML included in GT4
  • SAML-2 awaiting contribution of code
  • Job submission
  • JSDL alpha implementation available
  • BES when BES specification completed

31
(No Transcript)
32
A Production Grid
33
The TeraGrid
The worlds largest collection of supercomputers
34
TeraGrid A High Level View
User Facilities Support Help desk/Portal and
ASTA
Grid Software and ENV Deployment CTSS
Authorization, Accounting and Authentication TG
Allocation and Accounting
Grid Monitoring and Information Systems MDS4
Inca
35
TeraGridAllocation Accounting
36
TeraGrid Allocation
  • Researchers request allocation of resource
    through formal process
  • Process works similarly as that for submitting a
    NSF grant proposal
  • There are eligibility requirements
  • US faculty member or researcher for an non-profit
    organization
  • Principle Investigators submits CV
  • More
  • Description of research, requirements etc.
  • Proposal is peer reviewed by allocation
    committees
  • DAC Development Allocation Committee
  • MRAC Medium Resource Allocation Committee
  • LRAC Large Resource Allocation Committee

37
Authentication, Authorization Accounting
  • TG Authentication Authorization is automatic
  • User accounts are created when allocation is
    granted
  • Resources can be accessed through
  • ssh via password, ssh keys
  • Grid access via GSI mechanism (grid-mapfile,
    proxies)
  • Accounts created across TG sites users in
    allocation
  • Accounting system is oriented towards TG
    Allocation Service Units (ASU)
  • Accounting system is well defined and monitored
    closely
  • Each TG sites is responsible for its own
    accounting

38
TeraGridMonitoring and Validation
39
TeraGrid and MDS4
  • Information providers
  • Collect information from various sources
  • Local batch system Torque, PBS
  • Cluster monitoring ganglia, Clumon
  • Spits out XML in a standard schema (attribute
    value pairs)
  • Information is collected into local Index service
  • Global TG wide Index collector with WebMDS

40
Inca TeraGrid Monitoring
  • Inca is a framework for the automated
    testing, benchmarking and monitoring of Grid
    resource
  • Periodic scheduling of information gathering
  • Collects and archives site status information
  • Site validation verification
  • Checks site services deployment
  • Checks software stack environment
  • Inca can also site performance measurements

41
TeraGridGrid Middleware Software Environment
42
The TeraGrid Environment
  • SoftEnv all software on TG can be accessed via
    keys defined in HOME/.soft
  • SoftEnv system is user configurable
  • Environment can also be accessed at run time for
    WS GRAM jobs

You will be interacting with SoftEnv during the
exercises later today
43
TeraGrid Software CTSS
  • CTSS Coordinated TeraGrid Software Service
  • A suite of software packages that includes globus
    toolkit, condor-g, myproxy, openssh
  • Installed at every TG site

44
TeraGrid User Facility Support
  • The TeraGrid Help desk help_at_teragrid.org
  • Central location for user support
  • Routing of trouble tickets
  • TeraGrid portal
  • Users view of TG
  • Resources
  • Allocations
  • Access to Docs!

45
TeraGrids ASTA Program
  • Advanced Support for TeraGrid Application
  • Help application scientists with TG resources
  • Associates one or more TG staff with application
    scientists
  • Sustained effort
  • A minimum of 25 FTE
  • Goal
  • Maximize effectiveness
  • of application software
  • TeraGrid resources

46
Topics Not Covered
  • Managed Storage
  • Grid Scheduling
  • More

47
Managing Storage
  • Problems
  • No real good way to control the movement of files
    into and out of site
  • Data is staged by fork processes!
  • Anyone with access to the site can submit such a
    request and swamp the server
  • There is also no space allocation control
  • A grid user can dump files of any size on a
    resource
  • If users do not cleanup sys, admin have to
    intervene

These can easily overwhelm a resource
48
Managing Storage
  • A Solution SRM (Storage Resource Manager)
  • Grid enabled interface to put data on a site
  • Provides scheduling of data transfer requests
  • Provides reservation of storage space
  • Technologies in the OSG pipeline
  • dCache/SRM (disk cache with SRM)
  • Provided by DESY FNAL
  • SE(s) available to OSG as a service from the
    USCMS VO
  • DRM (Disk Resource Manager)
  • Provided by LBL
  • Can be added on top of a normal UNIX file system

gt globus-url-copy srm//ufdcache.phys.ufl.edu/cms
/foo.rfz \ gsiftp//cit.caltech.edu/data/bar.rfz
49
Grid Scheduling
  • The problem With job submission this still
    happens!

Why do I have to do this by hand?
Why do I have to do this by hand? _at_?gt_at_
?
User Interface VDT Client
Grid Site A
Grid Site B
Grid Site X
50
Grid Scheduling
  • Possible Solutions
  • Sphinx (GriPhyN, UF)
  • Work flow based dynamic planning (late binding)
  • Policy based scheduling
  • More details ask Laukik
  • Pegasus (GriPhyN, ISI/UC)
  • DAGman based planner and Grid scheduling (early
    binding)
  • More details in Work Flow
  • Resource Broker (LCG)
  • Match maker based Grid scheduling
  • Employed by application running on LCG Grid
    resources

51
Much Much More is Needed
  • Continue the hardening of middleware and other
    software components
  • Continue the process of federating with other
    Grids
  • OSG with TeraGrid
  • OSG with LHC/EGEE, NordiGrid
  • Continue to synchronize the Monitoring and
    Information Service Infrastructure
  • Improve documentation

52
Conclude with a simple example
  • Log on to a User Interface
  • Get your grid proxy logon to the grid
    grid-proxy-init
  • Check OSG MIS clients
  • To get list of available sites depends on your
    VO affiliation
  • To discover site specific information needed by
    your job ie,
  • Available services hostname, port numbers
  • Tactical storage location app, data, tmp,
    wntmp
  • Install your application bins at selected sites
  • Submit your jobs to selected sites via condor-G
  • Check OSG MIS clients to see if jobs have
    completed
  • Do something like this
  • If 0 then
  • echo Have a coffee (beer, margarita)
  • else
  • echo its going to be a long night
  • fi

53
To learn more
  • The Open Science Grid top level page
  • http//www.opensciencegrid.org
  • The TeraGrid top level page
  • http//www.teragrid.org
  • The TeraGrid portal
  • https//portal.teragrid.org/gridsphere/gridsphere
  • The globus website
  • http//www.globus.org
  • The iVDGL website
  • http//www.ivdgl.org
  • The GriPhyN website
  • http//www.griphyn.org

54
Data Transfers _at_ the TG
  • Gridftp is available at all sites
  • Provides
  • GSI on control and data channels
  • Parallel streams
  • third party transfers
  • Stripped
  • Each TG sites has 1 to several dedicated GridFTP
    enabled servers
  • TeraGrida sites are equiped with various gridftp
    clients
  • globus-url-copy
  • Standard globus gridftp clients (see lectures)
  • uberftp
  • interactive GridFTP client. supports GSI
    authentication, parallel file transfers.
  • tgcp
  • wrapper for globus-url-copy (optimized tcp buffer
    sizesparallel streams)
  • Interfaced with RFT (Reliable Transfer Service),
    performs third party transfers make sure files
    gets to destination see lectures?

55
(No Transcript)
56
How can Markup languages help
  • Pro
  • Standardization
  • Language neutral
  • Some languages have good support through classes
  • Framework neutral (mostly)
  • Hip/Fashionable
  • Con
  • Mostly not human readable
  • Binary data is not easy to encode
  • Parsing large documents need some thought
  • Pull parser vs. document parser
  • Programming can be tedious
  • Is there really a standard?

57
  • Standardization
  • Language Independence
  • User unfriendliness
  • Service Description
  • Information services
  • Job submission
  • Configuration
  • YAML vs XML
  • http//www.kuro5hin.org/story/2004/10/29/14225/062

58
YAML vs XML
  • ltuser idgregor" computersuny.mcs.anl.gov"gt
  • ltfirstnamegtGregorilt/firstnamegt
  • ltlastnamegtvon Laszewskilt/lastnamegt
  • ltdepartmentgtArgonnelt/departmentgt
  • ltphonegt630- 252 2000lt/phonelgt
  • ltaddressesgt
  • ltaddressgtgregor_at_mcs.anl.govlt/addressgt
  • ltaddressgt laszewski_at_gmail.comlt/addressgt
  • lt/addressesgt
  • lt/usergt
  • user
  • id gregor
  • computer sunny.mcs.anl.gov
  • firstname Gregor
  • lastname von LAszewski
  • phone 630 252 2000
  • addresses
  • - address gregor_at_mcs.anl.gov
  • - address laszewski_at_gmail.com

59
YAML
  • YAML is a machine parsable data serialization
    format
  • documents are very readable by humans.
  • interacts well with scripting languages (Perl,
    Ruby, Python, ).
  • uses host languages' native data structures.
  • has a consistent information model.
  • enables stream-based processing.
  • is expressive and extensible.
  • is easy to implement.
  • Features
  • Structure is shown through indentation
  • Sequence items are denoted by a dash
  • Key value pairs within a map are separated by a
    colon.

60
YAML
  • kern
  • ostype Darwin
  • osrelease 8.7.1
  • osrevision 199506
  • version Darwin Kernel Version 8.7.1 Wed
    Jun 7 161956
  • maxproc 532
  • maxfiles 12288
  • argmax 262144
  • securelevel 1
  • hostname sunny.mcs.anl.gov
  • hostid 0
  • clockrate
  • hz 100
  • tick 10000
  • profhz 100
  • stathz 100
  • posix1version 200112
  • ngroups 16

61
Conventions
  • sysctl -a on Mac OSX

kern.ostype Darwin kern.osrelease
8.7.1 kern.osrevision 199506 kern.version
Darwin Kernel Version 8.7.1 Wed Jun 7 161956
kern.maxproc 532 kern.maxfiles
12288 kern.argmax 262144 kern.securelevel
1 kern.hostname lapi-56.mcs.anl.gov kern.hostid
0 kern.clockrate hz 100, tick 10000,
profhz 100, stathz 100 kern.posix1version
200112 kern.ngroups 16
62
LDIF
  • dn cnThe Postmaster,dcexample,dccom
  • objectClass organizationalRole
  • cn The Postmaster
  • .
  • Unique identifier
  • Definition of object classes
  • http//tools.ietf.org/html/rfc2849

63
LDIF
  • dncnBarbara Jensen, ouProduct Development,
    dcairius, dccom
  • objectclasstop
  • objectclassperson
  • objectclassorganizationalPerson
  • cnBarbara Jensen
  • cnBarbara J Jensen
  • cnBabs Jensen
  • snJensen
  • uidbjensen
  • telephonenumber1 408 555 1212
  • descriptionBabs is a big sailing fan, and
    travels extensively in sea
  • rch of perfect sailing conditions.
  • titleProduct Manager, Rod and Reel Division

64
LDIF
  • kern.ostype Darwin
  • kern.osrelease 8.7.1
  • kern.osrevision 199506
  • kern.version Darwin Kernel Version 8.7.1 Wed
    Jun 7 161956
  • kern.maxproc 532
  • kern.maxfiles 12288
  • kern.argmax 262144
  • kern.securelevel 1
  • kern.hostname lapi-56.mcs.anl.gov
  • kern.hostid 0
  • kern.clockrate hz 100, tick 10000, profhz
    100, stathz 100
  • kern.posix1version 200112
  • kern.ngroups 16

65
LDIF
  • kern.ostype Darwin
  • kern.osrelease 8.7.1
  • kern.osrevision 199506
  • kern.version Darwin Kernel Version 8.7.1 Wed
    Jun 7 161956
  • kern.maxproc 532
  • kern.maxfiles 12288
  • kern.argmax 262144
  • kern.securelevel 1
  • kern.hostname lapi-56.mcs.anl.gov
  • kern.hostid 0
  • kern.clockrate hz 100, tick 10000, profhz
    100, stathz 100
  • kern.posix1version 200112
  • kern.ngroups 16

66
JSON vs XML
  • JSON
  • "menu"
  • "id" "file",
  • "value" "File",
  • "popup"
  • "menuitem"
  • "value" "New", "onclick"
    "CreateNewDoc()",
  • "value" "Open", "onclick" "OpenDoc()",
  • "value" "Close", "onclick" "CloseDoc()"
  • XML
  • ltmenu id"file" value"File" gt
  • ltpopupgt
  • ltmenuitem value"New" onclick"CreateNewDoc()"
    /gt

67
Some simple Examples
  • wsrf-query -s https//127.0.0.18443/wsrf/servic
    es/DefaultIndexService \"count(//local-name()'
    Entry')
  • wsrf-query -s https//127.0.0.18443/wsrf/servic
    es/DefaultIndexService \ "number(//local-name(
    )'GLUECE'/glueComputingElement/glueState/_at_glue
    FreeCPUs)0
  • wsrf-query -s http//localhost8080/wsrf/service
    s/ContainerRegistryService \ "////local-nam
    e()'Address'"

68
THE END
69
(No Transcript)
70
(No Transcript)
71
(No Transcript)
72
Outline
  • Introduction to Grid Computing
  • Basic networking, security and other definitions
  • Very basic web services
  • Hardware components and Grids
  • Introduction to Grid middleware components
  • Security
  • Job management
  • Data management
  • Information
Write a Comment
User Comments (0)
About PowerShow.com