NPACI Grid: Using Grid Software to Enhance NPACI - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

NPACI Grid: Using Grid Software to Enhance NPACI

Description:

100-cpu AMD cluster (morpheus) being expanded. 177-cpu IBM SP2, 24-cpu IBM Nighthawk ... 'get test.exe from morpheus and run it on hypnos' - submitted by Globus ... – PowerPoint PPT presentation

Number of Views:65
Avg rating:3.0/5.0
Slides: 43
Provided by: ABo81
Category:

less

Transcript and Presenter's Notes

Title: NPACI Grid: Using Grid Software to Enhance NPACI


1
NPACI Grid Using Grid Software to Enhance NPACI
Applications and Systems At the University of
Michigan
Tutorial 10 Abhijit Bose Ken MacInnis Brian
Wickman NPACI All Hands Meeting, March 18-20,
2003
2
Agenda
  • Current Grid Resources at Michigan
  • How to use Grid Services (user-centric view)
  • NPACI Grid Application Showcase High-Energy
    Physics
  • (DZero/SAM Grid)
  • Setting up Grid Services
  • Demonstration

3
Current NPACI Grid Resources at Michigan (March
2003)

Calibration DB Servers
Local Naming Service
SAM FSS
WAN
SAM Station Servers
SAM Stagers
ADSM Storage
/adsm2 72 TB
SAM-Grid Gateway/D0 Gatekeeper
NFS-mounted

Kx509 Client
UM Network
Globus-PBS Job Manager
Hypnos 256-cpu AMD cluster 2000MP, 1 GB/cpu, 1
TB scratch (PBS/Maui Scheduler)
MGRID Gateway/Globus Gatekeeper
(NMI)
NFS-mounted
RAID Server
/d0raid 750GB-1TB
Gigabit Ethernet
4
Grid Resources at Michigan (2003)
  • Integrate the rest of the resources with the Grid
  • 100-cpu AMD cluster (morpheus) being expanded
  • 177-cpu IBM SP2, 24-cpu IBM Nighthawk
  • Other college and dept.-level clusters as part
    of MGRID
  • University-wide KCA (prototype already up and
    running,
  • policies under
    development)
  • University-wide Kx509 and NMI deployment
  • Possible NPACI/CAC 64-bit cluster acquisition in
    3Q, 2003

5
Grid Resources at Michigan (2003)
  • We are also looking at
  • AFS clients on Grid-enabled clusters unified
    view of user
  • directories
  • Scalability and feasibility of using GPFS
    Parallel File System
  • for cluster I/O
  • Dzero/NFSv4 Integration to provide sandboxing
    to local
  • file systems and fine-grained access control
  • Many other activities via the MGRID initiative
    (attend
  • Wednesdays Talk Parallel Session 2 Grid
    Experiences)

6
Steps in Using a Grid-enabled Resource (http//ww
w.npaci.edu/globus)
  • Get NPACI allocations, accounts on Grid-enabled
    systems
  • 2. Get appropriate certificates (More later)
  • Set appropriate environment (can be in your
    .cshrc or other shell rcfile)
  • e.g. (on chi.grid.umich.edu) do
  • GLOBUS_LOCATION/usr/grid
  • source /usr/grid/etc/globus-user-env.sh
  • 4. Create a RSL resource request/submission
    script (More later)
  • 5. Submit RSL script for appropriate Globus job
    manager via
  • globusrun o r f

7
Authentication and Certificates

Globus GSI uses public key cryptography and
digital signatures (http//www.globus.org/se
curity/overview.html) Primary Motivations -
need for secure communication among the elements
of a Grid - support security across
organizational boundaries (distributed
security entities rather than a centrally managed
entity) - support a single sign-on for users,
especially authenticate and authorize
users to multiple resources Need to have some
basic knowledge of Globus credentials, KX.509,
Kerberos etc. Very Important when requesting
access to multiple resources spanning
different administrative domains. This talk
will not cover the details of Globus GSI.
8
Globus Certificates
  • Central to GSI are certificates users and Grid
    services are identified by their certificates
    (identification and authentication)
  • four information fields in each certificate
  • - subject name of the user or the service it
    identifies
  • - public key of the user or the service
  • - identity of the Certificate Authority (CA)
    that signed the certificate
  • certifying the user or the service
  • - a digital signature (e.g. using MD5),
    certificate of the above CA
  • Since the CA certifies the link between the
    subject name and the public key, the CA is
    trusted (e.g. a Grid administrator decides which
  • CAs to trust).

9
Example of a Certificate (using
CA_at_chi.grid.umich.edu) X.509 Format
Certificate Data Version 3
(0x2) Serial Number 1 (0x1)
Signature Algorithm md5WithRSAEncryption
Issuer OGrid, OUGlobusTest, OUsimpleCA-chi.eng
in.umich.edu, CNGlobus Simple CA
Validity Not Before Feb 19 141148
2003 GMT Not After Feb 19 141148
2004 GMT Subject OGrid, OUGlobusTest,
OUsimpleCA-chi.engin.umich.edu,
OUengin.umich.edu, CNKen MacInnis
Subject Public Key Info Public Key
Algorithm rsaEncryption RSA Public
Key (1024 bit) Modulus (1024
bit) 00d1746c9d55ac15
58eb26c2bc27fc
6164f7bd0cc695a1f474570738f0a1
749575b6a3e467670b940d4
302708e
137b0513ca46835908fa6735151b33
43e1250c9ec7ea6697cf329
c2317df
4c3c361d171c7f31e8fcac6c057e58
79f2c45d08cb20380fa42d2
c9d83e5
4c81d1a89cf5978b1deaf0bc474e8b
10e3ce468aeeb3ffe1
Exponent 65537 (0x10001) X509v3
extensions Netscape Cert Type
SSL Client, SSL Server, S/MIME, Object
Signing
10
Example of a Certificate (using
CA_at_chi.grid.umich.edu) X.509 Format
Signature Algorithm md5WithRSAEncryption
86f9bba023f0fde6fa999170a9e842d4b
216 2aa832f401096de608632175
58f47b9d94be d03d00e3e9a911a4
9362f373e6004fe562d7
8a25daf98ef2e83e829e14c5b8a825b70a
7c 5e970f292b67d45b1e4db65fd
690f4e01cee 402db381f4dca1e8
3e97026b3b55af3579b1
8a967684a9090702708ae46c69a12316ac
62 3dfc -----BEGIN CERTIFICATE----- MII
CbTCCAdagAwIBAgIBATANBgkqhkiG9w0BAQQFADBmMQ0wCwYDV
QQKEwRHcmlk MRMwEQYDVQQLEwpHbG9idXNUZXN0MSUwIwYDVQ
QLExxzaW1wbGVDQS1jaGkuZW5n aW4udW1pY2guZWR1MRkwFwY
DVQQDExBHbG9idXMgU2ltcGxlIENBMB4XDTAzMDIx OTE0MTE0
OFoXDTA0MDIxOTE0MTE0OFowfDENMAsGA1UEChMER3JpZDETMB
EGA1UE CxMKR2xvYnVzVGVzdDElMCMGA1UECxMcc2ltcGxlQ0E
tY2hpLmVuZ2luLnVtaWNo LmVkdTEYMBYGA1UECxMPZW5naW4u
dW1pY2guZWR1MRUwEwYDVQQDEwxLZW4gTWFj .. -----END
CERTIFICATE----- One can have multiple
certificates for access to different resources (
stored in HOME/.globus directory) total
24 drwxrwxr-x 6 kmacinni kmacinni 4096 Feb
27 1623 . drwxrwxr-x 4 kmacinni kmacinni
4096 Mar 3 1129 .. drwxrwxr-x 2 kmacinni
kmacinni 4096 Feb 19 1418 chi-simpleCA drwxrw
xr-x 2 kmacinni kmacinni 4096 Feb 27 1624
doe-energy drwxrwxr-x 2 kmacinni kmacinni
4096 Feb 19 1413 globus drwxrwxr-x 2 kmacinni
kmacinni 4096 Feb 19 1419 ncsa
11
Kerberos Network Authentication System
Kerberos Key Distribution Center
KDC
1,3
2,4
5
Alice
Bob
user
service
Login Phase Once Per
Session 1. Alice - KDC I am Alice 2. KDC -
Alice TGTAlice,TGS,KA,TGSKTGS,TKA,KA,,KCTK
A Accessing Services Every time a new/current
kerberized service is requested 3. Alice - TGS
Alice, Bob, TGT, TKA,TGS 4. KCT -
Alice TKT Alice, Bob, KA,BKB, TK
A,TGS,KA,BKA ,TGS 5. Alice - Bob I am Alice,
TKT, TKA,B TGS Ticket Granting Service (often
same entity as KDC) KA Shared key between Alice
and KDC (derived from Alices password upon
login) KA,TGS Session key for Alice and KDC
KTGS Shared key between KDC and TGS KA,B
Session key for Alice and Bob T Timestamp to
prevent replay attacks (requires synchronized
clocks)
12
KX.509 Certificates
  • The story so far.Alice has a Kerberos ticket on
    the workstation she is
  • logged into. But Globus uses X.509 certificates
    how does Alice use
  • Globus-enabled services ?
  • KX.509, developed at CITI, University of
    Michigan is a Kerberized client
  • program (resides on Alices workstation)
  • - generates a X.509 certificate and a private
    key based on the existing
  • Kerberos ticket
  • - both are normally stored in the same
    Kerberos ticket cache
  • (most often in the /tmp directory)
  • - the temporary key are destroyed when
  • Kerberos ticket expires
  • Therefore, by adopting K.X509, an Kerberos-based
    organization can
  • deploy and use Globus-enabled services without
    changing its security
  • infrastructure. Kerberos is the most widely
    deployed network
  • authentication system currently in use.

13
Kerberos - K.X509 and Globus Proxy Certificate
Creation
  • Steps (chi.grid.umich.edu)
  • Obtain and cache a Kerberos5 Ticket Granting
    Ticket (TGT)
  • abose_at_chi abose kinit abose
  • Password for abose_at_GRID.UMICH.EDU
  • abose_at_chi abose ls -al /tmp grep abose
  • -rw------- 1 abose abose 483 Mar 18 0034
    krb5cc_108355_PIfNIZ
  • (2) Obtain X.509 certificate from KCA and store
    in /tmp as well
  • abose_at_chi abose kx509
  • (3) Convert X.509 certificate to Globus Proxy
    certificate and cache
  • abose_at_chi abose kxlist -p

14
Kerberos - K.X509 and Globus Proxy Certificate
Creation (continued)

(you can use either kxlist p or
grid-proxy-init to generate Globus proxy
certificate) Content of the certificate Service
kx509/certificate issuer /CUS/STMichigan/LAnn
Arbor/OUniversity of Michigan/CNMGrid Test
KCA subject /CUS/STMichigan/LAnn
Arbor/OUniversity of Michigan/OUMGrid Test
KCA/CNabose/0.9.2342.19200300.100.1.1abose/Emai
labose_at_GRID.UMICH.EDU serial34 hash8ca5c718
Note for Grid Administrators need to add the
subject line from above and the username on the
host in the grid-map file (GLOBUS_LOCATION/etc/gr
id-mapfile or any Location specified in Globus
Gatekeeper configuration file GLOBUS_LOCATION/e
tc/globus-gatekeeper.conf)
15
Globus Resource Specification Language (RSL)
Basics
  • Use RSL to specify resources you need at the time
    of
  • submission
  • globusrun o r chi/jobmanager-pbs f req.rsl
  • ( -r resource name, -f RSL filename )
  • - Good to know some of the basic value pairs
  • indicates single resource request to Globus
    Resource
  • Allocation Manager (GRAM), conjunction of
    pairs.
  • indicates request for multiple resources
    (coallocation)
  • introduces new variable scope

16
Globus Resource Specification Language (RSL)
Basics
  • variables defined in one clause of a
    multi-request are not visible to the other
    clauses
  • RSL tokens
  • Cant be any of the following as part of an
    unquoted literal
  • ' (plus), ' (ampersand), ' (pipe),
    (' (left paren), )' (right paren), ' (equal),
    ' (right angle),
    !' (exclamation), "' (double quote),
    '' (apostrophe), ' (carat), ' (pound), and
    ' (dollar).
  • Common RSL tokens
  • arguments, count, directory, executable,
    jobType,
  • environment, maxTime, maxWallTime, gramMyjob,
  • maxCpuTime, stdin, stdout, stderr, queue,
    project, dryRun, maxMemory, minMemory,
    hostCount

17
RSL continued
Example (single resource for now) globusrun
-r chi/jobmanager-pbs ' (executable"/home/abo
se/test.exe") (host_count2) (count4)
(arguments-t 100 f out.dat")
(email_addressabose_at_umich.edu")
(queue"cac") (pbs_stageinmorpheus/home/
abose/test.exe") (pbs_stageoutmorpheus/h
ome/abose/out.dat") (pbs_stdout"/tmp/stdou
t") (pbs_stderr"/tmp/stderr")
(maxwalltime10)(jobtype"mpi) get test.exe
from morpheus and run it on hypnos - submitted
by Globus gatekeeper on chi using PBS job manager
18
RSL continued
Example Resulting PBS Submission Script on
Hypnos ! /bin/sh PBS batch job script built
by Globus job manager PBS -S /bin/sh PBS -M
abose_at_umich.edu PBS -m n PBS -q cac PBS -W
stagein/home/abose/test.exe_at_morpheus.engin.umich.
edu/home/abose/test.exe PBS -W
stageout/home/abose/out.dat_at_morpheus.engin.umich.
edu/home/abose/out.dat PBS -l
walltime1000 PBS -o hypnos/tmp/stdout PBS -e
hypnos/tmp/stderr PBS -l nodes2 PBS
-v X509_USER_PROXY/home/abose/.globus/.gass_cache
/local/md5/1c/fd/d3/753b90 28dfec2ddd6df84cd06c/md
5/0a/4b/1d/599dac54863d650c2531cb92fc/data,GLOBUS_
LOCATION/usr/grid,GLOBUS_GRAM_JOB_CONTACThttps
//chi.grid.umich.edu58963/ 575/1047861360/,GLOBUS
_GRAM_MYJOB_CONTACTURLx-nexus//chi.grid.umich.ed
u58 964/, HOME/home/abose,LOGNAMEabose,LD_LIBRA
RY_PATH Change to directory requested by
user cd /home/abose /usr/gmpi.pgi/bin/mpirun np
4 /home/abose/test.exe t 100 f out.dat
19
An Application Grid Domain using NPACI resources
DZero/SAM-Grid Deployment at Michigan
Timelines Planning Meetings (CAC and Fermi
Labs) Sep-Oct, 2002 Demonstration of SAM-Grid at
SC2002 Nov, 2002 Deployment/Site-customization
Dec, 2002 - Mar, 2003 Target NPACI
Allocation/Production Runs at Michigan
April, 2003 (plus site
visits, students spent part of their time at
Fermi Labs) Slide Courtesy Jianming Qian,
Univ. of Michigan and Lee Leuking, FNAL
20
The D0 Collaboration
  • 500 Physicists
  • 72 institutions
  • 18 Countries

21
(No Transcript)
22
Scale of Challenges
Computing sufficient CPU cycles and storages,
good network bandwidth Software efficient
reconstruction program, realistic
simulation, easy data-access,
50 Hz
2 MHz
5 KHz
1 KHz
Level - 1
Level - 2
Level - 3
  • Luminosity-dependent physics menu leads to
    approximately
  • constant Level 3 output rate
  • With a combined duty factor of 50, we are
    writing at 25 Hz DC,
  • corresponding to 800 million events a year

23
Data Path
Raw
Data
MC
Offsite Farms
Fermilab Farm
Data Handling System
?
Offsite
Fermilab
24
Major Software
  • Trigger algorithms (Level-2/Level-3) and
    simulation
  • Filtering through firmware programming and/or
    (partial) reconstruction,
  • event building and monitoring. Simulating Level-1
    hardware and wrap
  • Level-2 and Level-3 code offline simulation.
  • Data management and access
  • SAM (Sequential Access to data via Meta-data) is
    used to catalog
  • data files produced by the experiment and
    provides distributed data
  • storage and access to production and analysis
    sites.
  • Event reconstruction
  • Reconstructing all physics object candidates,
    producing Data
  • Summary Tape (DST) and Thumbnails (TMB) for
    further analyses.
  • Physics and detector simulation
  • Off the shelf event generators to simulate
    physics processes and
  • home grown Geant-based (slow) and parameterized
    (fast) programs
  • to simulate detector responses.

25
Computing Architecture
Central Data Storage
dØmino


Central Analysis Backend
(CAB)
Remote Analysis
26
Storage and Disk Needs
For two-year running
Storage store all officially produced and user
derived datasets at Fermilab robotic tape
system ? 1.5 PB Disk all data and some MC
TMBs are disk resident, sufficient disk cache
for user files ? 40 TB at analysis centers
27
Analysis CPU Needs
CPU needs are estimated based on the layered
analysis approach
  • DST based
  • Resource intensive, limited to physics, object
    ID, and detector groups
  • Example insufficient TMB information, improved
    algorithms, bug fixes,
  • TMB based
  • Medium resource required, expect to be done
    mostly by subgroups
  • Example creating derived datasets, direct
    analyses on TMB,
  • Derived datasets
  • Individuals done daily on their desktops and/or
    laptops
  • Example Root-tree level analyses, selection
    optimization,

The CPU needs is about 4 THz for a data sample
of two-year running
28
Analysis Computing
RAC (Regional Analysis Center)
  • dØmino and its backend (CAB) at Fermilab
    Computing Center
  • provided and managed by Fermilab Computing
    Division
  • dØmino is a cluster of SGI O2000 CPUs, provides
    limited CPU power,
  • but large disk caches and high performance I/O
  • CAB is a 160 dual 1.8 GHz AMD CPU Linux farm on
    dØmino backend,
  • should provide majority of analysis computing
    at Fermilab
  • CluEDØ at DØ
  • over 200 Linux desktop PCs from collaborating
    institutions
  • managed by volunteers from the collaboration
  • CAB and CluEDØ are expected to provide half of
    the estimated analysis
  • CPU needs, the remaining is to be provided by
    regional analysis centers

29
Overview of SAM (Sequential Access to data via
Meta-Data)
Database Server(s) (Central Database)
Name Server
Global Resource Manager(s)
Log server
Shared Globally
Station 1 Servers
Station 3 Servers
Local
Station n Servers
Station 2 Servers
Mass Storage System(s)
Arrows indicate Control and data flow
Shared Locally
30
Components of a SAM Station
/Consumers
Producers/
Project Managers
Cache Disk
Temp Disk
MSS or Other Station
MSS or Other Station
File Storage Server
Station Cache Manager
File Storage Clients
File Stager(s)
Data flow
Control
Worker nodes
31
SAM Deployment
  • The success of SAM data handling system is the
    first step towards
  • utilizing offsite computing resources
  • SAM stations deployed at collaborating
    institutions provide easy
  • data storage and access

Only most active sites are shown
32
(No Transcript)
33
SAM-GRID
SAM-GRID is a Particle Physics Data Grid project.
It integrates Job and Information Management
(JIM) with the SAM data management system. A
first version of SAM-GRID is successfully
demonstrated at Super Computing 2002
in Baltimore SAM-GRID could be an important job
management tool for our offsite analysis efforts
34
JIM v1 Deployment Plan to Achieve the April 1
Milestone
  • Lee Lueking,
  • Igor Terekhov,
  • Gabriele Garzoglio
  • Fermilab

35
Objectives of SAMGrid
  • Bring standard grid technologies (Globus and
    Condor) to the Run II experiments.
  • Enable globally distributed computing for D0 and
    CDF.
  • JIM complements SAM by adding job management and
    monitoring to data handling.
  • Together, JIM SAM SAMGrid

36
Principal Functionality
  • Enable all authorized users to use off-fermi-site
    computing resources
  • Provide standard interface for all job submission
    and monitoring
  • Jobs can be 1. Analysis, 2. Reconstruction, 3.
    Monte Carlo, 4. Generic (vanilla)
  • JIM v1 features
  • Submission to SAM station of users choice,
  • Automatic selection of SAM station based on
    amount of input data cached at each station
  • Web-based monitoring

37
Job Management
User Interface
User Interface
Submission Client
Submission Client
Match Making Service
Match Making Service
Broker
Queuing System
Queuing System
Information Collector
Information Collector
JOB
Data Handling System
Data Handling System
Data Handling System
Data Handling System
Execution Site 1
Execution Site n
Computing Element
Computing Element
Computing Element
Storage Element
Storage Element
Storage Element
Storage Element
Storage Element
Grid Sensors
Grid Sensors
Grid Sensors
Grid Sensors
Computing Element
38
Site Requirements
  • Linux i386 hardware architecture machines
  • SAM station with working sam submit
  • For MC clusters, mc_runjob installed
  • Submission and execution sites continuously run
    SAM and JIM servers with auto-restart procedures
    provided.
  • Execution sites can configure their Grid users
    and batch queues to avoid self inflicted DOS, eg
    all users mapped to one user d0grid with
    limited resources.
  • Firewalls must have specific ports open for
    incoming connections to SAM and JIM. Client hosts
    may include FNAL and all submission sites.
  • Execution sites trust grid credentials used by
    D0, including DOE Science Grid, FNAL KCA, and
    others by agreement. To use FNAL KCA, Kerberos
    client must be installed at the submission site.

39
JIM V1 Deployment
  • A site can join SAM-Grid with combos of services
  • Monitoring
  • Execution
  • Submission
  • April 1, 2003 Expect 5 initial sites for SAMGrid
    deployment, and 20 submission sites.
  • May 1, 2003 A few additional sites, depending on
    success and use of initial deployment.
  • Summer 2003 Continue to add execution and
    submission sites. Hope to grow to dozens exe. and
    hundreds of sub. sites.
  • CAB is powerful resource at FNAL, but...
  • Globus software not well supported on IRIX (CAB
    station server runs on d0mino).
  • FNAL computer security team restricts Grid jobs
    to in situ exes, or KCA certificates for user
    supplied exes.

40
SAMGrid Dependencies
41
Expectations from D0
  • By March 15 Need 2 volunteers to help set up
    beta-sites and conduct submission tests.
  • We expect runjob to be interface with the JIM v1
    release to run MC jobs.
  • In the early stages, it may require ½ FTE at each
    site to deploy and help troubleshoot and fix
    problems.
  • Initial deployment expectations
  • GrkdKa Analysis site
  • Imperial College and Lancaster MC sites
  • U. Michigan (NPACI) Reconstruction center.
  • Second round of deployments Lyon (ccin2p3),
    Manchester, MSU, Princeton, UTA.
  • Others include NIKHEF and Prague, need to
    understand EDG/LCG implications.

42
How Do We Spell Success?
  • Figures of merit
  • Number of jobs successfully started in remote exe
    batch queues.
  • If a job crashes its beyond our control.
  • May be issues related to data delivery that could
    be included as special failure mode.
  • How effectively CPU is used at remote sites
  • May change scheduling algorithm for job
    submission and/or tune queue config at sites.
  • Requires cooperation of participating sites
  • Ease of use and how much work gets done on the
    Grid.
Write a Comment
User Comments (0)
About PowerShow.com