FermiGrid - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

FermiGrid

Description:

The Fermilab campus Grid and Grid portal. The site globus gateway. ... Dark Energy Survey. des. http://www.fnal.gov/faw/experimentsprojects/index.html ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 41
Provided by: cddocd
Category:
Tags: fermigrid | dark | mini | nova | portal

less

Transcript and Presenter's Notes

Title: FermiGrid


1
FermiGrid
  • Eileen Berman
  • (for the FermiGrid Team)
  • Fermilab
  • Work supported by the U.S. Department of Energy
    under contract No. DE-AC02-07CH11359.

2
Overview
  • What Is FermiGrid?
  • Running a Job on FermiGrid
  • Job Accounting
  • Operational Experience
  • The Future
  • Conclusion

3
What is FermiGrid?
  • FermiGrid brings previously disjoint resources
    into a common infrastructure where the user
    communities retain control of their resources
    and also allow negotiation for opportnistic
    sharing.

4
What is FermiGrid?
  • The Fermilab campus Grid and Grid portal.
  • The site globus gateway.
  • Accepts jobs from external (to Fermilab) sources
    and forwards the jobs onto internal clusters.
  • A set of common services to support the campus
    Grid and interface to Open Science Grid (OSG) /
    LHC Computing Grid (LCG)
  • VOMS, VOMRS, GUMS, SAZ, MyProxy, Squid, Gratia
    Accounting, etc.
  • A forum for promoting stakeholder
    interoperability and resource sharing within
    Fermilab
  • CMS, CDF, D0
  • ktev, miniboone, minos, mipp, etc.

5
Software Stack
  • Baseline
  • Scientific Linux 3.0.x, 4.x, 5.0
  • OSG 0.6.0 (VDT 1.6.1, Globus Toolkit 4, WS-Gram,
    Pre-WS Gram)
  • Additional Components
  • VO Management Service - VOMS
  • VO Membership Registration Service - VOMRS
  • Grid User Mapping Service - GUMS
  • Site AuthoriZation Service - SAZ
  • job forwarding job manager - jobmanager-cemon
  • credential storage - MyProxy
  • web proxy cache - Squid
  • auditing - syslog-ng
  • accounting - Gratia
  • Virtualization - Xen

6
Hardware
  • FermiGrid core systems
  • dual 3.6GHz Intel XEON
  • 4GB of PC2100 DDRAM
  • 2 mirrored 10K rpm SCSI system disks
  • a single gigabit Ethernet connection to the LAN
  • Cluster Worker Nodes
  • Heterogeneous commodity computers

7
Compute Elements and Worker Nodes
WN Worker Node VM Virtual Machine
FermiGrid Site Gateway
CMS Gatekeeper
CMS Gatekeeper
GP Gatekeeper
D0 Gatekeeper
CDF Gatekeeper
GP Gatekeeper
CMS Gatekeeper
CDF Gatekeeper
D0 Gatekeeper
CDF Gatekeeper
D0 Gatekeeper
CDF WNs
D0 WNs
GP WNs
CMS WNs Operated by CMS
580 WNs, 1400 VMs
1140 WNs, 6000 VMs
1200 WNs, 4800 VMs
220 WNs, 600 VMs
8
Batch Systems
  • Condor
  • This is the dominant batch system at Fermilab.
  • Is in use on the GP, CDF and CMS tier 1 Grid
    Clusters.
  • PBS
  • Is in use on the D0 farms and D0 Grid Clusters,
    and on LQCD.
  • It gives us an alternative and/or competitor to
    Condor...
  • Sun Grid Engine
  • No direct experience at Fermilab - yet.
  • We will be commissioning a small batch system
    test cluster with SGE later this year.

9
Storage
  • BlueArc NFS Shared File Systems
  • Purchased 24 Tbytes (raw) 14 Tbytes (formatted
    RAID 6)
  • NFS server appliance (fiber channel)
  • Failover supported
  • Currently mounted on (most) FermiGrid clusters
  • Temporary storage for job data, applications
  • Public dcache (FNAL_FERMIGRID_SE)
  • 7 TBytes of storage accessible via Storage
    Resource Manager.
  • Access to permanent storage tape robots
  • For FY07
  • Plan to closely monitor needs, user requests and
    utilization.
  • Do not currently expect to request additional
    storage until FY08.

10
Virtual Organizations
  • FermiGrid supports many diverse Virtual
    Organizations
  • a group of individuals or institutions who
    share the computing resources of a "grid" for a
    common goal. (Wikipedia)
  • often equates to an experiment

11
Virtual Organizations
  • Resource Providers (sites) or Grids establish
    trust relationships with a VO
  • Contracts between Resource Providers VOs govern
    resource usage policies (site and VO policies)
  • A VO Manager is designated by a VO to
    validate/approve new VO members
  • A VO's member structure may include groups,
    subgroups and/or roles into which it divides its
    members according to their responsibilities and
    tasks
  • Allows Grid level scalability for membership
    registration, site account management, etc.

12
VOs Hosted at Fermilab
13
Job Submission
14
Job Submission
  • Obtain a certificate from an OSG recognized
    Certificate Authority
  • DOE Grids, Fermilab Kerberos Certificate
    Authority
  • All IGTF recognized CAs
  • Register as part of a VO (Once) (1)
  • VO Membership Registration Service (VOMRS).
  • Used by individuals to apply for membership in a
    VO, VO subgroup, and/or Role.
  • Workflow manages the approval or rejection of the
    individuals application.
  • Push VOMRS configuration into VOMS server (2)
  • VO Management Service.
  • Authority for DNs in VO authorized groups
    and/or roles (Fully Qualified Attribute Name
    FQAN).

15
Job Submission
  • Periodic synchronization of VOMS information by
    Site GUMS Service (3)
  • Grid User Mapping Service.
  • List of VOMS servers to query is configured by
    the system administrator.
  • Maps DNVO extended attributes to site specific
    UID/GID.

16
Job Submission
  • User obtains VOMS proxy certificate (4)
  • Contains X.509 grid certificate and VOMS
    attributes (Roles, Groups, VO)
  • i.e., voms-proxy-init -voms fermilab/fermilab/usm
    inos/Rolesoftadmin
  • Single initial certificate used when joining
    multiple VOs
  • User submits job (5)
  • condor_submit, globus-job-run, globus-job-submit,
    etc.

17
Job Submission
  • Callout to Site Authorization Service (SAZ) (6)
  • Site Whitelist/Blacklist Service
  • Returns Allow/Deny
  • Registration automatic at 1st grid job (stores
    DN, VO, Role, CA) (default-accept)
  • voms-proxy-init vs. grid-proxy-init
  • default allow access to users who present proxy
    certificate generated from voms-proxy-init
  • On a case-by-case basis, users who present a
    proxy certificate generated from grid-proxy-init
    can be allowed access manually
  • Allows the Fermilab security authorities to
    impose a site-wide grid access policy
  • Allows for rapid temporary suspension of a DN,
    VO, Role, Group, or CA during incident
    investigation

18
Job Submission
  • Callout to Grid User Mapping Service (GUMS) (7)
  • Returns local UID or null
  • Many-to-One, One-to-One, One-to-Self Mapping
    Types
  • The mapping scheme that is used at a Site is a
    decision of the site administrators based on the
    request of the VO using the site. At Fermilab,
    the usCMS VO has requested that users be mapped
    one-to-one, while the D-Zero VO is using the
    many-to-one mapping.
  • Implemented as a Java Servlet and run in Tomcat.
  • Developed at Brookhaven National Laboratory (BNL)

19
GUMS Mapping Types
  • many-to-one
  • All members of the same VO having the same
    extended key attributes are mapped to a single
    local UID.
  • In OSG the many-to-one mapping is an acceptable
    usage case.
  • With many-to-one mappings it is possible to
    obtain the proxy of another user running on the
    same batch system.
  • Currently, the OSG prevents this through
    policies set in the Acceptable Usage Policy (AUP)
    document that all OSG users must electronically
    sign when they register with their VO.
  • one-to-one
  • A set of pool accounts is created on the batch
    system and GUMS maps a user DN with extended key
    attributes to a single, specific pool account.
  • Maintained in the GUMS database for each
    consecutive request for mapping for that
    userattributessite
  • Multiple users in the same VO cannot obtain
    another users proxy certificate, nor can they
    accidentally delete or overwrite another users
    data.
  • Easier to trace a rogue job in the queue to the
    originating user.
  • The site GUMS administrator must monitor the
    usage of the pool accounts, and the system
    administrators must make sure that there are
    enough pool accounts available on each worker
    node (WN) in the batch system.
  • one-to-self
  • A user is mapped to their own local UID on the
    batch system.
  • Used at Brookhaven National Laboratory (BNL).

20
Job Submission
  • Job matched to suitable cluster (8)
  • Resource Selection Service (ReSS) uses the Condor
    Match-making System to match the job requirements
    specified by the job against information gathered
    by gLite CEMon information system
  • Jobs matched against the various resources
    available at the point in time that the job was
    submitted
  • Must have submitted the job to the fermigrid1
    central gatekeeper
  • By default, users job will be matched against
    clusters which support the users VO and have at
    least one free slot available
  • Users have the ability to add additional
    conditions to this "requirements attribute,
    using the attribute named "gluerequirements" in
    the condor submit file specified in terms of Glue
    Schema attributes.

21
Pilot Jobs
  • Pilot (glide-in) Jobs (9)
  • Place holder jobs in the batch queue.
  • Call home when they land on a WN to get the
    real, User Job
  • The owner of the Pilot Job may be authorized to
    run but the owner of the User Job may not be
  • glexec a suexec derivative (from Apache)
  • glexec is a mini gatekeeper.
  • Uses the proxy certificate of the owner of the
    User Job and checks it against SAZ (6) and GUMS
    (7)
  • Execs the job as the correct local UID,
    protecting the Pilot Job (10)

22
gLExec
  • Joint development by NIKHEF (David Groep / Gerben
    Venekamp / Oscar Koeroo) and Fermilab (Dan Yocum
    / Igor Sfiligoi).
  • glexec allows a site to implement the same level
    of authentication, authorization and accounting
    for glide-in jobs on the WN as for jobs which are
    presented to the globus gatekeeper of the CE.
  • Allows a VO to prioritize jobs to its own
    policies without sacrificing security.
  • Based on the LCAS/LCMAPS infrastructure (from
    EGEE) and integrated with GUMS/SAZ via plugin
    infrastructure for OSG.
  • glexec is currently deployed on several clusters
    at Fermilab.
  • glexec, with the GUMS plugin, is scheduled to be
    part of the next VDT release.

23
Accessing Storage
  • Mass Storage access
  • Access to disk Cache (dCache) available via
    gridftp or Storage Resource Manager (SRM) (8)
  • Dynamic space allocation
  • File management functionality
  • dCache, in collaboration with DESY is one of the
    leading, recognized large scale, high performance
    storage solutions for the LHC and other High
    Energy Physics experiments
  • Uses gPLAZMA to interface to GUMS (7)
  • Authorized access to tape storage

24
Gratia Accounting
  • Robust, scalable, and accurate grid accounting
    service for OSG.
  • Interoperates with EGEE accounting system.
  • Utilizes a store locally and forward model to
    a central collector (repository).
  • Assure that accounting data is not lost.
  • Sites can operate a local site Gratia collector
    which summarizes site data and then forwards the
    summaries to the OSG-wide Gratia collector.
  • Gratia probes interface between the Globus
    Gatekeeper and the site specific batch system to
    capture the accounting data.
  • Condor, PBS, SGE, glexec, dCache, etc.
  • psacct-probe captures system level information
  • Running since Sept 2006, 9.3Million Records,
    current accretion rate of 90K batch records/day,
    30GB (no storage records yet... soon)
  • Effort now turning toward understanding the data
    being collected.
  • Generation of standard dashboard accounting
    plots.
  • Supporting ad-hoc queries against the Gratia
    accounting data

25
Operational Experience
  • It works... well!
  • Typical week in February, 2007
  • Over 110,000 jobs submitted by 18 VOs to 5
    separate clusters.
  • On one day
  • A peak of about 450,000 GUMS mappings occurred
    (5Hz)
  • System load was nominal at 1-3 on 1, 5, and 15
    min averages

26
User Experience on FermiGrid/OSG
  • VOs on OSG using Fermilab resources
  • Fermilab operates as a universal donor of
    opportunistic cycles to OSG VOs.
  • Gratia accounting data show that 10 of the Grid
    systems at Fermilab are being used
    opportunistically.
  • But there are VOs which run into problems with
    our job forwarding site gateway and handling the
    heterogeneous cluster configurations.
  • There are also VOs which desire significant
    amounts of resources beyond what we are able to
    opportunistically contribute
  • VOs from other Grids using OSG/Fermilab
    resources.
  • We participate in the GIN (Grid Interoperability
    Now) efforts.
  • Most recently - Individuals from PRAGMA have been
    able to make successful use of Fermilab Grid
    computing resources.

27
Unique VOs on FermiGrid
Unique VOs
28
CPU Time per Organization on FermiGrid
CPU Hours
29
SAZ Operational Experience
  • Monitors of the operational Site Authorization
    Service (SAZ) (client and server) appears to
    indicate that SAZ will be able to scale to the
    expected number of calls/day necessary to support
    the full deployment at Fermilab (all CEs
    WNs)x( of jobs/day).

30
Site AuthoriZaton Service (SAZ)
31
GUMS Scaling
32
The Future
  • FermiGrid-High Availability Upgrades
  • Linux-HA
  • Active-Active or Active-Standby depending on the
    service.
  • XEN
  • Virtualization of services.
  • These technologies will be used for
  • FermiGrid-HA Site Globus Gatekeeper
  • Including Web Services GRAM.
  • FermiGrid-HA Services
  • VOMS, GUMS, SAZ, etc.
  • FermiGrid-HA vobox/edge services
  • FermiGrid-HA grid development platforms

33
Auditing Grid Service
  • Allow assessment of overall security condition
    across sites and VOs
  • Provides forensic analysis tools for security
    investigations
  • Complementary to the site specific security
    processes
  • Currently in design phase

34
Conclusion
  • FermiGrid is a complex, robust, and scalable grid
    gateway system built using commodity hardware and
    community provided software.
  • 15,000 jobs submitted per day
  • 400,000 user mappings per day
  • Movement of 100's of terabytes to/from mass
    storage
  • FermiGrid Web Site Additional Documentation
  • http//fermigrid.fnal.gov/

35
Conclusion
  • Questions??

36
FermiGrid Strategy
  • Strategy
  • In order to better serve the entire program of
    Fermilab, the Computing Division has undertaken
    the strategy of placing all of its production
    resources in a Grid "meta-facility"
    infrastructure called FermiGrid.
  • This strategy is designed to allow Fermilab
  • to insure that the large experiments who
    currently have dedicated resources to have first
    priority usage of those resources that are
    purchased on their behalf.
  • to allow opportunistic use of these dedicated
    resources, as well as other shared Farm and
    Analysis resources, by various Virtual
    Organizations (VO's) that participate in the
    Fermilab experimental program and by certain VOs
    that use the Open Science Grid (OSG).
  • to optimise use of resources at Fermilab.
  • to make a coherent way of putting Fermilab on the
    Open Science Grid.
  • to save some effort and resources by implementing
    certain shared services and approaches.
  • to work together more coherently to move all of
    our applications and services to run on the Grid.
  • to better handle a transition from Run II to LHC
    in a time of shrinking budgets and possibly
    shrinking resources for Run II worldwide.
  • to fully support Open Science Grid and the LHC
    Computing Grid and gain positive benefit from
    this emerging infrastructure in the US and
    Europe.

37
Timeline and Effort
  • The FermiGrid concept started in mid CY2004 and
    the FermiGrid strategy was formally announced in
    late CY2004.
  • The initial hardware was ordered and delivered in
    early CY2005.
  • The Initial core services (Globus Gateway, VOMS
    and GUMS) based on OSG 0.2.1 were commissioned on
    April 1, 2005.
  • We missed the ides of March, so we chose April
    Fools Day.
  • Our first site gateway which used Condor-G
    matchmaking and MyProxy was commissioned in the
    fall of CY2005.
  • Job forwarding based on work by GridX1 in Canada
    (http//www.gridx1.ca).
  • Users were required to store a copy of their
    delegated grid proxy in our MyProxy repository
    prior to using the job forwarding gateway.
  • OSG 0.4 was deployed across FermiGrid in late
    January-February 2006.
  • Followed quickly by OSG 0.4.1 in March 2006.
  • The Site AuthoriZation (SAZ) service was
    commissioned on October 2, 2006.
  • Provides site wide whitelist and blacklist
    capability.
  • Can make decision based on any of DN, VO, Role,
    and CA.
  • Currently operate in a default accept mode
    (providing that the presented proxy was generated
    using voms-proxy-init).
  • The glexec pilot job glide-in service was
    commissioned on November 1, 2006.
  • Provides authorization and accounting trail for
    Condor glide-in jobs.
  • The latest version of the site job forwarding
    gateway (jobmanager-cemon) was commissioned in
    November 2006
  • Eliminated the need to utilize MyProxy via
    accept limited option on the gatekeeper.
  • Based on CEMon and OSG RESS, Condor Matchmaking.
  • Periodic hold and release functions were added in
    March 2007.

38
Authorization Authentication
  • DOEgrids Certificate Authority
  • Long lived (1 year) certificates.
  • Heavy weight process (from the perspective of the
    typical user).
  • Fermilab Kerberos Certificate Authority
  • Service run on Fermilab Kerberos Domain
    Controllers.
  • Most Fermilab personnel already have Kerberos
    accounts for single sign on.
  • Lighter weight process than DOEgrids (from the
    users perspective).
  • Short lived (1-7 day) certificates
  • kinit -n -r7d -l26h
  • kx509
  • kxlist -p
  • voms-proxy-init -noregen -voms fermilab/fermilab
    -valid 1680
  • submit grid job
  • Support cron jobs through kcroninit
  • /usr/krb5/bin/kcron ltscriptgt
  • /DCgov/DCfnal/OFermilab/OURobots/CNcron/CNKe
    ith Chadwick/UIDchadwick
  • TAGPMA IGTF

39
Certificates Mappings per Day
40
GUMS Mappings - CMS
Write a Comment
User Comments (0)
About PowerShow.com