Putting Existing Farms on the Testbed - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

Putting Existing Farms on the Testbed

Description:

Add CE to list of valid job submission clients (eg in hosts.equiv) ... This recipe being written up for http://www.gridpp.ac.uk/tb-support ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 13
Provided by: john1481
Category:

less

Transcript and Presenter's Notes

Title: Putting Existing Farms on the Testbed


1
Putting Existing Farms on the Testbed
  • Manchester DZero/Atlas and BaBar farms are
    available via the Testbed.
  • Done with a handful of modifications to the
    Testbed site and to the existing farms.
  • This talks describes what we did and how you can
    do it too...

Andrew McNab - Manchester HEP - 17 September 2002
2
Farms at Manchester HEP
BaBar 80 0.8GHz
GridFarm 16 1.0GHz
DZero / Atlas 60 1.5GHz
Andrew McNab - Manchester HEP - 17 September 2002
3
The problem
  • We want to make existing farms available on the
    Testbed.
  • But we dont want to massively reconfigure/reinsta
    ll farms
  • theyre in production so need to be kept stable
  • they are already configured the way their owners
    need
  • We might want to keep reinstalling as EDG
    software is updated.
  • this is labour intensive unless we install from
    scratch with LCFG install
  • dont want to have to make many manual changes to
    CE etc every time we install/upgrade
  • Solution that has been mentioned several times is
    to have a standard EDG Testbed Site as a front
    end to the Existing Farm
  • So want to find the minimal set of changes to
    Farm and Testbed Site that will put the Farm on
    the Testbed.

Andrew McNab - Manchester HEP - 17 September 2002
4
Standard Testbed Site
/home
  • All elements installed from LCFG server
  • Computing Element shares /home directories by
    NFS
  • Storage Element shares /flatfiles with data by
    NFS
  • PBS Server on CE talks to PBS on Worker Nodes.

CE
WN
PBS Node
PBS Server
PBS
LCFG
WN
PBS Node
SE
WN
PBS Node
/flatfiles
Andrew McNab - Manchester HEP - 17 September 2002
5
What we want
Grid Farm / Testbed Site
BaBar or DZero/Atlas Farm
/home
qsub
CE
WN
PBS Node
PBS Server
PBS Server
PBS
LCFG
WN
PBS Node
PBS Node
SE
WN
PBS Node
PBS Node
/flatfiles
Andrew McNab - Manchester HEP - 17 September 2002
6
Reconfigure Existing Farm
  • PBS Server must allow access from CE, but only
    for the right users.
  • Add CE to list of valid job submission clients
    (eg in hosts.equiv)
  • Create special queue (bfq or dfq) for Testbed
    jobs.
  • Limit queues so desired pool of accounts (eg
    atlas001 etc) can submit jobs to the bfq/dfq but
    other queues/pools forbidden.
  • PBS Nodes need access to pool accounts, home
    directories on CE, and /flatfiles area on SE.
  • If already using NFS automount, then easy to add
    /home on CE and /flatfiles on SE (eg as
    /nfs/gf-home and /nfs/gf-flatfiles)
  • Add pool accounts to /etc/passwd (or NIS)
  • Make symbolic links in /home to automount CE
    /home directories.

Andrew McNab - Manchester HEP - 17 September 2002
7
Software on PBS Nodes
  • For current EDG job submissions to work, need to
    install globus-url-copy RPMs on PBS Nodes.
  • PBS Nodes currently need to make an outgoing
    gridftp
  • connections to Resource Broker.
  • GridFTP possible with NAT, but difficult.
  • Other middleware RPMs will be needed if also
    intending to manipulate SE and RC during jobs.
  • For use with EDG Testbed, should also install
    relevant application RPMs

Andrew McNab - Manchester HEP - 17 September 2002
8
Changes to Testbed Site
  • Have attempted to minimise changes
  • easier to document and support
  • easier to maintain as EDG software changes
  • Basic philosophy modify EDG scripts to make
    remote qsub and qstat calls to PBS Server
    machines on the farms.
  • Only need to edit 3 scripts on the CE
  • /opt/globus/libexec/globus-script-pbs-queue
  • /opt/edg/info/mds/sbin/skel/ce-globus.skel
  • /opt/edg/info/mds/bin/ce-pbs
  • Create grid-mapfile and ce-static.ldif for each
    queue.
  • Include farm queue and PBS nodes in LCFG
    site-cfg.h

Andrew McNab - Manchester HEP - 17 September 2002
9
New behaviour
  • Modified ce-pbs queries PBS Server using remote
    qstat
  • Publishes edited grid-mapfile listing only the
    right users.
  • Jobs can be submitted using Resource Broker,
    based on published information.
  • When received by CE, globus-script-pbs-queue
    submits job to remote PBS Server
  • EDG Globus jobmanager on CE monitors job status
    via remote qstat and transmits to Logging as
    normal.
  • Job runs on PBS Node with access to pool account
    /home
  • Job completes and returns files to RB via gridftp

Andrew McNab - Manchester HEP - 17 September 2002
10
Example logs
  • Three jobmanagers visible to GridPP MDS and RB
  • gf18.hep.man.ac.uk2119/jobmanager-pbs-gfq (Grid
    Farm/Testbed)
  • gf18.hep.man.ac.uk2119/jobmanager-pbs-dfq (DZero/
    Atlas farm)
  • gf18.hep.man.ac.uk2119/jobmanager-pbs-bfq (BaBar
    farm)
  • Different operating system, grid-mapfile lists of
    users etc for each queue.
  • Can submit job to RB and have it matchmake the
    requirements
  • including dynamic properties like free nodes
  • Example log shows submitting a job from UI at RAL
    via RB at IC, which decides which farm at
    Manchester matches and sends the job there.

Andrew McNab - Manchester HEP - 17 September 2002
11
Applying this to other sites
  • This recipe being written up for
    http//www.gridpp.ac.uk/tb-support/
  • With current EDG release, the PBS Nodes need
    outgoing direct internet access (not NAT.)
  • You need to be able to make minor changes to PBS
    Server permissions, NFS mounts etc as described.
  • You should have some (3?) dedicated Testbed
    machines, or add it to an existing GridPP/EDG
    Testbed setup.
  • We use Microdirect.co.uk boxes at
    1.5GHz/256MB/40GB box for 250 .
  • If you dont use an EDG-supported batch system
    (PBS etc), you need to modify ce-pbs and
    globus-script-pbs- scripts to use your job
    submission commands.

Andrew McNab - Manchester HEP - 17 September 2002
12
Summary
  • Its not at all difficult to access existing PBS
    farms via an EDG Testbed site.
  • include CE SE in NFS and PBS configuration of
    farm
  • include pool accounts in farms passwd file
  • enforce security by account pools
  • Only need to modify a handful of files on the
    Testbed CE.
  • Should be relatively straightforward to apply
    this to other batch queue systems even if you
    dont use PBS.
  • Weve demonstrated putting our 150 1 GHz nodes
    on the current Testbed and submitting jobs via
    GridPP RB
  • You can too.

Andrew McNab - Manchester HEP - 17 September 2002
Write a Comment
User Comments (0)
About PowerShow.com