RAL Tier A Status - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

RAL Tier A Status

Description:

34.00 7.00 25297.00 61896.00 0.00 87193.00 5.00 17.00 22.00 21560.00 60257.00 2.00 81819.00 10.00 19.00 29.00 0.06 1.00 23646.00 61896.00 0.00 85542.00 5.00 17.00 ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 12
Provided by: TimA96
Category:
Tags: ral | status | tier

less

Transcript and Presenter's Notes

Title: RAL Tier A Status


1
RAL Tier A Status
  • Tim Adye
  • Rutherford Appleton Laboratory
  • BaBar UK Collaboration Meeting
  • Liverpool
  • 11th April 2003

2
BaBar Batch CPU Use at RAL
3
BaBar Batch Users at RAL(running at least one
non-trivial job each week)
4
Kanga Disk Saga
  • In December we had filled up all 20 TB at RAL
  • Freed up some space by deleting (most) old
    Series-8 data and started importing the backlog
  • A minor upgrade of our old data server on 19 Feb,
    csfsun02, prompted a major loss of data
  • Recovered
  • 1.3 TB scavenged from csfsun02 disks
  • 1.4 TB re-imported from SLAC disk
  • 0.3 TB restored from SLAC HPSS
  • Half way through recovering, discovered that
    csfsun02 was still bad.
  • All data migrated to borrowed servers.
  • All Kanga data restored and up-to-date with SLAC
    production on 28 March.

5
Security Incident
  • SucKIT Linux root exploit has been spreading
    throughout the HEP community
  • An infected machine records all passwords typed
    on that machine
  • Includes passwords used to connect to other
    machines
  • ssh included fortunately not klog
  • Its not unlikely that CSF passwords have been
    compromised by another system
  • To protect CSF from further attack, all passwords
    that have been used recently were reset Tuesday
  • Users contacted by phone and post
  • I can give you your new password today ?

6
Linux Upgrade
  • Nearly all machines at RAL now run RedHat 7.2
  • Exceptions are
  • babar-old.gridpp.rl.ac.uk front-end (AKA csfc)
  • Will be switched off next week
  • babarbuild batch queue
  • RH72 batch workers can run RH6 jobs, but RH72
    machines cant build code in release analysis-13
    and before, so
  • Upgrade to analysis-13b or later
  • Use the babarbuild queue to compile and link run
    in the normal queues

7
CSF Batch System
  • Much work behind the scenes
  • Reliability and optimising queuing algorithms
  • Use bbrbsub to submit, eg.
  • bbrbsub -l cput010000 BetaApp myAnalysis.tcl
  • bbrbsub is a wrapper for qsub, so you can use
    qsub options (see man qsub)

8
Recently Planned Improvements 1Since November
  • Install dedicated import-export machines
  • Fast (Gigabit) network connection
  • Special firewall rules to allow scp, bbftp, bbcp,
    etc.
  • Two new RH72 Linux machines
  • csfmove01.rl.ac.uk for exports
  • AFS authentication improvements
  • PBS token passing and renewal
  • integrated login (AFS token on login, like SLAC)
  • Not yet implemented

?
?
9
Recently Planned Improvements 2Since November
  • Objectivity support
  • Works now for private federations, but no data
    import
  • First step will be to provide Objy conditions
    database access
  • Objy conditions snapshot installed byTim
    Barrass
  • Then we lost our Objy server, csfsun02
  • Upgrade Suns to Solaris 8 and integrate into PBS
  • 4 x 4-CPU Solaris 8 systems now available in
    babarsol queue, eg.
  • bbrbsub q babarsol job.sh

?
?
?
10
Recently Planned Improvements 3Since November
  • Support Grid generic accounts, so special RAL
    user registration is no longer necessary
  • Users without an entry in thegrid-mapfile will
    be assigned to babar001, babar002, babar050
  • The pool account will forever more be bound to
    that certificate DN, so you will always run under
    the same babar0NN

?
11
Support
  • For help, post to RAL Tier A HyperNews forum
    or
  • contact Emmanuel Olaiya (at SLAC) or me (at RAL)
Write a Comment
User Comments (0)
About PowerShow.com