LBNLNERSCPDSF - PowerPoint PPT Presentation

About This Presentation
Title:

LBNLNERSCPDSF

Description:

Hardware: PDSF, HPSS, T3E, SP2, Viz Lab, Network ... 9 GB hd. 100MB full-duplex ethernet. 1 ea, Sun E-450. 4x248Mhz CPU's. 3 GB ram (from 1GB) ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 22
Provided by: wwwprojec
Category:

less

Transcript and Presenter's Notes

Title: LBNLNERSCPDSF


1
LBNL/NERSC/PDSF Site Report for HEPiX SLAC Oct 7,
1999 by Thomas Davis tadavis_at_lbl.gov
2
LBL/NERSC/PDSF
  • NERSC (National Energy Research Scientific
    Computing)
  • NERSC and HENP
  • What is PDSF?
  • Staffing
  • Groups/Projects
  • Hardware
  • Software

3
LBL/NERSC/PDSF
  • National Energy Research Scientific Computing
  • Largest non-classified compute facility in U.S.
  • 2000 Active Users (1000 Heavy Users)
  • HENP _at_ NERSC
  • Comp. Sci. Grand Challenge, Clipper, LDRD
  • Hardware PDSF, HPSS, T3E, SP2, Viz Lab, Network
  • Nuclear STAR, PHENIX, E895, E896, E871, NA49,
    etc.
  • HEP ATLAS, BaBar, CDF, D0, etc.
  • HPSS PROBE
  • LBL ESnet Site ESnet Hub

4
NERSC
  • HPSS development center
  • Probe project - to be detailed later.
  • PDSF
  • Access to T3E, SP2 supercomputers.
  • Moving of equipment offsite to Oakland datacenter
    in the next year.

5
NERSC FY00
6
PDSF
  • Parallel Distributed Systems Facility
  • Specialized hardware resource for use by HENP
  • ATLAS, BaBar, CDF, D0, E895, E871, E896, PHENIX,
    STAR, astrophysics, etc. (17 exp.s/grp.s, gt200
    users)
  • NERSC/MICS operating funds
  • 2 FTE, 3 Part time.
  • FY1999 upgrade funds
  • Added more 4 terabytes of disk space, about 50
    more CPU's
  • LSF - Load Sharing Fac. - 158 single cpu licenses
  • Renewed support contract (ouch!)

7
Staffing
  • 1 FTE Project Lead/Admin
  • 1 FTE Admin
  • ½ FTE High Energy Physics, ½ FTE ATLAS
  • 2 -¼ FTE Nuclear Physics Support

8
Projects/Groups
  • STAR
  • ATLAS
  • E895
  • E896
  • E871
  • CDF

9
Compute Nodes
  • 16 ea, PII/400
  • 256MB ram
  • 11GB hd
  • 100MB full-duplex ethernet
  • 6 ea, Dual PII/400
  • 256MB ram
  • 11GB hd
  • 100MB full-duplex ethernet

10
Compute Nodes
  • 8 ea, dual-PII/333
  • 256MB ram
  • 9 GB hd
  • 100MB full-duplex ethernet
  • 1 ea, Sun E-450
  • 4x248Mhz CPU's
  • 3 GB ram (from 1GB)
  • 1,296 GB of disk (from 405GB)
  • 4x100 EtherChannel/Sun Trunking

11
Compute Nodes
  • Updated the 6 of the old dual cpu systems to be
    dual PIII/450's, old cpu's inserted in 6 systems
    to make them dual capable.
  • Added 18 new compute nodes
  • 2U rack case, Intel nightshade system board.
  • PIII/450 (dual)
  • 256mb of RAM, 11gb IDE drive, 3.5" floppy.
  • Fast ethernet, full duplex
  • Serial console

12
Hardware/Data Vaults
  • 7 ea, PII/400
  • 100MB full-duplex ethernet
  • 256MB ram
  • 64GB of disk space
  • 62GB available via NFS
  • built using 4x16.7GB IBM UDMA drives
  • Running Linux v2.2.x
  • RedHat v5.1

13
Data Vaults
  • 1 ea PII/400
  • 2x100MB full-duplex EtherChannel
  • 256MB ram
  • 32GB disk space (4x9gb Quantum Fireball UDMA)

14
Data Vaults
  • 3 new datavaults
  • PIII/450 (dual CPU)
  • 256mb of ram
  • Intel 440GX system board
  • RAIDZONE 15x37gb rack (550GB of disk space)
  • 2x100MB Fast Ethernet bonded into Cisco 5500
  • Serial Console setup no keyboard/video/mouse.
  • Running Linux

15
Other Disk Vault
  • Added 2 "old" style datavaults for E871, to run
    AIT tape library, and provide disk space for
    project.
  • Built using 4x37gb (148GB total) IBM Deskstar
    UltraDMA cards Intel Nightshade system board.
  • 256mb of ram
  • PIII/450 (single cpu)

16
Master Server
  • Dual cpu, PIII/450, Intel 440GX system board
  • Built up fault tolerance dual power supplies,
    small (700 VA UPS)
  • RAIDZONE 15x37gb (550GB) disk rack.
  • Contains home, shared (backed up) directories.
  • 256MB of ram.
  • Channel Bonded Fast Ethernet
  • Serial console master. (32 port PCI/Comtrol
    Rocketport, Purdue Console software)
  • NIS, Home, LSF master, and Flex/LM server.

17
Hardware Totals
  • Will have online 4.2 TB of disk space by end of
    Oct, 1999.
  • Of this 4.2 TB, 2.22TB are hot swappable.
  • 104 CPU's for compute purposes
  • 13 PII/266
  • 16 PII/350
  • 28 PII/400
  • 42 PIII/450
  • 5 UltraSparc

18
System Overview
19
HPSS
  • Current Size of NERSC HPSS
  • 4 independent systems
  • 10 tape robots (total 46000 tape cartridges)
  • 60 tape drives
  • 14 control mover nodes
  • 2.5 TB disk (metadata cache)
  • Current Usage of NERSC HPSS
  • 68 TB of Data (48 TB user 20 TB backup)
  • 1.7 TB of I/O per day
  • 1 Backup System (W 300 GB/day)
  • 2 User Systems (WR 700 GB/day each)

20
Software
  • Compilers
  • KAI
  • PGI
  • EGCS/GCC (2.7.2, 1.0.2, 1.1.1, 1.1.2, 2.95.1)
  • All SunCompilers full licensed (NO DEMOS)
  • Objectivity/MySQL
  • ROOT (root4star)
  • PAW
  • STAF

21
Software - Linux
  • Red Hat v5.1, soon to be migrated to RedHat v6.0
    (just started to evaluate RH6.1)
  • MOSIX 0.93.3 installed on starlx01-starlx08
  • Linux kernel v2.2.12
  • Large swap space - 2 GB on each Linux box.
  • Using Kernel based NFS servers
  • ssh v1 for everything when possible
  • NERSC mandates no clear text passwords to be used
    where possible telnet to be totally disabled by
    Dec 8, 1999
  • KRB5, v1.1-beta installed on PDSF cluster (Linux
    only).
  • AFS via NFS only. Uses Large AFS cache on
    starsu00 (5gb).
Write a Comment
User Comments (0)
About PowerShow.com