PC Farms at CERN - PowerPoint PPT Presentation

1 / 67
About This Presentation
Title:

PC Farms at CERN

Description:

Investigate how to use the new WfM, DMI, PXE, ACPI, etc. initiatives ... Boot proms, equipment interoperability. CODE reintegration (Physics & CERNLIB) Think Windows ... – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 68
Provided by: frederi8
Category:
Tags: cern | farms

less

Transcript and Presenter's Notes

Title: PC Farms at CERN


1
PC Farms at CERN
  • Frédéric Hemmer
  • CERN-IT/PDP

2
Disclaimer
  • This will cover farms which imply an involvement
    of CERNs computer center.
  • There are other farms in strict online
    environments or private farms in building.

3
Overview
  • Off line farms
  • Linux farms
  • NT farms
  • Issues
  • PC Technology Performance
  • Online Farms quasi online farms
  • Cost of ownership
  • Conclusions

4
Linux Farms - Nomad
  • Proof of concept in Summer 97
  • Straight NQS port
  • SHIFT SW client port
  • CERNLIB port
  • NOMAD observed a quasi linearity with clock
    frequency compared to Alphas !!!
  • I.e. Alpha_at_266 MHz PII_at_266 MHz
  • Now 17 PCs dual, 3 types of MB

5
Linux Farms - NA49
  • NA49 already deployed privately a PC farm in
    their premises
  • Request a new farm to be deployed in order to
    benefit from the computer center infrastructure
    (people and equipment ) in 1 H98
  • Trivial deployment, running with NQS
  • Most PCs are branded PCs (HP)
  • Now completely off RISC for CPU
  • 18 DUALS _at_ 300-gt400 MHz

6
NA49 Analysis - data access
HiPPI
600 GB 1 Run
100BT
From experiment 10-12 TB / month 1
month/year Manual Feed 100 GB Cartridges
SONY DMS
7
Linux Farms (NA48)
  • NA48 was using the QSW CS/2 (128 proc.)
  • CS/2 overload -gt investigate PCs in late 97
  • Installation of 12 Dual machines in 1Q98 and more
    ...

8
Linux Issues
  • EEPRO 100 B MP crashes
  • AFS support (MP)
  • NFS support (MP)
  • Commercial software
  • Manufacturer support for Linux
  • Very few Linux experts

9
NT offline Farms
  • PCSF
  • Simulation facility but
  • COMPASS
  • Evaluating benchmarking technology

10
PCSF - Overview
  • Configuration
  • Applications
  • Data access
  • Specific work solutions
  • Key issues
  • Conclusions

11
PCSF - Goals
  • Make PCNT a standard option for Physics Data
    Processing, starting with simulation
  • Establish a minimum management model for NT farm
    management
  • Address scalability issues
  • Gain Windows NT experience

12
PCSF Milestones
  • Joined RD47 in Autumn 96
  • Price inquiry issued in 12/96
  • Hardware delivered 4/97
  • Ready to use 6/97
  • RD47 report 10/97
  • Expansion 5/98

13
PCSF Configuration (1)
  • Server running NT 4.0 Server SP3
  • 1 dual capable Ppro _at_ 200 MHz, 96 MB, with 9 GB
    data disk (with mirroring). LSF central queues.
  • Server running NT Terminal Server Beta 2
  • 1 dual Ppro _at_ 200 MHz, 128 MB, with 4 GB data
    disk. Runs IIS 3.0 and is accessible from outside
    CERN. It also host the asps for Web access
  • Servers running NT 4.0 Workstation SP3
  • 9 dual Ppros _at_ 200 MHz, 64 MB, 24GB
  • 25 dual PIIs _at_ 300 MHz, 128 MB, 24GB
  • All equipped with boot proms

14
PCSF Configuration (2)
  • Machines interconnected with 4 3com 3000 100BaseT
    switch
  • Display/Keyboard/Mouse connected to a Raritan
    multiplexor
  • PC Duo for remote admin access
  • ? There were problems with other products
  • All running LSF 3.0.
  • ? LSF 3.2 does not work, support weak
  • Completely integrated with NICE

15
Applications on PCSF
  • ATLAS Dice simulation
  • NA45 1996 reconstruction
  • CMS reconstruction with Objectivity being tested
  • LHCB simulation code ready
  • ATLAS reconstruction being ported
  • ATLAS/Marseille event filter prototype
    scalability tests

16
Data access
RFIO
Unix Tape Server
stagexxx commands
17
ATLAS Level 3 DAQ
Readout Buffers
1 GB/s
Processor Farm
Storage (100 MB/s)
18
ATLAS Event Filter
  • Testbed for evaluating algorithms sizing
  • Architecture simulation studies
  • Monitoring, system management, feedback, etc
  • Interface prototypes (SFI, SFO)
  • Timescale prototype -1 (I.e. end 98)
  • Status sizing of an initial farm

19
PCSF Usage
20
(No Transcript)
21
Specific work so far
  • Installation (Remote Boot, Winstall, NICE
    replicas, Install Server)
  • User codes, CERNLIB, SHIFT
  • Job Starter
  • PC MGR
  • WNTS
  • Web Interface

22
Installation
  • Disk cloning change SID
  • ? Fastest method, but not very automated
  • Remote boot
  • Remote boot install procedures with virtual disk
  • Use unattended setup, installs Winstall and other
    things
  • Third party packages installed through Winstall
  • ? boot prom support on some hardware

23
Porting
  • Usually porting code from Unix to NT is easy
    (NA45 code ported in 1 week)
  • Usually porting production environment from Unix
    to NT is difficult (shell scripts)
  • Porting build environment is difficult, better to
    use native tools (Dev Studio)
  • ? Mixing Unix and NT build environment, revision
    control, etc.

24
Jobstarter
  • Initially inherited from Unix LSF CERN JobStarter
  • Rewritten in C, using PcMgrSvc for drive
    mapping
  • Check execution preconditions
  • Clean up normal and abnormal job end
  • Kill popup dialog windows
  • ? Excel Winzip in batch

25
PcMgrSvc/Ctl
  • Checks
  • Status of monitored processes/services
  • Amount of scratch space
  • Drive mapping(s)
  • Map/Unmap drives
  • Sync. with time servers
  • Generate alarms on request
  • Gets all parameters from registry

26
Web Interface
  • As a solution to
  • Remote access from outside CERN
  • Access from non NT hosts
  • Implemented as ASPs with VB
  • Requires IIS on the server

27
Web Interface - authentication
28
Web Interface - Overview
29
Web Interface - bjobs
30
Web interface - bjobs result
31
Windows NT Terminal Server
32
Next Steps
  • Finish and understand remote boot issues
  • Complete remote boot - remote install
  • AFS Integration
  • Build up resilience
  • Investigate how to use the new WfM, DMI, PXE,
    ACPI, etc. initiatives
  • Investigate whether WSH is an alternative
  • Investigate NTs I/O capabilities

33
Key Issues
  • AFS access
  • LSF support
  • Boot proms, equipment interoperability
  • CODE reintegration (Physics CERNLIB)
  • Think Windows
  • Scalability Management (home grown solution vs.
    commercial apps.)
  • Remote external access

34
PC with NT
  • PCNT has proven to work in batch environment,
    and is now an option for Physics Data Processing
  • Farm management is less of a concern after have
    built a few tools (alternatives would be to use
    SMS or TNG), but some work is still needed
  • Scalability has started to be addressed, but the
    relatively small number of nodes does not help
    here
  • Considerable NT experience has been gained

35
Issues so far
  • Linux
  • EEPRO 100 B MP support
  • Commercial software
  • Manufacturer support
  • Very few local Linux experts
  • NT
  • AFS access
  • LSF support
  • Think Windows
  • Remote and external access
  • PC
  • Interoperability (cards/MB combination
  • Remote Boot support

36
PC Technology evolution in 97
  • Pentium Pro ? Pentium II
  • 50 raw performance increase
  • but 50 cache performance reduction
  • SEC ? new motherboards
  • 440 FX ? 440 LX (SDRAM, AGP)
  • Recent MBs ? embedded SCSI, Enet, VGA
  • 100 Mbit Enet switches standard, 1000 Mbit
    arriving

37
PC Technology evolution in 98
  • Pentium II _at_300 MHz ? Pentium Xeon _at_ 450 MHz
  • MP support
  • 50 cache performance increase
  • Slot 2 ? new motherboards
  • 440 LX ? 440 BX, 440 NX (100 MHz, EDO)
  • Recent MBs ? No more available through Intel,
    TYAN
  • 1000 Mbit/s Enet switches standard, gtgt 1000
    Mbit/s arriving

38
Racking evolution
1998
1997
39
At the back ...
40
Console multiplexors
41
Fast Ethernet switches (Sep. 98)
42
Fast Ethernet Switches (Oct. 98)
43
At the back of Fast Ethernet Switches (Oct. 98)
44
Gigabit Ethernet Switches
45
Network performance Results
  • PCs interconnected through 100 BaseT 3Com 3000
    switch
  • Repeated with other H/W
  • Half duplex behavior
  • Block size does not matter
  • Linux uses less CPU than NT
  • ? Good unidirectional performance
  • ? Disappointing CPU consumption on NT
  • ? Disappointing bi-directional performance

46
PC to PC Network performance
47
Network performance issues
  • Unexplained 0.5 MB/s observed with some eepro100
    versions on PCRD hardware, but OK on PCSF
  • Recent DEC E'net boards with chipset gt 21140 give
    poor performance on Linux
  • Surprising results PC/Alpha

48
PC/Alpha Network performance
49
PC High Performance Networking
  • HiPPI (5/98)
  • PII, 300 MHz, 440LX, SDRAM, Roadrunner to SGI
    O2000, 4 CPU, IRIX 6.4
  • Transmit 50 MB/s
  • Receive 50 MB/s (53 MB/s with SMP)
  • Gigabit Ethernet (10/98)
  • PII, 400 MHz, 440 BX, 100 MHz SDRAM, PCI 32/33,
    Tigon I
  • 1500 bytes/packet 28 MB/s, 40 CPU
  • 9000 bytes/packet, 90 MB/s, 90 CPU

50
Disk performance
  • PCs connected to SEAGATE ST19171W using two
    Adaptec 2940 UW
  • NT needs a lot of tuning (default behavior is to
    swap data out!)
  • Block size, BIOS settings, EDO/FPM does not
    matter
  • ? Poor performance
  • ? Windows NT even worse
  • ? Memory bandwidth is suspected

51
Disk performance
  • Striping has no effect
  • 1 stream 2 stripes 21 MB/s (22 max)
  • 1 stream 3 stripes 21 MB/s (33 max)

52
Disk performance issues
  • Memory bandwidth suspected
  • Need to test with LX/SDRAM, BX SDRAM_at_100 Mhz
  • RISC PCI does not support variety of boards
  • Combined disk/network performance even worse
    5-6 MB/s on Linux

53
Memory bandwidth (lmbench)
54
Memory bandwidth (lmbench)
55
Technology issues
  • Technology evolves too fast (processors,
    chipsets, memory, motherboards, networking,...)
  • Changing environment/interoperability issues
  • Hard to maintain (obsolescence)
  • New NICs, drivers
  • Measurements valid only a few months
  • ? Difficult to establish stable environments
  • Wide variety of solutions
  • ? Some combinations work, other not
  • Local suppliers cannot help to solve problems

56
PC Performance summary
  • CPU performance fine
  • Network performance
  • Some configurations do not work
  • Some configurations can saturate Fast Ethernet
  • Recent tests show excellent performance
  • Memory performance
  • Now better than low-end RISC
  • Disk Performance disappointing
  • Linux better than NT

57
Online and quasi online farms
  • NA48 Data Recording
  • NA45 Data Recording in Objectivity

58
NA48 Central Data Recording
Sub detector VME crates
Event Builder Online PC Farm
FDDI
Fast Ethernet
SUN E450 500 GB Disk space
XLNT Gbit
Fast Ethernet
7 KM
Gigabit Ethernet
3Com 9300
GigaRouter
HiPPI
HiPPI
FDDI
Offline PC Farm
CS/2 2.5 TB Disk space
59
NA 48 Data Recording in 98
  • May ? September 1998
  • Raw Data on Tape
  • 68 TB (1450 tapes, mainly 50 GB tapes)
  • 12.5 TB Selected Reconstructed Data
  • Total with 97 data 96 TB
  • Average Data Rate 18 MB/s (peaks _at_ 23 MB/s)
  • CDR system can do 40-50 MB/s limitation is CPU
    Time available
  • Data recorded as files (4 million)

60
NA48 On Line Farm
  • 11 Subdetector PCs (dual PII-266, 128 MB)
  • 8 Event Building PCs (dual PII-266, 128 MB, 18
    GB SCSI)
  • 4 CDR routing PCs (dual PII-266, 64 MB, FDDI)
  • All running Linux
  • Software event building in the interburst gap
  • Optional Software Filter (tags data)
  • Send data to computer center (local disk buffers
    144 GB , 2 hours)
  • On CS/2 L3 Filtering and tape writing

61
NA48 Plans for 1999
Sub detector VME crates
Event Builder
4 SUN E450 4.5 TB Disk space
7 KM
Gigabit Ethernet
Fast Ethernet
3Com 9300
HiPPI
HiPPI
Gigabit Ethernet
On/Offline PC Farm
62
NA45 Data Recording
Sub detector VME crates
NA48
SCI
Event Builder On Line PC Farm
Fast Ethernet
3Com 3900
PCSF
7 KM
Gigabit Ethernet
2 SUN E450 500 GB Disk space
Fast Ethernet
3Com 9300
HiPPI
HiPPI
3Com 3900
Gigabit Ethernet
63
NA45 Raw Data recording in Objectivity
  • October 98 November 98
  • Estimated bandwidth 15 MB/s
  • Processes translate Raw Data format to
    Objectivity
  • Database files (1.5 GB) are closed, then written
    on tape
  • Steering done using a set of perl scripts on the
    disk servers
  • On line filtering/reconstruction/calibration
    possible
  • Farm is running Windows NT
  • Reconstruction can use PCSF

64
Current Future Data rates at CERN
65
Summary
  • On line PC farms are being used to record data at
    sensible rates (Linux)
  • Off line PC farms are being used for
    reconstruction/filtering/analysis (Linux/NT)
  • Still a lot to do on scalable farm management,
    global steering, CDR monitoring, etc..

66
PC Total Cost of Ownership
  • Software not included
  • Install labor not included
  • Assumes 3 years lifetime

67
DEC 8400 (12-Way) Cost of Ownership
  • Software SW maintenance not included
  • Assumes 5 years lifetime

68
General Conclusions (1)
  • PCs are now used for online, quasi online and
    offline environments
  • The offline is now part of the online
  • The I/O is still done using RISC/Unix but recent
    MP Xeon may change this

69
General Conclusions (2)
  • PC technology is moving very fast
  • Good for performance
  • Not so for stability, interoperability
  • Not so for understanding issues
  • The general management of large farms is not
    solved but
  • Number of initiatives/standards/tools may help us
    here WfM, DMI, PXE, ACPI, SMS, TNG, etc.

70
General Conclusions (3)
  • Linux vs. NT the battle is over
  • Choose the one suitable to your application
  • NT can be used
  • Linux is usable (and offers more performance).
  • PC real costs are usually not well understood
Write a Comment
User Comments (0)
About PowerShow.com