Grid Enabled Analysis for Particle Physics - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Grid Enabled Analysis for Particle Physics

Description:

(Conrad Steenberg, Frank van Lingen, Michael Thomas) Visit of ... Lizard, PROOF. Web browser. E-mail. CMD line. Prog. Lang X. ROOT, PROOF. Dist. JAS, DIAL ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 19
Provided by: julia71
Category:

less

Transcript and Presenter's Notes

Title: Grid Enabled Analysis for Particle Physics


1
Grid Enabled AnalysisforParticle Physics
  • Julian Bunn,Caltech
  • (Conrad Steenberg, Frank van Lingen, Michael
    Thomas)
  • Visit of
  • Lt. General Syed Shujaat Hussain (NUST)
  • November 2003

2
Particle Physics Computing Challenges
Petabytes, Petaflops, Global Collaborations
  • Geographical dispersion People and resources
  • Complexity The detector and the LHC
    environment
  • Scale Tens of Petabytes per year of data

5000 Physicists 250 Institutes 60
Countries
New Forms of Distributed Systems Data Grids
3
Data Grid Hierarchy
CERN/Outside Resource Ratio 12Tier0/(?
Tier1)/(? Tier2) 111
PByte/sec
100-1500 MBytes/sec
Online System
Experiment
CERN 1M SI95 1 EB Disk Tape Robot
Tier 0 1
  • HPSS

10-100 Gbps
Tier 1
FNAL 200k SI95 600 TB
IN2P3 Center
INFN Center
RAL Center
10 Gbps
Tier 2
10 Gbps
Tier 3
Institute 1TIPS
Institute
Institute
Institute
Physicists work on analysis channels Each
institute has 10 physicists working on one or
more channels
0.110 Gbps
Physics data cache
Tier 4
Workstations
4
Caltech Tier2 (at CACR)
Caltech Tier2 User Community There are currently
57 registered users from Caltech (HEP, CACR, CS),
CERN, UFL, UC Davis, UCR, UCSD, UCLA, UWM, FNAL,
ANL and Romania.
Caltech Tier2 Uses, of 3 Clusters PG
Participating in CMS PreChallenge Production
IGT US CMS simulatn, reco and analysis ?
CalibDGT Develop and Test CMS grid software.
Cluster 1 Production Grid (PG)1 4U front-end
node with dual P4 Xeon 2.4GHz CPUs with 2
GB 1.5TB disk22 1U compute nodes with dual P4
Xeon 2.8/2.4 GHz CPUs with 1GB 1.5 TB
StorageCluster 2 Integration Grid Testbed
(IGT)1 7U dual P3 1 GHz with 2GB front-end
server 20 2U dual P3 800 MHz 512MB compute
nodes. 1 3 TB RAID arrayCluster 3
Development Grid Testbed (DGT)5 Node AMD
Athlon-based cluster, and 2 1U dual P4 Xeon
2.4GHz/1GB MonALISA servers 1 1U dual P4 Xeon
2.4GHz/1GB Network server 1 Sun Ultra-250 network
data server Gigabit connection to public and
private networks
5
COJAC CMS ORCA Java Analysis Component Java3D
Objectivity JNI Web Services
  • Object Collections
  • Multiplatform, Light Client
  • Interface to OODBMS

Demonstrated Caltech-Riode Janeiro and Chile in
2002
6
Grid Enabled Analysis Architecture
ROOT
Laptop
Browser
PDA
Desktop
Peer Group
Clarens
Super Peer Group
DAGs
CMS Apps
File Transfer
Others
MonaLisa
From GriPhyn, iVDGL, Globus etc
Caltech/CMS Developments
7
GAE - Grid Analysis Environment
  • The development of a physics analysis environment
    that integrates with the Grid systems, is where
    the real Grid Challenge lies
  • To be used by a large diverse community
  • 100s - 1000s of tasks with different technical
    demands
  • Needs priorities
  • Needs security
  • How much automation is possible. How much is
    desirable?
  • GAE is a key to success or failure for
    physics Grid applications
  • Where the physics gets done
  • Where the Grid End-to-End Services and the Grid
    Application Software Layers get built
  • Where we learn how to collaborate remotely to do
    physics

8
GAE Tools Clarens
  • Our emphasis is on accomodating existing analysis
    tools in our CAIGEE architecture
  • To facilitate this, we use the Clarens
    Dataserver
  • Clarens is server software that makes datasets
    and services available to clients in a suitable
    lingua franca
  • Clients initially Grid-authenticate with a
    Clarens server, and then are able to make use of
    a wide set of data and analysis services on offer

9
Interconnected System
  • Try to make sense of the Alphabet soup
  • Service/functionality oriented view
  • Providers
  • Clients
  • Both
  • Middleware/information providers

10
Connecting components
  • Client/server view based on resource
  • abundance
  • Middleware/IP helps organize resources in a
    resource-scarce environment
  • G, OGS, Tomcat, MonaLisa, web server
  • Metadata catalogs some missing
  • Needed make a (more) consistent and unified
  • environment without resorting to X scripting
  • languages as glue
  • Interact with network-enabled components in
  • from the most sensible environment for the
  • task
  • Not only client/server, but between all
    components

11
Enable higher level services
  • Reduce the development impedance for higher level
    services to function properly
  • E.g. MonaLisa uses modules to monitor using
    'ping', SNMP, Ganglia etc., but provides
    agregated information using single remote API
    (SOAP)
  • Reduce manual interaction
  • Counter-example VO management
  • Obtain certificates from X CAs via LDAP
  • Store in VO LDAP server, create VO
  • Extract structure using different tool, using
    config file to create new config file (gridmap)
    used by middleware (Globus gatekeeper)
  • Site admins must maintain separate copies of
    gridmap files for different clusters/servers

12
Security and Virtual Organization
  • Authentication via X509 certificates
  • Verifies certificate chain up to a list of
    accepted Certificate Authority certificates
  • Client identified internally by the certificate
    distinguished name (DN) uniqueness ensured by
    CA
  • Authorization done using an internal VO
  • VO consists of a hierarchy of groups and users
  • Does not need to store client certificates, uses
    Dns
  • VO data stored in DB

SUPER ADMIN GROUP
DN1,DN2...
Part of can add users to
Can create groups
Specified in server setup file
Can add users to admin group
GROUP N Member DN1 Member DN2 ...
ADMIN GROUP
Can add users to admin group
13
CAIGEE architecture II
14
Interactive Analysis
  • Use Clarens as RPC layer
  • Python as scripting language already used
  • Multithreaded analysis job listens to RPCs
  • Use Condor/PBS/LSF as scheduler to start and kill
    jobs.

Head node
Sched
Clarens
Client
Clarens
Pclarens
Farm node
Analysis
15
Current work
  • WSDL interface descriptions/Resource/data
    discovery
  • Integation with CMS analysis tools
  • POOL RL catalog interface
  • NorduGrid RL catalog interface (Atlas)
  • BOSS job submission (INFN)
  • Java version of server
  • Sphinx job scheduling
  • Chimera virtual data system
  • OGSI compatibility
  • Monitoring integration via MonaLisa

16
GAE Collaboration DesktopExample
  • Four-screen Analysis Desktop 4 Flat Panels 5120
    X 1024
  • Driven by a single server and single graphics
    card
  • Allows simultaneous work on
  • Traditional analysis tools (e.g. ROOT)
  • Software development
  • Event displays
  • MonALISA monitoring displays Other Grid Views
  • Job-progress Views
  • Persistent collaboration (VRVS shared windows)
  • Online event or detector monitoring
  • Web browsing, email

17
GAE Tools PDA Client
  • A handheld GAE client fruits of collaboration
    between NUST and Caltech
  • Software is Java Analysis Studio (JAS) ported to
    the Pocket PC 2002 OS
  • Hardware is any Pocket PC 2002 device (But we use
    HP/Compaq iPAQ devices)
  • This tool includes Grid authentication/security
    components

18
Grid-Enabled Analysis Prototypes
Collaboration Analysis Desktop
COJAC (via Web Services)
Write a Comment
User Comments (0)
About PowerShow.com