Title: Growing consensus that Grid services is the right concept for building the computing grids;
1Grid services based architectures
Some buzz words, sorry
- Growing consensus that Grid services is the right
concept for building the computing grids - Recent ARDA work has provoked quite a lot of
interest - In experiments
- SC2 GTA group
- EGEE
- Personal opinion - this the right concept that
arrived at the right moment - Experiments need practical systems
- EDG is not capable to provide one
- Need for pragmatic, scalable solution without
having to start from scratch.
2Tentative ARDA architecture
Job
Auditing
Provenance
Information
Service
1
2
Authentication
3
Authorisation
User
Interface
API
6
4
Accounting
Metadata
DB Proxy
Catalogue
14
Grid
5
Monitoring
13
12
File
7
Catalogue
Workload
Management
10
9
Package
Data
Manager
8
Management
Discuss more these parts in the following
11
Computing
15
Element
Job
Monitor
Storage
Element
3Metadata catalogue (Bookkeeping database) (1)
- LHCb Bookkeeping
- Very flexible schema
- Storing objects (jobs, qualities, others ?)
- Available as a services (XML-RPC interface)
- Basic schema is not efficient for generic
queries - need to build predefined views (nightly ?)
- views fit the query by the web form, what
about generic queries ? - data is not available immediately after
production
Needs further development, thinking, searching
4Metadata catalogue (Bookkeeping database) (2)
- Possible evolution keep strong points, work on
weaker ones - Introduce hierarchical structure
- HEPCAL recommendation for eventual DMC
- AliEn experience
- Better study sharing parameters between thejob
and file objects - Other possible ideas.
This is a critical area worth investigation
! But (see next slide)
5Metadata catalogue (Bookkeeping database) (3)
- Man power problems
- For development, but also for maintenance
- We need the best possible solution
- Evaluate other solutions
- AliEn, DMC eventually
- Contribute to the DMC development
- LHCb Bookkeeping as a standard service
- Replaceable if necessary
- Fair test of other solutions.
6Metadata catalogue (Bookkeeping database) (4)
- Some work has started in Marseille
- AliEn FileCatalogue installed and populated by
the information from the LHCb Bookkeeping - Some query efficiencies measurements done
- The results are not yet conclusive
- Clearly fast if the search in the hierarchy of
directories - Not so fast if more tags are included in the
query - Very poor machine used in CPPM not fair to
compare to the CERN Oracle server. - The work started to provide single interface to
both AliEn FileCatalogue and LHCb Bookkeeping - How to continue
- CERN group possibilities to contribute ?
- CPPM group will continue to follow this line, but
resources are limited - Collaboration with other projects is essential
7File Catalogue (Replica database) (1)
- The LHCb Bookkeeping was not conceived with the
replica management in mind added later - File Catalog needed for many purposes
- Data
- Software distribution
- Temporary files (job logs, stdout, stderr, etc)
- Input/Output sandboxes
- Etc, etc
- Absolutely necessary for DC2004
- File Catalog must provide controlled access to
its data (private group, user directories)
In fact we need a full analogue of a distributed
file system
8File Catalogue (Replica database) (2)
- We should look around for possible solutions
- Existing ones (AliEn, RLS)
- Will have Grid services wrapping soon
- Will comply with the ARDA architecture
eventually - Large development teams behind (RLS, EGEE ?)
- This should be coupled with the whole range of
the data management tools - Browsers
- Data transfers, both scheduled and on demand
- I/O API (POOL, user interface).
This is a huge enterprise, and we should rely on
using one of the available systems
9File Catalogue (Replica database) (3)
- Suggestion is to start with the deployment of the
AliEn FileCatalogue and data management tools - Partly done
- Pythonify the AliEn API
- This will allow developing GANGA and other
applicationplugins - Should be easy as the C API (almost) exist.
- Should be interfaced with the DIRAC workload
management (see below) - Who ? CPPM group, others are very welcome
- Where ? Install the server at CERN.
- Follow the evolution of the File Catalogue Grid
services (RLS team will not yield easily !)
This is a huge enterprise, we should rely on
using one of the available systems
10Workload management (1)
- The present production service is OK for the
simulation production tasks - We need more
- Data Reprocessing in production (planned)
- User analysis (spurious)
- Flexible policies
- Quotas
- Accounting
- Flexible job optimizations (splitting, input
prefetching, output merging, etc) - Flexible job preparation (UI) tools
- Various job monitors (web portals, GANGA plugins,
report generators, etc) - Job interactivity
11Workload management (2)
- Possibilities to choose from
- Develop the existing service
- Use another existing service
- Start developing the new one.
- Suggestion a mixture of all of choices
- Start developing the new workload management
service using existing agents based
infrastructure and borrowing some ideas from the
AliEn workload management - Already started actually (V. Garonne)
- First prototype expected next week
- Will also try OGSI wrapper for it (Ian
Stokes-Rees) - Keep the existing service as jobs provider for
the new one.
12Workload management architecture
GANGA
Workload management
Production service
Job Receiver
Optimizer 1
Optimizer 1
Optimizer 1
Command line UI
Job DB
Site
Agent 1
CE 1
Job queue
Match Maker
Agent 2
CE 2
Agent 3
CE 3
13Workload management (3)
- Technology
- JDL job description
- Condor Classad library for matchmaking
- MySQL for Job DB and Job Queues
- SOAP (OGSI) external interface
- SOAP and/or Jabber internal interfaces
- Python as development language
- Linux as deployment platform.
- Dependencies
- File catalog and Data management tools
- Input/Output sandboxes
- CE
- DIRAC CE
- EDG CE wrapper.
14Conclusions
- Most experiment dependant services are to be
developed within the DIRAC project - MetaCatalog (Job Metadata Catalog)
- Workload management (with experiment specific
policies and optimizations) - Can be eventually our contribution to the common
pool of services. - Get other services from the emerging Grid
services market - Security/Authentication/Authorization,
FileCatalog, DataMgmt, SE, CE, Information, - Aim at having DC2004 done with the new (ARDA)
services based architecture - Should be ready for deployment January 2004