Configuration Monitoring Tool for Large Scale Distributed Computing - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Configuration Monitoring Tool for Large Scale Distributed Computing

Description:

... a person to perform the administration tasks securely: ... interface enforces strong authentication and authorization using the digital certificates. ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 26
Provided by: yuj34
Category:

less

Transcript and Presenter's Notes

Title: Configuration Monitoring Tool for Large Scale Distributed Computing


1

Configuration Monitoring Tool for Large Scale
Distributed Computing
Y. Wu1, G. Graham1, X Lu2, A. Afaq1, B.J. Kim3
and I. Fisk1 1. Fermi National Accelerator
Laboratory 2. University of Iowa 3. University of
Florida
2
Outline
  • Introduction to the CMS computing
  • Why a configuration monitoring tool
  • Design consideration and approach
  • Configuration monitoring tool architecture and
    components
  • Current status of the configuration monitoring
    tool
  • Future development plan and summary

3
Introduction to the CMS computing
  • CMS (Compact Muon Solenoid) experiment, which
    will run at the Large Hadron Collider (LHC), is
    expected to have the following features in its
    computing
  • - Will have petabytes of data
  • - Need very large scale distributed computing
    systems to analyze the data
  • - Grid computing will likely be used to achieve
    much of its offline data analysis needs
  • - The computing systems utilized in the CMS data
    analysis will be heterogeneous and dynamic

4
CMS Data Grid Hierarchy
1 TIPS 25,000 SpecInt95 PC (2000) 20 SpecInt95
PBytes/sec
Online System
100 MBytes/sec
Tier 01
Bunch crossing per 25 nsecs.100 triggers per
secondEvent is 1 MByte in size
0.6-2.5 Gbits/sec
or Air Freight
Tier 1
FNAL Regional Center
France Regional Center
Italy Regional Center
UK Regional Center
2.4 Gbits/sec
Tier 2
622 Mbits/sec
Tier 3
Physicists work on analysis channels. Each
institute has 10 physicists working on one or
more channels Data for these channels should be
cached by the institute server
Institute
Institute
Institute
Institute
Physics data cache
100 - 1000 Mbits/sec
Tier 4
5
Why A Configuration Monitoring Tool?
  • To meet the CMS distributed computing challenges,
    we find we need to have a monitoring system to
    track and query site configuration information
    for large-scale distributed CMS applications
  • A few selected use cases
  • - Job generators, e.g. MOP, need to know
    a list of configurations on a computer resource
    (e.g., CMS software location, scratch area, etc.)
    for generating and submitting jobs
  • - A general user need to know what kind of
    services are available within an organization
    (e.g., USCMS) and their corresponding
    configurations, e.g., gatekeeper port number and
    available job managers
  • - Users also want to know the services
    status critical services need to be available
    even before job submission

6
Design Consideration and Approach
  • The goal of a configuration monitoring system is
    to fit the needs of CMS production and user
    analysis across the US CMS resources
  • The following features are desirable (based on a
    user survey)
  • - The information in the configuration
    monitoring system should be highly available
  • - The history configuration information
    should be archived and retrievable
  • - The configuration information should
    only be available for authorized users and/or
    groups
  • Utilize as much existing tools as possible

7
Design Consideration and Approach (2)
  • Globus Toolkit and Tomcat servlet container are
    chosen as the building blocks for the
    configuration monitoring tool
  • A relational database server is used to store the
    configuration information. This has the advantage
    to log the info for future queries
  • The Grid Security Infrastruction (GSI), together
    with the EDG Java Security package, is used for
    secure authentication and transparent access to
    the configuration information across the USCMS
    grid

8
Design Consideration and Approach (3)
  • A layered structure is used to develop the whole
    system. It has the advantage to replace a layer
    without interfering other layers. Tentatively,
    the system is divided into the following layers
  • Site info provider layer
  • - The module in this layer is distributed
    at each computing resource. It collects and
    publishes resource configuration info.
  • Configuration Database Server layer
  • - It tracks the hosts and services to be
    monitored, and stores all the collected
    configuration info.
  • Tomcat service layer
  • - Through Tomcat, a user can view the info
    through a web browser and/or query the info in
    the database through web service
  • User Interface
  • - They are here for the convenience of
    users

9
A Protype Architecture
Tomcat Server
query
VOMS
query
Configuration Database Server (MySQL)
Site Info Provider
Site Info Provider
Site Info Provider
10
Site Information Provider Layer
  • This layer is responsible for collecting and
    publishing site configuration information at each
    resource. It accomplishes the task through
    Globus MDS with our own information provider and
    the standard GLUE schema (Grid Laboratory Uniform
    Environment)
  • The information provider can publish the
    information from the following source
  • - Configuration information in a text
    file
  • - Output from a user command
  • - Special scripts can also be written as
    plug-ins for other configuration generations
  • The published resource configuration info in MDS
    can be queried directly using standard Globus
    commands or through a set of client scripts
    provided by the configuration monitoring tool.

11
Configuration Database Server Layer
  • The database server layer consists of a
    relational database server and cron job scripts
    to track and update the information in the
    database. It is the core component of the whole
    configuration monitoring architecture
  • - Provides a mechanism on controlling
    hosts and services to be monitored
  • - Tracks the availability of the services
    within a Virtual Organization (VO) --- some
    services are supposed to be available all the
    time
  • - Archives the collected configuration
    information for later use
  • Currently, we are using MySQL as the relational
    database server. It is an open source product. It
    can fit our current need when the number of hosts
    and services to be monitored is relatively small.

12
Configuration Database Server Layer (2)
  • The configuration information in the database are
    collected through site information providers and
    get updated at a scheduled interval using the
    cron job scripts
  • The old configuration information are archived
    and only updated in the database when there is a
    change in a resource configuration. In another
    word No change in information, no update!

13
Tomcat Service Layer
  • Tomcat plays an important role in our
    configuration monitoring system.
  • Tomcat servlet technology is used to provide a
    web interface for users to accomplish the
    following tasks
  • - Browse the available hosts
  • - Browse the available services, and its
    configurations
  • - Make a specific query on the host and/or
    service
  • And the same technology is used for authorizing a
    person to perform the administration tasks
    securely
  • - Update the resources/services to be
    monitored
  • - Reset the availability of services

14
Tomcat Service Layer (2)
  • In the future, we plan to provide web service
    through Tomcat for both users and administrators
  • - Users may query the information in the
    configuration database through command-line
    scripts. This will include the available
    resources, services, and their configuration info
    in the central database (or databases). Still, if
    a user wants the newest information, he/she has
    to retrieve those information directly from a
    local info provider.
  • - Administrators can update their site
    information through the web service mechanism,
    e.g., when a service must be shut down
    immediately.

15
Web Interface Screenshot (1)
16
Web Interface Screenshot (2)
17
Web Interface Screenshot (3)
18
Security features of the Configuration Monitoring
System
  • Keeping the configuration information only
    accessible by authorized users is always one of
    our top priorities
  • As the site info provider is part of the Globus
    MDS, it has the same security mechanism as the
    standard Globus toolkit
  • The web interface enforces strong authentication
    and authorization using the digital certificates.
    This requires a client web browser to be able to
  • - manage client certificates
  • - perform SSL mutual authentication

19
Security features of the Configuration Monitoring
System (2)
  • On the server side, all the web pages and
    servlets are put behind an authorization servlet
    filter----currently, we are using a filter
    package developed by EDG.
  • The authorization filter examines every incoming
    request and tries to extract the client
    certificate from the request. It then passes the
    extracted client DN to an authorization manager
    for verification. If the authorization manager
    can verify the client DN, it gives permission for
    the user to view the web info otherwise, it just
    termites the request and informs the user
    authorisation failed.
  • Currently, the authorization manager is
    configured to examine a standard grid-mapfile to
    see if a request user DN can be found in the
    grid-mapfile.
  • Furthermore, the user DN entries in the
    grid-mapfile is extracted from a VOMS (Virtual
    Organization Membership Service) server

20
Security features of the Configuration Monitoring
System (3)
User Request (DN, etc)


Tomcat
Authorization Manager
Servlet filter
Authorized?
Grid-mapfile
Configuration Info (.html, .jsp, servlets)
VOMS
21
The Current Status of Configuration Monitoring
Tool
  • We have finished the initial development on major
    components of the configuration monitoring tool
    and tested it using USCMS grid resources
  • The information provided by configuration
    monitoring tool has been used in the USCMS
    distributed Monte Carlo production---its first
    customer (detail next page)
  • Other applications, such as GridServ under
    development at University of Florida, also show
    interest in using the info published by the
    Configuration Monitoring tool.
  • - More info on this can be found at
  • https//gdsuf.phys.ufl.edu8443/gridmon/ad
    min/gridserv/dpeclient

22
The Current Status of Configuration Monitoring
Tool (2)
  • MOP is a system for distributing CMS production
    jobs over the distributed grid environment.
    Currently, it is the main production system used
    in the USCMS grid testbeds.
  • In order to generate and submit MOP jobs, the MOP
    job submitter need to know a set of parameters at
    each remote site intended to run jobs
  • MOP_MAX_JOBS100MOP_REMOTE_JOB_MANAGER_
    FOR_RUNgarlic.hep.wisc.edu/jobmanagerMOP_REMOTE
    _JOB_MANAGER_FOR_STAGE_INgarlic.hep.wisc.edu/job
    managerMOP_REMOTE_JOB_MANAGER_FOR_STAGE_OUTgarli
    c.hep.wisc.edu/jobmanagerMOP_REMOTE_JOB_MANAGER_
    FOR_PUBLISHgarlic.hep.wisc.edu/jobmanagerMOP_RE
    MOTE_JOB_MANAGER_FOR_CLEANUPgarlic.hep.wisc.edu/
    jobmanagerMOP_REMOTE_RUNTIME_AREA/afs/hep.wisc.e
    du/grid3/shared-tmpMOP_EXPORT_DIR/afs/hep.wisc.e
    du/grid3/shared-tmpMOP_REMOTE_VDT_LOCATION/data/
    grid/GRID3/MOP_REMOTE_DAR_ROOT/afs/hep.wisc.edu/
    grid3/app/uscms01

23
The Current Status of Configuration Monitoring
Tool (3)
  • Before Using Configuration Monitoring Tool
  • - Remote system administrators had to mail
    this information to the person who generated MOP
    jobs. He/she would put these info into a
    configuration file.
  • - It was a model very prone to failure If
    there was a change in the site configuration,
    there is a potential of job failure even before
    submitting the jobs---sometime a system
    administrator forgot to mail this info or a MOP
    user forgot to check the e-mail to modify the
    submitter side file.
  • After using the tool
  • - The site system administrator just
    need modify a local copy of the configuration
    file. The configuration monitoring tool will take
    care of the rest.

24
Future development
  • We think further developments are needed in the
    following areas
  • Need to provide web services to query the info
    from the database and/or to update the info in
    the database through Tomcat
  • More resource configuration information need to
    be collected from other monitoring tools, like
    MonaLisa, Ganglia, etc.
  • Provide a web interface to view history data
  • - They are now archived in the database
    with timestamp. We need to have an interface to
    view those info.

25
Summary
  • A configuration monitoring tool has been
    developed on top of the Globus technology and web
    service to allow users/sites to publish the site
    configuration info, archive the collected info
    and query them
  • The Grid Security Infrastructure, together with
    EDG Java Security packages, are used for secure
    authentication and transparent access to the
    configuration information across the USCMS grid
  • The configuration monitoring tool has been
    installed on the USCMS Grid testbeds and tested
    in the USCMS grid production jobs
  • Further improvements have been identified and
    will be available in the near future
Write a Comment
User Comments (0)
About PowerShow.com