OMIS Approach to Grid Application Monitoring - PowerPoint PPT Presentation

About This Presentation
Title:

OMIS Approach to Grid Application Monitoring

Description:

on request, passed to local monitors ... storing and efficient data structures (counters and integrators) proved to be very efficient ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 23
Provided by: vito2
Category:

less

Transcript and Presenter's Notes

Title: OMIS Approach to Grid Application Monitoring


1
OMIS Approach to Grid Application Monitoring
  • Bartosz Balis
  • Marian Bubak
  • Wlodzimierz Funika
  • Roland Wismueller

2
AGENDA
  • Introduction
  • Monitoring architecture
  • sensors (local monitors, application monitors)
  • service managers
  • Performance
  • efficient data gathering
  • scalability of grid-scale monitoring
  • Producer / consumer communication protocol
  • Comparison to DATAGRID
  • Experience
  • Conclusion

3
Introduction
  • Need for monitoring applications
  • improve performance
  • localize bugs
  • For these purposes specialized tools needed
  • debuggers, performance analyzers, visualizers,
    etc.
  • Tools composed of two modules
  • user interface
  • monitoring module

4
Introduction (contd)
  • Main issues of monitoring on Grid
  • scale of Grid enormous
  • many applications, many users, high distribution,
    high heterogeneity
  • simply porting existing environments not
    sufficient!
  • A solution
  • underlying universal monitoring system
  • well defined interface to tools
  • Experience with OMIS / OCM PVM ? MPI, port of
    tools
  • next step move to Grid?

5
Monitoring architecture
  • Compliance with GMA (Grid Monitoring
    Architecture)
  • producer / consumer model
  • Sensors producers of performance data
  • Tools consumers of the data
  • Direct communication between producers and
    consumers
  • Producers located via e.g. a directory service

6
Sensors
  • Collect performance data from applications
  • Two types of sensors
  • local monitors (process sensors)
  • application monitors

7
Sensors (contd)
  • Local monitors
  • one per node
  • collect data only from processes on this node
  • publish themselves in the directory service
  • Application monitors
  • embedded parts of applications
  • collect data on various events, e.g. function
    calls
  • may improve efficiency and portability
  • interact with local monitors

8
Monitoring Architecture
9
Service managers
  • Tool local monitors one consumer, multiple
    producers
  • Intermediate entity service manager
  • handles requests coming from a tool
  • splits them into sub-requests for local monitors
  • collects replies from local monitors
  • assembles them into a single reply for the tool
  • Both producer (of data for tools) and consumer
    (of data from local monitors)
  • Offers the functionality of local monitors but on
    a per-application basis

10
Application Monitors
  • Part of the monitoring system embedded in the
    applications processes
  • have acces to the application address space!
  • Many possible usages
  • efficient data gathering and storing
  • may take over some of the local monitors tasks
  • may be used to dynamically load monitoring
    extensions
  • even more for multithreaded applications

11
Application Monitors debugging example
  • A debugger wants to access a process address
    space
  • Standard system mechanisms ptrace, /proc
  • /proc more powerful yet platfom-dependant
  • synchronous control
  • Via application monitors ? request from the
    debugger to access the data
  • portable, asynchronous
  • question how to ensure that application monitors
    are not corrupted by the application?

12
Performance
  • Efficient data gathering
  • data production much more frequent than retrieval
  • frequency and time of access difficult to
    predict
  • Scalability
  • grid-scale monitoring system
  • distributed vs. centralized

13
Efficient data gathering
  • Local storing
  • performance data first stored locally, in the
    context of application processes
  • on request, passed to local monitors
  • saves communication and context switches between
    application and local monitor processes
  • Efficient data structures
  • performance data initially preprocessed
  • summarized information stored in e.g. counters
    and integrators

14
Scalability
  • Decentralization ? multiple service managers
    instead of one
  • Possible approaches
  • fixed number of service managers, each
    responsible for part of the system
  • one service manager starting for every monitored
    application

15
Fixed number of SMs
16
One SM per application
17
Scalability (contd)
  • In the first approach
  • more tight cooperation between service managers
    will be necessary
  • In the second approach
  • local monitors must have the ability to serve
    multiple service managers
  • service managers locate local monitors via
    directory service

18
Communication protocol
  • Based on the OMIS specification
  • OMIS On-line Monitoring Interface Specification
  • specification of a universal interface between
    tools and a monitoring system
  • supports various types of tools
  • allows for easy extending
  • Necessary Grid-specific extensions (e.g. for
    authentication)

19
Comparison to DATAGRID
  • Monitoring approach
  • DG (semi-)on-line
  • CG on-line
  • Architecture
  • DG centralized distributed (local monitors and
    one main monitor)
  • CG distributed (local monitors and multiple
    service managers)

20
Comparison to DATAGRID (contd)
  • Data collection
  • DG local storing with trace buffering or
    counters
  • CG local storing with preprocessing (counters,
    integrators)
  • Communication protocol
  • DG Not specified
  • CG OMIS

21
Experience
  • OMIS-based monitoring system for clusters of
    workstations OCM
  • OMIS-based tools PATOP (performance analysis),
    DETOP (debugging), others...
  • Local storing and efficient data structures
    (counters and integrators) proved to be very
    efficient
  • full monitoring overhead of about 4
  • Instrumentation techniques used induce
    zero-overhead when monitoring inactive

22
Summary
  • Demand for accurate data from monitoring tools
  • Monitoring data handling production /
    consumption
  • A general scheme of monitoring compliant with GMA
  • Need of an advanced monitoring infrastructure
  • Concepts of OMIS will be extended to fit Grid
Write a Comment
User Comments (0)
About PowerShow.com