SC2000 Tutorial - PowerPoint PPT Presentation

1 / 70
About This Presentation
Title:

SC2000 Tutorial

Description:

When building distributed systems, we often observe unexpectedly low performance ... the disks or network adapters on either the sending or receiving host ... – PowerPoint PPT presentation

Number of Views:70
Avg rating:3.0/5.0
Slides: 71
Provided by: briant2
Category:

less

Transcript and Presenter's Notes

Title: SC2000 Tutorial


1

Tutorial on Using NetLogger Toolkit version 2.x
Dan Gunter Brian L. Tierney
Distributed Systems Department Lawrence Berkeley
National Laboratory
2
Overview
  • The Problem
  • When building distributed systems, we often
    observe unexpectedly low performance
  • the reasons for which are usually not obvious
  • The bottlenecks can be in any of the following
    components
  • the applications
  • the operating systems
  • the disks or network adapters on either the
    sending or receiving host
  • the network switches and routers, and so on
  • The Solution
  • Highly instrumented systems with precision timing
    information and analysis tools

3
Bottleneck Analysis
  • Distributed system users and developers often
    assume the problem is network congestion
  • This is often not true
  • In our experience tuning distributed
    applications, performance problems are due to
  • network problems 40
  • host problems 20
  • application design problems/bugs 40
  • 50 client , 50 server
  • Therefore it is equally important to instrument
    the applications

4
NetLogger Toolkit
  • We have developed the NetLogger Toolkit (short
    for Networked Application Logger), which
    includes
  • tools to make it easy for distributed
    applications to log interesting events at every
    critical point
  • tools for host and network monitoring
  • The approach is novel in that it combines
    network, host, and application-level monitoring
    to provide a complete view of the entire system.
  • This has proven invaluable for
  • isolating and correcting performance bottlenecks
  • debugging distributed applications

5
NetLogger Components
  • NetLogger Toolkit contains the following
    components
  • NetLogger message format
  • NetLogger client library (C, Java, Python)
  • NetLogger visualization tools
  • NetLogger host/network monitoring tools
  • Source code (BSD-like open source) and binaries
    are available at
  • http//www-didc.lbl.gov/NetLogger/
  • Additional critical component for distributed
    applications
  • NTP (Network Time Protocol) or GPS host clock is
    required to synchronize the clocks of all systems

6
NetLogger Methodology
  • NetLogger is both a methodology for analyzing
    distributed systems, and a set of tools to help
    implement the methodology.
  • You can use the NetLogger methodology without
    using any of the LBNL provided tools.
  • The NetLogger methodology consists of the
    following
  • All components must be instrumented to produce
    monitoring These components include application
    software, middleware, operating system, and
    networks. The more components that are
    instrumented the better.
  • All monitoring events must use a common format
    and common set of attributes and a globally
    synchronized timestamp
  • Log all of the following events Entering and
    exiting any program or software component, and
    begin/end of all IO (disk and network)
  • Collect all log data in a central location
  • Use event correlation and visualization tools to
    analyze the monitoring event logs

7
When to use NetLogger
  • When you want to
  • do performance/bottleneck analysis on distributed
    applications
  • determine which hardware components to upgrade to
    alleviate bottlenecks
  • do real-time or post-mortem analysis of
    applications
  • correlate application performance with system
    information (ie TCP retransmission's)
  • works best with applications where you can follow
    a specific item (data block, message, object)
    through the system

8
When NOT to use NetLogger
  • Analyzing massively parallel programs (e.g. MPI)
  • Current visualization tools dont scale beyond
    tracking about 20 types of events at a time
  • However, NetLogger is useful for watch a few
    nodes, and for collecting MPI log files from all
    the nodes
  • Analyzing many very short events
  • system will become overwhelmed if too many events
  • we typically use NetLogger to monitor events that
    take gt .5 ms
  • e.g probably dont want to use to instrument the
    UNIX kernel

9
NetLogger Client Library
  • There are 2 versions of the client library
  • Netlogger-lite only supports NetLogger format
    (single .c and .h file)
  • Recommended for most use-cases
  • C, Python (native or via SWIG), and Java (native
    only)
  • NetLogger-full supports all formats, reconnect,
    backup URLs, and more
  • C, Python (via SWIG), Java (via SWIG)

10
NetLogger Event Format
  • NetLogger supports the following message formats
  • NetLogger format native ASCII format, easy to
    read and parse, enforces data model
  • NetLogger binary for applications that generate
    1000s of events / second
  • ULM for compatibility with NetLogger 1.0
  • XML if you want to run very slow -)

11
NetLogger Data Model
  • NetLogger message format is designed to help
    enforce not just a common format, but also a
    common data model
  • Why bother with a data model?
  • Makes it easier to write generic consumers of the
    data for analysis, visualization
  • Relational databases need as much structure as
    they can get
  • Elegance if we can describe a general principle,
    we should!

12
NetLogger Message Format (1)
  • Each NetLogger message is called an event
  • The event is subdivided into fields
  • Each field has three parts, typecode, key, and
    value
  • t DATE 2004-04-15T213601.425059
  • Typecodes
  • d Double (64 bits)
  • i Integer (32 bits)
  • l Long integer (64 bits)
  • s String (up to 4K)
  • t Timestamp (ISO8601 date format)

13
NetLogger Message Format (2)
  • The following fields are generated automatically
    by NetLogger (all APIs)
  • DATE The time that this event occurred.
  • LVL The logging level of this event.
  • Fatal, Error, Warn, Info, Debug, Debug1, Debug2,
    Debug3, or User. The default level is Info.
  • EVNT This field is the name of the event.
  • Any number of additional fields are allowed
  • Well, technically, 128 total is the limit

14
NetLogger Message Format (3)
  • Sample NetLogger event
  • t DATE 2004-04-15T213601.425059
  • s LVL Info
  • s HOST foo.lbl.gov
  • s PRGM test_program
  • s EVNT SendData
  • i Send.Size 49332
  • This says program named test_program on host
    foo.lbl.gov performed event named SendData with a
    size of 49332 bytes, on April 15 at 936 PM
    (UTC).
  • User-defined data elements are used to store
    information about the logged event - for example
  • EVNT SendData / Send.Size 49332
  • the number of bytes of data sent
  • EVNT netstat.retranssegs / NS.RTS 2
  • the number of TCP retransmits since the previous
    event

15
NetLogger APIs
  • NetLogger Toolkit includes application libraries
    for generating NetLogger messages
  • Can send log messages to
  • file
  • host/port (netlogd)
  • syslogd
  • C, Java, and Python APIs are currently supported
  • Compatible with standard Java and Python logging
    packages
  • Perl, TCL, etc.
  • Easy to generate compatible format using string
    print functions

16
NetLogger C API
17
C API deployment
  • Designed to be lightweight
  • One .c and one .h file
  • To build from source, also need
    netlogger_stdint.h for portability
  • lt 3000 lines of code, total
  • Can just copy into your source code go

18
NetLogger C API Design
Format a modules event at a given logging level
for a given target type
Format an event belonging to a module to a
stream using write() with variable arguments
public interface exists for each layer
Abstraction
Format a stream of records
Format a record into a string
19
Basic C API concepts
  • Module
  • Name for a globally accessible output
    destination. Multiple modules may be active at
    any one time.
  • Logging levels
  • Each event has a level, 0..9 are reserved, but
    any posint is OK (appears as NL_USER in output)
  • Each module has a level, meaning only events at
    this level of detail or lower levels of details
    (henceforth just lower levels) will be written
  • Target types
  • application, dbg, path, node or any
    user
  • Determines additional auto-generated fields, such
    as file name and line number for dbg
  • Determines required fields, such as SRC and DST
    for path

20
C API Example
  • // create module named dbg
  • NL_logger_module("dbg", NULL, NL_LVL_INFO,NL_TYPE_
    DBG, "")
  • // write a loop.start event to the dbg module
  • NL_debug("dbg","loop.start","starti stopi", 0,
    10)
  • for ( ctr0 ctr lt 10 ctr )
  • // wrap do_something() with some log events
  • NL_debug1("dbg","do.something.start","inde
    xi", ctr)
  • do_something()
  • NL_debug1("dbg","do.something.end","index
    i", ctr)
  • // write a loop.end event
  • NL_debug("dbg","loop.end","starti stopi", 0,
    10)
  • // close all log modules
  • NL_logger_del()

21
Sample output (1)
  • If you compile and run this example, you will get
    no output!
  • Reasons
  • Destination for logging was specified as NULL
  • Message level (debug) is more detailed than the
    default message level for the module
    (NL_LVL_INFO)
  • Both of these can be changed without even
    re-compiling the application

gcc simple_example.c ./a.out huh?!
22
C API, Dynamic Log Locations / Levels
  • The environment variable NL_DEST provides a
    default location for logging
  • NL_DESTx-netlog//some.remote.host15001
  • Changing log levels on the fly is very useful for
    debugging
  • Very simple to do this with C API
  • NLCONFIGLEVEL/tmp/level
  • echo 3 gt /tmp/level
  • You can even do per-module levels
  • NLCONFIGLEVEL/tmp/level
  • echo 5 dbg\n3 gt /tmp/level

23
C API Example output (2)
  • Applying what we now know to the example

gcc simple_example.c export
NLCONFIGLEVEL/tmp/dbglevel echo 6 gt
/tmp/dbglevel export NL_DEST- stdout
./a.out t DATE 2004-09-29T224238.759067 s LVL
Debug s HOST 131.243.2.143 s TGT dbg s EVNT
loop.start s FILE example.c i LINE 9 s FN
main i start 0 i stop 10 t DATE
2004-09-29T224238.759268
s LVL Debug1 s HOST 131.243.2.143 s TGT dbg s
EVNT do.something.start s FILE example.c i
LINE 11 s FN main i index 0 t DATE
2004-09-29T224238.759326 s LVL Debug1 s HOST
131.243.2.143 s TGT dbg s EVNT
do.something.end s FILE example.c i LINE 13 s
FN main i index 0
etc..
24
C API Performance
  • To /dev/null
  • 250,000 events/sec
  • To a file in /tmp
  • 95,000 events/sec
  • These results were with the highest API layer.
    Using the next layer down gives about a 10
    speedup.
  • These tests were on a faster machine than the
    Python and Java ones

25
NetLogger Python API
26
Python API Design
Python stdlib logging module
wraps
nllogging.py
nllite.so C library
calls
nllite.py
calls (optionally)
27
Sample usage of nllogging.py
  • import logging
  • import sys
  • import gov.lbl.dsd.netlogger.nllogging as \
    nllogging
  • create a logging 'channel'
  • logger logging.getLogger('sample')
  • proxy Handler class w/ NLHandlerProxy
  • logger.addHandler(nllogging.NLHandlerProxy(logging
    .StreamHandler))
  • write some messages
  • logger.warn("this log message is ignored")
  • for i in range(10)
  • logger.warn("hi","foo"i,"bar"i/2.0))

28
Sample usage of nllite.py
  • import gov.lbl.dsd.netlogger.nllite as \ nllite
  • get file object for NetLogger URL
  • ofile nllite.urlfile("-")
  • open log stream
  • log nllite.LogOutputStream(ofile)
  • write some messages
  • for i in range(10) log.write("hi","foo"i,"ba
    r"i/2.0,
  • levelnllite.Level.WARN)

29
Python API Performance
logging to a file in /tmp
nllite.py API
nllogging.py API
Normal usage
pure Python
800
7000
25000
n/a
Python / C with SWIG
High-perf usage
Units events/second
30
Choosing a Python API
  • Use the nllogging API if you
  • Want compatibility with existing instrumentation
  • Dont log more than a few dozen events a second
    from any one process
  • Want to use the log4j-style configuration files
    built in to this interface
  • Use the nllite API if you
  • Are sometimes logging hundreds of events a second
    from a single process
  • Dont need log4j config files
  • Want to write over a socket directly
  • Need the C librarys dynamic levels

31
NetLogger Java API
32
Java API Design Goals
  • Interoperate with log4j and commons logging APIs
  • Logging calls should not take more than one or
    two lines of code
  • Efficiency

33
Java API Usage
  • Java code
  • public static logSomething(void)
  • private static final org.apache.commons.logging.L
    og log org.apache.commons.logging.LogFactory.ge
    tLog(nl")
  • for (int i 0 i lt 10 i)
  • log.debug(new LogMessage("dbg", "my.event").
    add("index", i))
  • Log4j.properties
  • Send nl' messages to the NETLOGGER appender
  • log4j.logger.nlDEBUG, NETLOGGER
  • Append to file
  • log4j.appender.NETLOGGERorg.apache.log4j.FileAppe
    nder
  • log4j.appender.NETLOGGER.File/tmp/foo1

34
Java API Performance
Format-only
Write to /tmp
New message every call
Normal case
45,000
90,000
Reuse message object
800,000
80,000
HotSpot test?
Unitsevents/sec
35
Java API Deployment
netlogd (collect log files)
WAN
Log4J properties file
nlforward (send files to net)
LogMessage.java (nllite.jar)
disk
Will talk about this part next
APP
36
Using NetLogger for Distributed Applications
37
Recommended approach
netlogd (collect log files)
WAN
node
N
38
Why not just use a socket?
  • Distributed communication is unreliable
  • Putting reconnect and buffering logic into the
    instrumentation library is not a good idea
  • Bloats and complicates the code
  • Cleanup is a nightmare
  • Who deals with the file if the app crashes?
  • Separating it out has many advantages
  • Writing log files will have almost no impact on
    application
  • No chance that network congestion will cause
    application to block
  • Logged events will eventually get sent to
    netlogd, even if program exits or crashes

39
nlforward
  • Scan a directory for files matching some glob or
    regex pattern, and forward them to a NetLogger
    URL (file, socket)
  • Scan interval is configurable to reduce load
  • Can optionally age out files that havent been
    modified in some time period
  • If you want, can even remove them!
  • Example
  • nlforward -e 600 -s 5 ./'.log' -p fwd.state
    x-netlog//some.host14380
  • Forward all files ending in .log to
    some.host14380, scanning every 5 sec for
    changes and timing out files after 10 min.

40
netlogd
  • Use netlogd to collect NetLogger messages at a
    central host
  • use to avoid the need to sort/merge several log
    files from several places

Network 3
Network 2
Network 1
netlogd
NetLogger data
41
NetLogger Event Life Lines
42
Event ID
  • In order to associate a group of events into a
    lifeline, you must assign an event ID to each
    NetLogger event
  • Sample Event Ids
  • file name
  • block ID
  • frame ID
  • user name
  • host name
  • combination of the above
  • etc.

43
Sample NetLogger Use with Event IDs
  • include nl_log.h
  • main()
  • int i0, size 65536, num_blocks 100
  • NL_logger_module(foo",NULL, NL_LVL_INFO,
    NL_TYPE_APP, "")
  • for ( i0 i lt num_blocks i )
  • / read block of data)
  • NL_write(INFO, foo",read.start","indexi
    read.sizei", i, size)
  • read_data(i)
  • NL_write(INFO, foo",read.end","indexi
    read.sizei", i, size)
  • / process block of data /
  • NL_write(INFO, foo",process.start","indexi",i
    ,)
  • process_data(i)
  • NL_write(INFO, foo",read.start","indexi
    read.sizei", i, size)
  • / write results /
  • NL_write(INFO, foo",write.start","indexi
    write.sizei", i, size)
  • send_data(i)
  • NL_write(INFO, foo",write.start","indexi
    write.sizei", i, size)

44
NetLogger Host/Network Tools
  • Wrapped UNIX network and OS monitoring tools to
    log interesting events using the same log
    format
  • netstat (TCP retransmissions, etc.)
  • vmstat (system load, available memory, etc.)
  • iostat (disk activity)
  • ping
  • These tools have been wrapped with python
    programs which
  • parse the output of the system utility
  • build NetLogger messages containing the results
  • Also have a NetLogger wrapper for Ganglia

45
Sample NetLogger System Monitoring Tool
  • Example nl_vmstat -t 60 -d 5000 -m 2
    logger.lbl.gov
  • program will run vmstat every 5 seconds for 1
    hour, and send the results to netlogd on host
    logger.lbl.gov
  • Generates the following information
  • CPU usage by User
  • CPU usage by System

46
NetLogger Visualization Tools
  • Exploratory, interactive analysis of the log data
    has proven to be the most important means of
    identifying problems
  • this is provided by nlv (NetLogger Visualization)
  • nlv functionality
  • can display several types of NetLogger events at
    once
  • user configurable which events to plot, and the
    type of plot to draw (lifeline, load-line, or
    point)
  • play, pause, rewind, slow motion, zoom in/out,
    and so on
  • nlv can be run post-mortem or in real-time
  • real-time mode done by reading the output of
    netlogd as it is being written

47
NLV Graph Types
  • nlv supports graphing of points, load-lines,
    and lifelines

48
NLV Screenshot
Menu bar
Title
Scale for load-line/ points
Events
Zoom box
Time axis
Max window size
Legend
Window size
Playback speed
Zoom-box actions
Summary line
Playback controls
Zoom window controls
You are here
49
NLV Configuration
  • NLV is very flexible, with many options settable
    in the configuration file.
  • Format
  • eventset /-eventset_name
  • type ltline,point,loadgt
  • id list of ULM field names used to determine
    which NetLogger messages get grouped into the
    same graph primitive
  • group list of ULM field names which will be
    mapped to the same color
  • val field_name min_val max_val
  • annotate list of field names to display in
    with annotate option
  • events list of all event IDs in this
    lifeline
  • Each nlv graph object needs to be defined by an
    eventset
  • Events and event-sets both use "" and "-" to
    indicate default (i.e. on startup) visibility

50
Example NLV Configuration
  • display vmstat info as a loadline
  • eventset VMSTAT
  • type load
  • loadline constructed from messages with the
    same HOST and NL.EVNT
  • id HOST NL.EVNT
  • messages with the same HOST get the same color
  • group HOST
  • list of NL.EVNT values in this set_
  • events VMSTAT_SYS_TIME VMSTAT_USER_TIME
  • display netstat TCP retransmits as a point
  • eventset NETSTAT
  • type point
  • truncate values outside the range 0 to 999
  • val NS.VAL 0.0 999.0
  • point constructed from messages from the same
    HOST and PROG
  • id HOST PROG
  • messages with the same HOST get the same color

51
Example NLV Configuration
  • display server data as a lifeline
  • eventset SERVER_READ
  • type line
  • lifeline constructed from messages from the
    same client and server
  • id CLIENT_HOST DPSS.SERV
  • messages with the same DPSS.SERV get the same
    color
  • group DPSS.SERV
  • events
  • APP_SENT
  • DPSS_SERV_IN
  • DPSS_START_READ
  • DPSS_END_READ
  • DPSS_START_WRITE
  • APP_RECEIVE

52
How to Instrument Your Application
  • Youll probably want to add a NetLogger event to
    the following places in your distributed
    application
  • before and after all disk I/O
  • before and after all network I/O
  • entering and leaving each distributed component
  • before and after any significant computation
  • e.g. an FFT operation
  • before and after any significant graphics call
  • e.g. certain CPU intensive OpenGL calls
  • This is usually an iterative process
  • add more NetLogger events as you zero in on the
    bottleneck

53
Does NetLogger affect application performance?
  • Only if you use it incorrectly or log too much
  • There are several things to be careful of when
    doing this type of monitoring
  • If logging to disk, dont log to a nfs mounted
    disk
  • best to log to /tmp, which may actually be RAM
    (Solaris)
  • Dont send log messages to a slow (i.e. 10BT) or
    congested network, as youll just make it worse
  • log to a local file instead
  • Sample NetLoggerWrite Performance 100000
    calls/sec
  • can make 1000 NetLoggerWrite calls / sec and only
    effect your application by 1

54
NetLogger Case Studies
55
Example HPSS Storage Manager Application
  • NetLogger was used to test and verify the results
    of a Storage Access Coordination System (STACS)
    by LBNLs Data Management Group
  • STACS is designed to optimize the use of a disk
    cache with an HPSS Mass Storage system, and tries
    to minimize tape mount requests by clustering
    related data on the same tape
  • NetLogger was used to look at
  • per-query latencies
  • to show that subsequent fetches of spatially
    clustered data "hit" in the cache.
  • (http//gizmo.lbl.gov/sm/)

56
STACS Instrumentation Points
Monitoring Points A) request arrives at HPSS B)
start transfer from tape C) tape transfer
finished D) file available to client E) file
retrieved by client F) file released by client
57
NLV for STACS Tracking File Requests
58
Tracking Files and System Performance
59
Example Parallel Data Block Server
  • The Distributed Parallel Storage Server (DPSS)
  • provides high-speed parallel access to remote
    data
  • Unique features of the DPSS
  • On a high-speed network, can actually access
    remote data faster that from a local disk
  • 70 MB/sec (DPSS) vs 22 MB/sec (local disk)
  • Only need to send parts of the file currently
    required over the network
  • e.g. client may only need 100 MB from a 2 GB
    data set
  • analogous to http model
  • NetLogger was used for performance tuning and
    debugging of the DPSS

60
DPSS Cache Architecture
61
NetLogger Results for the DPSS
62
NetLogger Results for the DPSS
63
NetLogger Results for the DPSS over a WAN
64
Example NLV of DPSS with a HENP client
65
Example Babar data analysis 2 nodes with
Objectivity Error
66
Example Matisse Project
67
Example Combined Host and Application Monitoring
68
Future directions
  • Data queries
  • Data grouping
  • Discovery queries
  • Statistical transformations

Subject, Operation, Parameters
Metadata ref Data
Monitoring service
Metadata refs
Host, network sensors
Instrumentation
Metadata can be inline or by reference
69
Getting NetLogger
  • Source code and binaries are available at
  • http//dsd.lbl.gov/NetLogger/Toolkit/
  • Code should run on all Unix platforms
  • Tested under Linux, Solaris, FreeBSD, and Mac OSX
  • Python and Java code should even run under
    (gasp!) Windows
  • Were not Windows developers, so some tweaks may
    be necessary

70
For More Information
  • Email
  • Dan Gunter, dkgunter_at_lbl.gov
  • Brian Tierney, bltierney_at_lbl.gov
  • http//dsd.lbl.gov/netlogger-toolkit/
  • New distribution combining netlogger-full and
    netlogger-lite
  • Subversion repo. on TO-DO list!
  • http//dsd.lbl.gov/nlv/
  • NetLogger Visualization tool
  • http//dsd.lbl.gov/NetLogger/
  • Older netlogger-full web pages, still lots of
    useful stuff
Write a Comment
User Comments (0)
About PowerShow.com