IBM T'J' Watson Research Center - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

IBM T'J' Watson Research Center

Description:

Allows stress-testing and bug-finding. Gives us some idea of ... Still, useful for diagnosing and treating problems. Deconstructing SPECweb99. Erich Nahum ... – PowerPoint PPT presentation

Number of Views:66
Avg rating:3.0/5.0
Slides: 42
Provided by: EricH66
Category:

less

Transcript and Presenter's Notes

Title: IBM T'J' Watson Research Center


1
Deconstructing SPECweb99
  • Erich Nahum

IBM T.J. Watson Research Center www.research.ibm
.com/people/n/nahum nahum_at_us.ibm.com
2
Talk Overview
  • Workload Generators
  • SPECweb99
  • Methodology
  • Results
  • Summary and Conclusions

3
Why Workload Generators?
  • Allows stress-testing and bug-finding
  • Gives us some idea of server capacity
  • Allows us a scientific process to compare
    approaches
  • e.g., server models, gigabit adaptors, OS
    implementations
  • Assumption is that difference in testbed
    translates to some difference in real-world
  • Allows the performance debugging cycle

Measure
Reproduce
Fix and/or improve
Find Problem
The Performance Debugging Cycle
4
How does W. Generation Work?
  • Many clients, one server
  • match asymmetry of Internet
  • Server is populated with some kind of synthetic
    content
  • Simulated clients produce requests for server
  • Master process to control clients, aggregate
    results
  • Goal is to measure server
  • not the client or network
  • Must be robust to conditions
  • e.g., if server keeps sending 404 not found, will
    clients notice?

Responses
Requests
5
Problems with Workload Generators
  • Only as good as our understanding of the traffic
  • Traffic may change over time
  • generators must too
  • May not be representative
  • e.g., are file size distributions from IBM.com
    similar to mine?
  • May be ignoring important factors
  • e.g., browser behavior, WAN conditions, modem
    connectivity
  • Still, useful for diagnosing and treating
    problems

6
What Server Workload Generators Exist?
  • Many. In order of publication
  • WebStone (SGI)
  • SPECweb96 (SPEC)
  • Scalable Client (Rice Univ.)
  • SURGE (Boston Univ.)
  • httperf (HP Labs)
  • SPECweb99 (SPEC)
  • TPC-W (TPC)
  • WaspClient (IBM)
  • WAGON (IBM)
  • Not to mention those for proxies (e.g. polygraph)
  • Focus of this talk SPECweb99

7
Why SPECweb99?
  • Has become the de-facto standard used in
    Industry
  • 141 submissions in 3 years on the SPEC web site
  • Hardware Compaq, Dell, Fujitsu, HP, IBM, Sun
  • OSes AIX, HPUX, Linux, Solaris, Windows NT
  • Servers Apache, IIS, Netscape, Tux, Zeus
  • Used within corporations for performance,
    testing, and marketing
  • E.g., within IBM, used by AIX, Linux, and 390
    groups
  • Begs the question how realistic is it?

8
Server Workload Characterization
  • Over the years, many observations have been made
    about Web server behavior
  • Request methods
  • Response codes
  • Document Popularity
  • Document Sizes
  • Transfer Sizes
  • Protocol use
  • Inter-arrival times
  • How well does SPECweb99 capture these
    characteristics?

9
History SPECweb96
  • SPEC Systems Performance Evaluation Consortium
  • Non-profit group with many benchmarks (CPU, FS)
  • Pay for membership, get source code
  • First attempt to get somewhat representative
  • Based on logs from NCSA, HP, Hal Computers
  • 4 classes of files
  • Poisson distribution within each class

10
SPECweb96 (cont)
  • Notion of scaling versus load
  • number of directories in data set size doubles as
    expected throughput quadruples (sqrt(throughput/5)
    10)
  • requests spread evenly across all application
    directories
  • Process based WG
  • Clients talk to master via RPC's
  • Does only GETS, no keep-alive
  • www.spec.org/osg/web96

11
Evolution SPECweb99
  • In response to people "gaming" benchmark, now
    includes rules
  • IP maximum segment lifetime (MSL) must be at
    least 60 seconds
  • Link-layer maximum transmission unit (MTU) must
    not be larger than 1460 bytes (Ethernet frame
    size)
  • Dynamic content may not be cached
  • not clear that this is followed
  • Servers must log requests.
  • W3C common log format is sufficient but not
    mandatory.
  • Resulting workload must be within 10 of target.
  • Error rate must be below 1.
  • Metric has changed
  • now "number of simultaneous conforming
    connections rate of a connection must be
    greater than 320 Kbps

12
SPECweb99 (cont)
  • Directory size has changed
  • (25 (400000/122000) simultaneous conns) /
    5.0)
  • Improved HTTP 1.0/1.1 support
  • Keep-alive requests (client closes after N
    requests)
  • Cookies
  • Back-end notion of user demographics
  • Used for ad rotation
  • Request includes user_id and last_ad
  • Request breakdown
  • 70.00 static GET
  • 12.45 dynamic GET
  • 12.60 dynamic GET with custom ad rotation
  • 04.80 dynamic POST
  • 00.15 dynamic GET calling CGI code

13
SPECweb99 (cont)
  • Other breakdowns
  • 30 HTTP 1.0 with no keep-alive or persistence
  • 70 HTTP 1.1 with keep-alive to "model"
    persistence
  • still has 4 classes of file size with Poisson
    distribution
  • supports Zipf popularity
  • Client implementation details
  • Master-client communication uses sockets
  • Code includes sample Perl code for CGI
  • Client configurable to use threads or processes
  • Much more info on setup, debugging, tuning
  • All results posted to web page,
  • including configuration back end code
  • www.spec.org/osg/web99

14
Methodology
  • Take a log from a large-scale SPECweb99 run
  • Take a number of available server logs
  • For each characteristic discussed in the
    literature
  • Show what SPECweb99 does
  • Compare to results from the literature
  • Compare to results from a set of sample server
    logs
  • Render judgment on how well SPECweb99 does

15
Sample Logs for Illustration
Well use statistics generated from these logs as
examples.
16
Talk Overview
  • Workload Generators
  • SPECweb99
  • Methodology
  • Results
  • Summary and Conclusions

17
Request Methods
  • AW96, AW00, PQ00, KR01 majority are GETs, few
    POSTs
  • SPECweb99 No HEAD request, too many POSTS

18
Response Codes
  • AW96, AW00, PQ00, KR01 Most are 200s, next
    304s
  • SPECweb99 doesnt capture anything but 200 OK

19
Resource Popularity
  • p(r) C/ralpha (alpha 1 true Zipf others
    Zipf-like")
  • Consistent with CBC95, AW96, CB96, PQ00, KR01
  • SPECweb99 does a good job here with alpha 1

20
Resource (File) Sizes
  • Lognormal body, consistent with results from
    AW96, CB96, KR01.
  • SPECweb99 curve is sparse, 4 distinct regions

21
Tails of the File Size
  • AW96, CB96 sizes have Pareto tail Downey01
    Sizes are lognormal.
  • SPECweb99 tail only goes to 900 KB (vs 10 MB for
    others)

22
Response (Transfer) Sizes
  • Lognormal body, consistent with CBC95, AW96,
    CB96, KR01
  • SPECweb99 doesnt capture zero-byte transfers
    (304s)

23
Transfer Sizes w/o 304s
  • When 304s removed, SPECweb99 much closer

24
Tails of the Transfer Size
  • SPECweb99 tail is neither lognormal nor pareto
  • Again, max transfer is only 900 KB

25
Inter-Arrival Times
  • Literature gives exponential distr. for session
    arrivals
  • KR01 Request inter-arrivals are pareto
  • Here we look at request inter-arrivals

26
Tails of Inter-Arrival Times
  • SPECweb99 has pareto tail
  • Not all others do, but may be due to truncation
  • (e.g. log duration of only one day)

27
HTTP Version
  • Over time, more and more requests are served
    using 1.1
  • But SPECweb99 is much higher than any other log
  • Literature doesnt look at this, so no judgments

28
Summary and Conclusions
  • SPECweb99 has a mixed record depending on
    characteristic
  • Methods OK
  • Response codes bad
  • Document popularity good
  • File sizes OK to bad
  • Transfer sizes bad
  • Inter-arrival times good
  • Main problems are
  • Needs to capture conditional GETs with IMS for
    304s
  • Better file size distribution (smoother, larger)

29
Future Work
  • Several possibilities for future work
  • Compare logs with SURGE
  • More detail on HTTP 1.1 (requires better workload
    characterization, e.g. packet traces)
  • Dynamic content (e.g., TPC-W) (again, requires
    workload characterization)
  • Latter 2 will not be easy due to privacy,
    competitive concerns

30
Probability
  • Graph shows 3 distributions with average 2.
  • Note average ? median in some cases !
  • Different distributions have different weight
    in tail.

31
Important Distributions
  • Some Frequently-Seen Distributions
  • Normal
  • (avg. sigma, variance mu)
  • Lognormal
  • (x gt 0 sigma gt 0)
  • Exponential
  • (x gt 0)
  • Pareto
  • (x gt k, shape a, scale k)

32
Probability Refresher
  • Lots of variability in workloads
  • Use probability distributions to express
  • Want to consider many factors
  • Some terminology/jargon
  • Mean average of samples
  • Median half are bigger, half are smaller
  • Percentiles dump samples into N bins
  • (median is 50th percentile number)
  • Heavy-tailed
  • As x-gtinfinity

33
Session Inter-Arrivals
  • Inter-arrival time between successive requests
  • Think time"
  • difference between user requests vs. ALL requests
  • partly depends on definition of boundary
  • CB96 variability across multiple timescales,
    "self-similarity", average load very different
    from peak or heavy load
  • SCJO01 log-normal, 90 less than 1 minute.
  • AW96 independent and exponentially distributed
  • KR01 session arrivals follow poisson
    distribution, but requests follow pareto with
    a1.5

34
Protocol Support
  • IBM.com 2001 logs
  • Show roughly 53 of client requests are 1.1
  • KA01 study
  • 92 of servers claim to support 1.1 (as of Sep
    00)
  • Only 31 actually do most fail to comply with
    spec
  • SCJO01 show
  • Avg 6.5 requests per persistent connection
  • 65 have 2 connections per page, rest more.
  • 40-50 of objects downloaded by persistent
    connections

Appears that we are in the middle of a slow
transition to 1.1
35
WebStone
  • The original workload generator from SGI in 1995
  • Process based workload generator, implemented in
    C
  • Clients talk to master via sockets
  • Configurable client machines, client
    processes, run time
  • Measured several metrics avg max connect time,
    response time, throughput rate (bits/sec),
    pages, files
  • 1.0 only does GETS, CGI support added in 2.0
  • Static requests, 5 different file sizes

www.mindcraft.com/webstone
36
SURGE
  • Scalable URL Reference GEnerator
  • Barford Crovella at Boston University CS Dept.
  • Much more worried about representativeness,
    captures
  • server file size distributions,
  • request size distribution,
  • relative file popularity
  • embedded file references
  • temporal locality of reference
  • idle periods ("think times") of users
  • Process/thread based WG

37
SURGE (cont)
  • Notion of user-equivalent
  • statistical model of a user
  • active off time (between URLS),
  • inactive off time (between pages)
  • Captures various levels of burstiness
  • Not validated, shows that load generated is
    different than SpecWeb96 and has more burstiness
    in terms of CPU and active connections
  • www.cs.wisc.edu/pb

38
S-Client
  • Almost all workload generators are closed-loop
  • client submits a request, waits for server, maybe
    thinks for some time, repeat as necessary
  • Problem with the closed-loop approach
  • client can't generate requests faster than the
    server can respond
  • limits the generated load to the capacity of the
    server
  • in the real world, arrivals dont depend on
    server state
  • i.e., real users have no idea about load on the
    server when they click on a site, although
    successive clicks may have this property
  • in particular, can't overload the server
  • s-client tries to be open-loop
  • by generating connections at a particular rate
  • independent of server load/capacity

39
S-Client (cont)
  • How is s-client open-loop?
  • connecting asynchronously at a particular rate
  • using non-blocking connect() socket call
  • Connect complete within a particular time?
  • if yes, continue normally.
  • if not, socket is closed and new connect
    initiated.
  • Other details
  • uses single-address space event-driven model like
    Flash
  • calls select() on large numbers of file
    descriptors
  • can generate large loads
  • Problems
  • client capacity is still limited by active FD's
  • arrival is a TCP connect, not an HTTP request
  • www.cs.rice.edu/CS/Systems/Web-measurement

40
TPC-W
  • Transaction Processing Council (TPC-W)
  • More known for database workloads like TPC-D
  • Metrics include dollars/transaction (unlike SPEC)
  • Provides specification, not source
  • Meant to capture a large e-commerce site
  • Models online bookstore
  • web serving, searching, browsing, shopping carts
  • online transaction processing (OLTP)
  • decision support (DSS)
  • secure purchasing (SSL), best sellers, new
    products
  • customer registration, administrative updates
  • Has notion of scaling per user
  • 5 MB of DB tables per user
  • 1 KB per shopping item, 25 KB per item in static
    images

41
TPC-W (cont)
  • Remote browser emulator (RBE)
  • emulates a single user
  • send HTTP request, parse, wait for thinking,
    repeat
  • Metrics
  • WIPS shopping
  • WIPSb browsing
  • WIPSo ordering
  • Setups tend to be very large
  • multiple image servers, application servers, load
    balancer
  • DB back end (typically SMP)
  • Example IBM 12-way SMP w/DB2, 9 PCs w/IIS 1M
  • www.tpc.org/tpcw
Write a Comment
User Comments (0)
About PowerShow.com