Tales from the Lab: Experiences and Methodology - PowerPoint PPT Presentation

1 / 63
About This Presentation
Title:

Tales from the Lab: Experiences and Methodology

Description:

Features and components you will test. PEOPLE, MONEY, Location ... Interface to other system and uses algorithms for parsing names/addresses ... – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 64
Provided by: stewartbl
Category:

less

Transcript and Presenter's Notes

Title: Tales from the Lab: Experiences and Methodology


1
Tales from the LabExperiences and Methodology
  • Demand Technology User Group
  • December 5, 2005
  • Ellen Friedman
  • SRM Associates, Ltd

2
Testing in the Lab
  • Experiences of a consultant
  • Taming the Wild West
  • Bringing order to Chaos
  • HOW?
  • Methodology- Capacity Planning,SPE, Load Testing,
  • Discipline
  • Checklists/Procedures
  • What happens when procedures arent followed
  • Detective Work

3
Agenda
  • Introduction
  • Software Performance Engineering and Benefits of
    Testing
  • Back to Basics
  • Workload Characterization/Forecasting Capacity
    Planning
  • Building the Test Labs
  • Testing Considerations
  • Scripts and test execution
  • Some Examples
  • Documenting the test plan and reporting results
  • Summary

4
Software Performance Engineering
  • Performance engineering is the process by which
    new applications (software) are tested and tuned
    with the intent of realizing the required
    performance.
  • Benefit
  • Identify problems early-on in the application
    life-cycle
  • Manage Risk
  • Facilitates the identification and correction of
    bottlenecks to
  • Minimize end to end response time
  • Maximize application performance

5
Should we bother to Test??
WE CANT PLAN FOR WHAT WE DONT KNOW
6
What do we need to achieve?
  • Scalability
  • Predictable scaling of software/hardware
    architecture
  • Do we have capacity to meet resource
    requirements?
  • How many users will system handle before we need
    to upgrade or add web servers/app servers
  • Stability
  • Ability to achieve results under unexpected loads
    and conditions
  • Performance vs Cost
  • Achieving SLA and minimizing cost

7
Testing throughout the application lifecycle
Cost of Fixing a problem late in the development
is extremely
8
What is a Performance Test Lab?
A facility to pro-actively assess the
satisfactory delivery of Service to users prior
to system Implementation or roll-out. - A
test drive capability.
9
Lab- What is it Good For?
  • Before you deploy the application- create an
    environment that simulates the production
    environment
  • Use this environment to reflect the conditions of
    target production environment

10
Testing Plan
Evaluate system
SLAs, Workload Characterization, Volumes
Develop Scripts Test Strategy
Obtain tools, methodology, build scripts
Execute Baseline Tests
Run the tests in the lab and obtain baseline
Ensure that test scripts adequately represent
the production environment
Validate Baseline
Run Controlled Benchmarks
Analyze Results
Report Findings
Analyze Results
11
Evaluate SystemWorkload Characterization
  • Identify Critical Business Functions
  • Define Corresponding System Workloads/Transactions
  • Map business workloads to system transactions
  • Identify flow of transactions through the system
  • Identify current and expected future volume
  • Determine resource requirements for
    business-based workloads at all architectural
    tiers
  • Web server, Applications server, Database server

12
Evaluate SystemWorkload Forecasting
  • Define key volume indicators for
  • What are the drivers for volume and/or resource
    usage for the system?
  • Examples
  • Banking Checks processed
  • Insurance Claims processed
  • Financial Trades processed
  • Shipping Packages processed

13
Workload ForecastingHistorical Review
  • Does the business have a set peak?
  • December for retail, and shipping
  • Peak/Average Ratio? 20 or 30 higher?
  • Volume vs. Resource Usage
  • Larger centers require greater computing
    resources
  • Need to determine scaling of hardware/software
    resources as a function of volume

14
Volume vs. Response Time
Scale Volume 1000 PPH
15
Service Level Considerations
  • e-Business SystemTracking System for Package
    Inquiries
  • WHERE IS MY PACKAGE?
  • Call center handles real-time customer inquiries
  • SLA- caller cannot be put on hold gt3 minutes
  • 90 of all calls should be cleared on first
    contact
  • Responsiveness to customer needs
  • Web-interface for customers
  • Page load time and query resolution lt6-8 seconds

16
Lab can be used throughout the Application
Lifecycle
  • Testing throughout the Application Life Cycle
  • Planning
  • Design/ coding
  • Development/testing/UAT
  • Production Deployment
  • Post-production-change management
  • Optimization (performance and volume testing)
  • Labs reduce risk to your production environment
  • Solid testing leads to cleaner implementations !!

17
How many Labs? Where to put them
  • Locations for testing in various technical,
    business, or political contexts. The following
    factors influence the decisions you make about
    your test environment
  • Your testing methodology
  • Features and components you will test
  • PEOPLE, MONEY, Location
  • Personnel who will perform the testing
  • Size, location, and structure of your application
    project teams.
  • Size of your budget.
  • Availability of physical space.
  • Location of testers.
  • Use of the labs after deployment.

18
Types of Labs and their Purpose
  • Application unit testing
  • Hardware or software incompatibilities
  • Design flaws
  • Performance issues
  • Systems integration testing lab
  • User Acceptance Testing (UAT)
  • Application compatibility
  • Operational or deployment inefficiencies
  • Windows 2003 features
  • Network infrastructure compatibility
  • Interoperability with other network operating
    systems
  • Hardware compatibility
  • Tools (OS, third-party, or custom)
  • Volume testing lab
  • Performance and capacity planning
  • Baseline traffic patterns
  • traffic volumes without user activity
  • Certification Lab
  • Installation and configuration documentation
  • Administrative procedures and documentation
  • Production rollout (processes, scripts, and
    files back-out plans)

19
Testing Concepts 101
  • Define the problem- Test Objectives
  • Limit the scope
  • Establish metrics analysis methodology
  • Tools/analysis
  • Establish the environment
  • Design the test bed
  • Simulate the key business functions
  • Develop scripts and their frequency of execution

20
Testing Process 101
  • Ensure that Lab mimics production (H/W, S/W,
    Workload/business functions being tested)
  • Test measurement tools and develop analysis tools
  • ARM the application
  • Instrumentation to provide end to end response
    time
  • Instrumentation to provide business metrics to
    correlate
  • Execute controlled test
  • Single variable manipulation
  • Ensure repeatability
  • Analyze data repeat if required (e.g., tune
    system)
  • Extrapolate
  • Document Test set-up and results

21
Developing the script
  • Meet with the Business Team, Applications Team to
    understand the workload.
  • What is typical? What is most resource intensive.
  • Determine the appropriate mix of work
  • Typical navigation and screen flow
  • of time each screen is accessed by user
  • Number of users to test with, number of different
    accounts to use (other factors impacting
    representative ness of test)
  • Include cases to test resource intensive
    activities and functions
  • Include cases where user may abandon session
    because r/t is too long
  • Test for time-outs

22
Load Testing Parameters
  • Simulating Volume and distribution of arrival
    rate
  • Hourly volume- distribution is not uniform,
    Bursty arrival rate
  • Web sessions are only about 3 minutes long
  • When is traffic heaviest?
  • How long does the user spend at the site?
  • Need to vary the number of users started over the
    hour/User Think Time
  • Package Shipping Example Different from web
    site- more predictable
  • Arrival rate highest in first hour
  • Limited by capacity of site to load the
    packages/speed of belts etc.
  • Package scanning some automated but still has
    human involvement

23
X read bytes/second over time
How long should the test run?
Note reduction in read bytes/sec over time
Test run is four hours here!
Need to reach steady state!
24
Creating the Test Environment in the Lab
  • Creating the data/database
  • Copy database from production- subset it
  • Manually key/Edit some of the data
  • Create image copy of system for use in each run
  • Verifying the test conditions
  • Utilize ghost imaging or software such as
    Powerquest or Live State to save the database
    and system state between test runs
  • May need to also verify configuration settings
    that arent saved in the image copy
  • Make sure that you are simulating the correct
    conditions (End of Day/Beginning of Day/Normal
    production flow)
  • Scripting the key business functions
  • Vary the test data as part of scripting
  • Vary users/accounts/pathing

25
What type of staff do we need?
  • Programmers
  • Korn Shell Programmers
  • Mercury Mavens?

26
Establish Metrics Analysis Methodology
  • Based on the testing objectives, what data do we
    need to collect and measure?
  • CPU, Memory, I/O, network, response time
  • What tools do we need for measurement?
  • Do not over-measure
  • Dont risk over-sampling and incurring high
    overhead
  • Create a Template to use for comparison between
    test runs

27
Build a Template for Comparison
  • Before vs. After Comparison of Test Cases
  • Collect the performance data- Metrics
  • CPU Processor Metrics
  • System, User and Total Processor Utilization
  • Memory
  • Available bytes, Page reads/second, Page
    Ins/second, Virtual/Real bytes
  • Network
  • Bytes sent/received, Packets sent/received per
    NIC
  • Disk
  • Reads and Writes/second, Read and Write
    bytes/second, Seconds/Read, Seconds/Write, Disk
    utilization
  • Process SQL Server (2 instances)
  • CPU
  • Working set size
  • Read/Write bytes per second
  • Database- SQL
  • Database Reads/Writes per instance, Stored
    Procedure Timings
  • Log Bytes flushed per database

28
CASE STUDY
  • Packaging- Shipping System
  • Many centers throughout the country
  • Same Applications
  • Same Hardware
  • Testing in the lab is required to identify
    bottlenecks and optimize performance
  • SLA not being met in some larger centers
  • Suspect Database Performance

29
Case Study Configuration Architecture
  • Database Server
  • Runs 2 Instances of SQL (Main, Reporting)
  • Databases are configured on the X drives
  • TempDB and Logs are configured on D drive

30
Scanning the package on the Belt
IF SLA not met packages arent processed
automatically Additional manual work is required
to handle exceptions
31
Case Study Hardware
  • Database Server- DB 1
  • G3 (2.4 GHz) with 4 GB memory
  • Raid 10 Configuration
  • Internal
  • 1 C/D logically partitioned
  • External (10 slots)
  • 2 X drives- mirrored
  • 2 Y drives- mirrored

Database 1
DatabaseServers
  • Application Server
  • G3 (2.4 GHz) with 3 GB memory
  • 2 Internal Drives (C/D)
  • Database Server- DB 2
  • G3 (2.4 GHz) with 4 GB memory
  • Internal
  • 1 C/D logically partitioned
  • 2 X mirrored drives

Database 2
Application Server
32
Case Study Software and OS
  • Windows 2000
  • SQL Server 2000
  • 2 Database Instances
  • Reporting
  • Main Instance- Multiple Databases
  • Replication of Main Instance to Reporting
    Instance on the same server
  • Main Instance and Reporting Instance share same
    drives

33
Case StudyWhen do we test in the Lab?
  • Hardware Changes
  • OS Changes
  • Software patch level changes to main suite of
    applications
  • Major application changes
  • Changes to other applications which coexist with
    primary application suite.

34
Checklists and Forms
  • Test Objectives
  • Application Groups must identify
  • Specific application version to be tested as well
    as those of other co-dependent applications
  • Database set-up to process the data
  • Special data
  • Workstation set-up
  • Volume- Induction rate/flow(arrival rate)
  • Workflow and percentages
  • Scripts/percentage/flow rate

35
Case Study Hardware Checklist
36
Sign-offs on Procedures/Pre-flight
  • Who?
  • Applications team
  • Lab group
  • Systems groups
  • Network
  • Distributed Systems
  • Database
  • Performance

37
Script Development Collected data from
Production Systems
  • Applications to include for testing and to be
    used to determine resource profiles for key
    transactions and business functions
  • Volumes to test with
  • Database conditions including database size,
    database state requirements (e.g. end of day
    conditions)
  • Application workflow- based on operational
    characteristics in various centers
  • Job and queue dependencies
  • Requirements for specific data feeds to include
  •  

38
Case Study Developing a Script
  • Major business functions for labeling and
    shipping
  • Verifying the name and address of the item to be
    shipped
  • Interface to other system and uses algorithms for
    parsing names/addresses
  • Route planning- interface with OR systems to
    optimize routing
  • Scanning the package information (local
    operation)
  • Determining the type of shipment
    freight/letter/overnight small package for
    shipping the item, and the appropriate route
  • Sorting the packages according to type of
    shipment
  • Printing the smart labels
  • how/where to load the package
  • Tracking the package

39
Case StudyPerformance Testing in the Lab
  • Production Analysis indicated
  • Insufficient memory to support database storage
    requirements
  • Resulting in increased I/O processing
  • OPTIONS
  • Add memory
  • Not feasible requires OS upgrade to address more
    than 4 GB of storage with Windows 2000 Standard
    Edition
  • Make the I/O faster- faster drives or more drives
  • Spread the I/O across multiple drives (external
    disk storage is expandable up to 10 slots
    available)
  • Separate the database usage across 2 sets of
    physical drives
  • Split the database across multiple servers (2
    database servers)
  • Easier upgrade then OS change
  • Change the database design (Expected in 1Q2006,
    testing now)

40
Planning Testing out the configuration options
  • Test out each of the options and provide a
    recommendation
  • SLA 99 of packages must complete their
    processing in under 500 milliseconds
  • Each option was evaluated based on its relative
    ability to satisfy the SLA criteria.

41
Validating the baseline
Taming the West!
If you cant measure it, you cant manage it!
(CMG slogan)
42
Case StudyWhat are we measuring?
  • End to End Response Time (percentiles, average)
  • SQL Stored Procedure Timings (percentiles,
    average)
  • SQL Trace information summarized for each stored
    procedure for a period of time
  • Perfmon System, Process, SQL (average, max)
  • CPU, Memory, Disk
  • Process Memory, Disk, Processor
  • SQL Database Activity, Checkpoints, Buffer Hit
    etc.

43
Validating the Baseline
  • Data from two production systems was obtained to
    produce
  • Test database from multiple application systems
  • Database states were obtained, system
    inter-dependencies were satisfied, application
    configuration files
  • Baseline test was executed- Multiple Iterations
  • Performance measurements from two other systems
    were collected and compared against baseline
    execution
  • Results were compared
  • Database and scripts were modified to better
    reflect production conditions

44
Story Creating a new Environment
  • A series of performance tests were conducted in
    Green Environment to evaluate I/O performance
  • To be reviewed in presentation on Thursday 12-8.
  • Green Environment was required for another
    project. So moved to a new Red Environment
  • Data created from a different source (2 different
    production environments)
  • Simulating high volume
  • What happened?
  • Different page densities
  • Different distribution of package delivery dates
  • Different database size for critical database
  • Red was much fatter!

45
Analysis to evaluate new Baseline
  • Compare I/O activity for Green and Red
  • Metrics
  • End to End Response Time
  • SQL Stored Procedure Timings
  • SQL Activity
  • Database Page Reads/Writes overall and for each
    database
  • (X drive containing database)
  • Log Bytes Flushed per second (each database)-
  • D-drive (logs)
  • SQL Read and Write bytes/second
  • SQL reads and writes is overall so it includes
    database I/O and log activity
  • Disk Activity
  • Overall Drive D/X Read/Write bytes/second

46
Comparing Overall Response TimeRed vs. Green and
Separate Server
Green and Red tests with 2 mirrored pair of X
drives are baselines
Results of baselines should be comparable!!!
47
Comparison of Green and Red Environments (X drive
database)
Read Activity 16 higher Write Activity 38 higher
48
Comparison of Green and Red Environments (D drive
Tempdb/logs)
Read Activity 1 higher Write Activity 13 higher
I/O activity is approximately same on D drive
49
Comparison of I/O LoadSQL Activity Green vs. Red
Increase in Reads in Red due to Main Increase in
Writes in Red caused by both
50
I/O Load Change Main Instance Separate server
vs. Baseline
Read Activity is reduced by 43 with separate
server
51
Differences between Red and Green
  • D Drive activity is approximately the same
  • TempDB and logging
  • X Drive activity is increased in Red environment
  • Most of differences are due to an increase in
    Reads on X drive for Main Instance
  • Implies that the database was much fatter
  • Confirm this by reviewing Page reads/Page Writes
    per database from SQL statistics
  • Review database sizes (unfortunately didnt have
    this data so we inferred it based on I/O data and
    SQL trace data)
  • SQL trace data showed more Page Reads for key
    databases

52
Red Environment Comparing Three Days
  • Background
  • Several large databases
  • Main UOWIS, PAS
  • Reporting Adhoc, UW1, Distribution
  • 4-1 Replication turned off for UW1 database
  • 4-4 Replication on for UW1 database
  • 4-8 Separate server for UOWIS, replication
    turned on for UW1
  • Expectations
  • 4-1 will perform better than 4-4 reduce I/O
    significantly
  • Expect significant reduction in Reporting
    Database I/O
  • 4-8 separate server will separate out the
    critical database
  • Expect same amount of work performed as 4-4 but a
    reduction in Read Activity for UOWIS because data
    will now be in memory

53
Reviewing Log Write Activity
Note No log bytes no replication Of UW1
database on 4-4
54
RED Comparing Three Days Database Disk
Activity
Note 4-8 UOWIS results are for separate server
Increase in work performed on 4-8 vs. 4-4
55
Comparing Database Reads/WritesMain Instance
56
Comparing Database Reads/WritesReporting Instance
Total page reads for reporting instance should
remain constant Why did it increase on 4-8?
57
Where are the differences on the two days?
Note Differences in Stored Procedure- Total
Reads (logical) Data Cap Summary and in Belt
Summary Reports (not main functionality)
58
What have we uncovered about test differences?
  • Processor usage approximately the same
  • Amount of Write Activity per instance is same
  • Reviewed log bytes/flushed for each instance
  • Reporting instance performed more I/O- more reads
  • Additional report jobs were executed on 4-8 and
    not on 4-4
  • Reports run 4 times per hour (every 15 minutes-
    causes burst in I/O activity)
  • When UOWIS database is on the same server
    (sharing same drives as other Main Instance and
    Reporting Instance work) response time is higher
  • Response Time is directly related to physical
    reads and physical disk read performance
  • Spreading the I/O across more drives and/or
    providing more memory for the critical database
    instance improves performance

59
Testing Summary
  • Need to create and follow a test plan which
    outlines
  • All pre-flight procedures
  • Confirm that environment is ready to go
  • Validate baselines
  • Run tests in organized fashion following the plan
  • Do a sanity check!
  • Do results make sense
  • Otherwise search for the truth- dont bury the
    results

60
Measurement Summary
  • The nature of performance data is that it is long
    tailed
  • Averages arent representative
  • Get percentiles
  • Need to understand the variability of tests
    conducted
  • Run the same test multiple times to obtain a
    baseline
  • Helps you iron out your procedures
  • Can get a measure of variability of test case so
    that you can determine if the change you are
    testing is significant
  • If the variability experienced between your base
    test runs is small that is good- you have
    repeatability
  • If the variability is large
  • You need to make sure that any change you make
    shows an even greater change

61
Reporting the Test ResultsTemplate
  • Executive Summary
  • Graphs of results- e.g., end to end response time
  • Scalability of solution
  • Overall findings
  • Background
  • Hardware/OS/Applications
  • Scripts
  • Analysis of Results
  • System and application performance
  • Decomposition of response time
  • Web tier, Application, Database
  • Drill down again for details as necessary e.g.,
    database metrics
  • Next steps

62
Summary
  • Cant always simulate everything - do the best
    you can.
  • Implement the change in production and go back to
    the lab to understand why it matched or didnt
  • When you discover a problem,
  • Apply what youve learned
  • Make necessary changes to procedures,
    documentation, methodology- in the lab and
    recommend changes for outside the lab
  • Improve the process, dont just bury or hide the
    flaws!
  • Result better testing and smoother
    implementations

63
Questions?????????
  • Contact Info
  • Ellen Friedman
  • SRM Associates, Ltd
  • ellen_at_srmassoc.com
  • 516-433-1817
  • Part II. To be presented at CMG Conference
  • Thursday 915-1015
  • Session 512
  • Measuring Performance in the Lab
  • A Windows Case Study
Write a Comment
User Comments (0)
About PowerShow.com