Title: IBM WebSphere and POWER7: Powering performance for Smarter Planet Solutions
1Best practices around Dynacache
Jonathan Marshall WebSphere Technical
Professional marshalj_at_uk.ibm.com
2Acknowledgements
- Based on presentations by
- Geoff Tindall WebSphere level 2 support
- Rama Boggarapu WebSphere level 2 support
3Agenda
- Why do I need caching?
- What is Dynamic Cache?
- A peak under the covers of WebSphere Application
Server - Replicating the Dynamic Cache
- Monitoring the Cache
- Troubleshooting Tips
- Where do we go from here?
4Why Cache?
A cache allows you to get stuff faster and helps
youavoid doing something over and over
again(which may be redundant and may not make
sense)
(far away)
(near)
(happy)
5Why Cache?
- Improving the performance
- Tuning the application?
- Tuning the environment?
- Tuning the backend resources?
- Not doing something at all?
6Introducing Dynamic Caching Services (dynacache)
7Enabling Dynamic Cache
8What does it cache?
- Each CacheEntry will have CacheId and CacheData
- In addition to CacheId and CacheData, CacheEntry
will have other information like dependencyIds,
timeout information, replication type and various
other information. - CacheMonitor.ear application can be used to view
most of the information about a CacheEntry.
9Where does it cache data? The Cache Instance
- An application uses a cache instance to store,
retrieve, and share data objects within the
dynamic cache. - Two types of custom cache instances can be
configured - Servlet Cache
- Cachespec.xml
- Object Cache
- API based cache, com.ibm.websphere.cache.Distribut
edMap - Three methods to create custom cache instance
- Administrative console (under Resources)
- Properties file (cacheinstances.properties)
- Resource reference in web.xml
10Dynamic Cache Configuration for Default Cache
Instance
11Disk Offload
- Implementation inherited from IBM Research
- Disk cache size can be controlled in terms of
space on the disk and of entries. - LRU Size based Disk eviction algorithms can
configured to specify the criteria for eviction. - 3 Performance modes HIGH, CUSTOM LOW depending
on the JVM free heap space available. - High ..Buffers all disk metadata in memory.
- Custom/Balanced Buffers some/most disk metadata
in memory. - Low Buffers NO metadata in memory. Note this
will be deprecated in v8. - FlushToDiskOnStop provides for the cache to be
persisted. - One of the strongest features of Dynacache.
Differentiates from other open source solutions
as well as eXtreme Scale.
12Disk Offload
- Configure a new cache instance
- Resources Cache Instances Servlet Cache
Instance new - A jndi name must be given to the cache instance
which will be used to refer back to the instance
in cachespec.xml. - Check Enable disk offload and specify a disk
off load location, cache size and cache entry
limits.
13How does it choose what to cache?
- Cachespec.xml is a deployable XML cache policy
file that contains caching rules - Placed in WEB-INF directory
- Specify the cache-instance name where the cache
rules to be applied - What to be cached?
- command/servlet/webservice/JAXRPCClient/static/por
tlet - How to identify an item in cache?
- Cache-Id
- Where to cache?
- Memory only/ Disk
- When to invalidate?
- Timebased/invalidation rules
- How to handle dependencies?
- Dependency rules for invalidation
14Cache ID
- A cacheID can be comprised of the following
- Request parameters and attributes
- The URI used to invoke the servlet or JSP
- Session information
- Cookies
- Pathinfo and servlet path
- Http header and Request method
- Servlet/JSP result caching can be used to cache
both whole pages or fragments. - Example
- /reference/Time.jspformatanalog
- Web Service getStockPrice but NOT setStockPrice
15Example cachespec.xml (1/4) Define the entry
- ltcache-entrygt ltnamegt/newscontrollerlt/namegt
- ltclassgtservletlt/classgt
- lt/cache-entrygt
http//publib.boulder.ibm.com/infocenter/wasinfo
/v7r0/index.jsp?topic/com.ibm.websphere.nd.doc/in
fo/ae/ae/welc6tech_dyn_dev.html
16Example cachespec.xml (2/4) Define the cache id
- ltcache-entrygt ltnamegt/newscontrollerlt/namegt
- ltclassgtservletlt/classgt
- ltcache-idgt
- ltcomponent id"action" type"parameter"gt
- ltvaluegtviewlt/valuegt
- ltrequiredgttruelt/requiredgt
- lt/componentgt
- ltcomponent id"category" type"parameter"gt
- ltrequiredgttruelt/requiredgt
- lt/componentgt
- ltcomponent id"layout" type"session"gt
- ltrequiredgtfalselt/requiredgt
- lt/componentgt
- lt/cache-idgt
- lt/cache-entrygt
http//publib.boulder.ibm.com/infocenter/wasinfo
/v7r0/index.jsp?topic/com.ibm.websphere.nd.doc/in
fo/ae/ae/welc6tech_dyn_dev.html
17Example cachespec.xml (3/4) Define the
dependencies
- ltcache-entrygt ltnamegt/newscontrollerlt/namegt
- ltclassgtservletlt/classgt
- ltcache-idgt
- ltcomponent id"action" type"parameter"gt
- ltvaluegtviewlt/valuegt
- ltrequiredgttruelt/requiredgt
- lt/componentgt
- ltcomponent id"category" type"parameter"gt
- ltrequiredgttruelt/requiredgt
- lt/componentgt
- ltcomponent id"layout" type"session"gt
- ltrequiredgtfalselt/requiredgt
- lt/componentgt
- lt/cache-idgt
- ltdependency-idgtcategory
- ltcomponent id"category" type"parameter"gt
- ltrequiredgttruelt/requiredgt
- lt/componentgt
- lt/dependency-idgt
http//publib.boulder.ibm.com/infocenter/wasinfo
/v7r0/index.jsp?topic/com.ibm.websphere.nd.doc/in
fo/ae/ae/welc6tech_dyn_dev.html
18Example cachespec.xml (4/4) Define the
invalidation
- ltcache-entrygt ltnamegt/newscontrollerlt/namegt
- ...
- ltinvalidationgtcategory
- ltcomponent id"action" type"parameter"
ignore-value"true"gt - ltvaluegtupdatelt/valuegt
- ltrequiredgttruelt/requiredgt
- lt/componentgt
- ltcomponent id"category" type"parameter"gt
- ltrequiredgttruelt/requiredgt
- lt/componentgt
- lt/invalidationgt
- lt/cache-entrygt
http//publib.boulder.ibm.com/infocenter/wasinfo
/v7r0/index.jsp?topic/com.ibm.websphere.nd.doc/in
fo/ae/ae/welc6tech_dyn_dev.html
19Fragment Caching
- The content of A.jsp is composed of its own
content plus 3 jsps(fragments) B.jsp,C.jsp and
D.jsp - Often a mix of static and dynamic
- consume-subfragmentstells the cache to store
fragments - ltexcludegt allows fragments to explicitly not be
cached
20Going under the covers Data Replication Service
21Data Replication Service - DRS
- Data replication service (DRS) is an internal
WebSphere Application Server component that
replicates data. - Transport for sending data from one managed
server to another. - Uses HAManager and DCS data stack frameworks to
accomplish replication. - Has the notion of replicas
- Data transfer channel can be encrypted. In
practice no-one does this. - Benefits
- Used by DynaCache, HTTPSession, Stateful Session
Beans and SIP - Challenges
- DRS Bootstrap will be expensive, if aggressive
replication occurs during startup. - DCS uses a star topology for synchrony, can
result in scalability bottlenecks.
22High Availability Manager
Data Replication Services (DRS)
- New feature in V6
- Collection of services
- Highly Available Singleton Services
- Low-level Replication Abstraction
- Shared State Bulletin Board
- Several runtime features depend on HA Manager
services - HA Manager runs in every single JVM in the cell
- HA Manager services are only available between
JVMs that are part of the same core group
23HA Core Group Message Links
JVM
- N2-N links
- N2, 2 links
- N3, 6 links
- N4, 12 links
- N5, 20 links
- N6, 30 links
JVM
JVM
JVM
JVM
- With each additional JVM
- Geometrically greater messages flowing
- More connections per JVM
- More memory per JVM to hold messages
JVM
Optimizations to reduce message frequency and
size are available in WAS 6.1
24DRS Important Tuning
- DRS shares the HA Manager with other services
- Under load, you may see large numbers of any of
the following Distribution and Consistency
Services (DCS) congestion messages in your
SystemOut.log file - DCSV1051W, a high severity congestion event for
outgoing messages - DCSV1052W, a medium severity congestion event for
outgoing messages - DCSV1054W, a medium severity congestion event for
incoming messages - Tuning directions are given in the InfoCenter
http//publib.boulder.ibm.com/infocenter/wasinfo/v
6r1/index.jsp?topic/com.ibm.websphere.nd.doc/info
/ae/ae/trun_ha_cfg_replication.html
25DRS Important Tuning
- Consider increasing the size of the default
thread pool - In larger configurations, the Default thread pool
size should be increased. - Doubling the thread pool size to 40 will likely
be sufficient. - However, when the number of application servers
in a replication domain is greater 10 and the
number of replication domain consumers in each
application server exceeds 2, it may need to be
increased more. - This should keep the DCS traffic moving and avoid
the timeouts. - The transport buffer size out of the box value
may also be insufficient - For all the appservers doing replication,
- Click on Servers -gt Application Servers -gt ltYour
appservergt -gt Coregroup service - Change the transport buffer size to 100MB (or
larger in more extreme loads). This changes the
RMM buffer size. - We also recommend changing the IBM_CS_DATASTACK_ME
G memory config parameter - Servers --gt Core Groups --gt Core group settings
--gtthen the settings for your core group (e.g.
such as DefaultCoreGroup). Choose "Additional
Properties" to specify a "Custom Properties".
Key would be "IBM_CS_DATASTACK_MEG" with value in
MB. Default value is 50.
26Cache Replication
27Replicating data across servers in a cluster
- Currently only dealt with the scenario where data
is cached per JVM - The Data Replication Service replicates data
throughout the cell for various functions - Dynacache
- HTTP Session
- Stateful Session Beans
- The scope of replication, the Replication
Domain allows multiple deployment patterns - Client/Server
- Client-only/Server-only
- Creating replication domains http//publib.boulde
r.ibm.com/infocenter/wasinfo/v6r1/index.jsp?topic
/com.ibm.websphere.nd.doc/info/ae/ae/trun_drs_repl
ication.html
28Replication Topologies Peer to peer
- For dynacache and HTTP session replication
29Replication Topologies Client/Server
- For HTTP session replication only
30Replication Domain configuration
- Single replica
- Limits replicated data to a single backup copy
within the Replication Domain - Reduces the total amount of redundant data held
throughout the domain - E.g. HTTP Session replication
- Entire Domain
- Stores a backup copy within each DRS instance in
the Replication Domain - Maximises the redundancy of the data held
throughout the domain - E.g. Dynamic cache needs to use this
- Specify
- Stores a specified number of backup copies within
the Replication Domain
31Configuring Replication for Default Cache Instance
32Replication Types
- In all of the following modes, the invalidations
will be sent across servers for data consistency. - PUSH mode - cacheId and Cache Data will be sent
to all servers in cluster - PULL mode (Not recommended) - Server requests the
data from other servers in cluster, when not
present in its cache - PUSH/PULL mode - Only cacheId is sent to all
servers in cluster. When a server needs an entry
in the Push/Pull table, it requests it from the
server that has the copy. - Not Shared - Cache data will not be shared across
servers. But invalidation events are sent to all
servers. - Recommendation
- Try PUSH and move to PUSH/PULL or even Not Shared
if struggling to replicate
33Monitoring the Cache
34Monitoring the Cache The CacheMonitor
Application
- Shipped with WebSphere application server install
in the installableApps dir
35Monitoring the Cache
- Cache contents can also be displayed.
- In our example, the cacheID is
- /referenceWEB/Time.jsprequestTypeGET
36Disk Offload
37Extended Cache Monitor on developerWorks
- http//www.ibm.com/developerworks/websphere/downlo
ads/cache_monitor.html - Provides two functions that are not available
with the cache monitor - Monitor and Manage contents of object cache
- Monitor cache statistics for cache instances
across all members of the cluster. - Enhancements made for the cache monitorin
WebSphere Application Server V7 - look at the push-pull table associated with a
cache instance - search memory contents, disk contents, and
push-pull table using a regular expression - compare cache instances
38Cache Statistics Collector and Visualizer
http//www.alphaworks.ibm.com/tech/cacheviz/
Statistics from DynaCache JMX MBean provide
insight into the state, health, performance,
composition efficiency of the cache. Collect
cache statistics using DynaCacheStatisticsCSV.py
wsadmin jython script. Outputs statistics in
CSV file Statistics can be charted with
Microsoft Excel or OpenOffice SpreadSheet.
39Performance Troubleshooting
40Dynamic Cache Common Sense Tips
- WebSphere Application Server uses JVM memory to
cache objects. Therefore, know how much memory
can be allocated for the cache and set the cache
size to the proper value. - Increase the priority of cache entries that are
expensive to regenerate. - Modify timeout of entries so that they stay in
memory as long as they are valid. - If the estimated total size of all cached objects
is bigger than the available memory, you can
enable the disk offload option. - Increase the cache size if memory allows.
41Essential Dynacache Tuning
- Read the following document for cache replication
tuning - http//www-01.ibm.com/support/docview.wss?uidswg2
7006431 - Set the following Dynacache JVM Custom Properties
- Name com.ibm.ws.cache.CacheConfig.ignoreValueInIn
validationEvent Value true - Name com.ibm.ws.cache.CacheConfig.filterTimeOutIn
validation Value true - Name com.ibm.ws.cache.CacheConfig.filterLRUInvalid
ation Value true - Name com.ibm.ws.cache.CacheConfig.cacheEntryWindo
w Value 50 - Name com.ibm.ws.cache.CacheConfig.cacheInvalidate
EntryWindow Value 50 - Name com.ibm.ws.cache.CacheConfig.useServerClassL
oader Value true - Assume to use replication type of NOT_SHARED
- Read the following Technote for Portal Server
Cache Replication issues - http//www-01.ibm.com/support/docview.wss?uidswg2
1322640 - Read the following Technote for Commerce server
replication issues - http//www-01.ibm.com/support/docview.wss?uidswg2
1358672
42Objects placed in cache are not replicated
- Make sure the replication instance is launched
(see SystemOut.log below) - Test with PUSH mode
- If you are using PUSH_PULL mode, only CacheId
will be pushed. Cache data will be pulled when
needed - The following messages in SystemOut.log file
tells Dynamic cache instance is initialized and
replication service is launched. Notice a small
delay, if application is looking for replicated
data during the delay, it may not find
it2008-07-10 185601828 EST 0000002c
CacheServiceI I DYNA1001I WebSphere Dynamic
Cache instance named /cache/instance_one
initialized successfully.2008-07-10
185602828 EST 00000042 DRSBootstrapM A
CWWDR0001I Replication instance launched
/cache/instance_one - See the following Technote for more information
- http//www-01.ibm.com/support/docview.wss?uidswg2
1313480
43Dynacache Congestion
- Signs of Congestion
- (1) DCSV1051W/DCSV1052W DCS Stack
DefaultCoreGroup.MyCluster at Member
CustomServer1 Raised a high severity congestion
event for outgoing messages. Internal details are
Total stored bytes 67701476, Red mark is
41943040, Yellow mark is 37748736, Green mark is
8388608. - (2) HMGR0152W CPU Starvation detected. Current
thread scheduling delay is 109 seconds. - Congestion normally occurs when there is lot of
data stored in cache and a member joins a
HAManager view or if there are lot of
invalidation events across servers. - Workarounds
- Use the NOT_SHARED or PUSH_PULL replication mode
- Filter out of unnecessary cache invalidation
events by configuring custom propertiesName
com.ibm.ws.cache.CacheConfig.filterTimeOutInvalida
tion Value trueName com.ibm.ws.cache.CacheConfi
g.filterLRUInvalidation Value true
44Dynacache Controlling DRS message size
- When processing BatchUpates, the DRS message
could be huge, which can lead to fragmentation
issues and OutOfMemory errors. - The following custom properties introduced in
PK32201 and PK35284 helps to control the size of
DRS messagescom.ibm.ws.cache.CacheConfig.cachePe
rcentageWindow 2com.ibm.ws.cache.CacheConfig.c
acheEntryWindow 50com.ibm.ws.CacheConfig.batchU
pdateMilliseconds - 1 seccom.ibm.ws.cache.CacheCo
nfig.cacheInvalidateEntryWindow
50com.ibm.ws.cache.CacheConfig.cacheInvalidatePer
centWindow -2
45 46Dynamic Cache Next Generation
47WebSphere eXtreme Scale
- WebSphere eXtreme Scale provides distributed
object caching essential for elastic scalability
and next-generation cloud environments. It
processes massive volumes of transactions with
extreme efficiency and linear scalability. - The WebSphere eXtreme Scale dynamic cache
provider - It uses cheaper system memory instead of SAN or
storage solutions - Provides a scalable replicated cache with a
configurable number of replicas. This eliminates
the need to use broadcast data everywhere with
DRS. - Scales elastically. Additional WebSphere
eXtreme Scale containers can be added at runtime.
WebSphere eXtreme Scale automatically
redistributes data partitions as new containers
are added to the grid. - Provides better caching qualities of service and
control, than the default cache provider. - Uses the same runtime monitoring and
administration tools as the classic dynamic
cache.
48WebSphere eXtreme Scale Example with Commerce
WXS Grid 18Gb
- Benefits
- With WXS, we offload the dynacache data store to
WXS grid - WCS estate needs 25 less memory for dynacache -
potentially reduce WCS estate - Performance improvement from not needing disk I/O
- Disk not needed cost savings
- WCS throughput improvement through reduced
chatter between JVMs and less GC overhead - WXS can now provide much larger in-memory cache
if desired by adding more JVMs (disk often
constrained by size and contention limits - Replica WXS JVM gives the cache resilience
49Summary
50Summary
- Dynamic Cache is a core service provided by
WebSphere Application Server that can provide
significant benefits to application developers
for - Web content Servlets/JSPs, AJAX requests, Web
Services - DistributedMap API
- Java command objects
- Dynamic Cache provides a comprehensive monitoring
support infrastructure - Higher performance environments will need to do
some tuning to ensure Dynamic Cache performs at
its best
51Reference
- IBM Redbooks
- WebSphere Application Server V6 Scalability and
Performance Handbook - http//www.redbooks.ibm.com/abstracts/SG246392.ht
ml - Mastering DynaCache in WebSphere Commerce
- http//www.redbooks.ibm.com/abstracts/sg247393.ht
ml - WebSphere Dynamic Cache Improving J2EE
application performance - http//www.research.ibm.com/journal/sj/432/bakalo
va.pdf
52Backup
53Cache Invalidation
- It is essential that timely invalidations of
cached content take place for the integrity of
the website. - Mechanisms for invalidation are
- timeout or inactivity directives within
cachespec.xml - group-based invalidation mechanism through
dependency IDs. - Programmatic invalidation via the cache API
'com.ibm.websphere.cache. - The CacheMonitor
54Data Oriented
Session management
Elastic DynaCache
DataPower XC10 Appliance
Web side cache
- Drop-in cache solution optimized and hardened for
data oriented scenarios - High density, low footprint improves datacenter
efficiency
Petabyte analytics
Data buffer
eXtreme Scale
Event Processing
- Ultimate flexibility across a broad range of
caching scenarios - In-memory capabilities for application oriented
scenarios
Worldwide cache
In-memory OLTP
In-memory SOA
Application Oriented
Elastic caching for linear scalability High
availability data replication Simplified
management, monitoring and administration
55Caching Rules in cachespec.xml
- Consider the simple JSP, Display.jsp.
- The request to Display.jsp returns itself, as the
parent, plus the included fragment Time.jsp.
56Caching Rules in cachespec.xml
- consume-subfragments
- The consume-subfragments property tells the cache
not to stop saving content when it includes a
child servlet. The parent entry will include all
the content from all fragments in its cache
entry, resulting in one big cache entry. - Use the ltexcludegt element to tell the cache to
stop consuming for the excluded fragment and
instead, create a placeholder for the include or
forward. For example, exclude Time.jsp from the
consume-subfragment, as follows
57Caching Rules in cachespec.xml
- ltcachegt
- ltcache-entrygt
- ltclassgtservletlt/classgt
- ltnamegt/Display.jsplt/namegt
- ltproperty name"consume-subfragments"gttrue
- ltexcludegt/Time.jsplt/excludegt
- h
- ltcache-idgt
- lttimeoutgt30lt/timeoutgt
- lt/cache-idgt
- lt/cache-entrygt
- lt/cachegt
-
58Caching Rules in cachespec.xml
- do-not-consume
- As discussed, the consume-subfragments property
tells the cache to save all content including
child fragments. - A fragment can be excluded from the rules of the
parent by using the do-not-consume property. - In this cachespec, Time.jsp is labeled
do-not-consume and can specify its own timeout or
other rules.
ltcachegt ltcache-entrygt
ltclassgtservletlt/classgt ltnamegt/Display.jsplt/n
amegt ltproperty name"consume-subfragments"gt
truelt/propertygt ltcache-idgt
lttimeoutgt30lt/timeoutgt lt/cache-idgt
lt/cache-entrygt ltcache-entrygt
ltclassgtservletlt/classgt ltnamegt/Time.jsplt/name
gt ltproperty name"do-not-consume"gt
truelt/propertygt ltcache-idgt
lttimeoutgt10lt/timeoutgt lt/cache-idgt
lt/cache-entrygt lt/cachegt
59Caching Rules in cachespec.xml
- Inactivity.
- While the timeout directive dictates how long
content can remain in cache before being
refreshed, the Inactivity directive can cause a
refresh prior to the timeout when a page is not
used frequently. - ltcache-idgt
- lttimeoutgt600lt/timeoutgt
- ltinactivitygt30lt/inactivitygt
- lt/cache-idgt
60Dependency IDs
- Dynamic Cache provides a group-based invalidation
mechanism through dependency IDs. - A dependencyID identifies a cache entries
dependency based on certain factors, such that
when those factors occur they trigger an
invalidation of all the cache entries that share
this dependency. - An example of such a dependency could be the
invalidation of a page which lists customer
names. (Customer.jsp) Cached entries for this
list should be invalidated when a customer is
added to or removed from the list.
61Troubleshooting Dynamic Cache
- Dynamic Cache Trace
- Use the WebSphere trace facility to review key
trace points and verify expected caching
behavior. - Enabling Trace
- Dynamic Cache issues can be traced using the
following trace specification - Dynacache replication is disabled
- infocom.ibm.ws.cache.all
- Dynacache replication is enabled
- infocom.ibm.ws.cache.allcom.ibm.ws.drs.all
- For information regarding trace setting please
refer to the WebSphere information center or see
this this link - http//www-1.ibm.com/support/docview.wss?rs180u
idswg21254706
62Troubleshooting Dynamic Cache
- Binding to cache instance
- ResourceMgrIm I WSVR0049I Binding
services/cache/diskoffload as services/cache/disko
ffload - First request
- CacheHook 3 handleServlet absoluteUri
/referenceWEB/Display.jsp - EntryInfo 3 set id/referenceWEB/Display.jsp
nameFredrequestTypeGET - FragmentCompo 3 setConsumeSubfragments
/Display.jsp consumeSubfragmentstrue - FragmentCompo 3 setDoNotConsume /Display.jsp
doNotConsumefalse - CacheStatisti 3 CACHE Cache Miss
/referenceWEB/Display.jspnameFredrequestTypeGE
T - CacheHook 3 CACHE MISS id
/referenceWEB/Display.jspnameFredrequestTypeGE
T - CacheHook 3 servicing /referenceWEB/Display.
jspnameFredrequestTypeGET - FragmentCompo 3 saveCachedData uri/Display.jsp
- Next request
- CacheStatisti 3 CACHE Local Cache Hit
/referenceWEB/Display.jspnameFredrequestTypeGE
T - CacheHook 3 CACHE HIT id/referenceWEB/Displ
ay.jspnameFredrequestTypeGET - Invalidation due to cache timeout
- Cache gt internalInvalidateById()
cacheNamebaseCache id/referenceWEB/Display.jspn
ameFredrequestTypeGET Entry - Cache lt internalInvalidateById()
id/referenceWEB/Display.jspnameFredrequestType
GET Exit