Title: CS193H: High Performance Web Sites Lecture 8: Rule 4
1CS193HHigh Performance Web SitesLecture 8
Rule 4 Gzip Components
- Steve Souders
- Google
- souders_at_cs.stanford.edu
2Announcements
- Web 100 Performance Profile (round 1) class
project has been graded contact Aravind if you
want to know your grade
3Compression (encoding)
GET /v-app/scripts/107652916-dom.common.js
HTTP/1.1 Host www.blogger.com User-Agent
Mozilla/5.0 () Gecko/2008070208 Firefox/3.0.1
GET /v-app/scripts/107652916-dom.common.js
HTTP/1.1 Host www.blogger.com User-Agent
Mozilla/5.0 () Gecko/2008070208
Firefox/3.0.1 Accept-Encoding gzip,deflate
HTTP/1.1 200 OK Content-Type application/x-javasc
ript Last-Modified Mon, 22 Sep 2008 211435
GMT Content-Length 6230 function d(s) ...
HTTP/1.1 200 OK Content-Type application/x-javasc
ript Last-Modified Mon, 22 Sep 2008 211435
GMT Content-Length 2066 Content-Encoding
gzip XmoÛHþ\ÿFÖvãwØoq...
- typically reduces size by 70
- (6230-2066)/6230 67
4Gzip vs. Deflate
Gzip Gzip Deflate Deflate
Size Size Savings Size Savings
Script 3.3K 1.1K 67 1.1K 66
Script 39.7K 14.5K 64 16.6K 58
Stylesheet 1.0K 0.4K 56 0.5K 52
Stylesheet 14.1K 3.7K 73 4.7K 67
- gzip (default settings) compresses more
5Pros and Cons
- Pro
- smaller transfer size
- Con
- CPU cycles on client and server
- Don't compress resources lt 1K
6Gzip configuration
- Apache 1.3 mod_gzip
- mod_gzip_item_include file \.html
- mod_gzip_item_include mime text/html
- mod_gzip_item_include file \.js
- mod_gzip_item_include mime application/x-javascri
pt - mod_gzip_item_include file \.css
- mod_gzip_item_include mime text/css
- Apache 2.x mod_deflate
- AddOutputFilterByType DEFLATE text/html text/css
application/x-javascript - control compression level DeflateCompressionLevel
- http//httpd.apache.org/docs/2.0/mod/mod_deflate.h
tml
7Gzip not just for HTML
HTML Scripts Stylesheets
amazon.com x
aol.com x some some
cnn.com
ebay.com x
froogle.google.com x x x
msn.com x deflate deflate
myspace.com x x x
wikipedia.org x x x
yahoo.com x x x
youtube.com x some some
HTML Scripts Stylesheets
aol.com x x x
ebay.com x some
facebook.com x x x
google.com/search x x na
search.live.com/results x x x
msn.com x x x
myspace.com x x x
en.wikipedia.org/wiki x some some
yahoo.com x x x
youtube.com x x x
gzip scripts, stylesheets, XML, JSON (not
images, Flash, PDF)
October 2008
8Edge Case Proxies
Proxy
Origin Server
2 GET main.js Accept-Encoding gzip
1 GET main.js Accept-Encoding gzip
5 main.js Content-Encoding gzip
3 main.js Content-Encoding gzip
6 GET main.js (no Accept-Encoding)
7 main.js Content-Encoding gzip
4 main.js Content-Encoding gzip
proxies may serve gzipped content to browsers
that don't support it, and vice versa
9Edge Case Proxies w/ Vary
Proxy
Origin Server
2 GET main.js Accept-Encoding gzip
1 GET main.js Accept-Encoding gzip
7 GET main.js (no Accept-Encoding)
5 main.js Content-Encoding gzip
3 main.js Content-Encoding gzip Vary
Accept-Encoding
6 GET main.js (no Accept-Encoding)
8 main.js Vary Accept-Encoding
10 main.js (no gzip)
4 main.js Content-Encoding gzip
Accept-Encoding gzip
11 GET main.js Accept-Encoding gzip
9 main.js Accept-Encoding
12 main.js Content-Encoding gzip
13 GET main.js (no Accept-Encoding)
14 main.js (no gzip)
add Vary Accept-Encoding
10Edge Case Bad Browsers
- lt 1 of browsers have problems with gzip
- IE 5.5
- http//support.microsoft.com/default.aspx?scid
kben-usQ313712 - IE 6.0
- http//support.microsoft.com/default.aspx?scid
kben-usQ31249 - Netscape 3.x, 4.x
- http//www.schroepl.net/projekte/mod_gzip/brow
ser.htm - User-Agent white list for gzip
- Apache 1.3
- mod_gzip_item_include reqheader "User-Agent
MSIE 6-9" - mod_gzip_item_include reqheader "User-Agent
Mozilla/5-9" - Apache 2.0
- BrowserMatch MSIE 6-9 gzip
- BrowserMatch Mozilla/5-9 gzip
11Edge Case Bad Browsers
- (cont'd)
- proxies could mix-up responses
- give cached response from useragent1 to
useragent2 - could add Vary User-Agent
- so many possibilities, defeats proxy caching
- better to add Cache-Control Private
- downside disables all proxy caches
- is it a serious problem?
- hard to diagnose problem getting smaller
12Edge Case ETags
- what happens when proxy makes Conditional GET
requests? - Last-Modified date for gzipped vs. ungzipped is
different gt If-Modified-Since works fine - ETag is the same in Apache for gzipped
ungzipped gt If-None-Match succeeds, proxy could
give browser mismatched content - remove Etags! (Rule 13)
- http//issues.apache.org/bugzilla/show_bug.cgi?id
39727
13Edge Case ETags present
Proxy
Origin Server
2 GET main.js Accept-Encoding gzip
1 GET main.js Accept-Encoding gzip
- 7 GET main.js
- If-None-Match "de158-e58-c7ee4140"
5 main.js Content-Encoding gzip
- 3 main.js
- Content-Encoding gzip
- Cache-Control max-age0
- ETag "de158-e58-c7ee4140"
6 GET main.js (no Accept-Encoding)
8 304 Not Modified
9 main.js Content-Encoding gzip
- 4 main.js
- Content-Encoding gzip
- Cache-Control max-age0
- ETag "de158-e58-c7ee4140"
proxy gives browser mismatched content
14Edge Case ETags removed
Proxy
Origin Server
2 GET main.js Accept-Encoding gzip
1 GET main.js Accept-Encoding gzip
- 7 GET main.js
- If-Modified-Since Thu, 21 Aug 2008 235357
GMT
5 main.js Content-Encoding gzip
- 3 main.js
- Content-Encoding gzip
- Cache-Control max-age0
- Last-Modified Thu, 21 Aug 2008 235357 GMT
6 GET main.js (no Accept-Encoding)
- 8 main.js
- Cache-Control max-age0
- Last-Modified Fri, 22 Aug 2008 094315 GMT
10 main.js (no gzip)
- 4 main.js
- Content-Encoding gzip
- Cache-Control max-age0
- Last-Modified Thu, 21 Aug 2008 235357 GMT
9 main.js Cache-Control max-age0
Last-Modified Fri, 22 Aug 2008 094315 GMT
removing ETags avoids the problem
15Edge Case Fixes
Vary Accept-Encoding Cache-Control private ETag
aol.com x
ebay.com x x x (IIS)
facebook.com x
google.com/search x
search.live.com/results x x (IIS)
msn.com x (IIS)
myspace.com x x (Apa)
en.wikipedia.org/wiki x (Apa)
yahoo.com x
youtube.com x some
Vary User-Agent not used
October 2008
16Homework
- "Improving Top Site" class project
- add improvements for Rule 4
- measure improvements using Hammerhead
- record results in your personal Web 100 sheet
- read Chapter 5 of HPWS for 10/17
17Questions
- How much are file sizes typically reduced by
using gzip compression? - What types of resources (images, scripts, etc.)
should not be compressed? - For the resource types that should be compressed,
should they always be compressed? - How do you prevent proxies from serving gzipped
resources to browsers that don't support gzip? - How can ETags cause proxies to serve mismatched
content to browsers?