Dr. Mike Lowndes, - PowerPoint PPT Presentation

About This Presentation
Title:

Dr. Mike Lowndes,

Description:

(Right-click or click-hold (Mac) and press k or select Speaker Notes) ... of the issues distorting logs affect these measurements (according to the blurb) ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 19
Provided by: mikel182
Category:
Tags: blurb | lowndes | mike

less

Transcript and Presenter's Notes

Title: Dr. Mike Lowndes,


1
Lies, Damn lies and Web Statistics
IWMW 2005 Whos web is it anyway?
  • Dr. Mike Lowndes,
  • Interactive Media Manager,
  • Natural History Museum, London
  • Houses 350-permanent scientific staff, plus
    postgraduate students one of the largest UK
    research institutes in the natural sciences.

(Right-click or click-hold (Mac) and press k or
select Speaker Notes)
2
Contents
  • Why bother?
  • Issues with web logs
  • Issues with analytic tools
  • Browser tracking
  • Comparison between approaches
  • Known issues with browser tracking
  • Nedstat input and findings from Newcastle
    University

3
Why bother?
  • Web log analysis is currently the main method
    used to quantify web site usage for reporting.
  • Results are used by the government as performance
    indicators for institutional websites.
  • Not accurate or meaningful most of the time
  • no good for absolute measurement of usage.
  • Can be used for
  • Trend analysis
  • Content preferences
  • ROI estimation
  • Checking and fixing your site
  • Understanding users behaviour
  • Testing assumed pathways

4
Issues with server logs
  • Dynamic IP
  • Many users using the same IP number over time.
  • Same user assigned many IP numbers over time.
  • Proxies
  • Several or many users behind 1 IP number
  • Caches (can be in Proxies)
  • Commonly requested files cached closer to the
    users.
  • Can form the top 20-50 hosts accessing sites.
  • Robots and spiders
  • Few visits but lots of hits.
  • Analytic packages cannot keep up to date with all
    of them for exclusion.
  • Syndication
  • RSS feeds generate huge logs, but are not read
    by humans initially.
  • Click-through configuration.
  • Reporting by analysis tools
  • Often weekly or monthly reports realtime is very
    labour/server intensive
  • Reports often complex and techy.

5
Issues with log analysis tools
  • Webtrends vs Summary.net
  • 1. Natural History Museum
  • Summary SP (summary.net) Version 4.2.1,
    unregistered demo, default configuration
  • 2. UKOLN (Bath)
  • WebTrends (www.webtrends.com) Version 5, default
    configuration
  • Both tools were applied to the same log file
  • Default configurations not removing robots
  • Note WebTrends documentation not clear on this
    point

6
Measurement discrepancies
7
Comparison between tools
  • Not a single measurement was identical.
  • Most measurements were within 5
  • Visit duration measurement widely different, and
    can depend on configuration. Possible bug in
    WebTrends version 5.
  • Page view measurements were quite different.
  • Results broadly similar but direct comparisons,
    especially of Page Views, are not really
    justified.

8
Browser tracking
  • Do they have fewer inaccuracies and distortions?
  • Is it easier on the web team?
  • Is it affordable?
  • Does it give us more information / better
    information?

9
Browser tracking
  • Requires code to be added to pages
  • Uses an image, sourced from the tracking website.
    Also uses javascript and cookies for gathering
    extended and repeat-visit information
  • Usually hosted services
  • Provide near real-time tracking
  • Few of the issues distorting logs affect these
    measurements (according to the blurb)
  • Main players Nedstat, Nielson/Netratings,
    WebSideStory

10
Comparison between tools
  • Summary SP VS Nielson/Netratings
  • Run on one section of a site over a month.
  • Visiting section of the Natural History Museum
    site small but popular and easily tagged.

11
Results 1 visits and visitors
12
Results 2 pages viewed
13
Results 3 country
  • Depends on the quality of the geographical IP
    database, not the mode of tracking?

14
Conclusions regarding traditional Log analysis
  • Assuming browser tracking is more accurate
  • We have fewer visit sessions than we thought, but
    more visitors
  • Fewer visits (sessions), possibly due to robot
    exclusion
  • More visitors (unique users), possibly due to the
    masking effect of proxies/caches and browser
    caches
  • Visit duration is much shorter than thought
  • possibly due to robots/spiders and cache
    updating.
  • Country information is roughly accurate so long
    as a geographical lookup is used.
  • Activity of popular pages, which are often
    cached, will be underestimated

15
Browser tracking advantages
  • Almost real-time analysis, incremental data.
  • Better repeat user tracking and individual
    pathway analysis.
  • Configurable, graphical reports for non-techies
  • Techie still needs to configure those reports
    however, as an understanding of web analytics is
    required
  • Cut our monthly staff time down from 1.5 days to
    1 hour
  • Appear to be more accurate in describing the
    activity of real people, but we would like to see
    some independent research.

16
Issues with browser tracking
  • Setup is not trivial You need to add code to
    every page.
  • Multiple server / ownership issues.
  • Does not always work (or get full user details)
    if Javascript is turned off or cookies
    disallowed.
  • Does not work with text-only browsers.
  • Unknown compatibility with PDAs, mobiles etc.
  • Questions
  • Would we get different results with different
    hosted services?
  • ABCE industry standards for measurement
  • Cookies often deleted unless user is confident in
    the source?
  • This would affect the measurement of repeat
    visitors and behaviour
  • Political issues
  • Issues with external hosting of institutional
    data
  • Security of personal data issues with external
    hosting
  • E.g. measurements of student and staff use of a
    VLE.

17
Next steps
  • Many private sector and public sector sites have
    already moved to browser tracking.
  • About 6 National Museums are currently discussing
    hosted browser tracking.
  • 5 Universities currently involved in a trial of
    NedStat.

18
Thank you
Write a Comment
User Comments (0)
About PowerShow.com