Title: Loglinear and Multidimensional Scaling Models of Webbased Information Spaces
1Loglinear and Multidimensional Scaling Models of
Web-based Information Spaces
- René F. Reitsma,
- Schwartz School of Business and Information
Systems - St. Francis Xavier University
- Antigonish, Nova Scotia
- Barbara P. Buttenfield
- Dept. of GeographyUniversity of Colorado, Boulder
2Loglinear and Multidimensional Scaling Models of
Web-based Information Spaces
- Cartography of the Internet
- Alexandria Digital Libray (ADL)
- Research questions
- Modeling ADL transactions
- Finding stable patterns loglinear models
- Maps from traffic data multidimensional scaling
- Interpretation of results
- Discussion and new work
3Cartography of the Internet
- Is there such a thing as information/cyber space?
- We navigate it so.
- Does it have dimensions?
- How many?
- Is it orthogonal?
- What is its metric (how is distance measured?)
- Can we map it?
4Alexandria Digital Library (ADL)http//www.alexan
dria.ucsb.edu
- Comprehensive library services for
- acquisition
- cataloging
- browsing
- retrieval
- collection maintenance
- distributed data archives of
- library components spread across the Internet
- delivery on wide-area networks
- geographically referenced information
5Alexandria Digital Library (Cont.d)
- Holdings include
- Subset of UCSB Map and Imagery Library (gt
2,000,000) - Additional data sets coming on-line
- Functions
- Browsing and retrieving maps and imagery
- Spatial data sets, catalog and gazetteer
- Tutorials, on-line help and general reference
information
6Research Questions
- Can we conceptualize ADL (and other web sites) as
an information space? - If so, what does that space look like? Can we
(re)construct this space? - Can we use ADL transactions to reconstruct that
space? - Do transaction patterns (and hence the ADL
'space') change over time e.g., as a consequence
of changes to the user-interface? - Do different user groups navigate different
spaces? Why?
7ADL 'Rooms'
8ADL Transactions
((User X, Room P, Time T) , (User X, Room Q, Time
Ti))
9Loglinear Models of ADL Transactions
- Transaction matrix as contingency table
- Spatial interaction Willekens (1983), Willekens
and Baydar (1983, 1983a), Aufhauser Fischer
(1985) - Social mobility DiPrete (1990), Miller and Hayes
(1990) - Science citation Everett and Pecotich (1991)
10Loglinear Modeling (Concept)
- ?2 analysis
- - Expected Fij P(row i ? column j) F
- Fi/F Fj/F F
- - Observed Fij Expected Fij Interaction
11Loglinear Modeling (Cont.d)
- ?OD as statictically stable indicator of traffic
intensity
12Loglinear Models of ADL Transactions(Cont.d)
- Do transaction patterns change over time?
13ADL Loglinear Results (All users, n175,606)
14Spatial Interpretation of Transactions
- Don't know the space's dimensionality nor its
metric. - Don't know the location of the pages/rooms in the
space. - Assumption distance is inversely related to
interaction. - Multidimensional scaling
15?OD as Distance Estimate
16MDS Scaling results
17Dimensional Interpretation
- Dimension 1 user- vs. system directed
- Dimension 2 topical vs. administrative
- Dimension 3 actions vs. learning
- Dimension 4 time on task
18Research Questions
- Can we (re)construct information space?
- Consider the statistically stable transactions as
'trips' or 'moves' made trough n-dimensional
space. - Apply MDS based on ?OD - distance assumption.
- Does the space (transaction patterns) change over
time e.g., as a consequence of changes to the
user-interface? - - not significant
19Research Questions (Cont.'d)
- Do different user groups navigate different
spaces?
- Librarians (untrained) - Panelists
(trained)
20Discussion
- So far inductive ? deductive follow up
- Current transactions ? stable patterns ? theory
- Needed theory ? predict patterns ? testing
- LLM implies averaging important signatures can
get lost in surrounding noise - From pattern to path.
- Room 'extensions' from points to n-dimensional
volumes.
21ADL vs. Most Web Sites
- ADL transaction
- ((User X, Page P, Time T),(User X, Page Q, Time
Ti)) - ADL had definite entry point sessions could be
uniquely identified. - Most web sites no entry point ? no constraint on
i. - ((User X, Page P, Mon 900 AM),(User X, Page Q,
Wed 400 PM)) - ???
- Need correction.
22Server Log Correction
- Assumption The less two consecutive visits are
separated in time, the more likely they are to
represent a transaction. - P(transaction) f(time)
- Problem How to estimate f ?
- Proposal Estimate f empirically
- For each candidate transaction i,j, record its
duration. - For all candidate transactions i,j, fit a
probability density function. - Weigh each candidate transaction I,j, with its
probability.