Aruna Balasubramanian, Yun Zhou, W Bruce Croft, - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Aruna Balasubramanian, Yun Zhou, W Bruce Croft,

Description:

... internet connectivity from vehicles possible. no subscription cost. useful when no other connectivity is available ... Connectivity characteristics of testbeds ... – PowerPoint PPT presentation

Number of Views:75
Avg rating:3.0/5.0
Slides: 23
Provided by: NinaS2
Category:

less

Transcript and Presenter's Notes

Title: Aruna Balasubramanian, Yun Zhou, W Bruce Croft,


1
Web Search From a Bus
  • Aruna Balasubramanian, Yun Zhou, W Bruce Croft,
  • Brian N Levine and Arun Venkataramani
  • Department of Computer Science,
  • University of Massachusetts, Amherst

2
Why web search from a bus?
  • Open access point commonly available
  • Intermittent internet connectivity from vehicles
    possible
  • no subscription cost
  • useful when no other connectivity is available
  • Web search 2nd most common web activity (survey
    by pewinternet.org)

3
Connectivity characteristics of testbeds
Goal Build web search in the presence of
frequent disconnections and small connectivity
duration
4
Web search process
ltyour favorite search enginegt
Retrieving web.
Retrieving images
Retrieving.
5
Adapting to vehicular network
6
Why challenging?
  • Interactive
  • several exchanges between user and search engine
    needed
  • Results imprecise
  • response may not be relevant
  • difficult to measure relevance

Thedu Proxy Architecture sustain
interaction IR contribution increase usefulness
of returned response
7
Thedu proxy
  • Between vehicle and search engine
  • When proxy receives query request from vehicle
  • retrieves urls and snippets
  • prefetches URL contents including images
  • stores responses and maintains state
  • When vehicle connects to proxy
  • downloads pending responses

8
Client and proxy architecture
Server-side Proxy
Client-side Vehicle
Search engine
Queries for vehicle
New queries
Web interface
USE R
Queries
Fetch URL/images
Store query
Intermittent connectivity
Response bundles
Process response
Prioritize response
Responses
9
How to prioritize?
  • Search engines use relevance scores to rank
    responses
  • scores not comparable across queries
  • Even if response is relevant it may not be useful
  • Query chants 2007 needs only one response
  • Thedu
  • Normalize relevance scores Comparable across
    queries
  • Classify query-type To capture user intent

http//www.netlab.hut.fi/chants-2007/
10
Query-Type classification
  • Query-type classification
  • Homepage query cnn, chants 2007
  • Non-homepage query Harry potter review
  • Thedu classifies using URL, snippet and title
    field
  • E.g., chants 2007 on Google
  • lturlgt http//www.netlab.hut.fi/chants-2007
    lt/urlgt
  • ltsnippetgt Welcome to the home page of the ACM
    MobiCom workshop on Challenged Networks (CHANTS
    2007). lt/snippetgt
  • lttitlegt chants workshop lt/titlegt

11
Relevance score normalization
  • Modified language model framework
  • D Document, Q Query, C Collection
  • Normalized score
  • Kullback-Leibler divergence (distance between Q
    and D)

Probability of word occurring in collection
Probability of word occurring in document
12
Thedu protocol
  • 1. Sort responses in the order of normalized
    score
  • 2. For response r for query q,
  • 2a. Update
  • 2b. If q is homepage query and do
    not send
  • 2c. Else send response to vehicle

expected relevance of all response sent for a
query q
probability that r is relevant for q
13
Evaluation goals
  • What is the delay in getting search results?
  • How many results were relevant to the user?

14
Evaluation Tools
  • DieselNet
  • Indri search engine
  • TREC (Text Retrieval Conference)
  • Predefined web data collection (10G)
  • Predefined set of queries (100 homepage 50
    content)
  • Relevance judgments (which documents are relevant
    for query)

Thedus query-type classifier accuracy 88
15
Deployment on DieselNet
16
Thedu vs Proxy-less server
  • Thedu
  • March 26 to March 30
  • Bundle responses
  • Returns responses in prioritized order
  • Maintains state
  • Proxy-less server
  • April 30 to May 5
  • Bundle responses
  • Returns responses as FIFO
  • No state

17
Connectivity duration
Mean connection duration 35 sec Mean
disconnection duration 8 min
18
Thedu vs Proxy-less architecture
Thedu
Stateless proxy
19
Delay until first relevant response
20
Extending Thedu
  • Can we use connectivity among buses to improve
    throughput?
  • Are we limited to academic search engines?
  • Convince commercial search providers to provide
    relevance scores
  • Or, assign scores based on ranking
  • Are users really happy with search results and
    delay?

traces.cs.umass.edu
21
Simulation Results
22
Inter-meeting times
Write a Comment
User Comments (0)
About PowerShow.com