Semantic Query Caching in Mobile Environments - PowerPoint PPT Presentation

About This Presentation
Title:

Semantic Query Caching in Mobile Environments

Description:

Taco Bell. 2.4 miles. Dominos Express. 2 miles. McDonalds. Found 37 matches. Master's Thesis 8 ... the dynamics of our application domain, none of these ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 42
Provided by: csU59
Category:

less

Transcript and Presenter's Notes

Title: Semantic Query Caching in Mobile Environments


1
Semantic Query Caching in Mobile Environments
  • By Jekkin Shah
  • Advisor Dr. Konstantinos Kalpakis

2
Semantic Query Caching in Mobile Environments
  • Introduction
  • Motivation
  • Contribution
  • Concept of Semantic Caching
  • Issues involved in semantic caching
  • System Architecture
  • Prototype and Experiments
  • Conclusion and further work

3
Introduction
  • Disparate works and progresses in
  • Geographic Information System (GIS)
  • Global Positioning System (GPS)
  • Wireless Technology
  • Handheld devices
  • Convergence to Mobile Geographic Information
    System (mobile GIS)
  • Rapid growth in mobile GIS applications in all
    walks of life
  • Emphasis on spatial data, its storage, retrieval
    and manipulation

4
Convergence
GIS
GPS
Mobile GIS
Wireless
Handheld
5
Growing List of Applications
  • Car navigation systems
  • Emergency services
  • Real time stock quotes
  • Field services
  • Real time tracking and routing of shipments
  • Environmental surveys
  • and the list is growing rapidly

6
Semantic Query Caching in Mobile Environments
  • Introduction
  • Motivation
  • Contribution
  • Concept of Semantic Caching
  • Issues involved in semantic caching
  • System Architecture
  • Prototype and Experiments
  • Conclusion and further work

7
Motivation
  • Hungry !!! Lets find a nearby restaurant
  • query Q1
  • FIND restaurants WHERE location nearby

Found 37 matches
8
Example 1 (cont.)
  • Wait We also need some gas !!!
  • Lets see if we can find a gas station near
    McDonalds.
  • query Q2
  • FIND McDonalds WHERE gas Station nearby

Found 2 matches
9
Shouldnt we speed up the process ?
  • Query Q1 is in local cache
  • Query Q1 subsumes query Q2
  • Why do we need to execute query Q2 from scratch
    ??
  • We need a technique to determine and extract Q2
    from Q1
  • Unfortunately, traditional techniques like page
    caching do not provide much help in this case

Q1
Q2
10
A new approach Semantic Caching
  • Along with query results, store the queries also
    in cache
  • Use these queries (query descriptors) to
    determine if and how a new query can be answered
    from cache
  • Check if the required data is present in cache.
  • Extract the data from cache
  • Add, remove, merge data by performing
    corresponding operation on query descriptors
  • Manage cache by managing the query descriptors
  • Think of query descriptors as intelligent pointer
    references that implicitly contain some
    information about the data they refer to

11
Problems with traditional caching
  • Pointer references do not contain any implicit
    information
  • Q1 ? p1,p2,p3,p4,p5,p6
  • Q2 ? p7,p8,p9,p10,p11,p12
  • Q3 ? all the pages
  • Space constraints will make it difficult to store
    all the pages in cache.

p1
p2
p3
p4
p5
p6
data3
p7
p8
p9
p10
p11
p12
12
Semantic Query Caching in Mobile Environments
  • Introduction
  • Motivation
  • Contribution
  • Concept of Semantic Caching
  • Issues involved in semantic caching
  • System Architecture
  • Prototype and Experiments
  • Conclusion and further work

13
Contribution
  • An architecture for Semantic Caching in mobile
    environments
  • A system prototype as a proof-of-concept with
    the following building blocks
  • Query parser and validator
  • A Solver for determining query satisfiability
  • An Executor for processing partial and remainder
    queries
  • A Cache manager for efficiently managing the
    cache
  • A cache replacement algorithm
  • Techniques for query processing

14
Semantic Query Caching in Mobile Environments
  • Introduction
  • Motivation
  • Contribution
  • Concept of Semantic Caching
  • Issues involved in semantic caching
  • System Architecture
  • Prototype and Experiments
  • Conclusion and further work

15
Issues in semantic caching
  • Although the idea of semantic caching is straight
    forward, store query descriptors along with their
    results, the issues involved are much harder !!
  • Simple concept but Difficult Implementation
  • Issues
  • 1. We need to decide if the answer is present in
    cache
  • 2. If present, do we have sufficient information
    to extract it ?

16
Answering Queries from Cache
Is result of Q3 present in (Q1 Q2) ?
17
Solving the implication problem
  • Let T Q1, Q2 be a set of query descriptors
    already in cache
  • We need to show that Q?T
  • We show that (Q ? T) is FALSE
  • (Q ? T)
  • ? ( Q ? T)
  • ? Q ? (T)
  • ? Q ? (T1 ? T2 ? T3 ? T4)
  • ? Q ? (T1) ? (T2) ? (T3) ? (T4)
  • This is the primary technique used in our thesis.
  • The algorithm is adopted from LY85.

18
Solving the implication problem (Cont.)
  • Exponential growth in the number of equations to
    be solved.
  • Solution
  • Clustering based on Signatures
  • Signature created by taking into account the
    predicate attributes present in the query
  • Restriction on the number of clusters created
  • Signature used in indexing the query descriptors

Attr A, B
Attr X, D
19
Data Extraction problem

Can we extract Data3 ?
Data1
Data3
Data2
We fetch attribute C from remote source and take
a Cartesian product with the data already present
in cache
20
Answering Partial Queries
  • What happens if Q?T is FALSE ?
  • There may be a non empty intersection set between
    Q and T
  • Answer (Q ? T) locally (Partial match)
  • Send (Q ? T) to the server (Remainder Query)

T1
T2
Q
21
Semantic Query Caching in Mobile Environments
  • Introduction
  • Motivation
  • Contribution
  • Concept of Semantic Caching
  • Issues involved in semantic caching
  • System Architecture
  • Prototype and Experiments
  • Conclusion and further work

22
Semantic Caching Architecture
Solver (Query implication)
query
Query parser and Validator
Remote db
Executor
results
Cache manager
Local Cache
23
Cache Structure
  • Local Cache is implemented as relational database
    structures
  • Query descriptors are stored in one table indexed
    by their signatures
  • Corresponding query results (data) are stored in
    another table
  • An auxiliary table associates the query
    descriptors with its corresponding data
  • Cache manager interacts with query descriptor
    table
  • Manipulation of data is achieved through the
    manipulation of query descriptors

24
Cache Operations and Management
  • Cache Manager
  • Replacement module
  • Replacement Determines what needs to be cached
    and what can be purged out
  • Management module
  • Addition Granularity of addition is a semantic
    region
  • Deletion Removal of region, though not
    necessarily leading to the removal of data
  • Merge To simplify query processing, two or more
    regions can be merged
  • Decomposition A very large region, can be
    decomposed for efficiency reasons

25
Cache Replacement
  • Theory and Assumptions
  • What is the performance metric ?
  • Conventional caching schemes optimize one or more
    of the following parameters with the goal of
    improving the performance
  • Hit ratio
  • Response time
  • Data transmission time
  • Due to the dynamics of our application domain,
    none of these parameters truly reflect the
    performance of our applications

26
Theory and Assumptions (Cont.)
  • Cache Hit Rate how do we define hit rate ?
  • One At least one data record obtained from cache
  • All All data records to be obtained from local
    cache
  • Mid 50 of data records to be satisfied from
    local cache
  • Response time
  • Partially answered queries make it difficult to
    accurately define the response time
  • Data transmission time
  • Lot of dependence on the actual network
    parameters like latency and bandwidth

27
Theory and Assumptions (Cont.)
  • Mobile environments Premium on bandwidth
  • Our goal To minimize the cost of servicing the
    requests that cannot be answered from the local
    cache
  • Cost is measured in terms of time
  • Performance metric is Byte hit rate (BHR)
  • Ratio of actual amount of data served from local
    cache to the amount of data transferred from the
    remote source
  • Assumptions
  • Negligible query execution time
  • Uniform latency and bandwidth across the network

28
Replacement Algorithm
  • Guiding Action Selection function (GAS) to assign
    a value to each semantic region
  • GAS value a (s f b)
  • s size of data transferred from the remote
    source
  • f frequency of access of the query
  • a, b are domain specific parameters
  • a freshness count of each query
  • b 1/Sd, where Sd is the distance between the
    current location of the moving object and the
    location of query
  • Using the GAS function the value of each semantic
    region is calculated

29
Replacement Algorithm (Cont.)
  • For each query in cache we have,
  • GAS value (Vi)
  • Weight (Wi)
  • Also, we have a limit on the total size of the
    cache (W) and also the total number of queries
    (K) that can be admitted
  • Problem definition
  • Given a set of rectangles with a weight and a
    value, choose at most K rectangles that gives
    maximum value, provided the weight does not
    exceed W
  • The problem can be formulated as the 0-1 Knapsack
    problem with additional cardinality constraint

30
Semantic Query Caching in Mobile Environments
  • Introduction
  • Motivation
  • Contribution
  • Concept of Semantic Caching
  • Issues involved in semantic caching
  • System Architecture
  • Prototype and Experiments
  • Conclusion and further work

31
Experiments (Setup)
  • Requirements
  • Workload (datasets and queries)
  • Modeling the behavior of the moving object
  • Query execution guidelines
  • Real datasets
  • Hard to obtain
  • Complexity in processing due to complex
    structures of spatial objects
  • Synthetic dataset generator
  • Easily generated
  • Various parameters can be controlled

32
Workload
  • Query load selection
  • Tables
  • Restaurants LocX, LocY, Name, ID, tables, City,
    Zip
  • Gas Stations LocX, LocY, Name, ID, Low, Mid,
    High
  • Query specifications
  • Rectangular queries (select and project only)
  • Number of queries issued per trip 20-70
  • Type of queries Location aware, location
    dependent and non-location related
  • Frequency of issuance Selected randomly ranging
    from 5 ms to 100 ms
  • Overlap rate 10-25

33
Experiments (Moving Object)
  • Behavior of Moving Object
  • Generating Spatio-Temporal Dataset (GSTD) PT00
  • Moves in a 2D space
  • Static points and regions called infrastructure
    emulate real life objects like buildings, rivers,
    roads etc.
  • Trajectories are generated using specific
    guidelines
  • Initial statistical distribution of
    infrastructure objects
  • Source and destination location
  • Speed of moving object
  • Direction of motion
  • Duration of journey

34
Query Execution Guidelines
  • Controllable parameters
  • Type of queries
  • Location dependent, Location aware, Non-location
    related
  • Frequency of query issuance
  • Selectivity of chosen queries
  • Query overlap rate
  • Parameters are chosen in a variety of
    combinations
  • Random
  • Gaussian distribution
  • Skewed distribution

35
Results
  • Cache Size Vs Hit Rate ( NEW vs m-LRU)
  • The NEW replacement scheme compares roughly equal
    to modified LRU replacement scheme
  • BHR increases upto 70 when cache size is
    progressively increased

36
Results
  • Hit rates Vs Number of queries (NEW scheme)
  • Increasing the number of queries in the system
    does not substantially increase the hit rates.
  • Byte hit rate performs nearly equal to Hit rate
    Mid

37
Semantic Query Caching in Mobile Environments
  • Introduction
  • Motivation
  • Contribution
  • Concept of Semantic Caching
  • Issues involved in semantic caching
  • System Architecture
  • Prototype and Experiments
  • Conclusion and further work

38
Conclusion
  • No assumption made on Spatial Locality of
    Reference
  • Query descriptors act as Intelligent References
  • Can support Content Based Reasoning
  • Ability to take advantage of Schema Knowledge
  • Page / Tuple caching schemes do not scale well in
    our GIS domain
  • Reasons
  • Unintelligent pointer references
  • Questionable assumption of Spatial Locality of
    Reference
  • Inability to take advantage of Semantic Overlaps

39
Advantages of Semantic Caching
  • Benefits of Semantic Caching
  • Leverages semantic locality found in typical
    mobile GIS applications
  • Adapts dynamically to the patterns of user
    queries rather than caching static clusters of
    tuples
  • Minimizes cost of cache lookup due to compact
    representation of query descriptors
  • Capable of providing partial and/or approximate
    answers to queries quickly

40
Conclusion (Cont.)
  • Shortcomings of Semantic Caching
  • Complicated cache management schemes
  • Too restrictive. Solver can process only simple
    type of queries
  • Captures the semantics of the query and not the
    result objects. Hence, fails to utilize cached
    objects when the semantics of the query do not
    match

41
Conclusion (Cont.)
  • Future work Lots of things
  • Make the solver more general to handle different
    types of queries
  • Make the caching scheme flexible enough to
    capture the semantics of the query descriptors as
    well as the result objects
  • Simpler cache management
  • Ability to share cache with peers
Write a Comment
User Comments (0)
About PowerShow.com