AIMS: An Immersidata Management System - PowerPoint PPT Presentation

About This Presentation
Title:

AIMS: An Immersidata Management System

Description:

Integrated Media Systems Center. University of Southern California. Los Angeles, CA 90089-0781 ... Fire Fighter Training System (Georgia Tech) Planetary ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 45
Provided by: cyr15
Learn more at: http://ilab.usc.edu
Category:

less

Transcript and Presenter's Notes

Title: AIMS: An Immersidata Management System


1
AIMS An Immersidata Management System
  • Cyrus Shahabi
  • Computer Science Department
  • Integrated Media Systems Center
  • University of Southern California
  • Los Angeles, CA 90089-0781
  • shahabi_at_usc.edu
  • http//infolab.usc.edu

2
Outline
  • Definitions and Motivating Applications
  • Immersive Data Types (focus immersidata)
  • AIMS Architecture
  • Subsystems Acquisition, Storage Querying
  • Current Status (demo, if time permits)
  • Conclusion and Future Work

3
Immersive Environments
  • Immersive Environments allow a user to become
    immersed within an augmented or virtual reality
    environment in order to interact with people,
    objects, places, and databases.
  • Examples
  • Office of the Future (UNC)
  • Fire Fighter Training System (Georgia Tech)
  • Planetary Exploration (JPL)
  • Physical/Occupational Therapy System (Haifa
    Univ.)
  • Virtual Classroom and Office (USC IMSC)
  • Haptic Museum (USC IMSC)
  • MRE Mission Rehearsal Exercise (USC ICT)

4
Thesis (1)
  • It is absolutely critical to understand the data
    generated by and for immersive environments
  • For example, from the data acquired from a users
    interactions with an immersive environment (i.e.,
    immersidata), we can learn about the users
    behavior to
  • Study human factor issues
  • Measure the effectiveness of the environment
  • Customize the information delivery
  • Identify pitfalls in the system
  • Better understand the users intentions
  • Improve the system performance
  • For immersive and multimedia community!
  • For database community
  • Immersive sensors are the user interfaces of the
    future as a research community we should study
    their generated data or we will miss the boat.

5
Example Immersive Sensor Data Streams
ltSi, x, y, z, t, vgt
6
Application (1) Immersive Sensor Pattern
Recognition On-Line Query Analysis
Recognition System
DB of Labeled Patterns
Immersive environment
7

Application (1) American Sign Language (ASL) as
well-defined patterns
1. User makes ASL signs w/ a glove
4. ASL signs recognized
Acquisition Module
2. Sensor values sampled over time
  • Recognition modules
  • SVD
  • Bayesian Classifiers
  • Neural Net

Spatio-Temporal (moving sensors) Query Evaluation
3. Semantic description of hand
8
Application (1) ASL On-Line QA
  • On-Line query and analysis challenges
  • A hand sign is composed of a sequence of data
    samples across multiple sensor streams
  • A sequence for one sign has no fixed length
    (i.e., cant tell when one ends and the other
    starts!)
  • An example statement in American Sign Language
    (ASL)

shoes
I
yellow
  • Two problems (chicken egg-problem) with
    interdependent solutions should be addressed
  • Isolate signs
  • Recognize the isolated sign

9
Application (2) Immersive ClassroomOff-Line
Query Analysis
  • Study attention performance for Normal
    ADHD-Diagnosed Children
  • A classroom as a virtual environment (virtual
    students, a virtual teacher, desks, a blackboard,
    a window to the playground, doors)
  • Presence of distracters
  • Paper airplane
  • Ambient classroom noise
  • Students walking
  • Cars passing outside, visible through the window

10
Application (2) IC Off-Line QA
  • User, wearing HMD, is immersed into the class
  • Trackers monitor body movements and stream data
    to the database
  • Task pressing a button when a particular letter
    pattern is seen on the virtual blackboard (e.g.,
    AX)

Displayed Characters
DB
Head sensor data
Arm sensor data
Leg sensor data
Mouse Clicks
Distracters
11
Application (2) IC Off-Line QA
  • Off-line query and analysis
  • Range-sum queries
  • Sum of body movements
  • Average reaction time to the patterns
  • Number of correct hits
  • Classification and clustering
  • Use a classification technique to differentiate
    between normal and ADHD-diagnosed subjects (e.g.,
    SVM)
  • Distinguishing hyperactive kids from normal by
    automatically analyzing tracker data major
    impact in psychotherapy, able to discriminate and
    specify diagnosis in a manner not possible using
    existing traditional methods

12
Thesis (2)
  • Immersive applications in training and simulation
    domains, share common data storage and analysis
    requirements (i.e., dealing w/ sensor data
    streams, aka immersidata)
  • Hence, instead of building customized systems for
    the acquisition, storage and querying needs of
    each immersive application, one can design a
    general-purpose system addressing many of the
    shared requirements

13
Common Data Components of Immersive Environments
ACM-ITP02
  • User (subject(s))
  • Virtual Space
  • Actor Objects
  • Mission (task objective)
  • Immersive Data Types
  • Conventional Data user data
  • Spatio-Temporal Data immersive space/time data
  • Immersidata Sensor Data Streams

14
Focus Immersidata MIS99
  • Data acquired from users interaction with the
    immersive environment
  • Subject body positions
  • Subject recognized gestures
  • Can be analyzed to learn about users behavior
  • Specifications
  • Multidimensional ltsi, x, y, z, t, vgt
  • Spatio-Temporal
  • Continuous Data Streams (CDS)
  • Potentially large in size and bandwidth
    requirements
  • Noisy

, ltsn,xn,yn,zn,hn,pn,rn,tngt, ,
,lts1,x1,y1,z1,h1,p1,r1,t1gt,
15
AIMS An Immersidata Management System
3. User interaction module
Application-specific GUI
Pattern isolation heuristic
1. Acquisition module
Pattern matching SVD-based measure
DWPT basis selection for each dimension
Sensor Data Streams
Transformation
4. Query analysis module
2. Storage module
ProPolyne web services
Wavelets packing into disk blocks or DB BLOBS
Immersidata storage (file-system OR-DBMS)
16
Challenges of AIMS Subsystems
  • Acquisition SIGMETRICS01,ICME02
  • Data should be filtered and transformed (similar
    to signals)
  • Database friendly signal processing techniques
    are required
  • Storage SIGMOD03?
  • Physical level of storage system should be
    designed to store transformed data (e.g., wavelet
    coefficients)
  • Block allocation strategies considering query
    patterns
  • Offline Query and Analysis EDBT02.PODS02
  • Approximate, progressive, and efficient
    polynomial analytical query on large amount of
    multidimensional data
  • Online Query and Analysis MMM03
  • Common challenges with querying continuous data
    streams
  • Real-time pattern recognition on aggregation of
    multiple data streams that are incrementally
    completing
  • Data from all streams form the meaningful data

17
1. Acquisition Module
Approaches
  • INPUT Multidimensional streams
  • OUTPUT Wavelet coefficients
  • Receive multidimensional sensor streams
  • In real-time selects different basis per
    dimension (optimally) from the DWPT (Discrete
    Wavelet Packet Transforms) library
  • Applies multidimensional transformation to data
    (generates multi-resolution representations of
    data)
  • NOTE no compression is applied, no data will be
    lost by this process

18
2. Storage Module
Approaches
  • INPUT Wavelet coefficients
  • OUTPUT disk blocks
  • metadata records
  • Optimally packs related wavelet coefficients into
    disk blocks (to reduce future I/O cost) and store
    them in the file system or within OR-DBMS
  • Includes corresponding disk blocks info into the
    DBMS (Database Management System) for future
    queries

19
Optimal Disk Placement for Wavelet
DataDependency Graph (Haar wavelets)
20
Optimal Disk Placement for Wavelet DataTiling -
Blocking (Haar wavelets)
21
3. User Interaction Module
Approaches
  • INPUT Camera/speech/tracker/immersive-sensor
  • OUTPUT application commands and queries
  • user profile/state and application
    context
  • Receives data from various input-devices (beyond
    keyboard and mouse) used by the user (e.g., for
    data visualization purposes)
  • Understands the set of requested actions (SVD
    mutual-information)
  • Translate actions to application-specific
    commands and/or database queries (takes
    user-profile context into account)
  • Also stores a history of users interactions to be
    mined off-line and/or on-line to extract user
    state/behavior and application context to
    facilitate future interactions by the same user
    (e.g., personalization/customization)

22
4. Query Analysis Module
Approaches
  • INPUT Range and point queries
  • OUTPUT Aggregate values/Integrated events
  • Transforms queries into a consistent wavelet
    domain as of data
  • Performs queries efficiently (and perhaps
    approximately or progressively) in the wavelet
    domain
  • Displays the correct resolution/granularity of
    aggregate value(s) and/or events to the user
    based on user profile (e.g., tolerable latency
    time) and/or system requirements and/or data
    availability
  • An event is tagged with space (e.g., latitude,
    longitude and altitude), time and bag of
    attributes

23
AIMS Main Theme Data Manipulation, Query
Analysis in the WAVELET Domain
  • Main idea/distinction storage is cheap and
    queries are ad-hoc lets keep all the wavelet
    coefficients! (no data compression)
  • Intuition At the data population time, we dont
    know which coefficients are more/less important
  • Different than the signal-processing objective to
    reconstruct the entire signal as good as possible
  • This has been observed by Garofalakis Gibbons,
    SIGMOD02, but they proposed other ways to drop
    coefficients assuming a uniform workload
  • Opportunity At the query time, however, we have
    the knowledge of what is important to the pending
    query

24
AIMS Main Theme QA of Wavelets
  • Define range-sum query as dot product of query
    vector and data vector (also observed by Gilbert
    et. al, VLDB2001 but no query transformation)
  • Offline Multidimensional wavelet transform of
    data
  • At the query time lazy wavelet transform of
    query vector (very fast)
  • Dot product of query and data vectors in the
    transformed domain ? exact result
  • Choose high-energy query coefficients only ? fast
    approximate result (90 accuracy by retrieving lt
    10 of data)
  • Choose query coefficients in order of energy ?
    progressive result

25
Progressive Evaluation of Vector Queries
26
Current Status ProPolyne Demonstration
27
AIMS with a Twist!
ltx, y, z, t, valuegt Remote Sensor Data
Streams ltlat, long, altitude, t, temperaturegt
3. User interaction module
Application-specific GUI
Pattern isolation heuristic
Pattern matching SVD-based measure
1. Acquisition module
DWPT basis selection for each dimension
Transformation
4. Query analysis module
2. Storage module
ProPolyne web services
Wavelets packing into disk blocks or DB BLOBS
Sensor Data storage (file-system DBMS)
28
Conclusion and Future Work
  • A new application domain, immersive applications,
    and one of its data set, immersidata, were
    introduced
  • Database challenges involved in managing
    immersidata discussed
  • Some direct adoption of the typical database
    research techniques (e.g., OLAP)
  • Some modifications/extensions of the current
    research contributions (e.g., in the area of data
    streams) that are not applicable immediately
  • The design of AIMS, an innovative data systems
    architecture, were reported
  • Future Work
  • I/O efficient ways for Wavelet transformation and
    incremental update
  • Hybrid sorting of both data and query
    coefficients
  • Prototypical implementation of an end-to-end
    application using AIMS
  • Performance evaluation

29
Application (3) Physical/Occupational Therapy
Both On-Line and Off-Line QA
  • Rehabilitation research using virtual
    environments and gaming technologies
  • Enables individuals with severe physical
    disabilities to use their residual motor
    abilities in more efficient and less fatiguing
    ways
  • Patient watches her video projected on a 2-d
    virtual environment
  • Video cameras track body movements
  • Animated target characters are manipulated within
    the environment
  • Patient is asked to hit the targets to gain more
    score
  • Potential data analysis tasks
  • Offline analysis of user performance in order to
    find specific motor disabilities
  • Online analysis of body movements to add more
    targets in the directions which need more
    exercises

30
  • Thanks!

31
Haptic Data Acquisition SIGMETRICS01
  • Temporal aspect the rate of which the values of
    sensors should be sampled?
  • Trade-off between accuracy bandwidth
    utilization
  • Fixed Sampling
  • Sampling at a constant rate max value of speed
    is a function of system speed and/or haptic glove
  • Group Sampling
  • Intuitive grouping of sensors different sampling
    rate for each group
  • Adaptive Sampling
  • Dynamic sampling within a window of session,
    every sensor sampled at an individual optimal
    rate

32
ProPolyne Features
  • Measure can be any polynomial on any
    combination of attributes
  • Can support COUNT, SUM, AVERAGE
  • Also supports Covariance, Kurtosis, etc.
  • All using one set of pre-computed aggregates
  • Independent from how well the data set can be
    compressed/approximated by wavelets
  • Because We show range-sum queries can always
    be approximated well by wavelets (not always HAAR
    though!)
  • Low update cost O(logd N)
  • Can be used for exact, approximate and
    progressive range-sum query evaluation

33
Polynomial Range-Sum Queries
  • Polynomial range-sum queries Q(R,f,I)
  • I is a finite instance of schema F
  • R SubSetOf Dom(F), is the range
  • f Dom(F) ? R is a polynomial of degree d

34
Polynomial Range-Sum Queries as Vector Queries
  • The data frequency distribution of I is the
    function DI Dom(F) ? Z that maps a
    point x to the number of times it occurs in I
  • To emphasize the fact that a query is an
    operator on the data frequency distribution, we
    write
  • Example D(25,50)D(28,55)D(57,120)1 and
    D(x)0 otherwise.

35
Overview of Wavelets
H operator computes a local average of array a
at every other point to produce an array of
summary coefficients Ha Example (Haar)
h1/2,1/2
G operator measures how much values in the array
a vary inside each of the summarized blocks to
compute an array of detail coefficients
Ga Example (Haar) g1/2,-1/2
aka wavelet coefficients of a
36
Naive Evaluation of Vector Queries Using Wavelets
  • Hence, vector queries can be computed in the
    wavelet-transformed space as
  • Algorithm
  • Off-line transformation of data vector (or data
    distribution function, i.e., D, to be exact)
  • O (IldlogdN) for sparse data, O (I) Nd for
    dense data
  • Transform the query vector at submission
  • O (Nd) !
  • Sum-up the products of the corresponding elements
    of data and query vectors
  • Retrieving elements of data vector O (Nd) !

37
Fast Evaluation of Vector Queries Using Wavelets
  • Main intuitions
  • query vector can be transformed quickly because
    most of the coefficients are known in advance
  • Transformed query vector has a large number of
    negligible (e.g., zero) values (independent on
    how well data can be approximated by wavelet)
  • Example Haar filter COUNT function on R5,12
    on the domain of integers from 0 to 15

GH3a
H4a
At each step, you know the zeros
38
Exact Evaluation of Vector Queries
Query SUM(salary) when (25 lt age lt 40) (55k
lt salary lt 150k)
of Wavelet Coefficients 837
of Nonzero Coordinates 4380
39
Approximate Evaluation of Vector Queries
40
Optimal Disk Placement for Wavelet Data
  • The goal is to efficiently store wavelet
    coefficients
  • Efficiently means fast access to stored data, low
    I/O complexity, little disk access
  • How to achieve this create a principle of
    locality of reference
  • Designed for wavelet overlap queries, but can be
    extended for polynomial range-sum queries over
    multidimensional data

41
Optimal Disk Placement for Wavelet DataDiscrete
Wavelet Transform
42
SVD Background
  • The idea of SVD is based on the following theorem
    of linear algebra
  • If matrix , then there exist
    column-orthonormal matrices U and V such that
    where and ,
    and is a diagonal matrix
    such that

43
Weighted-Sum SVD
  • Each data sequence could be represented as a
    matrix, where the columns (r) are the sensors and
    hence their is fixed
  • The similarity metric of two data sequences is
    defined on the square matrices
  • To eliminate the effect that the number of rows
    (i.e., the time dimension) in the two matrices
    are different (i.e., multiply the matrix by its
    transpose matrix)

44
Weighted-Sum SVD
45
Weighted-Sum SVD
46
The Ridge-Climbing Heuristic
  • Procedure
  • Compute the accumulated similarity values (ASVs)
    between the input sequence and all vocabulary
    sequences
  • Keep track of all ASVs
  • For each vocabulary sequence, check whether the
    ASV is monotonically increasing, and whether a
    maximum is reached
  • Yes put this vocabulary into the candidates pool
  • Choose the vocabulary from the candidates pool
    with biggest maximal value
  • Isolate the recognized stream

47
The Ridge-Climbing Heuristic
Assume the database only has three vocabulary
sequence, like, yellow, and I.
Input sequence
Write a Comment
User Comments (0)
About PowerShow.com