DataScope: A Database Content Visualization Tool based on Ranking Queries - PowerPoint PPT Presentation

Loading...

PPT – DataScope: A Database Content Visualization Tool based on Ranking Queries PowerPoint presentation | free to download - id: 30cb7-NzI5Y



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

DataScope: A Database Content Visualization Tool based on Ranking Queries

Description:

Inconvenient to browse data. Existing database visualization. Polaris Stanford ... Some screenshots (Polaris) Some screenshots. Some screenshots (Maryland) ... – PowerPoint PPT presentation

Number of Views:92
Avg rating:3.0/5.0
Slides: 34
Provided by: csU70
Learn more at: http://www.cs.uiuc.edu
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: DataScope: A Database Content Visualization Tool based on Ranking Queries


1
DataScope A Database Content Visualization Tool
based on Ranking Queries
  • CS511 Course Project
  • Tianyi Wu
  • Dec 08, 2006

2
DataScope
  • Motivation
  • Contributions
  • Demonstration
  • Architecture
  • Design Implementation
  • Future Work

3
DataScope
  • Motivation
  • Contributions
  • Demonstration
  • Architecture
  • Design Implementation
  • Future Work

4
Motivation
  • Existing database systems
  • SQL query-based
  • Form-based
  • Limited user interface
  • Inconvenient to browse data
  • Existing database visualization
  • Polaris Stanford
  • Stotle et al. Query, analysis, and visualization
    of hierarchically structured data using Polaris.
    KDD 2002.
  • DIVE-ON U of Alberta
  • Ammoura et al. Towards a novel OLAP interface for
    distributed data warehouses. DaWaK 2001.
  • Maryland
  • Various projects and tools
  • http//www.cs.umd.edu/hcil/research/visualization.
    shtml

5
Some screenshots (Polaris)
6
Some screenshots
7
Some screenshots (Maryland)
8
Limitations of Existing Work
  • Particular domains
  • Spatial-temporal
  • Time-series
  • Predefined schemas
  • Fixed visual representation
  • Statistical charts. (e.g. scatterplots)

9
DataScope
  • Motivation
  • Contributions
  • Demonstration
  • Architecture
  • Design Implementation
  • Future Work

10
Goal
  • Visualize databases like Google Maps!
  • Content-based
  • Explorative, easy-to-use
  • Dragging (pan), drilling (zoom)…
  • Domain independent
  • Web-based interfac
  • Fast query processing

11
Challenges
  • Layout
  • Not well-defined and well-understood
  • Maps longitude, latitude
  • How to position objects on screen?
  • User preferences on what to see
  • Different users have different preferences
  • Even the same user may have different preferences
    based on the context of the query
  • Data is often associated with multiple
    hierarchies or semantic links
  • Powerful query engine

12
Contribution
  • Interface design
  • Principles which can address the above challenges
  • A system prototype
  • Efficient implementation
  • Ranking-Cube Xin et al. VLDB06
  • Ranking Aggregation

13
DataScope
  • Motivation
  • Contributions
  • Demonstration
  • Architecture
  • Design Implementation
  • Future Work

14
About the Demo
  • Not fully-functional yet
  • Ajax vs. PHP
  • Demonstrate important concepts
  • Ranking
  • Efficiency
  • Customization
  • Datasets
  • Real DBLP (extracted 20387 database related
    entries)
  • Synthetic store database

15
DataScope
  • Motivation
  • Contributions
  • Demonstration
  • Architecture
  • Design Implementation
  • Future Work

16
System Architecture
17
DataScope
  • Motivation
  • Contributions
  • Demonstration
  • Architecture
  • Design Implementation
  • Future Work

18
DataScope Overview
19
DataScope Overview
20
DataScope Overview
  • Design principles
  • Structured dimensions
  • Ordering of attribute values
  • Selection of comprehensive layout
  • Quick selection
  • Display rich information
  • Easy customization
  • Implementation
  • Linear ranking functions with arbitrary
    selections
  • Ranking on aggregation

21
Design principles
  • Structured dimensions
  • View data in multi-resolution
  • Roll-up and drill-down
  • Can be automatically generated
  • Numeric attributes
  • Age, salary, price
  • Categorical attributes
  • Milk - dairy products - food

22
Design principles
  • Ordering of attribute values
  • How to order values along X/Y axis?
  • Ascending/Descending
  • Alphabetic order (e.g. AAAI, CIKM…)
  • Numeric order (e.g. 2001, 2002…)
  • Independent of any ranking function
  • The order of a value is not determined by its
    score

23
Design principles
  • Selection of comprehensive layout
  • Initial layout
  • High-level, familiar to most users
  • Map - US map
  • DBLP (AI, Theory, System)(80s, 90s, 00s)
  • Subsequent layout
  • Can be changed according to different data
  • Customizable

24
Design principles
  • Quick selection
  • Dragging
  • Scrolling the mouse wheel to roll-up/drill-down
    or zoom in/out
  • Push constraint easily
  • Context menu

25
Design principles
  • Display rich information
  • Top-k (as in Google Maps)
  • K-representative items
  • Outliers
  • Display primitives
  • Color, size (e.g. big cities have big font), etc.
  • Searching

26
Design principles
  • Easy customization
  • Users can freely define their own layout
  • Xlocation, Yyear
  • Adjust the resolution
  • More/less objects on screen
  • Customize ranking function
  • e.g. rank houses by 0.7Price0.3size
  • Selection database conferences and 2003-2006

27
Implementation
  • Ranking-Cube
  • Xin Et al. Answering top-k queries with
    multi-dimensional selections (VLDB06)
  • Linear ranking functions
  • Arbitrary selections
  • Methods
  • Partition the data space and store blocks
  • Progressively retrieve the most promising blocks
    for each query
  • Data fragments
  • Partial materialization to deal with high
    dimensionality

28
Ranking on Aggregation
  • Example
  • Given a relation (conference, year, author,
    paper)
  • Query
  • SELECT top k COUNT(author)
  • FROM R
  • GROUP BY conference, year

29
Ranking on Aggregation
  • Method
  • Materialization for all possible cuboids
  • Algorithm
  • Input aggregation dimension D, ranking
    dimensions R, concept hierarchies H.
  • Output a set of ranking fragments S
  • 1) For each possible group-by of R and H
  • 2) Compute aggregation for each value in D
  • 3) Compute ranking fragments for D
  • 4) S S D

30
DataScope
  • Motivation
  • Contributions
  • Demonstration
  • Architecture
  • Design Implementation
  • Future Work

31
Conclusion
  • DataScope
  • Extend the current prototype to support mapping
    operations and multiple sessions
  • Improve design principles which can lead to a
    more effective interface
  • Support various ranking queries efficiently

32
Future work
  • Interface
  • Improve the initial system prototype
  • Support the full set of operations
  • Support easy customization
  • Implementation
  • Rich research issues
  • Ranking objects based on user feedbacks
  • Retrieve most relevant objects in keyword
    searching
  • Multiple types of ranking queries

33
Thank you!
  • Any questions?
About PowerShow.com