Chapter 7: Data Warehousing - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Chapter 7: Data Warehousing

Description:

Fit cube and roll-up operators in SQL. Define new aggregate functions for cubes ... Honda. 15. black. 20. red. 2001. 12. black. 10. red. 2000. Ford. Sales ... – PowerPoint PPT presentation

Number of Views:152
Avg rating:3.0/5.0
Slides: 18
Provided by: tee7
Category:

less

Transcript and Presenter's Notes

Title: Chapter 7: Data Warehousing


1
  • Chapter 7 Data Warehousing
  • Title Data Cube - A Relational Aggregation
    Operator Generalizing Group-By, Cross-Tab, and
    Sub-Totals
  • Authors J. Gray, S. Chaudhuri, A. Bosworth, A.
    Layman, D. Reichart, M. Venkatrao, F. Pellow, and
    H. Pirahesh
  • (Microsoft and IBM Research Centers)
  • Reviewers G10
  • Kuo-Wei Hsu Amanuel Godefa

2
Data Cube - A Relational Aggregation Operator
Generalizing Group-By, Cross-Tab, and Sub-Totals
  • Outline
  • Motivation and Background
  • Problem Statement
  • Contributions
  • Key Concepts
  • Methodology
  • Assumptions
  • Rewrite Today

3
Motivation and Background
  • Need of better tool or operator
  • achieve analysis on large and complex datasets,
    e.g., data warehousing
  • HISTOGRAM
  • Statistical summary of groups and subgroups
    sub-totals totals
  • rollup drilldown reports

4
Problem Statement
  • Given
  • Multi-dimensional data or datasets.
  • Find
  • Summation/aggregation functions over one or more
    dimensions.
  • Objective
  • Provide a new operators
  • Description
  • SQL on decision making on multi-dimensional data
  • Relates sub-totals and totals for drill-down and
    roll-ups reports
  • And Shortcomings of GROUP BY

5
Contributions
  • Enhance traditional relational DBs
    multi-dimensional data
  • ability to support operations over N-dimensional
  • Improve 2D flat files (SQL tables) model with an
    N-dimensional problem

6
Contributions
  • Fit cube and roll-up operators in SQL
  • Define new aggregate functions for cubes
  • efficient techniques to compute cube
  • Improve SQL Standard.

7
Key Concepts
  • How to obtain the data cube / how to implement
    CUBE and ROLLUP operations
  • Recursive algorithm / recursively computes
    aggregates with referencing sub-aggregates
  • Categorize of aggregate functions
  • Distributive (e.g. count, min, max, sum,)
  • Algebraic (e.g. average, variance,)
  • Holistic (e.g. median, mode)

8
Aggregate FunctionDistributive( F ) if there
exists a G such that F( Xi,j ) G ( F ( Xi,j )
) where i1,n j1,..m
  • e.g.1 Max()
  • 2 Sum()

9
Aggregate Function.Algebraic( F ) if there
exists a G1(),,Gp() and H() such that F( Xi,j )
H(Gq( F ( Xi,j ) ) i1,n j1,..m
q1pe.g. Average
10
Aggregation up to 3-Dimension
From Data Cube- fig 3
11
Example
12
Example
  • SELECT Model, Year,
  • SUM (sales) AS YTD_Sales
  • FROM autoSales
  • GROUP BY Model, Year
  • WITH CUBE

13
CUBE
SELECT Model, Year, Color, Sum(Sales) AS
SalesFROM SalesWHERE Model in Ford,
ChevyAND Year BETWEEN 1990 AND 1992GROUP BY
CUBE Model, Year, Color
Example CUBE Operator Source Data CUBE
14
Validation Methodology
  • Theoretical Analysis,
  • e.g., the three types of aggregate functions
  • Practical Examples
  • Case Studies,
  • e.g., Microsoft SQL server

15
Assumptions
  • Relational data model
  • The size of the data
  • Normalized tables
  • Multi - Dimensional

16
To rewrite the paper
  • Memory size doesnt need to be focused
  • utilize memory without partitioning datasets
    (participating tables)
  • Warehousing in a distributed environment.
  • Heterogeneous DBMSs
  • normalized locally but unnormalized globally
  • ROLLUP and CUBE are SQL extensions
  • SQL 6.5 and Oracle 8i (above).

17
References
  • (T7.3) J. Gray, S. Chaudhuri, A. Bosworth, A.
    Layman, D. Reichart and M. Venkatrao, DataCube A
    Relational Aggregation Operator Generalizing
    Group-by, Cross-tab and Sub-Totals, Data Mining
    and Knowledge Discovery 1(1), 1997, 29-53.
  • SQL Server 2005 Books Online, Microsoft
    Corporation
  • Lecture note from Prof. S. Shekharwww.cs.umn.edu/
    research/shashi-group/fbkof_datacube.ppt
Write a Comment
User Comments (0)
About PowerShow.com