Adv. DBMS Data Warehouse CSC5301 - PowerPoint PPT Presentation

About This Presentation
Title:

Adv. DBMS Data Warehouse CSC5301

Description:

'Subject-oriented, integrated, time-variant, nonvolatile collection of data in ... Seasonal adaptions, e.g. when to produce how many skis, bikinis, convertibles, ... – PowerPoint PPT presentation

Number of Views:285
Avg rating:3.0/5.0
Slides: 16
Provided by: university73
Category:

less

Transcript and Presenter's Notes

Title: Adv. DBMS Data Warehouse CSC5301


1
Adv. DBMSData WarehouseCSC5301
  • Review
  • Hachim Haddouti

2
Do You Remember?
DSS
Data cube
MD
OLTP
RollUp
drill down
Slice/dice
MD
ROLAP
Star schema
MOLAP
Data mining
Data extraction
Fact table
3
Data Warehouses
  • Subject-oriented, integrated, time-variant,
    nonvolatile collection of data in support of
    managements decision-making process Inmon (AP
    analytical processing is missing)
  • Used for analysis of existing data
  • Resolves performance issues suffered by
    operational RDBMSs and OLTPs

4
Sizing DW?
  • Mining of mobile phone calls
  • (Caller, Callee, Time, Duration, Geogr.
    Location) 100 B/tuple
  • In Germany
  • 107 users 10 calls/(dayuser) 100 B/call
  • 1010 B/day 31012 B/year 3 TB/year
  • Scanning data at 107 B/s takes
  • 31012/107 3105 s gt 3 days

5
Data Warehouse Architecture
6
Data model
  • MD Model
  • better performance
  • Better data organisation
  • Better visualization
  • Business queries (why, what if)
  • ER Model
  • a disaster for querying a huge amount of data
    (time)
  • not understandable for users and they can not be
    navigated usefully by DBMS software.
  • hard to visualize many possible connections
    between tables,
  • To avoid redundancy

7
Typical DWH Analyses/Queries
  • What are the consequences of new orders for
    production capacity w.r. to investment,
    personnel, maintenance, extra hours, ...
  • Seasonal adaptions, e.g. when to produce how many
    skis, bikinis, convertibles, ...
  • Influence of external financing on profits

8
  • Operations
  • aggregation
  • slice
  • dice (cube)
  • rollup to coarser level
  • drill down to more detailed level
  • grouping
  • sorting

9
Data Cube Representation
10
Steps to build a DWH
  • Acquisition of data
  • Data cleansing
  • Storage
  • Processing AP
  • Maintenance, ...
  • Not possible with classical DB-technology alone

11
On-Line Analytical Processing
  • OLTP (online transaction processing) for
    operational data of enterprise, e.g. in
    relational DBMS, IMS, SAP/R3, ...
  • DSS Decision Support System to store
    data/information for strategic management
    decisions aggregations, summaries, etc.
  • Optimized to work with data warehouses
  • Used to answer questions
  • Allows users to perceive data as a
    multidimensional data cube
  • Data mining

12
OLTP versus OLAP
  • Thematic focus
  • OLTP many small transactions (microscopic view
    of business processes, individual steps at lowest
    level, single order, delivery)
  • OLAP finances in general, personnel in general,
    ...
  • OLAP requires integration and unification of many
    detailed data into big picture
  • Time orientation
  • Durability data extracted once, no updates

13
Technical Comparison OLTP vs OLAP
  • OLTP high rate of updates, several thousand t/s
  • OLAP read only transactions, very complex, DWH
    is loaded at certain time intervals, e.g. after
    the end of the month, quarter
  • Compute intensive
  • Special systems with new access methods, e.g.
    multidimensional data organization and access
    methods
  • Special OLAP systems necessary to offload OLTP
    systems

14
ROLAP and MOLAP
  • Solution 1 ROLAP relational online analytical
    processing, built on top of relational DBS,
    additional middleware or client front end (star
    schema)
  • Solution 2 MOLAP multidimensional online
    analytical processing
  • new model
  • new data organizations
  • new algorithms
  • new query languages
  • new optimization techniques

15
  • DW Review
  • degenerate dimension
  • big dimensions
  • hierarchies
  • snow falcking
  • Slowly changing dimensions
  • dirty dimensions
  • Hetegrogeneous prodcuts (core and custom)
  • Factless Fact table
Write a Comment
User Comments (0)
About PowerShow.com