Chapter 15 Data Warehousing, OLAP, and Data Mining - PowerPoint PPT Presentation

Loading...

PPT – Chapter 15 Data Warehousing, OLAP, and Data Mining PowerPoint presentation | free to download - id: 90e05-ZmZjO



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Chapter 15 Data Warehousing, OLAP, and Data Mining

Description:

Data, data, data...everywhere! Information...that's ... is a smaller, more focused Data Warehouse a mini-warehouse. ... a maximum daily withdrawal of $200 ... – PowerPoint PPT presentation

Number of Views:1206
Avg rating:3.0/5.0
Slides: 39
Provided by: ronn161
Learn more at: http://www.data-miners.com
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Chapter 15 Data Warehousing, OLAP, and Data Mining


1
Chapter 15 Data Warehousing, OLAP, and Data Mining
2
Introduction
  • Data, data, data…everywhere!
  • Information…thats another story!
  • Especially, the right information _at_ the right
    time!
  • Data warehousings goal is to make the right
    information available _at_ the right time
  • Data warehousing is a data store (eg., a database
    of some sort) and a process for bringing together
    disparate data from throughout an organization
    for decision-support purposes

3
Introduction
  • Data warehouses are natural allies for data
    mining (work together well)
  • Data mining can help fulfill some of the goal of
    data warehouses right information _at_ the right
    time
  • Relational database management systems (RDBMS),
    such as Oracle, DB2, Sybase, Informix, Focus, SQL
    Server, etc. are often used for data warehousing

4
Definitions of a Data Warehouse
A subject-oriented, integrated, time-variant and
non-volatile collection of data in support of
management's decision making process
1.
- W.H. Inmon
A copy of transaction data, specifically
structured for query and analysis
2.
- Ralph Kimball
5
Data Warehouse
  • For organizational learning to take place, data
    from many sources must be gathered together and
    organized in a consistent and useful way hence,
    Data Warehousing (DW)
  • DW allows an organization (enterprise) to
    remember what it has noticed about its data
  • Data Mining techniques make use of the data in a
    Data Warehouse

6
Data Warehouse
Enterprise Database
Customers
Orders
Transactions
Vendors
Etc…
Etc…
  • Data Miners
  • Farmers they know
  • Explorers - unpredictable

Copied, organized summarized
Data Warehouse
Data Mining
7
Data Warehouse
  • A data warehouse is a copy of transaction data
    specifically structured for querying, analysis,
    reporting, and more rigorous data mining
  • Note that the data warehouse contains a copy of
    the transactions which are not updated or changed
    later by the transaction system
  • Also note that this data is specially structured,
    and may have been transformed when it was copied
    into the data warehouse

8
Data Mart
  • A Data Mart is a smaller, more focused Data
    Warehouse a mini-warehouse.
  • A Data Mart typically reflects the business rules
    of a specific business unit within an enterprise.

9
Data Warehouse to Data Mart
Decision Support Information
Data Warehouse
Decision Support Information
Decision Support Information
10
Generic Architecture of Data
(synonym) Transaction data
11
Transaction (Operational) Data
  • Operational (production) systems create (massive
    number of) transactions, such as sales,
    purchases, deposits, withdrawals, returns,
    refunds, phone calls, toll roads, web site
    hits, etc…
  • Transactions are the base level of data the raw
    material for understanding customer behavior
  • Unfortunately, operational systems change due to
    changing business needs
  • Fortunately, operational systems can usually be
    changed to support changing business needs
  • Data warehousing strategies need to be aware of
    operational system changes

12
Operational Summary Data
Summaries are for a specific time period and
utilize the transaction data for that time period
Other Examples???
13
Decision Support Summary Data
  • The data that are used to help make decisions
    about the business
  • Financial Data, such as
  • Income Statements (Profit Loss)
  • Balance Sheets (Assets Liabilities Net Worth)
  • Sales summaries
  • Other examples???
  • Data warehouses maintain this type of data,
    however financial data of record (for audit
    purposes) usually comes from databases and not
    the data warehouse (confusing???)
  • Generally, it is a bad idea to use the same
    system for analytic and operational purposes

14
Database Schema
  • Database schema defines the structure of data,
    not the values of the data (e.g., first name,
    last name structure Ron Norman values of the
    data)
  • In RDBMS
  • Columns fields attributes (A,B,C)
  • Rows records tuples (1-7)

15
Logical Physical Database Schema
  • Describes data in a way that is familiar to
    business users
  • Describes the data the way it will be stored in
    an RDBMS which might be different than the way
    the logical shows it

16
Metadata
  • General definition Data about data !!!
  • Examples
  • A librarys card catalog (metadata) describes
    publications (data)
  • A file system maintains permissions (metadata)
    about files (data)
  • A form of system documentation including
  • Values legally allowed in a field (e.g., AZ, CA,
    OR, UT, WA, etc.)
  • Description of the contents of each field (e.g.,
    start date)
  • Date when data were loaded
  • Indication of currency of the data (last updated)
  • Mappings between systems (e.g., A.this B.that)
  • Invaluable, otherwise have to research to find it

17
Business Rules
  • Highest level of abstraction from operational
    (transaction) data
  • Describes why relationships exist and how they
    are applied
  • Examples
  • Need to have 3 forms of ID for credit
  • Only allow a maximum daily withdrawal of 200
  • After the 3rd log-in attempt, lock the log-in
    screen
  • Accept no bills larger than 20
  • Others???

18
General Architecture for Data Warehousing
  • Source systems
  • Extraction, (Clean), Transformation, Load (ETL)
  • Central repository
  • Metadata repository
  • Data marts
  • Operational feedback
  • End users (business)

19
Where does OLAP fit in?
20
OLAP Overview
  • Interactive, exploratory analysis of
    multidimensional data to discover patterns

21
OLAP Architecture
22
Server Options
  • Single processor
  • Symmetric multiprocessor (SMP)
  • Massively parallel processor (MPP)

23
OLAP Server Options
  • ROLAP (Relational)
  • MOLAP (Multidimensional)
  • HOLAP (Hybrid)

24
OLAP Online Analytical Processing
  • A definition
  • Data representation is in the form of a CUBE
  • OLAP goes beyond SQL with its analysis
    capabilities
  • Key feature of OLAP Relevant multi-dimensional
    views such as products, time, geography

25
OLAP Cube - 1
26
OLAP Cube - 2
27
OLAP Cube - 3
  • Star Structure (quite common)

28
OLAP Cube - 4
The Cube
29
OLAP Cube - 5
Three- Dimensional Cube Display
30
OLAP Cube - 6
Six- Dimensional Cube
31
Rotation (Pivot Table)
32
Drill Down
33
OLAP Examples
  • http//perso.wanadoo.fr/bernard.lupin/english/exam
    ple.htm
  • Excel Pivot Table example (similar to OLAP cube)

34
Sample of OLAP products
Just a snippet from http//www.olapreport.com/Prod
uctsIndex.htm not an endorsement
35
Data Mining versus OLAP
36
Data Mining versus OLAP
  • OLAP - Online Analytical Processing
  • Provides you with a very good view of what is
    happening, but can not predict what will happen
    in the future or why it is happening

37
Results of Data Mining Include
  • Forecasting what may happen in the future
  • Classifying people or things into groups by
    recognizing patterns
  • Clustering people or things into groups based on
    their attributes
  • Associating what events are likely to occur
    together
  • Sequencing what events are likely to lead to
    later events

38
End of Chapter 15
About PowerShow.com