Building the Data Warehouse Chapter 2 Team 1 - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Building the Data Warehouse Chapter 2 Team 1

Description:

Classical operations system organized around applications ... design issue in the data warehouse environment because it profoundly affects ... – PowerPoint PPT presentation

Number of Views:26
Avg rating:3.0/5.0
Slides: 25
Provided by: cltAs
Category:

less

Transcript and Presenter's Notes

Title: Building the Data Warehouse Chapter 2 Team 1


1
Building the Data WarehouseChapter 2 Team 1
  • Jim McGinnis
  • Kris Williams
  • Matt Crook

2
Building the Data WarehouseChapter 2 Team 1
  • The Data Warehouse Environment
  • The Structure of the Data Warehouse
  • Data warehouse is subject oriented
  • Classical operations system organized around
    applications
  • Registration, Emergency, Nursing, Management
  • Subject areas of data warehouse may be product,
    order, raw goods, bill of material, SKU, sale,
    vender
  • Each company will have a unique set of subjects

3
Building the Data WarehouseChapter 2
  • The Structure of the Data Warehouse (cont.)
  • The data warehouse is Integrated
  • Most important aspect
  • Data has a single physical corporate image
  • Data input into data warehouse
  • Converted, formatted, re-sequenced, summarized
  • Inconsistencies are undone from the previous
    applications(M/F, I/O,Y/N)
  • Data decisions made by previous designers are
    corrected

4
Building the Data WarehouseChapter 2
  • Data warehouse is Nonvolatile
  • Regular operational data is accessed and
    manipulated and updated as a regular matter of
    course
  • Data warehouse data
  • Loaded, accessed, but not updated
  • Does not change over time

5
Building the Data WarehouseChapter 2
  • Data warehouse data is Time Variant
  • Every unit of data is accurate at some point in
    time
  • Records are time stamped
  • Records have a transaction date
  • Operational databases contain current-value data
    valid as of the moment of access
  • Data warehouse data is nothing more than a
    sophisticated series of snapshots, each taken at
    one moment of in time

6
Building the Data WarehouseChapter 2
  • The Structure of the Data warehouse
  • Levels of detail in the data warehouse
  • Older level of detail
  • Current level of detail
  • Level of lightly summarized data
  • Level Highly summarized data

7
Building the Data WarehouseChapter 2
  • Subject Orientation -Data warehouse is oriented
    in to the major subject areas of the corporate
    level
  • Series of related tables relate by a common
    key(customer ID and may reside on a different
    media
  • Typical areas are
  • Customer Patient ID
  • Product Diagnosis
  • Transaction or activity CT Scan
  • Claim Treatment
  • Account Insurance

8
Building the Data WarehouseChapter 2
  • Data Locations
  • Different areas of storage for the same warehouse
  • Two most popular media for data
  • DASD- Direct Access Storage Device
  • Does not have to be accessed thru a serial file
  • Magnetic Tape
  • Serial file storage
  • Other popular media
  • Fiche storage for detailed records that never
    to be reproduced in electronic media(pt records
    older than twenty years
  • Optical disk cheap, easy to use and stores
    large amounts of data

9
Questions
  • Jim
  • What are four aspects of data warehousing form
    chapter 2?

10
1-Day n Phenomenon
  • Day 1 there is a lot of legacy systems
    essentially doing operational, transactional
    processing.
  • Day 2 the first few tables of the first subject
    area of the data warehouse are populated.
  • Day 3 more of the data warehouse is populated
    and more users begin to come

11
  • Day 4 some of the data that resided in the
    operational environment become properly placed in
    the data warehouse. It is now discovered as a
    source for doing analytical processing.
  • Day 5 departmental databases (data mart or
    OLAP) start to blossom.
  • Day 6 more departmental systems join in.
  • Day n the architecture is fully developed.

12
Granularity
  • It is the single most important aspect of design
    of a data warehouse.
  • It refers to the level of detail of the units of
    data in the data warehouse.
  • The more detail there is, the lower the level of
    granularity.
  • The less detail there is, the higher the level of
    granularity.

13
  • Granularity is a major design issue in the data
    warehouse environment because it profoundly
    affects the volume of data that resides in the
    data warehouse and the type of query that can be
    answered. The volume of data in a warehouse is
    traded off against the level of detail of a
    query.

14
Benefits of Granularity
  • Being able to look at data in many ways. (using
    the same data to satisfy the needs of different
    departments)
  • Flexibility (altering the look of the data by
    department)
  • History of activities and events across the
    corporation.
  • Future benefits related to changes

15
Kris Question
  • What is the single most important aspect of
    designing a data warehouse and what does it refer
    to?

16
Dual Levels of Granularity
  • Exists when a company maintains more than one
    warehouse
  • Two types of data
  • Lightly Summarized Data
  • True Archival Detail Data

17
Lightly Summarized Data
  • Detail Data is summarized to a small extent
  • Summarized into fields that are likely to be used
    for DSS analysis
  • Significant less volume of data than a True
    Archival Data
  • Limited level of detail

18
True Archival Level of Data
  • All detail data is stored
  • Large amounts of data
  • Storage mediums
  • Magnetic tapes
  • Bulk storage medium

19
Reasons for Dual Levels of Granularity
  • Only 5 or less of the time, a request will
    require toe use of True Archival Data
  • A designer can add more fields at the Lightly
    Summarized data level as needed
  • Reduces query times
  • Increases efficiencies

20
  • A single level of data should only be attempted
    when a shop has a relatively small amount of data
    in the data warehouse.

21
Living Sample Database
  • Hybrid form of the data warehouse
  • Refers to a subset of either the True Archival
    data or Lightly Summarized data
  • Needs to be periodically refreshed

22
Limitations
  • Not a general-purpose database
  • Random selection of data
  • Introduces a degree of error into the analysis

23
Advantages
  • Good for statistical analysis and looking for
    trends
  • Resources required for analysis are reduced
  • Iterative processes can be used to reduce the
    error (due to sampling) in calculations and
    analysis

24
Matts Question
  • Name and describe the two types of data in a Dual
    Level Granularity system.
Write a Comment
User Comments (0)
About PowerShow.com