ISQS 6339, Business Intelligence Creating Data Marts - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

ISQS 6339, Business Intelligence Creating Data Marts

Description:

How does a data mart look like in SQL Server 2005. A Cube. There are a number of approaches ... Typically flags or indictors that describe or categorize the ... – PowerPoint PPT presentation

Number of Views:198
Avg rating:3.0/5.0
Slides: 30
Provided by: zlinB
Category:

less

Transcript and Presenter's Notes

Title: ISQS 6339, Business Intelligence Creating Data Marts


1
ISQS 6339, Business IntelligenceCreating Data
Marts
  • Zhangxi Lin
  • Texas Tech University

2
Why need Data Mart?
  • Data mart complements the centralized data
    warehousing based on UDM model, for the
    situations where UDM cannot be used
  • Legacy databases
  • Data are from nondatabase sources
  • No physical connection the centralized data
    warehouse
  • Data are not clean

3
Data Mart Structures
  • Fact tables
  • Measures
  • Dimension tables
  • Dimensions and Hierarchies
  • Attributes (or columns)
  • Dimensional modeling Stars and Snowflakes

4
The Three Fact Table Types
  • Transaction fact table
  • Periodic snapshot fact table
  • Cumulative performance over specific time
    intervals valuable to combine data across
    several business processes in the value chain.
  • Accumulating snapshot fact table
  • Constantly updated over time.

4
5
Measures
  • A numeric quantity expressing some of the
    organizations performance. The information
    represented by this quantity is used to support
    or evaluate the decision making and performance
    of the organization.
  • A measure is also called a fact
  • The table holding measure information is called
    as a fact table

6
Attributes
  • An additional piece of information pertaining to
    a dimension member that is not the unique
    identifier or the description of the member.
  • Attributes can be used to more fully describe
    dimension members

7
Slowly Changing Dimensions
  • The attribute values in a dimension may change
    over time, which are critical to understand the
    dynamics of the business. The ability to track
    the changes of facts over time is critical to a
    DW/BI system.
  • Employees changed their departments
  • Home moving (16.8 American moved per year) zip
    code changes possible.
  • The dimensions that have changeable attribute
    values slowly changing dimensions (SCDs)
  • Type 1 SCD overwrites the existing attribute
    value with a new value. You dont care about
    keeping track of historical values
  • Type 2 SCD change tracking ETL process creates
    a new row in the dimension table to capture the
    new values of the changed item
  • Type 3 SCD Similar to Type 2 SCD but only track
    current state and the original state two
    additional attribute SCD Start Date, SCD Initial
    Value
  • Ask business users how they will use the
    information to determine which SCDs are of type 1
    and which are of type 2.

7
8
Surrogate Key
  • A surrogate key has a unique value assigned to
    each row in the dimension. It becomes primary key
    of the dimension table and is used to join the
    dimension to the associated foreign key field in
    the fact table
  • Benefits of surrogate keys
  • Protect the DW/BI system from changes in the
    source system
  • Allow the DW/BI system to integrate data from
    multiple source system
  • Enable developers to add rows to dimensions that
    do not exist in the source system
  • Provide the means for tacking changes in
    dimension
  • Are efficient in the relational database and
    analysis services
  • Surrogate keys are created when doing data
    warehousing. They are new from the keys in
    original database
  • They are also called meaningless keys, substitute
    keys, non-natural keys, artificial keys
  • Specifically, surrogate keys are used in slowly
    changed dimensions (SCD) management

8
9
Aggregate Dimensions
  • Situation data at different levels of
    granularities
  • Two resolutions
  • Removing a dimension
  • Rolling up a dimensions hierarchy and provide a
    new, shrunken dimension at the aggregate level

9
10
Many-to-many or Multivalued Dimensions
  • Relationship between a dimension table and fact
    table is called one-to-many one row in the
    dimension table may join to many rows in the fact
    table.
  • Many-to-many or Multivalued Dimensions are
    referred to as there are more than one row in a
    dimension table joining to multiple rows in a
    fact table
  • Bridge table supports many-to-many relationship
  • fact-dimension
  • dimension-dimension.

10
11
Hierarchies
  • Meaningful, standard ways to group the data
    within a dimension
  • Variable-depth hierarchies
  • Frequently changing hierarchies
  • Examples of hierarchy in a dimension
  • Address street, city, state, country
  • Organization section, division, branch, region
  • Time year, quarter, month, date

11
12
Heterogeneous Products
  • Several products with differentiated attributes
  • Problem sharing one dimension or use different
    dimension?
  • Resolutions
  • One family-oriented dimension with core fact and
    product tables plus specific information for each
    line of product

12
13
Junk Dimensions
  • Also called miscellaneous or mystery dimensions
  • They are miscellaneous attributes that dont
    belong to any existing dimension.
  • Typically flags or indictors that describe or
    categorize the transaction in some way.
  • Contents are often important
  • Four alternatives for dealing with them
  • Leave them in the fact table
  • Create a separate dimension for each attribute
  • Omit them
  • Group them into a single junk dimension

13
14
Degenerate Dimensions
  • No description of its own
  • No joining to an actual dimension table
  • No attributes
  • Example transaction ID

14
15
Exercise 1 Walk through data warehousing process
  • Learning Objectives
  • To gain a general impression how to use SQL
    Server 2008 to implement a data mart
  • Tasks
  • Create your database with SSMS, named as
    ISQS6339_lastname
  • Import data from Commrex_2011.xls
  • Use SSMS to create a ERD diagram
  • Create a SSAS project using BIDS
  • Define data source, data source view, and cube
  • Deliverable
  • One-page printout of the screenshot of the cube
    diagram

15
16
Maximum Miniatures Manufacturing Designing Data
Mart
  • General business needs
  • To analyze the statistics available from the
    manufacturing automation systems. The VP would
    like an interactive analysis tool, rather than
    printed reports, for the analysis.
  • The manufacturing automation system controls all
    the machines to create figurines
  • Filling a mold with the raw material
  • Aiding the hardening of this materials
  • Removal from the mod when hardening is complete
  • Computerized painting of the figurines
  • Curing the paint if necessary

17
Maximum Miniatures Manufacturing Creating Data
Mart
  • Specific Business Needs
  • Analyzing the following numbers
  • Dollar value of products sold
  • Number of products sold
  • Sale tax charged on products sold
  • Shipping charged on products sold
  • These numbers should be viewable by
  • Store
  • Sales Promotion
  • Product
  • Day, Month, Quarter, and Year
  • Customer
  • Sales Person

18
Data Requirements
  • Number of accepted products by batch by product
    by machines by day
  • Number of rejected products by batch by product
    by machines by day
  • Elapsed time for molding and hardening by product
    by machine by day
  • Elapsed time for painting and curing by curing
    type by product by machine by day
  • Product rolls up into product subtype, which
    rolls up into product type
  • Machine rolls up into machine type, which rolls
    up into country
  • Day rolls up into month, which rolls up into
    quarter, which rolls up into year
  • The information should be able to be filtered by
    machine manufacturer and purchase date of the
    machine

19
Business Need of Sales
  • The VP of sales for Max Min, Inc. would like to
    analyze sales information. This information is
    collected by three OLTP systems the Order
    Processing System, the Point of Sale (POS)
    system, and the MaxMin.com Online system.
  • To analyze the following numbers
  • Dollar value of products sold
  • Number of products sold
  • Sales tax charged on product sold
  • Shipping charged on product sold
  • These number should be viewable by store, sales
    promotion, product, time, customer, sales person

20
Snowflake Schema of the Data Mart
Manufacturingfact
DimBatch
DimMachine
DimProduct
DimMachineType
DimPlant
DimProductSubType
DimMaterial
DimCountry
DimProductType
21
Exercise 2 Creating a data mart with SSMS
  • Learning Objectives
  • How to design a dimensional model
  • How to create a data mart with SSMS
  • How to create a cube for a data mart.
  • Tasks
  • Manually create the fact table and DimProduct
    table using SSMS (see the detailed information
    from file DW_MMM.PDF in the shared directory
    under \Docs)
  • Import remaining tables from oredb.lin.mmm.empty
  • Define the primary keys of tables and the
    relationships among them
  • Create a cube
  • The primary key of the fact table is composed of
    three foreign keys plus one time dimension key
    ProductCode, BatchNumber, MachineNumber, and
    DateOfManufacture.
  • Deliverable
  • The printout of the screenshot of the cube
    structure and the success of the deployment

22
Hints for Deploying the OLAP Cube
  • Due to the security restrictions, you need to
  • Double click the entry in Data Source
  • Type in your eRaider login information in the
    Impersonation Information panel
  • Change the server to OREDB

22
23
The screenshot of impersonation information
24
The properties of the project
After this step you can proceed to deploy the cube
25
Exercise 3 Creating a data mart with SSAS
  • Learning Objectives
  • Learn an alternative way to create a data mart
  • How to deploy a data mart
  • Tasks
  • Create data mart MaxMinSalesDM with BIDS from a
    cube template
  • Deploy the data mart
  • Deliverable
  • The printout of the screenshot of the cube
    structure and the success of the deployment

26
(No Transcript)
27
(No Transcript)
28
Cube structure
29
Data Mart Tables
Write a Comment
User Comments (0)
About PowerShow.com