The Data Warehouse Toolkit - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

The Data Warehouse Toolkit

Description:

10/3/09. Evan ??. 1. The Data Warehouse Toolkit. Chapter 6 CRM. ????? ???(Evan Lin) ... not give us an excuse to dodge the challenge of creating conformed dimension, ... – PowerPoint PPT presentation

Number of Views:230
Avg rating:3.0/5.0
Slides: 42
Provided by: Eva961
Category:
Tags: com | data | dodge | toolkit | warehouse

less

Transcript and Presenter's Notes

Title: The Data Warehouse Toolkit


1
The Data Warehouse Toolkit
  • Chapter 6 CRM
  • ????? ???(Evan Lin)
  • MIS Dept., Yuan Ze Univ., Taiwan
  • http//evan.scuec.net
  • s917725_at_mail.yzu.edu.tw

2
What is CRM
  • Customer relationship management was the
    buzzword,design and develop customer-centric
    dimensional models to better understand their
    customers behavior.
  • CRM has emerged as mission-critical business
    strategy that many as essential to company
    survival.

3
Discuss following Concept
  • CRM overview, including its operational and
    analytic roles.
  • Customer name and address parsing,along with
    international consideration
  • Common customer dimension attribute, such as
    dates segmentation attribute,and aggregation
    facts
  • Dimension outrigger for large cluster of
    low-cardinality attribute.

4
Discuss following Concept..
  • Minidimensions for attribute browsing and change
    tracking in large dimensions, as well as
    variable-width attribute sets.
  • Implications of using type 2 slowing change
    dimension technique on dimension counts
  • Behavior study groups to track a set of customers
    that exhibit common characteristics or behaviors
  • Commercial customer hierarchy consideration,
    including both fixed and variable depth..
  • Combine customer to multiple data sources
  • Analyzing customer across multiple business
    processes

5
CRM Overview
  • The goal CRM is maximum relationships with your
    customers over their lifetime.
  • CRM promises significant return for organization
    that embrace it in term it in term of both
    increased revenue and greater operational
    efficiencies
  • Switch to a customer-driven perspective can lead
    to increased sales effectiveness and closure
    rates, revenue growth, enhanced sales
    productivity at reduced cost,improved customer
    profitability margins, higher customer
    satisfaction , and increased customer retention.

6
CRM Overview..
  • View customer variable depend on
  • Product line,business unit, business function ,or
    geographic location.
  • In many cases.the existing business processes for
    customer interactions have evolved over time as
    operational or organization work-around.
  • The resulting patchwork set of customer-related
    processes is often clumsy at best.
  • Since it is human nature to resist change, it
    comes as no surprise that people-related issues
    often challenge

7
Operational and Analytic CRM
  • It could be said that CRM suffers from a split
    personality syndrome because it address both
    Operational and Analytic requirement.
  • Effective CRM relies on the collection of data at
    every interaction we have with a customer and
    then the leveraging if that breadth of data
    through analysis.

8
Operational CRM
  • CRM call for synchronization of customer-facing
    processes.
  • Data is created on operational side of the CRM
    equation, we obviously need to store and analyze
    the historical metrics resulting from our
    customer interaction and transaction systems.

9
Analytic CRM
  • Analytic CRM is enabled via accurate, integrated,
    and accessible customer data in warehouse.
  • We are able to measure the effectiveness of
    decisions made in the past in order to optimize
    future interactions.
  • Customer data can be leveraged to better identify
    up-sell and cross-sell opportunities, pinpoint
    inefficiencies, generate demand, and improve
    retention.

10
  • The model output can translate into specific
    proactive or reactive tactics recommended for the
    next point of customer contact, such as the
    appropriate next product offer or anti-attrition
    response.
  • The model results also are retained in the data
    warehouse for subsequent analysis.

11
  • Information must feed back to the operational Web
    site or call center systems on a more real-time
    basis. Like ODS(Operational Data Store)
  • The ODS generally does not require the breadth or
    depth of customer information available in the DW

12
Packaged CRM
  • In response to urgent need of business for CRM,
    project teams may be wrestling with BUY versus
    BUILD decision.
  • In the long term BUILD is better,but it take
    longer require more resources, potentially high
    cost.
  • Buying a packaged application will deliver a
    practically ready-to-go solution,but it may not
    focus on the integration and interface issues
    needed for it to function in the large IT
    content.

13
Packaged CRM ..
  • Buying packaged, regardless of its application
    breadth, does not give us an excuse to dodge the
    challenge of creating conformed dimension,
    including customer dimension.
  • Any organization needs is another data stovepipe,
    so be certain to integrate any packaged solution
    properly.

14
Customer Dimension
  • The conformed customer dimension is a critical
    element for effective CRM.
  • A well-maintained, well-deployed conforming
    customer dimension is the cornerstone of sound
    customer-centric analysis.

15
Customer Dimension ..
  • Customer dimension is typically the most change
    dimension in data warehouse.
  • In large organization customer dimension can be
  • Extremely deep(with million rows)
  • Extremely wide(with dozens or hundreds of
    attributes)
  • Subject to rather rapid change
  • To further complicate matters, the customer
    dimension often represents an amalgamation(??) of
    data from multiple internal and external source
    system.

16
Name and Address Parsing
  • Regardless of whether we are dealing with
    individual human being or commercial entities, we
    typically capture
  • Many designer use a liberal design for names and
    address.. Likely Name-1, Name-2..
  • Unfortunately, these catchall columns are
    virtually worthless when it comes to better
    understanding and segmenting the customer base.

17
Name and Address Parsing..
  • About NAME column is too far to limit..
  • Hard to handling salutation(??),title ,suffix
  • Can not identify what person first name is or how
    she should be address in a personalized greeting.
  • We would find multiple customers listed in single
    name field.
  • We also additional descriptive information in the
    name field..(UGMA?Trustee)

18
Address Problem
  • May contain enough room for any address, but
    there is no discipline(?????) imposed by the
    columns that will guarantee conformance with
    postal regulation(????), address matching or
    latitude/longitude identification.
  • Attribute has not standardized..
  • RD ? Road
  • Ste ? State..
  • Commercial customer typically have multiple
    address, such as physical and shipping
    addresseach of these would follow much the same
    logic as the address structure we just develop..

19
International Name and Address consideration
  • Customer geographic attributes become more
    complicated if we are dealing with customers from
    multiple countries.
  • In addition to name and address parsing we
    discussed earlier
  • Universal representation
  • Should be consistent from country to country.
  • Cultural correctness
  • Appropriate salutation(???????)in
    e-mail,telephone, letter
  • Differences in addresses
  • Idiosyncrasies(??) such as presenting the
    destination city.

20
Other Common Attribute
  • Date
  • We often find date in customer dimension, such as
    date of first purchase,date of last purchase..

21
Other Common Attribute..
  • Aggregated facts as attribute
  • User often are interested in constraining the
    customer dimension based on aggregated
    performance metrics..
  • Filter on all customers who spent over a certain
    dollar amount during last year,
  • While there are query usability and performance
    advance to storing these attribute but Staging
    processes to ensure that attributes accurate(??),
    up-to-date , and consistent with actual fact
    row.

22
Dimension Outriggers for a Low-Cardinality
Attribute Set
  • Snowflake the low-cardinality columns in the
    dimension have been removed to separate
    normalized tables that then link back into
    original dimension table.
  • Not recommend more complex and low performance.
  • Sometime build a dimension outriggers that has
    appearance of a snowflake table.

23
  • Several factor to bend our non-snowflake rule
  • Demographic data(??) is available at
    significantly (????) difference grain than
    primary dimension..
  • The data is administered and loaded at difference
    times than the rest(???) of the data in the
    customer dimension..
  • Dimension outriggers are permissible(????),but
    they should be the exception rather than the
    rule. A red warning flag should go up if your
    design is riddled(???) with outriggers you may
    have succumbed(??) to the temptation(???) to
    overly(???) normalize design.

24
Large Changing Customer Dimension
  • In Chapter 4 the Type 2 cant do with large
    changing dimension..
  • Huge customer dimensions are even more likely to
    change than moderately sized dimensions. We call
    it rapidly changing monster dimension
  • To rescue(??) to address both browsing-performance
    and change-tracking challenge is to break off
    frequently analyzed or frequently changing
    attribute into a separate dimension, referred to
    as a minidimension.

25
Minidimension..
  • We include two foreign keys related to customer
    the regular customer dimension key and
    minidimension demographic key..
  • This design delivers browsing and constrain
    performance benefits by providing a smaller point
    of entry to the fact..
  • Queries can avoid the huge customer dimension
    table altogether(??)unless attribute from that
    table are constrained
  • When the demographic key participates as foreign
    key in the fact table, another benefit is that
    fact table serves to capture the demographic
    profile changes..
  • Example in table 6.3 in page 155..(like Type 2
    response in SCDs)

26
  • Customer dimension are unique in that customer
    attributes frequently are queries independently
    of the fact table..
  • Exhow many female customers live in Dade Country
    by age bracket(?????)..
  • Use demographic data to link through the fact
    table,the most recently value of demographic key
    also can exist as foreign key on customer
    dimension table. We refer to demographic table as
    a customer dimension outrigger
  • The minidimension terminology refers to when the
    demographics key is part of the fact table
    composite key if the demographic key is a
    foreign key in customer dimension, we refer to it
    as an outrigger..

27
  • Type 1 if you embed most recent demographic key
    in customer dimension..
  • Type 2 if you want to track all change, but
    avoid to rapidly changing monster dimension
    problem..
  • Use demographic data,through either the
    minidimension or outrigger, can deliver more
    functionality and complexity than some users can
    handle..

28
  • The demographic dimension itself cannot be
    allowed to grow too large.
  • We would build a second demographics dimension
  • Minidimension growth while also clustering like
    attributes together for a more intuitive user
    presentation
  • They are potentially sourced from two different
    data providers with difference update
    frequencies..

29
  • The best approach for efficiently browsing and
    tracking changes of key attributes in really huge
    dimension is to break off one or more
    minidimension from the dimension table, each
    consisting of small clumps(?) of attributes that
    have been administered to have a limited number
    of values.

30
VariableWidth Attribute Set
  • Minidimension can be create to handle a variable
    number of customer attributes..
  • When using external prospect(??) lists, we often
    are permitted only a one-time use of the list and
    dont have the legal right to store the prospect
    information internally.
  • We could store the prospect and customers
    together in a single contact dimension,however,
    there is a significant(???) disparity(???)
    between the numbers of attributes for prospective
    versus customer contacts.

31
  • We may know only a handful of identification and
    location attributes about our prospects.
  • Database platform such as Oracle support
    variable-width attribute set..

32
Implications of Type 2 Customer Dimension Changes
  • Type 2 remain the predominant technique for
    tracking change in customer dimension with less
    than 100,000 rows
  • Evan we have large customer dimension we need
    still use type 2 response to handle very slowly
    changing attribute left behind in the customer
    dimension ..
  • Must avoid over-counting use COUNT DISTINCT
  • Most recent row indicator in the customer
    dimension is also helpful to do count based on
    the most up-to-date descriptive values for
    customers

33
Customer Behavior Study Groups
  • With customer analysis, simple queries,such as
    how much have we sold to customers in this
    geographic area in the past year,rapidly evolve
    to more complex inquiries.
  • The secret to building complex behavioral study
    group queries is to capture the keys of customers
    or products whose behavior you tracking. You then
    use the captured keys to constrain other fact
    tables without having to return the original
    behavior analysis..

34
Commercial Customer Hierarchies
  • One of most challenging aspects of dealing with
    commercial customers is modeling their internal
    organizational hierarchies.
  • Nested
  • May change frequently
  • We will talk about two approaches to handling
    customer hierarchies
  • Fixed-Depth Hierarchies
  • Variable-Depth Hierarchies

35
Fixed-Depth Hierarchies
  • It is uncommonly that customer dimension is
    highly predictable with fixed number of levels
  • Easy to compute , easy to sum to the sum..

36
Variable-Depth Hierarchies
  • Representing an arbitrary (???),ragged (???)
    organization hierarchies difficult task in a
    relational environment

37
Variable-Depth Hierarchies..
  • In computer science approach to handling this
    unpredictable hierarchies using recursive parent
    customer key pointer on each customer dimension
    row..
  • GROUP BY can not follow the recursive tree point
    downward an summarized and additive fact in a
    companion fact table such as revenue in an
    organization.
  • Oracles CONNECT BY can navigate a recursive
    point, but cannot be used in the same SQL
    statement as a join..

38
  • Instead of using a recursive pointer we insert a
    bridge table between the customer dimension and
    fact tables..
  • The bridge table has been called a helper or
    associative table in the past.
  • The number of rows in the bridge table typically
    is several times large than the number of rows in
    customer dimension..

39
Combining Multiple Source Of Customer Data
  • Conform customer dimension is a distillation of
    data from several operational system and possibly
    outside sources.
  • Unique customer has multiple identifiers in
    multiple operational touch-point system.
  • There is no secret weapon for tracking this data
    consolidation.

40
Analysis Customer Data from Multiple Business
Processes
  • Be very careful when simultaneously joining a
    single dimension table to two fact tables of
    different cardinality in many cases,relational
    systems will return the wrong answer. A similar
    problem arises when joining two fact tables of
    different granularity together directly.
  • If use are frequently combining data from
    multiple business processes,then an additional
    fact table can be constructed that combines the
    data once into a second-level,consolidated fact
    table rather than relying on users to combine the
    data consistently and accurately on their own..

41
Summary
  • Overview of CRM basics
  • Design issue..
  • Name ,address parsing..
  • Dimension outrigger
  • Low-cardinality attribute
  • Customer dimension has million of rows
  • Minidimension with variable-width attributes
  • Customer behavioral study group dimension
  • Customer hierarchies
  • Fixed and variable depth hierarchies
  • Potential downfalls of querying across the fact
    tables .
Write a Comment
User Comments (0)
About PowerShow.com