Retail Data Warehouses: Implementation and Technology - PowerPoint PPT Presentation

Loading...

PPT – Retail Data Warehouses: Implementation and Technology PowerPoint presentation | free to download - id: 15353c-ZDc1Z



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Retail Data Warehouses: Implementation and Technology

Description:

none – PowerPoint PPT presentation

Number of Views:132
Avg rating:3.0/5.0
Slides: 84
Provided by: lpk6
Learn more at: http://www.terry.uga.edu
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Retail Data Warehouses: Implementation and Technology


1
Retail Data Warehouses Implementation and
Technology
Scott E. Gnau, Managing Partner Teradata
Solutions Group
2
Todays Agenda
  • Data Warehouse Concepts
  • Data Warehouse Technology
  • Architecture
  • Framework
  • Parallelism
  • Technical Differentiators
  • Top 5 Implementation Pitfalls
  • NCR Retail Data Warehouse Applications
  • Real World Examples

3
Todays Agenda
  • Data Warehouse Concepts
  • Data Warehouse Technology
  • Architecture
  • Framework
  • Parallelism
  • Technical Differentiators
  • Top 5 Implementation Pitfalls
  • NCR Retail Data Warehouse Applications
  • Real World Examples

4
Data Warehouse Definition
Data Warehousing is a process, not a product
It is a technique to properly assemble and manage
data from various sources to answer business
questions not previously known or possible
5
What Distinguishes Teradata _at_ctive Warehouse?
Centralized DW - Enterprise view single
version of the truth - Organizationally
consistent data
Powerful Analytics - Accurate, high-quality
information - Valuable insight
DW-leveraged Interaction - Drive optimal actions
- Operational analysis that drives business
operations
6
Information Evolution In a Data Warehouse
Environment
7
Todays Agenda
  • Data Warehouse Concepts
  • Data Warehouse Technology
  • Architecture
  • Framework
  • Parallelism
  • Technical Differentiators
  • Top 5 Implementation Pitfalls
  • NCR Retail Data Warehouse Applications
  • Real World Examples

8
The Game is Different
Is your database designed for transactions or
complex business questions?
Data Warehouse Summit
Choose an architecture that meets your business
goals...
...not an architecture that meets your database
capabilities!
OLTP Base Camp
9
Definition Independent Data Mart
An independent data mart is a data warehouse
organized by subject area or user group and
sourced from operational and external systems.
Sales
10
Multi Data Mart Evolution -- The Decision
Point!
Companies that elect to pursue a Distributed Data
Mart strategy begin to build Independent Data
Marts for each functional subject area or user
group.
Sales
Inventory
11
Data Warehouse and Data Marts
Source Gartner, Kevin Strange, October 98
12
Centralized, Active DW - The Solution
Active Data Warehouses
Subject orientednot project oriented
SUBJECT AREAS
Sales
Inventory
Marketing
13
Definition Active DataWarehouse
An active data warehouse is a centralized store
of detail and summary data from all relevant
sources allowing for ad hoc discovery and drill
down analysis from multiple user groups.
Finance
Marketing
14
Definition Dependent Data Mart
A Dependent Data Mart is a data
warehouseorganized by subject area or user group
and sourced from the enterprise data warehouse.
Finance
Marketing
Sales
15
Independent Data Marts
IT Users
Operational Data
Data Transformation
Data Marts
Business Users
16
Hub and Spoke or Federated Warehouse
IT Users
Operational Data
Data Transformation
Data Warehouse Hub
Data Replication
Data Mart Spokes
Business Users
17
Gartner on Hub and Spoke
  • Challenges of "Hub and Spoke" Data Warehouse
    Topology Kevin Strange
  • The "hub and spoke" trend in data warehouse
    topology circumvents weaknesses in DBMS
    technology and fails to deliver benefits to
    justify notable cost increases.
  • This recent trend is due to the challenge for
    most DBMS products to support complex data models
    and concurrent query workloads.
  • The creation of a plethora of data marts to
    alleviate the strain of the DW DBMS will lead
    only to increased implementation costs, a loss of
    opportunity to support dynamic strategic decision
    processes and a requirement for increased support
    without obtaining value that justifies the costs.
    Data marts are not necessarily a negative part of
    a DW implementation, but they should not replace
    direct DW access by users and should be used
    where appropriate, based on application
    requirements
  • Bottom Line Enterprises should not let the
    hype of yet another alternative topology take
    away from a comprehensive DBMS evaluation and
    selection that will provide support for current
    and future user concurrency requirements.

18
Active, Enterprise Data Warehouse
IT Users
Operational Data
Data Transformation
Enterprise Warehouse Management
Business Users
19
Single Database, Multiple Uses
Enterprise Warehouse Management
Data Transformation Layer
Detail, Normalized Data Layer
Logical Data Mart Layer
Business Users
20
The Best of All Worlds
IT Users
Operational Data
Data Transformation
Enterprise Warehouse, Management Logical Data
Marts
Replication
Physical Data Mart or Departmental Warehouse
Business Users
21
Active Data Warehouse Phases
Sales Data
Inventory
Customer
Vendor
22
Active Warehouse Phase I
The Business User
Business Value
IT Development
Sales Data
Information Infrastructure
The IT Provider
23
Active Warehouse Phase II
The Business User
Business Value
IT Development
Sales Data
Inventory
Information Infrastructure
The IT Provider
24
Active Warehouse Phase IIIRealize Exponential
Growth in ROI!
The Business User
Business Value
IT Development
Sales Data
Inventory
Vendor
Information Infrastructure
The IT Provider
25
Active Warehouse Phase IV Know Your Customers!
The Business User
Business Value
IT Development
Sales Data
Customer
Inventory
Vendor
Information Infrastructure
The IT Provider
26
Active Warehouse Phase NKnow What Your
Customers Buy!
The Business User
Business Value
IT Development
Sales Data
Customer
Marketing
Inventory
Vendor
Information Infrastructure
The IT Provider
27
Active Warehouse, Ready to Mine!
Data Mining
Sales Data
Customer
Marketing
Inventory
Vendor
Information Infrastructure
28
Parallel Processing, the Impact
  • How long to read a Terabyte of Data
  • 1.2 Days, serially Information Week article on
    VLDB
  • Parallel Processing can speed-up
  • .6 Days with 2 Parallel Tasks
  • Less than 18 minutes with 100 Parallel Tasks,
    provided that
  • Software has even distribution of tasks
  • Hardware can sustain I/O levels

29
Parallel Query Speed-up
  • Formula for Effect on Query, by Parallelization
  • (Parallel) Total Time ------------------------
    ---------- (Non-Parallel)(Total Time)
    Total Query Time
  • Units of Parallelism
  • Example What effect will a Parallel DBMS have
    on a Query that takes 100 Minutes in a NON
    Parallel System
  • 55 Vs. 98 Parallel and
  • at 10 and 20 units of Parallelism

30
Parallel Query Results
  • DBMS A 55 of the Query is Parallelized

.55100 .45100 47.75 Minutes 20
.55100 .45100 50.5 Minutes10
31
Latest Gartner DW Evaluation
  • New Gartner Group Study evaluates 11 DW Platforms
  • Spans NT, UNIX, AS/400, mainframes
  • Compares companies, DB and platforms
  • Technology
  • Market momentum
  • Business Practice
  • Database
  • Shows importance of DBMS in selection
  • Strong results for MPP
  • The proven excellence of NCRs Teradata database
    returned the highest possible score for NCR
  • Source GartnerGroup Data Warehouse Application
    Server Evaluation Model
  • Full report www.ncr.com/teradata

32
Gartner Warehousing Scores
Database As the ultimate DW category, the
Database category helped propel the NCR WorldMark
on Teradata beyond the reach of Oracle and DB2.
August 1999
33
Gartner Database Warehousing Scores
Max Score 75
34
Oracle and DB2 OS/390 (SMP)
DBMS Kernel
Locking
Logging
Multi-Task Control
Global Systems/Storage Area
Global Buffer Pools
UoP
UoP
UoP
UoP
UoP
UoP
UoP
UoP
UoP
UoP
UoP
UoP
UoP
UoP
UoP
Data Partition
35
Oracle OPS and IBM DB2 Sysplex
Locking, Communication and Task Control
Interconnect
36
DB2 UDB EEE
Communication and Task Control Interconnect
37
Teradata (SMP)
38
Teradata MPP
BYNET SWITCH
PE V-Proc
PE V-Proc
PE V-Proc
PE V-Proc
BYNET Connect
BYNET Connect
AMP
AMP
AMP
AMP
AMP
AMP
AMP
AMP
M-T
M-T
M-T
M-T
M-T
M-T
M-T
M-T
Lock
Lock
Lock
Lock
Lock
Lock
Lock
Lock
Log
Log
Log
Log
Log
Log
Log
Log
BPool
BPool
BPool
BPool
BPool
BPool
BPool
BPool
I/O
I/O
I/O
I/O
I/O
I/O
I/O
I/O
39
Technical Differentiators
  • Where other databases fall short
  • Data Distribution
  • Data Management Reorganization
  • Optimizer Intelligence
  • Degree of Parallelism
  • Database Utilities

"Teradata has a low overhead for support, and
the flexibility of getting to the atomic datain
a very efficient manner.

Jerry Hill, Western Digital
40
Technical Differentiator 1Data Distribution
For optimum performance, data should be
distributed randomly and equally to minimize
access contention.
  • To get around it, many vendors use range
    distribution which creates intensive maintenance
    tasks for the DBA.

How should I partition the data?
Where do I have data contention?
How are users accessing the data?
How large should I make the partitions?
41
Technical Differentiator 1 Data DistributionA
Better Way to Distribute Data
Teradata uses hash partitioning and distribution
to randomly and evenly distribute data across all
nodes for balanced performance.
42
Technical Differentiator 2Data Management
  • Adding, updating and deleting data affects manual
    data distribution schemes thereby reducing query
    performance and requiring reorganization.

43
Technical Differentiator 2 Data ManagementA
Better Way to Manage Data
  • Teradatas automatic hash distribution eliminates
    costly data maintenance tasks freeing up precious
    DBA resources.
  • As a result, strategic business data is more
    accessible to the users

Ill be right over to help you design that new
application!
DBA
44
Technical Differentiator 3Optimizer
Intelligence
  • To maximize throughput and minimize resource
    contention, the optimizer must know about system
    configuration, available units of parallelism and
    data demographics

45
Technical Differentiator 3 Optimizer
IntelligenceTeradata Parallel Awareness Enablers
  • Teradatas optimizer provides unequalled ad hoc
    and complex query performance
  • Fully Parallel
  • Cost-based
  • Full look-ahead (all sub-steps)
  • Dynamic application to all queries
  • Auto-applies time-saving structures (Temp
    tables, Join Indexes)
  • No HINTS required

46
Technical Differentiator 4Degree of Parallelism
  • The greater the number of tasks processed in
    parallel, the better the system performance
  • Many products are called parallel, but they
    only perform some tasks in parallel

Other DBs
Teradata is Always Parallel!
Teradata
47
Technical Differentiator 5Database Utilities
  • Data Warehouses require robust utilities to load
    and manage data because
  • The amount and frequency of data coming into the
    warehouse will grow exponentially
  • Loads, extracts, inserts, updates, and deletes
    will occur on a regular basis
  • Batch windows shrink as companies grow to span
    multiple time zones
  • Users want their data fast!

48
Technical Differentiator 5 Database Utilities
Other RDBMS Typical Data Load Steps
If load fails, start load job over!
49
Technical Differentiator 5 Database Utilities
Teradata Utilities are Robust and Mature
  • Teradata utilities are fully parallel
  • Teradata utilities have checkpoint restart
    capability
  • Data loads directly from the source into the
    database
  • no manual data partitioning!
  • no file splitting!
  • no intermediary file transfers!
  • no separate data conversion step!

parallel in
parallel out
Teradata Warehouse
50
Teradata Data Distribution and Layout
  • Every Unit of Parallelism gets equal part of each
    table via hashing function
  • Every node layout is independent and completely
    dynamic

51
Todays Agenda
  • Data Warehouse Concepts
  • Data Warehouse Technology
  • Architecture
  • Framework
  • Parallelism
  • Technical Differentiators
  • Top 5 Implementation Pitfalls
  • NCR Retail Data Warehouse Applications
  • Real World Examples

52
The Top 5 DataWarehouse Implementation Pitfalls
Take notes, youre about to be warned!
53
Pitfall 1 The Soviet 5 Year Plan
Data Warehouse Modeling Design Issue
54
Pitfall 1 The Soviet 5 Year Plan
  • Beware the All-At-Once Top Down Corporate Data
    Model.
  • Attempt to model the entire enterprise, all at
    once.
  • Try to model every possible entity and attribute
    for every possible business area.
  • Run by centralized, corporate-level modeling
    organization.
  • Develop entire model over long period of time . .
    . then propagate to the masses.

55
Pitfall 1 The Soviet 5 Year Plan
  • Problem Like painting the Golden Gate Bridge .
    . . by the time youre finished, you have to
    start over.
  • Never seem to get started with anything tangible.
  • Difficult (if not impossible) to have specific
    enough business area knowledge, up-front, from
    one vantage point.
  • Very real risk of ivory tower irrelevance . . .
    beautiful model which everyone ignores.

56
Pitfall 2 Leverage Lockout
Data Warehouse Modeling Design Issue
57
Pitfall 2 Leverage Lockout
  • Many So-Called Data Warehouses Are Really Just
    Large Fixed-Purpose Report Generators.
  • Goal is to leverage your Data Warehouse - serve
    many users and cross functional purposes, known
    and unknown.
  • The Most Important Application/Query Topic Is
    the One No One Has Thought of Yet.

58
Pitfall 2 Leverage Lockout
59
Pitfall 2 Leverage Lockout
  • Lessons Learned and Recommendations
  • Data Warehouse lt gt Data Mart - The essence of a
    Data Warehouse You dont already know what
    youre going to ask it.
  • Applications and Processes come and go - Avoid a
    fixed application/process-centric design.
  • Data Warehouse is inherently cross-functional -
    Maximize leverage to support cross functional
    queries/applications.
  • Make the Data Warehouse design independent,
    flexible, and
  • atomic. It should be blind to current
    processes and applications.

60
Pitfall 2 Leverage Lockout
  • Lessons Learned and Recommendations (continued)
  • Expect several key obstacles to building in
    leverage
  • Functional Fixation - This is how our current
    processes work. Optimize the data model to
    reflect that.
  • Existing Data - New structures may require new
    data not readily available (not captured,
    generated, external, etc.).
  • Integrated Applications - Integrated
    process/application/data model stove-pipe
    approach.
  • Timeframe Pressures - Wasting time on modeling
    that does not directly address our current
    needs.

61
Pitfall 3 Transformation Trauma
  • The Scope of Effort Required to Source,
    Condition, and Load Legacy/External Data into a
    Data Warehouse is Usually Seriously
    Underestimated.
  • Tendency to view scope of the effort in terms of
    number and complexity of tables in the Data
    Warehouse itself.
  • Tendency to assume legacy/external files look
    like Data Warehouse tables.

62
Pitfall 3 Transformation Trauma
  • Lessons Learned and Recommendations
  • Plan to do significant data discovery before
    scoping and undertaking a transformation effort.
  • Expect to do significant mapping work - from
    legacy/external file bytes to Data Warehouse
    table columns. Plan to build a transformation
    metadata repository.
  • Transformation tools can help, but they are not a
    silver bullet. You must train them first.
  • Data Warehouse quality must be production
    quality, not close enough. Reconcile and
    document source data anomalies.

63
Pitfall 4 One-Size-Fits-All Interface
User Interface Issue
64
Pitfall 4 One-Size-Fits-All Interface
  • Lessons Learned and Recommendations
  • Expect to implement various interface tools over
    time, targeted toward a broad range of users
  • If the Data Warehouse is designed well (see
    Pitfall 3),
  • various interfaces can be phased in and out
    without impact.

65
Pitfall 5 Big Dog Buy-In
User Interface Issue
66
Pitfall 5 Big Dog Buy-In
  • Lessons Learned and Recommendations
  • Do Yourself a Favor - Be Political.
  • Early priority - Provide custom tailored
    front-end/application for senior sponsoring
    executives
  • Let them see first hand the strategic value of
    the Data Warehouse.
  • Sell the sizzle
  • Color Graphics.
  • Charting, Mapping.
  • Data Visualization.
  • Executive Drill-Down.

67
Pitfall 5 Big Dog Buy-In
  • Lessons Learned and Recommendations (continued)
  • Recommend Customer Demographic data, Competitive
    Positioning data, etc.
  • Plan up front to incorporate this information
    into the Data Warehouse from internal and
    external sources early in the project.
  • Place an appropriately tailored interface into
    the hands of executives.
  • Publicize findings and success stories.
  • A Data Warehouse is meant to be a strategic
    resource. Make its
  • strategic value evident early-on so it wont have
    to fight for its life.

68
Todays Agenda
  • Data Warehouse Concepts
  • Data Warehouse Technology
  • Architecture
  • Framework
  • Parallelism
  • Technical Differentiators
  • Top 5 Implementation Pitfalls
  • NCR Retail Data Warehouse Applications
  • Real World Examples

69
Teradata Solutions for Retail
Business Intelligence
Demand Chain Management
Customer Relationship Management
Analysis
Demand Forecasting
Modeling
Teradata CRM Solutions
Teradata retailDecisions
Contribution Ranking
Assortment Analysis Family
Seasonal Profile
Replenishment
Automatic Profile Tuning
Optimization
Personalization
Communication
Promotions Forecasting
Promotion Analysis Family
Promotion Management 1
Promotion Management 2
Interaction
Exception Monitor
Customer Analysis 1
Customer Analysis Family
Long Range Forecasting
Product Allocation
Store Operations 1
Store Operations Family
Web-Store
Intelligent E-Analysis
Intelligent E-Commerce Family
Intelligent E-Referral
Intelligent Cross-Channel
Price, Item Vendor Management (TCI)
Loyalty Marketing
Perpetual Inventory (TCI)
MicroStrategy 7.0
Loyalty Marketing (VRMS Market Expert XR)
Collaboration - CPFR (Syncra Ct)
Teradata Retail Logical Data Model
Teradata Data Warehouse
70
Retail Business Intelligence Examples
71
Business Questions
  • How does a given item perform in a mailer?
  • How many items were purchased by those mailed?
  • How many mailers were sent?
  • How often was only the featured item purchased?
  • What other items were most frequently purchased
    in the same transaction?
  • What was the profitability of the featured item?
  • What was the profitability of the transaction
    containing the featured item?

72
Key Item Analysis 32 Sony PIP Color TV

Event Greatest SONY Event Ever Selling
Price 1099.99 Location Front Cover -
Lower Left Corner Mailers Sent 4.8
Million Units Sold 206 Total Item Sales
Dollars 217,081 Average Selling Price Per
Item 1,054 Total Transactions Sales
Dollars 280,926 Total Margin Dollars
64,584 Margin Percent 23 Average Margin
Dollars per Transaction 313 Number of
Transactions Overall 206 Single Item
Transactions Overall 108 (52)
73
Key Item Analysis 32 Sony PIP Color TV
  • Of the 206 units sold, 98 customers purchased
    additional items with the 32 TV.
  • 60 customers or 63 also bought a TV stand or
    There were 4 different types of stands sold with
    the TV.
  • 31 customers or 38 bought the 32 Sony Stand
    which was sold at a total loss of 280.18.
  • 20 customers or 25 purchased a VCR when they
    bought the TV.

74
Key Item Analysis 32 Sony PIP Color TV
  • Actions to increase multiple item purchases
  • Train sales associates in cross selling TV stands
    or racks with the TVs
  • Ensure the stores display the stands near the
    featured TV
  • Feature the TV on promotion on one or more model
    of stands
  • Promote but do not discount TV stands or racks in
    the mailer
  • Eliminate the markdown on the SONY TV stand to
    sell at a profit

75
Retail Demand Chain Examples
76
Why do you need a Forecasting Package?
  • Replenishment
  • One option for all
  • auto-replenished skus
  • Excessive Safety Stock
  • to compensate for
  • lack of forecasting
  • accuracy

In House Developed Forecast Accuracy 30 to 50

1.23B in Inventory
Before (Actual)
Based on 1999 Inventory
  • Replenishment
  • Many options depending
  • on the type of sku
  • Safety Stock
  • Weeks of Supply
  • Pull Replenishment
  • Manually replenish
  • Safety Stock reduced
  • significantly thanks to
  • improvements in
  • forecast accuracy

Forecasting Package Accuracy gt 80

1.06B in Inventory
After (Target)
170M
Inventory Savings
77
Why do you need a Forecasting Package?
Actual Inventory Turns without Forecasting
Package
COGS 5.964M OH 1.232M
Yearly Avg. Turns 4.85
Target Inventory Turns with Forecasting Package
COGS 5.964M OH 1.062M
Yearly Avg. Turns 5.62
Based on 1999 COGS and Inventory data
78
Retail Customer Relationship Management Examples
79
NCRs Retail eCRM Solutions
80
NCRs Retail eCRM Solutions
CAMPAIGN MANAGEMENT
eCRM Analysis
Operational eCRM
CAMPAIGN MGMT
LOYALTY PROGRAMS
LOYALTY PROGRAMS
web logs
POS data
EC/PERSONALIZATION retailDecisions Intelligent
e-Commerce
data warehouse
TRANSACTION-ENABLED ALERTS
Other source systems
81
Todays Agenda
  • Data Warehouse Concepts
  • Data Warehouse Technology
  • Architecture
  • Framework
  • Parallelism
  • Technical Differentiators
  • Top 5 Implementation Pitfalls
  • NCR Retail Data Warehouse Applications
  • Real World Examples

82
Ten Reasons Why Teradata is the Leader is Data
Warehousing
  • Business
  • Product Maturity Limits Risk
  • Reference Accounts and Track Record
  • Demonstrated Quickest Time to Solution
  • Lowest Administration Cost of Any Database
  • Most Complete Support Infrastructure
  • Technical
  • Effortless Scalability
  • Sustains High Levels of User Concurrency
  • Complex and Ad Hoc Query Performance w/Low
    Overhead
  • Fast, Fail-Safe Extract and Load Utilities
  • Seamless Connectivity and Mainframe Integration

83
Thank You!
About PowerShow.com