Data Warehouse Architecture - PowerPoint PPT Presentation


PPT – Data Warehouse Architecture PowerPoint presentation | free to view - id: 24bf88-ZDc1Z


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation

Data Warehouse Architecture


Present a Data Warehouse Architectural Framework. Information Systems Architecture. Information Systems Architecture is the process of making the key choices that ... – PowerPoint PPT presentation

Number of Views:507
Avg rating:3.0/5.0
Slides: 24
Provided by: neilf8


Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Data Warehouse Architecture

Data Warehouse Architecture
  • Lecture 1

Lecture Objectives
  • Define Data Warehouse Architecture
  • Define Data Warehouse and Data Mart
  • Discuss The Great Debate
  • Present a Data Warehouse Architectural Framework

Information Systems Architecture
  • Information Systems Architecture is the process
    of making the key choices that are essential to
    the development of an information system.
    Architecture includes
  • Guiding Principles
  • Approaches/philosophies
  • Logical representations of a system
  • Hardware/Operating System
  • Computing model client/server vs traditional vs
  • Tools and technologies
  • It is key, when making these choices that they
  • Requirements driven
  • Take into consideration operational, technical
    and financial feasibility
  • Made within an architectural framework

Architecture Drivers
  • There are a lot of Drivers of Architecture

Architecture to Design to Implementation
Looking at the Work of Developing a System as a
set of Choices, Architecture can be Described as
Highest Level Choices
Business Strategy Line of Code / Process
Step Employee / Computer
Removal of Choices
How is Architecture Different from Design?
  • Its not Architecture can be considered
    high-level design
  • Architecture includes those aspects of the design
    that are essential to the information system
  • Architecture Example
  • Users must be able to self-serve (guiding
  • We will use a hub and spoke design where data
    will be placed in a central data warehouse, then
    be propagated to one or more data marts.
  • We will normalize data in the central warehouse
    and use a dimensional design in the data marts
  • We will use Oracle 8i as our DBMS (technical

Architecture vs Design
  • Not Architecture
  • The Order subject area will be composed of the
    following tables order_fact, customer_dim,
    product_dim and time_dim
  • The customer_dim table will have the following

The Value of Architecture
  • Communication
  • To business sponsors, and business users
  • Between members of the project team
  • Planning
  • Cross Check for Project Plan
  • Ensure that all important components of the data
    warehouse are accounted for
  • Flexibility and Growth
  • Thinking about overall architecture will reduce
    risk associated with the success of the data
  • Learning
  • Productivity and Reuse

Whats different about DW Architecture?
  • Transaction processing systems growth is
    (relatively) predictable
  • Example
  • A company uses SAP for order processing
  • They are opening a new retail store
  • They predict (based on experience) 2000
    transactions per week
  • To process this volume, we need 3 workstations to
    capture the transactions
  • Peak time each day is 11-2 when 50 of
    transactions occur

Whats Different About Data Warehouse
  • Success drives explosive growth
  • More users
  • More (complex) queries
  • More data
  • Performance is unpredictable
  • Unpredictable queries
  • Unpredictable use patterns

Data Warehouse
The Great Data Warehouse Architecture Debate
  • Bill Inmon The enterprise data warehouse
  • Ralph Kimball data marts
  • The compromise Hub and Spoke or Federated

If you build it, They will come
What is a Data Mart?
  • A data mart is a collection of subject areas
    organized for decision support based on the
    specific needs of a given user group.
  • Each mart may widely different from others (as we
    will see)
  • Typically, data marts are built on the
    dimensional data model
  • Facts things that the organization wants to
    measure revenue, orders, shipments, purchases,
  • Dimensions the means by which the organization
    wants to analyze the measures (facts) by
    customer, by time, by product BY ANY

What is a Data Mart?
  • There are two kinds of data marts--dependent and
  • A dependent data mart is one whose source is a
    data warehouse.
  • An independent data mart is one whose source is
    the legacy applications environment. All
    dependent data marts are fed by the same
    source--the data warehouse. Each independent data
    mart is fed uniquely and separately by the legacy
    applications environment.
  • Dependent data marts are architecturally and
    structurally sound.
  • Independent data marts have a number of
    significant issues

Data Warehouse Definition 3
  • Provides the infrastructure to feed data marts,
    which supply data to users.
  • Responsible for
  • Acquiring data
  • Cleaning data
  • Transforming data
  • Managing granular data
  • Managing ALL data required for a specific subject
  • Can be store data in a number of ways
  • Flat files
  • Other technologies (compression)

Data Warehouse Definition 3
  • Ensures that
  • Impact on source systems is minimized
  • Consistent data definitions across data marts
  • Conformed dimensions!!
  • Appropriate history is maintained

Architectural Framework
  • Three Views
  • Data (content of warehouse)
  • Technical (functions to be performed by the
  • Infrastructure (Hardware, Communications)
  • Four levels of detail for each view
  • Business Requirements
  • Architecture Models
  • Detailed Models
  • Implementation

Data Architecture
  • Business Requirements
  • What information do we need to make better
    business decisions?
  • What data assets are available?
  • Architecture Models and Documents - The
    Dimensional Model
  • What are the major entities (the facts and
    dimensions) that make up this information?
  • How do they relate to each other?
  • How should these entities be structured?
  • Detailed Models and Specs - The Logical and
    Physical Models
  • What are the individual elements, their
    definitions, domains, and rules for derivation?
  • What are the sources and how do they map to the
  • Implementation
  • Create and document the databases, indexes,
    backup procedures, etc.

Technical Architecture Back Room
  • Business Requirements and Audit
  • How will we get at the data, transform it, and
    make it available to our users?
  • How is this done today?
  • Architecture Models and Documents
  • What are the specific capabilities needed to get
    the data into a usable form in the desired
    locations at the appropriate times?
  • What are the major data stores and where should
    they be located?
  • Detailed Models and Specs
  • What standards and products provide the needed
  • How will we hook them together?
  • What are our development standards for code
    management., naming, etc.?
  • Implementation
  • Write the extracts and loads.
  • Automate the process.
  • Document the process.

Technical Architecture Front Room
  • Business Requirements and Audit
  • What are the major business issues we face?
  • How will we measure these issues?
  • How do we want to analyze the data?
  • Architecture Models and Documents
  • What will users need to get the information out
    in a usable form?
  • What major classes of analysis and reporting do
    we need to provide?
  • What are the priorities?
  • Detailed Models and Specs
  • What tools do we select to provide capabilities
  • What are the specifics for the report templates?
  • Who needs them?
  • How often?
  • How do we distribute them?

Technical Architecture Front Room
  • Implementation
  • Implement the reporting and analysis environment.
  • Build the initial report set.
  • Train the users.
  • Document.

  • Business Requirements and Audit
  • What hardware and network capabilities do we need
    to be successful?
  • What do we currently have in place?
  • Architecture Models and Documents
  • Where is the data coming from and going to?
  • Do we have enough calculation and storage
  • What are the specific capabilities we are
    counting on?
  • Do they exist?
  • Who is responsible for them?
  • Detailed Models and Specs
  • How do we interact with these capabilities?
  • What are the system utilities, calls, APIs, etc.?
  • Implementation
  • Install and test new infrastructure components.
  • Connect the sources to the targets to the
  • Document.

Technical Data Warehouse Framework
Technical Data Warehouse Framework
Metadata Management
Back Room
Data Staging Services
Front Room
Client/ Server
Re-engd. Data
Extracted Data
Data Marts
Web Server
Workflow and System Management