Enterprise Data Warehouse A Technical Perspective - PowerPoint PPT Presentation

1 / 84
About This Presentation
Title:

Enterprise Data Warehouse A Technical Perspective

Description:

Load new Core New 'current' record from Staging. Development Lifecycle. Post ETL. Measures ... Optionally used for job execution with 'dependency management' ... – PowerPoint PPT presentation

Number of Views:351
Avg rating:3.0/5.0
Slides: 85
Provided by: hom4364
Category:

less

Transcript and Presenter's Notes

Title: Enterprise Data Warehouse A Technical Perspective


1
Enterprise Data Warehouse A Technical
Perspective
  • Tony Dalwood
  • Information Architecture Management
  • University of South Australia

2
IT Structure
  • ISTS Information Strategy Technology Services
  • Information Strategy
  • Corporate Information Systems
  • E-Business
  • Information Architecture Management
  • Technical Services
  • Customer Services
  • Network Services
  • Systems Infrastructure

3
Information Architecture Management (IAM)
  • Merger of DBA team Information Integration team
    in Feb 2006
  • IAM manages
  • Corporate System Databases (3 DBAs)
  • Operational Data Store Management
  • Middle Tier Apps
  • Student Portal (myUniSA)
  • Staff Portal (UniSAinfo)
  • UniSAinfo Reporting
  • EDW

4
Project Governance
  • Steering Group
  • Includes Directors of ISTS, Planning and
    Assurance Services (PAS), Student Academic
    Services (SAS), Finance
  • Sponsors Group
  • Director of Planning Assurance Services
  • Dep. Director Information Strategy
  • Business Project Manager
  • Technical Project Manager
  • Reference Group
  • Senior Officers from PAS, HR, Research, SAS,
    Finance

5
Project Governance
  • Project Team
  • Business Project Manager (PAS)
  • Technical Project Manager (ISTS)
  • Design Architect/Dev Team Leader (ISTS)
  • Business Analyst (x1.5) (PAS)
  • Data Quality Manager (0.5) (PAS)
  • Developers (x3 variant) (ISTS)

6
EDW Project Milestones
  • Aug 2004 - Business Case submitted by Planning
    Assurance Services (PAS) and ISTS to extend
    current reporting environment to an EDW (150K)
  • Feb 2005 Project Commenced
  • Feb-July 2005 Data Gathering Workshops
  • Sep-Dec 2005 Technical Research Proof of
    Concept (0.5 IT Resource)
  • Jan-Feb 2006 External Consultancy (1 IT
    Resource)
  • May 2006 First Star Schema complete (Research
    Publications) (4 IT Resources)
  • July 2006 Three more Star Schemas complete
    (Research Income, AVCC Data, Research Staff
    Supervision) (4 IT Resources)
  • August 2006 First Soft Production Release
    (2.5 IT Resources)
  • Beyond Student Data Finance Data (min 2 IT
    Resources)
  • NB IT Resource not including part time Tech
    Project Manager

7
Project Goals
8
By-Products of an EDW Project
  • Data Discovery
  • What data do we have
  • How data is used and maintained
  • What is the quality of the data
  • How data can be utilised by more of the
    organisation
  • Enhanced Collaboration
  • Intra and Inter communication between business
    units, system owners and IT

9
Technical Project Plan
  • Warehousing Research
  • Proof of Concept exercise
  • External Assistance
  • Implementation of an Architecture
  • Development Standards Procedures
  • Build Implementation of Stage 1
  • Review

10
Proof of Concept
  • Validate Warehouse research findings
  • Proof of Concept covered the following topics
  • Project methodology
  • Technical architecture
  • Design methodology
  • ETL methodology
  • MetaData options
  • Data Quality approach
  • Security implementation options

11
Project Methodology
12
Technical Architecture
  • Inputs into Architecture
  • Business Goals
  • Existing Reporting Environments
  • Technology
  • Time
  • Resources/Skills

13
(No Transcript)
14
Data Flow Architecture
15
Design Methodology
  • Dimensional Modelling chosen as the design
    philosophy
  • Star Schemas/Snowflakes
  • Facts
  • Dimensions
  • Measures
  • Bridges
  • History Retention for Slowly Changing Dimensions
  • Warehouse records are versioned i.e. never
    deleted or overwritten.
  • Views to identify current records

16
Transformation of Design - Source
17
Transformation of Design - Target
18
ETL Methodology
  • Scripts Vs Tool decision
  • Tool chosen for following reasons
  • Already licensed for Oracle Internet Developer
    Suite that includes Oracle Warehouse Builder
  • Oracle Database environment
  • Oracle technical skills
  • Visibility of Development Environment
  • Auto technical Meta Data generation
  • Auto and accessible code generation using PL/SQL
  • Ability to include custom code
  • Integration with Oracle database and related
    Oracle technology
  • Framework for Beginners
  • Difficult to evaluate other products without
    expertise
  • Smarts Effort into Modelling and Design ETL
    should be a no brainer

19
MetaData
  • Data about Data
  • Oracle Warehouse Builder provides technical
    metadata
  • Business MetaData facility currently restricted
    to documentation and Cognos catalogs
  • Evaluation of MetaData methods to be reviewed at
    the completion of Stage 1 development

20
Data Quality
  • Pre-ETL
  • Technical profile to ensure physical design has
    mapped appropriate data elements
  • Business profile of source data to identify data
    attributes e.g. data type, patterns, nulls, min,
    max, outlies
  • ETL
  • Transform to conformed data sets
  • Foreign Key checks
  • Reporting of anomolies
  • Post ETL
  • Final Business profile to validate
    transformations of data

21
Security
  • Security options implemented are
  • Database Layer
  • Oracle roles to grant or deny access to database
    objects based on Business rules
  • Oracle views for granular data security where
    appropriate
  • User Layer
  • Access to end user Cognos catalogues/cubes
    controlled via Cognos security mechanisms and
    filesystem access

22
Development Lifecycle
  • Business Requirements
  • Design Process
  • Logical Design
  • Physical Design
  • Data Mapping
  • Data Profiling

23
Development Lifecycle
  • Design Build ETL Objects Processes
  • Extraction routines
  • Diff routines
  • Tag records as Inserts, Updates or Deletes
  • Build Staging tables
  • Build Target warehouse tables

24
Standard ETL Process
  • Scheduled Extract/Diff process runs to populate a
    Diff table in the Staging Area
  • ETL process then performs a standard set of steps
  • Load Staging from Diff table
  • Stamp Staging record according to Diff type (U, D
    or I)
  • Updated Record Tag staging record as new
    version of core record
  • Deleted Record Tag staging record Retired
    record in warehouse
  • Inserted Record Tag staging record to be new
    record (version 1)
  • Update Core End date existing current record
  • Load new Core New current record from Staging

25
Development Lifecycle
  • Post ETL
  • Measures
  • Summary data
  • Process Flows to execute ETL
  • Security views
  • End User Layer e.g. Catalogues

26
ETL Auditing
  • When did a process last run
  • How long did it run for
  • Did it Succeed, Fail or produce Warnings
  • How many records did it alter or insert
  • What were the data exceptions

27
UniSA EDW Toolset
  • Oracle Database
  • Oracle Warehouse Builder
  • Oracle Workflow
  • Oracle Enterprise Manager
  • Datiris Data profiler
  • Cognos Impromptu/Powerplay
  • Whiteboard and lots of A3 Paper!!!

28
Oracle Database
  • Options assisting Warehouse implementation
  • External tables
  • Materialised Views
  • Query Rewrite
  • Bitmap indexes
  • Partitioning
  • Star Query optimizer options

29
Oracle Warehouse Builder
  • Provides the design and development environment
    and framework for the build and deployment of
    Warehouse objects and transformation processes
  • Consists of Design Repository and Runtime
    components

30
(No Transcript)
31
Oracle Workflow
  • Optionally used for job execution with
    dependency management
  • Exists as an optional install with RDBMS
  • Run as Client/Server or HTTP browser based
    application
  • Workflow engine is a service on the warehouse
    database server administered by a workflow schema

32
Oracle Enterprise Manager
  • Optionally used as the scheduling option for
    submitting and monitoring Warehouse builder
    processes or workflows
  • Base OEM comes with RDBMS
  • Optionally run as standalone install or
    Management Server mode using a web console

33
Cognos 7.3 Reporting Suite
  • Catalogues
  • Report Developer access layer
  • Impromptu
  • Reporting capability
  • Powerplay
  • Multi-dimensional analysis
  • Upfront
  • Web interface

34
  • Oracle Warehouse Builder Demonstration

35
(No Transcript)
36
(No Transcript)
37
(No Transcript)
38
(No Transcript)
39
(No Transcript)
40
(No Transcript)
41
(No Transcript)
42
(No Transcript)
43
(No Transcript)
44
(No Transcript)
45
(No Transcript)
46
(No Transcript)
47
(No Transcript)
48
(No Transcript)
49
(No Transcript)
50
(No Transcript)
51
(No Transcript)
52
(No Transcript)
53
(No Transcript)
54
(No Transcript)
55
(No Transcript)
56
(No Transcript)
57
(No Transcript)
58
(No Transcript)
59
(No Transcript)
60
(No Transcript)
61
(No Transcript)
62
(No Transcript)
63
(No Transcript)
64
(No Transcript)
65
(No Transcript)
66
(No Transcript)
67
(No Transcript)
68
(No Transcript)
69
(No Transcript)
70
(No Transcript)
71
(No Transcript)
72
(No Transcript)
73
(No Transcript)
74
(No Transcript)
75
(No Transcript)
76
(No Transcript)
77
(No Transcript)
78
(No Transcript)
79
(No Transcript)
80
(No Transcript)
81
OWB 10g Release 2 - Paris
  • New Features
  • Design Tool
  • Graphic Interface Improvements
  • Built in Slowly Changing Dimension property
  • Data Profiling/Quality utilities
  • Better Integrated Workflow Engine
  • Job Scheduling within OWB via OEM

82
Project Review
  • Sanity Check on whole process, architecture,
    methodology
  • Business Technical
  • Evaluate ROI
  • Quantify metrics on time to deliver
  • Proposed Future phases
  • Usage Statistics
  • Hardware adequacy capacity

83
Useful Technical References
  • Links
  • Oracle Business Intelligence Technical Sites
  • http//www.oracle.com/solutions/business_intellige
    nce/index.html
  • http//www.oracle.com/technology/tech/bi/index.htm
    l
  • Rittman Blog
  • http//www.rittman.net/
  • Kimball Tips
  • http//www.kimballgroup.com/html/designtips.html
  • Texts
  • Oracle 9iRel2 Data Warehousing - Hobbs
  • Kimball Texts
  • The Data Warehouse Lifecycle Toolkit
  • The Data Warehouse Lifecycle Toolkit
  • The Data Warehouse ETL Toolkit

84
Questions ?
Write a Comment
User Comments (0)
About PowerShow.com