City of Charlotte Data Warehousing and Business Intelligence and Building Mashups By Example by Rattapoom Tuchinda, Pedro Szekely, and Craig A. Knoblock - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

City of Charlotte Data Warehousing and Business Intelligence and Building Mashups By Example by Rattapoom Tuchinda, Pedro Szekely, and Craig A. Knoblock

Description:

City of Charlotte Data Warehousing and Business Intelligence and Building Mashups By Example by Rattapoom Tuchinda, Pedro Szekely, and Craig A. Knoblock – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 37
Provided by: coitwebU5
Category:

less

Transcript and Presenter's Notes

Title: City of Charlotte Data Warehousing and Business Intelligence and Building Mashups By Example by Rattapoom Tuchinda, Pedro Szekely, and Craig A. Knoblock


1
City of Charlotte Data Warehousing and Business
Intelligenceand Building Mashups By Example by
Rattapoom Tuchinda, Pedro Szekely, and Craig A.
Knoblock
by Doris Phillips 6010 Data Integration UNC
Charlotte
2
Business Decision Making
Which?
(Raper, J., Building a Data Warehouse in a
Heterogeneous Tool Environment)
3
Overview
  • Describe current data warehousing/business
    intelligence methods at the City of Charlotte
  • Present Building Mashups By Example byRattapoom
    Tuchinda, Pedro Szekely, and Craig A. Knoblock
  • Compare data warehousing techniques to data
    integration methods for mashups

4
Focus on Processes Common to Data Warehousing and
Data Integration
  • Data Retrieval
  • Data Modeling
  • Data Cleaning
  • Data Integration
  • Common processes but not the same!

5
Data Warehousing at the City of Charlotte
6
DISCLAIMER
7
City of CharlotteSimplified KBU Organization
Chart
(Raper, J., Building a Data Warehouse in a
Heterogeneous Tool Environment)
8
Sample Data Sources
  • Accounts Payable
  • Accounts Receivable
  • Asset Center
  • Emerald
  • Faster
  • General Ledger
  • Kronos
  • Hansen
  • PeopleSoft
  • Remedy
  • Unisys Helpdesk
  • Utility Billing System

9
HUB and Spoke Design
SOURCES
TARGET DATA MARTS
(Raper, J., Building a Data Warehouse in a
Heterogeneous Tool Environment)
10
Data Vault Components
  • Hubs - Key Topics
  • Links Relationships
  • Satellites Details
  • Auxiliaries Support
  • Implemented as Relational Tables
  • Diagramed Using Distinctive Shapes

(Raper, J., Phillips, D., User-Managed Metadata
Oracle Application Express Meets Oracle Warehouse
Builder 10.2)
11
Hub Tables
  • Hub records contain the logical business and
    physical keys to the business data and its
    context.
  • Relatively Stable over Time
  • Keys
  • Primary Key Surrogate ID
  • Natural Key Unique Logical Key Fields

(Raper, J., Phillips, D., User-Managed Metadata
Oracle Application Express Meets Oracle Warehouse
Builder 10.2)
12
Link Tables
  • Link records provide information related to
    relationships between Hubs and Links.
  • The information in link records can and does
    change over time.
  • May have one or more Satellite Type records
  • Keys
  • Primary Key Surrogate ID
  • Natural Key Unique Logical Key Fields
  • Foreign Key HUB or LNK SIDs

(Raper, J., Phillips, D., User-Managed Metadata
Oracle Application Express Meets Oracle Warehouse
Builder 10.2)
13
Satellite Tables
  • Satellite records provide the structure to hold
    the context or descriptive type information from
    operational systems.
  • Maintain these changes over time.
  • Related Directly to Hub or Link Records
  • Keys
  • Primary Key Surrogate ID
  • Foreign Key HUB or LNK SIDs

(Raper, J., Phillips, D., User-Managed Metadata
Oracle Application Express Meets Oracle Warehouse
Builder 10.2)
14
Auxiliary Tables
  • Auxiliary records contain a variety of cross
    reference and lookup descriptions tied to logical
    business keys.
  • Standalone Support Tables Not Directly Linked to
    HUB, SAT, or LNK tables.
  • Types of Auxiliaries
  • Lookups
  • Cross References
  • External Tables
  • ETL Work Tables
  • Keys
  • Primary Key is NK Unique Logical Key Fields
  • May have FK

(Raper, J., Phillips, D., User-Managed Metadata
Oracle Application Express Meets Oracle Warehouse
Builder 10.2)
15
Generic Data Vault Schema
(Raper, J., Phillips, D., User-Managed Metadata
Oracle Application Express Meets Oracle Warehouse
Builder 10.2)
16
Oracle Warehouse Builder MappingNew Code Capture
(Raper, J., Phillips, D., User-Managed Metadata)
17
Oracle Warehouse Builder Mapping New Code Capture
Detail
(Raper, J., Phillips, D., User-Managed Metadata)
18
Data Cleaning
  • Convert to UPPER case
  • Trim blank spaces
  • Replace NULL within UNK or UNKNOWN
  • Check for valid values using lookups
  • Remove duplicates

19
Metadata Capture - Cross Reference Organization
Codes APEX Application
(Raper, J., Phillips, D., User-Managed Metadata)
20
Data Warehouse Approach
  • Extracted data from multiple heterogeneous
    sources
  • Converted to Data Vault Architecture
  • Cleaned data and transformed into desired format
  • Combined data from multiple sources
  • Data provided to users for reporting and data
    visualizations

21
Building Mashups By Example
  • byRattapoom Tuchinda, Pedro Szekely, and Craig
    A. Knoblock

22
Overview
  • Mashup A web application that integrates data
    from multiple web sources to provide a unique
    service
  • Goal Create a mashup building framework where
    an average Internet user with no programming
    experience can build Mashups easily

23
Current Solutions
  • Widget Paradigm
  • Current Solutions involve selecting, customizing,
    and connecting widgets together
  • Disadvantages
  • As number of widgets gets large, locating the
    right widget becomes confusing and time consuming
  • Connecting widgets required understanding
    programming concepts

24
Widgets Yahoo Pipes
(Tuchinda, et. al, p. 140)
25
Microsoft Popfly
  • United States Information Widget - EDIT
  • United States Information Widget - RUN

26
Mashup Building Process
  • Data Retrieval
  • Extracting data from web pages into a structured
    source (table or XML)
  • Source Modeling
  • Process of assigning the attribute name for each
    data column
  • Data Cleaning
  • Required to fix misspellings and transform
    extracted data into the appropriate format
  • Data Integration
  • Specifies how to combine two or more data sources
    together

27
Karma Solution
The left window is an embedded web browser. The
top right window contains a table that a user
would interact with. The lower right window shows
options that the user can select to get into
different modes of operation. (Tuchinda, et. al,
p. 140)
28
Data Retrieval Karma Table
  • User Selects Value from Page, List
    Automatically Populated
  • User Selects Address for Value, List
    Automatically Populated

(Tuchinda, et. al, p. 140-141)
29
Source Modeling - Attributes
  • Karma Attributes
  • Example
  • Karma compares extracted data with existing data
    in its repository
  • Automatically populates some attributes
  • User specifying the correct attribute
  • Users search existing attributes in data
    repository

30
Data Cleaning
  • Users specify what data to clean
  • Karma tries to pick up desired transformations
    and populate the remaining columns

31
Data Integration
  • Karma
  • Analyzes attributes and data to determine
    possible join conditions
  • Suggests existing data sources in the repository
    that can be linked to the new data in the table

32
Data Integration
  • Karma Problems
  • Locating the related sources from the repository
  • Figuring out the query to combine the new source
    and existing valid sources

33
Data Integration
  • Karma Solution
  • Uses table constraints
  • Uses programming methods and procedures induced
    from user interaction

34
Mashup Building Approach
  • Combines most problem areas in Mashup building
    into a unified interactive framework that
    requires no widgets
  • Allows users with no programming background to
    easily create Mashups by example

35
Conclusions
  • Data Warehouse
  • Data Integration / Mashup
  • Data from enterprise applications and maintained
    sources
  • Historical data for trending and analysis
  • Extract, Transform, and Load
  • Often Strategic information
  • Data from web that may not be maintained or may
    contain errors
  • Real time data for current information
  • Extract, Model, and Clean
  • Often tactical information

36
Resources
  • Linstedt, D. http//www.danlinstedt.com/AboutDV.ph
    p. Nov. 9, 2008.
  • Linstedt, D., Garziano, K., Hultgren, H., May
    2008. The New Business Supermodel The Business
    of Data Vault Modeling.
  • Raper, Jim., Dec. 2004. Building a Data Warehouse
    in a Heterogeneous Tool Environment.
  • Raper, Jim., Dec. 2004. From Source to Loading
    Dock with Oracle Warehouse Builder
  • Raper, Jim., Phillips, Doris., 2006.
    User-Managed Metadata Oracle Application
    Express Meets Oracle Warehouse Builder 10.2
  • Tuchinda, R., Szekely, Pl., Knoblock, C., 2008.
    ACM., Building Mashups by Example.
Write a Comment
User Comments (0)
About PowerShow.com