Trends In Data Warehousing - PowerPoint PPT Presentation

About This Presentation
Title:

Trends In Data Warehousing

Description:

Walmart vs. Amazon.com. Walmart is the US company most quoted for the ... Walmart filed lawsuit against Amazon.com for its unlawful way of pirating its DW ... – PowerPoint PPT presentation

Number of Views:1358
Avg rating:3.0/5.0
Slides: 42
Provided by: thomas850
Category:

less

Transcript and Presenter's Notes

Title: Trends In Data Warehousing


1
Chapter 3
Data Warehouse Fundamentals
  • Trends In Data Warehousing

Paul K Chen
1
2
Data Warehousing is Becoming Mainstream
  • In the early stages, four significant factors
    drove many
  • companies to move into data warehousing
  • Fierce competition
  • Government deregulation
  • Need to revamp internal processes
  • Imperative for customized marketing

3
Walmart vs. Amazon.com
  • Walmart is the US company most quoted for the
    successful application deployment of Data
    Warehousing technology.
  • Walmart filed lawsuit against Amazon.com for its
    unlawful way of pirating its DW technology by
    hiring away its DA personnel by offering hefty
    stock option to these people.

4
Significant Factors
  • These significant factors reflect the new trends
    in data
  • warehousing
  • Multiple Data Types
  • Data Visualization
  • Parallel Processing
  • Query Tools
  • Browser Tools
  • Data Fusion
  • Multidimensional Analysis
  • Agent Technology
  • E-Business- ERP, KM, CRM

5
Decision Making and Data Warehousing
  • A data warehouse is the data, processes, tools,
    and facilities to manage and deliver complete,
    timely, accurate, and understandable business
    information to authorized individuals for
    effective decision making.
  • Structured Data
  • Includes traditional relational databases
  • Typically internal and enterprise-owned
  • Predetermined
  • Unstructured Data
  • Includes articles, reports, images, and videos
  • Utilizes external data and expert opinion
  • Ad hoc

3
6
Decision Making and Data Warehousing
  • Management Systems
  • Extend relational databases to store and support
    multimedia
  • User-defined types (UDT) and functions (UDF) in
    SQL-3
  • Specialized Servers
  • Used for data which is incompatible with
    relational databases (e.g., Streaming video
    servers)
  • Objects may be linked to a relational database
  • Search Engines
  • Query by Image Content (shape, color, texture,
    etc)
  • Text retrieval on free-text documents
  • Audio and video searching

7
Decision Making and Data Warehousing
  • The trend is toward unstructured data and ad
    hoc warehouses.

Trend toward multimedia.
4
8
Types of Decision Support Tools
  • Data Inquiry
  • A request for a set of data based on some search
    criteria
  • Data Interpretation
  • Manipulation and visualization of a set of data
    (statistical analysis)
  • Multidimensional Analysis (OLAP)
  • n-dimensional spreadsheet analysis
  • Information Discovery
  • Pattern recognition, trends
  • Browsers
  • Search metadata catalogs
  • Search information object lists
  • Launch analysis tools

5
9
File-based Processing
6
10
Types of Decision Support Tools
  • Trend toward utilization of the Web,
    facilitated by Java.

7
11
Data Warehouse Architectures
  • Single Level
  • Decision support tools access operational data
    directly
  • Feasible only with clean data
  • Valid for unstructured data
  • Two Level Reconciled
  • Scrubbed operational data supporting ad hoc
    queries
  • Two Level Derived
  • Summarized data
  • Three Level
  • Maintains both scrubbed operational data, and
    summarized data.

10
12
Data Warehouse Architectures
  • Trend toward multidimensional data.

11
13
Data Stores and Access Enablers
  • Specialized Multidimensional Databases
  • Data is peregrinated and loaded into
    multidimensional databases
  • Long loading times but quick response
  • Relational-like Stores
  • Indexing is used to proved pseudo-multidimensional
    functionality
  • Relational Data Stores
  • An extra semantic layer generates
    multidimensional data on the fly
  • Hybrids
  • Details are stored in a traditional relational
    format
  • A subset is cached in a multidimensional data
    structure

12
14
Database Management System (DBMS)
13
15
Data Stores and Access Enablers
  • Trend toward multidimensional data.

14
16
Metadata
  • Integrated Components
  • All components (sources, stores, etc) use a
    common metadata repository to maintain their
    metadata
  • Standardized Metadata Interchange
  • Components keep their own metadata
  • Components use a common interchange information
    model and syntax to share metadata
  • Synchronized Metadata Interchange
  • Metadata changes are updated automatically across
    all components
  • Building of Business Metadata
  • Manually entered, free-text, plain language
    descriptions

17
Metadata
  • Trend toward better metadata, exchanged between
    systems.

18
Middleware - Gluing the Warehouse Together
  • Definition software that shields users and
    developers from differences in services and
    resources used by applications
  • Data warehouses often have heterogeneous
    databases, operating systems, networks, hardware,
    applications

19
Business Issues for Middleware
  • Role of middleware
  • Assist developer in data extraction/transformation
    and populating DW
  • Assist business user in accessing DW
  • Therefore needed at different points in life
    cycle
  • Types
  • Copy management data extraction,
    transformation, replication, and propagation
  • Gateways DB and independent gateways
  • Program-to program RPCs, TP monitors, ORBs
  • Message-oriented

20
Data Quality
  • Preprocessing Ownership
  • Source application owners know their data
  • Warehouse owners still must integrate the entire
    system
  • Automated Preprocessing Tools
  • Specialized packages
  • Generalized tools using pattern processing,
    lexical analysis, and statistical matching to
    reconcile a wide range of data sources
  • Custom programming
  • Reliability and Credibility of External Data
  • Quality ratings
  • Posted statistical meta-information (sample size,
    randomness, etc)

15
21
Data Quality
Trend toward better understanding of data
quality.
16
22
Significant Trends- Multiple Data Types
Image
Spatial

Structured Numeric
Video
Data Warehouse Repository
Structured Text
17
Audio
Unstructured Documents
23
Significant Trends- Data Visualization
  • More Chart Types-Pie chart, scatter plot
  • Interactive Visualization
  • Chart Manipulation
  • Drill Down

24
Significant Trends- Parallel Processing
  • Aims to solve decision-support problems using
    multiple nodes working on the same problem.
  • Performs many database operations simultaneously,
    splitting individual tasks into smaller parts so
    that tasks can be spread across multiple
    processors.
  • Parallel DBMSs must be capable of running
    parallel queries, parallel data loading, table
    scanning, and data archiving, and back up.

25
Significant Trends- Parallel Processing
  • Shared memory architecture (SMP)
  • All the servers share all the data
  • Shared nothing architecture (MPP)
  • Each server has its own partition of data

26
Significant Trends- Query Tools, Browse Tools
  • Flexible Presentation online results and report
    generator
  • Aggregate Awareness
  • Crossing Subject Areas
  • Multiple Heterogeneous Sources
  • Integration
  • Overcoming SQL Limitations
  • Data Fusion

27
Significant Trends- Integrating ERP and Data
Warehouse
  • Option 1 Companies implement the data warehouse
    solutions of the ERP vendor with the currently
    available functionality and await the
    enhancements.
  • Option 2 Companies implement customized data
    warehouse and use third-party tools to extract
    data from the ERP datasets. Retrieving and
    loading data from the proprietary ERP datasets
    is not easy.
  • Option 3 It is a hybrid approach that combines
    the functionalities provided by the vendors data
    warehouse with additional functionalities from
    third-party tools.

28
Significant Trends- Integrating KM and Data
Warehouse
Whats KM?
  • It is a systematic process for capturing,
    integrating, organizing, and communicating
    knowledge accumulated by employees.
  • It is a vehicle to share corporate knowledge so
    that employees may be more more effective and be
    productive in their work.
  • A knowledge management system must store all such
    knowledge in a knowledge repository.

29
Significant Trends- Integrating KM and Data
Warehouse
A specific corporate scenario
  • Sales have dropped in the South region.
  • Your marketing VP is able to discern this from
    your data warehouse by running some queries and
    doing some preliminary analysis. If he or she has
    access to a document prepared by an analyst
    explaining why the sales are low and suggesting
    remedial action.
  • Knowledge must be linked to the sales result to
    provide context to the sales numbers from the
    data warehouse.

30
Significant Trends- Integrating KM and Data
Warehouse
An airplane sales scenario The following
information is essential For a successful pitch
for airplane sales.
  • Model configuration
  • Production schedule (Delivery schedule)
  • Part replacement
  • Warranty
  • Knowledge obtained from the knowledge
    management
  • system can provide context to the information
    received from
  • the data warehouse to understand the story
    behind the above
  • information.

31
Summary of Trends
  • Ad Hoc Questions
  • Multidimensional Analysis (OLAP)
  • Web-Enabled Data Warehouse
  • Multimedia
  • Middleware
  • Metadata Interchange
  • Integrating ERP with Data Warehouse
  • Integrating KM with Data Warehouse

32
Complete E-Business Suite A Review
ERP
EAI
Marketing
Sales
Projects
Financial Services
One Database
Order Mgt
Procurement
Human Resources
Customer Relationship(CRM)
Manufacturing
Supply Chain (SCM)


33
Information System Categories
34
Information System Categories
35
Data Warehouse ERP
  • ERP Enterprise Resource Planning
  • A software solution that addresses
    enterprise needs taking the process view of an
    organization to meet the
  • organization goals tightly integrating all
    the functions
  • of an organization.
  • -- It integrates all the departments and
    functions across
  • a company into a single computer system
    that can serve all those different
    departments particular
  • needs.

36
WHY ERP?
  • Business
  • Customer satisfaction
  • Business development new areas, products
    and services
  • Ability to face competition
  • Efficient processes required for companys
    growth
  • IT
  • Present software does not met business needs.
  • Legacy systems difficult to maintain
  • Obsolete hardware/software difficult to
    maintain

37
How ERP?
  • ERP Combines various department systems into a
    single, integrated software program that runs
    off a single database so that the various
    departments can more easily share information and
    communicate with each other.
  • The best part of ERP is the way in which it
    improves the order fulfillment process that is
    taking the customer order and process it into an
    invoice and revenue.
  • It doesnt handle the front-end that is handled
    by CRM (Customer Relationship Management).

38
How ERP?(contd)
  • When a customer service representative enters a
    customer order to an ERP system, he has all the
    information necessary to complete the order such
    as customers credit rating and order history
    from the finance module, the companys inventory
    levels from the warehouse module and the shipping
    docks trucking schedule from the logistics
    module.
  • How its being done It integrates the financial
    information and customer order information . It
    does so by integrating the following
  • Database
  • Application
  • Interfaces
  • Tools
  • BPR

39
How ERP? (contd)
  • It standardizes and speeds up the manufacturing
    process. This saves time, increases productivity
    and reduces head count.
  • It reduces the inventory. Due to the information
    available about all the orders it helps to
    maintain the right level of stock and smoothes
    the manufacturing process.

40
Data Warehouse EAI
  • What is EAI? EAI refers to Enterprise Application
    Integration. EAI is the merging of applications
    and data from various new and legacy systems
    within a business. Various means are employed to
    accomplish EAI, including middleware, in order to
    unify IT resources, maximize new ERP investments,
    diminish errors and get everyone on the same
    page. EAI enables companies to link their
    existing software applications with each other
    and with portals. EAI provide the ability to get
    their applications to exchange critical data. EAI
    is usually close to the top of any CIO's list of
    concerns. There are different approaches to EAI.
    Some rely on linking specific applications with
    tailored code, but most rely on generic
    solutions, typically called middleware. XML,
    combined with SOAP and UDDI is a kind of
    middleware.

41
E-Business
  • Trend toward better understanding as well as
    consolidation of internal processes and data
  • Trend toward web-enabled data warehouse.
Write a Comment
User Comments (0)
About PowerShow.com