Data Warehouse - PowerPoint PPT Presentation

View by Category
About This Presentation

Data Warehouse


What is data warehouse – PowerPoint PPT presentation

Number of Views:64
Slides: 16
Provided by: himanchalt


Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Data Warehouse

Presentation on Data warehouse
Made by Himanchal Tiwari
Question What is data
warehouse? Answer Data warehouse is powerful
database model. That Users ability to quickly
analyze large and multi dimensional set.
  • Data in the warehouse must have a strong analytic
    characteristics, i.e.
  • Subject Oriented Data
  • Integrated Data
  • Time Referenced Data
  • Non-Volatile Data .

  • (1)Subject Oriented Data -
  • Data in the warehouse is group by subject rather
    than by activity.
  • Data in database is organized by activities e.g.
    Payroll processing, Shipping Processing, Loan
  • It Helps reducing response time of queries.
  • (2)Integrated Data -
  • Integration means de-duplicating information and
    merging it from many sources into one consistent
  • Integrating data is one of the most happing
    activity that happens in the DW as it reduce
    response time of queries.
  • E.g.
  • Employee Table
  • Account Table
  • Integrated Data
  • (3)Time Referenced Data -
  • Data is always is referred its time value
  • (4)Non-Volatile Data-
  • Non-Volatile Data helps users to dig into deep
    history and In DW is non changeable.

Explain Operational Vs.
Informational System
  • Operational System
  • O.S. as the name implies are the system are the
    system that help everyday operational of
  • For e.g. inventory, or order entry,
    manufacturing, payroll and account etc.
  • They are backbone of any business operational.
  • Most organization cannot function without O.S.
    and data maintained by the system.
  • They focus single area and hence have limited
  • Informational System
  • (i) For function like planning, forecasting,
    managing, marketing, engineering.
  • (ii) The above function requires knowledge based
    system like informational.
  • (iii) I.S. spans large area by grouping together
    several single areas.
  • (iv) I.S. deal with analyzing data and making
  • (v) I.S. have different scope and need large
    amount of operational.

Framework of DW Architecture of Data Warehouse
  • Source System.
  • Source data transport layer.
  • Data quality Control and data profiling layer.
  • Metadata management layer.
  • Data integration layer.
  • Data processing layer.
  • End user reporting layer.

  • 1. Source system-
  • For the processing of business transection,
    operational data exit.
  • But because of limited focus, operational
    database cannot be used to access data for
    informational purpose.
  • Many operational system are 10-15 year old and
    hence the data and data accessing technology is
  • Goal of data warehousing is to free the data
    locked in O.S.
  • Information in outside database obtained may be
    demographic, econometric of competitive.
  • This information super highway required more data
  • 2. Source data transport layer-
  • Data Transport Layer of DWA does data trafficking
  • It represents tools and process involved in
    transporting data from source system to
  • FTP is extensively used for data transmission.
  • (3) Data quality control and data Profiling layer
  • Incomplete and inaccurate data causes DW failures
  • DW does not generate data on its own ,rather
    depends on source system.
  • (4) Metadata management Layer.
  • For Fully Function Warehouse, it is necessary top
    have verity of metadata.
  • Metadata is info or data about data.
  • End users should be able to access data without
    having known where the data reside in the DW.

  • (5)Data integration layer .
  • This layer integrate data obtained from various
    source system.
  • Lot of formatting cleansing activities happen
    in the layer.
  • This layer consists of heavy tools job control
    procedure to keep the data warehouse up to date.
  • (6) Data Processing layer.
  • DW is where dimensionally modeled data obtained
    from different operational database resides
    analytical purposes.
  • Data in the warehouse is stored in a form that it
    is easy to access highly flexible.
  • (7) End user Reporting layer.
  • Success of DW depends on access of valuable
  • Based on business needs, there are different type
    of reporting architectures.
  • Since most of report queries are analytical
    there is tight integration between DW and
    reporting architecture.

  • Explain the Development of DW
  • Developing DW is an IT Project That requires
    careful planning, requirement definition , design
    prototype and implementation.
  • Following are the step involved in the
    development of warehouse.
  • Developing strategy.
  • Evolving DWA.
  • Designing Data warehouses.
  • Managing Data warehouses.

Data Warehouse Options Key factors that need to
be considered ? Scope of the data warehouse ?
Data redundancy ? Type of end-user Scope The
scope of a data warehouse may be as broad as all
the informational data for the entire enterprise
from the beginning of time, or it may be as
narrow as a personal data warehouse for a
single. manager for a single year. There is
nothing that makes one of these more of a data
warehouse than another. In practice, the broader
the scope, the more valuable the warehouse is to
the enterprise and the more expensive and time
consuming it is to create and maintain.
Data Redundancy There are essentially three
levels of data redundancy that enterprises should
think about when considering their data warehouse
options ? Virtual or point-to-point data
warehouses ? Central data warehouses ?
Distributed data warehouses Virtual or
point-to-point Data Warehouses A virtual or
point-to-point data warehousing strategy means
that end-users are allowed to get at operational
databases directly, using whatever tools are
enabled to the data access network. Virtual
data warehouses often provide a starting point
for organizations to learn what end-users are
really looking for. Central Data Warehouses The
central data warehouse is a single physical
database that contains all data for a specific
functional area, department, division, or
enterprise. Such warehouses are often selected
where there is a common need for informational
data and there are large numbers of end-users
already connected to a central computer or
network. A central data warehouse may contain
records for any specific period of time and
usually, contains information from multiple
operational systems.
Goals In order to provide information with which
users can increase profitability, gain
competitive advantage or make better business
decisions, the data warehouse must meet the
following goals Provide Easy Access to Corporate
Data ? The user access tools must be easy to
use, so that accessing warehouse data will be
simple and intuitive ? Access should be graphic
so that it will be easier for users to understand
and spot trends or area of interest ? Access
should be manageable by end users of data
warehouse, for these are predominantly business
analysts, mid-level managers, and CEOs who seldom
possess technical acumen to get the data but are
highly qualified to analyze the information. They
must easily get answers to their questions and
ask new questions, all without getting the IT
team involved ? The process of getting and
analyzing data must be fast. Questions leapfrog,
so you must get answers fast. The very nature of
data analysis is that not all questions are known
beforehand. This involves a lot of unpredictable,
ad-hoc inquiry during a typical data analysis
session. In an analytic environment, the user is
directly interacting with the facts to find
patterns, clues or problem areas, rather than
looking at a printed report. The answers have to
be delivered fast before users lose their train
of thought Provide Clean and Reliable Data for
Analysis ? For consistent analysis, the data
environment must be stable. Now that the users
can create their own reports, they must have
consistent data. One department doing an analysis
must get the same result as any other. ? Source
conflicts must be resolved. The source for
warehouse data is the transactional system and it
is frequently seen that these systems often have
data stored in several different applications in
different formats. A customer ID may be
represented as 100044629 in one system and
44629 in another. A straight load of
information for this company into the data
warehouse would result in two customers being
represented with transactions being listed and no
supporting details. ? Historical analysis must
be possible, so that data can be analyzed cross a
span of time.
Thanks Watch my Presentation