Title: Basic%20Concepts%20of%20Datawarehousing%20An%20Overview
1Basic Concepts of Datawarehousing An Overview
2How to answer these Business Queries?
What is the sales distribution region wise?
3DSS
Decision Support Systems (DSS) are interactive
computer-based systems intended to help decision
makers utilize data and models to identify and
solve problems and make decisions. Data
Warehouse is the foundation of DSS process. It is
a Strategy and a Process for Staging Corporate
Data.
Enable users to get a Business View of the
data Facilitate Data based Decision Making that
would drive and improve the Business Discover
Hidden Trends
4Driving Forces for DSS
Business Speed
Reform
Customers
RESULT
COMPETITION
Technology
5Scenario without DSS
- Unavailability of Tools and Techniques for
acquisition of data from various sources for
answering business questions and making
decisions, in earlier days - Intensive efforts in data formatting than data
analysis - Static and inflexible report generation
- Time-lag in accessing the information at central
place
6OLTP v/s DSS Environment
- OLTP Environment
- get data IN
- large volumes of simple transaction queries
- continuous data changes
- low processing time
- mode of processing
- transaction details
- data inconsistency
- mostly current data
- DSS Environment
- get information OUT
- small number of diverse queries
- periodic updates only
- high processing time
- mode of discovery
- subject oriented - summaries
- data consistency
- historical data is relevant
7OLTP v/s DSS Environment
- OLTP Environment
- high concurrent usage
- highly normalized data structure
- static applications
- automates routines
- DSS Environment
- low concurrent usage
- fewer tables, but more columns per table
- dynamic applications
- facilitates creativity
8Benefits for Business User
- Flexible Information Access
- High Availability
- Ease of Use
- Quality Completeness of Data
- Focus on Information Processing
- Information Base for Knowledge Discovery
9 Available line of technology
- Advances in dbms technology
- Data warehousing
- On-line analytical processing
- Data mining
10Datawarehouse
- Data warehouses store large volumes of data which
are frequently used by DSS.It is maintained
separately from the organizations operational
databases - Data warehouse is subject-oriented, integrated,
time-variant, and nonvolatile collection of data - Subject-oriented Contains information regarding
objects of interest for decision support Sales
by region, by product, etc. - Itegrated Data are typically extracted from
multiple, heterogeneous data sources (e.g., from
sales, inventory, billing DBs etc.). - Time-variant Contain historical data, longer
horizon than operational system. - Nonvolatile Data is not (or rarely) directly
- updated.
11Datawarehouse
- Is the enabling technology that facilitates
improved business decision-making - Its a process, not a product
- A technique for assembling and managing a wide
variety of data from multiple operational systems
for decision support and analytical processing - Its a journey not a destination...
12DW Components
Data Mart Population
Aggregation Summarization
Transformation
Knowledge Discovery
Metadata Layer
13Operational Process
- Data extraction
- Data Cleansing and Transformation
- Data Load and refresh
- Build derived data and views
- Service queries
- Administer the warehouse
14Extraction Process ( Data Capturing )
Data Capturing Process
15Extraction Process (Data Transmission )
16Cleansing Process
17Transformation Process
Operational Data Store
18Summarization Process
19Metadata
- Data about Data
- Used to maintain Datawarehouse
- Control data
- Static
- Roles, permissions, naming standards, source
system names, - Locations, target names, transformation and
mapping rules - Dynamic
- Scheduling, scripts, load statistics, space
usage, - Backup statistics
- Business data
- Business rules,Who validates data,Who
controls,How they validate
20DW Components/Tools
- Extraction/transformation/load tool (family of
tools including data modeling tool, extraction
tool, Meta data repository, and DW administration
tools) - Meta data exchange architecture (API used to
integrate all components of DW with central Meta
data) - Target databases (relational, multidimensional,
hybrid) - Data access and analysis tools for end users
- Database servers, operating systems, networks
21DW Tools