Data Warehousing and Decision Support - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Data Warehousing and Decision Support

Description:

Add up the number of Hogs reported in a survey by state, county, race, sex ... If a specific question in a Hog survey is used in an Environmental survey, a new ... – PowerPoint PPT presentation

Number of Views:18
Avg rating:3.0/5.0
Slides: 26
Provided by: USD786
Category:

less

Transcript and Presenter's Notes

Title: Data Warehousing and Decision Support


1
Data Warehousingand Decision Support
  • Mickey Yost
  • United States Department of Agriculture
  • National Agricultural Statistics Service
  • ASC 99
  • September 24, 1999

2
Metadata Standards
The most visible aspects of the data warehouse
and the ones by which the project will be judged
are the portions that rise above the surface the
schema, metadata, and the desktop data access
tools you provide. Managing the Data Warehouse
(1997) Inmon, Welch, and Glassey
3
Data Warehousing
  • Uses a Dimensional Model

4
The Simplest Dimensional Model or Star Schema is
characterized by a long and narrow central Fact
table surrounded by 6 to 12 Dimension tables.
5
In the NASS model, the Fact table primarily
contains the Survey Responses reported by
respondents in a survey or census.
6
The dimension tables store and manage the
information that describes the facts stored in
the Fact table, and are each joined to that table.
7
The Varname dimension table, for example,
contains the question text used in a sample
survey or census. The response to a particular
question is not in this table.
8
Likewise, the Reporter dimension table contains
the names, addresses, and demographics of the
individuals who were interviewed in a survey or
census.
9
The Survey Dimension Table merely contains one
row for each unique survey or census being
tracked. For example 1997 CENSUS OF
AGRICULTURE OR 1999 SEPTEMBER HOG SURVEY
10
The Location dimension table merely contains one
row for each State and County in the U. S.
11
Dimensional database models are primarily
designed to allow for AD HOC data analysis by the
dimensional attributes being tracked in the model.
12
  • Questions that may be posed to this database
    include
  • Show me the entire reporting history for a single
    reporter.
  • Add up the number of Hogs reported in a survey by
    state, county, race, sex and age.

13
Data Warehousing
  • Uses a Dimensional Model
  • Tracks the Facts

14
Unlike transaction system databases that replace
specific data in a specific field, the Data
Warehouse accumulates Facts over time.
15
In the NASS Data Warehouse, the Facts are the
individual Survey Responses from 1997 to the
present covering multiple surveys, a census, 3600
questions, and 1.8 million respondents.
16
Data Warehousing
  • Uses a Dimensional Model
  • Tracks the Facts
  • Tracks the Metadata

17
Metadata is defined as the data about the data.
18
For our purposes, metadata describes the facts
stored in the dimension tables, and is primarily
intended for the business end user.
19
One of the main responsibilities of the Data
Warehouse is to correctly represent prior
history. Ralph Kimball That is done by adding
rows to the dimension tables.
20
Slowly changing dimensional attributes, and their
relationship to the facts they describe, are
managed by the addition of new rows when any of
the key attributes change.
21
For example, a key attribute in the Varname table
is the Varname Survey Name. If a specific
question in a Hog survey is used in an
Environmental survey, a new row is added for that
survey.
22
Not only is the survey source of the variable
maintained, but all of the original attributes
are also maintained.
23
This has created a golden opportunity to
standardize all variable names and their
definitions. This is accomplished by using the
Master Varname and Master Varname Description
columns in the Varname table.
24
Master Master Varname
Varname Original Varname Description
Survey Name Varname CCRNXXHV CORN ALL
AG SURVEY CCRNXXHV HARVESTED
HOG SURVEY CCRNXXHV ACRES
COUNTY ESTIMATES C133 COUNTY
ESTIMATES C202 COUNTY ESTIMATES C203
COUNTY ESTIMATES C531 COUNTY
ESTIMATES C538 COUNTY ESTIMATES C543
COUNTY ESTIMATES CNAHCURR COUNTY
ESTIMATES CYDGCHRV COUNTY
ESTIMATES HARVGRN COUNTY
ESTIMATES IC321 CENSUS K67
25
  • Standardization benefits
  • Browsing and selection of variables is much
    easer.
  • Difference between original variable and master
    variable reduced over time.
  • Survey and Census specifications integrated.
Write a Comment
User Comments (0)
About PowerShow.com