PPT – Delivering Business Intelligence With Microsoft SQL Server 2005 - Part 2 PowerPoint presentation

About This Presentation

Title:

Delivering Business Intelligence With Microsoft SQL Server 2005 - Part 2

Description:

Topic 3: Tasks Accomplished by Data Mining. Topic 4: Steps for Data Mining ... the data and set direction, so our organizations do not tumble over a cliff. ... – PowerPoint PPT presentation

Number of Views:315

Avg rating:3.0/5.0

Slides: 91

Provided by: circus6

Category:

more less

Transcript and Presenter's Notes

Title: Delivering Business Intelligence With Microsoft SQL Server 2005 - Part 2

1
Delivering Business Intelligence With Microsoft
SQL Server 2005- Part 2
2
Summary of Topics

Topic 1 What Is Data Mining?
Topic 2 SQL Server Data Mining Architecture
Overview
Topic 3 Tasks Accomplished by Data Mining
Topic 4 Steps for Data Mining
Topic 5 Data Mining Algorithms
Topic 6 Data Mining Structure Model

3
Topic 1 What Is Data Mining?

Data mining works in a manner similar to our
topographic map. It makes connections within the
data that may not be readily apparent to the
human observer. These connections make the data
easier to understand. Using this new
understanding, we can efficiently analyze the
data and set direction, so our organizations do
not tumble over a cliff.

4
Making Connections
5
Making Connections -- Planning a route using a
topographic map

6
Making Predictions

Data mining provides us with a similar
capability. We can use the patterns found in our
data to make predictions of what will happen
next. Data mining may find patterns in the way
our clients make use of our services. Based on
these patterns, we can predict which clients may
need additional services in the future. Data
mining may find patterns in the valid data
entered for a particular screen. Based on these
patterns, we can predict whether a newly entered
set of data is accurate or contains a data entry
error.

7
Basic Data Mining Terminology

The data mining industry uses a lot of terms
fairly loosely, so its helpful for us to define
a few of these terms early on. This is not an
exhaustive list of data mining terms, only the
relevant ones for our discussion.
Algorithm The programmatic technique used to
identify the relationships or patterns in the
data.
Model The definition of the relationship
identified by the algorithm, which generally
takes the form of a set of rules, a decision
tree, a set of equations, or a set of
associations.
Case The collection of attributes and
relationships (variables) that are associated
with an individual object, usually a customer.
The case is also known as an observation.

8
Basic Data Mining Terminology

Case set A group of cases that share the same
attributes. Think of a case set as a table with
one row per unique object (like customer). Its
possible to have a nested case set when one row
in the parent table, like customer, joins to
multiple rows in the nested table, like
purchases. The case set is also known as an
observation set.
Dependent variable(s) (or predicted attribute or
predict column) The variable the algorithm will
build a model to predict or classify.

9
Basic Data Mining Terminology

Independent variable(s) (or predictive attribute
or input column) The variable with descriptive
information used to build the model. The
algorithm creates a model that uses combinations
of independent variables to define a grouping or
predict the dependent variable.

10
Basic Data Mining Terminology

Discrete or Continuous Variables Numeric columns
that contain continuous or discrete values. A
column in the Employee table called Salary that
contains the actual salary values is a continuous
variable. You can add a column to the table
during data preparation called SalaryRange,
containing integers to represent encoded salary
ranges (1 0 to 25,000 2 between 25,000
and 50,000 and so on). This is a discrete
numeric column. Early data mining and statistical
analysis tools required the conversion of strings
to numeric values like the encoded salary ranges.
Most tools, including most of the SQL Server data
mining algorithms, allow the use of character
descriptions as discrete values. The string 0 to
25,000 is easier to understand than the number
1. Discrete variables are also known as
categorical. This distinction between discrete
and continuous is important to the underlying
algorithms in data mining, although its
significance is less obvious to those of us who
are not statisticians.

11
Basic Data Mining Terminology

Regression A statistical technique that creates
a best-fit formula based on a dataset. The
formula can be used to predict values based on
new input variables. In linear regression, the
formula is the equation for a line.
Deviation A measure of how well the regression
formula fits the actual values in the dataset
from which it was created.

12
Basic Data Mining Terminology

Mining Structure A Microsoft data mining term
used as a name for the definition of a case set
in Analysis Services. The mining structure is
essentially a metadata layer on top of a data
source view that includes additional data
mining-related flags and column properties, like
the field that identifies a column as input,
predict, both, or ignore. A mining structure can
be used as the basis for multiple mining models.
Mining Model The specific application of an
algorithm to a particular mining structure. You
can build several mining models with different
parameters or different algorithms from the same
mining structure.

13
Topic 2 SQL Server Data Mining Architecture
Overview
14
SQL Server Data Mining Architecture Overview

Point A in Architecture Figure shows how data
mining models are built using the dimensional
engine, leveraging its ability to load data and
quickly perform the base statistical calculations
like sums, averages, and counts. The data mining
server can also easily pull case data from both
relational Analysis Services databases as seen at
point B in ArchitectureFigure.

15
SQL Server Data Mining Architecture Overview

Point C in Architecture Figure shows how the
typical developer will first experience data
mining by creating a BI Studio Analysis Services
project and then using the Data Mining Wizard to
create a new data mining structure and an initial
data mining model. The mining structure is a new
construct that provides a metadata layer allowing
several mining models to work with the same input
data. Each mining model in a mining structure can
have different algorithms and parameters. The
wizard provides model building guidance with auto
selection and adjustment of variables based on
the algorithm selected. The wizard also helps you
create case sets, including complex, nested
queries.

16
Build, Deploy, and Process

Most of the functions in the Data Mining Designer
work with the actual model as it exists in
Analysis Services. This means once the wizard is
complete, the developer must build and deploy the
project (which includes processing the model
cubes) before any more progress can be made.
Building the project writes the metadata out to
project files in the development environment. The
actual model does not come into being until the
project is deployed to an Analysis Services
instance. At that point, BI Studio creates a
database for the project in Analysis Services. It
writes out the model structure metadata and the
definition of each model. Finally, it creates a
cube for each model and processes the models,
inserting the training data so the algorithm can
calculate the rules, correlations, and other
relationships. Until the project is deployed and
a model is processed, it cannot be viewed in the
viewers.

17
Build, Deploy, and Process

Most of the functions in the Data Mining Designer
work with the actual model as it exists in
Analysis Services. This means once the wizard is
complete, the developer must build and deploy the
project (which includes processing the model
cubes) before any more progress can be made.
Building the project writes the metadata out to
project files in the development environment. The
actual model does not come into being until the
project is deployed to an Analysis Services
instance. At that point, BI Studio creates a
database for the project in Analysis Services. It
writes out the model structure metadata and the
definition of each model. Finally, it creates a
cube for each model and processes the models,
inserting the training data so the algorithm can
calculate the rules, correlations, and other
relationships. Until the project is deployed and
a model is processed, it cannot be viewed in the
viewers.

18
Accessing the Mining Models

As you see at Point D in ArchitectureFigure, Data
Mining eXtensions to SQL language (DMX) is at the
core of all the Microsoft data mining API. As the
name suggests, DMX is an extension to SQL
designed to create, train, modify, and query data
mining models. DMX was introduced with SQL Server
2000 as part of the OLE DB for Data Mining APIs.
It has been enhanced in SQL Server 2005 with
additional options for data sources and more
flexible SELECT functionality. An easy way to
begin learning about DMX is to use the Mining
Model Prediction tab in the Data Mining Designer
and examine the syntax it generates for DMX
queries. The code can be copied to a DMX query
window in SQL Studio for further exploration.
Although DMX is an extension to SQL, queries are
submitted to the Analysis Services serverthats
where the data mining services are.

19
Accessing the Mining Models

When the development environment submits its DMX
commands and queries to Analysis Services, it
uses the XML for Analysis (XMLA) APIs. In fact,
because Analysis Services is a native XMLA
server, the data mining models are available to
any application as a web service by using SOAP
protocols. It is still possible to access the
server with old OLE DB APIs, or ADO and ADO.NET.

20
Integration Services and Data Mining

Integration Services can play a major role in the
data mining process as shown in
ArchitectureFigure. Many of the standard
transforms used for data cleansing and data
integration are particularly valuable for
building the training and test case sets. Besides
the obvious tasks, like Data Conversion and
Derived Column, tasks like the Percentage
Sampling, Row Sampling, Conditional Split,
Lookup, and Merge Join are powerful components
the data miner can use to build a set of packages
to prepare the case sets for the data mining
process. This is shown at point E in
ArchitectureFigure. In addition to the standard
tasks, there are two Integration Services tasks
that directly interact with the data mining
models, shown at point F in ArchitectureFigure.

21
Integration Services and Data Mining

The Data Mining Model Training destination task
is the target for a training set used to train
(or re-train) an existing mining model. This
capability is perfect for the ongoing re-training
required to keep certain mining models
currentrecommendation models, for example. The
Data Mining Query task is specifically designed
to do prediction joins against a model in the IS
pipeline, once the model has been trained and
tested. This could be used to add scores to the
Customer table or identify significant data
anomalies during the nightly ETL process. It
could also be used in a real-time mode to flag
transactions that were potentially fraudulent.

22
Additional Features

There are several additional features that
will be important for certain applications. Most
of these can be found at point G in
ArchitectureFigure and are listed briefly here
Extensibility Microsoft has provided a set of
COM APIs that allow developers to integrate
additional data mining algorithms into the data
mining engine. They can integrate custom viewers
into the data mining designer as well. Someone
could even create a new viewer for an existing
Microsoft algorithm.
Analysis Management Objects (AMO) AMO is a new
API for managing the creation and maintenance of
data mining objects, including creating,
processing, backing up, restoring, and securing.

23
Additional Features

Stored procedures and user-defined functions A
developer can create what are essentially stored
procedures or user-defined functions and load
them as managed assemblies into Analysis
Services. This allows clients to work with large
mining models through the intermediate layer of
the server-based managed assembly.
Text mining It is possible to do some
interesting data mining on unstructured text
data, like the text in HTML files in a set of web
directories, or even text fields in a database.
For example, use the Integration Services Term
Extraction transformation to build a dictionary
of terms found in the text files. Then use data
mining to create classification rules to
categorize the documents according to the terms
they contain. This is essentially a data mining
application, not a new data mining algorithm, but
it has value in certain areas.

24
Architecture Summary

The previous slidfes show how data mining fits
into the overall SQL Server 2005 architecture,
and to show how well data mining has been
integrated into the SQL Server environment. It
should be clear by now that data mining is a
serious element of the BI toolset. Data mining
and application developers will have a lot more
to learn before they are proficient with all the
data mining components. Fortunately, Microsofts
documentation is heavily weighted toward the
development community. Additional help should be
easy to find. A good place to start is a web site
called www.sqlserverdatamining.com, maintained by
the SQL Server data mining development team.

25
Topic 3 Tasks Accomplished by Data Mining

Classification
Regression
Segmentation
Association
Sequence Analysis
Probability Predictions

26
Classification

we may need to classify a potential customer as
someone who is likely to need our services or
someone who is not. If we know how to classify an
individual, an entity, or a thing, we can make
more intelligent decisions as we deal with that
individual, entity, or thing.

27
Example of Classification using Tree Induction
Customer renting property gt 2 years
No
Yes
Rent property
Customer agegt45
No
Yes
Rent property
Buy property
28
Example of Classification using Neural Induction
29
Regression

Classification is used to predict the value for a
discrete attribute, meaning an attribute that has
one of a set number of distinct values.
Regression, on the other hand, is used to predict
a continuous value.
To predict a continuous value, regression looks
at trends likely to continue and repeat over
time.
Like classification, regression also looks at
relationships between the value being predicted
and other continuous values available in the
data.

30
Regression - Value Prediction

Linear regression attempts to fit a straight line
through a plot of the data, such that the line is
the best representation of the average of all
observations at that point in the plot.
Problem is that the technique only works well
with linear data and is sensitive to the presence
of outliers (i.e.., data values, which do not
conform to the expected norm).

31
Segmentation

Segmentation is the "divide and conquer" approach
to data analysis. Segmentation divides data into
groups with similar characteristics. We can then
analyze the characteristics of each group for
insights.
Applications of database segmentation include
customer profiling, direct marketing, and cross
selling.

32
Example of Database Segmentation using a Scatter
plot
33
Association

Association requires we have some type of
grouping in our data. Multiple items could be
grouped together in a single sales transaction.
Multiple services could be provided to a single
family unit.
Classification, segmentation, and association can
sometimes run together. They seem similar. The
difference is this With classification, we pick
the attribute that defines the grouping and the
algorithm determines what most distinguishes
those groupings. With segmentation, the algorithm
itself creates the grouping, based on what it
determines as distinguishing characteristics.
With association, the underlying data already
contains some type of grouping. The algorithm
makes determinations about items likely to be in
the group, rather than attributes they are likely
to have in common.

34
Sequence Analysis

Sequence analysis is often applied to navigation
on a website. A user is on Page A. The user is
then most likely to navigate to Page B, with a
somewhat smaller probability of navigating to
Page C.
Finds patterns between events such that the
presence of one set of items is followed by
another set of items in a database of events over
a period of time.
e.g. Used to understand long term customer buying
behaviour.

35
Probability Predictions

It mathematically analyzes what has occurred in
the past and determines what is most probable to
occur if present trends continue.

36
Topic 4 Steps For Data Mining

Problem Definition
Data Preparation
Training
Validation
Deploy

37
Problem Definition

The first step for a data mining solution is to
define the problem we are trying to solve. Data
mining is great at finding patterns and
relationships that are not apparent to human
eyes. However, it cannot solve problems unless we
point that analytical power in a specific
direction.
We need to define the business goal we are
working toward. Perhaps we want to do a better
job at identifying perspective customers who are
a bad credit risk. Perhaps we want to identify
customers who are likely to switch to our
competitor's products or services. Perhaps we
want to predict future sales to put in place the
capacity to meet those needs.

38
Problem Definition

Once the business goal is defined, we determine
what we need
data mining to do for us. We select which of the
five data mining
tasks we are looking to accomplish
Classification
Regression
Segmentation
Association
Sequence Analysis
Our choice of task influences our choice of data
mining
Algorithm to be discussed in topic 3 .

39
Data Preparation

When we learned about OLAP cube design, one of
our main concerns
was where the data to populate the cube was going
to come from.
With data mining, we have the same concern where
are we going to
get the data to be fed into our data mining
algorithms?

40
Training

With algorithms selected and our data in place,
we begin the process of working with the data
mining model. You can think of a data mining
model as the combination of one or more data
mining algorithms with some real, live data. The
model applies the algorithms to the data and
creates the classifications, associations,
regression formulas, and so forth that we use to
solve our business problem.

41
Training

This step is known as training the data mining
model. We provide our data to the data mining
algorithms, so they can be trained on a
particular aspect of our organization. They
"learn" what is unique about our data, as well as
the operations and entities, customers, products,
and sales transactions that produced that data.

42
Validation

We spoke previously in this chapter about
providing a data mining algorithm with past data,
so it could find the classifications,
associations, and so forth contained within. In
actuality, we often divide our past data into two
sets. One set of data is used to train the data
model. The second set is used to validate the
model after it is trained. Let's return to our
credit risk example for an illustration.

43
Deploy

Once we test the data mining model and determine
which data mining algorithm gives us the most
accurate information, we can put it to work. We
place the model where it can be accessed by the
front-end tools that will put it to work. We use
the Data Mining Extensions (DMX) query language
to feed new records to our algorithm and have it
generate information.

44
Topic 5 Data Mining Algorithms

SQL Server 2005 provides us with seven data
mining algorithms. Most of these algorithms
perform several different tasks. Having a
detailed understanding of the inner workings of
each algorithm is unnecessary. Instead, this
section provides a brief explanation of each to
give you some background information. More
important is the knowledge of what each can be
used to accomplish.

45
Data Mining Algorithms

Microsoft Decision Trees
Microsoft Naïve Bayes
Microsoft Clustering
Microsoft Association
Microsoft Sequence Clustering
Microsoft Time Series
Microsoft Neural Network

46
5.1 Microsoft Decision Trees

Function The Microsoft Decision Trees
algorithm is one of the easiest algorithms to
understand because it creates a tree structure
during its training process. (You probably
already guessed that from the name.) The tree
structure is then used to provide predictions and
analysis. .

47
The Microsoft Decision Trees Algorithm

48
Tasks

The main purpose of the Microsoft Decision Trees
algorithm is Classification.
It can also be used for Regression Association.

49
5.2 Microsoft Naïve Bayes

Microsoft's SQL Product Manager, Donald Farmer,
claims because there is a Naïve Bayes algorithm,
there must be a "deeply cynical" Bayes algorithm
out there somewhere in data mining land. I guess
it is needed to bring balance to data mining's
version of the "force." We will try not to be too
Naïve as we explore the benefits and shortcomings
of this algorithm.

50
Function

The Naïve Bayes algorithm looks at each attribute
of the entity in question and determines how that
attribute, on its own, affects the attribute we
are looking to predict. Figure 5.2 shows a Naïve
Bayes algorithm being used to predict whether a
customer is a good credit risk. One by one, the
Naïve Bayes algorithm takes a single attribute of
a customer, size of company, annual revenue, and
so forth and looks at the training data to
determine its effect on credit risk.

51
Figure 5.2 The Naïve Bayes algorithm

52
Function

In our diagram, 57 of companies with a size
attribute of small are bad credit risks. Only 14
of companies with a size attribute of large are
bad credit risks. In this particular example, it
looks pretty cut and dried we should never
extend credit to small companies and we should
always extend credit to large companies.

53
Tasks

The Naïve Bayes algorithm can only be used for
Classification.

54
5.3 Microsoft Clustering

The Microsoft Clustering algorithm builds
clusters of entities as it processes the training
data set. This is shown in Figure 5-3. Once the
clusters are created, the algorithm analyzes the
makeup of each cluster. It looks at the values of
each attribute for the entities in the cluster.
Cluster analysis lets the data miner assemble
data into unforeseen groups containing similar
characteristics. Also known as "segmentation,"
this type of data
mining is probably the most widely used.

55
Figure 5.3 The Microsoft Clustering Algorithm

56
Tasks

The main purpose of the Microsoft
Clustering algorithm is
Segmentation
Regression
ClassificationTree Induction Neural Induction

57
Database Segmentation

Aim is to partition a database into an unknown
number of segments, or clusters, of similar
records.
Uses unsupervised learning to discover
homogeneous sub-populations in a database to
improve the accuracy of the profiles.

58
Example of Database Segmentation using a Scatter
plot
59
Example of Classification using Tree Induction
60
Example of Classification using Neural Induction
61
5.4 Microsoft Association

The Microsoft Association algorithm creates its
own sets of entities, and then determines how
often those sets occur in the test data set. This
is shown in Figure 5.4. In this set, we are
looking at groupings of products purchased
together in a single purchase transaction. To
simplify our example, we are only looking at
products in the World War II product subtype.

62
Figure 5.4 The Microsoft Association algorithm

63
Tasks

The Microsoft Association algorithm can
only be used for Association.

64
5.5 Microsoft Sequence Clustering

The Microsoft Sequence Clustering algorithm
examines the test data set to identify
transitions from one state to another. The test
data set contains data, such as a web log showing
navigation from one page to another or perhaps
routing and approval data showing the path taken
to approve each request. The algorithm uses the
test data set to determine, as a ratio, how many
times each of the possible paths is taken.

65
Tasks

The main purpose of the Microsoft Sequence
Clustering algorithm is Sequence Analysis
Segmentation.
It can also be used for Regression
Classification.

66
5.6 Microsoft Time Series

The Microsoft Time Series algorithm starts with
time-related data in the test data set. In Figure
5.6, this is sales data for each month. To
simplify things, we are only looking at data for
two products in our example.

67
Figure 5.6 The Microsoft Time Series algorithm

68
Tasks

The Microsoft Time Series algorithm can only be
used for Regression.

69
5.7 Microsoft Neural Network

Neural networks were developed in the 1960s to
model the way human neurons function. Microsoft
has created the Microsoft Neural Network
algorithm, so we can use neural networks for such
mundane activities as predicting product sales.
Of course, predicting product sales might not
seem so mundane if your future employment is
dependent on being correct.

70
Tasks

The main purpose of the Microsoft Neural Network
algorithm is
Classification
Regression

71
Topic 6 Data Mining Structure Model

The data mining structure enables us to define
the data we will use, either from a relational
database or from an OLAP cube, as the source for
the mining. It also lets us define one or more
data mining algorithms to apply to that data.

72
Data Columns

The first step in doing any mining is to select
the data being mined. We select a data source
either relational data or an OLAP cube. Once this
is done, we need to select a table if this is a
relational data source, or a dimension if this is
an OLAP data source. Finally, we must select the
table columns or dimension attributes to be used
as the data columns for our data mining.

73
Data Columns

Although an initial table or dimension must be
selected for data mining, it is possible to
select additional related tables or dimensions
for inclusion. These are known as nested tables.
If a relational database is being used, the
initial table and the nested table must be linked
by a foreign key relationship (or a chain of
foreign key relationships). If a cube is being
used, the initial dimension and the nested
dimension must be linked through the fact table.

74
Data Mining Model

The data mining model combines the data columns
with a data mining algorithm. Additionally, we
must determine how each data column should be
used by the data mining algorithm. This process
determines how the data mining algorithm
functions and what it predicts.

75
Data Column Usage

Each column in the data mining structure must
have one
of the following usages
Key The key is the unique identifier for a
table or a dimension. The key is not used to
uniquely identify records or members and it is
not used by the data mining algorithm to predict
values.
Input Input columns are used by the data mining
algorithm when making a prediction. The input
values are the values that influence the
prediction made by the algorithm.

76
Data Column Usage

Predict A predict is a data column whose value
is being predicted by the data mining algorithm.
This column can also be used as an input column.
If data column A and data column B are both
predict, then data column A will serve as an
input when the algorithm is predicting data
column B, and vice versa.

77
Data Column Usage

Predict Only A predict only is a data column
whose value is being predicted by the data mining
algorithm. This column cannot be used as an input
column when predicting other columns.
Ignore This data column is not used by the data
mining algorithm.

78
Using Cube SliceAs a Training Set

The fact that data must be divided into training
and
testing data sets. The cube slice enables us to
specify an
expression that divides the cube into two parts.
The
portion of the data that satisfies the expression
is fed into
the data mining algorithm for training. The rest
of the
data will be testing data set.

79
Defining a Data Mining Model

Steps
Business Need
Select the data mining algorithms
Determine where the data source

80
Mining Model Viewer

Now that we have several data mining models
created and trained, let's look at how we begin
to analyze the information. We do this by using
the data mining model viewers, which are found on
the Mining Model Viewer tab on the Data Mining
Design tab.
Each data mining algorithm has its own set of
viewers, enabling us to examine the trained data
mining algorithm. These viewers present the
mining information graphically for ease of
understanding. If desired, we can also use the
Microsoft Mining Content Viewer to look at the
raw data underlying the graphical presentations.

81
Mining Model Viewer

Microsoft Tree Viewer
Microsoft Mining Content Viewerused to see the
detailed data underlying the diagrams

82
The SQL Server Management Studio Windows

Registered Servers Window
Object Explorer Window
Summary Window
Query Windows

83
Query Windows

One of the functions of the SQL Server Management
Studio is the creation and execution of queries.
These queries may be Transact-SQL for relational
databases, MDX or XMLA for Analysis Services
databases, DMX for data mining, or SQL Mobile for
hand-held devices. Each type of query has its own
specific type of query window. We discuss how to
use these query windows throughout the rest of
this book.

84
MDX Definition

Multidimensional Expression (MDX) language
provides the programming language for OLAP Cube
navigation and over 50 mathematical functions for
calculating cube measures. It can be called and
used in the relational database language of
Transact-SQL.

85
DMX Definition

Data Mining Expression (DMX) language is a new
language in SQL Server 2005, which provides the
commands to easily set up data mining. It can be
called and used in the relational database
language of Transact-SQL

86
XML Definition

XML for Analysis Services (XMLA) is an open,
XML-based standard protocol for interacting with
Microsoft SQL Server 2005 Analysis Services data
over an HTTP connection, such as an intranet or
the Internet. XMLA uses the Simple Object Access
Protocol (SOAP).

87
Mining Accuracy Chart

Before we put a data mining model to use in a
production environment, we need to insure the
model is making predictions with the desired
accuracy. This is done using the Mining Accuracy
Chart tab in the Business Intelligence
Development Studio. The Mining Accuracy Chart tab
provides three tools for determining the accuracy
of our mining models the Lift Chart, the Profit
Chart, and the Classification Matrix.

88
Deployment

As you see in Figure 6.1, the deployment step in
the Business Dimensional Lifecycle is where the
three parallel development tracks come back
together. This is the great unveiling of the
DW/BI system to the business community. The
quality of this first impression will strongly
influence the acceptance of the systemand you
get only one shot at it. Like any big event,
there are a lot of details that must fall into
place in order for the show to be successful.

89
Figure 6.1 The Deployment step in the Business
Development Lifecycle
90
Deployment

What needs to be tested in the DW/BI system
before deployment?
How and where should you perform that testing?
Whats the easiest way to deploy the completely
new DW/BI system, and how can you safely test and
deploy changes to an existing system?
Why is DW/BI documentation important, and what
documentation does your team need to write?
What kind of support and training will your users
need?