Chapter 17 Preparing Data for Mining

About This Presentation

Title:

Description:

Number of Views:29

Avg rating:3.0/5.0

Slides: 15

Provided by: ronn161

Category:

Tags: chapter | data | mining | preparing | snapshot

Transcript and Presenter's Notes

Title: Chapter 17 Preparing Data for Mining

1
Chapter 17Preparing Data for Mining
2
Introduction

Just as manufacturing and refining are about
transformation of raw materials into finished
products, so too with data to be used for data
mining
ECTL extraction, clean, transform, load is
the process/methodology for preparing data for
data mining
The goal ideal DM environment (Ch 16)

3
What the Data Should Look Like

All data mining algorithms want their input in
tabular form rows columns as in a spreadsheet
or database table

4
What the Data Should Look Like

Each row represents the customer and whatever
might be useful for data mining
5
What the Data Should Look Like

The columns
Contain data that describe aspects of the
customer (e.g., sales and quantity for each of
product A, B, C)
Contain the results of calculations referred to
as derived variables (e.g., total sales )

6
What the Data Should Look Like

1.
2.
3.
7
What the Data Should Look Like

Columns have important Model Roles in Data
Mining
Input columns input into the model
Target column(s) used only for predictive
models the values are created by the algorithm
Ignored columns not used in a particular data
mining analysis

8
What the Data Should Look Like

9
What the Data Should Look Like

10
Constructing the Customer Signature
11
Typical Customer Model
3 different definitions of Customer
12
The Dark Side of Data

13
Conclusion

Lots to think about and take action on for
Preparing Data for Mining
Remember a process/methodology is needed which
includes ECTL (extraction, clean, transform,
load)
Remember Data Mining Group Skills needed

14
End of Chapter 17

Write a Comment

User Comments (0)