Data Mining in SQL Server 2000 and Yukon - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Data Mining in SQL Server 2000 and Yukon

Description:

statistical algorithms such as decision trees, clustering, sequence clustering, ... OLAP is not a prerequisite for data mining, but it almost always comes first ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 18
Provided by: dejan7
Category:

less

Transcript and Presenter's Notes

Title: Data Mining in SQL Server 2000 and Yukon


1
Data Miningin SQL Server 2000and Yukon
  • Richard Lees
  • EasternMining_at_Hotmail.com
  • RichardLees.com.au

2
Agenda
  • What isnt Data Mining
  • Demo
  • What is Data Mining
  • Demo
  • Create a data mine
  • 4 ways to view data mine
  • Whats Coming in Yukon
  • Demo
  • Questions
  • Throughout

3
Which Questions are Data Mining?
  • Who are our biggest customers?
  • What are customers buying with cigars?
  • What are the customer retention levels of our
    branches?
  • Which customers have bought olives, feta cheese
    but no ciabatta bread?
  • Which regions have the highest male/female ratio
    of single 20 somethings?
  • Which region has lowest customer retention levels
    and list out lost customers?

4
Demonstration
  • Ad hoc query
  • Drill through to details
  • Business Intelligence tool

5
History of OLAP and Data Mining
Future
2000
1993
1998
1999
19xx
Custom Data Mining available to Fortune 100
Codds Defined 12 rules for OLAP
  • Microsoft SQL 7
  • OLAP v1
  • OLAP on the Web
  • ThinSlicer
  • Many others
  • Data Mining V2
  • SQL 2005
  • BI Tools
  • Microsoft
  • SQL 2000
  • OLAP v2
  • Data Mining
  • English Query

SAS and SPSS offer Data Mining tools To those
who can afford
6
Sample Data I Will be Using
  • Wellington Libraries Loan DB
  • We wanted sample data for data mining
  • They were just writing off a data warehouse
    project
  • The experts have spent 12 months trying to
    import data!
  • How could Microsoft help us?
  • The data are in IBM databases!

7
What is Data Mining?
Data mining is the use of powerful software
tools to discover significant traits or
relationships, from databases or data warehouses
and often used to predict future events
  • It exploits
  • statistical algorithms such as decision trees,
    clustering, sequence clustering, association,
    naïve bayes, neural network and time series
    algorithms
  • Once the knowledge is extracted it
  • Can be used to discover
  • Can be used to predict values of other cases

8
OLAP versus Data Mining
  • OLAP
  • Is about fast ad hoc querying
  • Analysis by dimensions and measures
  • Gives precise answers
  • Data Mining
  • May use rdbms or OLAP source
  • Is about discovering and predicting
  • Gives imprecise answers
  • OLAP is not a prerequisite for data mining, but
    it almost always comes first

(learning to ride a bike before a car)
9
Clusters
Annual Income
Age
10
Library Clusters
11
Decision Trees
  • Input data
  • About cases
  • Discovering relationships
  • Predicting outcomes

12
Data Mining
  • Demo with real data
  • Build a data mine
  • View data mine
  • Browse dependencies
  • Browse decision trees
  • Query using MDX
  • Query using ThinMiner
  • Batch update
  • Uses of Data Mining
  • Risk assessment
  • Claim likelihood
  • Customer profitability predictions
  • Fraud detection
  • Treatment efficacy
  • Product suggestions
  • Web shopping
  • Call centre tool

13
Successful Data Mining Projects
  • Two additional Critical Success Factors
  • Discover something interesting
  • Profit from discovery
  • For example
  • ComputerFleet
  • (Localhost)

14
Whats Coming in Yukon
Decision Trees
Confusion Matrix
15
Naïve Bayes
16
Demonstration
  • Yukon
  • Development
  • New algorithms
  • Lift chart
  • Profit curve
  • Query tool

17
Questions
References
Microsoft Research http//Research.Microsoft.com/r
esearch/pubs
Richard Lees EasternMining_at_Hotmail.com http//Rich
ardLees.com.au
Write a Comment
User Comments (0)
About PowerShow.com