Data Mining Tools - PowerPoint PPT Presentation

Loading...

PPT – Data Mining Tools PowerPoint presentation | free to download - id: 31b3b-ODU1N



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Data Mining Tools

Description:

First generation data mining tool. Most widely used 'decision tree' ... Works with popular query and reporting, spreadsheet, statistical and OLAP & ROLAP tools. ... – PowerPoint PPT presentation

Number of Views:456
Avg rating:3.0/5.0
Slides: 25
Provided by: Wes75
Learn more at: http://www.terry.uga.edu
Category:
Tags: data | mining | tools

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Data Mining Tools


1
Data Mining Tools
KnowledgeSeeker 4.5
  • Alexandra Bahl
  • Ray Bichard
  • Wes Griffin
  • Kitty Roberts

2
Presentation Outline
  • Overview of Data Mining
  • Introduction to KnowledgeSeeker
  • Major Competitors
  • Current Applications
  • Introduction to Key Terms
  • Interactive Demonstration
  • Summary
  • Questions Answers

Wes Griffin Ray Bichard Kitty
Roberts Alex Bahl
3
What is Data Mining?
  • Data mining is a continuous, iterative process.
    It involves the use of software, sound
    methodology, and human creativity to achieve new
    insight through the exploration of data to
    uncover patterns, relationships, anomalies and
    dependencies.

4
Data Mining History
Data Collection and Database Creation 1960s
Database Management Systems 1970s -early 1980s
Advanced DBMS mid 1980s - present
Web based DBMS 1990-present
DW and Data Mining late 1980s - present
New Generation Integrated Information Systems
5
Data Mining Architecture
Graphical User Interface
Pattern Evaluation
Knowledge Base
Data Mining Engine
Database or DW server
Data Cleaning Data Integration
Filtering
Database
Data Warehouse
6
What is KnowledgeSeeker?
  • A data analysis, data mining package
  • Enables users to quickly analyze and understand
    the relationships between variables in a data
    set.
  • First generation data mining tool
  • Most widely used decision tree data mining
    analytical tool
  • Price per copy 4750.00 USD

7
What is KnowledgeSeeker?
  • Produced by ANGOSS Software Corporation, who
    focus solely on data mining software.
  • Offer training and consulting services
  • Produce data mining add-ins which accepts data
    from all major databases
  • Works with popular query and reporting,
    spreadsheet, statistical and OLAP ROLAP tools.

8
The KnowledgeSeeker Process
  • Define business goal
  • Prepare the data
  • Analyze the data

9
The KnowledgeSeeker Process
  • Define business goal
  • What question needs answered?
  • What type of analysis will be performed?
  • What functionalities does the business require?

10
The KnowledgeSeeker Process
  • Prepare the data
  • Consider the various factors that could influence
    the outcome.
  • Examine database to identify those data fields
    which provide measurements of potential
    dependencies.
  • Create subset of the database containing only
    those data fields.

11
The KnowledgeSeeker Process
  • Analyze the data
  • Automatically scans all the fields in the data
    set, summarizes the statistically significant
    patterns and relationships among the fields, and
    displays the result as a graphical decision tree,
    or as a knowledge base of rules.

12
KnowledgeSeeker Pulsepoints
  • ADVANTAGES
  • Easy to use
  • Powerful
  • Scalability
  • Flexibility
  • DISADVANTAGES
  • Less than impressive GUI

13
Major Competitors
14
Major Competitors
15
Current Applications
  • Manufacturing
  • Used by the R.R. Donnelly Sons commercial
    printing company to improve process control, cut
    costs and increase productivity.
  • Used extensively by Hewlett Packard in their
    United States manufacturing plants as a process
    control tool both to analyze factors impacting
    product quality as well as to generate rules for
    production control systems.

16
Current Applications
  • Auditing
  • Used by the IRS to combat fraud, reduce risk,
    and increase collection rates.
  • Finance
  • Used by the Canadian Imperial Bank of Commerce
    (CIBC) to create models for fraud detection and
    risk management.

17
Current Applications
  • CRM
  • Telephony
  • Used by US West to reduce churning and increase
    customer loyalty for a new voice messaging
    technology.

18
Current Applications
  • Marketing
  • Used by the Washington Post to improve their
    direct mail targeting and to conduct survey
    analysis.
  • Health Care
  • Used by the Oxford Transplant Center to
    discover factors affecting transplant survival
    rates.
  • Used by the University of Rochester Cancer
    Center to study the effect of anxiety on
    chemotherapy-related nausea.

19
More Customers
20
Introduction to Key Terms
  • Dependent / Independent variables
  • Root node / nodes
  • Decision tree
  • Splits
  • Clustering

21
Interactive Demo
22
Questions
  • What percentage of people in the test group have
    high blood pressure with these characteristics
    66-year-old male regular smoker that has low to
    moderate salt consumption?
  • Do the risk levels change for a male with the
    same characteristics who quit smoking? What are
    the percentages?
  • If you are a 2 milk drinker, how many factors
    are still interesting?
  • Knowing that salt consumption and smoking habits
    are interesting factors, which one has a stronger
    correlation to blood pressure levels?
  • Grow an automatic tree. Look to see if gender is
    an interesting factor for 55-year-old regular
    smoker who does not each cheese?

23
Summary
  • Data mining has evolved into knowledge
    discovery
  • KnowledgeSeeker provides rapid data anaylsis
  • KnowledgeSeeker is flexible and inexpensive
  • KnowledgeSeeker is easy to use

24
Any Questions?
About PowerShow.com