Business Data Solution Using Clustering, Linear Programming, and Neural Net

About This Presentation

Title:

Business Data Solution Using Clustering, Linear Programming, and Neural Net

Description:

Business Data Solution Using Clustering, Linear Programming, and Neural Net – PowerPoint PPT presentation

Number of Views:192

Avg rating:3.0/5.0

Slides: 30

Provided by: Subha3

Learn more at: https://utminers.utep.edu

Category:

more less

Transcript and Presenter's Notes

Title: Business Data Solution Using Clustering, Linear Programming, and Neural Net

1
Business Data Solution Using Clustering, Linear
Programming, and Neural Net

A presentation to
El Paso del Norte Software Association
Somnath (Shom) Mukhopadhyay
Information and Decision Sciences Department
The University of Texas at El Paso
August 27th 2003

2
Outline of Presentation

Data Mining Definition
Introduction of Neural Net
- Physiological flavor
- General framework
- Classes of PDP models
- Sigma-PI units
- Conclusion

3
Outline of Presentation (Continued)

Examples of real-world application problems
Organization of theoretical concepts
- Three methods used for classification
- A new LP based method for classification
problem.
- Application to a fictitious problem with four
classes.
- Comparing LP method results with the results
from a neural network method
- QA

4
Data Mining - definition

- Exploring relationships in large amount of data
- Should generalize
- Should be empirically validated
Examples
- Customer Relationship Management (CRM)
- Credit Scoring
- Clinical decision support

5
PDP Models and Brain

Physiological Flavor
Representation and Learning in PDP models
Origins of PDP
- Jackson (1869) and Luria (1966)
- Hebb (1950)
- Rosenblatt (1959)
- Grossberg (1970)
- Rumelhart (1977)

6
General Framework for PDP

A set of processing units
A state of activation
An output function of each unit
A pattern of connectivity among units
A propagation rule
An activation rule
A learning rule
An operating environment

7
The Basic Components of a PDP system
8
Classes of PDP models

Simple Linear Models
Linear Threshold Units
Brain State in a Box (BSB) by J. A. Anderson
Thermodynamic models
Grossberg
Connectionist modeling

9
Sigma-PI Units
10
A few real-world applications of interest to
organizations and individuals

Breast cancer detection
Heart disease diagnosis
Enemy sub-marine detection
Mortgage delinquency prediction
Stock market prediction
Japanese Character recognition and conversion

11
(No Transcript)
12
What is classification?

Identification of a set of certain mutually
exclusive classes
Identify a set of meaningful attributes that
discriminate among the classes
Illustrations
Using a meaningful set of attributes, can we
differentiate between frequent and infrequent
occurrence?

13
Decision Boundaries of a typical classification
problem
14
Three Methods for Classification

Identifying decision boundaries for each class
region
Linear discriminant (Glover at al., 1988)
Linear programming (Roy and Mukhopadhyay, 1991)
Neural Networks (Rumelhart, 1986)

15
A new LP based method for classification problem

Step 1. Identify and discard outliers using
Clustering
Step 2. Form decision boundaries for each class
region by using LP

16
Step 2 Form Decision Boundaries

Development of Boundary Functions
Use convex functions to calibrate the boundary.

One example function f(x) ?ai Xi ?bi Xi2
? ?cij Xi Xj d where j i 1
17
Step 2 Form Decision Boundaries (Contd.)

One instance of the general function.

fA(x) a1 X1 a2 X2 b1 X12 b2 X22 d

18
Step 2 Form Decision Boundaries (Contd.)

LP formulation of the previous problem instance

Minimize e s.t. fA(x1) gt e fA(x8) gt e fA(x9)
lt -e ... fA(x18) lt -e egt a small positive
constant.
Minimize e s.t. a2 b2 d gt e for pattern
x1 a1 b1 d gt e for pattern x2 - a2 b2 d
gt e for pattern x3 - a1 b1 d gt e for
pattern x4 . a1 a2 b1 b2 d lt - e for
pattern x15 a1 - a2 b1 b2 d lt - e for
pattern x16 - a1 - a2 b1 b2 d lt - e for
pattern x17 - a1 a2 b1 b2 d lt - e for
pattern x18 egt a small positive constant.
19
Step 2 Form Decision Boundaries (Contd.)

Solution of this LP formulation gives decision
boundaries.

Specifically we get, a1 0, a2 0, b1 -1, b2
-1, d 1e Therefore, the boundary
function fA(x) a1 X1 a2 X2 b1 X12 b2 X22
d translates into fA(x) 1 - X12 - X22 e
20
Step 2 Form Decision Boundaries (Contd.)

Putting this result into picture we have the
following decision boundary

21
Step 2 Form Multiple Decision Boundaries

A class does not have to be neatly packed within
one boundary.
For problems requiring multiple decision
boundaries, the algorithm can find multiple
disjointed regions for the same class. For
example, a class called corner seats in a
soccer stadium is scattered into four disjointed
regions.

22
An example of a decision space of a fictitious
problem (It has four classes A, B, C, D)
23
Decision Boundary Identification Process for
Class D only
24
Six Decision Boundaries found for Class B
25
Constructing MLP from masksMasking functions put
on a network to exploit parallelism.
26
Neural Networks Method for Classification

Neural networks
develops non-linear functions to associate inputs
with outputs
no assumptions about distribution of data
handles missing data well (graceful degradation)
Supervised neural networks
Estimating and testing the model
Construct a training sample and a holdout sample
Estimate model parameters using training sample
Test the estimated models classification ability
using holdout sample

27
Comparison between LP and NN performance for
three real-world problem

28
Future Research

- Autonomous Learning
learn without outside interventions
does class dependent feature selection
derives simple if-then type classification rules
that humans can understand
develops non-linear functions to associate inputs
with outputs

Business Data Solution Using Clustering, Linear Programming, and Neural Net - PowerPoint PPT Presentation

Business Data Solution Using Clustering, Linear Programming, and Neural Net

Business Data Solution Using Clustering, Linear Programming, and Neural Net – PowerPoint PPT presentation