Business Data Solution Using Clustering, Linear Programming, and Neural Net - PowerPoint PPT Presentation

About This Presentation
Title:

Business Data Solution Using Clustering, Linear Programming, and Neural Net

Description:

Business Data Solution Using Clustering, Linear Programming, and Neural Net – PowerPoint PPT presentation

Number of Views:192
Avg rating:3.0/5.0
Slides: 30
Provided by: Subha3
Category:

less

Transcript and Presenter's Notes

Title: Business Data Solution Using Clustering, Linear Programming, and Neural Net


1
Business Data Solution Using Clustering, Linear
Programming, and Neural Net
  • A presentation to
  • El Paso del Norte Software Association
  • Somnath (Shom) Mukhopadhyay
  • Information and Decision Sciences Department
  • The University of Texas at El Paso
  • August 27th 2003

2
Outline of Presentation
  • Data Mining Definition
  • Introduction of Neural Net
  • - Physiological flavor
  • - General framework
  • - Classes of PDP models
  • - Sigma-PI units
  • - Conclusion

3
Outline of Presentation (Continued)
  • Examples of real-world application problems
  • Organization of theoretical concepts
  • - Three methods used for classification
  • - A new LP based method for classification
    problem.
  • - Application to a fictitious problem with four
    classes.
  • - Comparing LP method results with the results
    from a neural network method
  • - QA

4
Data Mining - definition
  • - Exploring relationships in large amount of data
  • - Should generalize
  • - Should be empirically validated
  • Examples
  • - Customer Relationship Management (CRM)
  • - Credit Scoring
  • - Clinical decision support

5
PDP Models and Brain
  • Physiological Flavor
  • Representation and Learning in PDP models
  • Origins of PDP
  • - Jackson (1869) and Luria (1966)
  • - Hebb (1950)
  • - Rosenblatt (1959)
  • - Grossberg (1970)
  • - Rumelhart (1977)

6
General Framework for PDP
  • A set of processing units
  • A state of activation
  • An output function of each unit
  • A pattern of connectivity among units
  • A propagation rule
  • An activation rule
  • A learning rule
  • An operating environment

7
The Basic Components of a PDP system
8
Classes of PDP models
  • Simple Linear Models
  • Linear Threshold Units
  • Brain State in a Box (BSB) by J. A. Anderson
  • Thermodynamic models
  • Grossberg
  • Connectionist modeling

9
Sigma-PI Units
10
A few real-world applications of interest to
organizations and individuals
  • Breast cancer detection
  • Heart disease diagnosis
  • Enemy sub-marine detection
  • Mortgage delinquency prediction
  • Stock market prediction
  • Japanese Character recognition and conversion

11
(No Transcript)
12
What is classification?
  • Identification of a set of certain mutually
    exclusive classes
  • Identify a set of meaningful attributes that
    discriminate among the classes
  • Illustrations
  • Using a meaningful set of attributes, can we
    differentiate between frequent and infrequent
    occurrence?

13
Decision Boundaries of a typical classification
problem
14
Three Methods for Classification
  • Identifying decision boundaries for each class
    region
  • Linear discriminant (Glover at al., 1988)
  • Linear programming (Roy and Mukhopadhyay, 1991)
  • Neural Networks (Rumelhart, 1986)

15
A new LP based method for classification problem
  • Step 1. Identify and discard outliers using
    Clustering
  • Step 2. Form decision boundaries for each class
    region by using LP

16
Step 2 Form Decision Boundaries
  • Development of Boundary Functions
  • Use convex functions to calibrate the boundary.

One example function f(x) ?ai Xi ?bi Xi2
? ?cij Xi Xj d   where j i 1
17
Step 2 Form Decision Boundaries (Contd.)
  • One instance of the general function.

fA(x) a1 X1 a2 X2 b1 X12 b2 X22 d

18
Step 2 Form Decision Boundaries (Contd.)
  • LP formulation of the previous problem instance

Minimize e s.t. fA(x1) gt e fA(x8) gt e fA(x9)
lt -e ... fA(x18) lt -e egt a small positive
constant.
Minimize e s.t. a2 b2 d gt e for pattern
x1 a1 b1 d gt e for pattern x2 - a2 b2 d
gt e for pattern x3 - a1 b1 d gt e for
pattern x4 . a1 a2 b1 b2 d lt - e for
pattern x15 a1 - a2 b1 b2 d lt - e for
pattern x16 - a1 - a2 b1 b2 d lt - e for
pattern x17 - a1 a2 b1 b2 d lt - e for
pattern x18 egt a small positive constant.
19
Step 2 Form Decision Boundaries (Contd.)
  • Solution of this LP formulation gives decision
    boundaries.

Specifically we get, a1 0, a2 0, b1 -1, b2
-1, d 1e   Therefore, the boundary
function fA(x) a1 X1 a2 X2 b1 X12 b2 X22
d   translates into fA(x) 1 - X12 - X22 e
20
Step 2 Form Decision Boundaries (Contd.)
  • Putting this result into picture we have the
    following decision boundary

21
Step 2 Form Multiple Decision Boundaries
  • A class does not have to be neatly packed within
    one boundary.
  • For problems requiring multiple decision
    boundaries, the algorithm can find multiple
    disjointed regions for the same class. For
    example, a class called corner seats in a
    soccer stadium is scattered into four disjointed
    regions.

22
An example of a decision space of a fictitious
problem (It has four classes A, B, C, D)
23
Decision Boundary Identification Process for
Class D only
24
Six Decision Boundaries found for Class B
25
Constructing MLP from masksMasking functions put
on a network to exploit parallelism.
26
Neural Networks Method for Classification
  • Neural networks
  • develops non-linear functions to associate inputs
    with outputs
  • no assumptions about distribution of data
  • handles missing data well (graceful degradation)
  • Supervised neural networks
  • Estimating and testing the model
  • Construct a training sample and a holdout sample
  • Estimate model parameters using training sample
  • Test the estimated models classification ability
    using holdout sample

27
Comparison between LP and NN performance for
three real-world problem
 
 
28
Future Research
  • - Autonomous Learning
  • learn without outside interventions
  • does class dependent feature selection
  • derives simple if-then type classification rules
    that humans can understand
  • develops non-linear functions to associate inputs
    with outputs

29
Q A
  • Thank you.
Write a Comment
User Comments (0)
About PowerShow.com