A MixedInteger Programming Approach to Customer Segmentation Problem - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

A MixedInteger Programming Approach to Customer Segmentation Problem

Description:

Burcu Saglam, F.Sibel Salman, Metin T rkay {bsaglam,ssalman,mturkay}_at_ku.edu.tr ... optimization setting, an objective function can be defined such as the ... – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 29
Provided by: bur101
Category:

less

Transcript and Presenter's Notes

Title: A MixedInteger Programming Approach to Customer Segmentation Problem


1
A Mixed-Integer Programming Approach to
Customer Segmentation Problem
  • Burcu Saglam, F.Sibel Salman, Metin Türkay
  • bsaglam,ssalman,mturkay_at_ku.edu.tr
  • Dept. of Industrial Engineering
  • Serpil Sayin
  • ssayin_at_ku.edu.tr
  • Dept. of Business Administration
  • June 20, 2004
  • ESI 2004, METU, Ankara

2
Koç University, Istanbul
www.ku.edu.tr www.eng.ku.edu.tr
3
Outline
  • Introduction
  • Clustering Problem
  • Clustering Approaches
  • Motivation of the Study
  • Proposed Model
  • Illustrative Example
  • Evaluation in Real World Scenario
  • Conclusions and Future Work

4
Introduction
  • This study presents one new mathematical
    programming-based segmentation model that is
    applied to a digital platform companys customer
    database

5
Digiturk
  • Private digital platform
  • Eager to find out opportunities in customer
    relationship management, such as onetoone
    marketing

6
Digiturk
  • Pay-Per-View Services
  • Vision halls
  • Football matches
  • Erotic channels
  • Interactive Events
  • Banking
  • TV Games, etc...
  • Products
  • Standard package
  • Sports package
  • Cinema package
  • Super package
  • Mega package

7
Why Data Mining, Segmentation?
  • Analysis of large data collections, huge
    databases
  • Understanding needs, desires, and expectations of
    the customers
  • Grouping the ongoing and potential customers
  • Hidden patterns and knowledge within the data
  • Segmentation is applied when there is a need to
    partition the instances into natural groups

8
Clustering Analysis
  • A data mining technique developed for the purpose
    of identifying groups of entities that are
    similar to each other with respect to certain
    characteristics
  • Dividing heterogeneous sets of data into smaller
    and homogeneous ones
  • Evaluating the result and performance of a
    supervised learning model
  • Analyzing the set of input attributes
  • Determining outliers

9
Clustering Problem
  • Given a data set with n data items in
    m-dimensions
  • Partition the data into k clusters
  • In an optimization setting, an objective function
    can be defined such as the minimization of the
    sum of 1-norm distances between each data point
    and the center of the cluster which it belongs to
    (Bradley et al., 1997)

10
Main Considerations
  • The term similarity
  • Exclusive, overlapping, probabilistic or fuzzy
    clusters
  • Iterative or non-iterative
  • Hierarchical or non-hierarchical
  • Distance-based, probability-based approaches,
    graph theoretic methods, continuous-discrete
    optimization, ...

11
Analytical Clustering Methods
  • Hierarchical - number of clusters is not assumed
    to be known a priori
  • Divisive and agglomerative methods
  • Once an assigment is made, it is irrevocable
  • Well known BIRCH algorithm

12
Analytical Clustering Methods
  • Nonhierarchical - number of clusters is known a
    priori
  • Initially data is divided into k partitions where
    each partition represents a cluster
  • Two main decisions
  • Selection of the initial cluster centroids
  • Assignment of the instances to clusters
  • Sensitive to initial partitions
  • Too many local minima
  • K-Means, K-Medoids, CLARANS, etc. ...

13
Classical K-Means
  • Iterative distance-based
  • Works in numeric domains
  • Partitions instances into disjoint clusters
  • Two steps
  • Assignment
  • Updating the cluster centers
  • Works well when the candidate clusters are
    approximately equal size

14
Shortcomings of K-Means
  • Solution is local minima
  • Convergence to a local optima is proved
  • Sensitivity to initially selected cluster centers
  • Worst case time complexity is stated to be
    exponential
  • To find a global minima, the algorithm has to be
    repeated several times
  • Impossible to interpret which attributes are
    significant

15
Motivation of the Study
  • Considering the limitations of existing
    clustering approaches and algorithms, an exact
    non-hierarchical distance-based clustering
    algorithm is proposed

16
Proposed Approach
  • Given a data set of n data items in m-dimensions
  • Aim is to find the optimum partitioning of the
    data set into k exclusive clusters
  • Objective function Minimization of the maximum
    diameter of the generated clusters
  • Number of clusters is known a priori

17
MIP-Max Model
Minimize
s.t.
18
MIP-Max
  • O(kn) variables and O(kn2) constraints
  • Non-hierarchical
  • Not iterative
  • No need for an initial solution
  • Global optimum

19
Illustrative Example
20
Comparisons with the Results of the K-Means
21
Comparisons with the Results of the K-Means
22
Evaluation in Real World Scenario
  • Data set includes demographic and transactional
    information
  • Each row represents a unique customer
  • 18 real-valued and categorical attributes

23
Experiments with MIP-Max Model
k 2
CPU times are reported for a computer with a
Pentium IV processor at 2.56 GHz and 1GB memory.
24
Comparison of MIP-Max with K-Means and
Interpretations
  • The result of the approach is compared with the 3
    cluster solution and 100 data items
  • MIP-Max model grouped 39 instances in the first
    cluster, 34 instances in the second cluster and
    27 instances in the third cluster
  • K-Means generated clusters 1, 2 and 3
    respectively with 59, 22, and 19 instances
  • Interpretations based on predictiveness score

25
Predictiveness Score
  • Given class C and attribute A with values v1,
    v2, v3vn, an attribute-value predictiveness
    score for vi is defined as the probability an
    instance resides in C given the instance has
    value vi for A.
  • Between-class measure
  • Actually for categorical attibutes, most of our
    attributes are in nominal form
  • An attribute has a distinguishing power in one
    cluster if its predictiveness scores are higher
    than 75

26
Predictiveness Score
27
Conclusions and Future work
  • The sensitivity of the K-Means to initial
    solution is analyzed
  • The interpretations of MIP-Max model is more
    meaningful than K-Means, sports package
    subscriber grouped in one group, etc. ...
  • MIP-Max is significantly better than K-Means in
    terms of quality and stability of the solutions
  • Future Work
  • Improvement of run times
  • Determination of number of clusters

28
Thanks ? Welcome to Any Questions ?
29
Clustering Approaches
  • Hierarchical and non-hierarchical clustering
    methods
  • Classical K-Means
  • Cobweb
  • Clarans
  • Birch
  • Advantages and disadvantages
  • Motivation of the study

30
COBWEB
  • Conceptual clustering technique
  • Forms a hierarchy to capture knowledge
  • Deals only with categorical (nominal) data
  • Cluster quality namely Category Utility
  • Expensive to compute this measurement
  • Instance ordering have impact on the resulting
    clustering.

31
CLARANS
  • A type of K-Medoids algorithm, differs by its
    randomized partial search strategy,
  • Clustering problem is represented by graph,
  • Limitations
  • Convergence to a definite local minimum,
  • Efficiency considerations.

32
BIRCH
  • Stemmed from the fact that available memory is
    limited,
  • One iteration ends with a successful clustering,
  • Applicable to large data sets,
  • But, sensitive to parameters settings.

33
Experimental Results of MIP-Max
34
Comparisons with the Results of the K-Means
Write a Comment
User Comments (0)
About PowerShow.com