Title: Machine learning Techniques applied to CRM
 1Machine learning Techniques applied to CRM
- PUBLIC PrUning  BuiLding Integrated Classifier
 
Presented by Soumen Sengupta 
 2Customer Relationship Management
- CRM is a process that manages the relationship 
 - between a company and its customers. 
 - Improving Customer Profitability 
 - Increases ROI 
 - Integrating Data Mining with Marketing 
 - Efficient algorithms for prediction
 
  3Data Mining applied to CRM
- Customer Segmentation 
 -  
 - Customer Profiling 
 - Customer Acquisition 
 - Cross selling/ Up-selling 
 -  Customer Retention
 
  4Customer Segmentation and Customer Profiling
- Customer Segmentation a method that allows 
companies to know who their customers are, how 
they are different from each other, how they 
should be treated.  -  e.g. RFM ( Recency, Frequency, Monetary) 
 -  Demographic Segmentation 
 -  Psychographic Segmentation 
 -  Targeted Segmentation 
 -  
 -  
 -  Customer Profiling describing a customer by 
his attributes such  -  as age, income, lifestyles etc. Various 
marketing media applied to  -  various segments. 
 -  
 -  
 
  5CRM functions contd.
- Customer Acquisition Acquiring new customers by 
turning a group of potential customers into 
actual customers. Customer responses to Market 
Campaigns are analyzed  
  6Customer Retention
- Customer Retention Predictive model built to 
identify customers who are likely to churn e.g. 
Attrition in the Cellular Telephone industry 
Phone Technology
 Old
 New
C
Customer Lifetime
20,0
 gt2.3 years
 lt2.3 years
Age
5,40
 lt35
 gt35
C Churner NC Non-churners 
C
5,10
20,0
Predicting churn in the telecommunication 
industry, adapted from BES97 
 7Cross Selling
- Cross-selling is the process of offering new 
products to existing customers  -  Modeling of individual customer behaviors 
 -  Scoring the models 
 -  Optimizing the scores 
 -  
 
 
 
 
 
 
 
 
 
 
  
  8Machine Learning Techniques
- Decision Tree 
 - Artificial Neural Networks 
 - Bayesian Classifier 
 - Genetic Algorithms 
 - Rule based Analysis and lots more 
 
  9Decision Tree An operation overview
- Select splitting Criteria( Information gain, Gain 
ratio, Gini Index, Chi Square test)  - Apply recursive partitioning until all the 
examples( training data) are classified or 
attributes are exhausted  - Pruning the tree 
 - Test the tree
 
Feature Extraction
Build phase
Train the model
Prune Phase
Handle over fitting of data
Test the model 
 10An example of Decision Tree for Credit Screening
Work Class
Self-employed not inc
Private firm
Capital Gain
Income
Not Satisfactory
 gt50k
Satisfactory
 lt50K
Yes
No
Credit History
Education
 Bachelors
  Bachelors
Not Good
Good
No
No
Yes
Yes 
 11PUBLIC An efficient decision tree classifier 
-  A Decision tree algorithm that integrates 
pruning into the building  -  phase 
 -  
 -  Produces trees that are smaller in size 
 -  Makes it computationally efficient 
 -  More accurate for larger datasets 
 -  
 -  Splitting Criteria Information Gain 
 -  
 -  
 n  -  where info Gain(X)  Info (Tree) - ? ( 
Sj  / S)  log2 (  Sj  / S)  -  
 j1  -  S are the subsets for various classes 
 
  12Pruning
- Occams Razor The hypothesis that is simple is 
usually the best  - one. 
 - Pruning is used to avoid over fitting 
 - Improves accuracy, speed and memory requirements 
 - Produces a much smaller tree 
 - Constraints 
 -  Size 
 -  Inaccuracy (Misclassification) 
 -  
 -  
 
  13Pruning the Tree and MDL
- Pre-pruning 
 -  Stop growing the tree when the size 
reaches a  -  number of nodes or the cost limit is 
reached  - Post-pruning 
 -  Cross Validation 
 -  Pessimistic pruning 
 -  Minimum Error based pruning 
 -  MDL 
 -  
 
  14MDL applied to Decision Trees
- MDL principle The best tree can represent the 
classes of records  -  with the fewest number of of bits. 
 - A subtree S is pruned if the cost of encoding the 
records in S is  - less than the cost of encoding the subtree and 
the cost of  - encoding the records in each leaf 
 -  
 
  15MDL Costs 
- Cost of encoding the records 
 -  
 - Cost of encoding the tree 
 -  Cost of encoding the structure of the tree ( 
1bit used to  -  represent a node(1) or a leaf (0) 
 -  Cost of encoding each split (Csplit) 
 -  Cost of classifying the classes of records in 
the leaves  -  
 
  16Cost of encoding the records
- Let there be a set S containing n records and k 
classes and let ni be the number of records 
belonging to class i. The cost function C(S) for 
encoding the records is as follows  -  C(S)  S ni log(n / ni)  (k-1)/2  log 
(n/2)  log ?(k/2) / ?(k/2)  -  
 
  17Split Costs of a subtree
- Cost of encoding the tree rooted at node N 
 -  C(S)  1 
 -  
 -  Cost of a Subtree with s splits 
 -  2s 1  sloga  Skis2 ni 
 -  
 -  wher s is the number of splits 
 -  a is the number of attributes 
 -  k is the total number 
of classes  -  ni represents the 
number of  -  records belonging to class 
i  -  
 -  After each split 
 -  ? 2  log a 
 -  ? ns2 
 
1split
2 splits
2  loga lt ns2 
 18Pruning Algorithms compared
-  
 -  Dataset Diabetes has been used and quoted 
from MRA98  - MDL produce a tree that is much smaller than that 
of other pruning algorithms  - It is a touch better with error rates although 
the execution times might be a little slower  - Doesnt need extra data for pruning 
 
  19Advantages of PUBLIC 
- Easy to interpret 
 - Fast and doesnt require too much training data 
 - Rules can be generated and ranked according to 
 -  confidence  support 
 - Doesnt require extra data for training 
 -  
 - Avoids over fitting by using MDL for post 
pruning  - Reduces the I/O overhead and improves performance 
 - Improves the accuracy as well
 
  20References 
- BeST00  Alex Berson, Stephen Smith, Kurt 
Thearling Building Data Mining Applications for 
CRM, Mcgraw Hill publication, 2000.  - MRA95 Manish Mehta, Jorma Rissanen, Rakesh 
Agrawal MDL based  -  Decision tree Pruning IBM Almaden Research 
Center, 1995  - RS98 Rajeev Rastogi, Kyuseok Shim PUBLIC A 
Decision Tree  -  Classifier that integrates Building and 
Pruning.  - BR01 Catherine Bounsaythip, Esa Rinta-Runsala 
Overview of Data  -  Mining for Customer Behavior Modeling, 2001 
  - BeS1997 A. Berson and S. J. Smith, Data 
Warehousing, Data Mining and OLAP, McGraw Hill, 
1997  
  21