Customer Relationship Management A Databased Approach - PowerPoint PPT Presentation

About This Presentation
Title:

Customer Relationship Management A Databased Approach

Description:

Train set: Used to build the models ... the train set to fit the models ... train, test and score data sets. target variable calculation. models and their ... – PowerPoint PPT presentation

Number of Views:184
Avg rating:3.0/5.0
Slides: 45
Provided by: sbs83
Category:

less

Transcript and Presenter's Notes

Title: Customer Relationship Management A Databased Approach


1
Customer Relationship ManagementA Databased
Approach
  • V. Kumar
  • Werner J. Reinartz
  • Instructors Presentation Slides

2
Chapter Ten
  • Data Mining

3
Topics Discussed
  • Applications of Data Mining
  • Involvement of the three main groups
    participating in a data-mining project
  • Overview of the Data Mining Process
  • CRM at Work Credite Est and Yapi Kredi

4
Applications of Data Mining
  • Reducing churn with the help of predictive
    models, which enable early identification of
    those customers likely to stop doing business
    with the company
  • Increasing customer profitability by identifying
    customers with a high growth potential
  • Reducing marketing costs by more selective
    targeting

5
Overview of the Data Mining Process
Learn
Get Raw Data
Identify Relevant Variables
(Re)defineBusiness objectives
-
Gain Customer Insight
Act
  • Extract
  • descriptive and
  • transactional data
  • Rollup data
  • Train predictive
  • models
  • Deploy
  • models
  • Define
  • objectives
  • and expectations
  • Create analytical
  • variables
  • Compare
  • models
  • Monitor
  • performance
  • Define
  • measurement
  • of success
  • Check quality
  • Enhance
  • analytical data
  • Select
  • models
  • Enhance models
  • Select relevant
  • variables

6
Timeframe of Data Mining Methodology
60
-
(Re
-
) Define
Raw
Relevant
Customer
insight
7
Extent of Involvement of The Three Main Groups
Participating in a Data-Mining Project
8
Involvement of Business, Data Mining and IT
Resources in a Typical Data Mining Project
  • Data mining group
  • Understand the business objectives and support
    the business group to refine and sometimes
    correct the scope, and expectations
  • Most active during the variable selection and
    modeling phase
  • Share the obtained customer insights with the
    business group
  • IT resources
  • Required for the sourcing and extraction of the
    required data used for modeling
  • Business group
  • Involved in checking the plausibility and
    soundness of the solution in business terms
  • Takes the lead in deploying the new insights into
    corporate action such as a call center or direct
    mail campaign

9
Manipulations to Data Set
  • Column manipulations
  • Transformation
  • Derivation
  • Elimination
  • Row manipulations
  • Aggregation
  • Change detection
  • Missing value detection
  • Outlier detection

10
Data Preparation
  • For modeling, incoming data is sampled and split
    into various streams as
  • Train set Used to build the models
  • Test set Used for out-of-sample tests of the
    model quality and to select the final model
    candidate
  • Scoring data Used for model-based prediction ,
    large as compared to other data sets

11
Define Business Objectives
  • Modeling of expected customer potential, in
    order to target acquisition of
  • customers who will be profitable over the
    whole lifetime of the business
  • relationship
  • Distinguish between customers with a target
    variable equal to zero and
  • customers with a target variable equal to one
  • Establish likelihood threshold levels above
    which business group think a
  • prospect should be included in the marketing
    campaign

12
Define Business Objectives (contd.)
  • Define the set of business or selection rules for
    the campaign (e.g. , the customers that should
    be excluded from or included in the target
    groups)
  • Define the details of project execution
    specifying the start and delivery dates
  • of the data mining process, and the responsible
    resources for each task
  • Define the chosen experimental setup for the
    campaign
  • Define a cost/revenue matrix describing how the
    business mechanics will work in the supported
    campaign and how it will impact the data mining
    process
  • Establish the criteria for evaluating the success
    of the campaign
  • Find a benchmark to compare against results
    obtained in the past for the
  • same or similar campaign setups using
    traditional targeting methods, and not predictive
    models

13
Cost/Revenue Matrix
  • Will have an impact on the choice of model
  • parameters such as the cut-off point for the
    selected model scores
  • It will also give business users an immediately
    interpretable table

14
Cost/Revenue Matrix
Cost/Revenue matrix In reality prospect did not purchase In reality prospect did purchase
Model predicts prospect will not purchase (not contacted) Cost 0 1st year revenue 0 Total 0 lost business opportunity of 895
Model predicts prospect will purchase (contacted) Cost -5 1st year revenue 0 Total -5 Cost -5-100 1st year revenue 1000 Total 895
  • Assuming average cost per call is 5, each
    positive responder (purchaser) will generate
    additional cost due to
  • administration work required to register him as a
    new customer
  • the cost of the delivered phone handset (say,
    100)
  • Customers, who respond positively will, generate
    average revenue of 1000 per year

15
Get Raw Data
  • Identify, extract and consolidate raw data in a
    database
  • (often called Analytical Data Mart)
  • Check the quality of the analytical raw data -
    technical checks as well
  • as ensuring that the data makes sense in the
    given business context

16
Get Raw Data (contd.)
  • Step 1 Looking for Data Sources
  • Mixed top-down and bottom-up process, driven by
    business requirements (top) and technical
    restrictions (bottom)
  • Step 2 Loading the Data
  • Define how the data will be imported into the
    data mining environment
  • Checking Data Quality
  • Technical aspects of the data primary keys,
    duplicate records, missing values
  • Business context realistic data

17
Step 1 Looking for Data Sources
  • Data warehouse infrastructures with advanced data
    cleansing processes can help ensure that you are
    working with high-quality data
  • Build a (simple) relational data model onto which
    the source data will be mapped

18
Step 2 Loading the Data
  • Define further query restrictions , prepared by
    IT teams , for execution at pre-defined time
    windows in batch mode
  • Deliver extracted data to the data mining
    environment in a pre-defined format
  • Further processing and using data to fill
    previously defined data model in the data mining
    environment as part of the ETL process
    (Extract-Transform-Load)

19
Step 3 Checking Data Quality
  • Assess and understand limitations of data
    resulting from its inherent quality (good or bad)
    aspects
  • Create an analytical database as the basis for
    subsequent analyses
  • Carry out preliminary data quality assessment
  • To assure an acceptable level of quality of the
    delivered data
  • To ensure that the data mining team has a clear
    understanding of how to interpret the data in
    business terms
  • Data miners have to carry out some basic data
    interpretation and aggregation exercises

20
Identify Relevant Predictive Variables
Step 1 Create Analytical Customer View
Flattening the Data Step 2 Create Analytical
Variables Step 3 Select Predictive Variables
21
Step 1 Create Analytical Customer View
Flattening the Data
  • Individual customer constitutes an observational
    unit for data analysis and predictive modeling
  • All data pertaining to an individual customer is
    contained in one observation (row, record)
  • Individual columns (variables, fields) represent
    the conditions at specific points in time or a
    summary over a whole period
  • Definition of the target or dependent variable-
    values should be generated for all customers and
    added to the existing data tables

22
Step 2 Create Analytical Variables
  • Introduce additional variables derived from the
    original ones
  • When needed, transform variables to get new and
    more predictive variables
  • Increase normality of variable distributions to
    help the predictive model training process
  • Missing value management is key for enhancing the
    quality of the analytical data set

23
Step 3 Select Predictive Variables
  • Inspect the descriptive statistics of all
    univariate distributions associated to all
    available variables
  • Exclude those variables
  • which take on only one value (i.e. the variable
    is a constant)
  • with mostly missing values
  • directly or indirectly identifying an individual
    customer
  • showing collinearities
  • showing very little correlation with the target
    variable
  • Containing personal identifiers
  • Define a threshold missing value count level
    above which the field would be excluded from
    further analysis (e.g. more than 95 missing
    values)
  • Check if all variables have been mapped to the
    appropriate data types

24
Gain Customer Insight
Step 1 Preparing data samples Step 2 Predictive
Modeling Step 3 Select Model
.
25
Step 1 Preparing Data Samples
  • Analyze if sufficient data is available to obtain
    statistically significant results
  • If enough data is available, split the data into
    two samples
  • the train set to fit the models
  • the test set to check the models performance on
    observations that have not been used to build it

26
Step 2 Predictive Modeling
  • Two steps
  • The rules (or linear/non-linear analytical
    models) are built based on a training set
  • These rules are then applied to a new dataset for
    generating the answers needed for the campaign
  • Guidelines
  • Distinguish between different types of predictive
    models obtained through different modeling
    paradigms supervised and un-supervised modeling
  • Find the right relationships between variables
    describing the customers to predict their
    respective group membership likelihood purchaser
    or non-purchaser, referred to as scoring (e.g.
    between 0 and 1)
  • Apply unsupervised modeling where group
    membership is not known beforehand

27
Step 3 Select Model
Compare relative quality of prediction by
comparing respective misclassification rates
obtained on the test set Example of
misclassification error rate or confusion matrix
Input Node - Classification Neural Network (10)
28
Act
Step 1 Deliver Results to Operational
Systems Step 2 Archive Results Step 3 Learn
29
Step 1 Deliver Results to Operational Systems
  • Apply the selected model to the entire customer
    base
  • Prepare score data set containing the most recent
    information for each customer with the variables
    required by the model
  • The obtained score value for each customer and
    the defined threshold value will determine
    whether the corresponding customer qualifies to
    participate in the campaign
  • When delivering results to the operational
    systems, provide necessary customer identifiers
    to unambiguously link the models score
    information to the correct customer

30
Step 2 Archive Results
  • Each data mining project will produce a huge
    amount of information including
  • raw data used
  • transformations for each variable
  • formulas for creating derived variables
  • train, test and score data sets
  • target variable calculation
  • models and their parameterizations
  • score threshold levels
  • final customer target selections
  • Useful to preserve especially if the same model
    is used to score different data sets obtained at
    different times

31
Step 3 Learn
  • Referred to as closing the loop
  • Obtain the facts describing performance of data
    mining project and business impact
  • Obtained by monitoring campaign performance while
    it is running and from final campaign performance
    analysis after the campaign has ended
  • Detect when a model has to be re-trained

32
CRM at Work Credite Est
  • Regional mid-tier bank in France use of data
    mining in marketing
  • Uses segmentation scheme based on behavioral
    characteristics
  • (e.g. product ownership), and an
    activity-based-costing system to identify
    individual customer level contribution margin
  • Project
  • Business goal to acquire new prospects
  • Objective to identify the characteristics of
    profitable customers in Credite Ests mass-market
    segment to efficiently target similar profiles in
    the prospect pool

33
Credite Est (contd.)
  • Get Raw Data
  • Response variable for current customers is
    customer contribution margin
  • Customers sorted by operating contribution and
    profile of the top 20 of customers noted
  • Transaction information on prospects purchased
    and then appended to individual records of
    existing customers
  • Identify relevant variables
  • To find the profile that best characterizes high
    value clients which is subsequently applied to
    prospects information
  • Model attempts to predict customer operating
    margin as dependent variable with geodemographic
    information as independent variables
  • Credite Est appended a total of 65 variables to
    existing customer records

34
Credite Est (contd.)
  • Select Predictive Variables
  • All variables that were appended had almost 50
    missing data
  • Assessing whether any of the missing data could
    be meaningfully replaced improved the overall
    rate of missing values from 42 to 21
  • Investigation of univariate statistics (means,
    standard deviations, frequencies, outliers) for
    all variables brought reduction in variables from
    65 to 54
  • Calculation of all bi-variate correlations (or
    mean analyses in case of categorical variables)
    of existing independent variables with the
    dependent variable customer value
  • Data evaluation process resulted in a total of 17
    variables that had a reasonable correlation with
    the dependent variable. These were retained for
    the next step, the response model

35
Credite Est (contd.)
  • Gain Customer Insight
  • Use logistic regression to classify the dependent
    variable as 0/1 the goal being to either target
    or not target a certain individual in the
    prospect pool
  • Theory-based elimination variables that are
    highly collinear
  • The ability of the model to correctly classify in
    a holdout sample was 75.5 in the estimation
    sample and 69.8 in the holdout sample, roughly
    20 higher than based on chance alone
  • Result was deemed successful and it was decided
    to utilize this model for a prospecting campaign

36
Credite Est (contd.)
  • Act
  • Final model was rolled out in sequential fashion
    to target prospect audience
  • Credite Est purchased addresses from list brokers
    that had at least non-missing vales for 3 out of
    the 5 variables in the final model
  • The prospects were scored with the model and then
    ranked by likelihood of being a high value
    customer
  • Objective was to assess the receptivity of the
    two samples of customers for respective products
  • Result Both target mailings were significantly
    more successful than the base line scenario

37
CRM at Work Yapi Kredi Predictive Model Based
Cross-Sell Campaign
  • Challenge To continue YAPI KREDIs development
    as the fastest growing retail bank in Turkey
  • Capabilities required
  • Advanced analytical customer segmentation
  • Segment specific offering of product bundles
  • Conversion of customers to more profitable
    segments via targeted campaigns using advanced
    CRM tools such as predictive modeling
  • Project plan
  • To carry out a set of pilot projects for
    cross-selling of consumer banking products
  • A reduced selection of target customers with a
    high propensity to positively respond would be
    included in a multi-channel, two-step campaign

38
Yapi Kredi - Define Business Objectives
  • YAPI KREDIs B-type mutual funds, characterized
    by
  • Being low risk investment instruments based on
    fixed income securities
  • Easily purchased via the ATM, Web, and Telephone
    channels
  • Offer to two customer groups
  • Customers already having invested into B-type
    mutual funds to stimulate an increase of the
    assets
  • Customers not yet owning any B-type fund to help
    increase product ratio and attract new money

39
Yapi Kredi-Define business objectives (contd. )
  • Communication channels two-channel approach
  • Campaign sizing Contact 3000 customers by
    branch based out-bound calls and active marketing
    during customer branch visits
  • Campaign Two-step
  • Customers were first contacted with the B-type
    mutual fund offer
  • Positive responders received a follow up call if
    they had not purchased until one week after their
    initial positive response
  • Evaluation of results Based on response and
    purchase rates by contact channel (branch or call
    center)

40
Yapi Kredi- Get Raw Data Identify Relevant
Variables
  • Get Raw Data
  • Data mart with data extracted from more than 50
    source system tables
  • About 20 database tables were produced with 30
    Giga Bytes of disk space for the initial project
    phase
  • Identify Relevant Variables - customer attributes
    describing
  • Demographics
  • Product Ownership
  • Product Usage
  • Channel usage
  • Assets
  • Liabilities
  • Profitability

41
Yapi Kredi - Gain Customer Insight
  • Based on six months of historical customer data,
    five different predictive models were developed
  • Best model logistic regression
  • Yielding a lift value of 29 and a cumulative
    response rate of 14 for the top customer
    percentile
  • Reaches 2.9 times more responders for the top
    customer percentile than a random selection of
    the same size
  • A set of 4200 customers with the highest
    propensity to purchase was selected as the target
    group for the pilot campaign

42
Yapi Kredi - Act
  • A subset of 3000 customers was assigned to the 16
    branches holding the responsibility for the
    respective relationships
  • The remaining 1200 customers were assigned to the
    call center
  • The target list with the corresponding channel
    assignment was made available to the campaign
    management system

43
Yapi Kredi - Result
  • Result
  • Impressive response rates of 6.5 and 12.2 were
    obtained with the branch based part of the
    campaign and the call center based part of the
    campaign respectively
  • The pilot campaign acquired more than 1 million
    into B-type mutual funds

44
Summary
  • Data Mining can assist in selecting the right
    target customers or in identifying previously
    unknown customers with similar behavior and needs
  • A good target list is likely to increase purchase
    rates, and have a positive impact on revenue
  • In the context of CRM, the individual customer is
    often the central object analyzed by means of
    data mining methods
  • A complete data mining process comprises
    assessing and specifying the business objectives,
    data sourcing, transformation and creation of
    analytical variables, and building analytical
    models using techniques such as logistic
    regression and neural networks, scoring customers
    and obtaining feedback from the field
  • Learning and refining the data mining process is
    the key to success
Write a Comment
User Comments (0)
About PowerShow.com