Credit Card Fraud Classification - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Credit Card Fraud Classification

Description:

Credit Card Fraud Classification. Gavin Rosenbush. Modern Approaches and Practices ... History of the card owner stored and compared against. current transaction ... – PowerPoint PPT presentation

Number of Views:748
Avg rating:1.0/5.0
Slides: 26
Provided by: umiac7
Category:

less

Transcript and Presenter's Notes

Title: Credit Card Fraud Classification


1
Credit Card Fraud Classification
Modern Approaches and Practices
  • May 4, 2005

ENEE 752 Spring 2005
Gavin Rosenbush
2
Credit Card Fraud Classification
Introduction
  • The credit industry is extremely large and
    ubiquitous
  • Almost everyone has at least one line of credit
  • Credit institutions handle millions of
    transactions per day
  • Each transaction is checked by the credit
    institutions and
  • is denied or approved

ENEE 752 Spring 2005
Gavin Rosenbush
3
Credit Card Fraud Classification
Introduction
  • Why do we care about fraud?
  • Fraud is paid for by consumers
  • Banks raise rates to vendors and consumers
  • Vendors raise prices
  • In the end, buyers take responsibility

ENEE 752 Spring 2005
Gavin Rosenbush
4
Credit Card Fraud Classification
Domain Specification
  • What is credit card fraud?
  • Generic def'n unauthorized use of someone's
    credit card for
  • making purchases
  • There are many cases/situations that are
    fraudulent
  • Well-known person's physical card stolen
  • Information unknowingly sniffed
  • Social Engineering
  • Corporate databases compromised
  • Many points of failure in the system
  • Physical card not required for fraud

ENEE 752 Spring 2005
Gavin Rosenbush
5
Credit Card Fraud Classification
Domain Specification
  • What is credit card fraud?
  • Because of all these cases, detecting fraud when
  • information stolen is difficult.
  • Fraud must be detected when transactions occur
  • Combined with the volume of daily transactions,
    machine
  • Learning becomes necessary for this
    classification

ENEE 752 Spring 2005
Gavin Rosenbush
6
Credit Card Fraud Classification
System Description
  • Information to be deduced
  • Two class classification fraudulent, legitimate
  • System should output information to assist in
    this decision
  • Different algorithms have different output
    conventions
  • Probability that transaction is fraudulent
  • Probability that transaction is legitimate

ENEE 752 Spring 2005
Gavin Rosenbush
7
Credit Card Fraud Classification
System Description
  • What cannot be deduced
  • Machine learning algorithms are not able to
    provide a
  • reason why a certain decision was made
  • Not required in many other problems, but would be
    useful in
  • this problem. Customers will ask why actions
    occurred
  • A person with domain expertise may be able to
    look at
  • the case and provide a reason

ENEE 752 Spring 2005
Gavin Rosenbush
8
Credit Card Fraud Classification
System Constraints
  • What makes this problem unique?
  • Skewed class distribution
  • Difference in percentage of fraudulent and
    legitimate
  • transactions is very high
  • Non-Uniform Cost Function
  • The cost of incorrectly or correctly classifying
    a transaction
  • is not uniform and is based on cost of the
    transaction and
  • the customer's history

ENEE 752 Spring 2005
Gavin Rosenbush
9
Credit Card Fraud Classification
System Constraints
  • What makes this problem unique?
  • Overlapping data
  • Legitimate transactions can appear fraudulent
    vice-versa
  • Speed demand is high
  • Customers don't want to wait more than a few
    seconds
  • False-Positive rate
  • FP rate is very important customers don't want a
    hassle
  • Data sharing is not encouraged
  • Customer privacy
  • Trade secrets

ENEE 752 Spring 2005
Gavin Rosenbush
10
Credit Card Fraud Classification
Research Constraints
  • What made this difficult to research?
  • Little information on in-use systems
  • Fortunately, one was studied
  • Training data kept secret
  • Features kept secret

ENEE 752 Spring 2005
Gavin Rosenbush
11
Credit Card Fraud Classification
System Analysis
  • Introduction
  • Researched the following
  • Karl Tuyls' comparison of neural networks and
    bayesian
  • networks
  • Minerva
  • Spain's neural-network based system for
    classifying VISA
  • transactions

ENEE 752 Spring 2005
Gavin Rosenbush
12
Credit Card Fraud Classification
System Analysis
  • Common Elements
  • Two main system types
  • By-owner system
  • History of the card owner stored and compared
    against
  • current transaction
  • Large data storage requirement
  • By-operation system
  • History of operations stored and compared over a
    fixed
  • time window
  • Smaller storage requirement, faster processing

ENEE 752 Spring 2005
Gavin Rosenbush
13
Credit Card Fraud Classification
System Analysis
  • Common Elements
  • Input Features
  • Not specified due to privacy
  • Hypothesis looked at paper on credit
    application fraud
  • Features are characteristics about the current
    transaction
  • Most likely features (among others)
  • Cost of transaction
  • Time of day
  • Location
  • Store name, type

ENEE 752 Spring 2005
Gavin Rosenbush
14
Credit Card Fraud Classification
Artificial Neural Networks
  • Applied to credit fraud
  • Most widely used method
  • In use system Minerva
  • Handles 60 of VISA traffic in Spain
  • 75 of Spain's credit card institutions
  • More than 1.2 million transactions per day
  • By-operation system

ENEE 752 Spring 2005
Gavin Rosenbush
15
Credit Card Fraud Classification
Artificial Neural Networks
  • Applied to credit fraud
  • Network Structure
  • 1 input layer
  • 1-2 hidden layers 3-6 nodes each
  • 1 output layer
  • Activation Functions
  • Sigmoid most widely used
  • Hyperbolic tangent function also used with
    comparable
  • results

ENEE 752 Spring 2005
Gavin Rosenbush
16
Credit Card Fraud Classification
Artificial Neural Networks
  • Applied to credit fraud
  • Pre-Processing Data Improves ANN performance
  • Restricting input features to those that are most
    relevant
  • Requires domain expertise
  • Normalization of input features
  • De-correlation of input features to eliminate
    unnecessary
  • Features
  • ANNs with preprocessing represented the best
    results

ENEE 752 Spring 2005
Gavin Rosenbush
17
Credit Card Fraud Classification
Bayesian Belief Networks
  • Applied to credit fraud
  • Comparable results
  • Several difficulties in comparison
  • Less features used
  • Different training, test data used
  • Different network structures tried
  • Evaluated using STAGE algorithm

ENEE 752 Spring 2005
Gavin Rosenbush
18
Credit Card Fraud Classification
Data Sharing using Meta-Learning
  • Applied to credit fraud
  • JAM Java Agents for meta-learning
  • Distributed computing model at the OS level
  • Allows each institution to have their own local
    classifier
  • Java agents work with local classifiers at each
    site to
  • share learned information
  • Secured environmentallows learning info to be
    shared
  • Without divulging personal or proprietary info

ENEE 752 Spring 2005
Gavin Rosenbush
19
Credit Card Fraud Classification
System Comparisons
  • System constraints are key in comparisons
  • True positive rate
  • False positive rate
  • Speedboth learning and classifying new examples
  • Storage space requirements

ENEE 752 Spring 2005
Gavin Rosenbush
20
Credit Card Fraud Classification
Results
  • ANNs
  • Best results with pre-processing
  • Training 120K transactions, Test 100K
    transactions
  • Overfitting occurs after 90 epochs
  • At /- 10 false positive, 60 true positive
  • At /- 15 false positive, 70 true positive
  • Minerva
  • 111 false to true positive rate at best
  • 12 false to true positive rate at worst

ENEE 752 Spring 2005
Gavin Rosenbush
21
Credit Card Fraud Classification
Results
  • ANNs
  • Execution time-
  • Fast
  • Minerva reported at 60ms, dominated by
  • disk access
  • Training time -
  • Slow about 2 hours

ENEE 752 Spring 2005
Gavin Rosenbush
22
Credit Card Fraud Classification
Results
  • BBNs
  • Comparable performance ratios to ANNs
  • Execution time-
  • Slow
  • Training time -
  • Fast about 20 minutes

ENEE 752 Spring 2005
Gavin Rosenbush
23
Credit Card Fraud Classification
Conclusion
  • ANNs win
  • Neural networks are more widely used because
    speed is
  • critical and performance is comparable

ENEE 752 Spring 2005
Gavin Rosenbush
24
Credit Card Fraud Classification
Future Work
  • There is a lot of work still to be done in the
    field
  • Information sharing (JAM) should assist greatly
  • Pruning algorithms for removing unnecessary
    perceptrons
  • Support Vector Machines have not been considered

ENEE 752 Spring 2005
Gavin Rosenbush
25
  • Questions
  • Possible Answers
Write a Comment
User Comments (0)
About PowerShow.com