Loading...

PPT – Managing the Business Risk of Fraud using Sampling and Data Mining PowerPoint presentation | free to download - id: 3bb937-MWVmM

The Adobe Flash plugin is needed to view this content

Managing the Business Risk of Fraud using

Sampling and Data Mining

Mike Blakley

Presented to

Fall 2009

PWC Global Survey Nov, 2009 Economic crime in

a downturn

- Sharp rise in accounting fraud over the past 12

months - Accounting fraud had grown to 38 percent of the

economic crimes in 2009 - Employees face increased pressures to
- meet performance targets
- keep their jobs
- keep access to funding

Survey findings

- Greater risk of fraud due to increased incentives

or pressures - More opportunities to commit fraud, partially due

to reductions in internal finance staff - While companies are expecting more fraud, they

have not done much - People who look for fraud are more likely to find

it

Session objectives

- Understand the framework for managing the

business risk of fraud - Plan, perform and explain statistical sampling in

audits - Reduce audit costs using data mining, sequential

sampling and other sampling techniques - Apply SAS 56, the new SAS suite and the revised

(2007) Yellow Book. - Run, hands-on, the most productive analytic

technique (regression analysis). - Use data mining to introduce greater efficiency

into the audit process, without losing

effectiveness.

Session agenda - 1

- Introduction and the Process for Managing the

Business Risk of Fraud - Introductions All Around
- Course Objectives
- Framework of risk management for fraud
- Fundamentals of data mining
- Data mining The Engine That Drives analysis
- Analytics and Regression
- Sources of Analytics Data
- Basic and Intermediate ARTs
- SAS 56
- IIA Practice Advisory 2320
- The Yellow Book (2007 revision)
- The Guide Managing the Business Risk of Fraud

Session Agenda (contd) Sampling refresher

- Sampling
- The sampling process
- Sampling methods
- RAT-STATS
- Random Numbers
- Determining Sample Size
- Case Study
- Attribute sampling
- Variable Sampling
- Case study
- Stratified Sampling
- Obtaining and Interpreting the results
- Other Sampling Approaches
- DCAA Audit Package
- Sequential Sampling
- Overview of the process
- Attribute Sampling
- Variable Sampling

Session Agenda (contd) Linear regression as

an audit tool

- Regression Analysis
- Overview
- Terms
- Statistical basis
- Charting Regression Seeing Is Believing
- Plotting Data
- Inserting a Trend line
- Statistical Intervals
- Confidence Intervals
- Prediction Intervals
- Calculation of Statistical Confidence Bounds
- Case Study - Wake County Schools Bus Maintenance

Session Agenda (contd) Data mining, or How to

test 100

- Overview
- Statistical Basis
- Data Conversion and Extraction
- Data mining objectives
- Classification
- Trends
- Identification of extremes
- Major types of data analysis
- Numeric
- Date
- Text

Session Agenda (contd) Excel as an Analytics

tool

- Macros
- Tools Data Analysis
- The Macro facility
- Adding a little class to your audit
- VBA friend or foe

Handout (CD)

- CD with articles and software
- PowerPoint presentation
- More info at www.ezrstats.com

Cockroach theory of auditing

- If you spot one roach.

Cockroach theory of auditing

- There are probably 30 more that you dont see

Statistics based roach hunting

Many frauds coulda/woulda/shoulda been detected

with analytics

Overview

- Fraud patterns detectable with digital analysis
- Basis for digital analysis approach
- Usage examples
- Continuous monitoring
- Business analytics

The Why and How

Objective 1

- Three brief examples
- ACFE/IIA/AICPA Guidance Paper
- Practice Advisory 2320-1
- Auditors Top 10
- Process Overview
- Who, What, Why, When Where

Example 1 Wake County Transportation Fraud

Objective 1a

- Supplier Kickback School Bus parts
- 5 million
- Jail sentences
- Period of years

Too little too late

Objective 1a

- Understaffed internal audit
- Software not used
- Data on multiple platforms
- Transaction volumes large

Preventable

Objective 1a

- Need structured, objective approach
- Let the data talk to you
- Need efficient and effective approach

Regression Analysis

Objective 1

- Stepwise to find relationships
- Forwards
- Backwards
- Intervals
- Confidence
- Prediction

Data outliers

Objective 1

- Sometimes an out and out Liar
- But how do you detect it?

Data Outliers

Objective 1

- Plot transportation costs vs. number of buses
- Drill down on costs
- Preventive maintenance
- Fuel
- Inspection

Scatter plot with prediction and confidence

intervals

Cost of six types of AIDS drugs

Example 2

Objective 1a

Medicare HIV Infusion Costs

Objective 1

- CMS Report for 2005
- South Florida - 2.2 Billion
- Rest of the country combined - .1 Billion

Pareto Chart

Objective 1

Typical Prescription Patterns

Objective 1a

Example 2

Prescriptions by Dr. X

Example 2

Objective 1a

Off-label use

Example 2

Objective 1a

- Serostim
- Treat wasting syndrome, side effect of AIDS, OR
- Used by body builders for recreational purposes
- One physician prescribed 11.5 million worth (12

of the entire state)

Revenue trends

Example 3

Objective 1a

Dental Billings

Example 3

Objective 1a

Guidance Paper

Objective 1b

- A proposed implementation approach
- Managing the Business Risk of Fraud A Practical

Guide http//tinyurl.com/3ldfza - Five Principles
- Fraud Detection
- Coordinated Investigation Approach

Managing the Business Risk of Fraud A Practical

Guide

Objective 1b

- ACFE, IIA and AICPA Exposure draft issued

11/2007, final 5/2008 - Section 4 Fraud Detection

Guidance Paper

- Five Sections
- Fraud Risk Governance
- Fraud Risk Assessment
- Fraud Prevention
- Fraud Detection
- Fraud Investigation and corrective action

Risk Governance

- Fraud risk management program
- Written policy managements expectations

regarding managing fraud risk

Risk Assessment

- Periodic review and assessment of potential

schemes and events - Need to mitigate risk

Fraud Prevention

- Establish prevention techniques
- Mitigate possible impact on the organization

Fraud Detection

- Establish detection techniques for fraud
- Back stop where preventive measures fail, or
- Unmitigated risks are realized

Fraud Investigation and Corrective Action

- Reporting process to solicit input on fraud
- Coordinated approach to investigation
- Use of corrective action

60 Minutes World of Trouble

- 2/15/09 Scott Pelley
- Fraud Risk Governance one grand wink-wink,

nod-nod - Fraud Risk Assessment - categorically false
- Fraud Prevention my husband passed away
- Fraud Detection - We didn't know? Never saw one.
- Fraud Investigation and corrective action -

Pick-A-Payment losses 36 billion

Section 4 Fraud Detection

Objective 1b

- Detective Controls
- Process Controls
- Anonymous Reporting
- Internal Auditing
- Proactive Fraud Detection

Proactive Fraud Detection

Objective 1b

- Data Analysis to identify
- Anomalies
- Trends
- Risk indicators

Fraud Detective Controls

- Operate in the background
- Not evident in everyday business environment
- These techniques usually
- Occur in ordinary course of business
- Corroboration using external information
- Automatically communicate deficiencies
- Use results to enhance other controls

Examples of detective controls

- Whistleblower hot-lines (DHHS and OSA have them)
- Process controls (Medicaid audits and edits)
- Proactive fraud detection procedures
- Data analysis
- Continuous monitoring
- Benfords Law

Specific Examples Cited

Objective 1b

- Journal entries suspicious transactions
- Identification of relationships
- Benfords Law
- Continuous monitoring

Data Analysis enhances ability to detect fraud

Objective 1b

- Identify hidden relationships
- Identify suspicious transactions
- Assess effectiveness of internal controls
- Monitor fraud threats
- Analyze millions of transactions

Continuous Monitoring of Fraud Detection

- Organization should develop ongoing monitoring

and measurements - Establish measurement criteria (and communicate

to Board) - Measurable criteria include

Measurable Criteria number of

- fraud allegations
- fraud investigations resolved
- Employees attending annual ethics course
- Whistle blower allegations
- Messages supporting ethical behavior delivered by

executives - Vendors signing ethical behavior standards

Management ownership of each technique implemented

- Each process owner should
- Evaluate effectiveness of technique regularly
- Adjust technique as required
- Document adjustments
- Report modifications needed for techniques which

become less effective

Practice Advisory 2320-1 Analysis and Evaluation

- International standards for the professional

practice of Internal Auditing - Analytical audit procedures
- Efficient and effective
- Useful in detecting
- Differences that are not expected
- Potential errors
- Potential irregularities

Analytical Audit Procedures

- May include
- Study of relationships
- Comparison of amounts with similar information in

the organization - Comparison of amounts with similar information in

the industry

Analytical audit procedures

- Performed using monetary amounts, physical

quantities, ratios or percentages - Ratio, trend and regression analysis
- Period to period comparisons
- Auditors should use analytical audit procedures

in planning the engagement

Factors to consider

- Significance of the area being audited
- Assessment of risk
- Adequacy of system of internal control
- Availability and reliability of information
- Extent to which procedures provide support for

engagement results

Peeling the Onion

Objective 1c

Fraud Pattern Detection

Objective 1d

Digital Analysis (5W)

Objective 1e

A little about the basics of digital analysis.

- Who
- What
- Why
- Where
- When

Who Uses Digital Analysis

Objective 1e

- Traditionally, IT specialists
- With appropriate tools, audit generalists (CAATs)
- Growing trend of business analytics
- Essential component of continuous monitoring

What - Digital Analysis

Objective 1e

- Using software to
- Classify
- Quantify
- Compare
- Both numeric and non-numeric data

How - Assessing fraud risk

Objective 1e

- Basis is quantification
- Software can do the leg work
- Statistical measures of difference
- Chi square
- Kolmogorov-Smirnov
- D-statistic
- Specific approaches

Why - Advantages

Objective 1e

- Automated process
- Handle large data populations
- Objective, quantifiable metrics
- Can be part of continuous monitoring
- Can produce useful business analytics
- 100 testing is possible
- Quantify risk
- Repeatable process

Why - Disadvantages

Objective 1e

- Costly (time and software costs)
- Learning curve
- Requires specialized knowledge

When to Use Digital Analysis

Objective 1e

- Traditional intermittent (one off)
- Trend is to use it as often as possible
- Continuous monitoring
- Scheduled processing

Where Is It Applicable?

Objective 1e

- Any organization with data in digital format, and

especially if - Volumes are large
- Data structures are complex
- Potential for fraud exists

Disadvantages of digital analysis

- Cost
- Software
- Training
- Skills not widely available
- Time consuming
- Development costs
- Testing resources

Objective 1 Summarized

Objective 1

- Three brief examples
- CFE Guidance Paper
- Top 10 Metrics
- Process Overview
- Who, What, Why, When Where

Objective 1 - Summarized

- Understand the framework for managing the

business risk of fraud - Plan, perform and explain statistical sampling in

audits - Reduce audit costs using data mining, sequential

sampling and other sampling techniques - Apply SAS 56, the new SAS suite and the revised

(2007) Yellow Book. - Run, hands-on, the most productive analytic

technique (regression analysis). - Use data mining to introduce greater efficiency

into the audit process, without losing

effectiveness.

Next is plan, perform

Statistical Sampling

- Brief History / Timeline
- Overview
- Attribute Sampling Compliance
- Variable Sampling Numeric Estimates

History of Sampling

- Basis is two laws/theorems of probability
- Law of Large Numbers
- Central Limit Theorem

Law of large numbers

Time Line - LLN

- Indian mathematician Bramagupta 600 AD
- Italian mathematician Cardon 1500s
- Statement without proof that empirical statistics

improve with more trials

Time line LLN (continued)

- Jacob Bernoulli first to prove in 1713
- Foundation for central limit theorem

Central limit theorem

Classic measure

Mean of a sufficiently large number of random

samples will be approximately normally

distributed.

The traditional explanation

Central Limit Theorem

- See it in action today
- Any population
- Large number of samples
- Average is normally distributed

History of Central Limit Theorem

- French mathematician Abraham de Moivre
- 1733 approximate distribution from tossing coin

(heads/tails) - Ho hum reaction
- French Mathematician LaPlace expanded it
- Ho hum reaction

History of CLT (contd)

- Russian mathematician Lyapunov
- Proof in 1901
- Same reaction

Industrial revolution

Manufacturing

Engineering

Excitement!

Students T

William Gosset - 1908

Guinness Brewery

SAS 39

- Effective June, 1983
- Exposure draft for revision in 2009

Attribute sampling

- Buonaccorsi (1987)
- Refined calculations
- Few software packages use it

Overview

- Sample size calculations
- Attribute sampling
- Variable sampling
- Random number generators

Sample size calculation

- Its a guess
- Every package different answer
- Need to know the population
- But thats why youre taking a sample!

Attribute Sampling Using RAT-STATS

- Unrestricted populations

Session Objectives

- Understand what is attribute sampling and when to

use it - Understand unrestricted populations
- Overview of the process using RAT-STATS
- Understand the formula behind the computations

Attribute sampling

- Attribute
- Compliance testing
- Signatures on approval documents, attachment of

supporting documentation, etc.

Statistical approach

- Recommended
- Economical
- Efficient
- Requires determination of a sample size

Overview of process

- Determine the sampling objective
- Confidence
- Precision
- Determine required sample size
- Identify samples to be selected based upon random

numbers - Pull the sample and test
- Compute the sampling results (i.e. estimate of

range)

How this is done in RAT-STATS

- The sampling parameters are first developed by

the auditor - RAT-STATS is used to compute sample size
- RAT-STATS used to generate random numbers
- Pull the sample and test
- Enter results in RAT-STATS to compute estimates

Step 1 Develop sampling parameters

- Size of population
- Expected error rate
- Required confidence
- Required precision

Step 2 Obtain the random numbers

- Done by entering info into RAT-STATS
- Output can be a variety of sources
- Text File
- Excel
- Microsoft Access
- Print File

Step 3 Pull the sample

- Each random number selected corresponds with an

item - Put the selected item on a separate schedule

Step 4 - Test each selected item

- Generally requires reviewing documents

Step 5 Compute the results

- Enter summary information into RAT-STATS
- Output can be in a variety of formats
- Excel
- Microsoft Access
- Text File
- Print File
- Printer

Thats It!

- Now well see an actual demo using the RAT-STATS

software - Excel population of 5,000 invoices
- Results of test of attributes stored in the

worksheet

Variable Sampling Using RAT-STATS

- Unrestricted populations

Session Objectives

- Understand what variable sampling is and when to

use it - Understand unrestricted populations
- Overview of the process using RAT-STATS
- Understand the formula behind the computations

Variable sampling

- Variable
- Estimating account balances
- Estimating transaction totals

Statistical approach

- Recommended
- Economical
- Efficient
- Requires determination of a sample size

Overview of process

- Determine the sampling objective
- Confidence
- Precision
- Determine required sample size
- Identify samples to be selected based upon random

numbers - Pull the sample and test
- Compute the sampling results (i.e. estimate of

range)

How this is done in RAT-STATS

- The sampling parameters are first developed by

the auditor - RAT-STATS is used to compute sample size
- RAT-STATS used to generate random numbers
- Pull the sample and test
- Enter results in RAT-STATS to compute estimates

Step 1 Develop sampling parameters

- Probe sample
- Statistical measure
- Excel formula

Step 1 Develop sampling parameters

- Size of population
- Average value
- Standard deviation

Step 2 Obtain the random numbers

- Done by entering info into RAT-STATS
- Output can be a variety of sources
- Text File
- Excel
- Microsoft Access
- Print File

Step 3 Pull the sample

- Each random number selected corresponds with an

item - Put the selected item on a separate schedule

Step 4 - Test each selected item

- Generally requires reviewing documents
- Example data contains both examined and

audited value.

Step 5 Compute the results

- Enter summary information into RAT-STATS
- Output can be in a variety of formats
- Excel
- Microsoft Access
- Text File
- Print File
- Printer

Thats It!

- Now well see an actual demo using the RAT-STATS

software - Excel population of 5,000 invoices
- Audited values stored in the worksheet

Attribute Sampling Using RAT-STATS

- Stratified populations

Session Objectives

- Understand what is stratification and when to use

it - Overview of the process using RAT-STATS

Stratified sampling

- Strata
- Homogenous
- More efficient in some instances

Overview of process

- Separation into strata
- Determine the sampling objective
- Confidence
- Precision
- Determine required sample size
- Identify samples to be selected based upon random

numbers - Pull the sample and test
- Compute the sampling results (i.e. estimate of

range)

How this is done in RAT-STATS

- The sampling parameters are first developed by

the auditor - RAT-STATS is used to compute sample size
- RAT-STATS used to generate random numbers
- Pull the sample and test
- Enter results in RAT-STATS to compute estimates

Step 1 Develop sampling parameters

- Size of population
- Expected error rate
- Required confidence
- Required precision

Step 2 Obtain the random numbers

- Done by entering info into RAT-STATS
- Output can be a variety of sources
- Text File
- Excel
- Microsoft Access
- Print File

Step 3 Pull the sample

- Each random number selected corresponds with an

item - Put the selected item on a separate schedule

Step 4 - Test each selected item

- Generally requires reviewing documents

Step 5 Compute the results

- Enter summary information into RAT-STATS
- Output can be in a variety of formats
- Excel
- Microsoft Access
- Text File
- Print File
- Printer

Thats It!

- Now well see an actual demo using the RAT-STATS

software - Excel population of 5,000 invoices
- Results of test of attributes stored in the

worksheet

Variable Sampling Using RAT-STATS

- Stratified populations

Session Objectives

- Understand what stratified sampling is and when

to use it - Populations benefiting from stratified sampling
- Overview of the process using RAT-STATS
- Understand the formula behind the computations

Stratified variable sampling

- Stratified
- Variable
- Estimating amounts
- Narrower standard deviation

Overview of process

- Determine the sampling objective
- Confidence
- Precision
- Determine required sample size
- Identify samples to be selected based upon random

numbers - Pull the sample and test
- Compute the sampling results (i.e. estimate of

range)

How this is done in RAT-STATS

- The sampling parameters are first developed by

the auditor - RAT-STATS is used to compute sample size
- RAT-STATS used to generate random numbers
- Pull the sample and test
- Enter results in RAT-STATS to compute estimates

Step 1 Develop sampling parameters

- Probe sample
- Statistical measure
- Excel formula

Step 1 Develop sampling parameters

- Number of strata
- Size of population
- Average value
- Standard deviation

Step 2 Obtain the random numbers

- Done by entering info into RAT-STATS
- Multi-stage random numbers
- Output can be a variety of sources
- Text File
- Excel
- Microsoft Access
- Print File

Step 3 Pull the sample

- Each random number selected corresponds with an

item in a strata - Put the selected item on a separate schedule

Step 4 - Test each selected item

- Generally requires reviewing documents
- Example data contains both examined and

audited value.

Step 5 Compute the results

- Enter summary information into RAT-STATS
- Output can be in a variety of formats
- Excel
- Microsoft Access
- Text File
- Print File
- Printer

Thats It!

- Now well see an actual demo using the RAT-STATS

software - Excel population of 5,000 invoices
- Divided into three strata
- Audited values stored in the worksheet

Objective 2 - Summarized

- Understand the framework for managing the

business risk of fraud - Plan, perform and explain statistical sampling in

audits - Reduce audit costs using data mining, sequential

sampling and other sampling techniques - Apply SAS 56, the new SAS suite and the revised

(2007) Yellow Book. - Run, hands-on, the most productive analytic

technique (regression analysis). - Use data mining to introduce greater efficiency

into the audit process, without losing

effectiveness.

Next is cost reduction

Techniques for cost reduction

- Optimize sample size (most bang for the buck)
- Skip sampling review 100 of transactions using

computer assisted audit techniques (CAATs)

Sample optimization

- Sequential sampling

University of Hawaii

- Banana aphids

Sequential sampling

- Banana aphids

100 test using CAATs

- Provides complete coverage
- Best practice
- Basis for continuous monitoring
- Repeatable process

Objective 3 - Summarized

- Understand the framework for managing the

business risk of fraud - Plan, perform and explain statistical sampling in

audits - Reduce audit costs using data mining, sequential

sampling and other sampling techniques - Apply SAS 56, the new SAS suite and the revised

(2007) Yellow Book. - Run, hands-on, the most productive analytic

technique (regression analysis). - Use data mining to introduce greater efficiency

into the audit process, without losing

effectiveness.

Next is Yellow Book and SAS 56

Yellow book standards

Standards regarding statistical sampling and IT

General standards

- 3.43 Technical Knowledge and competence
- The staff assigned to conduct an audit or

attestation engagement under GAGAS must

collectively possess the technical knowledge,

skills, and experience necessary to be competent

for the type of work being performed before

beginning work on that assignment. - The staff assigned to a GAGAS audit or

attestation - engagement should collectively possess

Stat sampling and IT

- Skills appropriate for the work being performed.

For example, staff or specialist skills in - (1) statistical sampling if the work involves use

of statistical sampling - (2) information technology

SAS 56 Analytical procedures

- Requires use of analytic review procedures for
- Audit planning
- Overall review stages

SAS 56 Analytical Review

- Encourages use of analytical review
- Provides guidance

A wide variety of analytical procedures may be

useful for this purpose.

Objective 4 - Summarized

- Understand the framework for managing the

business risk of fraud - Plan, perform and explain statistical sampling in

audits - Reduce audit costs using data mining, sequential

sampling and other sampling techniques - Apply SAS 56, the new SAS suite and the revised

(2007) Yellow Book. - Run, hands-on, the most productive analytic

technique (regression analysis). - Use data mining to introduce greater efficiency

into the audit process, without losing

effectiveness.

Next is linear regression

Next Metric

- Outliers
- Stratification
- Day of Week
- Round Numbers
- Made Up Numbers
- Market basket
- Trends
- Gaps
- Duplicates
- Dates

Trend Busters

7 - Trends

Does the pattern make sense?

Trend Busters

7 Trends

- Linear regression
- Sales are up, but cost of goods sold is down
- Spikes

Purpose / Type of Errors

7 Trends

- Identify trend lines, slopes, etc.
- Correlate trends
- Identify anomalies
- Key punch errors where amount is order of

magnitude

Linear Regression

7 Trends

- Test relationships (e.g. invoice amount and sales

tax) - Perform multi-variable analysis

How is it done?

7 Trends

- Estimate linear trends using best fit
- Measure variability (standard errors)
- Measure slope
- Sort descending by slope, variability, etc.

Trend Lines by Account - Example Results

7 Trends

Generally the trend is gently sloping up, but two

accounts (43870 and 54630) are different.

Scatter plot with prediction and confidence

intervals

Objective 5 - Summarized

- Understand the framework for managing the

business risk of fraud - Plan, perform and explain statistical sampling in

audits - Reduce audit costs using data mining, sequential

sampling and other sampling techniques - Apply SAS 56, the new SAS suite and the revised

(2007) Yellow Book. - Run, hands-on, the most productive analytic

technique (regression analysis). - Use data mining to introduce greater efficiency

into the audit process, without losing

effectiveness.

Next is data mining

Basis for Pattern Detection

Objective 6

- Analytical review
- Isolate the significant few
- Detection of errors
- Quantified approach

Understanding the Basis

Objective 2

- Quantified Approach
- Population vs. Groups
- Measuring the Difference
- Stat 101 Counts, Totals, Chi Square and K-S
- The metrics used

Quantified Approach

Objective 2a

- Based on measureable differences
- Population vs. Group
- Shotgun technique

Detection of Fraud Characteristics

Objective 2a

- Something is different than expected

Fraud patterns

Objective 2b

- Common theme something is different
- Groups
- Group pattern is different than overall population

Measurement Basis

Objective 2c

- Transaction counts
- Transaction amounts

A few words about statistics (the s word)

Objective 2d

- Detailed knowledge of statistics not necessary
- Software packages do the number-crunching
- Statistics used only to highlight potential

errors/frauds - Not used for quantification

How is digital analysis done?

Objective 2d

- Comparison of group with population as a whole
- Can be based on either counts or amounts
- Difference is measured
- Groups can then be ranked using a selected

measure - High difference possible error/fraud

Demo in Excel of the process

- Based roughly on the Wake County Transportation

fraud - Illustrates how the process works, using Excel

Histograms

Objective 2d

- Attributes tallied and categorized into bins
- Counts or sums of amounts

Two histograms obtained

Objective 2d

- Population and group

Compute Cumulative Amount for each

Objective 2d

Are the histograms different?

Objective 2d

- Two statistical measures of difference
- Chi Squared (counts)
- K-S (distribution)
- Both yield a difference metric

Chi Squared

Objective 2d

- Classic test on data in a table
- Answers the question are the rows/columns

different - Some limitations on when it can be applied

Chi Squared

Objective 2d

- Table of Counts
- Degrees of Freedom
- Chi Squared Value
- P-statistic
- Computationally intensive

Kolmogorov-Smirnov

Objective 2d

- Two Russian mathematicians
- Comparison of distributions
- Metric is the d-statistic

How is K-S test done?

Objective 2d

- Four step process
- For each cluster element determine percentage
- Then calculate cumulative percentage
- Compare the differences in cumulative percentages
- Identify the largest difference

Kolmogorov-Smirnov

Objective 2d - KS

Classification by metrics

Objective 2e

- Stratification
- Day of week
- Happens on holiday
- Round numbers
- Variability
- Benfords Law
- Trend lines
- Relationships (market basket)
- Gaps
- Duplicates

Auditors Top 10 Metrics

Objective e

- Outliers / Variability
- Stratification
- Day of Week
- Round Numbers
- Made Up Numbers
- Market basket
- Trends
- Gaps
- Duplicates
- Dates

Understanding the Basis

Objective 2

- Quantified Approach
- Population vs. Groups
- Measuring the Difference
- Stat 101 Counts, Totals, Chi Square and K-S
- The metrics used

Objective 2 - Summarized

- Understand why and how
- Understand statistical basis for quantifying

differences - Identify ten general tools and techniques
- Understand examples done using Excel
- How pattern detection fits in

Next are the metrics

Its that time!

Session Break!

The Top 10 Metrics

Objective 3

- Overview
- Explain Each Metric
- Examples of what it can detect
- How to assess results

Trapping anomalies

Objective 3

Fraud Pattern Detection

Objective 3

Outliers / Variability

1 - Outliers

Outliers are amounts which are significantly

different from the rest of the population

Outliers / Variability

1 - Outliers

- Charting (visual)
- Software to analyze z-scores
- Top and Bottom 10, 20 etc.
- High and low variability (coefficient of

variation)

Drill down to the group level

1 - Outliers

- Basic statistics
- Minimum, maximum and average
- Variability
- Sort by statistic of interest
- Variability (coefficient of variation)
- Maximum, etc.

Example Results

1 - Outliers

Two providers (3478421 and 2356721) had

significantly more variability in the amounts of

their claims than all the rest.

Next Metric

- Outliers
- Stratification
- Day of Week
- Round Numbers
- Made Up Numbers
- Market basket
- Trends
- Gaps
- Duplicates
- Dates

Unusual stratification patterns

2 - Stratification

Do you know how your data looks?

Stratification - How

2 - Stratification

- Charting (visual)
- Chi Squared
- Kolmogorov-Smirnov
- By groups

Purpose / types of errors

2 Stratification

- Transactions out of the ordinary
- Up-coding insurance claims
- Skewed groupings
- Based on either count or amount

The process?

2 Stratification

- Stratify the entire population into bins

specified by auditor - Same stratification on each group (e.g. vendor)
- Compare the group tested to the population
- Obtain measure of difference for each group
- Sort descending on difference measure

Units of Service Stratified - Example Results

2 Stratification

Two providers (2735211 and 4562134) are shown to

be much different from the overall population (as

measured by Chi Square).

Next Metric

- Outliers
- Stratification
- Day of Week
- Round Numbers
- Made Up Numbers
- Market basket
- Trends
- Gaps
- Duplicates
- Dates

Day of Week

3 Day of Week

- Activity on weekdays
- Activity on weekends
- Peak activity mid to late week

Purpose / Type of Errors

3 Day of Week

- Identify unusually high/low activity on one or

more days of week - Dentist who only handled Medicaid on Tuesday
- Office is empty on Friday

How it is done?

- Programmatically check entire population
- Obtain counts and sums by day of week (1-7)
- Prepare histogram
- For each group do the same procedure
- Compare the two histograms
- Sort descending by metric (chi square/d-stat)

Day of Week - Example Results

3 Day of Week

Provider 2735211 only provided service for

Medicaid on Tuesdays. Provider 4562134 was

closed on Thursdays and Fridays.

Next Metric

- Outliers
- Stratification
- Day of Week
- Round Numbers
- Made Up Numbers
- Market basket
- Trends
- Gaps
- Duplicates
- Dates

Round Numbers

4 Round Numbers

Its about.

Estimates!

Purpose / Type of Errors

4 Round Numbers

- Isolate estimates
- Highlight account numbers in journal entries with

round numbers - Split purchases (under the radar)
- Which groups have the most estimates

Round numbers

4 Round Numbers

- Classify population amounts
- 1,375.23 is not round
- 5,000 is a round number type 3 (3 zeros)
- 10,200 is a round number type 2 (2 zeros)
- Quantify expected vs. actual (d-statistic)
- Generally represents an estimate
- Journal entries

Round Numbers in Journal Entries - Example Results

4 Round Numbers

Two accounts, 2735211 and 4562134 have

significantly more round number postings than any

other posting account in the journal entries.

Next Metric

- Outliers
- Stratification
- Day of Week
- Round Numbers
- Made Up Numbers
- Market basket
- Trends
- Gaps
- Duplicates
- Dates

Made up Numbers

5 Made up numbers

Curb stoning

Imaginary numbers

Benfords Law

What can be detected

5 Made Up Numbers

- Made up numbers e.g. falsified inventory

counts, tax return schedules

Benfords Law using Excel

5 Made Up Numbers

- Basic formula is log(1(1/N))
- Workbook with formulae available at

http//tinyurl.com/4vmcfs - Obtain leading digits using Left function, e.g.

left(Cell,1)

Made up numbers

5 Made Up Numbers

- Benfords Law
- Check Chi Square and d-statistic
- First 1,2,3 digits
- Last 1,2 digits
- Second digit
- Sources for more info

How is it done?

5 Made Up Numbers

- Decide type of test (first 1-3 digits, last 1-2

digit etc) - For each group, count number of observations for

each digit pattern - Prepare histogram
- Based on total count, compute expected values
- For the group, compute Chi Square and d-stat
- Sort descending by metric (chi square/d-stat)

Invoice Amounts tested with Benfords law -

Example Results

5 Made Up Numbers

During tests of invoices by store, two stores,

324 and 563 have significantly more differences

than any other store as measured by Benfords

Law.

Next Metric

- Outliers
- Stratification
- Day of Week
- Round Numbers
- Made Up Numbers
- Market basket
- Trends
- Gaps
- Duplicates
- Dates

Market Basket

6 Market Basket

- Medical Ping ponging
- Pattern associations
- Apriori program
- References at end of slides
- Apriori Latin a (from) priori (former)
- Deduction from the known

Purpose / Type of Errors

6 Market basket

- Unexpected patterns and associations
- Based on market basket concept
- Unusual combinations of diagnosis code on medical

insurance claim

Market Basket

6 Market basket

- JE Accounts
- JE Approvals
- Credit card fraud in Japan taxi and ATM

How is it done?

6 Market basket

- First, identify groups, e.g. all medical

providers for a patient - Next, for each provider, assign a unique integer

value - Create a text file containing the values
- Run apriori analysis

Apriori outputs

6 Market basket

- For each unique value, probability of other

values - If you see Dr. Jones, you will also see Dr. Smith

(80 probability) - If you see a JE to account ABC, there will also

an entry to account XYZ (30)

Next Metric

- Outliers
- Stratification
- Day of Week
- Round Numbers
- Made Up Numbers
- Market basket
- Trends
- Gaps
- Duplicates
- Dates

Numeric Sequence Gaps

8 - Gaps

Whats there is interesting, whats not there is

critical

Purpose / Type of Errors

8 Gaps

- Missing documents (sales, cash, etc.)
- Inventory losses (missing receiving reports)
- Items that walked off

How is it done?

8 Gaps

- Check any sequence of numbers supposed to be

complete, e.g. - Cash receipts
- Sales slips
- Purchase orders

Gaps Using Excel

8 Gaps

- Excel sort and check
- Excel formula
- Sequential numbers and dates

Gap Testing - Example Results

8 Gaps

Four check numbers are missing.

Next Metric

- Outliers
- Stratification
- Day of Week
- Round Numbers
- Made Up Numbers
- Market basket
- Trends
- Gaps
- Duplicates
- Dates

Duplicates

9 - Duplicates

Why is there more than one?

Same, Same, Same, and

Same, Same, Different

Two types of (related) tests

9 Duplicates

- Same items same vendor, same invoice number,

same invoice date, same amount - Different items same employee name, same city,

different social security number

Duplicate Payments

9 - Duplicates

- High payback area
- Fuzzy logic
- Overriding software controls

Fuzzy matching with software

9 - Duplicates

- Levenshtein distance
- Soundex
- Like clause in SQL
- Regular expression testing in SQL
- Vendor/employee situations

Russian physicist

How is it done?

9 - Duplicates

- First, sort file in sequence for testing
- Compare items in consecutive rows
- Extract exceptions for follow-up

Possible Duplicates - Example Results

9 - Duplicates

Five invoices may be duplicates.

Next Metric

- Outliers
- Stratification
- Day of Week
- Round Numbers
- Made Up Numbers
- Market basket
- Trends
- Gaps
- Duplicates
- Dates

Date Checking

10 - Dates

If were closed, why is there

Adjusting journal entry?

Receiving report? Payment issued?

Holiday Date Testing

10 Dates

- Red Flag indicator

Date Testing challenges

10 Dates

- Difficult to determine
- Floating holidays Friday, Saturday, Sunday,

Monday

Typical audit areas

10 Dates

- Journal entries
- Employee expense reports
- Business telephone calls
- Invoices
- Receiving reports
- Purchase orders

Determination of Dates

10 Dates

- Transactions when business is closed
- Federal Office of Budget Management
- An excellent fraud indicator in some cases

Holiday Date Testing

10 Dates

- Identifying holiday dates
- Error prone
- Tedious
- U.S. only

Federal Holidays

10 Dates

- Established by Law
- Ten dates
- Specific date (unless weekend), OR
- Floating holiday

Federal Holiday Schedule

10 Dates

- Office of Personnel Management
- Example of specific date Independence Day, July

4th (unless weekend) - Example of floating date Martin Luther Kings

birthday (3rd Monday in January) - Floating Thanksgiving 4th Thursday in November

How it is done?

10 Dates

- Programmatically count holidays for entire

population - For each group, count holidays
- Compare the two histograms (group and population)
- Sort descending by metric (chi square/d-stat)

Holiday Counts - Example Results

10 Dates

Two employees (10245 and 32325) were off the

chart in terms of expense amounts incurred on a

Federal Holiday.

The Top 10 Metrics

Objective 3

- Overview
- Explain Each Metric
- Examples of what it can detect
- How to assess results

Objective 3 - Summarized

- Understand why and how
- Understand statistical basis for quantifying

differences - Identify ten general tools and techniques
- Understand examples done using Excel
- How pattern detection fits in

Next using Excel

Use of Excel

Objective 4

- Built-in functions
- Add-ins
- Macros
- Database access

Excel templates

Objective 4

- Variety of tests
- Round numbers
- Benfords Law
- Outliers
- Etc.

Excel Univariate statistics

Objective 4

- Work with Ranges
- sum, average, stdevp
- largest(Range,1), smallest(Range,1)
- min, max, count
- Tools Data Analysis Descriptive Statistics

Excel Histograms

Objective 4

- Tools Data Analysis Histogram
- Bin Range
- Data Range

Excel Gaps testing

Objective 4

- Sort by sequential value
- if(thiscell-lastcell ltgt 1,thiscell-lastcell,0)
- Copy/paste special
- Sort

Detecting duplicates with Excel

Objective 4

- Sort by sort values
- if testing
- if(and(thiscelllastcell, etc.))

Performing audit tests with macros

Objective 4

- Repeatable process
- Audit standardization
- Learning curve
- Streamlining of tests
- More efficient and effective
- Examples - http//ezrstats.com/Macros/home.html

Using database audit software

Objective 4

- Many built-in functions right off the shelf

with SQL - Control totals
- Exception identification
- Drill down
- Quantification
- June 2008 article in the EDP Audit Control

Journal (EDPACS) SQL as an audit tool - http//ezrstats.com/doc/SQL_As_An_Audit_Tool.pdf

Use of Excel

Objective 4

- Built-in functions
- Add-ins
- Macros
- Database access

Objective 4 - Summarized

- Understand why and how
- Understand statistical basis for quantifying

differences - Identify ten general tools and techniques
- Understand examples done using Excel
- How Pattern Detection fits in

Next Fit

How Pattern Detection Fits In

Objective 5

- Business Analytics
- Fraud Pattern Detection
- Continuous monitoring

Where does Fraud Pattern Detection fit in?

Objective 5

Right in the middle

- Business Analytics
- Fraud Pattern Detection
- Continuous fraud pattern detection
- Continuous Monitoring

Business Analytics

Objective 5

- Fraud analytics -gt business analytics
- Business analytics -gt fraud analytics

Role in Continuous Monitoring (CM)

Objective 5

- Fraud analytics can feed (CM)
- Continuous fraud pattern detection
- Use output from CM to tune fraud pattern detection

Objective 6 - Summarized

- Understand the framework for managing the

business risk of fraud - Plan, perform and explain statistical sampling in

audits - Reduce audit costs using data mining, sequential

sampling and other sampling techniques - Apply SAS 56, the new SAS suite and the revised

(2007) Yellow Book. - Run, hands-on, the most productive analytic

technique (regression analysis). - Use data mining to introduce greater efficiency

into the audit process, without losing

effectiveness.

Links for more information

- Kolmogorov-Smirnov
- http//tinyurl.com/y49sec
- Benfords Law http//tinyurl.com/3qapzu
- Chi Square tests http//tinyurl.com/43nkdh
- Continuous monitoring http//tinyurl.com/3pltdl

Market Basket

- Apriori testing for ping ponging
- Temple University http//tinyurl.com/5vax7r
- Apriori program (open source)

http//tinyurl.com/5qehd5 - Article Medical ping ponging

http//tinyurl.com/5pzbh4

Excel macros used in auditing

- Excel as an audit software http//tinyurl.com/6h3y

e7 - Selected macros - http//ezrstats.com/Macros/home.

html - Spreadsheets forever - http//tinyurl.com/5ppl7t

Questions?

Contact info

- Phone (919)-219-1622
- E-mail Mike.Blakley_at_ezrstats.com
- Blog http//blog.ezrstats.com