The Relevance of Human Factors to the management of spreadsheet risk - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

The Relevance of Human Factors to the management of spreadsheet risk

Description:

Comparing Humans and Computers (conventional) ... to give an output which is then compared to the training set provided by the user ... – PowerPoint PPT presentation

Number of Views:75
Avg rating:3.0/5.0
Slides: 26
Provided by: sm6681
Category:

less

Transcript and Presenter's Notes

Title: The Relevance of Human Factors to the management of spreadsheet risk


1
The Relevance of Human Factors to the management
of spreadsheet risk
  • INFORMS 2005 San Francisco
  • Simon Thorne and David Ball
  • UWIC

2
A brief overview of spreadsheet errors
  • Three well accepted sources of spreadsheet error
    are
  • Logical errors
  • Mechanical errors
  • Domain Knowledge errors
  • More recently recognised Human Factor errors
  • Human Mechanical Error (Well researched)
  • Cognitive and psychological (Much newer)

3
Human Factors
  • Cognitive Load
  • Overconfidence
  • Base Error Rate
  • Working memory limit
  • State space searching
  • Bias
  • Optimism bias, Hypothesis fixation, Confirmation
    bias
  • Question Spreadsheet errors
  • Human Computer Mismatch?

4
Comparing Humans and Computers (conventional)
5
Comparing Humans and Computers (conventional)
  • Humans are good at giving real-world examples,
    weak at generating formulae
  • Computers are good at manipulating mathematics
    (ALU), weak at generating real-world examples

6
Comparing Humans and Computers (conventional)
1
2
7
Spreadsheet development methods sympathetic to
Human Factors
  • In order to reduce spreadsheet errors, novel
    methods of interaction are being researched.
  • One example (at UWIC) of such research is the
    exploitation of Machine Learning Techniques with
    Example Driven Modelling
  • But this raises the question of whether the same
    or corresponding Human Factors will apply.

8
Example Driven Modelling
  • User provides examples of attribute
    classifications of a given problem
  • This example data set is then used to generalise
    the problem
  • Basic premise

9
EDM in Practice
  • To demonstrate how this process works the
    following slides will run through an example
  • This will run from the construction of the data
    set to the results gained from the network

10
The Example Problem
  • Credit Risk Decision Support System (DSS)
    implemented in a spreadsheet (Gross et al., 2006)
  • Used to asses the credit worthiness of companies
  • Model with three classifications Accept Further
    Enquire and Reject
  • Based upon several contributing variables such
    as
  • Current years sales
  • Previous debt balances
  • Net worth
  • A number of risk class indexes

11
Risk analysis model in detail
12
Extracting the example data set
  • The first task is extracting the example data set
    from the example problem
  • The easiest way to do this is examine the rules
    that satisfy or reject classifications in the
    model
  • Once the parameters of the model have been
    defined, an example data set can be constructed
    around true and false attribute classifications
    for a particular rule
  • The example data set appears as a set of values
    needed to satisfy or decline classifications in
    the model

13
Example rule
  • To satisfy Class 1, it is necessary that Variable
    3 (Net worth) is greater than or equal to 50,000
    AND Variable 4 (DB Credit Index) is greater than
    or equal to 2 AND Variable 5 (DB Paydex index)
    is greater than or equal to 70 AND Variable 6
    (DB Stress class index) is equal to 1.
  • So the next step is to think up values that
    satisfy or decline that rule
  • i.e provide an example with 50,000 Net Worth and
    45,000 Net Worth
  • The end product will be a set of input variable
    with values assigned with the appropriate
    classification

14
Data set Choose easy examples
15
Feeding the data set into the Neural Network
  • Once this has been completed, the data set is fed
    into a Neural Network
  • The network then learns the model based upon
    the parameters provided in the data set
  • The Neural Network uses a type called the
    Backpropogation learning rule with Genetic
    Optimisation of the input space
  • Training set, test set and universe

16
Neural Networks
  • Neural Networks (NN) are an artificial emulation
    of the biological structures of neurons in the
    brain
  • Like the brain, NN are designed to learn from
    experience
  • NN are a connectionist approach in machine
    learning
  • They have the ability to test and refine
    classifications based upon input provided by the
    user.
  • This allows NN to generalise real-world problems
    given some input data to learn from

17
Artificial Neuron
18
Neural Networks
  • A neural network consists artificial neurons
    arranged in a connected network paradigm.
  • They consist of
  • Input signals
  • A set of weights that describe connection
    strengths
  • The output depends on it being greater than a
    certain threshold.
  • The learning rule that specifies how to adjust
    the weights for a given input/output pair.

19
Neural Networks
  • The example values are passed through the network
    to give an output which is then compared to the
    training set provided by the user
  • The network then adjusts the weights in an
    attempt to mimic the input/output pattern of the
    training set
  • This process is repeated until the network
    reaches some predetermined level of accuracy
  • The network has then generalised the problem
  • This can then be verified using the blind test set

20
The generalised model
  • Once the model has learnt the problem, it outputs
    the Mean Squared Error (MSE) of the network
  • Chi squared
  • This value indicates how well the model has
    learnt the task and hence how well it will
    perform in testing
  • Blind testing is the best absolute test, i.e.
    passing unseen data through the network and
    checking the classifications it gives on that
    basis
  • So trained network is given new examples and is
    assessed on how well it classifies those examples

21
Results
Acceptable value range
22
Conclusions
  • This method can be applied to certain (logic
    models) business problems
  • In the presented example, the accuracy of the
    network was more than satisfactory
  • The actual process of generating examples,
    although different, is easier than generating the
    equivalent formulae

23
The equivalent equation
  • Producing the formulae would eventually lead to
    this
  • IF(NOT(OR(K3M3)), Reject, IF(OR(AND(K3,
    NOT(OR(L3M3))), AND (OR(L3M3), NOT (K3))),
    Further Evaluate, Accept))
  • This final equation is based upon the logic test
    of three others.
  • OR (K5, K20) (Class 3)
  • NOT(OR(K3K4, K6K19, K21)) (Class 2)
  • AND(OR(K5,K20), NOT(OR(K3K4, K6K19, K21)))
    (Class 1)

24
Some challenges
  • Understanding the problem is paramount
  • Rubbish in, Rubbish out
  • Domain Knowledge is the biggest challenge
  • This method may be subject to new Human Factor
    related problems
  • BER?

25
Thank you
  • Any questions?
  • sthorne_at_uwic.ac.uk
Write a Comment
User Comments (0)
About PowerShow.com