Database Systems - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

Database Systems

Description:

Michael Lang, National University of Ireland, Galway. 1. Data Quality & Spreadsheet Errors ... history - with the possible exceptions of handguns and tequila. ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 13
Provided by: michae570
Category:

less

Transcript and Presenter's Notes

Title: Database Systems


1
Data Quality Spreadsheet Errors
2
Data Quality
  • "A computer lets you make more mistakes faster
    than any invention in human history - with the
    possible exceptions of handguns and tequila."
  • - Mitch Ratliffe

3
Beware the Weakest Link
  • In assuring data integrity / information quality,
    a database is only as reliable as its weakest
    link
  • Microsoft Excel spreadsheets are often used to
    gather data from employees, for import into
    enterprise databases
  • What happens when you give power to users who are
    not trained in error-detection the ability to
    introduce errors !!!
  • Legislation and regulations
  • Ireland Data Protection Act 1988-2003
    (personal data)
  • UK Customs Excise can enforce compliance
    (SpACE - Spreadsheet Auditing)
  • US FDA 21 CFR Part 11 (see www.labcompliance.com
    )
  • US Sarbanes-Oxley Act 2002 (financial
    reporting)
  • US Data Quality Act 2001 (federal agencies)
  • US Health Insurance Portability and
    Accountability Act 1996

4
Beware the Weakest Link
  • Spreadsheets are often used for what-if
    analysis in decision support systems (DSS)
  • They may retrieve input from a number of
    databases and run them through a model e.g.
    economic forecasting, production planning, etc.
  • but the weakest link might be the spreadsheet
    because a single error in a spreadsheet will
    subvert all the controls in all the systems
    feeding into it !!!

5
Spreadsheet Errors
  • In December 1994, an accountant at Fidelity
    Magellan Fund, while transferring data from one
    spreadsheet to another, omitted the minus sign on
    a net loss of US 1.3 billion and incorrectly
    treated it as a gain on the new spreadsheet.
    Nobody noticed the mistake until it was too late
    to warn shareholders. Instead of the 4.32
    dividend promised by the company, there was no
    dividend.
  • Ross, J. A. (1996) Spreadsheet Risk, Harvard
    Business Review, 74(5) Sep/Oct pp. 10-12
  • In May 2003, TransAlta Corporation (an
    electricity company) lost Canada 24m through a
    clerical error in pasting into a Microsoft
    Excel spreadsheet. The spreadsheet was used to
    submit bids for contracts. The rules governing
    tenders specified that, once submitted, bids
    couldn't be retracted. TransAlta didn't spot the
    error until they were notified that their bids
    were successful.

6
Spreadsheet Errors
  • In October 2003, about 2 weeks after releasing
    their third quarter earnings figures, Fannie Mae
    (a mortgage funds company) had to re-state their
    unrealised gains by US 1.2 billion. They
    explained this was because of "honest mistakes
    made in a spreadsheet used in the implementation
    of a new accounting standard.
  • A single wrong figure on a spreadsheet forced
    Credit Suisse to markdown its profits by SFR 200m
    (86m). The error came in the German subsidiary
    of the banks Winterthur arm fourth-quarter
    income was lowered 16.7 to SFR 1 billion
  • London Evening Standard, March 26, 2004
  • A Fortune 500 firm used discounted cash flows in
    a spreadsheet for evaluating investment
    proposals. The discount rate and discounting
    formula had been set up many years ago by an
    employee who had since left. They were not
    documented. Although the prime rate rose from 8
    to over 20 between 1973 and 1981, the
    spreadsheet was kept at 8.
  • Dhebar, A. (1993) Managing the Quality of
    Quantitative Analysis, Sloan Management Review,
    34 (2), pp. 69-75

7
Spreadsheets are Error-Prone!
  • Studies by Ray Panko reveal that, on average, 30
    of spreadsheets contain errors, including
    incorrect data and incorrect formulae
  • Research by Coopers Lybrand in spreadsheets
    with over 150 rows, over 90 contained at least
    one significant mistake in a formula
  • Study by the UK Customs and Excise only 11 of
    audited spreadsheets contained errors, but those
    that did were at variance by amounts ranging from
    hundreds to millions of pounds sterling

8
Categorisation of Errors Impact
  • Critical
  • Material error could compromise a government, a
    regulator, a financial market, or other
    significant public entity and cause a breach of
    the law and/or fiduciary duty. May place those
    responsible at significant risk of criminal
    and/or civil legal proceedings and/or
    disciplinary action.
  • Key
  • Material error could cause significant business
    impact in terms of incorrectly stated assets,
    liabilities, costs, revenues, profits or taxation
    etc. May place those responsible at risk of
    adverse publicity and at risk of civil
    proceedings for negligence or breach of duty
    and/or internal disciplinary action,
  • Important
  • Material error could cause significant impact on
    the individual in terms of job performance and
    career progression without directly, greatly,
    immediately or irreversibly affecting business or
    the organisation.
  • Store Retrieve
  • Spreadsheets used as databases, with few issues
    other than data correctness and information
    security and where the impact of error is low.
  • Expired
  • Spreadsheets over three years old no longer
    required in the active management of the
    business, but may be required to be archived by
    statute or good practice. Present impact of error
    is low.
  • Personal
  • Other spreadsheets used by the individual in the
    day-to-day performance of their duties, where the
    impact of material error is low.

9
Common Causes of Errors
  • Alphanumeric cell names (e.g. A1, C23) are not
    easy to remember. Named variables (e.g.
    student_id, avg_mark), as in programming
    languages, would be better.
  • Spreadsheets do not check for cells that are used
    but never set, or set but never used.
  • Spreadsheets do not check for data type errors
    such as multiplying money by money, or adding
    money to interest rates.
  • Unlike most programming languages, which have
    comment statements, there is nowhere obvious to
    put large amounts of program documentation
  • To reuse code, you need to cut and paste to move
    it from one sheet to another. This is tedious and
    error-prone e.g. you may forget to change
    absolute cell references.
  • The structure of large models is not obvious.
    Some models are very big e.g. Personal Computer
    World (May 1995) cites an oil business model with
    80 linked worksheets totalling 120 MB.
    Spreadsheets are easy to start with. They tempt
    users to build ever bigger and bigger models, but
    don't provide the structuring tools these models
    need.

10
Common Causes of Errors
  • Lack of input control
  • The data validation features in Microsoft Excel
    are basic. In reality, few users use them. Many
    spreadsheets contain very simple errors e.g.
    cells that look like dates and numbers but are
    actually text strings.
  • Over-complexity
  • Few spreadsheet-based systems are planned.
    They often start as a single worksheet and just
    grow. Acceptance that a spreadsheet is
    unmanageable might not come until the user tries
    to add yet another layer of complexity. In many
    cases there is already an accumulation of
    unnoticed mistakes and by this stage, the
    spreadsheet may well have become a business
    critical application.
  • Linked workbooks
  • A proliferation of workbook links are usually a
    sign that a spreadsheet is already too complex.
    Keep it simple! If possible, use the data query
    feature to share data between worksheets /
    spreadsheets.
  • Repetition
  • If you need to repeatedly cut, copy and paste
    data it's only a matter of time before you make a
    mistake. Automate regular tasks with Excel VBA
    or macros. Better still, store your data in a
    database and use the data query feature to import
    the required data.
  • Re-keying
  • It is a statement of the obvious that re-keying
    data is an opportunity for error. Excel can link
    automatically to databases, web sources and
    enterprise data in your accounting, payroll and
    HR systems. If you must re-key, be sure to use
    check totals.

11
Error Testing Validation
  • Useful features in Microsoft Excel include
  • Data Validation
  • Tools Error Checking
  • Tools Formula Auditing
  • Tools Options Error Checking
  • Tools Protection
  • Format Cells Protection
  • Format Conditional Formatting
  • Edit Go To Special
  • CTRL (to show all formulae)

12
Further Reading Links
  • Spreadsheet Research (SSR) Website Ray Panko
  • http//panko.cba.hawaii.edu/SSR/
  • European Spreadsheet Risks Interest Group
  • http//www.eusprig.org
  • Spreadsheet Engineering Ltd. (Grenville Croll)
  • http//www.spreadsheetrisks.com
  • Louise Pryor
  • http//www.louisepryor.com/show.do?pagearticles
  • O'Beirne (2005) Spreadsheet Check and Control
    Detect and Prevent Errors, Systems Publishing,
    ISBN 190540400X
  • http//www.sysmod.com/sslinks.htm
  • http//www.spreadsheetauditing.com
  • http//www.spreadsheetdetective.com
  • http//www.spreadsheetinnovations.com
Write a Comment
User Comments (0)
About PowerShow.com