Chapter 8: CAATTs for Data Extraction and Analysis - PowerPoint PPT Presentation

View by Category
About This Presentation
Title:

Chapter 8: CAATTs for Data Extraction and Analysis

Description:

GAS can be used to audit the data currently being stored in most file structures and formats ... ACL is a proprietary version of GAS. Leader in the industry ... – PowerPoint PPT presentation

Number of Views:1722
Avg rating:3.0/5.0
Date added: 30 May 2020
Slides: 30
Provided by: tommiesi
Learn more at: http://www.swlearning.com
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Chapter 8: CAATTs for Data Extraction and Analysis


1
Chapter 8CAATTs for Data Extraction and Analysis
  • IT Auditing Assurance, 2e, Hall Singleton

2
DATA STRUCTURES
  • Organization
  • Access method

3
AccessNon-IndexMethods
HashingPointers
INDEX File
DATA File
AccessIndex Methods
Data Organization
SEQUENTIAL ISAM RANDOM
SEQUENTIAL RANDOM
4
FILE PROCESSING OPERATIONS
  1. Retrieve a record by key
  2. Insert a record
  3. Update a record
  4. Read a file
  5. Find next record
  6. Scan a file
  7. Delete a record

Individual Records
Table 8-1
5
DATA STRUCTURES
  • Flat file structures
  • Sequential structure Figure 8-1
  • All records in contiguous storage spaces in
    specified sequence (key field)
  • Sequential files are simple easy to process
  • Application reads from beginning in sequence
  • If only small portion of file being processed,
    inefficient method
  • Does not permit accessing a record directly
  • Efficient 4, 5 sometimes 3
  • Inefficient 1, 2, 6, 7 usually 3

6
DATA STRUCTURES
  • Flat file structures
  • Indexed structure
  • In addition to data file, separate index file
  • Contains physical address in data file of each
    indexed record

7
DATA STRUCTURES
  • Flat file structures
  • Indexed random file Figure 8-2
  • Records are created without regard to physical
    proximity to other related records
  • Physical organization of index file itself may be
    sequential or random
  • Random indexes are easier to maintain, sequential
    more difficult
  • Advantage over sequential rapid searches
  • Other advantages processing individual records,
    efficient usage of disk storage
  • Efficient 1, 2, 3, 7
  • Inefficient 4

8
DATA STRUCTURES
  • Flat file structures
  • Indexed Sequential Access Method (ISAM) Figure
    8-3
  • Large files, routine batch processing
  • Moderate degree of individual record processing
  • Used for files across cylinders
  • Uses number of indexes, with summarized content
  • Access time for single record is slower than
    Indexed Sequential or Indexed Random
  • Disadvantage does not perform record insertions
    efficiently requires physical relocation of all
    records beyond that point SOS
  • Has 3 physical components indexes, prime data
    storage area, overflow area Figure 8-4
  • Might have to search index, prime data area, and
    overflow area slowing down access time
  • Integrating overflow records into prime data
    area, then reconstructing indexes reorganizes
    ISAM files
  • Very Efficient 4, 5, 6
  • Moderately Efficient 1, 3
  • Inefficient 2, 7

9
Random
DBMS etc.
Legacy systems
ISAM
Legacy systems
Sequential
1980
1990
1960
1970
EVOLUTION OF ORG./ACCESS METHODS
10
Efficient
ISAM
Random
Sequential
Inefficient
Access entire files
Access single records
11
HASHING STRUCTURE
  • Employs algorithm to convert primary key into
    physical record storage address Figure 8-5
  • No separate index necessary
  • Advantage access speed
  • Disadvantage
  • Inefficient use of storage
  • Different keys may create same address
  • Efficient 1, 2, 3, 6
  • Inefficient 4, 5, 7

12
POINTER STRUCTURE
  • Stores the address (pointer) of related record in
    a field with each data record Figure 8-6
  • Records stored randomly
  • Pointers provide connections b/w records
  • Pointers may also provide links of records b/w
    files Figure 8-7
  • Types of pointers Figure 8-8
  • Physical address actual disk storage location
  • Advantage Access speed
  • Disadvantage if related record moves, pointer
    must be changed w/o logical reference, a
    pointer could be lost causing referenced record
    to be lost
  • Relative address relative position in the file
    (135th)
  • Must be manipulated to convert to physical
    address
  • Logical address primary key of related record
  • Key value is converted by hashing to physical
    address
  • Efficient 1, 2, 3, 6
  • Inefficient 4, 5, 7

13
DATABASE STRUCTURES
  • Hierarchical network structures Figure 8-9
  • Uses explicit linkages b/w records to establish
    relationship
  • Figure 8-9 is MN example
  • Relational structure
  • Uses implicit linkages b/w records to establish
    relationship foreign keys / primary keys

14
Relational Database table rows and columns










15
Relational Records Foreign Keys in one record
establishes relationships to related records in
other files.
CUSTOMERS
INVOICES
INVENTORY
16
DATABASE STRUCTURES
  • Relational structure
  • User views
  • Data a particular user needs to achieve his/her
    assigned tasks
  • A single view, or view without user input, leads
    to problems in meeting the diverse needs of the
    enterprise
  • Trend today capture data in sufficient detail
    and diversity to sustain multiple user views
  • User views MUST be consolidated into a single
    logical view or schema
  • Data in the logical view MUST be normalized

17
DATABASE STRUCTURES
  • Relational structure
  • Creating views
  • Designing output reports, documents, and input
    screens needed by users or groups
  • Physical documents help designer understand
    relationships among the data
  • 3 user views Table 8-2, Figure 8-12, Table 8-3
  • Then apply normalization principles to the
    conceptual user views to design the database
    tables

18
DATABASE STRUCTURES
  • Relational structure
  • Importance of data normalization
  • Critical to success of DBMS
  • Effective design in grouping data
  • Several levels 1NF, 2NF, 3NF, etc.
  • Un-normalized data suffers from
  • Insertion anomalies
  • Deletion anomalies
  • Update anomalies
  • One or more of these anomalies will exist in
    tables lt 3NF

19
DATABASE STRUCTURES
  • Relational structure
  • Normalization process
  • Un-normalized data Table 8-4
  • Eliminates the 3 anomalies if
  • All non-key attributes are dependent on the
    primary key
  • There are no partial dependencies (on part of the
    primary key)
  • There are no transitive dependencies non-key
    attributes are not dependent on other non-key
    attributes
  • Split tables are linked via embedded foreign
    keys
  • Normalized database tables examples Figures
    8-13, 8-14

20
DATABASE STRUCTURES
  • Relational structure
  • Creating physical tables
  • Created on paper so far
  • Then create physical files and populate data
  • Physical views can be produced from DBMS
  • Query function
  • Allows users to create customized lists from
    database
  • Users stipulate, using English-like commands,
    which tables, records, fields, filtering criteria
    needed to produce the desired list
  • Result is virtual table derived from actual
    database tables
  • SQL
  • SELECT, FROM, WHERE Figure 8-16
  • De facto standard query language

21
DATABASE STRUCTURES
  • Relational structure
  • Auditors and data normalization
  • Database normalization is a technical matter that
    is usually the responsibility of systems
    professionals.
  • The subject has implications for internal control
    that make it the concern of auditors also.
  • Most auditors will never be responsible for
    normalizing an organizations databases they
    should have an understanding of the process and
    be able to determine whether a table is properly
    normalized.
  • In order to extract data from tables to perform
    audit procedures, the auditor first needs to know
    how the data are structured.

22
EMBEDDED AUDIT MODULE
  • Identify important transactions live while they
    are being processed and extract them Figure
    8-18
  • Examples
  • Errors
  • Fraud
  • Compliance
  • SAS 78, SAS 94, SAS 99 / S-OX

23
EMBEDDED AUDIT MODULE
  • Disadvantages
  • Operational efficiency can decrease
    performance, especially if testing is extensive
  • Verifying EAM integrity - such as environments
    with a high level of program maintenance
  • Status increasing need, demand, and usage of
    COA/EAM/CA

24
GENERALIZED AUDIT SOFTWARE
  • Brief history
  • Most widely used CAATT Figure 8-19
  • Usages include
  • Footing and balancing entire files or selected
    data items (e.g., extending inventory)
  • Selecting and reporting detail data
  • Selecting stratified statistical samples from
    data files
  • Formatting results into audit reports (auto work
    papers!)
  • Printing confirmations
  • Screening / filtering data
  • Comparing multiple files for differences
  • Recalculating values in data

25
GENERALIZED AUDIT SOFTWARE
  • Popular because
  • GAS software is easy to use and requires little
    computer background
  • Many products are platform independent, works on
    mainframes and PCs
  • Auditors can perform tests independently of IT
    staff
  • GAS can be used to audit the data currently being
    stored in most file structures and formats

26
GENERALIZED AUDIT SOFTWARE
  • Simple structures Figure 8-19
  • Complex structures Figures 8-20, 8-21
  • Auditing issues
  • Auditor must sometime rely on IT personnel to
    produce files/data
  • Risk that data integrity is compromised by
    extraction procedures
  • Auditors skilled in programming better prepared
    to avoid these pitfalls

27
ACL
  • ACL is a proprietary version of GAS
  • Leader in the industry
  • Designed as an auditor-friendly meta-language
    (i.e., contains commonly used auditor tests)
  • Access to data generally easy with ODBC interface

28
ACL
  • See ACL tutorial 1
  • Input file definition
  • Customizing a view Figure 8-23
  • Filtering data Figures 8-24 thru 8-27
  • Stratifying data Figure 8-28
  • Statistical analysis

29
Chapter 8CAATTs for Data Extraction and
Analysis
  • IT Auditing Assurance, 2e, Hall Singleton
About PowerShow.com