SURVEY SOFTWARE DESIGN - PowerPoint PPT Presentation

Loading...

PPT – SURVEY SOFTWARE DESIGN PowerPoint presentation | free to view - id: 3c52f-NzQ2N



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

SURVEY SOFTWARE DESIGN

Description:

... are an asset (singers, actors, public speakers, disc jockeys, etc. ... Goal: Remove as much 'dead air' from each file as possible -- particularly for 'fills' ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 130
Provided by: DaveCo78
Category:
Tags: design | software | survey | dead | for | speaker | the

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: SURVEY SOFTWARE DESIGN


1
SURVEY SOFTWARE DESIGN
  • Jacob Bournazian
  • May 2000
  • Bangladesh Ministry of Energy and Mineral
    Resources

2
Typical Survey System
Respondent Data
Name, address Code
Data Entry
Data Files
Est., Var., Supp., Code
Desk Top Code
Publi- cations
Edit, Impute, Code
Energy Data
3
Criteria and Goals for a Survey Processing System
  • What are the goals for the system?
  • Design Quality
  • Data Integrity, Accessibility, Timeliness
  • What criteria will be applied?
  • E.g. edit criteria
  • What are the costs to develop and maintain the
    system?

4
Developing Automated Collection Systems
  • CASIC (Computer Assisted Survey Information
    Collection) is an all inclusive term for several
    automated data collection systems
  • Technology and supporting software has been
    developing and rapidly expanding for the past 20
    years
  • Grew out of need for handling large surveys,
    reducing costs of data collection, speeding up
    data collection, editing, processing, and
    attempts to improve response rates in mail
    surveys.

5
Types of (CASIC) Computer-Assisted Survey
Information Collection Systems
  • CATI (Computer-Assisted Telephone Interviewing)
  • Interviewers generally cluster at one or more
    central locations and contact telephone
    respondents, Interviewer reads questions
    displayed by a computer, and enters the answers
    into the computer.
  • E.g., retail price surveys for gasoline and
    diesel fuels use to measure to price per gallon
    paid by the average U.S. consumer

6
Types of CASIC systems continued
  • CAPI (Computer-Assisted Personal Interviewing)
  • Interviewers go to the respondents home or
    offices with a laptop PC and read the questions
    from, and record the answers into the computer.
  • E.g., Consumer Price Index survey used to measure
    the rate of inflation in the U.S.

7
Types of CASIC systems continued
  • CASI (Computer-Assisted Self Interviewing)
  • Techniques include
  • PDE (Prepared Data Entry) Respondents use a PC or
    terminal themselves to fill out interactively the
    survey questionaire
  • TDE (Touchtone Data Entry) Respondents answer
    computer generated questions by pressing buttons
    on a telephone
  • VRE (Voice Recognition Entry) Respondents answer
    questions by speaking directly into a telephone

8
EIAs CASIC-related Efforts
  • End Use Consumption Surveys
  • PC Electronic Data Reporting Option
  • Electronic Filing Pilots
  • Weekly CATI Surveys
  • CATI Frame Update Test

9
End Use Surveys
  • Surveys at end users--households and commercial
    buildings
  • Between 5,000 and 6,000 respondents
  • Detailed Questionnaires
  • 30-45 minute interviews
  • Changed mode from CAPI to CATI for Commercial
    Buildings Energy Consumption Survey in 1999

10
PC Electronic Data Reporting Option (PEDRO)
  • Windows-based Application
  • 11 monthly surveys
  • 5 weekly surveys
  • Runs on Respondents Machine
  • Includes Edit Capability
  • Data transmit Directly to EIA via Modem or over
    the Internet
  • Pretty Good Privacy (PGP) software for encryption

11
Electric Power Survey Electronic Filing Pilots
  • Using 3 Monthly Electric Power Surveys
  • Companies Send Encrypted Data via Internet
    Directly to Servers Outside of EIAs Firewall
  • Each company receives a password
  • Read Information From Previous Submissions, Bring
    in Blank Forms, and File Current Report
  • Also Testing Filing Data via E-mail

12
Weekly CATI Surveys
  • Retail Gasoline Prices (approx. 1000 outlet
    prices)
  • Retail Diesel Fuel Prices (350 outlet prices)
  • Data Collected on Monday morning
  • Publish retail price on Monday afternoon
  • regional and US pricing on a weekly basis
    expanding coverage to 5 cities, 5 states,

13
Petroleum Product Sales Survey Frame
Construction
  • Quadrennial Survey of Approx. 22,000 Companies
  • Tested CATI versus Mail Updates, using Split
    Sample
  • Conducted survey in 1999

14
Petroleum Product Sales Frame Test Results
  • CATI
  • Had Higher Response Rates
  • Reports Were Filed Sooner
  • Resulted in Cleaner Data
  • Cost Less

15
Background of the EIA-863
  • 22,000 companies
  • mail survey, every 3 years
  • starting with 1998 reference, major budget
    reduction--- every 4 years--- other steps
    necessary to reduce cost
  • reduce cost by 1) less manual review in list
    construction 2) less edit failures 3) increase
    initial response rates

16
Edit Profile
  • 40-50 of companies fail1) control data quality
    (CDQ) edit or2) volume data quality (VDQ) edit
  • High rate of edit failures, and high initial
    nonresponse result in extensive phone follow-up
  • Potential area of cost savings

17
(No Transcript)
18
Response Rates
  • CATI MAIL P-value
  • TOTAL 91.20 87.72 0.056
  • One State 89.33 86.33 0.262
  • Two States 93.33 88.67 0.159
  • Three States 93.22 90.00 0.372
  • Zero Items 73.75 66.25 0.304
  • One Items 96.47 87.21 0.027
  • Two Items 91.82 91.82 1.000
  • Three Items 95.83 93.75 0.519
  • Four Items 93.81 92.86 0.513
  • Five or more 93.00 90.00 0.449

19
(No Transcript)
20
Average Costs as Response Originally Designated
  • MAIL CATI
    DIFFERENCE Pilot Survey 6.20
    5.67 .53 Full Survey 5.56
    5.20 .36
    p value of
    .0002Break even point to recover programming
    costs ---gt 14,000
    respondents

21
Summary
  • CATI designated had higher response rates,
    reported sooner, and had cleaner data
  • 31 of CATI designated are likely to choose to
    report by mail
  • High proportion on full survey of one state
    and/or two item responses (lowest cost savings)
    high percentage reporting by mail--gt savings of
    only .36/response---gtbreak even of 14,000

22
The Blaise Family of Softwarehttp//www.westat.co
m/blaise
23
Computer-Assisted Survey Execution System
(CASES)http//socrates.berkeley.edu7500/casesfor
cal2.html
24
SAWTOOTHS Ci2CATI for Windowshttp//www.sawtooth
.com
25
SNAP survey softwarehttp//www.mercator.co.uk/pro
ducts.htm
26
The Survey Systemhttp//www.surveysystem.com
27
Surveycrafthttp//www.surveycraft.com/Products.ht
ml
28
Ronins Results for Researchhttp//www.ronin.com/
rforrprod.htm
29
TOUCHTONE DATA ENTRY (TDE)
  • Current Employment Statistics Program
  • Bureau of Labor Statistics, US Dept of Labor
  • Monthly survey of employment, payroll, and hours
  • Sample of 390,000 business establishments
  • Publish data two and a half weeks from collection
  • Provides key economic indicators
  • Employment by industry, state, and area
  • Average hourly earnings
  • Average weekly hours

30
Automated Collection Methods
  • CES offers a variety of automated reporting
    methods
  • New Automated collection mode
    Year
  • Computer Assisted Telephone Interview . . . . .
    1984
  • Touchtone Data Entry (TDE). . . . . . . . . . .
    .. . .1986
  • Voice Recognition . . . . . . . . . . . . . . . .
    . . . . ..1988
  • Electronic Data Interchange (EDI) . . . . . . . .
    . 1994
  • FAX . . . . . . . . . . . . . . . . . . . . . .
    . . . . . . . .. 1995
  • World Wide Web . . . . . . . . . . . . . . . . .
    . . . .. 1996

31
Distribution of CES Sample by Reporting Method
32
Data Collection CATI to TDE
  • Most respondents report via Computer Assisted
    Telephone Interviewing (CATI) for five months
  • CATI Interviewers use this opportunity to
  • Solidify CES reporting relationship
  • Ensure data quality
  • Serve as a resource for information
  • Prepare respondent for TDE self-reporting
  • After five months, respondents are requested to
    begin reporting by TDE, and are sent an
    introductory package in the mail
  • If the respondent does not wish to use TDE, they
    are offered alternate reporting methods

33
TDE Methodology
  • Respondents receive a monthly advance reminder
    FAX or postcard, according to prompt code
  • Respondents call a toll-free number and enter
    data using the keypad of their telephone
  • As the monthly deadline approaches, delinquent
    respondents receive a nonresponse prompt (NRP) by
    FAX or telephone, according to prompt code
  • A second, Last Chance NRP message may be sent
    on the morning of the deadline
  • Questions/concerns are addressed by a Help Desk
    staff
  • Data are extracted daily and uploaded to a main
    server
  • Registry information updates are sent to the
    States weekly

34
One-Point TDE Features
  • TDE collection for 40 states in one office
  • 120 phone lines for data collection
  • 48 lines for outbound FAXes
  • Postcard generation for prompts
  • NRP telephone prompts made at calling centers
  • Back-up site with 48 phone lines and 24 FAX lines
  • Help Desk

35
Example of Advance Notice FAX
36
One Point TDE Help Desk Staff
  • On duty 700AM to 800PM EST to answer questions
  • Adept at answering questions and dealing with
    reluctance
  • Use a specially designed Help Desk system that
    aids in customer service
  • Help Desk staff have the ability to
  • Collect data
  • Update registry information
  • Print and mail respondent packages
  • FAX respondent packages

37
Special Uses of TDE System
  • In addition to data collection, the TDE system
    may be modified to contain
  • Contact update questions
  • FAX availability questions
  • FAX prompts may be customized by industry,
    month, and establishment
  • Tips on touchtone reporting
  • Rounding instructions
  • Past due reports needed
  • Special messages

38
Data Collection Cost Components
39
Summary
  • A viable collection mode for short, numeric
    surveys
  • Convenience of self-response
  • Maximizes response within cost limitations
  • Requires ongoing nonresponse follow-up to
    maintain high response rates

40
Audio Computer Assisted Survey Interviewing
  • Automating data collection on Self administered
    surveys

41
Benefits of self administered surveys
  • Enhanced privacy for the respondent
  • Reduction in interviewer effects
  • Greater control given to the respondent
  • In Health surveys, there is a higher reporting of
    sensitive behaviors including
  • smoking, drinking, drug use
  • sexual activities
  • abortion

42
Benefits of Computerized self administered survey
interviewing (CASI)
  • Routing controlled by the computer
  • More complex questionnaires possible
  • Questions cannot be skipped inadvertently
  • Out-of-range responses are eliminated
  • Customized wordings possible
  • Questions presented in the same order to all
    respondents
  • Visual aids incorporated directly into the
    instrument

43
Audio Computer-Assisted Self-Interviewing (ACASI)
  • Above and beyond CASI, ACASI allows for
  • no requirement of literacy
  • fully standardized question presentation
  • fully private administration

44
Issues for Audio Computer Assisted Surveys
  • What to record
  • Length of the ACASI survey
  • Implications for question wording
  • Choice of a voice
  • Preparing the audio files
  • Testing the audio
  • Translations
  • Loading the laptops

45
What to Record...
  • Some instruments have audio for the questions but
    not for the response categories
  • Remember why youre including the audio component
  • If response categories are read, include
    instruction for how to enter an answer
  • Yes, press 1
  • No, press 2

46
How Long is Too Long?
  • Consider the length for a variety of reading
    skills
  • With respondents in control of the interview,
    there is less an interviewer can do to either
    slow a respondent down or speed him/her up
  • Some evidence that respondents listen to less of
    the audio the further they get into the
    questionnaire
  • To date, no evidence that there is an upper limit
    on the length of an ACASI instrument
  • Interviewer reactions somewhat mixed

47
Question Construction
  • Consider impact of fills on the audio component
    of ACASI
  • The computer recorded that you were AGE 1st USE
    the first time you used marijuana. Earlier, you
    told the interviewer that your date of birth was
    MONTH DAY YEAR. That would make you AGE
    which is YEARS younger than the first time
    you used marijuana. That is not possible. Which
    answer is correct?
  • I was AGE 1st USE the first time I used
    marijuana
  • I am AGE years old
  • Neither answer is correct

48
Question Construction contd
  • Revised
  • The answer to the last question and an earlier
    question disagree. Which answer is correct?
  • I was AGE 1st USE the first time I used
    marijuana
  • I am AGE years old
  • Neither answer is correct
  • Open-ended responses cannot be used as fills
  • Consider the impact of tailoring questions on the
    audio recording.
  • One question per screen

49
Choosing a Voice for ACASI
  • Is the voice important, and if so what type of
    voice is best?
  • Turner, et al. found no difference in data when
    male or female voice was used
  • Pleasant is good, but promoting honest and
    accurate reporting is better!
  • Laboratory testing at RTI indicates respondents
    do attribute varying degrees of integrity to a
    voice and have preferences for particular voices

50
Practical Considerations in Choosing a Voice
  • Consider the topic of the study
  • For longitudinal or ongoing surveys, determine
    long term availability
  • Dont select someone who has other competing
    priorities on the project
  • Familiarity with interviewing is an asset
  • Professional voices are an asset (singers,
    actors, public speakers, disc jockeys, etc.)

51
Preparing the Audio Files
  • Allow sufficient time, but dont start until
    youve finalized the question wording!
  • Determine the optimal recording level
  • Trade-off between quality and size of the files
  • Assign unique file name to each audio file
  • Prepare scripts for recording
  • Options for recording audio
  • Digitizing the audio files
  • Goal Remove as much dead air from each file
    as possible -- particularly for fills

52
Testing the Audio
  • On-the-spot testing
  • Goal Identify mistakes in wording or
    pronunciation at the time of recording
  • Database testing
  • Goal Identify mistakes in wording or
    pronunciation as well as audio files that are
    missing
  • Integrated file testing
  • Goal Verify audio and screen text match
    identify unusual intonation and long pauses

53
Translations
  • Re-visit space constraints
  • each translation will at least double the amount
    of space needed for the audio
  • subject/object agreement and degree of formality
  • Select voice
  • Dont start until the translation is finalized!
  • (Dont translate until the English is finalized)

54
Loading the Laptops
  • Allow sufficient time to load
  • Experience with conducting health surveys
  • 1 hour interview / ACASI portion runs 40 - 45
    minutes
  • English and Spanish, 2 CDs needed
  • Use of Head phones
  • Hygiene issues
  • Laptop speakers

55
In Summary
  • ACASI does improve reporting of sensitive data,
    but, the devil is in the details
  • Allow sufficient time for
  • testing
  • translations
  • loading

56
(No Transcript)
57
COOL EDIT PRO www//syntrillium.com
58
Automating a processing system
Respondent Data
Data Entry
Data Entry
Data Entry
Name, address Code
Data Entry
Data Entry
Data Files
Est., Var., Supp., Code
Desk Top Code
Publi- cations
Data Entry
Data Entry
Edit, Impute, Code
Energy Data
59
Building a generalized processing system to
process all surveys
60
Developing an Automated Collection and Processing
System
  • What are the goals for the system?
  • Design Quality
  • Data Integrity, Accessibility, Timeliness
  • What criteria will be applied?
  • E.g. edit criteria
  • What are the costs to develop and maintain the
    system?

61
Standard Economic Processing System (StEPS)
  • U.S. Bureau of Census
  • Developed to process over 100 economic surveys in
    the areas retail, wholesale, service industries,
    manufacturing, and construction.
  • Written entirely in SAS and operates in a UNIX
    environment.

9
62
Objectives for StEPS
  • Reduce resources required for system maintenance
  • Standardize survey procedures used in data
    analysis
  • Provide greater staffing flexibility for analysts
    and programmers to process different surveys by
    providing a processing system common to all
    surveys.
  • Make all survey data available to all users both
    survey analysts and survey managers

12
63
Objectives for StEPS - continued
  • Provide a common structure to make it easier to
    implement improvements for all surveys in the
    system.
  • Improve timeliness for new surveys by eliminating
    analyst retraining and the development of custom
    survey processing software.
  • Minimize the need for system maintenance

64
StEPS Generalized Design features
  • Design a set of standard data structures that
    remain the same, regardless of the survey and its
    data
  • Use parameters (stored in general data
    structures) to drive the survey-specific
    processing requirements.
  • Standardize field names and possible values for
    similar concepts (control data)

65
StEPS Design features (cont)
  • Interactive SAS/AF screens permit users to
  • Specify parameters
  • Select processing options
  • Review and change data
  • Monitor batch processing
  • Configuration to individual surveys via
    specification files and processing scripts
  • Does not include Frames development, sample
    selection, actual data collection and
    dissemination

11
66
Microdata
Interactive StEPS
Job submission
Data dictionaries
Batch StEPS
Script Files (SAS Macros)
Listing requests
Macrodata
14
67
StEPS Modules
  • USER SETUP
  • Allows the user to select a survey and stat
    period to process, set up default printers, or
    change the font size of the screen display

10
68
StEPS Modules (continued)
  • SURVEY SPECIFICATIONS
  • Allows users to set parameters for various
    processing activities (i.e. edits, derived items,
    imputation, estimation, and outliers) define
    data dictionaries define the survey-specific
    line displayed in the Data Review and Correction
    screens

69
StEPS Modules (continued)
  • COLLECTION ACTIVITIES
  • Allows users to perform activities associated
    with data collection, including creation of the
    label files, logging in respondents, submitting
    the reported data through batch jobs

70
StEPS Modules (continued)
  • REVIEW AND CORRECTION
  • Allows users to view and edit survey data
  • Review all item data for a specified ID
  • Review ID data for a specified item
  • Review historical data for selected items
  • View item totals for specific stat periods
  • View different versions of the data, including
    reported, edited, adjusted, weighted-adjusted
  • View current-to-prior ratios (between 2 stat
    periods) for a specified item

71
StEPS Modules (continued)
  • TOOLS
  • Provides users with the capability to do
  • Analyze data files by accessing SAS tools
    (SAS/ASSIST, Insight, EIS)
  • Download data to the PC
  • Query survey data sets
  • Request various lisitings to review survey data

72
StEPS Modules (continued)
  • RUN
  • Allows users to run processes as defined in
    survey specifications module. Such processes
    include edits, estimation, imputation, and
    derived items.
  • VIEW RESULTS
  • Allows users to view the results from the run
    processes.

73
StEPS Modules (continued)
  • MIS
  • Provides management information reports,
    including response rates, imputation rates, and
    edit summaries.

74
Standard Data Structures for every survey
  • Use of dictionary data sets
  • Control-data dictionary - information about
    variables (numeric or character) describing
    processing options for individual reporting units
  • Item-data dictionary - information about numeric
    variables containing data from questionnaire or
    from other sources
  • e.g. annual textile sales, amount of fuel used,

75
Use standard Libnames
  • Libnames point to different physical locations or
    directories within a data set called
    CENTRAL.SURVEYS. Once a user selects a survey, a
    libname called SURVLIB is set up.
  • CENTRAL.SURVEY data set
  • SURVEY Char Survey identifier
  • SURVNME Char Survey name
  • SURVDIR Char Directory of top-level survey
    info SURVLIB

15
76
Use of Standard Libnames cont. Once a user
selects a survey and SURVLIB is set up, a data
set SURVLIB.VSTATPS is opened. User selects the
statistical period of data to access from a list
of stat periods available for that particular
survey. StEPS is able to treat any stat period
as the base (or current) stat period and any
stat period other than the base as it relates
to the base.
77
SURVLIB.VSTATPS data set
  • SURVEY Char Survey Identifier
  • SURVNME Char Survey name
  • STATP Char Statistical Period
  • DATASDIR Char Stat period specific data
  • PARMSDIR Char Specific parmeter
  • SPRGDIR Char Directory for survey- specific
    programs

78
Use of Standard Libnames cont.
  • Non-stat period related libnames DATALIB and
    PARMLIB are then set up. The base stat period is
    assigned the following libnames DATA00 and
    PARM00.
  • This design sets up standard libnames that simply
    point to different physical locations based on
    what is stored in these data sets, regardless of
    what survey or stat period is used.

79
Microdata Storage (Survey Parameters or
specifications
  • Control-data files - (name and address info)
  • A record for each set of standard variables
    control type variables specific to a survey
  • Master control file
  • Stat-period control file
  • Item-data files (numeric info for each ID)
  • A record for each item in the survey, along with
    various processing fields (weighted, corrected)
  • Skinny file - separate record for each ID/item
  • Fat file - all data relating to an ID

17
80
Elements of StEPS
  • SAS data sets
  • Data dictionaries
  • Micro/macro data
  • Processing parameters
  • SAS/AF screens for specifying parameters,
    submitting batch jobs, and requesting results
    listings
  • SAS macros and estimation scripts for batch
    calculations

13
81
EFFORTS AT EIA FOR BUILDING A COMMON DATA
COLLECTION AND PROCESSING SYSTEM
  • COMMON COLLECTION AND PROCESSING SYSTEM (CCAPS)
    being developed to handle up to 70 different
    energy surveys
  • In pilot stage
  • Development of a Master Universe Database as a
    frame file for all companies reporting on a EIA
    survey across all energy sectors

82
COMMON COLLECTION AND PROCESSING SYSTEM
(CCAPS) Objectives
  • Minimize costs for maintenance, long term
    development, and user operations
  • Common database storage system to minimize data
    redundancy, common tools for data management and
    access
  • Consolidate all EIA name and address systems into
    one comprehensive system -- Master Universe
    Database (MUD)

83
Common Collection and Processing System
(CCAPS) Data Entry
  • Initialize new data collection period
  • Data collection
  • No change in data collection instruments
  • allows for multiple collection modes within a
    survey
  • Data input to CCAPS
  • Direct entry by keystroke or fast key
  • Import electronic data file (PEDRO, e.g.)

84
Common Collection and Processing System
(CCAPS) Data Processing
  • Perform all data edits
  • Set flags when edit fails
  • Provide mechanism for flag resolution
  • Maintain historical log of all data changes by
    cell
  • Provide reports to facilitate/evaluate edits

85
2 Levels of Automated Edits
  • FIRST LEVEL edits are performed on data when the
    data are first entered from the survey and saved
    into the data base.
  • SECOND LEVEL edits are performed using the
    current week/month aggregate data.

86
First Level Automated Edits
  • Include summing across cells within a form
  • Utilize previous historical information,
    inclusive of imputed values on that respondent
    for previous period (t-1), (t-2), (t-n).
  • Can make use of previous aggregate information
    for that data cell for previous periods (t-1),
    (t-n).
  • Occurs during the Edit and Save process for
    entering data.

87
Second Level Automated Edits
  • Requires an aggregation of current period (t)s
    data.
  • Any aggregation of current periods data may be
    used, not just publication aggregation level.
  • E.g., A rule based on the change in respondents
    market share from period (t-1) to period (t).

88
Editing Goals and Criteria
  • Need the ability to view data and edit flags and
    correct errors online after the Edit and Save
    process, as well as, after 2nd level edits.
  • Performance measures are required to measure both
    the quality of the data and Editing processing
    itself.

89
Performance Measures for Evaluating the Editing
Process
  • Counts of fatal, critical, and warning flags by
    type of edit
  • Calculation of data failure rate by type of edit
  • Count of critical and warning failures that
    result in changed data, calculation of hit rates
    by type of edit
  • Aggregate performance measures percent cells
    flagged and changed percent of volume changed.

90
Common Collection and Processing System
(CCAPS) Data Processing
  • Perform imputations, estimations and aggregations
  • Data suppression
  • Reports for final data evaluation and drafts of
    publication reports
  • Export files to publication system
  • Export files to other composite publication
    systems (analysis, Monthly Energy Review)

91
Form Selection Screen
92
Sample Form
93
Cell Flags/History
94
Comments
95
Reports Selection Screen
96
Features Summary
  • Common form screen structure
  • Respondent data, form data, comments
  • Right click
  • Edits, history, flags
  • Allows more than 1 form at a time
  • more than 1 respondent
  • more than 1 historical period
  • Common screen management functions
  • Comments, Imputations

97
The Need for Developing a Master Survey Frame
Database
  • Need to standardize information on Company name
    and addresses, Ids, etc., to allow transfer of
    information across survey systems.
  • Avoids confusion and inconsistencies of name and
    address information contained in the various
    survey systems.

98
The Role of a Master Frame Survey Database for a
Common Survey Processing System
  • The Master Universe Database contains all
    information concerning the businesses in the
    survey processing system.
  • Stores all name and address information, the
    energy sector and activity they are in, contact
    persons, business affiliations.
  • Contains historical records about companies no
    longer in business, past affiliations, and
    reporting history.

99
MUD Filter/Search Screen
100
Economic Unit Tab
101
Contacts
102
Relationships
103
Attributes
104
Comments
105
CCAPS Architecture Goal
Code
Data Publications Analyses
Respondent data
Form
Energy data
106
Advanced Architectural Features
  • Cell based approach
  • Layered Approach
  • Dynamic Formulas
  • Custom Controls

107
Cell Approach - design system around data cells
rather than the form
Old Approach (Publication based)
New Approach (Cell Based)
Table 1
Form 1
Table 2
Form 1
Table 3
Table 4
Form 2
Cell Table
Table 1
Form 3
Form 2
Table 2
Table 3
Form 4
Table 4
Form 5
Table 1
Form 3
Table 2
Form 6
Table 3
Table 4
108
Single Cell Approach
  • Survey Form Structure Stored As Data
  • Cell Tracks
  • Data
  • Flags
  • History
  • Imputations

109
Single Cell Data Management
Historical Values
Past Comments
Form EIA-x
Edit History
Dimensional Attributes
Location
Time
Resource
Other
110
Layered Approach
Data bases
111
Dynamic Formulas
  • Formula Types Edit, Impute, Aggregate, Estimate
  • Data Driven Features
  • Assignable to cells
  • Formula arguments are in database
  • Arguments are modifiable by the user
  • Formulas are applicable to cell groups
  • Formula application rules are determined at
    runtime

112
Custom Controls
Common Custom Controls Edit Grid
Check Box Buttons
Common Code Load Edit Save
Display Historical Data Flags
Comments Common Features Color Font
113
Interactive Graphical Editing capability
  • Median editing cost is 40 of survey cost for
    economic surveys
  • Over-identification of potential errors
  • extensive manual review and man-hours
  • unknown/little impact on survey results
  • increased respondent burden
  • risk of missing true errors
  • Capitalize on efficiency of technology while
    acknowledging subject matter specialists expertise

114
Combines Features of other Graphical Editing
Systems
  • From BLS a top down approach, displays an
    anomaly map similar to ARIES--Summarizes
    anomalies at aggregate
    levels
    through color --indicates relationship of
    aggregates --provides drill down capability to
    lower level aggregates and respondent level
    data

115
Combines Features of other Graphical Editing
Systems
  • From Census--EDA Techniques -- Box-whisker
    graphs of change with query and drill down
    capability --Time series and scatter graphs
  • From Statistics Sweden Windows Application --
    Anticipate client/server solution, object
    oriented, integrated with data processing --
    Graphics enable more effective editing rules,
    limits/thresholds or other parameters

116
Combines Features of other Graphical Editing
Systems
  • From Federal Reserve Bd PowerBuilder --
    Quicker development time, driven by iterative
    user feedback/requirements
  • Incorporate Visualization Techniques 2 or 3
    colors/4 shades, enhances perception, regression
    line on scatter graph, select and regraph to
    uncluster data

117
(No Transcript)
118
SCORE FUNCTION Marginal Change Respondents
Contribution to Total Change For
price aggregate, P k,t
Siwi,tPi,k,tVi,k,t
Si (wi,tVi,k,t) A
respondent is contribution is, MPi
wi,tPi,k,tVi,k,t - wi,t-1Pi,k,t-1Vi,k,t
-1 S (wi,tVi,k,t)
S (wi,t-1Vi,k,t-1) MVi wi,tVi,k,t -
wi,t-1Vi,k,t-1 S wi,t
S wi,t-1
119
(No Transcript)
120
(No Transcript)
121
(No Transcript)
122
(No Transcript)
123



124
(No Transcript)
125
(No Transcript)
126
Increase
Increases
Decrease
127
Summary on Graphical Editing
  • Fast method of identifying outliers
  • Allows you to test your editing rules and
    potential to test alternatives
  • Reduces the amount of correctly reported data
    failing the edits
  • Reduces the time and resources for validating data

128
http//www.fas.harvard.edu/stats/survey-soft/surv
ey-soft.htmlPackages
129
http//www.analyse- it.com/info/genmod.htm
About PowerShow.com