Title: WebBased Surveys for Corporate Information Gathering: A BiasReducing Design Framework
1Web-Based Surveys for CorporateInformation
Gathering A Bias-ReducingDesign Framework
- Jake Burkey
- Washington State University, Pullman
- William L. Kuechler
- University of Nevada, Reno
2Motivation for Use of Web-Based Surveys
- Relatively low cost overall.
- Sunk costs are greatest share.
- Server space, software, technical training of
personnel. - Marginal costs near zero.
- Data processing for an additional observation.
- Use of existing resources may reduce sunk cost.
- Existing excess server capacity.
- Personnel with development and administration
skills. - Dynamic interaction with respondent.
- Condition questions on prior responses.
- Present unique questions.
- Metadata readily available.
3Concerns Specific to Web-Based Surveys
- Formatting and Technical.
- Different browser makes and versions.
- Formatting and layout may differ.
- Inconsistent support for dynamic and interactive
scripts. - Sampling.
- Coverage error related to surveys of general
population. - Not significant in organizational context.
- Other Administrative.
- Data loss from program bugs or server failure.
- Security and privacy of transaction.
4Web Survey Types
- Nonprobability Surveys.
- These do not frame a representative sample.
- Data cannot be used to characterize a population
or make predictions through statistical
inference. - Bias not relevant in this context.
- Entertainment polls.
- Common single-question opinion polls.
- Unrestricted self-selected surveys.
- Volunteer opt-in panels.
5Web Survey Types
- Probability-based Surveys.
- Administered as census or probability sample.
- Allows the use of statistical inference.
- Bias reduces the accuracy of statistical
inference. - Intercept surveys.
- Used by e-commerce sites to sample customer
population. - List-based samples.
- Organizational e-mail lists.
- Web option in mixed-mode surveys.
- Comparatively high response rate.
- Prerecruited panels.
- Used by large polling firms.
- Representative sample is constructed through
weighting.
6Bias in Web Surveys
- Coverage and Sampling Error.
- Expresses inaccuracy of sample moments as
estimators of the true moments of the population
distribution. - This is the familiar margin of error in polls.
- Contributes to imprecise and inaccurate
estimators. - Occurs because Internet user population not
representative of the general population. - Technology adoption not uniform or complete.
- Access to Web more uniform in organizational
setting. - Not present in a census of all members of a
population. - Generally feasible for smaller organizations.
- Nonresponse and measurement errors may still be
present. - In larger organizations, stratified sampling may
increase precision and reduce cost.
7Bias in Web Surveys
- Measurement Error.
- Question wording and order.
- Address in content and language stage of
development. - Not specific to Web surveys.
- Layout of Web page and form controls.
- Questionnaire layout should be invariant to
browser resizing. - Orientation, response order and banking in
response scales all introduce some amount of
bias. - Respondent interaction with interviewer.
- People interact with Web pages similar to
interaction with humans. - Design and format of Web interface has
personality that can introduce bias.
8Bias in Web Surveys
- Nonresponse Error Item Nonresponse.
- Item in an observation unit is missing a
response. - Contributing factors.
- Question content or language.
- Confusing measurement scale or instructions.
- May replace item using auxiliary regression or
imputation.
9Bias in Web Surveys
- Nonresponse Error Unit Nonresponse
- An entire observation unit is missing.
- Contributing factors.
- Preferences for survey mode.
- Anonymity very important for organizational
surveys. - Ease of use.
- Time to completion.
- May compensate for unit nonresponse with
weighting adjustments.
10How to Measure Bias
- Statistical Validity Testing.
- Pearson chi-square statistic.
- Only assumption is that variables are independent
(i.i.d.). - Two-sample t statistic to compare means.
- Independent samples and approximate Normal
distribution. - F statistic to compare variances.
- Independent samples and Normal distribution.
- Very sensitive to departures from underlying
assumptions. - Wilcoxon rank sum test Mann and Whitney U-test.
- Non-parametric test.
- Not dependent on assumption of underlying
distribution. - Additional tests for symmetry or normality if
desired.
11Web Survey Development Framework
- Bias reduction through planned design.
- List of questions.
- Content and language.
- Format.
- Visual layout.
- Measurement.
- Design and function of form controls.
- Administration.
- Sampling.
- Programming.
- Data management.
12Design Framework - Format
- Limit page colors, fonts and graphics.
- Consider CSS for programming layout.
- Standardize across different client-side
configurations. - Browser options may override formatting
instructions. - Different browsers may render pages differently.
- Monitor size and screen resolution may affect
display. - Flash or Java will enforce standard presentation.
- May need software plug-in.
- Need broadband connection.
- More feasible in organizational context.
13Design Framework - Format
- Focus movement through page with visual anchors.
- Size, color and position of text elements.
- Contrasting blocks of background color.
- Single page vs. multiple page questionnaire.
- Dillman cited as recommending single page design.
- Currently using multi-page design, to collect
metadata. - We recommend single page unless
- questionnaire is lengthy or contains many
response fields. - need to collect question-specific metadata.
14Design Framework - Format
- Skip pattern compliance.
- DHTML on client side.
- Fast, low-bandwidth, and invisible to the user.
- Cross-browser compatibility is still an issue.
- Windows XP service pack 2 may disable active
scripting. - Server-side scripting.
- Most appropriate with multi-page questionnaire.
- Slower, but with dependable implementation.
15Design Framework - Format
- Minimize Questionnaire Size Time to Completion.
- Check completion time from a web terminal, not
locally. - Presentation of Instructions.
- Blocks of text for general instructions.
- Question wording as instruction.
- Additional information in pop-up windows.
- Available in document at point of need.
- DHTML boxes vs. new browser windows.
- May introduce bias. Effect has not been
researched. - Framework evaluation survey results.
- Web mode produced fewer errors and omissions.
- Due to pop-up instructions?
16Design Framework - Measurement
- Textboxes.
- Response not constrained to pre-determined choice
set. - Length of space communicates expectation, may
induce bias. - May require validation script.
- Drop-down menus.
- Will limit potential responses to the set
provided. - Set may be large without impacting page layout.
- Response choices are hidden and ordered, may
induce bias.
17Design Framework - Measurement
- Radio button lists.
- Likert response scales.
- Only one button in a set may be selected.
- Question should be presented with no choice
pre-selected. - Note object is not instantiated until a button
is selected. - Checkboxes.
- Allow multiple selections from a set of choices.
- Question should be presented with no choice
pre-selected. - Object exists regardless of selection status.
18Design Framework - Measurement
- Custom Controls.
- ASP.NET Server Controls, User Controls and Custom
Controls. - HTML form controls programmable in server-side
code. - Extended or unique control interface or
functionality. - May generate client-side code at run-time.
- .NET processor generates cross-browser compatible
code. - DHTML (JavaScript) Controls.
- Extended or unique HTML control interface or
functionality. - Code execution entirely on client.
- Cross-browser compatibility must be addressed
explicitly.
19Design Framework - Administration
- Development and deployment models.
- Third-party polling firm.
- Outside firm does all or most of design and
administration. - May strengthen perception of anonymity of
responses. - Highest cost, least control over process.
- Web-based polling site.
- Survey design completed in-house.
- Greater developmental control, potentially less
cost. - Outside firm handles web deployment and data
collection. - Maintains perception of anonymity.
- In-house.
- Greatest control over process, potential cost
reduction. - Personnel needs determined by survey complexity.
20Design Framework - Administration
- Web survey software.
- Pre-programmed software packages, often free or
low-cost. - May incur training costs for new software.
- Package may offer limited questionnaire options.
- May require additional programming.
- Programming various server and client systems.
- ASP, ASP.NET, PHP, CGI/Perl, on server.
- Javascript, Java, Flash, on client.
- Server license may be costly, client licenses are
free. - May be able to leverage programmers in-house.
21Design Framework - Administration
- User Validation.
- Probability survey must restrict access to
selected respondents. - Web form or querystring or both.
- Access code, or combination of user ID and access
code. - Databases.
- Many free or available as part of business
software suite. - Access, MySQL, PostgreSQL, DB2, Cloudscape,
Limited version of SQL Server, and several others.
22Design Framework - Administration
- Data Validation.
- Relatively easy to implement in Web surveys.
- May be used to prompt a respondent if item
missing. - Forcing a response may cause abandonment of
questionnaire. - Ensure complete data set, with no nulls.
- Write a default value to database if item
missing. - Prevent data type and SQL errors.
- Cast input to correct type before writing to
database. - Allow valid entries only as arguments to an SQL
query.
23Development and Administration Cases
- Dennis and Gambhir (2000).
- Programmed application using AOLServer and
Sybase. - Couper, et. al. (2001).
- Used ScyWeb survey program.
- Heerwegh and Loosveldt (2002).
- Programmed PHP server script.
- Francis, et. al. (2000).
- Wrote Java program, hosted on Linux server.
- Crawford, et. al. (2001).
- Programmed Cold Fusion server script, with Access
db. - Our framework evaluation survey.
- Programmed ASP server script, with Access
database.
24Pilot Testing
- Performance and Format Testing.
- Server.
- Load-test under heavier load than expected.
- Important regardless of whether survey is
administered in-house or by outside firm. - Network.
- If served over company intranet, check
permissions. - If served to Internet, check download time over
phone line. - HTML form input controls and client-side scripts.
- Test for proper function.
- Test for cross-browser compatibility.
- Questionnaire.
- Examine distribution of responses by question.
25Summary
- Gains from Web surveys.
- Cost reduction.
- Marginal cost approaches zero.
- Automation reduces data entry and data cleaning.
- May leverage existing server capacity, software
and personnel. - Bias reduction.
- Reduced measurement error and item nonresponse.
- Versatility.
- Relatively easy to design and distribute.
- Data can be delivered to any database, anywhere
on the Internet.
26Summary
- Challenges of Web surveys.
- Sample selection for surveys of general public.
- Should not impact organizational surveys.
- Potential time cost to respondent.
- Some have suggested Web survey completion may
require more time than other modes. - Results of our survey and others suggest time
difference may be eliminated by careful design of
user interface. - Organizational users more likely to be familiar
with Web forms.
27Summary
- Opportunities for Web Survey Research.
- Nonresponse error.
- Perception of transactional security.
- Eleanor Singer studied perceptions of data
privacy. - Measurement error.
- Custom HTML form controls.
- Dynamic instructional help.
- Consumer surplus from Web and Mail surveys.
- Respondent incurs time cost in completing survey.
- Value of time is continuously variable.
- Web surveys allow response at time when cost is
least.
28