Title: The%20Experiences%20of%20Web%20Based%20Data%20Collection%20from%20Enterprises%20in%20Finland
1The Experiences of Web Based Data Collection from
Enterprises in Finland
- August 9th 2006, JSM Seattle USA
2Introduction - Strategies And Methods
- Statistics Finlands Strategy for EDR
- To offer an electronic option in all data
collections by 2007 (not in person statistics) - Its the respondents choice whether to use it or
not - Data Collection Methods
- About 97 of data are derived from administrative
registers - About 3 are from direct data collection (paper
forms, machine readable data / primary EDI, EDR,
interviews by CATI/CAPI systems mainly Blaise) - Business Data Collections
- About 50 surveys (excluding collections with less
than 30 respondents) - 45 Web (Internet form) collections in use
3Background - Data Collection And Infrastructure
- Traditionally high response rates (in both
annual and sub-annual business surveys) - Up to over 99, persistent staff
- Good relations with data providers
- Experienced staff, continuous personal contacts,
high level of trust - High level of using the Internet
- Almost every enterprise has access to the
internet (employees 10 ? 98, employees 100 ?
100) - Business surveys are directed to the largest
enterprises - Positive atmosphere for using internet with the
government - Respondents are even enthusiastic about using the
Internet - Its fun to fill in web forms instead of paper
ones!
4Background - Three Generations of In-house EDR
Solutions
- 1. Generation Building cost index 2001
- Built using Microsoft Windows DNA (Distributed
Internet Application Architecture) - 2. Generation 7 EDR solutions 2002-2005
- VB.NET
- 3. Generation 23 EDR solutions 2005-2006
- XCola
- 11 EDR solutions made by outside service provider
1997-2006 - Pilot in integrated data collection (tourism
statistics)
5Technical information - XCola in a Nutshell
- A generic application for Web surveys
- Processes the XML questionnaires and transforms
them into Web applications - Supports client and server side validations
- Executed on the server side, does not require any
installation on the respondent side - Works on every modern browser
- Easy to implement new questionnaires in just
hours - Main developer Mr. Toni Räikkönen,
toni.raikkonen_at_stat.fi
6Benefits - Summary of Main Benefits
- Simplifying data collection process
- Reducing need for human resources
- Reducing other data collection costs
- Improving the quality of collected data
- Decreasing non-response
- Speeding up the data accumulation
- Reducing response burden
- Enabling direct individual feedback for
respondents - Enabling browsing of previously submitted data
- Assuring high level data security
Cost-efficiency
Accuracy
Timeliness
Data providerrelations
7Achieved Cost-Efficiency - 2nd Generation
- Four second generation solutions have been in
production for 3 years (3300 respondents per
month plus 800 per quarter) - Average per cent of work saved in the data
collection phase is over 40 (2 person years) - The amount of ground mail has been reduced by 65
(0.5 person years) - Number of reminders sent has gone down by half
- Mass e-mailer for all kinds of collections
- Investment is paid off in about a year
8Cost-Efficiency Continues to Improve - 3rd
Generation
- Common framework (one engine) for similar systems
- ? An effective build-up
- Simple method for transferring data between
collection and production databases - Only one application to maintain and support
- Support and development knowledge easier to
acquire and spread - Reducing need for human resources
- ? As manual handling diminishes, it can be
replaced by more rewarding tasks
9An Example - Working Hours Used in Data
Collection and Validation in Sale Inquiry
hours
2001 2002 2003 2004 2005
years
10Accuracy and Timeliness
- The data received are of better quality 25
less errors(both annual and sub-annual surveys) - Response rates have remained on high level
- The average response time of monthly surveys has
reduced - in the best case by 8-10 days or 30
- The number of reminders sent has decreased
substantially - in the best case by 50 (from 1000 to 500 in just
4 months) - The share of the respondents using EDR -solution
has in most cases reached high level - sub-annual surveys gt 60 (in the best case 85)
- annual surveys 30 (in the best case 75)
11An Example - Sale Inquiry Accumulation of Data
01/2002 - 01/2006
responses
12Data Provider Relations
- Perceived response burden has gone down
- E-mail informs of the survey and reminds to
answer - Questionnaire is always available and fast to
fill-in - Option to fill in the questionnaire in separate
sessions - Good designing of the questionnaire
- Helpful validity checks - no additional inquiries
- Contextual on-line help
- Support for several languages
- Individually tailored feedback
- Access to all the previously submitted data and
pre-filled questionnaires
13High Level of Data Security
- Data security audit by an outside consult
- All traffic on the Internet is SSL -encrypted
- An authentication / authorisation -process is
always needed - New user IDs and passwords every year
- User IDs and passwords are initially sent in a
letter - Only one of them can be sent by email
- The other one must always be sent in a letter or
given over by telephone - Only a certain number of our staff have access to
user IDs and passwords (usually two persons per
survey)
14An Example - Sale Inquiry Change in Response
Media 12/2001 - 12/2005
responses
15Costs - Investment and Maintenance
- The costs have dropped by 60-70 during the last
few years - Average investment cost per new EDR -solution
(today) - An outside service provider was EUR 5000
- In-house solution (XCola) less than 150 hours of
work - Maintenance costs of EDR solution per year
(today) - An outside service provider was EUR 1000
- In-house solution (XCola) less than 50 hours of
work - During the first and second phases the total
resource input was about 2,5 person years
(learning by doing) - Included the development of a secure
communication environment - Included the implementation of 7 solutions
16An Example - Work Done in Development and
Maintenance of An EDR Solution (Sale Inquiry)
Includes hours used in development of infrastrucre
hours
2002 2003 2004 2005
17Challenges - In-house Development and Maintenance
- The development of surveys can be very fast if
the IT -personnel have good skills in XML and
related techniques - At the moment the number of very skilled survey
developers is limited - The whole production environment around XCola is
not yet finished - Somewhat dependent on certain named persons
- The statistics departments typically have a lot
of requirements for the surveys - Some minor development in XCola is needed all the
time
18Pilot - Integrated Data Collection (Tourism)
- Data are delivered directly from hotel management
systems into our database - No manual work needed (except to initiate the
transfer) - After their reception data are submitted to the
standard validation process - Software vendors implement a module for the
hotels management software using Statistics
Finlands definitions for data and service
interface - Implemented using typical B2B integration
technique XML Web Services
19Near Future - Productisation and Integration
- More integrated data collections?
- Co-operation with management system providers
- Project for productisation of XCola (since June
2006) - Has already been made (Xcola v. 3.1)
- Developers manual, finalised administration
tools - Routines for transfers between collection and
production databases - XCola version for outside evaluation has been
built - Under development
- Graphical editor for building questionnaires and
links to metadata - Project for co-ordination of business surveys
- In the future more co-ordinated surveys - instead
of many independent surveys targeted towards
businesses