SlideShare a Scribd company logo
1 of 10
Download to read offline
416 SCHLEYER, FORREST, Web-based Survey Design and Administration
Research Paper Ⅲ
Methods for the Design
and Administration of
Web-based Surveys
TITUS K. L. SCHLEYER, DMD, PHD, JANE L. FORREST, RDH, EDD
A b s t r a c t This paper describes the design, development, and administration of a Web-
based survey to determine the use of the Internet in clinical practice by 450 dental professionals.
The survey blended principles of a controlled mail survey with data collection through a Web-
based database application. The survey was implemented as a series of simple HTML pages and
tested with a wide variety of operating environments. The response rate was 74.2 percent.
Eighty-four percent of the participants completed the Web-based survey, and 16 percent used
e-mail or fax. Problems identified during survey administration included incompatibilities/
technical problems, usability problems, and a programming error. The cost of the Web-based
survey was 38 percent less than that of an equivalent mail survey. A general formula for
calculating breakeven points between electronic and hardcopy surveys is presented. Web-based
surveys can significantly reduce turnaround time and cost compared with mail surveys and may
enhance survey item completion rates.
Ⅲ J Am Med Inform Assoc. 2000;7:416–425.
The Web-based survey described in this article was
designed to investigate the use of the Internet in clin-
ical practice by 450 dental professionals. The results
of the survey itself have been published previously.1
This paper describes the design and implementation
of the survey in detail to assist other researchers who
are considering using Web-based surveys. From a re-
view of the background literature and our own ex-
periences, we present issues in sampling for electronic
surveys; survey design, programming, testing, and
administration; potential problems and pitfalls; and
cost comparisons between electronic and hardcopy
surveys. We developed several general breakeven cal-
culations based on cost, provided all other variables
are equal, to help researchers choose between elec-
tronic and traditional mail surveys.
Affiliations of the authors: Temple University School of Den-
tistry, Philadelphia, Pennsylvania (TKLS); University of South-
ern California School of Dentistry, Los Angeles, California (JLF).
Correspondence and reprints: Titus K. L. Schleyer, DMD, PhD,
Department of Dental Informatics, Temple University School of
Dentistry, 3223 N. Broad Street, TU 600-00, Philadelphia, PA
19140; e-mail: ͗di@dental.temple.edu͘.
This work was supported in part by grant T15-LM07059 from
the National Library of Medicine/National Institute of Dental
and Craniofacial Research.
Received for publication: 9/7/99; accepted for publication:
1/31/00.
Background
Several recent publications have reported use of the
Internet to conduct survey research.2–9
Investigators in
the fields of medicine, psychology, sociology, den-
tistry, and veterinary medicine are recruiting partici-
pants for their research studies by targeting specific
search engines, newsgroups, and Web sites. Partici-
pants often answer surveys by returning a completed
form by e-mail or by entering their responses directly
on a Web site. Commonly cited advantages include
easy access, instant distribution, and reduced costs. In
addition, the Internet allows questionnaires and sur-
veys to reach a worldwide population with minimum
cost and time. Researchers can contact rare and hid-
den populations that are often geographically dis-
persed,3
as well as patient populations different from
those typically seen in the clinical or hospital set-
ting.2,10
Other reported benefits relate to graphical and inter-
active design on the Web. Ideally, HTML survey forms
enhance data collection, compared with conventional
surveys, because of their use of color, innovative
screen designs, question formatting, and other fea-
tures not available with paper questionnaires. They
can prohibit multiple or blank responses by not allow-
ing the participant to continue on or to submit the
survey without first correcting the response error. This
Journal of the American Medical Informatics Association Volume 7 Number 4 Jul / Aug 2000 417
feature is somewhat controversial, because there may
be legitimate reasons for not answering questions, and
responses such as ‘‘don’t know’’ or ‘‘prefer not to an-
swer’’ force an answer when participation and ques-
tion response is supposed to be voluntary.11
Regard-
less of one’s view on this issue, the program can
provide cues to make sure the respondent does not
inadvertently skip a question. In addition, coding er-
rors and data entry mistakes are reduced or elimi-
nated while compilation of results can be automated.12
Finally, online forms can help minimize costs, facili-
tate rapid return of information by participants, and
allow timely dissemination of results by investiga-
tors.13
Several examples show how the Internet is used for
survey research. Physicians in Germany developed a
Web-based patient information system about atopic
eczema to attract patients to the Web site and Internet
survey.2
The purpose of the survey was to explore the
relations between atopic stigmata and its symptoms,
predisposing factors, patient demographics, and as-
sociations with other diseases. As an incentive to fill
out the survey, an atopy score was calculated and pre-
sented to the participant upon completion. Approxi-
mately 240 subjects complete the survey each month.
Healthy Web surfers serve as controls.2
In another study, researchers at Columbia University
explored the properties of a new measure of sexual
orientation by monitoring network traffic on an intra-
net over a two-week period and collecting all postings
to two newsgroups related to their topic of study.3
From the formulated list of e-mail addresses, 360 sub-
jects were randomly selected. Subjects were notified
of their selection, and those who consented to partic-
ipate were e-mailed a survey. Of the participants who
were contacted, 66.1 percent provided their consent to
participate and 56.4 percent of that group returned
completed surveys.3
Veterinarians conducted research via e-mail and Web
pages to investigate causes of dog death in small vet-
erinary practices.4
In this study, 25 veterinarians sub-
mitted case material. On the basis of analysis by re-
gion and school attended, the investigators found that
participants were representative of the veterinarian
population in the United States.
Nursing researchers have found the Internet a valu-
able vehicle for collecting data from cancer survivors.7
In this study, three cancer-related newsgroups were
used to distribute the Cancer Survivors Survey Ques-
tionnaire. This method proved useful for collecting
preliminary data, which are often needed to demon-
strate the feasibility of conducting a large-scale study
and for determining adequate sample size.
Theoretically, conducting research over the Internet
has many benefits. However, survey experts and re-
searchers warn that the current online population is
not representative of the general population in the
United States. Estimates of computer ownership and
e-mail access vary depending on how data were gath-
ered, e.g., face-to-face or via telephone, and how it is
reported, e.g., household computer ownership vs. ‘‘ac-
cess to’’ computers.14,15
For example, in 48,000 face-to-
face interviews conducted in 1997, 37 percent of
households in the United States reported owning a
computer, 19 percent reported online access, and 17
percent reported e-mail access. In comparison,
through telephone polls, 67 percent reported having
access to a computer and 31 percent had an e-mail
address.
While access to e-mail and the Internet grow daily, a
‘‘digital divide’’ exists among age and racial groups,
income levels, and geographic settings.14
Ensuring
that each potential respondent has an equal chance of
being selected to participate poses a major challenge
in conducting a scientifically sound survey. This is es-
pecially true in the health sciences, where electronic
access to specific provider or patient groups cannot
easily be obtained.9
Currently, not all health profes-
sional associations or licensing boards collect e-mail
addresses, nor is it possible to estimate the number of
individuals with the particular health state of interest
who have access to computers and the Internet.7
How-
ever, rigorous sample selection procedures must be
followed if results are to be generalized to a popula-
tion and sources of coverage and sampling error are
to be kept to a minimum.11,16
Unfortunately, the sampling procedures reported in
many electronic surveys reflect unknown sam-
ples.2,3,5,13
When subjects are recruited by targeting
newsgroups or search engines, it is nearly impossible
to determine the distribution of the sample popula-
tion. These survey procedures should be used only
when sampling and self-selection biases can be toler-
ated.
Another concern unique to conducting electronic sur-
veys is the variation of the level of computer literacy
among respondents and the capabilities of their com-
puters. Internet users tend to be highly educated
white men between the ages of 26 and 30 years.13
Even
so, their experience responding to online question-
naires may be limited. Thus, Web-based surveys need
to have clear directions on how to perform each
needed skill, e.g., how to enter answers with a drop-
down box or erase responses from a check box11
so
that responding to the questionnaire does not become
a frustrating experience.
418 SCHLEYER, FORREST, Web-based Survey Design and Administration
Providing specific instructions will assist respondents
in accurately completing and returning the survey,
provided their computer is capable of receiving it in
the first place. Differences among computers, such as
their processing power, memory, connection speeds,
and browsers, potentially negate some of the benefits
purported for using the Web. For example, the use of
graphics and animation may increase the attractive-
ness and novelty of participating. However, advanced
Web progamming features, such as Java, JavaScript,
DHTML, or XML either may be incompatible with
certain browsers or may cause them to respond slowly
or crash. In The Influence of Plain vs. Fancy Designs on
Response Rates for Web Surveys, Dillman et al.17
showed
that such features can actually lower response rates.
In this study, a plain questionnaire obtained a higher
response rate than one that used tables and colors.
The plain design also was more likely to be fully com-
pleted in a shorter period of time.
Dillman et al. proposed three criteria and 11 support-
ing principles for designing respondent-friendly Web
questionnaires, some of which were used to guide the
development of the study presented in this article.11
These criteria include:
Ⅲ Take into account the inability of some respondents
to receive and respond to Web questionnaires with
advanced programming features that cannot be re-
ceived or easily responded to because of equip-
ment, browser, and/or transmission limitations.
Ⅲ Take into account both the logic of how computers
operate and the logic of how people expect ques-
tionnaires to operate.
Ⅲ Take into account the likelihood that a Web ques-
tionnaire will be used in mixed-mode survey situ-
ations.
The next section describes the purpose of the survey
described in this article, how the sample was selected,
and how the survey was designed, pilot tested, and
administered.
Survey Development and Administration
The Study
The Web-based survey described in this article was
designed to investigate the use of the Internet in clin-
ical practice by 450 dental professionals. There were
three primary reasons for choosing a Web-based sur-
vey method. First, the survey population used e-mail,
since all participants subscribed to an Internet discus-
sion list. Use of e-mail is not an absolute indicator of
Web use; however, since discussions often referenced
Web sites, it seemed likely that the majority of indi-
viduals used the Web. The survey results confirmed
this assumption. Second, because of an imposed dead-
line, survey development, implementation, and data
analysis had to be completed within eight weeks,
which made it impossible to conduct a traditional
mail survey. Finally, funds or other resources for the
production of a hardcopy survey, postage, and data
entry were not available.
Sample Selection
A random sample of dentists could not be selected
because no comprehensive list of dentists with e-mail
addresses was available. Consequently, the largest dis-
cussion list for general dentistry (Internet Dental Fo-
rum) was identified. Selection of the discussion list
permitted identification of the total population and
controlled follow-up with nonrespondents, blending
a methodologically sound approach with a new
method of collecting data. The investigators believed
that selection of this convenience sample, although
not representative of all dentists with Internet access,
was more appropriate than soliciting volunteers from
general sites with unknown populations. Dr. D.
Dodell, list owner of the Internet Dental Forum and
member of the project team, made the list of e-mail
addresses available. Institutional Review Board ap-
proval for this survey was not sought, since the
project was exempt under CFR §46.101 (b) (2).
Survey Design
A 22-question survey instrument with a total of 102
discrete answers was developed. Rather than being
presented on a single, lengthy Web page, questions
were grouped on 18 sequential screens, for two rea-
sons. First, sequential screens kept transmission time
to a minimum and avoided potential server time-outs
for respondents with slow modem connections (33
KBps and below). Second, the use of sequential
screens allowed questions to be displayed completely
and prevented the need for participants to scroll
through pages and potentially get lost.
Figure 1 illustrates some of the design features of the
survey. All screens were designed to display fully at
a screen resolution of 800 ϫ 600 pixels. Most screens
contained a single question. The top of the screen dis-
played a static 6 kb JPEG banner with a small picture
(which emphasized the clinical aspect of the survey)
and the title of the survey. Each question was dis-
played in bold. List boxes, radio buttons, and check
boxes provided answers for close-ended questions.
Text fields were available for answers to open-ended
Journal of the American Medical Informatics Association Volume 7 Number 4 Jul / Aug 2000 419
F i g u r e 1 Sample screen of
the survey.
questions. Formatting the answers in a table ensured
consistency of layout for different browsers, operating
systems, and window sizes. The lower left corner of
each screen indicated the participant’s relative posi-
tion in the total number of survey screens (in gray).
The lower right corner contained buttons to clear the
current screen and to move to the next screen. The
total file size of each screen averaged about 9 kb.
Several published recommendations and findings for
designing survey screens were followed.11,17,18
The
small file size minimized download time. Formatting
clearly differentiated questions and answers and de-
emphasized secondary screen elements. Consistent
layout reduced the number of required cognitive ad-
justments and allowed participants to concentrate on
answering the questions. The ‘‘next’’ button, com-
bined with the relative screen indicator, encouraged a
page-turning rhythm that resembled completing a
hardcopy survey.
To minimize incompatibilities with browsers, survey
pages were compliant with HTML 3.0. Neither
JavaScript, which is often employed to validate entry
fields on the Web, nor Java, ActiveX controls, and
other advanced Web programming concepts were
used for the reasons cited above.
The survey was programmed in PL/SQL on an Oracle
8 database server. Programming took approximately
35 hours. Code review and testing added another
eight hours. Since the code was going to be used only
once, the programmer neither optimized the code for
performance and maintenance nor added detailed
comments. The total length of the program was 2,471
lines. During the code review, the program was re-
viewed line by line. In the testing phase, each question
was answered and the corresponding entry checked
in the database. Initially, the survey was programmed
to validate every screen (e.g., to check that all fields
were filled out, that the zip code was formatted cor-
rectly). This feature was deactivated on the basis of
the results of the pilot test. A second program also
was developed to send survey messages to all partic-
ipants. This program took three hours to develop and
test and totaled 895 lines.
Pilot Testing
After programming, the survey was pilot-tested in-
house and with several remote participants. The pro-
gram was tested with two different browsers
(Netscape Communicator versions 4.0 and 4.5 and In-
ternet Explorer version 4.0), three operating systems
(Windows NT 4.0, Windows 95, and Macintosh OS
7.5), two types of Internet access (high-speed local
area network and modem dial-up line), and three dif-
ferent Internet service providers. The pilot test did not
uncover any technical problems. However, the word-
ing on some questions was slightly modified, and the
validation for required input fields was dropped. Sev-
420 SCHLEYER, FORREST, Web-based Survey Design and Administration
eral pilot-testers felt that requiring entries in all fields
was too restrictive, especially when they felt that a
question did not apply to them personally or when
the response choices did not exactly match their ex-
pectations.
Survey Administration
Next, a list of participants’ e-mail addresses was gen-
erated from the list of subscribers to the discussion
list. This list was imported into Microsoft Excel and a
unique, random four-character survey ID for each
participant was generated. The IDs were composed of
letters and numbers. Participants entered their IDs to
authenticate themselves and access the survey. The list
was then imported from the Excel file into the Oracle
database.
The main survey page (for introduction and login)
was hosted on the discussion list server (idf.stat.com)
rather than the Oracle server (heracles.dental.
temple.edu). Although this method avoided confusion
about the origin of the survey, it prevented the inves-
tigators from providing a URL that would have di-
rectly logged participants into the Oracle server. To
begin data collection and identify any significant
problems, a survey message was initially sent to 47
participants. The messages originated on the Oracle
server were sent through an SMTP mailer program
that spoofed* an e-mail address on the server hosting
the discussion list. Participants received a personal
message stating the purpose of the survey, who was
conducting it, the estimated time required to complete
it, the URL of the survey, the survey ID, and who to
contact with questions. The survey also instructed
them to return duplicate e-mail messages with the
subject line ‘‘DUPLICATE.’’ Once authenticated
through their survey ID, participants could begin an-
swering questions.
The group of individuals who responded to the first
mailing did not report any problems. However, sev-
eral problems were identified when the survey was
mailed to the remaining 403 individuals:
Ⅲ Several individuals stated that they entered their
survey ID, pushed submit, and then could not pro-
ceed with the survey because of an error message.
Unfortunately, the source of this problem could not
be tracked down. Most of these participants used
America Online as their Internet service provider.
*Spoofing is a common technique to fool hardware and software
in networked environments. Spoofing an e-mail address, for in-
stance, makes a message appear to be sent from someone else
than the actual sender.
Consequently, an ASCII copy of the survey was
mounted on the home page to provide an alterna-
tive method of answering the survey. Participants
were advised to use this method of replying if the
Web form failed. At the same time, participants
who had had problems were provided with an ex-
planation. A copy of the survey was included in the
reply. Completed surveys received through e-mail
or fax were entered by hand into the database later.
Ⅲ Some international users with slow modem con-
nections reported that they received a server time-
out when trying to answer the survey. As a result,
the timeout period for client responses to the server
was increased from 60 seconds to five minutes.
Ⅲ When typing in their survey ID, several respon-
dents mistook the digit ‘‘0’’ for the letter ‘‘O,’’ and
the digit ‘‘1’’ for the letter ‘‘l’’ and vice versa. Thus,
the server rejected their ID. Since a complete URL
could not be provided for respondents to click on
to access the survey, and since most participants ob-
viously did not copy and paste their survey IDs
into the field, respondents were advised by e-mail
of the correct way of entering their survey ID.
Ⅲ Several users were not aware that they could in-
clude the text of the original message in their reply
or paste the survey from the Web page into an e-
mail message. Instead, they printed the survey and
returned the completed hardcopy by fax. Although
this was not a major problem, it delayed the entry
of approximately ten surveys into the database.
Ⅲ After receiving approximately 130 surveys, the re-
sponses to one question revealed that participants
seemed to choose only two of the four responses on
a Likert scale. A review of the program revealed an
error that stored answers incorrectly. The error was
corrected, and the incorrectly stored answers were
discarded.
Three additional messages were sent to non-respon-
dents during the following two weeks. Figure 2 shows
the dates and times of the messages, the distribution
of responses, and the cumulative response rate.
The response rate for surveys entered via the Web was
32.9 percent—144 of 438 (adjusted) participants—
after the initial mailing. The first follow-up mailing
resulted in receipt of 76 surveys and brought the total
response up to 50.2 percent. The second follow-up
mailing raised the response rate to 57.1 percent, and
the third raised it to 64.4 percent (30 and 32 responses,
respectively). The 52 surveys returned by e-mail or fax
were entered by hand and increased the final response
rate to 74.2 percent (334 of 438 participants). We sent
Journal of the American Medical Informatics Association Volume 7 Number 4 Jul / Aug 2000 421
F i g u r e 2 Number of re-
sponses received (left scale)
and cumulative response
rate (right scale, in percents)
by date. The graph includes
surveys entered through the
Web only and indicates
when survey mailings were
sent.
a total of 1,132 e-mail messages to participants of this
survey. To increase the response rate, the survey was
directly included in the e-mail message in the second
and third follow-up mailings. In addition, while the
initial messages and the first follow-up messages were
sent from a generic e-mail account (survey@stat.com),
the subsequent follow-up messages were sent from
the listowner’s account directly, with a personal re-
quest for a response to the survey.
The next section compares the costs of this survey
with costs if the survey had been administered by
mail. General breakeven equations for the sample size
using Web-based vs. mail surveys are presented. The
section concludes with a comparison of characteristics
of Web and e-mail/fax respondents.
Cost and Response Pattern Analysis
Costs
Costs were calculated to assess the cost-effectiveness
of the Web-based survey method for planning future
surveys. Table 1 shows the costs for the Web-based
survey compared to the costs of an equivalent mail
survey. The comparison excludes costs that are the
same regardless of the survey methodology, such as
design of the survey instrument and pilot-testing.
Also excluded is the cost of obtaining the mailing list.
As Table 1 shows, the total cost of the Web survey
was $1,916, comprising the costs for programming
and testing the survey, programming the bulk mailer
program, and performing limited manual data entry.
If all respondents had completed the Web-based sur-
vey successfully, the cost would have dropped by
$206. Costs for an equivalent mail survey are calcu-
lated both for a non-anonymous survey (our case) and
an anonymous survey. The two alternatives differ in
their cost of preparing and mailing surveys. In non-
anonymous surveys, surveys are prepared for and
sent to non-respondents only after the initial mailing.
Anonymous surveys require that all participants re-
ceive the initial and all follow-up mailings and are
thus somewhat more expensive. We present different
breakeven calculations (see below) for these two op-
tions.
If our survey had been administered as a mail survey,
its total cost would have been $3,092, including the
cost of preparing 1,132 mailings, mailing costs, post-
age for returned mailings, and data entry. The Web-
survey was 38 percent cheaper than the equivalent
(non-anonymous) mail survey. The figures used for
the calculations in Table 1 represent local costs for a
mail survey of 450 individuals. In other settings, costs
might differ on the basis of factors such as sample
size, reproduction costs, study requirements, pro-
gramming costs, and data entry costs. Costs arising
from handling technical problems for the Web-based
survey (e.g., responding to user questions) were dis-
regarded, since the required time was minimal and an
improved design could have avoided most of those
problems.
As Table 1 shows, the cost of a Web-based survey is
independent of the sample size, whereas the costs of
a mail survey vary with the initial sample size as well
422 SCHLEYER, FORREST, Web-based Survey Design and Administration
Table 1 Ⅲ
Cost Comparison of a Web-based Survey and Non-anonymous and Anonymous Mail Surveys
Mail Survey
Non-anonymous
(1,132 mailings)
Anonymous
(1,800 mailings—4
ϫ 450 participants)Web-based Survey
Survey Programming (35 hrs $1,050 Printing/duplication of a 7- $724 $1,152
preparation ϫ $30) page survey and the cover
letter ($0.08/page)
Testing and code re- 480
view (8 hrs ϫ $60) Envelopes at 62 99
$0.055/envelope
Business reply envelopes at 37 59
$0.033/envelope
Distribution Bulk mailer program 180 Mailing of surveys at 736 1,170
and return (3 hrs ϫ $60) $0.65/envelope ($0.55
postage ϩ $0.10 envelope
Sending e-mail mes-
sages
0 stuffing)
Return postage on 334 210 210
surveys ($0.63/survey)
Data entry Entry by participant 0 Entry by investigator/staff 1,323 1,323
member (334 ϫ 6 min
Manual entry of 52 206 ϫ $0.66)
surveys (52 ϫ 6
min ϫ $0.66)
Total $1,916 $3,092 $4,013
NOTE: The calculations for the non-anonymous survey use actual response rates. The calculations for the anonymous survey assume
that each participant receives the mailing four times, whether they replied or not. Personnel costs include fringe benefits (30%).
as with the incremental and the total response rate.
Avoiding manual entry of completed surveys gener-
ated significant cost savings, a fact that has not been
lost on transaction-intensive industries such as air-
lines and banks.
To assist others in choosing between Web-based and
mail surveys on the basis of cost (assuming that all
other variables are equal), we use a standard break-
even calculation19
to determine the sample size for
which both types of surveys cost exactly the same.
This point is the threshold for which conducting one
type of survey becomes more cost-effective than the
other.
Equations 1 and 2 in Table 2 are used to calculate the
breakeven point for non-anonymous and anonymous
surveys, respectively, when the incremental and/or
the final response rates can be estimated. When that
is not possible, equations 3 and 4 can be used to cal-
culate lower and upper bounds for the breakeven
point.
For the described survey, n would have been 245 un-
der the idealistic assumption that all respondents suc-
cessfully answered the survey through the Web. Even
if the costs of entering of the surveys manually are
included, the breakeven point rises only to 274. Thus,
with a sample size of approximately 275 or below, a
mail survey would have been more economical than
a Web-based survey.
Equation 1 makes two assumptions of practical sig-
nificance. First, it assumes that exactly as many hard-
copy surveys are prepared as needed. In reality, this
is rarely possible. Equation 1 thus will often reflect
slightly lower costs for a mail survey than are achiev-
able, slanting the comparison in favor of mail surveys.
Second, it assumes that incremental and final re-
sponse rates can be estimated. When this is not pos-
sible, we can calculate a range for the breakeven point
by approximating the extreme values of equation 1
through equations 3 and 4. Using our costs, the lower
bound for the breakeven point would have been 190
and the upper bound 347. This means that for a sam-
ple size of 189 or less, a mail survey would have been
more economical, and with a sample size of 348 or
more, a Web-based survey. In between, the cost ad-
vantage would have depended on the actual incre-
Journal of the American Medical Informatics Association Volume 7 Number 4 Jul / Aug 2000 423
Table 2 Ⅲ
Equations to Determine Breakeven Sample Size
Assumptions Equations
Incremental and/or final
response rates can be
estimated:
Non-anonymous Survey*:
cPR
n = mϪ1
m Ϫ r и (c ϩ c ) ϩ r и ci D M m Eͩ ͸ ͪi=1
(1)
Anonymous Survey†:
cPR
n =
m и (c ϩ c ) ϩ r и cD M m E
(2)
Both types of survey:
Incremental and/or final
response rates cannot be
estimated:
Lower bound:
cPR
n =l
m и (c ϩ c ) ϩ cD M E
(3)
Upper bound:
cPR
n =u
m и (c ϩ c )D M
(4)
NOTE: Equations to determine the breakeven sample size for non-anonymous surveys (equation 1) and anonymous surveys (equation
2). If incremental or final response rates cannot be estimated, equations 3 and 4 can be used to determine the boundaries for the
breakeven point. The equations make the assumption that the denominator is not zero and that the Web-based survey is entered
successfully by each respondent. In addition, equations 1 and 2 assume that the cumulative response rate never reaches 100% before
the last mailing. The variables are as follows: n, breakeven point; nl, nu, lower and upper bounds for breakeven point; m, number of
mailings; r1, r2, . . . rm, cumulative response rate after the 1st, 2nd, . . . mth mailing (rm is the final response rate); cPR, cost of program-
ming the survey; cD, cost of preparation per survey (duplication, envelopes, etc.); cM, cost of mailing per survey; cE, cost of receipt
and data entry per survey.
*Repeat mailings of the non-anonymous survey are sent to nonrespondents only.
†Repeat mailings of the anonymous survey are sent to all participants.
mental and final response rates. Equations 3 and 4
thus provide a useful heuristic for determining a
range for the breakeven point if basic costs for the two
survey methods are known.
Comparison of Web and
E-mail/Hardcopy Responses
The goal of using a Web-based survey was to have all
respondents complete the questionnaire using the
Web. However, we had to provide an alternative
method to avoid converting individuals with techni-
cal or user problems into non-respondents. Once we
allowed participants to answer the survey by e-mail
or fax, some may have chosen one of these methods
based on personal preference or convenience. The
data were reviewed to discern potential patterns that
might distinguish Web from hardcopy respondents.
Sixteen percent of the 334 respondents returned the
completed survey by e-mail or fax. Although no dif-
ferentiation was made between e-mail and fax re-
sponses, the majority were received by e-mail. Amer-
ica Online users submitted 31 percent of the e-mail/
fax responses and 12 percent of the Web responses.
E-mail messages about technical problems were re-
ceived most frequently from America Online users.
Thus, at least some of the technical problems were
due to incompatibilities with America Online. One
reason that some AOL users were successful in sub-
mitting the survey through the Web may have been
their use of different versions of the AOL client soft-
ware.
Chi-squared tests (P = 0.05) were used to test for in-
dependence between the type of response (either Web
or e-mail/fax) and the following variables: top-level
domain of the participant (either com, net, or other);
self-reported computer experience (‘‘not at all com-
fortable,’’ ‘‘not very comfortable,’’ ‘‘comfortable,’’
‘‘very comfortable’’); self-reported years of Internet
experience (1, 2, 3, 4, 5, 6, >6); and, the number of
fields left empty on the survey (<21, 21–25, 26–30).
Self-reported computer experience showed a signifi-
cant relationship with the type of response (X2
= 10.3;
df = 2; P = 0.006), indicating that respondents more
comfortable with computers tended to be more suc-
cessful in completing the Web survey. Respondents
answering the Web survey also tended to complete
more fields on the survey (X2
= 37.3; df = 2; P = 0.001).
The other two variables showed no relationship to the
type of response. Thus, two hypotheses for future
studies could be that successful completion of Web
surveys is dependent on computer experience and
that respondents to Web surveys complete more ques-
tionnaire items than e-mail/hardcopy respondents.
Conclusion
Several authors have proposed guidelines on how to
conduct Web-based surveys.11,13,17,18,20
However, few
papers report the use of this method, and none de-
scribe its procedural aspects in detail.2–5
One primary
reason that the Web is not used frequently for large-
scale or general surveys may be that Web access is not
424 SCHLEYER, FORREST, Web-based Survey Design and Administration
a given. Although 19 percent of households in the
United States reported being online in 1997, not all
members of each household may use the computer.21
Even in a recent study, the first step in the survey was
to find out whether participants used the Internet or
not.17
Although consumer research companies are be-
ginning to tap AOL’s 17 million users for market re-
search,22
it has not been proved that AOL’s users are
representative of the U.S. population at large. Until
the e-mail address becomes as widely used as the
postal address, large survey populations cannot be
surveyed using the Web or e-mail alone.
As previously mentioned, some authors advocate pub-
lishing Web surveys through newsgroups, indexes, and
search engines.13
However, the resulting selection bias
makes the results less valid and generalizable. Where
defined populations are accessible through the Internet,
a Web-based survey can be an effective method of
gathering data using rigorous survey methodologies.
However, even a relatively large group of Internet
users may still represent a convenience sample that al-
lows generalization to only that group.
Several authors have made recommendations for
Web-survey design.11,18
On the basis of our experi-
ences in this case study, some additional recommen-
dations can be made:
Ⅲ Consider Web-based surveys software development
projects. Unless an off-the-shelf survey applica-
tion23,24
is used, several tools must be integrated
(such as HTML forms and PERL scripts). Some-
times survey applications must be custom-pro-
grammed. A survey application should be tested
thoroughly to reduce the number of software de-
fects and incompatibilities. Depending on the sam-
ple population, testing variables may include op-
erating systems, browsers, Internet service
providers, and Internet connection types. A system-
atic approach to identifying all variables and testing
can reduce the chance of failure.
Ⅲ Usability and survey design should reflect the character-
istics of the sample population and its environment. If
the computer literacy of the sample population is
not known, the survey should be easy to complete
in as few steps as possible. In this case study, simple
usability issues became major problems for some
participants. Among participants who have a high
degree of computer literacy, usability may be less
of a issue. Likewise, surveys designed for unknown
computing environments should use the lowest
common technical denominator. In contrast, a sur-
vey application could use advanced programming
techniques that match the capabilities of standard-
ized computing environments on an intranet.
Ⅲ Pilot-test with a sufficiently large random sample of the
sample population. It is often impossible to test all
environments for a survey before its release. Thus,
it is very important to pilot-test with a sufficiently
large random sample of users. This will, it is hoped,
account for computer literacy levels and the range
of computer configurations.
Ⅲ Scrutinize early returns earlier. Scrutinizing early re-
turns is important in mail surveys.25
The potentially
immediate response to electronic surveys makes
this recommendation even more important. As this
example has shown, close monitoring of early re-
turns can identify problems that were not caught
during the pilot test, from technical issues to pro-
gram bugs. Immediate resolution of such problems
is required to prevent an unnecessarily large por-
tion of non-respondents or increased measurement
error.
This case study has shown that surveys administered
through the Web can, compared with mail surveys,
potentially lower costs, reduce survey administration
overhead, and collect survey data quickly and effi-
ciently. However, it also confirms that a number of
variables have the potential to influence survey re-
sponse and measurement negatively, such as incom-
patibility with the target computing environment, sur-
vey usability, computer literacy of participants, and
program defects. In this case study, respondents to the
Web survey tended to complete more questionnaire
items than respondents who used e-mail or fax. Be-
cause of its potential impact on the quality of data
collection, this finding should be validated in future
studies.
The authors thank Dr. R. Kenney, Dr. D. Dodell, and N. Dovgy
for their help in conducting this project, Dr. Sorin Straja for his
help with the statistical analysis, Ms. Syrene Miller for her as-
sistance in the literature review, and the reviewers for their
helpful suggestions and comments.
References Ⅲ
1. Schleyer T, Forrest J, Kenney R, Dodell D, Dovgy N. Is the
Internet useful in clinical practice? J Am Dent Assoc. 1999;
130:1502–11.
2. Eysenbach G, Diepgen TL. Epidemiological data can be
gathered with World Wide Web. BMJ. 1998;316(7124):72.
3. Sell RL. Research and the Internet: an e-mail survey of sex-
ual orientation. Am J Public Health. 1997;87(2):297.
4. Gobar GM, Case JT, Kass PH. Program for surveillance of
causes of death of dogs, using the Internet to survey small
animal veterinarians. J Am Vet Med Assoc. 1998;213(2):251–
6.
5. Schleyer T, Spallek H, Torres-Urquidy MH. A profile of cur-
Journal of the American Medical Informatics Association Volume 7 Number 4 Jul / Aug 2000 425
rent Internet users in dentistry. J Am Dent Assoc. 1998;129:
1748–53.
6. Lakeman R. Using the Internet for data collection in nursing
research. Comput Nurs. 1997;15(5):269–75.
7. Fawcett J, Buhle EL Jr. Using the Internet for data collection:
an innovative electronic strategy. Comput Nurs. 1995;13(6):
273–9.
8. Swoboda WJ, Mu¨hlberger N, Weitkunat R, Schneeweib S.
Internet surveys by direct mailing. Soc Sci Comput Rev.
1997;15(3):242–55.
9. Hilsden RJ, Meddings JB, Verhoef MJ. Complementary and
alternative medicine use by patients with inflammatory
bowel disease: an Internet survey. Can J Gastroenterol. 1999;
13(4):327–32.
10. Soetikno RM, Mrad R, Pao V, Lenert LA. Quality-of-life re-
search on the Internet: feasibility and potential biases in pa-
tients with ulcerative colitis. J Am Med Inform Assoc. 1997;
4(6):426–35.
11. Dillman D, Tortora RL, Bowker D. Principles for construct-
ing Web surveys. Presented at the Joint Meetings of the
American Statistical Association; Dallas, Texas; August
1998.
12. Clark R, Maynard M. Research methodology: using online
technology for secondary analysis of survey research data
—‘‘act globally, think locally.’’ Soc Sci Comput Rev. 1998;
16(1):58–71.
13. Houston JD, Fiore DC. Online medical surveys: using the
Internet as a research tool. MD Comput. 1998;15(2):116–20.
14. National Telecommunications and Information Administra-
tion. Falling through the Net: new data on the digital di-
vide. Available at: http://www.ntia.doc.gov/ntiahome/
net2/falling.html. Accessed Dec 8, 1999.
15. Intelliquest. Latest IntelliQuest survey reports 62 million
American adults access the Internet/online services. Avail-
able at: http://www.intelliquest.com/press/release41.asp.
Accessed Dec 8, 1999.
16. Groves R. Survey errors and survey costs. New York: John
Wiley, 1989.
17. Dillman D, Tortora RL, Conradt J, Bowker D. Influence of
plain vs. fancy design on response rates for Web surveys.
In: Proceedings of the Joint Statistical Meetings, Survey
Methods Section. Alexandria, Va: American Statistical As-
sociation, 1998.
18. Turner JL, Turner DB. Using the Internet to perform survey
research. Syllabus. 1999;12(Jan):55–6.
19. Weygandt J, Kieso D, Kell W. Accounting Principles. 2nd
ed. New York: John Wiley, 1990.
20. Turner JL, Turner DB. Using the Internet to perform survey
research. Syllabus. 1998;12(Nov/Dec):58–61.
21. Stets D. Who’s using computers? Philadelphia Inquirer.
1995 Nov 19, 1995:sect D3.
22. Kranhold K. Foote Cone turns to AOL for online polls. Wall
Street Journal. Jul 28, 1999:sect B1.
23. Senecio Software Inc. Online and disk-by-mail surveys.
Available at: http://www.senecio.com/. Accessed Dec 8,
1999.
24. Perseus Development Corporation. A survey software pack-
age for conducting Web surveys. Available at: http://
www.perseus.com/. Accessed Dec 8, 1999.
25. Dillman DA. Mail and Telephone Surveys. New York: John
Wiley, 1978.

More Related Content

What's hot

Embi cri yir-2017-final
Embi cri yir-2017-finalEmbi cri yir-2017-final
Embi cri yir-2017-finalPeter Embi
 
What are metrics good for? Reflections on REF and TEF
What are metrics good for? Reflections on REF and TEFWhat are metrics good for? Reflections on REF and TEF
What are metrics good for? Reflections on REF and TEFDorothy Bishop
 
Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?Philip Bourne
 
One View of Data Science
One View of Data ScienceOne View of Data Science
One View of Data SciencePhilip Bourne
 
CINECA webinar slides: Open science through fair health data networks dream o...
CINECA webinar slides: Open science through fair health data networks dream o...CINECA webinar slides: Open science through fair health data networks dream o...
CINECA webinar slides: Open science through fair health data networks dream o...CINECAProject
 
群眾外包應用於藥物服用知識系統之研究
群眾外包應用於藥物服用知識系統之研究群眾外包應用於藥物服用知識系統之研究
群眾外包應用於藥物服用知識系統之研究彥成 林
 
Research in the time of Covid: Surveying impacts on Early Career Researchers
Research in the time of Covid: Surveying impacts on Early Career ResearchersResearch in the time of Covid: Surveying impacts on Early Career Researchers
Research in the time of Covid: Surveying impacts on Early Career ResearchersRebecca Grant
 
Do Open data badges influence author behaviour? A case study at Springer Nature
Do Open data badges influence author behaviour? A case study at Springer NatureDo Open data badges influence author behaviour? A case study at Springer Nature
Do Open data badges influence author behaviour? A case study at Springer NatureRebecca Grant
 
Predicting Response Mode Preferences of Survey Respondents
Predicting Response Mode Preferences of Survey RespondentsPredicting Response Mode Preferences of Survey Respondents
Predicting Response Mode Preferences of Survey RespondentsMickeyJackson3
 
Almaden presentation 15-dec-2015
Almaden presentation 15-dec-2015Almaden presentation 15-dec-2015
Almaden presentation 15-dec-2015Paul Courtney
 
NLP tutorial at AIME 2020
NLP tutorial at AIME 2020NLP tutorial at AIME 2020
NLP tutorial at AIME 2020Rui Zhang
 
Participant-centered research design and “equal access” data sharing practice...
Participant-centered research design and “equal access” data sharing practice...Participant-centered research design and “equal access” data sharing practice...
Participant-centered research design and “equal access” data sharing practice...Jason Bobe
 
Next generation electronic medical records and search a test implementation i...
Next generation electronic medical records and search a test implementation i...Next generation electronic medical records and search a test implementation i...
Next generation electronic medical records and search a test implementation i...lucenerevolution
 
AI in translational medicine webinar
AI in translational medicine webinarAI in translational medicine webinar
AI in translational medicine webinarPistoia Alliance
 
Ict homeworkjj
Ict homeworkjjIct homeworkjj
Ict homeworkjjMaitha Al
 
Information Technology: The Third Pillar of Medical Education
Information Technology: The Third Pillar of Medical EducationInformation Technology: The Third Pillar of Medical Education
Information Technology: The Third Pillar of Medical EducationBen Williams
 

What's hot (20)

Embi cri yir-2017-final
Embi cri yir-2017-finalEmbi cri yir-2017-final
Embi cri yir-2017-final
 
What are metrics good for? Reflections on REF and TEF
What are metrics good for? Reflections on REF and TEFWhat are metrics good for? Reflections on REF and TEF
What are metrics good for? Reflections on REF and TEF
 
Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?
 
One View of Data Science
One View of Data ScienceOne View of Data Science
One View of Data Science
 
CINECA webinar slides: Open science through fair health data networks dream o...
CINECA webinar slides: Open science through fair health data networks dream o...CINECA webinar slides: Open science through fair health data networks dream o...
CINECA webinar slides: Open science through fair health data networks dream o...
 
群眾外包應用於藥物服用知識系統之研究
群眾外包應用於藥物服用知識系統之研究群眾外包應用於藥物服用知識系統之研究
群眾外包應用於藥物服用知識系統之研究
 
Research in the time of Covid: Surveying impacts on Early Career Researchers
Research in the time of Covid: Surveying impacts on Early Career ResearchersResearch in the time of Covid: Surveying impacts on Early Career Researchers
Research in the time of Covid: Surveying impacts on Early Career Researchers
 
Do Open data badges influence author behaviour? A case study at Springer Nature
Do Open data badges influence author behaviour? A case study at Springer NatureDo Open data badges influence author behaviour? A case study at Springer Nature
Do Open data badges influence author behaviour? A case study at Springer Nature
 
Mr1480.appa
Mr1480.appaMr1480.appa
Mr1480.appa
 
Towards a gold standard and regarding quality in public domain chemistry data...
Towards a gold standard and regarding quality in public domain chemistry data...Towards a gold standard and regarding quality in public domain chemistry data...
Towards a gold standard and regarding quality in public domain chemistry data...
 
Predicting Response Mode Preferences of Survey Respondents
Predicting Response Mode Preferences of Survey RespondentsPredicting Response Mode Preferences of Survey Respondents
Predicting Response Mode Preferences of Survey Respondents
 
The Intersection of Social Media and Human Subjects Research
The Intersection of Social Media and Human Subjects ResearchThe Intersection of Social Media and Human Subjects Research
The Intersection of Social Media and Human Subjects Research
 
Almaden presentation 15-dec-2015
Almaden presentation 15-dec-2015Almaden presentation 15-dec-2015
Almaden presentation 15-dec-2015
 
NLP tutorial at AIME 2020
NLP tutorial at AIME 2020NLP tutorial at AIME 2020
NLP tutorial at AIME 2020
 
Participant-centered research design and “equal access” data sharing practice...
Participant-centered research design and “equal access” data sharing practice...Participant-centered research design and “equal access” data sharing practice...
Participant-centered research design and “equal access” data sharing practice...
 
Next generation electronic medical records and search a test implementation i...
Next generation electronic medical records and search a test implementation i...Next generation electronic medical records and search a test implementation i...
Next generation electronic medical records and search a test implementation i...
 
AI in translational medicine webinar
AI in translational medicine webinarAI in translational medicine webinar
AI in translational medicine webinar
 
Ict homeworkjj
Ict homeworkjjIct homeworkjj
Ict homeworkjj
 
Information Technology: The Third Pillar of Medical Education
Information Technology: The Third Pillar of Medical EducationInformation Technology: The Third Pillar of Medical Education
Information Technology: The Third Pillar of Medical Education
 
Requirements analysis IAKM website
Requirements analysis IAKM websiteRequirements analysis IAKM website
Requirements analysis IAKM website
 

Similar to Methods for the design

MCJ 5532, Research Methods in Criminal Justice Administra.docx
MCJ 5532, Research Methods in Criminal Justice Administra.docxMCJ 5532, Research Methods in Criminal Justice Administra.docx
MCJ 5532, Research Methods in Criminal Justice Administra.docxAASTHA76
 
Advantages And Disadvantages Of Internet Research Surveys Evidence From The ...
Advantages And Disadvantages Of Internet Research Surveys  Evidence From The ...Advantages And Disadvantages Of Internet Research Surveys  Evidence From The ...
Advantages And Disadvantages Of Internet Research Surveys Evidence From The ...Charlie Congdon
 
Our kids and the digital utilities
Our kids and the digital utilitiesOur kids and the digital utilities
Our kids and the digital utilitiesFiras Dabbagh
 
httpfmx.sagepub.comField Methods DOI 10.117715258.docx
httpfmx.sagepub.comField Methods DOI 10.117715258.docxhttpfmx.sagepub.comField Methods DOI 10.117715258.docx
httpfmx.sagepub.comField Methods DOI 10.117715258.docxpooleavelina
 
Listening to the Patient - Leveraging Direct-to-Patient Data Collection to Sh...
Listening to the Patient - Leveraging Direct-to-Patient Data Collection to Sh...Listening to the Patient - Leveraging Direct-to-Patient Data Collection to Sh...
Listening to the Patient - Leveraging Direct-to-Patient Data Collection to Sh...John Reites
 
Influence of electronic word of mouth on Consumers Purchase Intention
Influence of electronic word of mouth on Consumers Purchase IntentionInfluence of electronic word of mouth on Consumers Purchase Intention
Influence of electronic word of mouth on Consumers Purchase IntentionNasif Chowdhury
 
EAPP-Q2-Report (1).pdf
EAPP-Q2-Report (1).pdfEAPP-Q2-Report (1).pdf
EAPP-Q2-Report (1).pdfEuannMagtibay
 
HIV Tracking System in Forsyth County, NC
HIV Tracking System in Forsyth County, NCHIV Tracking System in Forsyth County, NC
HIV Tracking System in Forsyth County, NCwebbmother
 
Towards ehr interoperability in tanzania
Towards ehr interoperability in tanzaniaTowards ehr interoperability in tanzania
Towards ehr interoperability in tanzaniaIJCSEA Journal
 
Classification Scoring for Cleaning Inconsistent Survey Data
Classification Scoring for Cleaning Inconsistent Survey DataClassification Scoring for Cleaning Inconsistent Survey Data
Classification Scoring for Cleaning Inconsistent Survey DataCSCJournals
 
Classification Scoring for Cleaning Inconsistent Survey Data
Classification Scoring for Cleaning Inconsistent Survey DataClassification Scoring for Cleaning Inconsistent Survey Data
Classification Scoring for Cleaning Inconsistent Survey DataCSCJournals
 
Statistical and critical thinking
Statistical and critical thinkingStatistical and critical thinking
Statistical and critical thinkingRamiroGarcia103
 
Towards EHR Interoperability in Tanzania Hospitals : Issues, Challenges and O...
Towards EHR Interoperability in Tanzania Hospitals : Issues, Challenges and O...Towards EHR Interoperability in Tanzania Hospitals : Issues, Challenges and O...
Towards EHR Interoperability in Tanzania Hospitals : Issues, Challenges and O...IJCSEA Journal
 

Similar to Methods for the design (20)

MCJ 5532, Research Methods in Criminal Justice Administra.docx
MCJ 5532, Research Methods in Criminal Justice Administra.docxMCJ 5532, Research Methods in Criminal Justice Administra.docx
MCJ 5532, Research Methods in Criminal Justice Administra.docx
 
Mr1480.appa
Mr1480.appaMr1480.appa
Mr1480.appa
 
Advantages And Disadvantages Of Internet Research Surveys Evidence From The ...
Advantages And Disadvantages Of Internet Research Surveys  Evidence From The ...Advantages And Disadvantages Of Internet Research Surveys  Evidence From The ...
Advantages And Disadvantages Of Internet Research Surveys Evidence From The ...
 
CRM task 2
CRM task 2CRM task 2
CRM task 2
 
Zimmerman and Schultz 2000
Zimmerman and Schultz 2000Zimmerman and Schultz 2000
Zimmerman and Schultz 2000
 
Question 1
Question 1Question 1
Question 1
 
Our kids and the digital utilities
Our kids and the digital utilitiesOur kids and the digital utilities
Our kids and the digital utilities
 
Online survey
Online surveyOnline survey
Online survey
 
httpfmx.sagepub.comField Methods DOI 10.117715258.docx
httpfmx.sagepub.comField Methods DOI 10.117715258.docxhttpfmx.sagepub.comField Methods DOI 10.117715258.docx
httpfmx.sagepub.comField Methods DOI 10.117715258.docx
 
Listening to the Patient - Leveraging Direct-to-Patient Data Collection to Sh...
Listening to the Patient - Leveraging Direct-to-Patient Data Collection to Sh...Listening to the Patient - Leveraging Direct-to-Patient Data Collection to Sh...
Listening to the Patient - Leveraging Direct-to-Patient Data Collection to Sh...
 
Paper 3 (iftikhar alam)
Paper 3 (iftikhar alam)Paper 3 (iftikhar alam)
Paper 3 (iftikhar alam)
 
Influence of electronic word of mouth on Consumers Purchase Intention
Influence of electronic word of mouth on Consumers Purchase IntentionInfluence of electronic word of mouth on Consumers Purchase Intention
Influence of electronic word of mouth on Consumers Purchase Intention
 
EAPP-Q2-Report (1).pdf
EAPP-Q2-Report (1).pdfEAPP-Q2-Report (1).pdf
EAPP-Q2-Report (1).pdf
 
HIV Tracking System in Forsyth County, NC
HIV Tracking System in Forsyth County, NCHIV Tracking System in Forsyth County, NC
HIV Tracking System in Forsyth County, NC
 
Online surveys
Online surveysOnline surveys
Online surveys
 
Towards ehr interoperability in tanzania
Towards ehr interoperability in tanzaniaTowards ehr interoperability in tanzania
Towards ehr interoperability in tanzania
 
Classification Scoring for Cleaning Inconsistent Survey Data
Classification Scoring for Cleaning Inconsistent Survey DataClassification Scoring for Cleaning Inconsistent Survey Data
Classification Scoring for Cleaning Inconsistent Survey Data
 
Classification Scoring for Cleaning Inconsistent Survey Data
Classification Scoring for Cleaning Inconsistent Survey DataClassification Scoring for Cleaning Inconsistent Survey Data
Classification Scoring for Cleaning Inconsistent Survey Data
 
Statistical and critical thinking
Statistical and critical thinkingStatistical and critical thinking
Statistical and critical thinking
 
Towards EHR Interoperability in Tanzania Hospitals : Issues, Challenges and O...
Towards EHR Interoperability in Tanzania Hospitals : Issues, Challenges and O...Towards EHR Interoperability in Tanzania Hospitals : Issues, Challenges and O...
Towards EHR Interoperability in Tanzania Hospitals : Issues, Challenges and O...
 

Recently uploaded

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 

Methods for the design

  • 1. 416 SCHLEYER, FORREST, Web-based Survey Design and Administration Research Paper Ⅲ Methods for the Design and Administration of Web-based Surveys TITUS K. L. SCHLEYER, DMD, PHD, JANE L. FORREST, RDH, EDD A b s t r a c t This paper describes the design, development, and administration of a Web- based survey to determine the use of the Internet in clinical practice by 450 dental professionals. The survey blended principles of a controlled mail survey with data collection through a Web- based database application. The survey was implemented as a series of simple HTML pages and tested with a wide variety of operating environments. The response rate was 74.2 percent. Eighty-four percent of the participants completed the Web-based survey, and 16 percent used e-mail or fax. Problems identified during survey administration included incompatibilities/ technical problems, usability problems, and a programming error. The cost of the Web-based survey was 38 percent less than that of an equivalent mail survey. A general formula for calculating breakeven points between electronic and hardcopy surveys is presented. Web-based surveys can significantly reduce turnaround time and cost compared with mail surveys and may enhance survey item completion rates. Ⅲ J Am Med Inform Assoc. 2000;7:416–425. The Web-based survey described in this article was designed to investigate the use of the Internet in clin- ical practice by 450 dental professionals. The results of the survey itself have been published previously.1 This paper describes the design and implementation of the survey in detail to assist other researchers who are considering using Web-based surveys. From a re- view of the background literature and our own ex- periences, we present issues in sampling for electronic surveys; survey design, programming, testing, and administration; potential problems and pitfalls; and cost comparisons between electronic and hardcopy surveys. We developed several general breakeven cal- culations based on cost, provided all other variables are equal, to help researchers choose between elec- tronic and traditional mail surveys. Affiliations of the authors: Temple University School of Den- tistry, Philadelphia, Pennsylvania (TKLS); University of South- ern California School of Dentistry, Los Angeles, California (JLF). Correspondence and reprints: Titus K. L. Schleyer, DMD, PhD, Department of Dental Informatics, Temple University School of Dentistry, 3223 N. Broad Street, TU 600-00, Philadelphia, PA 19140; e-mail: ͗di@dental.temple.edu͘. This work was supported in part by grant T15-LM07059 from the National Library of Medicine/National Institute of Dental and Craniofacial Research. Received for publication: 9/7/99; accepted for publication: 1/31/00. Background Several recent publications have reported use of the Internet to conduct survey research.2–9 Investigators in the fields of medicine, psychology, sociology, den- tistry, and veterinary medicine are recruiting partici- pants for their research studies by targeting specific search engines, newsgroups, and Web sites. Partici- pants often answer surveys by returning a completed form by e-mail or by entering their responses directly on a Web site. Commonly cited advantages include easy access, instant distribution, and reduced costs. In addition, the Internet allows questionnaires and sur- veys to reach a worldwide population with minimum cost and time. Researchers can contact rare and hid- den populations that are often geographically dis- persed,3 as well as patient populations different from those typically seen in the clinical or hospital set- ting.2,10 Other reported benefits relate to graphical and inter- active design on the Web. Ideally, HTML survey forms enhance data collection, compared with conventional surveys, because of their use of color, innovative screen designs, question formatting, and other fea- tures not available with paper questionnaires. They can prohibit multiple or blank responses by not allow- ing the participant to continue on or to submit the survey without first correcting the response error. This
  • 2. Journal of the American Medical Informatics Association Volume 7 Number 4 Jul / Aug 2000 417 feature is somewhat controversial, because there may be legitimate reasons for not answering questions, and responses such as ‘‘don’t know’’ or ‘‘prefer not to an- swer’’ force an answer when participation and ques- tion response is supposed to be voluntary.11 Regard- less of one’s view on this issue, the program can provide cues to make sure the respondent does not inadvertently skip a question. In addition, coding er- rors and data entry mistakes are reduced or elimi- nated while compilation of results can be automated.12 Finally, online forms can help minimize costs, facili- tate rapid return of information by participants, and allow timely dissemination of results by investiga- tors.13 Several examples show how the Internet is used for survey research. Physicians in Germany developed a Web-based patient information system about atopic eczema to attract patients to the Web site and Internet survey.2 The purpose of the survey was to explore the relations between atopic stigmata and its symptoms, predisposing factors, patient demographics, and as- sociations with other diseases. As an incentive to fill out the survey, an atopy score was calculated and pre- sented to the participant upon completion. Approxi- mately 240 subjects complete the survey each month. Healthy Web surfers serve as controls.2 In another study, researchers at Columbia University explored the properties of a new measure of sexual orientation by monitoring network traffic on an intra- net over a two-week period and collecting all postings to two newsgroups related to their topic of study.3 From the formulated list of e-mail addresses, 360 sub- jects were randomly selected. Subjects were notified of their selection, and those who consented to partic- ipate were e-mailed a survey. Of the participants who were contacted, 66.1 percent provided their consent to participate and 56.4 percent of that group returned completed surveys.3 Veterinarians conducted research via e-mail and Web pages to investigate causes of dog death in small vet- erinary practices.4 In this study, 25 veterinarians sub- mitted case material. On the basis of analysis by re- gion and school attended, the investigators found that participants were representative of the veterinarian population in the United States. Nursing researchers have found the Internet a valu- able vehicle for collecting data from cancer survivors.7 In this study, three cancer-related newsgroups were used to distribute the Cancer Survivors Survey Ques- tionnaire. This method proved useful for collecting preliminary data, which are often needed to demon- strate the feasibility of conducting a large-scale study and for determining adequate sample size. Theoretically, conducting research over the Internet has many benefits. However, survey experts and re- searchers warn that the current online population is not representative of the general population in the United States. Estimates of computer ownership and e-mail access vary depending on how data were gath- ered, e.g., face-to-face or via telephone, and how it is reported, e.g., household computer ownership vs. ‘‘ac- cess to’’ computers.14,15 For example, in 48,000 face-to- face interviews conducted in 1997, 37 percent of households in the United States reported owning a computer, 19 percent reported online access, and 17 percent reported e-mail access. In comparison, through telephone polls, 67 percent reported having access to a computer and 31 percent had an e-mail address. While access to e-mail and the Internet grow daily, a ‘‘digital divide’’ exists among age and racial groups, income levels, and geographic settings.14 Ensuring that each potential respondent has an equal chance of being selected to participate poses a major challenge in conducting a scientifically sound survey. This is es- pecially true in the health sciences, where electronic access to specific provider or patient groups cannot easily be obtained.9 Currently, not all health profes- sional associations or licensing boards collect e-mail addresses, nor is it possible to estimate the number of individuals with the particular health state of interest who have access to computers and the Internet.7 How- ever, rigorous sample selection procedures must be followed if results are to be generalized to a popula- tion and sources of coverage and sampling error are to be kept to a minimum.11,16 Unfortunately, the sampling procedures reported in many electronic surveys reflect unknown sam- ples.2,3,5,13 When subjects are recruited by targeting newsgroups or search engines, it is nearly impossible to determine the distribution of the sample popula- tion. These survey procedures should be used only when sampling and self-selection biases can be toler- ated. Another concern unique to conducting electronic sur- veys is the variation of the level of computer literacy among respondents and the capabilities of their com- puters. Internet users tend to be highly educated white men between the ages of 26 and 30 years.13 Even so, their experience responding to online question- naires may be limited. Thus, Web-based surveys need to have clear directions on how to perform each needed skill, e.g., how to enter answers with a drop- down box or erase responses from a check box11 so that responding to the questionnaire does not become a frustrating experience.
  • 3. 418 SCHLEYER, FORREST, Web-based Survey Design and Administration Providing specific instructions will assist respondents in accurately completing and returning the survey, provided their computer is capable of receiving it in the first place. Differences among computers, such as their processing power, memory, connection speeds, and browsers, potentially negate some of the benefits purported for using the Web. For example, the use of graphics and animation may increase the attractive- ness and novelty of participating. However, advanced Web progamming features, such as Java, JavaScript, DHTML, or XML either may be incompatible with certain browsers or may cause them to respond slowly or crash. In The Influence of Plain vs. Fancy Designs on Response Rates for Web Surveys, Dillman et al.17 showed that such features can actually lower response rates. In this study, a plain questionnaire obtained a higher response rate than one that used tables and colors. The plain design also was more likely to be fully com- pleted in a shorter period of time. Dillman et al. proposed three criteria and 11 support- ing principles for designing respondent-friendly Web questionnaires, some of which were used to guide the development of the study presented in this article.11 These criteria include: Ⅲ Take into account the inability of some respondents to receive and respond to Web questionnaires with advanced programming features that cannot be re- ceived or easily responded to because of equip- ment, browser, and/or transmission limitations. Ⅲ Take into account both the logic of how computers operate and the logic of how people expect ques- tionnaires to operate. Ⅲ Take into account the likelihood that a Web ques- tionnaire will be used in mixed-mode survey situ- ations. The next section describes the purpose of the survey described in this article, how the sample was selected, and how the survey was designed, pilot tested, and administered. Survey Development and Administration The Study The Web-based survey described in this article was designed to investigate the use of the Internet in clin- ical practice by 450 dental professionals. There were three primary reasons for choosing a Web-based sur- vey method. First, the survey population used e-mail, since all participants subscribed to an Internet discus- sion list. Use of e-mail is not an absolute indicator of Web use; however, since discussions often referenced Web sites, it seemed likely that the majority of indi- viduals used the Web. The survey results confirmed this assumption. Second, because of an imposed dead- line, survey development, implementation, and data analysis had to be completed within eight weeks, which made it impossible to conduct a traditional mail survey. Finally, funds or other resources for the production of a hardcopy survey, postage, and data entry were not available. Sample Selection A random sample of dentists could not be selected because no comprehensive list of dentists with e-mail addresses was available. Consequently, the largest dis- cussion list for general dentistry (Internet Dental Fo- rum) was identified. Selection of the discussion list permitted identification of the total population and controlled follow-up with nonrespondents, blending a methodologically sound approach with a new method of collecting data. The investigators believed that selection of this convenience sample, although not representative of all dentists with Internet access, was more appropriate than soliciting volunteers from general sites with unknown populations. Dr. D. Dodell, list owner of the Internet Dental Forum and member of the project team, made the list of e-mail addresses available. Institutional Review Board ap- proval for this survey was not sought, since the project was exempt under CFR §46.101 (b) (2). Survey Design A 22-question survey instrument with a total of 102 discrete answers was developed. Rather than being presented on a single, lengthy Web page, questions were grouped on 18 sequential screens, for two rea- sons. First, sequential screens kept transmission time to a minimum and avoided potential server time-outs for respondents with slow modem connections (33 KBps and below). Second, the use of sequential screens allowed questions to be displayed completely and prevented the need for participants to scroll through pages and potentially get lost. Figure 1 illustrates some of the design features of the survey. All screens were designed to display fully at a screen resolution of 800 ϫ 600 pixels. Most screens contained a single question. The top of the screen dis- played a static 6 kb JPEG banner with a small picture (which emphasized the clinical aspect of the survey) and the title of the survey. Each question was dis- played in bold. List boxes, radio buttons, and check boxes provided answers for close-ended questions. Text fields were available for answers to open-ended
  • 4. Journal of the American Medical Informatics Association Volume 7 Number 4 Jul / Aug 2000 419 F i g u r e 1 Sample screen of the survey. questions. Formatting the answers in a table ensured consistency of layout for different browsers, operating systems, and window sizes. The lower left corner of each screen indicated the participant’s relative posi- tion in the total number of survey screens (in gray). The lower right corner contained buttons to clear the current screen and to move to the next screen. The total file size of each screen averaged about 9 kb. Several published recommendations and findings for designing survey screens were followed.11,17,18 The small file size minimized download time. Formatting clearly differentiated questions and answers and de- emphasized secondary screen elements. Consistent layout reduced the number of required cognitive ad- justments and allowed participants to concentrate on answering the questions. The ‘‘next’’ button, com- bined with the relative screen indicator, encouraged a page-turning rhythm that resembled completing a hardcopy survey. To minimize incompatibilities with browsers, survey pages were compliant with HTML 3.0. Neither JavaScript, which is often employed to validate entry fields on the Web, nor Java, ActiveX controls, and other advanced Web programming concepts were used for the reasons cited above. The survey was programmed in PL/SQL on an Oracle 8 database server. Programming took approximately 35 hours. Code review and testing added another eight hours. Since the code was going to be used only once, the programmer neither optimized the code for performance and maintenance nor added detailed comments. The total length of the program was 2,471 lines. During the code review, the program was re- viewed line by line. In the testing phase, each question was answered and the corresponding entry checked in the database. Initially, the survey was programmed to validate every screen (e.g., to check that all fields were filled out, that the zip code was formatted cor- rectly). This feature was deactivated on the basis of the results of the pilot test. A second program also was developed to send survey messages to all partic- ipants. This program took three hours to develop and test and totaled 895 lines. Pilot Testing After programming, the survey was pilot-tested in- house and with several remote participants. The pro- gram was tested with two different browsers (Netscape Communicator versions 4.0 and 4.5 and In- ternet Explorer version 4.0), three operating systems (Windows NT 4.0, Windows 95, and Macintosh OS 7.5), two types of Internet access (high-speed local area network and modem dial-up line), and three dif- ferent Internet service providers. The pilot test did not uncover any technical problems. However, the word- ing on some questions was slightly modified, and the validation for required input fields was dropped. Sev-
  • 5. 420 SCHLEYER, FORREST, Web-based Survey Design and Administration eral pilot-testers felt that requiring entries in all fields was too restrictive, especially when they felt that a question did not apply to them personally or when the response choices did not exactly match their ex- pectations. Survey Administration Next, a list of participants’ e-mail addresses was gen- erated from the list of subscribers to the discussion list. This list was imported into Microsoft Excel and a unique, random four-character survey ID for each participant was generated. The IDs were composed of letters and numbers. Participants entered their IDs to authenticate themselves and access the survey. The list was then imported from the Excel file into the Oracle database. The main survey page (for introduction and login) was hosted on the discussion list server (idf.stat.com) rather than the Oracle server (heracles.dental. temple.edu). Although this method avoided confusion about the origin of the survey, it prevented the inves- tigators from providing a URL that would have di- rectly logged participants into the Oracle server. To begin data collection and identify any significant problems, a survey message was initially sent to 47 participants. The messages originated on the Oracle server were sent through an SMTP mailer program that spoofed* an e-mail address on the server hosting the discussion list. Participants received a personal message stating the purpose of the survey, who was conducting it, the estimated time required to complete it, the URL of the survey, the survey ID, and who to contact with questions. The survey also instructed them to return duplicate e-mail messages with the subject line ‘‘DUPLICATE.’’ Once authenticated through their survey ID, participants could begin an- swering questions. The group of individuals who responded to the first mailing did not report any problems. However, sev- eral problems were identified when the survey was mailed to the remaining 403 individuals: Ⅲ Several individuals stated that they entered their survey ID, pushed submit, and then could not pro- ceed with the survey because of an error message. Unfortunately, the source of this problem could not be tracked down. Most of these participants used America Online as their Internet service provider. *Spoofing is a common technique to fool hardware and software in networked environments. Spoofing an e-mail address, for in- stance, makes a message appear to be sent from someone else than the actual sender. Consequently, an ASCII copy of the survey was mounted on the home page to provide an alterna- tive method of answering the survey. Participants were advised to use this method of replying if the Web form failed. At the same time, participants who had had problems were provided with an ex- planation. A copy of the survey was included in the reply. Completed surveys received through e-mail or fax were entered by hand into the database later. Ⅲ Some international users with slow modem con- nections reported that they received a server time- out when trying to answer the survey. As a result, the timeout period for client responses to the server was increased from 60 seconds to five minutes. Ⅲ When typing in their survey ID, several respon- dents mistook the digit ‘‘0’’ for the letter ‘‘O,’’ and the digit ‘‘1’’ for the letter ‘‘l’’ and vice versa. Thus, the server rejected their ID. Since a complete URL could not be provided for respondents to click on to access the survey, and since most participants ob- viously did not copy and paste their survey IDs into the field, respondents were advised by e-mail of the correct way of entering their survey ID. Ⅲ Several users were not aware that they could in- clude the text of the original message in their reply or paste the survey from the Web page into an e- mail message. Instead, they printed the survey and returned the completed hardcopy by fax. Although this was not a major problem, it delayed the entry of approximately ten surveys into the database. Ⅲ After receiving approximately 130 surveys, the re- sponses to one question revealed that participants seemed to choose only two of the four responses on a Likert scale. A review of the program revealed an error that stored answers incorrectly. The error was corrected, and the incorrectly stored answers were discarded. Three additional messages were sent to non-respon- dents during the following two weeks. Figure 2 shows the dates and times of the messages, the distribution of responses, and the cumulative response rate. The response rate for surveys entered via the Web was 32.9 percent—144 of 438 (adjusted) participants— after the initial mailing. The first follow-up mailing resulted in receipt of 76 surveys and brought the total response up to 50.2 percent. The second follow-up mailing raised the response rate to 57.1 percent, and the third raised it to 64.4 percent (30 and 32 responses, respectively). The 52 surveys returned by e-mail or fax were entered by hand and increased the final response rate to 74.2 percent (334 of 438 participants). We sent
  • 6. Journal of the American Medical Informatics Association Volume 7 Number 4 Jul / Aug 2000 421 F i g u r e 2 Number of re- sponses received (left scale) and cumulative response rate (right scale, in percents) by date. The graph includes surveys entered through the Web only and indicates when survey mailings were sent. a total of 1,132 e-mail messages to participants of this survey. To increase the response rate, the survey was directly included in the e-mail message in the second and third follow-up mailings. In addition, while the initial messages and the first follow-up messages were sent from a generic e-mail account (survey@stat.com), the subsequent follow-up messages were sent from the listowner’s account directly, with a personal re- quest for a response to the survey. The next section compares the costs of this survey with costs if the survey had been administered by mail. General breakeven equations for the sample size using Web-based vs. mail surveys are presented. The section concludes with a comparison of characteristics of Web and e-mail/fax respondents. Cost and Response Pattern Analysis Costs Costs were calculated to assess the cost-effectiveness of the Web-based survey method for planning future surveys. Table 1 shows the costs for the Web-based survey compared to the costs of an equivalent mail survey. The comparison excludes costs that are the same regardless of the survey methodology, such as design of the survey instrument and pilot-testing. Also excluded is the cost of obtaining the mailing list. As Table 1 shows, the total cost of the Web survey was $1,916, comprising the costs for programming and testing the survey, programming the bulk mailer program, and performing limited manual data entry. If all respondents had completed the Web-based sur- vey successfully, the cost would have dropped by $206. Costs for an equivalent mail survey are calcu- lated both for a non-anonymous survey (our case) and an anonymous survey. The two alternatives differ in their cost of preparing and mailing surveys. In non- anonymous surveys, surveys are prepared for and sent to non-respondents only after the initial mailing. Anonymous surveys require that all participants re- ceive the initial and all follow-up mailings and are thus somewhat more expensive. We present different breakeven calculations (see below) for these two op- tions. If our survey had been administered as a mail survey, its total cost would have been $3,092, including the cost of preparing 1,132 mailings, mailing costs, post- age for returned mailings, and data entry. The Web- survey was 38 percent cheaper than the equivalent (non-anonymous) mail survey. The figures used for the calculations in Table 1 represent local costs for a mail survey of 450 individuals. In other settings, costs might differ on the basis of factors such as sample size, reproduction costs, study requirements, pro- gramming costs, and data entry costs. Costs arising from handling technical problems for the Web-based survey (e.g., responding to user questions) were dis- regarded, since the required time was minimal and an improved design could have avoided most of those problems. As Table 1 shows, the cost of a Web-based survey is independent of the sample size, whereas the costs of a mail survey vary with the initial sample size as well
  • 7. 422 SCHLEYER, FORREST, Web-based Survey Design and Administration Table 1 Ⅲ Cost Comparison of a Web-based Survey and Non-anonymous and Anonymous Mail Surveys Mail Survey Non-anonymous (1,132 mailings) Anonymous (1,800 mailings—4 ϫ 450 participants)Web-based Survey Survey Programming (35 hrs $1,050 Printing/duplication of a 7- $724 $1,152 preparation ϫ $30) page survey and the cover letter ($0.08/page) Testing and code re- 480 view (8 hrs ϫ $60) Envelopes at 62 99 $0.055/envelope Business reply envelopes at 37 59 $0.033/envelope Distribution Bulk mailer program 180 Mailing of surveys at 736 1,170 and return (3 hrs ϫ $60) $0.65/envelope ($0.55 postage ϩ $0.10 envelope Sending e-mail mes- sages 0 stuffing) Return postage on 334 210 210 surveys ($0.63/survey) Data entry Entry by participant 0 Entry by investigator/staff 1,323 1,323 member (334 ϫ 6 min Manual entry of 52 206 ϫ $0.66) surveys (52 ϫ 6 min ϫ $0.66) Total $1,916 $3,092 $4,013 NOTE: The calculations for the non-anonymous survey use actual response rates. The calculations for the anonymous survey assume that each participant receives the mailing four times, whether they replied or not. Personnel costs include fringe benefits (30%). as with the incremental and the total response rate. Avoiding manual entry of completed surveys gener- ated significant cost savings, a fact that has not been lost on transaction-intensive industries such as air- lines and banks. To assist others in choosing between Web-based and mail surveys on the basis of cost (assuming that all other variables are equal), we use a standard break- even calculation19 to determine the sample size for which both types of surveys cost exactly the same. This point is the threshold for which conducting one type of survey becomes more cost-effective than the other. Equations 1 and 2 in Table 2 are used to calculate the breakeven point for non-anonymous and anonymous surveys, respectively, when the incremental and/or the final response rates can be estimated. When that is not possible, equations 3 and 4 can be used to cal- culate lower and upper bounds for the breakeven point. For the described survey, n would have been 245 un- der the idealistic assumption that all respondents suc- cessfully answered the survey through the Web. Even if the costs of entering of the surveys manually are included, the breakeven point rises only to 274. Thus, with a sample size of approximately 275 or below, a mail survey would have been more economical than a Web-based survey. Equation 1 makes two assumptions of practical sig- nificance. First, it assumes that exactly as many hard- copy surveys are prepared as needed. In reality, this is rarely possible. Equation 1 thus will often reflect slightly lower costs for a mail survey than are achiev- able, slanting the comparison in favor of mail surveys. Second, it assumes that incremental and final re- sponse rates can be estimated. When this is not pos- sible, we can calculate a range for the breakeven point by approximating the extreme values of equation 1 through equations 3 and 4. Using our costs, the lower bound for the breakeven point would have been 190 and the upper bound 347. This means that for a sam- ple size of 189 or less, a mail survey would have been more economical, and with a sample size of 348 or more, a Web-based survey. In between, the cost ad- vantage would have depended on the actual incre-
  • 8. Journal of the American Medical Informatics Association Volume 7 Number 4 Jul / Aug 2000 423 Table 2 Ⅲ Equations to Determine Breakeven Sample Size Assumptions Equations Incremental and/or final response rates can be estimated: Non-anonymous Survey*: cPR n = mϪ1 m Ϫ r и (c ϩ c ) ϩ r и ci D M m Eͩ ͸ ͪi=1 (1) Anonymous Survey†: cPR n = m и (c ϩ c ) ϩ r и cD M m E (2) Both types of survey: Incremental and/or final response rates cannot be estimated: Lower bound: cPR n =l m и (c ϩ c ) ϩ cD M E (3) Upper bound: cPR n =u m и (c ϩ c )D M (4) NOTE: Equations to determine the breakeven sample size for non-anonymous surveys (equation 1) and anonymous surveys (equation 2). If incremental or final response rates cannot be estimated, equations 3 and 4 can be used to determine the boundaries for the breakeven point. The equations make the assumption that the denominator is not zero and that the Web-based survey is entered successfully by each respondent. In addition, equations 1 and 2 assume that the cumulative response rate never reaches 100% before the last mailing. The variables are as follows: n, breakeven point; nl, nu, lower and upper bounds for breakeven point; m, number of mailings; r1, r2, . . . rm, cumulative response rate after the 1st, 2nd, . . . mth mailing (rm is the final response rate); cPR, cost of program- ming the survey; cD, cost of preparation per survey (duplication, envelopes, etc.); cM, cost of mailing per survey; cE, cost of receipt and data entry per survey. *Repeat mailings of the non-anonymous survey are sent to nonrespondents only. †Repeat mailings of the anonymous survey are sent to all participants. mental and final response rates. Equations 3 and 4 thus provide a useful heuristic for determining a range for the breakeven point if basic costs for the two survey methods are known. Comparison of Web and E-mail/Hardcopy Responses The goal of using a Web-based survey was to have all respondents complete the questionnaire using the Web. However, we had to provide an alternative method to avoid converting individuals with techni- cal or user problems into non-respondents. Once we allowed participants to answer the survey by e-mail or fax, some may have chosen one of these methods based on personal preference or convenience. The data were reviewed to discern potential patterns that might distinguish Web from hardcopy respondents. Sixteen percent of the 334 respondents returned the completed survey by e-mail or fax. Although no dif- ferentiation was made between e-mail and fax re- sponses, the majority were received by e-mail. Amer- ica Online users submitted 31 percent of the e-mail/ fax responses and 12 percent of the Web responses. E-mail messages about technical problems were re- ceived most frequently from America Online users. Thus, at least some of the technical problems were due to incompatibilities with America Online. One reason that some AOL users were successful in sub- mitting the survey through the Web may have been their use of different versions of the AOL client soft- ware. Chi-squared tests (P = 0.05) were used to test for in- dependence between the type of response (either Web or e-mail/fax) and the following variables: top-level domain of the participant (either com, net, or other); self-reported computer experience (‘‘not at all com- fortable,’’ ‘‘not very comfortable,’’ ‘‘comfortable,’’ ‘‘very comfortable’’); self-reported years of Internet experience (1, 2, 3, 4, 5, 6, >6); and, the number of fields left empty on the survey (<21, 21–25, 26–30). Self-reported computer experience showed a signifi- cant relationship with the type of response (X2 = 10.3; df = 2; P = 0.006), indicating that respondents more comfortable with computers tended to be more suc- cessful in completing the Web survey. Respondents answering the Web survey also tended to complete more fields on the survey (X2 = 37.3; df = 2; P = 0.001). The other two variables showed no relationship to the type of response. Thus, two hypotheses for future studies could be that successful completion of Web surveys is dependent on computer experience and that respondents to Web surveys complete more ques- tionnaire items than e-mail/hardcopy respondents. Conclusion Several authors have proposed guidelines on how to conduct Web-based surveys.11,13,17,18,20 However, few papers report the use of this method, and none de- scribe its procedural aspects in detail.2–5 One primary reason that the Web is not used frequently for large- scale or general surveys may be that Web access is not
  • 9. 424 SCHLEYER, FORREST, Web-based Survey Design and Administration a given. Although 19 percent of households in the United States reported being online in 1997, not all members of each household may use the computer.21 Even in a recent study, the first step in the survey was to find out whether participants used the Internet or not.17 Although consumer research companies are be- ginning to tap AOL’s 17 million users for market re- search,22 it has not been proved that AOL’s users are representative of the U.S. population at large. Until the e-mail address becomes as widely used as the postal address, large survey populations cannot be surveyed using the Web or e-mail alone. As previously mentioned, some authors advocate pub- lishing Web surveys through newsgroups, indexes, and search engines.13 However, the resulting selection bias makes the results less valid and generalizable. Where defined populations are accessible through the Internet, a Web-based survey can be an effective method of gathering data using rigorous survey methodologies. However, even a relatively large group of Internet users may still represent a convenience sample that al- lows generalization to only that group. Several authors have made recommendations for Web-survey design.11,18 On the basis of our experi- ences in this case study, some additional recommen- dations can be made: Ⅲ Consider Web-based surveys software development projects. Unless an off-the-shelf survey applica- tion23,24 is used, several tools must be integrated (such as HTML forms and PERL scripts). Some- times survey applications must be custom-pro- grammed. A survey application should be tested thoroughly to reduce the number of software de- fects and incompatibilities. Depending on the sam- ple population, testing variables may include op- erating systems, browsers, Internet service providers, and Internet connection types. A system- atic approach to identifying all variables and testing can reduce the chance of failure. Ⅲ Usability and survey design should reflect the character- istics of the sample population and its environment. If the computer literacy of the sample population is not known, the survey should be easy to complete in as few steps as possible. In this case study, simple usability issues became major problems for some participants. Among participants who have a high degree of computer literacy, usability may be less of a issue. Likewise, surveys designed for unknown computing environments should use the lowest common technical denominator. In contrast, a sur- vey application could use advanced programming techniques that match the capabilities of standard- ized computing environments on an intranet. Ⅲ Pilot-test with a sufficiently large random sample of the sample population. It is often impossible to test all environments for a survey before its release. Thus, it is very important to pilot-test with a sufficiently large random sample of users. This will, it is hoped, account for computer literacy levels and the range of computer configurations. Ⅲ Scrutinize early returns earlier. Scrutinizing early re- turns is important in mail surveys.25 The potentially immediate response to electronic surveys makes this recommendation even more important. As this example has shown, close monitoring of early re- turns can identify problems that were not caught during the pilot test, from technical issues to pro- gram bugs. Immediate resolution of such problems is required to prevent an unnecessarily large por- tion of non-respondents or increased measurement error. This case study has shown that surveys administered through the Web can, compared with mail surveys, potentially lower costs, reduce survey administration overhead, and collect survey data quickly and effi- ciently. However, it also confirms that a number of variables have the potential to influence survey re- sponse and measurement negatively, such as incom- patibility with the target computing environment, sur- vey usability, computer literacy of participants, and program defects. In this case study, respondents to the Web survey tended to complete more questionnaire items than respondents who used e-mail or fax. Be- cause of its potential impact on the quality of data collection, this finding should be validated in future studies. The authors thank Dr. R. Kenney, Dr. D. Dodell, and N. Dovgy for their help in conducting this project, Dr. Sorin Straja for his help with the statistical analysis, Ms. Syrene Miller for her as- sistance in the literature review, and the reviewers for their helpful suggestions and comments. References Ⅲ 1. Schleyer T, Forrest J, Kenney R, Dodell D, Dovgy N. Is the Internet useful in clinical practice? J Am Dent Assoc. 1999; 130:1502–11. 2. Eysenbach G, Diepgen TL. Epidemiological data can be gathered with World Wide Web. BMJ. 1998;316(7124):72. 3. Sell RL. Research and the Internet: an e-mail survey of sex- ual orientation. Am J Public Health. 1997;87(2):297. 4. Gobar GM, Case JT, Kass PH. Program for surveillance of causes of death of dogs, using the Internet to survey small animal veterinarians. J Am Vet Med Assoc. 1998;213(2):251– 6. 5. Schleyer T, Spallek H, Torres-Urquidy MH. A profile of cur-
  • 10. Journal of the American Medical Informatics Association Volume 7 Number 4 Jul / Aug 2000 425 rent Internet users in dentistry. J Am Dent Assoc. 1998;129: 1748–53. 6. Lakeman R. Using the Internet for data collection in nursing research. Comput Nurs. 1997;15(5):269–75. 7. Fawcett J, Buhle EL Jr. Using the Internet for data collection: an innovative electronic strategy. Comput Nurs. 1995;13(6): 273–9. 8. Swoboda WJ, Mu¨hlberger N, Weitkunat R, Schneeweib S. Internet surveys by direct mailing. Soc Sci Comput Rev. 1997;15(3):242–55. 9. Hilsden RJ, Meddings JB, Verhoef MJ. Complementary and alternative medicine use by patients with inflammatory bowel disease: an Internet survey. Can J Gastroenterol. 1999; 13(4):327–32. 10. Soetikno RM, Mrad R, Pao V, Lenert LA. Quality-of-life re- search on the Internet: feasibility and potential biases in pa- tients with ulcerative colitis. J Am Med Inform Assoc. 1997; 4(6):426–35. 11. Dillman D, Tortora RL, Bowker D. Principles for construct- ing Web surveys. Presented at the Joint Meetings of the American Statistical Association; Dallas, Texas; August 1998. 12. Clark R, Maynard M. Research methodology: using online technology for secondary analysis of survey research data —‘‘act globally, think locally.’’ Soc Sci Comput Rev. 1998; 16(1):58–71. 13. Houston JD, Fiore DC. Online medical surveys: using the Internet as a research tool. MD Comput. 1998;15(2):116–20. 14. National Telecommunications and Information Administra- tion. Falling through the Net: new data on the digital di- vide. Available at: http://www.ntia.doc.gov/ntiahome/ net2/falling.html. Accessed Dec 8, 1999. 15. Intelliquest. Latest IntelliQuest survey reports 62 million American adults access the Internet/online services. Avail- able at: http://www.intelliquest.com/press/release41.asp. Accessed Dec 8, 1999. 16. Groves R. Survey errors and survey costs. New York: John Wiley, 1989. 17. Dillman D, Tortora RL, Conradt J, Bowker D. Influence of plain vs. fancy design on response rates for Web surveys. In: Proceedings of the Joint Statistical Meetings, Survey Methods Section. Alexandria, Va: American Statistical As- sociation, 1998. 18. Turner JL, Turner DB. Using the Internet to perform survey research. Syllabus. 1999;12(Jan):55–6. 19. Weygandt J, Kieso D, Kell W. Accounting Principles. 2nd ed. New York: John Wiley, 1990. 20. Turner JL, Turner DB. Using the Internet to perform survey research. Syllabus. 1998;12(Nov/Dec):58–61. 21. Stets D. Who’s using computers? Philadelphia Inquirer. 1995 Nov 19, 1995:sect D3. 22. Kranhold K. Foote Cone turns to AOL for online polls. Wall Street Journal. Jul 28, 1999:sect B1. 23. Senecio Software Inc. Online and disk-by-mail surveys. Available at: http://www.senecio.com/. Accessed Dec 8, 1999. 24. Perseus Development Corporation. A survey software pack- age for conducting Web surveys. Available at: http:// www.perseus.com/. Accessed Dec 8, 1999. 25. Dillman DA. Mail and Telephone Surveys. New York: John Wiley, 1978.