On the ecological validity of a password study

O F A PAS S W O R D S T U D Y
ECOLOGICAL VALIDITY
On the
Alexandria Farar

WHAT IS ECOLOGICAL VALIDITY?
Definitions:
http://www.thefreedictionary.com/ecological
The relationship between organisms
and their environment.

Definitions:
How well a study can be related to
or reflects everyday, real life.
http://holah.co.uk/page/ecologicalvalidity/

Definitions:
http://www.alleydog.com/glossary/definition.php?term=Ecological%20Validity
http://study.com/academy/lesson/ecological-validity-in-psychology-definition-lesson-quiz.html


Experimental
Control
Ecological
Validity~

MOTIVATION
• Problems with Ecological Validity in Password Studies
• Complex & Difficult to Quantify
• Hard to Study ~ Lack of “Ground Truth”

BACKGROUND
• Studies on Password Security & Usability
Real Leaked / Stolen PasswordsUser Studies
Real World DataControlled

BACKGROUND
• Studies on Password Security & Usability
Types of User Studies
Online Surveys
• Increase sample size & diversity
Laboratory Studies
• Not in natural environment
• Aware of being studied
Pen & Paper-based

METHODOLOGY
Study Design
 Five unique passwords stored
 Asymmetric cryptography
 Password Decryption
 Five University-wide services
 IDM
 Email
 Wifi
 Campus Login
 Single Sign-on (SSO)
 Anonymized dump of decrypted passwords
Password
Policy
IDM
Identity Management
• Study design mirrored
enrollment process
• Able to compare study
passwords to real passwords
• Student role-play
• Informed consent

METHODOLOGY
• Analysis conducted offline without demographic information
• Account information never revealed for real or study
passwords
• Results of password data analysis linked to demographic data
• Results shared with Privacy Officer before publication
Study Design

METHODOLOGY
• Study Design
Do passwords generated by participants asked to role play
a scenario in which they have to create a password for
fictitious accounts resemble their real passwords?
Do participants behave so differently because of the study
that the results of the study should not be used to make
Inferences about their real behavior?

METHODOLOGY
• Study Design
• University password policy required strong passwords
• Independent Variable:
• Openly mentioned study was about passwords
vs. obfuscation
Within-subjects Between-subjects
• Real vs. Study Passwords • Lab vs. online study
• Password priming vs. not
Conditions

METHODOLOGY
• Study Design
ROLE
PLAY
Enroll in University
Register for Services
Mirrored University Password Policies
• A password’s minimal length is 8 characters;
its maximum length is 16 characters.
• Password characters are split into four
different groups: Upper and lower case
alphabetical characters, special characters
,.:;!?#%$@+-/_><=()[]{}* and digits.
Passwords that are shorter than 12 characters
must include characters from three of the four
described character groups. Passwords that
are 12 characters or longer must only include
characters from two of the four described
character groups.
• Neither the student’s first/last name nor the
student’s ID number may be part of a
password.
• Users must use different passwords for all
accounts.
University Password Policy

METHODOLOGY
Study Design
• 16,500 students invited via email
• Two-part online study creating
online accounts
• 15-20 minute questionnaire
• Second part two days after first
• Raffle 3x100 Euro Amazon
vouchers
• Two introductory texts
(prime, non-prime)
• Prime – important for passwords
to be available
• Participants created accounts as
normal
• Act as if passwords created were
real passwords
• Must login to accounts two days
later to complete
• Redirected to survey after creating
accounts

METHODOLOGY
Study Design
• 740 students invited via email
• 68 attended
• Same rules as online study
• Lab environment, PC and
supervised
• First part completed in lab
• Second completed at home
• Incentive – 20 Euros
• Opportunity to ask questions
• Assistance with technical

METHODOLOGY
• Password Analysis
Manual Scoring
Categorized participants based on
How similar the metrics study passwords
compared to real ones
User behavior considered
Example:
Study: “PwdIDM11.”, “PwdMail11.”,
“PwdWifi11.”,
“PwdPC11.”
Real: ‘B0ru$$ia09”, “16.Januar”, “(aus- tralien)”,
“314159Pi”

METHODOLOGY
Metrics
User
Behavior
Names
Dictionary Word
Dates
Simple/complex
Numbers
Mixed Case /
Random String
Upper/Lower
Case string
Special
Characters
Keyboard Pattern
L33T Speak
*Partial Listing

METHODOLOGY
Null
Single
FullSystem
Derogatory
Each subject’s password set was
assigned to one of these
categories.
Passwords preprocessed and
appended with “kb” to passwords
containing a keyboard pattern.

METHODOLOGY

METHODOLOGY
47% - Agreed
9.3% - Disagreed
43.5% - 2 Scores
Agreeing

METHODOLOGY
Full System
Realistic
Similar
Passwords
Single
Unrealistic
Null /
Derogatory
Inconsistent
Password
Composition
Password
length
#Uppercase
#Lowercase
#digits
# spec
char
Entropy
NIST
Entropy
For every password from real accounts and online / lab study
Password strength – John the Ripper, Entropy

RESULTS
583 online
63 lab
Age 17-55
35.8%
Female
385 online
33 lab
primed
16.3% IT
90.7%
Internet
• Participants

RESULTS
18.1 Online Accts
Medium IT
Expertise
6.5%
No Forgotten
Password
79.6%
Forgotten 2x
17.4% Account
Abuse
63.2%
Use 2-3
Passwords
14.9%
Different
Password
• Participants

RESULTS
• Scoring Evaluation
Hypothesis:
Category Full participants would have the highest correlation of password
composition values between their two password sets of all categories.
• Expected a weaker correlation for category Single and Category System
participants
• No correlation for category Null and Derogatory participants.

RESULTS
• Found highly significant and strong correlations for participants in score category Full
and mostly significant correlations in categories Single and System.
• No correlation when the entire set of study passwords was analyzed as a whole.
• No correlation for the categories Null and Derogatory.

RESULTS
Legitimate Categories Regardless of Condition:
(Online, lab, primed, non-primed)
Single, Full, System Participants - behave more realistically in our study than category Null
and Derogatory participants, with category Full participants showing the strongest correlation.
26.5% of our participants even used at least one of their real passwords in the study.
No difference between those conditions with respect to our categorization; it is possible to
compare the differences in password behavior solely on the category irrespective of the
condition.
Scoring consistent: Participants classified to behave consistently between real and study
pass-words by our scoring system did compose their passwords consistently.
Those behaving inconsistently according to our classification produced independent sets of
passwords.

RESULTS
• Evaluation
Password
Sets
Full
46.2% (298)
Single
18.8 % (121)
System
5.1 % (33)
Null - 28.5 %
(184)
Derogatory
(1.4%)

EVALUATION
• Online vs. Lab Study
More participants fell into the helpful categories Single, Full and System
compared to our online study (Table 3).
Priming – Null hypothesis that no difference in behavior could not be
Rejected with p=0.4698.

RESULTS
Self-reported Values in predicting inconsistent study behavior
Asked participants if they behaved differently
• Different behavior = fewer counts in Full, Single and System; Higher counts in Null
and Derogatory
• Participants who changed their usual behavior for the study obtained significantly
fewer ratings in categories Full, System and Single, and more in Null and Derogatory
than participants who did not self-report
• Participants who said that they use individual passwords for each account also
scored significantly more frequently in categories Null and Derogatory when
participating online
Some reasons for deviation include Distrust, Policy and Lazy.

RESULTS
• Consenters
88.6% online 95.6% lab
Compare
real
passwords
Different
Password
Strategies
Use indiv.
password
per account

REVIEWS
• Because of the content and the impacts mentioned above, the topic of the paper presents
a novelty, important new knowledge and fits the requirements of the call for paper of this
conference (see http://cups.cs.cmu.edu/soups/2013/cfp.html). Additionaly, the paper treats
the impact of organizational policy or procurement decisions and tuches the same topics as
failed usable security experiments, with the focus on the lessons learned from them.
Furthermore, the must-have criteria, that the work should relate to usability or human
factors and either privacy or security, is fulfilled. The length of the paper does not violate
the rules.
• positive aspects:
- comparison of lab to online and real-world behavior delivers wide coverage.
- compact, very informative evaluation display across multiple aspects of password
studies, including some interesting results (realistic passwords in lab environment).
• negative aspects:
- Password conditions were pretty strict, requiring relatively save passwords from the
get go.
- Perhaps unbalanced set of data between online and offline study, however, this is
a general problem of the two types of studies.

On the ecological validity of a password study

Recommended

Recommended

More Related Content

Similar to On the ecological validity of a password study

Similar to On the ecological validity of a password study (20)

Recently uploaded

Recently uploaded (19)

On the ecological validity of a password study

Editor's Notes