5. @infoxiao
COMBINING THE BEST OF BOTH WORLDS
5
Virtual Reality
(as of 2017)
Crowdsourcing
Crowdsourced VR
(this work)
Manipulation
Realism
Measurement
Granularity
Participant
Diversity
Reproducibility
6. @infoxiao
COMBINING THE BEST OF BOTH WORLDS
6
Virtual Reality
(as of 2017)
Crowdsourcing
Crowdsourced VR
(this work)
Manipulation
Realism
+ High - Low
Measurement
Granularity
+ High - Low
Participant
Diversity
- Low + High
Reproducibility - Low + High
7. @infoxiao
COMBINING THE BEST OF BOTH WORLDS
8
Virtual Reality
(as of 2017)
Crowdsourcing
Crowdsourced VR
(this work)
Manipulation
Realism
+ High - Low + High
Measurement
Granularity
+ High - Low + High
Participant
Diversity
- Low + High + High
Reproducibility - Low + High + High
9. @infoxiao
RESEARCH QUESTIONS
10
RQ1: Are there VR-eligible reachable workers?
RQ2: What is a good user flow?
RQ3: What types of experiment manipulations can we
deliver remotely?
RQ4: What are the limitations and challenges?
10. @infoxiao
CONTRIBUTIONS
11
1. Validated a VR-eligible panel of 242 workers
2. Implemented a user flow between desktop and VR
3. Replicated three previous studies with different
experiment manipulation remotely
4. Limitations and challenges
Source code and data are available at:
bit.ly/VRCrowdExperiments
11. @infoxiao
OUTLINE
12
1. Validated a VR-eligible panel of 242 workers
2. Implemented a user flow between desktop and VR
3. Replicated three previous studies with different
experiment manipulation remotely
4. Limitations and challenges
Source code and data are available at:
bit.ly/VRCrowdExperiments
12. @infoxiao
Please take a picture of your device with lasts 4 digits of your worker ID
handwritten on a piece of paper in view.
CONSTRUCTING VR-READY PANEL
13
14. @infoxiao
N = 242
(MORE) DIVERSE PARTICIPANTS
16
70%
14%
6%
10%
0% 25% 50% 75% 100%
White
Asian
Black
Other
61%
39%
0% 25% 50% 75% 100%
Male
Female*
52%
30%
18%
0% 25% 50% 75% 100%
Suburban
Urban
Rural
90%
10%
0% 25% 50% 75% 100%
U.S.
Other
14%
29%
29%
13%
-13% 10% 33%
High school or below
Some / 2-year college
Bachelor's
Master's and above
21%
57%
22%
0% 25% 50% 75%
Below $30K
$30K - $80K
Above $80K
Age: 18 - 78 (Median 32)
* One worker self-identified as “other”.
15. @infoxiao
OUTLINE
17
1. Validated a VR-eligible panel of 242 workers
2. Implemented a user flow between desktop and VR
3. Replicated three previous studies with different
experiment manipulation remotely
4. Limitations and challenges
Source code and data are available at:
bit.ly/VRCrowdExperiments
16. @infoxiao
DESIGN GOALS OF USER FLOW
18
Usable Direct workers to participate in VR and complete survey on desktop
Web-Based Allows for online data collection and thus remote participation
Low Technical Barrier Allows for easier replication and adaptation
18. @infoxiao
FLOW
20
Worker accepts task
through Amazon
Mechanical Turk
Worker completes
experiment in VR
VR Web App URL
Worker answers survey in
Qualtrics
Verification Code 1
Experimenter approves
payment Verification Code 2
19. @infoxiao
OUTLINE
21
1. Validated a VR-eligible panel of 242 workers
2. Implemented a user flow between desktop and VR
3. Replicated three previous studies with different
experiment manipulation remotely
4. Limitations and challenges
Source code and data are available at:
bit.ly/VRCrowdExperiments
21. @infoxiao
MODELS OF ILLUSIONS IN VR
23
Gonzalez-Franco, M., & Lanier, J. (2017). Model of Illusions and Virtual Reality.
Frontiers in psychology, 8, 1125.
Place Illusion A user’s feeling of being transported into the rendered environment
Embodiment Illusion A user’s feeling of experiencing the virtual world through an avatar
Plausibility Illusion A user’s feeling that events happening in the virtual world are real
22. @infoxiao
MODELS OF ILLUSIONS IN VR
24
Place Illusion A user’s feeling of being transported into the rendered environment
Embodiment Illusion A user’s feeling of experiencing the virtual world through an avatar
Plausibility Illusion A user’s feeling that events happening in the virtual world are real
Study 1: Restorative Effects of Virtual Environments [Valtchanov et al. 2010], N = 22 (original) v.s. 55 (ours)
Study 2: Proteus Effect [Yee and Bailenson, 2007], N = 50 (original) v.s. 59 (ours)
Study 3: Drawing Power of Crowds [Milgram et al. 1969], N = 1,424 (original) v.s. 56 (ours)
Gonzalez-Franco, M., & Lanier, J. (2017). Model of Illusions and Virtual Reality.
Frontiers in psychology, 8, 1125.
23. @infoxiao
MODELS OF ILLUSIONS IN VR
25
Place Illusion A user’s feeling of being transported into the rendered environment
Embodiment Illusion A user’s feeling of experiencing the virtual world through an avatar
Plausibility Illusion A user’s feeling that events happening in the virtual world are real
Gonzalez-Franco, M., & Lanier, J. (2017). Model of Illusions and Virtual Reality.
Frontiers in psychology, 8, 1125.
Study 1: Restorative Effects of Virtual Environments [Valtchanov et al. 2010], N = 22 (original) v.s. 55 (ours)
Study 2: Proteus Effect [Yee and Bailenson, 2007], N = 50 (original) v.s. 59 (ours)
Study 3: Drawing Power of Crowds [Milgram et al. 1969], N = 1,424 (original) v.s. 56 (ours)
25. @infoxiao
STUDY 2: PROTEUS EFFECT
27
Van Der Heide, Brandon, et al. "The Proteus effect in dyadic communication: Examining the effect of avatar appearance
in computer-mediated dyadic interaction." Communication Research 40.6 (2013): 838-860.
27. @infoxiao
STUDY 3: DRAWING POWER OF CROWDS
29
Milgram, S., Bickman, L., & Berkowitz, L. (1969). Note on the drawing power of crowds of different size. Journal of
personality and social psychology, 13(2), 79.
N = 1,424
28. @infoxiao
STUDY 3: DRAWING POWER OF CROWDS
30
Milgram, S., Bickman, L., & Berkowitz, L. (1969). Note on the drawing power of crowds of different size. Journal of
personality and social psychology, 13(2), 79.
N = 1,424
32. @infoxiao
GAZE* DISTRIBUTION
34Size of stimulus crowd
Zero (N = 15) Low (N = 15) Medium (N = 13) High (N = 13)
0% 100%
* Head rotation used as a proxy for gaze.
Participant
Male avatar
Female avatar
33. @infoxiao
GAZE* DISTRIBUTION
35Size of stimulus crowd
0% 100%
60%
8%
14% 18%
* Head rotation used as a proxy for gaze.
Zero (N = 15) Low (N = 15) Medium (N = 13) High (N = 13)
Participant
Male avatar
Female avatar
34. @infoxiao
GAZE* DISTRIBUTION
36Size of stimulus crowd
0% 100%
60%
8%
14% 18%
54%
12%
12% 22%
47%
13%
16% 24%
47%
14%
16% 23%
*** statistically significant
* Head rotation used as a proxy for gaze.
Zero (N = 15) Low (N = 15) Medium (N = 13) High (N = 13)
Participant
Male avatar
Female avatar
35. @infoxiao
RECAP
37
Place Illusion A user’s feeling of being transported into the rendered environment
Embodiment Illusion A user’s feeling of experiencing the virtual world through an avatar
Plausibility Illusion A user’s feeling that events happening in the virtual world are real
Study 1: Restorative Effects of Virtual Environments [Valtchanov et al. 2010], 22 (original) v.s. 55 (ours)
Study 2: Proteus Effect [Yee and Bailenson, 2007], 50 (original) v.s. 59 (ours)
Study 3: Drawing Power of Crowds [Milgram et al. 1969], 1,424 (original) v.s. 56 (ours)
Source code and data are available at:
bit.ly/VRCrowdExperiments
36. @infoxiao
REPRODUCIBILITY
38
• $8 per task (study participation)
• $1,374 for 201 participants (50%
reduction, estimating $3,600 for the
same in-lab studies)
• No equipment cost
Cheaper
• 201 study participation within a week
• Compared to three weeks estimation
(67% reduction, estimating 10 in-lab
studies per day)
• ~20 min per task
Faster
• Pure web-based
• JavaScript easy to learn
• Open sourced
Easier
Source code and data are available at:
bit.ly/VRCrowdExperiments
37. @infoxiao
CONTRIBUTIONS
39
1. Validated a VR-eligible panel of 242 workers
2. Implemented a user flow between desktop and VR
3. Replicated three previous studies with different
experiment manipulation remotely
4. Limitations and challenges
Source code and data are available at:
bit.ly/VRCrowdExperiments
38. @infoxiao
THE (LONG-DUE) PROMISE OF VIRTUAL REALITY
40
Blascovich, J., Loomis, J., Beall, A. C., Swinth, K. R., Hoyt, C. L., & Bailenson, J. N. (2002). Immersive virtual
environment technology as a methodological tool for social psychology. Psychological Inquiry, 13(2), 103-124.
… replications, or at least near-perfect replications,
become quite possible.
”
“
… allow for more representative sampling. Whole
experiments can be carried out concurrently in
multiple laboratories via networked collaboratories.
... provide a compelling sense of personal, social, and
environmental presence for users, while allowing the
investigator near-perfect control over the experimental
environment and actions within it.
39. @infoxiao
LIMITATION & CHALLENGES
41
• Device battery exhaust
• Device overheating
• Limitation on measurement
• Switching between desktop and VR not smooth
Device ConstraintSize Constraint
• N = 242
144
46
18
18
0 40 80 120 160
Samsung Gear VR
Google Cardboard
HTC Vive
Sony Playstation
• 98% at home - safety is important
• 89% seated, 81% have space to walk around
Physical Environment
41. @infoxiao
THANK YOU
44
Mor Naaman[1,2]
[1] Social Technologies Lab, Cornell Tech
[2] Cornell University
Megan Cackett[2] Leslie Park[2]Xiao Ma[1,2] Eric Chien[1,2]
We thank the crowdworkers who participated in our studies, and Oculus for providing the equipment used for
developing the studies. We thank Dan Goldstein, Jake Hofman and Sid Suri for early feedback and direction.
Source code and data are available at:
bit.ly/VRCrowdExperiments
This work is partially supported by
Oath through the Connected
Experiences Lab at Cornell Tech.
Editor's Notes
Hi. I’m Xiao, 5th year PhD at Cornell Tech.
Happy to present this work, with my colleagues at Cornell, Megan, Leslie, Eric, and my advisor Mor Naaman.
I’m going to tell you a bit about Virtual Reality and Crowdsourcing.
This work is first published at The Web Conference 2018, WWW, one of the premier academic conference for the future direction of web technologies. You can find the full paper on my website, or a medium write-up on twitter.
Now just out of curiosity, by show of hands, how many of have experienced virtual reality?
As you probably experienced, or heard,
Virtual reality is an immersive technology that can generate highly realistic simulated experiences. }}
This paper is about combining the power of virtual reality, with that of crowdsourcing.
We believe that we have ran the world’s first three crowdsourced VR experiments.
In this talk, I am going to tell you why this is important, how we did it, and how you can run you own crowdsourced experiments going forward. }}
To understand why it is important to combine virtual reality with crowdsourcing, we draw this table of comparison for both areas of research, focusing on four things:
manipulation realism
measurement granularity,
participant diversity
and reproducibility }}
Virtual reality is strong in delivering highly realistic experience, making it suitable for strong experiment manipulation.
In addition, a lot of the VR devices are mobile, and the built in sensors in these devices provide high granularity for measuring things, such as head orientation.
But at the same time, current virtual reality research lacks participant diversity, relying heavily on college students as participants. Finally, it is very hard to replicate a VR study, making VR research low in reproducibility.
These last two weakness, participant diversity and reproducibility, are actually the very strengths of crowdsourcing — a vibrant area of research that many of you in the room have contributed to. }}
Another area of research is crowdsourcing — again very rich area for the past decade.
In particular, our work builds on Mason and Suri, 2012, [CLICK] who first articulated the benefits of conducting behavior experiments using crowdsourcing.
The virtual lab concept was inspired by a recent work by Mao et al, [CLICK], a longitudinal virtual synchronized experiment that studied trust. It is our hope that we begin to bring these two areas of research together through this work.
By making VR experiments crowdsourced, we can have the best of both worlds — meaning…
the ability to run reproducible online experiments, with high experimental realism and fine-grained measurement, combining desktop based surveys with mobile based interactions, as well as reaching more diverse participants.
This is why crowdsourced VR is important.
Now the next question is, [CLICK] how? }}
(… how.)
How do we combine VR and crowdsourcing to create an effective paradigm for conducting experiments on the web.
More concretely… }}
(More concretely…,) we ask 4 research questions.
First, are there workers out there with their own VR devices and are willing to participate in our experiments?
Second, once we find them, how do we direct them to do tasks in VR?
Third, what types of experiment manipulations can we deliver remotely in VR, without being in a lab and maybe not even being online at the same time with the participant?
And lastly, what are the limitations and challenges that we still face? We addressed these questions in our work. Here are our main contributions: }}
(here are our main contributions…)
First, we validated a VR-ready panel on Mechanical Turk, and we surveyed that population.
Second, we designed and implemented an effective user flow to direct workers to participate in VR and come back to desktop for exit surveys and payment.
Third, we replicated three previous studies in VR and without VR, each successfully delivering a different kind experiment manipulation.
Last but not least, we got a deeper understanding of the limitations and challenges of crowdsourced VR, and open source our data and code for easy replication and adaptation for you to run new experiments. Which we quite excited to see.
I will use this slide as the outline for the rest of my talk, and will go into each point in more detail. }}
So first, validating VR-ready panel.
To see if there are any crowdworkers available with their own VR devices and are willing to participate in our experiments, we designed a screening task and posted it through Amazon Mechanical Turk. [CLICK] }}
We invited crowdworkers who own VR devices from a list of approved device to participate in a survey.
In the survey, to prove that they have access to the VR device, we asked the worker to take a picture of the device, but together with a piece paper with their last four digits of Turker ID handwritten on it in view as well.
We accepted the qualifying ones; [CLICK]
and rejected the ones that appear to be scamming us. [CLICK]
As you can see, it’s not very difficult to tell them apart.. }}
We received two hundred forty two valid submissions in total.
In the same survey, we also asked the worker about their demographics, and we report it here. }}
I will give you a second to read the graphs. [PAUSE]
There are two key takeaways from this slide.
First, we observe that this population does not appear to differ significantly from the overall turker population tracked by literature over the years.
Second, this VR-panel is more diverse than the usual college student population, which is promising in addressing one of the key weakness of current VR research, participant diversity. }}
Next, now that we have a potential population of crowdworker participants that are relatively diverse, can we design a good user flow to direct them to complete VR tasks? }}
There are several design goals for this user flow:
First, it should usable at least.
Second, it needs to be web-based to allow for remote participation.
Third, ideally, the technical barrier for implementing such a user flow should be relatively low to make it easy replication and adaptation. }}
To achieve these design goals: we chose the following technical platform.
A JavaScript based UI framework for VR on the client side, and node.js for the server.
In other words, everything is in JavaScript and can easily be picked up and deployed by a web developer without special knowledge about game engines such as Unity, that is usually required to develop VR applications.
Again code and data are open sourced for replication and adapation. }}
Here is how the flow works. A worker accepts a task through Amazon Mechanical Turk, [CLICK]
It contains a URL for VR web app, and the worker joins the URL to complete VR part in their own headset. [CLICK] At the end of the VR experience, the participant receives the first verification code in VR, which unlocks a desktop-based survey. [CLICK]
After they finish the survey, worker receives a second verification code which unlocks the payment on Mechanical Turk. The task is then complete. }}
Now we have the participants. We also have the flow.
Let’s run some experiments. }}
(7:00min estimate)
When we were choosing experiments to run, we wanted to be very strategic about the first few experiments for this paper. First of all, we want to run replication studies rather than brand new experiments, because we want to have some baseline results to compare against than to create something in vacuum.
Second, there are thousands of VR studies out there. How do we choose smartly to maximize the demonstration power of these few studies. To do that, we relied on a framework of models of illusions in VR as a map. }}
This framework states that VR’s capacity can be broken down into three main types of illusions.
Place illusion, embodiment illusion, and plausibility illusion. The definitions of the illusion are here on the right if you want to take a second to read. [PAUSE]
Based on this framework of illusions, we chose three studies to replicate. Some in VR, some not, but one for each type of illusion. }}
For place illusion, we chose a study in 2010 on the restorative effects of virtual environment.
For embodiment illusion, we chose one of the most iconic and well-cited work in VR, the proteus effect, on how one’s virtual representation of self can influence their behavior.
For plausibility illusion, we chose a seminal study by Milgram et al. in 1969 on the study of crowds. }}
Due to time limitation, I will show you demo videos of all three studies, but only go in depths for results for Study 3. You can of course refer to our paper for details. }}
I am about to show you the VR portion of the first study, the restorative effects of virtual nature environment. In this study, we first show participants a thriller video to increase their psychological stress, and then play clips of nature to see if we decrease their negative affect, and increase positive affect.
Each video is 2-minute long, which I trimmed here, and also omitted the audio. [CLICK]
The dot here shows the center of participant gaze location. }}
Moving on to the second study, proteus effect. Again this is a very iconic and well-cited work in VR, investigating how virtualized self can impact one’s behavior.
In one of the experiments, the paper shows that the participant plays more dominantly in a negotiation game of splitting money when assigned to a taller avatar compared to a shorter one.
There are mixed results when other researcher tried to replicate the Proteus effect.
We did our own version, but did not observe an effect. Here is the video. }}
And then, finally, the Milgram study.
In 1969, Milgram and students conducted a study using pedestrians of New York as participants on the street.
They hired actors as stimulus crowd to stand in the middle of the sidewalk, and looked up to the sky. Then they measure the percentage of pedestrians who looked up, or stopped. }}
(Target time: 11min)
Here is a graph showing the percentage of people who looked up and stopped on the Y-axis, and the size of the stimulus crowd on the x-axis. As you can see, as the size of stimulus crowd increased, the percentage of passers bys who were influenced in their behavior also increased. More of them looked up and stopped. I’m going to show you how we replicated this study in VR, using avatars as stimulus crowd.
One caveat is that we had the stimulus crowd looking to the back, rather than looking up per feedback from early pilot user studies. }}
[CLICK] There are ten avatars, half male and half female who acted as the stimulus crowd.
Some of them looking to the back of the participant, which varies per condition.
The participant was instructed to freely explore the space to look for an object, which always appeared at the last 10 seconds of a 3-minute time limit window.
During the 3 minutes, we log the participants’ head rotation five times per second as our depend variable. }}
To simplify our analysis, we focus only on one axis of head orientation, [CLICK] which is yawning, looking left to right. Here are some notation for our illustration of results of this rotation.
You are looking from top down at the participant. [CLICK] The default field of view of a VR headset is 100 degrees, which is this zone to the front. [CLICK] Based on this, we divide the space into four zones. [CLICK]
[Gestures..] Zone 1 to the front, 2 to the left, 3 to the back, and 4 to the right. Later we will plot heat maps of gaze distribution based with these four zones.}}
We had a between subject design with four experiment conditions. Zero, low, medium and high. Indicating the number of avatars who face to the back of the participant.
For zero condition, [CLICK], all avatars are looking in the same direction as the participant. The arrow indicates the direction the avatar is facing.
For low condition, we had 2 randomly selected avatars [CLICK] facing to the back; 4 for medium condition [CLICK], and 8 for high condition [CLICK]. }}
We received 56 valid submissions in total, which was roughly distributed evenly across conditions.
Based on how we divided the space into four zones earlier,
now we can plot the head orientation distribution averaged across all participants in the same condition as a proxy for their gaze distribution. }}
Here is a gaze distribution heatmap for zero condition.
As you can see, participants spend most of the time looking to the front, and least time to the back, which is not surprising and nice validation. }}
For other conditions, we plot the same heatmap.
Here is the interesting part.
As the size of the stimulus crowd increases, participants spend less time to the front, and more exploring other areas. In fact, if we compare the percentage of time spent in zone 1, which is to the front, we found significant differences [CLICK] between Medium Zero, and High Zero condition. This replicates Milgram’s finding of the drawing power of crowds with a much small number of participants, as we were able to observe the participants in a much granular way. }}
So there you go, our three replication studies, done with crowdsourcing and remotely, for the first time, in VR.}}
(Target time: 15min)
In addition, cost analysis showed that we were able to drive VR experimentation to be 50% cheaper, 60% faster, and easier to implement.
We are also among the firsts to open source data and code for VR research, which further contributes to higher reproducibility.}}
Our work is not without limitations.
The main challenge we still face is participant size, which limits the number of experiments that we could run. You can refer to the paper for more discussion on limitations.
But for now, I want to close with a paper that served as the most direct inspiration for this work, 2002 Blascovich et al. }}
(Target time 16:00min).
(2002 Blascovich…,) Sixteen years ago, Blascovich et al. imagined a world where VR can be used to run replicable experiments with diverse participants. It is not until today that we finally made it work for the first time. }}
Nonetheless, our work still has key limitations.
Although we were able to get a few hundred participants, the scope of the experiments is still limited.
We also face issues with device performance such as battery exhaustion and overheating.
Finally, there are constraints with the physical environment people participate in, and need to consider important safety issues.
Some of these issues will be mitigated with newer devices, and we hope the research and industry communities continue to build on our work, and strive towards a shared vision of
Combing VR and crowdsourcing to run scalable, and reproducible VR studies.
And with that, }}
I conclude my talk, and thank my collaborators again as well as Oath for supporting this work.
Thank crowdsourcing week for having me and feel free to reach out.
Thank you very much for your attention. }}