1. Affective State Prediction Based on Semi-Supervised
Learning from Smartphone Touch Data
CHI 2020
Rafael Wampfler, Severin Klingler, Barbara Solenthaler, Victor R. Schinazi, Markus
Gross
Hyunwook Lee
2020.09.10
2. Contents
• Paper Overview
• Introduction
• Related Work
• Method
• Experiment
• Results
• Application
• Limitation
4. 3
Introduction
• Basic Concepts
Affective States: experience of feeling the underlying emotional state
• States along three dimensions: valence, arousal, and motivational intensity(a.k.a. Dominance)
• Valence: the intrinsic positive valence or negative valence of an event, object, or situation
• Arousal: state of being activated, either physiologically or psychologically
• Motivational intensity: strength of the tendency to either approach a positive situation or to move
away from a negative situation
Valence
Dominance
Arousal
6. 5
Introduction: Challenges
Invasive setups Data Collection Manual feature extraction
- The majority of affective states
prediction system relies on
biosensor data(e.g. heart rate)
invasive, potentially costly
7. 6
Introduction: Challenges
Invasive setups Data Collection Manual feature extraction
- Quality of the prediction inherently
relies on the quality/amount of data
- User studies are expensive to run
Not quite as much labeled data
- The majority of affective states
prediction system relies on
biosensor data(e.g. heart rate)
invasive, potentially costly
8. 7
Introduction: Challenges
Invasive setups Data Collection Manual feature extraction
- The majority of affective states
prediction system relies on
biosensor data(e.g. heart rate)
invasive, potentially costly
- Quality of the prediction inherently
relies on the quality/amount of data
- User studies are expensive to run
Not quite as much labeled data
- Relies on domain knowledge
- Doesn’t leverage the larger amount
of available unlabeled data
9. 8
Introduction: Challenges
Invasive setups Data Collection Manual feature extraction
- The majority of affective states
prediction system relies on
biosensor data(e.g. heart rate)
invasive, potentially costly
- Quality of the prediction inherently
relies on the quality/amount of data
- User studies are expensive to run
Not quite as much labeled data
- Relies on domain knowledge
- Doesn’t leverage the larger amount
of available unlabeled data
All three challenges were caused by data Utilize touch data & Semi-supervised Learning!
11. 10
Related Work: Data
- Collect data from user study of 32 participants over
two months
- Data consisting of …
- Social interactions
- Browser history
- Apps | GPS
Multimodal Data
[Likamwa et al., 2013]
[Likamwa et al.] Robert LiKamWa, Yunxin Liu, Nicholas D. Lane, and Lin Zhong. 2013. MoodScope: building a mood sensor from smartphone usage patterns. In Proceeding of the 11th annual international conference on Mobile systems, applications, and services (MobiSys '13). Association for Computing Machinery, New York, NY, USA, 389–402.
[Gao et al.] Yuan Gao, Nadia Bianchi-Berthouze, and Hongying Meng. 2012. What Does Touch Tell Us about Emotions in Touchscreen-Based Gameplay? ACM Trans. Comput.-Hum. Interact. 19, 4, Article 31 (December 2012), 30 pages
- Collect data from user study of 15 participants
- Data collection during playing a game
- Data consisting of pressure and speed of touch
Touch Data
[Gao et al., 2012]
In this work, using chat conversation and touch data with much large-scale experiment with 70 participants
12. 11
Method: Overview
Heat Map Representation
• Considering spatial distribution of measurement
• Extract from touch data, using sliding window
Classification
• Classify affective states from extracted features
• Aggregating heatmaps by stacking latent space
Variational Autoencoder
• Most of the data are not labeled
• Learn and extract latent feature from heatmap
13. 12
Method: Heat Map Representation
Heat Map Representation
Classification
Variational Autoencoder
• Sliding window with window size 3 minutes
• Down-Down: typing speed between two consecutive touch downs
• Up-Down: typing speed between one touch up and subsequent touch down
Pressure
heat map
Down-Down
heat map
Up-Down
heat map
High
Low
14. 13
Method: Variational Autoencoder
Heat Map Representation
Classification
Variational Autoencoder
• Train variational autoencoder to utilize unlabeled data
• Learning Objective: Reconstruct the input
• Encoder: convolutional layers, Decoder: deconvolutional layers
• During training, encoder learns latent vector(high-dimension heat maps low-dimension embedding)
• Decoder reconstruct heat maps based on sampling from the distribution of the latent space
• Note that encoder of VAE learns normal distribution of the latent feature
Trained Encoder will be
used in classification!
15. 14
Method: Classification
Heat Map Representation
Classification
Variational Autoencoder
• Use pretrained encoder and fine-tune the entire network for the classification
• For the three type of the heatmap, trained individual autoencoder, and stacked the latent space of them
16. 15
Experiment: Overview
• Goal
To validate their pipeline for the prediction of affective states based on smartphone touch data
To collect the touch data and ground truth for the prediction
• Method: Simple chat conversation & self-reports in regular interval for the ground
truth
• Overview
Participants: 70 participants from ETH Zurich, same number of the male/female
• 20 different departments, Age(18-31, mean=23, SD=2.7)
• Every Participants are experts in English - English level is either “proficient”(C2) or “advanced”(C1), according to Common European
Framework of Reference for Languages
Apparatus
• For Chat Conversation: Huawei P9+ with Android 7.0, keyboard software was Gboard with auto-correlation and spell-checker disabled
• For self-reports: Huawei MediaPad M2 tablet
Procedure
• 1) Pre-experiment
• Survey: complete survey to measures mental health and personality traits before experiment
• Introduction: participants were given oral overview of experiment
• Small Talk: answer the 6 basic questions watch a nature video for the relaxation type well-known pangrams for the baseline touch input
• 2) Main Task
• Chat Conversation: Participants will engage in four different Skype conversations with four different contacts – Shocking, Exciting, Rude, Confusing
• Self-reporting: Every 90 seconds, participants had to fill in a self-report
21. 20
Experiment: Self-Reporting
Affective States Report(SAE) Stress, Emotion Level Report
Affective States During Conversation
• In Exciting, Confusing Conversation
• Valence & Dominance increase
• Arousal drops
• In Rude, Shocking Conversation
• Valence & Dominance decrease
• Arousal increases
• Average duration of rude, confusing conversation is shorter than duration of exciting, shocking one
• Rude and shocking conversations seemed to be more intense than exciting and confusing one
22. 21
Results: Affective State Prediction
• Down-Down speed best predictor
• Low & High levels most often
confused with medium level
• Pressure best predictor
• Medium level most often confused
with low level
• All heatmaps perform equally well
• Similar to arousal, but medium
dominance is more often confused
with high dominance
23. 22
Results: Others
• Basic Emotion and Stress Prediction
Anger
Happiness
Stress
Sadness
Surprise
87%(0.84 AUC)
15.7% of data
81%(0.88 AUC)
33.1% of data
92%(0.80 AUC)
9.3% of data
84%(0.87 AUC)
21.0% of data
84%(0.76 AUC)
16.0% of data
• Affective Sequence Analysis
• Sequence length means…
• If 0, considering all data points
• Else, only considering points that having at
least one, two, and three preceding data
points with the same label
24. 23
Application: Self-Awareness
• Can be utilized in therapeutic chatbots, such as
Woebot
• With this approach, the bot can increase the
mood prediction accuracy
• Can be utilized as self-awareness tool
• Using simple pie-chart, user can see their current
status
25. 24
Limitation
• In experiment…
Every experiment is conducted in laboratory cannot sure it works well outside
too
Participants only consist of bachelor and master students
Self-reporting at every 90 seconds may change their moods
only utilize heatmap for the prediction
In the heatmap, especially in the two speed heatmaps, it is hard to say that
considering them as one bundle is right – it is very individual-dependent
prediction may be wrong or too generalized