Wearables are embracing AI, transforming the way we live, act, learn and behave as social human beings. However, what’s at stake for this “PopSci rhetoric” to happen is nothing short of an enormous multifaceted challenge. In this talk, I will explore the system and algorithmic challenges in modelling behaviour in this augmented human era. In particular, I will discuss how an "Earable" can be used as a multi-sensory computational platform to learn and infer human behaviour and to design ultra-personal connected services.
3. Cognitive Assistant - Seamless Extension of the Inner Human Cognition
24/7 Contextual Assistant Strengthening Willpower
Safety & Adherence Assistive Guidance
@raswak
4. Help us to communicate better Help us to sleep better
Help us to focus better Help us to remember and recall better
@raswak
5. @raswak
- Cross Device Interactions
- Spans Across Space and Time
- Ultra Personalised
Behavioural UX
Accessing
everything
Controlling
everything
Understanding
everything
Sensing + Understanding
you and the world around you
7. AI Assisted Quantified Enterprise
Implication: People and Space Analytics
Location is the key context. Social signals can be extracted from location traces
Web Summit
Largest Tech Conference in the Planet
2015 @ Dublin
40K+ Attendees, 134 Countries
±6000 Sq. Meter
Startups, Entrepreneurs, Investors …
Long Term Feedback
Actionable Feedback
Community Driven Feedback
Privacy plays a critical role in users’ decision making process
Form needs an primary established purpose for sustainable engagement
Lessons
Understand, quantify and radically transform how people interact, feel, collaborate
and work together in the real enterprise for personal, group and larger organisation
efficiency.
@raswak
ACM UbiComp 2015, 2016, ICMI 2016, MobileHCI 2016
8. Actionable and Longterm Feedback at the right moment is key to sustainable engagement
Battery performance is absolutely important
Privacy plays a critical role in users’ decision making process
Form needs an primary established purpose for sustainable engagement
Lessons
@raswak
12. - With immediate and subtle interaction
- Unique placement for robust sensing
- Intimate and privacy preserving
- With an established purpose
- Aesthetically beautiful
- Ergonomically comfortable
The most personal
device
yet
Earables
@raswak
Sense
Learn
Act
Sensor
Sensor
AI/ML Models
13. @raswak
eSense Earable
Signal-to-Noise Ratio (SNR) of eSense in comparison to a smartphone and a
smartwatch concerning motion and audio sensing.
CSR Processor Flash Memory
45 mAh Li-Po Battery Contact Charging
Speaker
6-axis IMU Sensor
MicrophonePush Button
Multi Colour LED
Bluetooth/BLE
Size : 18x18x20 mm
Weight: 20 g
IEEE Pervasive 2018
15. eSense Earable
Over 90% accuracy with accelerometer only
Can further expand the set of head gestures to tilting, turning, …
PERFORMANCE
MULTIMODAL MODEL
SIGNAL BEHAVIOUR
Cleaner signals from the earbuds due to unique placement
HEAD GESTURE
Detection of basic head gestures with IMU signals
Nodding and Shaking
Gyroscope
Accelerometer
Nodding Shaking
Nearest Neighbour
Statistical
Features
Gyroscope Combined
Features
• Nodding
• Shaking
• Other
Statistical
Features
Accelerometer
F1score
0
0.25
0.5
0.75
1
Fusion Accelerometer Gyroscope
0.850.890.93
@raswak
ACM WearSys 2018
16. eSense Earable
IMU signal when walking
Accelerometer Gyroscope
Nearest Neighbour
Statistical
Features
Gyroscope
Combined
Features
• Stationary
• Walking
• Stepping up
• Stepping down
• Other
Statistical
Features
Accelerometer
AverageF-1score
0.00
0.25
0.50
0.75
1.00
Fusion Accelerometer Gyroscope
0.62
0.950.96
Over 90% accuracy with accelerometer alone
More robust to placements compared to watch and phone
PERFORMANCE
MULTIMODAL MODEL
SIGNAL BEHAVIOUR
Cleaner signals from the earbuds due to small head movements
PHYSICAL ACTIVITY
Detection of basic activities with IMU signals:
stationary, walking, stepping up and stepping down
@raswak
ACM WearSys 2018
17. eSense Earable
PERFORMANCE
MULTIMODAL MODEL
SIGNAL BEHAVIOUR
Cleaner signals from the earbuds due to small head movements
DIET
Detection of basic activities with IMU signals:
drinking and chewing
Audio spectrogram when chewing Gyroscope data when drinking
Random Forest
MFCC
Statistical
Features
Accelerometer
Gyroscope
Microphone • Drinking
• Chewing
• OtherCombined
Features
78% accuracy for fusion classifier even with simple features
Outperforms single-sensor classifiers
F-1score
0
0.2
0.4
0.6
0.8
Fusion Audio IMU
0.69
0.3
0.78
0.21
0.70.73
Chewing
Drinking
@raswak
ACM MobiSys 2018
18. eSense Earable
PERFORMANCE
SIGNAL PROCESSING
SIGNAL BEHAVIOUR
Amplified sound of heartbeats can be easily captured due to placement
HEART RATE
Simple filtering and peak detection is enough for reliable
detection
Average error of 2.4 BPM
Capable of detecting heart rate from in-ear microphone
Following ECG Pattern
Raw Signal
Microphone
Low-Pass
Filter
Amplifier
Z
Peak
Detector
Z
Heart rate
Beatsperminute
0
30
60
90
Ours Ground truth
81.684.0
@raswak
19. eSense Earable
PERFORMANCE
MULTIMODAL MODEL
SIGNAL BEHAVIOUR
Cleaner phase response from IMU to detect speech segment
CONVERSATION
Detection of speech segments using IMU and simple,
lightweight classifier
85% accuracy in speaking detection only with inertial sensors
Much more robust to ambient noise, e.g., nearby person’s speaking
Energy efficient trigger of more expensive microphone
SVM
Statistical
Features
Gyroscope
Combined
Features
• Speaking
• Non-speaking
Statistical
Features
Accelerometer
F1score
0
0.2
0.4
0.6
0.8
1
Audio IMU All
0.880.86
0.65
+ 20%
@raswak
ACM WellComp 2018
20. eSense Earable
PERFORMANCE
MULTIMODAL MODEL
SIGNAL BEHAVIOUR
Cleaner phase response from IMU to detect facial expression
FACIAL EXPRESSION
Fusion of IMU and Audio signals with SVM followed by HMM
Smoothing
70-80% F1 score with statistical features
High user variability for ‘smiling’ expression
Gyroscope data Camera
Stationary Pull Up Movement
Pull
Down
Stationary
SVM
• State 1
• State 2
• State 3
• …
MFCC
Statistical
Features
Accelerometer
Gyroscope
Microphone
Feature
Selection
HMM
• Laugh
• Smile
• Frown
• Other
F1score
0
0.2
0.4
0.6
0.8
1
Other Smile Laugh Frown
0.740.68
0.61
0.81
@raswak
ACM AH 2019
21. Situation-Aware Conversational Agent
Bringing cognition to conversational agents to radically transform their ability to assist and augment
human
KEY OBJECTIVE
Customer Experience, Conversational Commerce, Digital Health, Entertainment, Education
Home Automation and Life Style.
KEY APPLICATIONS
KEY INNOVATION
• AI-assisted software platform to understand emotion and situation at personal-scale.
• AI-as-a-Service to enable conversational agents to become situation-aware and dynamically adjusts
its conversation style, tone, volume in response to users emotional, social activity and environmental
context
Emotion
Awareness
Sociality
Awareness
Activity
Awareness
Realtime
Adaptation
KEY NUMBERS
97.8%
RECOGNITION
ACCURACY
1.2
RECOGNITION
LATENCY
2.48
E2E
LATENCY
SEC
SEC
@raswak
ACM ACII 2019
22. 360 Wellbeing Management and Cognitive Augmentation
People and Space Analytics
Stress and Happiness Analytics
Physical Social Network
Understand, quantify and radically transform how people interact, feel, collaborate
and work together in the real enterprise for personal, group and larger organisation
efficiency.
Implication
Key Objective
• Audio and Motion Sensor Processing
• Speech and OneTouch Interactions
• HD Quality Music
• Speech Recognition
• Speech Synthesis
• Notification Management
• Context Processing
BLE Localisation
• External Service Interaction
• Conversational Agent
• Selective Rule Engines
APP
Inference Engine for Realtime Context Awareness
End-to-End Architecture
Audio Conversational Activity
Audio Environment Dynamics
Audio Emotion
Motion Head Gesture
Motion Physical Activity
Location Face to Face Interaction
AI Model
AI Model
MFCC
Statistical
Features
BLE RSS
Accelerometer
Gyroscope
BLE
Microphone
• Heart Rate
• Emotion and Stress
• Eating and Drinking
• Conversation
• Ambient Environment
• Stationary
• Walking
• On-Transport
• Head Gesture
• Placement
• Social Interaction
• Proxemic Interaction
AI Model
• Sampling Rate
• Duty Cycle
CONTEXT PRIMITIVES CONFIGURATION
• Sampling Rate
• Duty Cycle
• Packet Interval
+
+
@raswak
23. Interaction with People, Places,
and Things On-the-Go.
Feedback on Physical and
Mental Well Being
Feedback on Collaboration, and
Social Behaviour
Personalised Recommendation
on Wellbeing
@raswak
28. Multiple devices offers more, better, and longer learning
opportunities at the expense of significant complexity.
1
Design for Multiplicity - Cognitive Orchestration
How to select, combine and compose devices to construct a dynamic
sensing pipeline contextually for highest QoS?CHALLENGE
@raswak
29. COGNITIVE ORCHESTRATION
2x accuracy gain at the expense of 13 mW energy
4x energy gain - inversely proportional to number of
devices
Learning the Runtime sensing quality of multiple devices using Siamese Neural Net
Predicting the best inference path addressing device and usage variability
Eliminate redundant computation.
Multi-Device Sensory AI Systems
Select and orchestrate the best devices for the task at hand
maximising accuracy and mining energy
SenSys 2019
Motion based Physical Activity Detection Audio Prosody based Emotion Detection@raswak
30. Design for Robustness. - Cognitive Translation
2
Environment - Environment Translation
Device - Device Translation
OS - OS Translation
Sensor - Sensor Translation
Guarantee a model to withstand its functional behaviour across
heterogenous conditions
Every single execution environment (sensor, device, OS, user) is different.
How to build robust sensory systems for 100 billion AI devices
(some of which are not invented yet)?
CHALLENGE
@raswak
31. COGNITIVE TRANSLATION
0%
25%
50%
75%
100%
iPhone S8 Mic2Mic
Loss
Recovery
0%
25%
50%
75%
100%
Thigh Chest Accel2Accel
Loss
Recovery
Audio Signal - Device Variability Motion Signal - User Variability
Accuracy Accuracy
Recover up to 90% of the accuracy lost due to device
variability using 15 minutes of unlabelled data.
Generative Models for Domain Adaptation and Domain Generalisation
Brand-new Model Architecture with CycleGAN principles for learning domain
translation functions
Robust and Future-Proof Sensory AI Systems
Sensory models that work irrespective of how and where the
sensor data is collected.
IPSN 2018, IPSN 2019
@raswak
CASE 1 CASE 2
32. Qualitative insights need to shape the systems’ runtime behaviour3
How to extend shape System’s behaviour at different phases in a
personalised way?
Turn user interaction into learning parameters
CHALLENGE
@raswak
33. COGNITIVE EXTENSION
Privacy Preserving and Personalised Extension of
Sensory AI Systems M
M
Running
M
M
Cycling
Swimming
M
M
Running
M Cycling
M Swimming
Time
On-Device Continual Learning
APPROX
POSTERIOR
M M
M
Upper Bounded KL Loss Cross Entropy Loss JSD Loss
PRIOR
APPROX
POSTERIOR M
Labelled Data
x y x’ x’’
MM
+
Unlabelled Data
Data
Augmentation
s s’ s’’
0
0.5
1
Period 1 Period 2 Period 3 Period 4 Period 5
Other Walking Sitting Walking Upstairs
Walking Downstairs Standing Laying
0
500
1000
1500
Period 2 Period 3 Period 4 Period 5
986
1286
1374
1211
362
461
590
728
Retained Samples New Samples
Continual Learning Accuracy for Motion Tasks Continual Learning Data Requirement
Accuracy
90% accuracy across multiple learning periods for extension
Only 10% data is retained
Labelled data reduction by 80%
Semi-supervised Bayesian continual learning
Small and imperfectly labelled supervised datasets
Rich approximate posteriors with uncertainty estimates
Extend sensory systems ability in a personalised and user-
defined way using on-device continual learning
@raswak
34. Design for Efficiency (and Privacy) - Cognitive Efficiency
Inference
Performance
Privacy
Protection
Energy
Awareness
Scale down cloud-scale algorithms to run locally on devices
Where will we find the next 10xgain?CHALLENGE
4
@raswak
35. COGNITIVE EFFICIENCY
Online Model Compression
Compress deep neural networks with negligible degradation in accuracy
Dynamic Model Fusion
Simultaneous execution of multiple models through parallelisation of parameter
heavy and computation heavy layers
Optimal Resource Allocation
Reduce energy footprints of neural networks and allocate an optimal set of
resources at runtime
Inference
Performance
Privacy
Protection
Energy
Awareness
Factorisation reduces memory and
computational requirements
1.5x gains in overall execution time
With runtime model fusion
Privacy Preserving Software Accelerator for
Sensory AI Systems
IPSN 2016, SenSys 2016, MobiSys 2017, IEEE Pervasive 2017
@raswak
36. Design needs to shape the understanding ability of the IoT Systems5
COMFORT MEMORABLE CONVERSATION
From recognition to understanding — {Design} enabled understanding
How to define the learning targets based on UX, and not the literals
towards a universal understanding model?CHALLENGE
@raswak
37. Intelligibility
6 Engage users and keep them informed about system’s behaviour
How to embed intelligibility in sensory system’s behaviour?
Answer the WHY ?
CHALLENGE
@raswak
38. Design needs to guide AI-assisted wearables failover strategy7 Design for AI Failure
How to guide the intelligibility of Sensory Systems in dealing with
failure, and in deciding when to engage human for right UX?CHALLENGE
@raswak