SW Data Meetup talk - Tom Diethe - 28 March 2017
Dr Tom Diethe, Research Fellow for the £15 million SPHERE Interdisciplinary Research Collaboration (IRC) at the University of Bristol, will introduce the SPHERE project, which is designing a platform for eHealth in a smart-home context. This platform is currently being deployed into homes throughout Bristol. You may have seen the SPHERE House featured on the BBC's 'Joy of Data' documentary or in the wonderful Aardman animated overview of the project.
This talk will focus on the Data Science and Machine Learning challenges and opportunities of SPHERE. Tom will discuss the implications for such an eHealth system in terms of the quantification and management of uncertainty for automated decision making in health care, gathering the necessary data to train Machine Learning models, and the importance of calibration in such systems, particularly in light of the differing operational contexts that will be encountered.
Due to well-known demographic challenges, traditional regimes of health-care are in need of re-examination. Many countries are experiencing the effects of an ageing population, which coupled with a rise in chronic health conditions is expediting a shift towards the management of a wide variety of health related issues in the home. In this context, advances in Ambient Assisted Living are providing resources to improve the experience of patients, as well as informing necessary interventions from relatives, carers and health-care professionals.
SPHERE has developed a number of different sensors that will combine to build a picture of how we live in our homes. This information can then be used to spot issues that might indicate a medical or well-being problem. The technology could help in the following ways:
• Characterise the sedentary behaviour that is linked to so many conditions.
• Detect correlations between factors such as diet and sleep.
• Measure changes in movement, posture and patterns of movement over months.
• Analyse eating behaviour - including whether people are taking prescribed medication.
• Detect periods of depression or anxiety and intervene using a computer based therapy.
• Predict falls and detect strokes so that help may be summoned.
Further details of SPHERE can be found at www.irc-sphere.ac.uk
Dr. Diethe is a Research Fellow on the SPHERE project at the University of Bristol, where he specialises in probabilistic methods for machine learning and data fusion, including time-series modelling of human behaviour, unsupervised, active, and transfer learning approaches. He has a Ph.D. in Machine Learning applied to multivariate signal processing from UCL, and was employed by Microsoft Research Cambridge where he co-authored a book titled “Model-Based Machine Learning”, an early access online version of which is available at http://www.mbmlbook.com. Contact him at tom.diethe@bristol.ac.uk
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Data fusion and mining in SPHERE
1. Data fusion and mining in SPHERE
Challenges & Opportunities for Machine Learning
Tom Diethe
Intelligent Systems Laboratory
University of Bristol
2. irc-sphere.ac.uk
21st Century Healthcare Challenges
UK: 1.4 million aged > 85, by 2035 -> 3.6 million
Japan: will have the oldest population in human history by
2050 (52 yrs)
China: a retired population larger than Europe
Ageing populations living with long term health conditions:
obesity, diabetes, depression, heart disease, dementia …
Technological solution to fill the gap between expectations
and reality of healthcare
3. Environmental
Temperature, light level,
humidity, air quality
Water & electricity consumption
Video
emotion, gate, activity,
interaction
Wearables
activity, sleep, etc.
Contextual information
medical history, demographics
Feedback
medical practitioner, users
4. Use Cases
• Clinician with information need
• Hip injury example – how is their gait, are they walking better?
• Causal relations – are there any common patterns of behaviour
that lead to health issues
• Information back to the user – better health-enhancing choices
• Early warning system
• Disease progress/treatment effectiveness
• Many more …
8. Prediction and Modelling
Who is in the house
What are they doing
When are activities happening
Where are these happening
Why does this matter?
8
9. Is this Big Data?
• Sensors
• Heterogeneous
• Noisy/intermittent
• Different spatial/temporal resolutions
• Velocity ✔ Variety ✔ volume ?
9
10. What’s Important?
• Quantification of uncertainty
• Transparent models
• Online Learning: models must adapt to
changing habits
• How to incorporate medical history
• Daily/Weekly/Seasonal patterns
• Personalisation
10
11. Model-Based Machine Learning
• Model uncertainty using probabilities
• frequency
• belief
• Model contains variables and factors
1. Build a model
2. Incorporate observations
3. Perform inference over “latent” variables
11
12. What is a Model?
12
A simulator programme
bool A = random.NextDouble() > 0;
bool B = random.NextDouble() > 0;
bool C = A & B;
C
A B
&
• A set of assumptions
1. A coin has an equal chance of landing on heads or tails
2. Coin tosses are independent
18. SPHERE House Script
<<<<Downstairs>>>>
Living Room
Enter the room and close the door behind you
Stand facing the mirror and jump twice.
Turn light on
Go to window and open and close the curtains
Take off shoes
Artificial activities … repeat 5 times with 3 seconds between each
stand to bend
bend to stand
stand to kneel
kneel to stand
stand to sit
sit to lie (back)
lie (back) to lie (side) (on sofa)
lie (side) to lie (back)
lie (back) to sit
sit to stand
cough
Turn light off
18
19. SPHERE Challenge
• Task: predict posture and ambulation labels
given the sensor data
• Accelerometer, RGB-D and environmental data
19
20. SPHERE Challenge
• Data was collected from a script in the SPHERE house
• 10 participant, ~20-30 minutes per script
• Even split between training and testing data
• Test data split into short 10-30s sequences
https://www.drivendata.org/competitions/42/senior-data-science-
safe-aging-with-sphere
bit.ly/sphere-challenge
20
21. Targets
• Each sequence was annotated at least twice
• Not all annotators will agree all of the time
• Start/end time of annotations may not be aligned
• Actual label assigned to a time interval may not agree
• Task: predict mean annotation on a per-second basis —
the targets are probabilistic
• Also provided localisation annotations (in training only)
21
28. Goal: Activity Recognition in Smart Homes
Deployment context differs from learning context
(home/resident)
TRANSFER LEARNING
Labels costly and time-consuming to acquire
ACTIVE LEARNING
28
30. Method
• Extension of the “Bayes Point Machine”
• Additional layer of hierarchy:
• model “shared” and “individual” weights
• can smoothly evolve from generic to personalised
predictions
• Implemented using Infer.NET
• http://research.microsoft.com/en-
us/um/cambridge/projects/infernet/
30
31. irc-sphere.ac.uk
Accelerometer Data
• Source: 30 subjects, Smartphone, 50Hz, Video annotations
• Target: 14 subjects, MotionNode, 100Hz, Observer
annotations
• Classes: Walking upstairs vs. Walking downstairs
• Features: 48 features based on the ‘body’ acceleration signal
31
https://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+ Using+Smartphones
http://sipi.usc.edu/HAD/
33. irc-sphere.ac.uk
• Bayesian framework very appealing
• Transfer boosts initial accuracy to 70%
• Active Learning -> ~5 instances to
personalise
Diethe, T., Twomey, N. and Flach, P., 2016.
Active transfer learning for activity
recognition. ESANN
Summary
35. Unsupervised learning of sensor topologies
• signal processing and information-theoretic
techniques
• learn an adjacency matrix
• enables us to determine combinations of
sensors useful for classification
• Experiments using CASAS data:
http://ailab.wsu.edu/casas/datasets/
35
36. irc-sphere.ac.uk
Modelling of CASAS datasets
• Experiments based on dataset 11 (Kyoto Daily Life 2010)
• Existing methods:
• Naïve Bayes, HMM, CRF (segmented data) Krishnan & Cook 2012
• SVM, Decision trees (streaming data) Cook 2012
• SOTA: ~80-90% accuracy in controlled environments, some transfer
learning
• Two approaches:
• Undirected models: Further work using Linear Chain CRFs
• Directed models: Online Bayesian classifiers
39. Results
• 5-10% boost in classification performance
• Can help
• when transferring to new sensor configuration
• disambiguating multiple residents
Twomey, N., Diethe, T., Craddock, I. and Flach, P., 2016.
Unsupervised learning of sensor topologies for improving
activity recognition in smart environments. Neurocomputing.
39
45. Opportunities
• True house-to-house transfer learning
• How well does this all work with multiple residents
• What happens when houses or people
change/move?
• Complex activities
• Sleep
• Real medical applications
45
Build a model, which if we’re taking the Bayesian approach is a joint distribution over the relevant variables. This can be represented as a graph.
Incorporate observations.
Run inference, which again in the Bayesian approach means computing the distributions over the desired variables.
In real-time applications we iterate over the 2nd & 3rd steps, and extend the model as required.
----- Meeting Notes (09/09/2014 14:37) -----
- Coming to the end of year 1, working on setting things up.
Names of the team members.
- We're hiring. Visits. Collaboration.
Who wants this? Nobody wants it for themselves, but everyone wants it for someone else
Cut out 1 slide
Cliniciian with information need
Hip injury example – how it their gait, are they walking better?
Causal relations – what causes illness
Information back to the user – better health-enhancing choices
Early warning system
Disease progress/treatment effectiveness
Embed video
New “House” graphic to replace existing & WP slide
Differing temporal/spatial resolutions