1
Machine Learning
for Sensor Data Analytics
Shiran Golan
Application Engineer
shirang@systematics.co.il
2
Outline
 Machine Learning Overview
 Machine Learning Types
 Machine Learning Workflow and
Challenges
 Example - Human Activity Classification
 Summary
3
Machine Learning
Machine learning is a type of artificial intelligence (AI) that provides
computers with the ability to learn without being explicitly programmed.
4
Machine learning uses data and produces a program to perform a task
Standard Approach Machine Learning Approach
𝑚𝑜𝑑𝑒𝑙 = <
𝑴𝒂𝒄𝒉𝒊𝒏𝒆
𝑳𝒆𝒂𝒓𝒏𝒊𝒏𝒈
𝑨𝒍𝒈𝒐𝒓𝒊𝒕𝒉𝒎
>(𝑠𝑒𝑛𝑠𝑜𝑟_𝑑𝑎𝑡𝑎, 𝑎𝑐𝑡𝑖𝑣𝑖𝑡𝑦)
Computer
Program
Machine
Learning
𝑚𝑜𝑑𝑒𝑙: Inputs → Outputs
Hand Written Program Formula or Equation
If X_acc > 0.5
then “SITTING”
If Y_acc < 4 and Z_acc > 5
then “STANDING”
…
𝑌𝑎𝑐𝑡𝑖𝑣𝑖𝑡𝑦
= 𝛽1 𝑋 𝑎𝑐𝑐 + 𝛽2 𝑌𝑎𝑐𝑐
+ 𝛽3 𝑍 𝑎𝑐𝑐 +
…
Task: Human Activity Detection
Machine Learning
5
Consider Machine Learning When…
update as more data
becomes available
learn complex non-
linear relationships
learn efficiently from
very large data sets
Problem is too complex for hand written rules or equations
Speech Recognition Object Recognition Engine Health Monitoring
Program needs to adapt with changing data
Weather Forecasting Energy Load Forecasting Stock Market Prediction
Program needs to scale
IoT Analytics Taxi Availability Airline Flight Delays
Because algorithms can
6
Machine Learning is Everywhere
 Stock Prediction
 Speech Recognition
 Image Recognition
 Medical Diagnosis
 Data Analytics
 Energy
 Robotics
 and more…
[TBD]
7
Sensor Data Analytics & IoT Workflows
8
Outline
 Machine Learning Overview
 Machine Learning Types
 Machine Learning Workflow and
Challenges
 Example - Human Activity Classification
 Summary
9
Different Types of Learning
Machine Learning
Supervised
Learning
Classification
Regression
Unsupervised
Learning
• Group and interpret data based on input data
• Discover a good internal representation
• Learn a low dimensional representation
• Output is a real number (temperature,
stock, prices).
• Output is a choice between classes
• (True, False) (Red, Blue, Green)
Classification
Develop predictive model
based on both
input and output data
10
Unsupervised Learning - Clustering
 What is clustering?
– Segment data into groups,
based on data similarity
 Why use clustering?
– Identify outliers
– Resulting groups may be the matter of interest
 How is clustering done?
– It is an iterative process
– Can be achieved by various algorithms
-0.1 0 0.1 0.2 0.3 0.4 0.5 0.6
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
11
Clustering
k-Means,
Fuzzy C-
Means
Hierarchical
Clustering
Neural
Networks
Gaussian
Mixture
Models
Hidden
Markov
Models
-0.1 0 0.1 0.2 0.3 0.4 0.5 0.6
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Nearest
Neighbors
12
Supervised Learning - Classification
 What is classification?
– Predicting the best group for each point
– “Learns” from labeled observations
– Uses input features
 Why use classification?
– Accurately group data never seen before
 How is classification done?
– Can use several algorithms to build a predictive model
– Good training data is critical
-0.1 0 0.1 0.2 0.3 0.4 0.5 0.6
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Group1
Group2
Group3
Group4
Group5
Group6
Group7
Group8
13
Supervised Learning - Regression
 What is regression?
– “classification, but for continuous variables”
 Why use regression?
– Predict a continuous response for
a new observation
 How is regression done?
– linear, non-linear, paramtric or non-parametric
14
Supervised Learning
Regression
Non-linear Reg.
(GLM, Logistic)
Linear
Regression
Gaussian
Process
Regression
Regression
Trees and
Ensembles
Neural
Networks
Classification
Nearest
Neighbor
Discriminant
Analysis
Naive Bayes
Support Vector
Machines
Boosted and
Baggd Trees
Logistic
Regression
Generalized
Linear Models
Support
Vector
Machines
15
MODEL
PREDICTION
Supervised Learning Workflow
Train: Iterate till you find the best model
Predict: Integrate trained models into applications
MODELSUPERVISED
LEARNING
CLASSIFICATION
REGRESSION
PREPROCESS
DATA
SUMMARY
STATISTICS
PCAFILTERS
CLUSTER
ANALYSIS
LOAD
DATA
PREPROCESS
DATA
SUMMARY
STATISTICS
PCAFILTERS
CLUSTER
ANALYSIS
NEW
DATA
16
Outline
 Machine Learning Overview
 Machine Learning Types
 Machine Learning Workflow and
Challenges
 Example - Human Activity Classification
 Summary
17
Machine Learning Workflow
Integrate Analytics
with Systems
Desktop Apps
Enterprise Scale
Systems
Embedded Devices
and Hardware
Files
Databases
Sensors
Access and Explore
Data
Develop Predictive
Models
Model Creation e.g.
Machine Learning
Model
Validation
Parameter
Optimization
Preprocess Data
Working with
Messy Data
Data Reduction/
Transformation
Feature
Extraction
18
Classification Learner App Regression Learner App
Machine Learning Apps
19
Data Import and Cross-Validation Setup
1. Data import and Cross-
validation setup
20
Data Exploration and Feature Selection
1. Data import and Cross-
validation setup
2. Data exploration and
feature selection
21
Train Multiple Models
1. Data import and Cross-
validation setup
2. Data exploration and
feature selection
3. Train multiple models
22
Model Comparison and Assessment
1. Data import and Cross-
validation setup
2. Data exploration and
feature selection
3. Train multiple models
4. Model comparison and
assessment
23
Share Model or Automate Process
1. Data import and Cross-
validation setup
2. Data exploration and
feature selection
3. Train multiple models
4. Model comparison and
assessment
5. Share model or
automate process
24
Fine-tuning Model Parameters
 Bayesian optimization
 Why?
Manual parameter selection is tedious
and may result in suboptimal
performance
 When?
When training a model with one or more
parameters that influence the fit
26
Challenge Solution
Data diversity
Extensive data support
Work with signal, images, financial, textual, and others
formats
Lack of domain tools
High-quality libraries
Industry-standard algorithms for Finance, Statistics, Signal,
Image processing & more
Time consuming
Interactive, app-driven workflows
Focus on machine learning, not programing
Select best model and easily fine-tune model parameter
Platform diversity
Run analytics anywhere
Code generation for embedded targets
Deploy to broad range of enterprise system architectures
Flexible architecture for customized workflows
Complete machine learning platform
MATLAB Strengths for Machine Learning
27
Outline
 Machine Learning Overview
 Machine Learning Types
 Machine Learning Workflow and
Challenges
 Example - Human Activity Classification
 Summary
28
Example: Human Activity Classification
Classification
Feature
Extraction
Dataset courtesy of:
Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra and Jorge L. Reyes-Ortiz.
Human Activity Recognition on Smartphones using a Multiclass Hardware-Friendly Support Vector Machine.
International Workshop of Ambient Assisted Living (IWAAL 2012). Vitoria-Gasteiz, Spain. Dec 2012
http://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones
29
Human Activity Classification - Challenges
 Work with wirelessly transmitted data
– Need to get the sensor data into your analytics environment
 Requires domain-specific knowledge
– Signal processing knowledge
– Machine learning knowledge
 No simple solution
– Need to test different algorithms and ideas quickly
30
Human Activity Classification - Solution
 MATLAB Connects to DAQ interfaces and sensors directly E.g.
-Android Sensor Support
-iPhone and iPad Sensor Support
 Only core built-in Signal Processing algorithms
 66 high-quality features extracted with only 54 lines of code!
 Automatically train and compare
a selection of different models using Apps
 Visualisation and automation
 Easy path to production
Data
31
Outline
 Machine Learning Overview
 Machine Learning Types
 Machine Learning Workflow and
Challenges
 Example - Human Activity Classification
 Summary
32
Key Takeaways
 Sensor data analytics made easy
– You don’t need to be a domain expert
– If you are a domain expert, you can explore ideas faster
 Direct access to sensors and many other data sources
 Easy paths to production, e.g. Automatic C code generation
 Integrated workflow from a single environment
33
Additional Resources
MATLAB documentation
Systematics Website
Systematics Courses

Machine learning for sensor Data Analytics

  • 1.
    1 Machine Learning for SensorData Analytics Shiran Golan Application Engineer shirang@systematics.co.il
  • 2.
    2 Outline  Machine LearningOverview  Machine Learning Types  Machine Learning Workflow and Challenges  Example - Human Activity Classification  Summary
  • 3.
    3 Machine Learning Machine learningis a type of artificial intelligence (AI) that provides computers with the ability to learn without being explicitly programmed.
  • 4.
    4 Machine learning usesdata and produces a program to perform a task Standard Approach Machine Learning Approach 𝑚𝑜𝑑𝑒𝑙 = < 𝑴𝒂𝒄𝒉𝒊𝒏𝒆 𝑳𝒆𝒂𝒓𝒏𝒊𝒏𝒈 𝑨𝒍𝒈𝒐𝒓𝒊𝒕𝒉𝒎 >(𝑠𝑒𝑛𝑠𝑜𝑟_𝑑𝑎𝑡𝑎, 𝑎𝑐𝑡𝑖𝑣𝑖𝑡𝑦) Computer Program Machine Learning 𝑚𝑜𝑑𝑒𝑙: Inputs → Outputs Hand Written Program Formula or Equation If X_acc > 0.5 then “SITTING” If Y_acc < 4 and Z_acc > 5 then “STANDING” … 𝑌𝑎𝑐𝑡𝑖𝑣𝑖𝑡𝑦 = 𝛽1 𝑋 𝑎𝑐𝑐 + 𝛽2 𝑌𝑎𝑐𝑐 + 𝛽3 𝑍 𝑎𝑐𝑐 + … Task: Human Activity Detection Machine Learning
  • 5.
    5 Consider Machine LearningWhen… update as more data becomes available learn complex non- linear relationships learn efficiently from very large data sets Problem is too complex for hand written rules or equations Speech Recognition Object Recognition Engine Health Monitoring Program needs to adapt with changing data Weather Forecasting Energy Load Forecasting Stock Market Prediction Program needs to scale IoT Analytics Taxi Availability Airline Flight Delays Because algorithms can
  • 6.
    6 Machine Learning isEverywhere  Stock Prediction  Speech Recognition  Image Recognition  Medical Diagnosis  Data Analytics  Energy  Robotics  and more… [TBD]
  • 7.
    7 Sensor Data Analytics& IoT Workflows
  • 8.
    8 Outline  Machine LearningOverview  Machine Learning Types  Machine Learning Workflow and Challenges  Example - Human Activity Classification  Summary
  • 9.
    9 Different Types ofLearning Machine Learning Supervised Learning Classification Regression Unsupervised Learning • Group and interpret data based on input data • Discover a good internal representation • Learn a low dimensional representation • Output is a real number (temperature, stock, prices). • Output is a choice between classes • (True, False) (Red, Blue, Green) Classification Develop predictive model based on both input and output data
  • 10.
    10 Unsupervised Learning -Clustering  What is clustering? – Segment data into groups, based on data similarity  Why use clustering? – Identify outliers – Resulting groups may be the matter of interest  How is clustering done? – It is an iterative process – Can be achieved by various algorithms -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
  • 11.
  • 12.
    12 Supervised Learning -Classification  What is classification? – Predicting the best group for each point – “Learns” from labeled observations – Uses input features  Why use classification? – Accurately group data never seen before  How is classification done? – Can use several algorithms to build a predictive model – Good training data is critical -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Group1 Group2 Group3 Group4 Group5 Group6 Group7 Group8
  • 13.
    13 Supervised Learning -Regression  What is regression? – “classification, but for continuous variables”  Why use regression? – Predict a continuous response for a new observation  How is regression done? – linear, non-linear, paramtric or non-parametric
  • 14.
    14 Supervised Learning Regression Non-linear Reg. (GLM,Logistic) Linear Regression Gaussian Process Regression Regression Trees and Ensembles Neural Networks Classification Nearest Neighbor Discriminant Analysis Naive Bayes Support Vector Machines Boosted and Baggd Trees Logistic Regression Generalized Linear Models Support Vector Machines
  • 15.
    15 MODEL PREDICTION Supervised Learning Workflow Train:Iterate till you find the best model Predict: Integrate trained models into applications MODELSUPERVISED LEARNING CLASSIFICATION REGRESSION PREPROCESS DATA SUMMARY STATISTICS PCAFILTERS CLUSTER ANALYSIS LOAD DATA PREPROCESS DATA SUMMARY STATISTICS PCAFILTERS CLUSTER ANALYSIS NEW DATA
  • 16.
    16 Outline  Machine LearningOverview  Machine Learning Types  Machine Learning Workflow and Challenges  Example - Human Activity Classification  Summary
  • 17.
    17 Machine Learning Workflow IntegrateAnalytics with Systems Desktop Apps Enterprise Scale Systems Embedded Devices and Hardware Files Databases Sensors Access and Explore Data Develop Predictive Models Model Creation e.g. Machine Learning Model Validation Parameter Optimization Preprocess Data Working with Messy Data Data Reduction/ Transformation Feature Extraction
  • 18.
    18 Classification Learner AppRegression Learner App Machine Learning Apps
  • 19.
    19 Data Import andCross-Validation Setup 1. Data import and Cross- validation setup
  • 20.
    20 Data Exploration andFeature Selection 1. Data import and Cross- validation setup 2. Data exploration and feature selection
  • 21.
    21 Train Multiple Models 1.Data import and Cross- validation setup 2. Data exploration and feature selection 3. Train multiple models
  • 22.
    22 Model Comparison andAssessment 1. Data import and Cross- validation setup 2. Data exploration and feature selection 3. Train multiple models 4. Model comparison and assessment
  • 23.
    23 Share Model orAutomate Process 1. Data import and Cross- validation setup 2. Data exploration and feature selection 3. Train multiple models 4. Model comparison and assessment 5. Share model or automate process
  • 24.
    24 Fine-tuning Model Parameters Bayesian optimization  Why? Manual parameter selection is tedious and may result in suboptimal performance  When? When training a model with one or more parameters that influence the fit
  • 25.
    26 Challenge Solution Data diversity Extensivedata support Work with signal, images, financial, textual, and others formats Lack of domain tools High-quality libraries Industry-standard algorithms for Finance, Statistics, Signal, Image processing & more Time consuming Interactive, app-driven workflows Focus on machine learning, not programing Select best model and easily fine-tune model parameter Platform diversity Run analytics anywhere Code generation for embedded targets Deploy to broad range of enterprise system architectures Flexible architecture for customized workflows Complete machine learning platform MATLAB Strengths for Machine Learning
  • 26.
    27 Outline  Machine LearningOverview  Machine Learning Types  Machine Learning Workflow and Challenges  Example - Human Activity Classification  Summary
  • 27.
    28 Example: Human ActivityClassification Classification Feature Extraction Dataset courtesy of: Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra and Jorge L. Reyes-Ortiz. Human Activity Recognition on Smartphones using a Multiclass Hardware-Friendly Support Vector Machine. International Workshop of Ambient Assisted Living (IWAAL 2012). Vitoria-Gasteiz, Spain. Dec 2012 http://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones
  • 28.
    29 Human Activity Classification- Challenges  Work with wirelessly transmitted data – Need to get the sensor data into your analytics environment  Requires domain-specific knowledge – Signal processing knowledge – Machine learning knowledge  No simple solution – Need to test different algorithms and ideas quickly
  • 29.
    30 Human Activity Classification- Solution  MATLAB Connects to DAQ interfaces and sensors directly E.g. -Android Sensor Support -iPhone and iPad Sensor Support  Only core built-in Signal Processing algorithms  66 high-quality features extracted with only 54 lines of code!  Automatically train and compare a selection of different models using Apps  Visualisation and automation  Easy path to production Data
  • 30.
    31 Outline  Machine LearningOverview  Machine Learning Types  Machine Learning Workflow and Challenges  Example - Human Activity Classification  Summary
  • 31.
    32 Key Takeaways  Sensordata analytics made easy – You don’t need to be a domain expert – If you are a domain expert, you can explore ideas faster  Direct access to sensors and many other data sources  Easy paths to production, e.g. Automatic C code generation  Integrated workflow from a single environment
  • 32.