SlideShare a Scribd company logo
Machine Learning for IoT
Unpacking the Blackbox
Ivelin Andreev
Sponsors
About me
• Software Architect @
o 15 years professional experience
o .NET Web Development MCPD
• External Expert Horizon 2020
• External Expert Eurostars-Eureka & IFD
• Business Interests
o Web Development, SOA, Integration
o Security & Performance Optimization
o IoT, Computer Intelligence
• Contact
o ivelin.andreev@icb.bg
o www.linkedin.com/in/ivelin
o www.slideshare.net/ivoandreev
Agenda
Machine Learning
• Azure ML and Competitors
• Choosing the right algorithm
• Reasons your ML model may fail
• Algorithm performance
• Deep neural networks
• ML on-premises
• Demo
Real World Business Cases
• Predictive maintenance
• Is it likely that the machine will break
based on sensor readings?
• Forest fire prediction
• Is it likely that a fire will start
based on environment parameters?
• Automated teller machine replenishment
• How to optimize ATM replenishment
based on historical data for transactions?
• Spear phishing identification
• Is a web site likely to be phishing
based on URL and HTTP response?
ASPires
Distributed IoT Platform
Funded by
EUROPEAN CIVIL PROTECTION
AND HUMANITARIAN AID
OPERATIONS
2016/PREV/03 (ASPIRES)
ML is not Black Magic
Supervised Learning
• Majority of practical ML uses supervised learning
• Mapping function approximated from past experience
o Regression f(X) = Y, Y is a real number
o Classification f(X) = Y, Y is a category label
• Training
o Labeled positive and negative examples
o From unseen input, predict corresponding output
o Learning until acceptable performance is achieved
Unsupervised Learning
• Discover hidden relations and learn about the data
o Clustering f(X) = [X1,…, Xk], k disjoint subsets
o Association f(Xi, Xj) = R, relation
• Training
o All examples are positive
o No labeling / No teacher
o No single correct answer
• Practical usage
o Derive groups, not explicitly labeled
o Market basket analysis (association among items)
Machine Learning with Microsoft
Azure
• Primary goal: makes deployable and
scalable web services from the ML modules.
• Though experience for creating ML models is
great, it is not intended to be a place to
create and export models
Main ML Market Players
Azure ML BigML Amazon ML Google
Prediction
IBM Watson ML
Flexibility High High Low Low Low
Usability High Med High Low High
Training time Low Low High Med High
Accuracy (AUC) High High High Med High
Cloud/
On-premises
+/- +/+ +/- +/- +/-
Algorithms Classification
Regression
Clustering
Anomaly detect
Recommendations
Classification
Regression
Clustering
Anomaly
Recommend
Classification
Regression
Classification
Regression
Semantic mining
Hypothesis rank
Regression
Customizations Model parameters
R-script
Evaluation support
Own models
C#, R, Node.js
Few parameters
Pricing?
Azure ML BigML Amazon ML Google
Prediction
IBM Watson
ML
Model Building $1.00 / h
$10.00 / user / month
$30 - $10’000
/ month
$0.42 / h $0.002 / MB $0.45 / h
$10.00 / service
Retraining
(per 1000)
$0.50 - N/A $0.05 -
Prediction
(per 1000)
$0.50 - $0.10 $0.50 $0.50
Compute
(per hour)
$2.00 - $0.42 - -
Free Usage 1’000 / month
2h compute
Dataset Size
Max 16MB
N/A 10’000/month 5’000 / month
5h compute
Notes Private
deployment
$55’000 / year
Shutdown
April 30, 2018
Cloud ML Engine
(TensorFlow)
1. Dataset
Azure ML Flow
2. Training Experiment
3. Predictive Experiment
4. Publish Web Service
5. Retrain Model
Which ML Algorithm shall I Use?
Do you need Math to work with ML?
Answer: It is not required, but would definitely help
Data Science is what is really necessary
Interdisciplinary field about processes and methods for extracting relations and knowledge from data
Working around Math
• Select the right algorithm
• Use cheat-sheets instead
• Choose parameter settings
• Experiment
• Use parameter-sweep
• Identify underfitting and overfitting
• Compare training and validation scores
I selected an appropriate
ML algorithm but…
why my ML model fails?
ML Decision Tree
Is it an ML task?
No, use another
solution
Is it correct ML
scenario?
No, try another
scenario
Is suitable model
identified?
Do you have
enough data?
Is the model overly
complicated?
Correct features
used?
Performed feature
engineering?
Proper evaluation
metrics?
Good evaluation
set?
Diagnose Steps (part 1)
1. Is it an ML task? Are you sure ML is the best solution?
• Hard: X is independent of Y: X <Name, Age, Income>, Height=?
• Easy: X is a set with limited variations. Configure Y=F(X)
2. Appropriate ML scenario?
• Supervised learning (classification, regression, anomaly detection)
• Unsupervised learning (clustering, pattern learning)
3. Appropriate model?
• Data size (small data -> linear model, large data -> consider non-linear)
• Sparse data (require normalization to perform better)
• Imbalanced data (special treatment of the minority class required)
• Data quality (noise and missing values require loss function – i.e. L2)
4. Enough training data?
• Investigate how precision improves with more data
5. Model overly complicated
• Start simple first, increase complexity and evaluate performance
• Avoid overfitting to training set
6. Feature quality
• Have you identified all useful features?
• Use domain knowledge of an expert to start
• Include any feature that could be found and investigate model performance
7. Feature engineering
• The best strategy to improve performance and reveal important input
• Encode features, normalize [0:1], combine features
8. Combine models
• If multiple models have similar performance there is a chance of improvement
• Use one model for one subset of data and another model for the other
Diagnose Steps (part 2)
Diagnose Steps (part 3)
9. Model Validation
• Use appropriate performance indicator (Accuracy, Precision, Recall, F1, etc.)
• How well does the model describe data? (AUC)
• Data typically divided into Training and Validation
• Evaluated accuracy on disjoint dataset (other than training dataset)
• Tune model hyper parameters (i.e. number of iterations)
Appropriate Algorithms are
Determined by Data
Types of Algorithms
Linear Algorithms
• Classification - classes separated by straight line
• Support Vector Machine – wide gap instead of line
• Regression – linear relation between variables and label
Non-Linear Algorithms
• Decision Trees and Jungles - divide space into regions
• Neural Networks – complex and irregular boundaries
Special Algorithms
• Ordinal Regression – ranked values (i.e. race)
• Poisson - discrete distribution (i.e. count of events)
• Bayesian – normal distribution of errors (bell curve)
• Binary classification outcomes {negative; positive}
• ROC Curve
o TP Rate = True Positives / All Positives
o FP Rate = False Positives / All Negatives
• Example
• AUC (Area Under Curve)
o KPI for model performance and model comparison
o 0.5 = Random prediction, 1 = Perfect match
• For multiclass – average from all RoC curves
Model Performance (Classification)
TP Rate FP Rate 1-FP Rate
5 0.56 0.99 0.01
7 0.78 0.81 0.19
9 0.91 0.42 0.58
• Probability Threshold
o Cost of one error could be much higher that cost of other
o (i.e. Spam filter – it is more expensive to miss a real mail)
• Accuracy
o For symmetric 50/50 data
• Precision
o (i.e. 1000 devices, 6 fails, 8 predicted, 5 true failures)
o Correct positives (i.e. 5/8 = 0.625, FP are expensive)
• Recall
o Correctly predicted positives (i.e. 5/6=0.83, FN are expensive)
• F1 (balanced error cost)
o Balanced cost of Precision/Recall
Threshold Selection (Binary)
Model Performance (Regression)
• Coefficient of Determination (R2)
o Single numeric KPI – how well data fits model
o R2>0.6 – good, R2>0.8 – very good R2=1 – perfect
• Mean Absolute Error / Root Mean Squared Error
o Deviation of the estimates from the observed values
o Compare model errors measure in the SAME units
• Relative Absolute Error / Relative Squared Error
o % deviation from real value
o Compare model errors measure in the DIFFERENT units
Neural Networks &
Deep Learning
• Neural-Networks are considered
universal function approximators
• They can compute and learn
any function
Neural Network Architecture
• Nodes, organized in layers, with weighted connections
o Acyclic directed graph
• Layers
o Input (1), Output (1)
o Shallow - 1 hidden layer
o Deep – multiple hidden layers
• MS NET# language
• Define DNN layers
• Bundles (connections)
• Activation functions
Artificial Neuron Activation
• Calculates weighted sum of inputs
• Adds bias (function shift)
• Decides whether it shall be activated
Natural Questions
• Why do we have so many?
• Why some work better than other?
• Which one to use ?
Activation Functions (AF)
Goal
• Convert input -> output signal
• Output signal is input to next layer
• Approximate target function faster
Samples
• ReLu, PReLu – good to start with
• TanH and Sigmoid – outdated
• Softmax – output layer, classification
• Linear func – output layer, regression
Image Recognition with DNN
• Convolution
• Non-Linearity (i.e. ReLU)
• Downsampling
• Classification
Microsoft Cognitive Toolkit
• The Microsoft way to run ML on-premises
• Toolkit for learning and evaluating DNN
o Production-ready
o Open source (GitHub since Jan 2016)
o Cross platform (Windows, Linux, Docker)
o Highly optimized for multiple GPUs/machines
• Definition in C++, Python, C# (beta), BrainScript
• Convolutional NN are extremely good for feature extraction
Azure ML
DEMO
Key ML Modules
Building blocks for creating experiments
Split Data
Split data into training and verification sets
Train Model
Trains model from untrained model and dataset
Number of arguments = algorithm flexibility
Score Model
Confirm model results against data set
Evaluate Model
Get key quality factors of the results or evaluate against another model
Tune Model
Sweep model parameters for optimal settings
Takeaways
Machine Learning
Cortana Intelligence Gallery (3700+ Sample Azure ML Projects)
Evaluating model performance
Azure ML documentation (full)
Azure ML video guidelines
ML algorithms by use case
Artificial Neural Networks
Intuitive Explanation of Convolutional Neural Networks
Biginners guide to Convolutional Neural Networks
Understanding activation functions
Activation functions, which is better
Sponsors

More Related Content

What's hot

MLlib and Machine Learning on Spark
MLlib and Machine Learning on SparkMLlib and Machine Learning on Spark
MLlib and Machine Learning on Spark
Petr Zapletal
 
Tensors Are All You Need: Faster Inference with Hummingbird
Tensors Are All You Need: Faster Inference with HummingbirdTensors Are All You Need: Faster Inference with Hummingbird
Tensors Are All You Need: Faster Inference with Hummingbird
Databricks
 
Dato Keynote
Dato KeynoteDato Keynote
Dato Keynote
Turi, Inc.
 
Best Practices for Hyperparameter Tuning with MLflow
Best Practices for Hyperparameter Tuning with MLflowBest Practices for Hyperparameter Tuning with MLflow
Best Practices for Hyperparameter Tuning with MLflow
Databricks
 
Deep Learning Introduction - WeCloudData
Deep Learning Introduction - WeCloudDataDeep Learning Introduction - WeCloudData
Deep Learning Introduction - WeCloudData
WeCloudData
 
Feature Engineering - Getting most out of data for predictive models - TDC 2017
Feature Engineering - Getting most out of data for predictive models - TDC 2017Feature Engineering - Getting most out of data for predictive models - TDC 2017
Feature Engineering - Getting most out of data for predictive models - TDC 2017
Gabriel Moreira
 
Building Powerful and Intelligent Applications with Azure Machine Learning
Building Powerful and Intelligent Applications with Azure Machine LearningBuilding Powerful and Intelligent Applications with Azure Machine Learning
Building Powerful and Intelligent Applications with Azure Machine Learning
David Walker, CSM,CSD,MCP,MCAD,MCSD,MVP
 
Online Machine Learning: introduction and examples
Online Machine Learning:  introduction and examplesOnline Machine Learning:  introduction and examples
Online Machine Learning: introduction and examples
Felipe
 
Machine Learning: An introduction โดย รศ.ดร.สุรพงค์ เอื้อวัฒนามงคล
Machine Learning: An introduction โดย รศ.ดร.สุรพงค์  เอื้อวัฒนามงคลMachine Learning: An introduction โดย รศ.ดร.สุรพงค์  เอื้อวัฒนามงคล
Machine Learning: An introduction โดย รศ.ดร.สุรพงค์ เอื้อวัฒนามงคล
BAINIDA
 
Machine Learning for (JVM) Developers
Machine Learning for (JVM) DevelopersMachine Learning for (JVM) Developers
Machine Learning for (JVM) Developers
Mateusz Dymczyk
 
Data exploration validation and sanitization
Data exploration validation and sanitizationData exploration validation and sanitization
Data exploration validation and sanitization
Venkata Reddy Konasani
 
QCon Rio - Machine Learning for Everyone
QCon Rio - Machine Learning for EveryoneQCon Rio - Machine Learning for Everyone
QCon Rio - Machine Learning for Everyone
Dhiana Deva
 
Apache Spark Machine Learning
Apache Spark Machine LearningApache Spark Machine Learning
Apache Spark Machine Learning
Carol McDonald
 
Overview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature EngineeringOverview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature Engineering
Turi, Inc.
 
Machine Learning Pipelines
Machine Learning PipelinesMachine Learning Pipelines
Machine Learning Pipelines
jeykottalam
 
Introduction to Machine Learning with Spark
Introduction to Machine Learning with SparkIntroduction to Machine Learning with Spark
Introduction to Machine Learning with Spark
datamantra
 
Production ready big ml workflows from zero to hero daniel marcous @ waze
Production ready big ml workflows from zero to hero daniel marcous @ wazeProduction ready big ml workflows from zero to hero daniel marcous @ waze
Production ready big ml workflows from zero to hero daniel marcous @ waze
Ido Shilon
 
Top 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsTop 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner Pitfalls
Sri Ambati
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
HJ van Veen
 
Intro_to_ML
Intro_to_MLIntro_to_ML
Intro_to_ML
Marko Velic
 

What's hot (20)

MLlib and Machine Learning on Spark
MLlib and Machine Learning on SparkMLlib and Machine Learning on Spark
MLlib and Machine Learning on Spark
 
Tensors Are All You Need: Faster Inference with Hummingbird
Tensors Are All You Need: Faster Inference with HummingbirdTensors Are All You Need: Faster Inference with Hummingbird
Tensors Are All You Need: Faster Inference with Hummingbird
 
Dato Keynote
Dato KeynoteDato Keynote
Dato Keynote
 
Best Practices for Hyperparameter Tuning with MLflow
Best Practices for Hyperparameter Tuning with MLflowBest Practices for Hyperparameter Tuning with MLflow
Best Practices for Hyperparameter Tuning with MLflow
 
Deep Learning Introduction - WeCloudData
Deep Learning Introduction - WeCloudDataDeep Learning Introduction - WeCloudData
Deep Learning Introduction - WeCloudData
 
Feature Engineering - Getting most out of data for predictive models - TDC 2017
Feature Engineering - Getting most out of data for predictive models - TDC 2017Feature Engineering - Getting most out of data for predictive models - TDC 2017
Feature Engineering - Getting most out of data for predictive models - TDC 2017
 
Building Powerful and Intelligent Applications with Azure Machine Learning
Building Powerful and Intelligent Applications with Azure Machine LearningBuilding Powerful and Intelligent Applications with Azure Machine Learning
Building Powerful and Intelligent Applications with Azure Machine Learning
 
Online Machine Learning: introduction and examples
Online Machine Learning:  introduction and examplesOnline Machine Learning:  introduction and examples
Online Machine Learning: introduction and examples
 
Machine Learning: An introduction โดย รศ.ดร.สุรพงค์ เอื้อวัฒนามงคล
Machine Learning: An introduction โดย รศ.ดร.สุรพงค์  เอื้อวัฒนามงคลMachine Learning: An introduction โดย รศ.ดร.สุรพงค์  เอื้อวัฒนามงคล
Machine Learning: An introduction โดย รศ.ดร.สุรพงค์ เอื้อวัฒนามงคล
 
Machine Learning for (JVM) Developers
Machine Learning for (JVM) DevelopersMachine Learning for (JVM) Developers
Machine Learning for (JVM) Developers
 
Data exploration validation and sanitization
Data exploration validation and sanitizationData exploration validation and sanitization
Data exploration validation and sanitization
 
QCon Rio - Machine Learning for Everyone
QCon Rio - Machine Learning for EveryoneQCon Rio - Machine Learning for Everyone
QCon Rio - Machine Learning for Everyone
 
Apache Spark Machine Learning
Apache Spark Machine LearningApache Spark Machine Learning
Apache Spark Machine Learning
 
Overview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature EngineeringOverview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature Engineering
 
Machine Learning Pipelines
Machine Learning PipelinesMachine Learning Pipelines
Machine Learning Pipelines
 
Introduction to Machine Learning with Spark
Introduction to Machine Learning with SparkIntroduction to Machine Learning with Spark
Introduction to Machine Learning with Spark
 
Production ready big ml workflows from zero to hero daniel marcous @ waze
Production ready big ml workflows from zero to hero daniel marcous @ wazeProduction ready big ml workflows from zero to hero daniel marcous @ waze
Production ready big ml workflows from zero to hero daniel marcous @ waze
 
Top 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsTop 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner Pitfalls
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
 
Intro_to_ML
Intro_to_MLIntro_to_ML
Intro_to_ML
 

Similar to Machine learning for IoT - unpacking the blackbox

The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
Ivo Andreev
 
The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?
Ivo Andreev
 
Cutting Edge Computer Vision for Everyone
Cutting Edge Computer Vision for EveryoneCutting Edge Computer Vision for Everyone
Cutting Edge Computer Vision for Everyone
Ivo Andreev
 
NLP Classifier Models & Metrics
NLP Classifier Models & MetricsNLP Classifier Models & Metrics
NLP Classifier Models & Metrics
Sanghamitra Deb
 
NLP and Deep Learning for non_experts
NLP and Deep Learning for non_expertsNLP and Deep Learning for non_experts
NLP and Deep Learning for non_experts
Sanghamitra Deb
 
Deep learning
Deep learningDeep learning
Deep learning
Ratnakar Pandey
 
AI and Deep Learning
AI and Deep Learning AI and Deep Learning
AI and Deep Learning
Subrat Panda, PhD
 
Deep learning - a primer
Deep learning - a primerDeep learning - a primer
Deep learning - a primer
Uwe Friedrichsen
 
Deep learning - a primer
Deep learning - a primerDeep learning - a primer
Deep learning - a primer
Shirin Elsinghorst
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
Vishwas Lele
 
Machine learning with R
Machine learning with RMachine learning with R
Machine learning with R
Maarten Smeets
 
Intro to Machine Learning by Microsoft Ventures
Intro to Machine Learning by Microsoft VenturesIntro to Machine Learning by Microsoft Ventures
Intro to Machine Learning by Microsoft Ventures
microsoftventures
 
Connected Components Labeling
Connected Components LabelingConnected Components Labeling
Connected Components Labeling
Hemanth Kumar Mantri
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional Managers
Albert Y. C. Chen
 
MXNet Workshop
MXNet WorkshopMXNet Workshop
MXNet Workshop
Amazon Web Services
 
The Art of Intelligence – Introduction Machine Learning for Oracle profession...
The Art of Intelligence – Introduction Machine Learning for Oracle profession...The Art of Intelligence – Introduction Machine Learning for Oracle profession...
The Art of Intelligence – Introduction Machine Learning for Oracle profession...
Lucas Jellema
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis Introduction
Te-Yen Liu
 
Deep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesDeep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep Features
Turi, Inc.
 
Machine learning
Machine learningMachine learning
Machine learning
Vikas Sinha
 
Creativity and Curiosity - The Trial and Error of Data Science
Creativity and Curiosity - The Trial and Error of Data ScienceCreativity and Curiosity - The Trial and Error of Data Science
Creativity and Curiosity - The Trial and Error of Data Science
DamianMingle
 

Similar to Machine learning for IoT - unpacking the blackbox (20)

The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
 
The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?
 
Cutting Edge Computer Vision for Everyone
Cutting Edge Computer Vision for EveryoneCutting Edge Computer Vision for Everyone
Cutting Edge Computer Vision for Everyone
 
NLP Classifier Models & Metrics
NLP Classifier Models & MetricsNLP Classifier Models & Metrics
NLP Classifier Models & Metrics
 
NLP and Deep Learning for non_experts
NLP and Deep Learning for non_expertsNLP and Deep Learning for non_experts
NLP and Deep Learning for non_experts
 
Deep learning
Deep learningDeep learning
Deep learning
 
AI and Deep Learning
AI and Deep Learning AI and Deep Learning
AI and Deep Learning
 
Deep learning - a primer
Deep learning - a primerDeep learning - a primer
Deep learning - a primer
 
Deep learning - a primer
Deep learning - a primerDeep learning - a primer
Deep learning - a primer
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
Machine learning with R
Machine learning with RMachine learning with R
Machine learning with R
 
Intro to Machine Learning by Microsoft Ventures
Intro to Machine Learning by Microsoft VenturesIntro to Machine Learning by Microsoft Ventures
Intro to Machine Learning by Microsoft Ventures
 
Connected Components Labeling
Connected Components LabelingConnected Components Labeling
Connected Components Labeling
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional Managers
 
MXNet Workshop
MXNet WorkshopMXNet Workshop
MXNet Workshop
 
The Art of Intelligence – Introduction Machine Learning for Oracle profession...
The Art of Intelligence – Introduction Machine Learning for Oracle profession...The Art of Intelligence – Introduction Machine Learning for Oracle profession...
The Art of Intelligence – Introduction Machine Learning for Oracle profession...
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis Introduction
 
Deep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesDeep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep Features
 
Machine learning
Machine learningMachine learning
Machine learning
 
Creativity and Curiosity - The Trial and Error of Data Science
Creativity and Curiosity - The Trial and Error of Data ScienceCreativity and Curiosity - The Trial and Error of Data Science
Creativity and Curiosity - The Trial and Error of Data Science
 

More from Ivo Andreev

Cybersecurity and Generative AI - for Good and Bad vol.2
Cybersecurity and Generative AI - for Good and Bad vol.2Cybersecurity and Generative AI - for Good and Bad vol.2
Cybersecurity and Generative AI - for Good and Bad vol.2
Ivo Andreev
 
Architecting AI Solutions in Azure for Business
Architecting AI Solutions in Azure for BusinessArchitecting AI Solutions in Azure for Business
Architecting AI Solutions in Azure for Business
Ivo Andreev
 
Cybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and BadCybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and Bad
Ivo Andreev
 
JS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AIJS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AI
Ivo Andreev
 
How do OpenAI GPT Models Work - Misconceptions and Tips for Developers
How do OpenAI GPT Models Work - Misconceptions and Tips for DevelopersHow do OpenAI GPT Models Work - Misconceptions and Tips for Developers
How do OpenAI GPT Models Work - Misconceptions and Tips for Developers
Ivo Andreev
 
OpenAI GPT in Depth - Questions and Misconceptions
OpenAI GPT in Depth - Questions and MisconceptionsOpenAI GPT in Depth - Questions and Misconceptions
OpenAI GPT in Depth - Questions and Misconceptions
Ivo Andreev
 
Collecting and Analysing Spaceborn Data
Collecting and Analysing Spaceborn DataCollecting and Analysing Spaceborn Data
Collecting and Analysing Spaceborn Data
Ivo Andreev
 
Collecting and Analysing Satellite Data with Azure Orbital
Collecting and Analysing Satellite Data with Azure OrbitalCollecting and Analysing Satellite Data with Azure Orbital
Collecting and Analysing Satellite Data with Azure Orbital
Ivo Andreev
 
Language Studio and Custom Models
Language Studio and Custom ModelsLanguage Studio and Custom Models
Language Studio and Custom Models
Ivo Andreev
 
CosmosDB for IoT Scenarios
CosmosDB for IoT ScenariosCosmosDB for IoT Scenarios
CosmosDB for IoT Scenarios
Ivo Andreev
 
Forecasting time series powerful and simple
Forecasting time series powerful and simpleForecasting time series powerful and simple
Forecasting time series powerful and simple
Ivo Andreev
 
Constrained Optimization with Genetic Algorithms and Project Bonsai
Constrained Optimization with Genetic Algorithms and Project BonsaiConstrained Optimization with Genetic Algorithms and Project Bonsai
Constrained Optimization with Genetic Algorithms and Project Bonsai
Ivo Andreev
 
Azure security guidelines for developers
Azure security guidelines for developers Azure security guidelines for developers
Azure security guidelines for developers
Ivo Andreev
 
Autonomous Machines with Project Bonsai
Autonomous Machines with Project BonsaiAutonomous Machines with Project Bonsai
Autonomous Machines with Project Bonsai
Ivo Andreev
 
Global azure virtual 2021 - Azure Lighthouse
Global azure virtual 2021 - Azure LighthouseGlobal azure virtual 2021 - Azure Lighthouse
Global azure virtual 2021 - Azure Lighthouse
Ivo Andreev
 
Flux QL - Nexgen Management of Time Series Inspired by JS
Flux QL - Nexgen Management of Time Series Inspired by JSFlux QL - Nexgen Management of Time Series Inspired by JS
Flux QL - Nexgen Management of Time Series Inspired by JS
Ivo Andreev
 
Azure architecture design patterns - proven solutions to common challenges
Azure architecture design patterns - proven solutions to common challengesAzure architecture design patterns - proven solutions to common challenges
Azure architecture design patterns - proven solutions to common challenges
Ivo Andreev
 
Industrial IoT on Azure
Industrial IoT on AzureIndustrial IoT on Azure
Industrial IoT on Azure
Ivo Andreev
 
Flying a Drone with JavaScript and Computer Vision
Flying a Drone with JavaScript and Computer VisionFlying a Drone with JavaScript and Computer Vision
Flying a Drone with JavaScript and Computer Vision
Ivo Andreev
 
ML with Power BI for Business and Pros
ML with Power BI for Business and ProsML with Power BI for Business and Pros
ML with Power BI for Business and Pros
Ivo Andreev
 

More from Ivo Andreev (20)

Cybersecurity and Generative AI - for Good and Bad vol.2
Cybersecurity and Generative AI - for Good and Bad vol.2Cybersecurity and Generative AI - for Good and Bad vol.2
Cybersecurity and Generative AI - for Good and Bad vol.2
 
Architecting AI Solutions in Azure for Business
Architecting AI Solutions in Azure for BusinessArchitecting AI Solutions in Azure for Business
Architecting AI Solutions in Azure for Business
 
Cybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and BadCybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and Bad
 
JS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AIJS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AI
 
How do OpenAI GPT Models Work - Misconceptions and Tips for Developers
How do OpenAI GPT Models Work - Misconceptions and Tips for DevelopersHow do OpenAI GPT Models Work - Misconceptions and Tips for Developers
How do OpenAI GPT Models Work - Misconceptions and Tips for Developers
 
OpenAI GPT in Depth - Questions and Misconceptions
OpenAI GPT in Depth - Questions and MisconceptionsOpenAI GPT in Depth - Questions and Misconceptions
OpenAI GPT in Depth - Questions and Misconceptions
 
Collecting and Analysing Spaceborn Data
Collecting and Analysing Spaceborn DataCollecting and Analysing Spaceborn Data
Collecting and Analysing Spaceborn Data
 
Collecting and Analysing Satellite Data with Azure Orbital
Collecting and Analysing Satellite Data with Azure OrbitalCollecting and Analysing Satellite Data with Azure Orbital
Collecting and Analysing Satellite Data with Azure Orbital
 
Language Studio and Custom Models
Language Studio and Custom ModelsLanguage Studio and Custom Models
Language Studio and Custom Models
 
CosmosDB for IoT Scenarios
CosmosDB for IoT ScenariosCosmosDB for IoT Scenarios
CosmosDB for IoT Scenarios
 
Forecasting time series powerful and simple
Forecasting time series powerful and simpleForecasting time series powerful and simple
Forecasting time series powerful and simple
 
Constrained Optimization with Genetic Algorithms and Project Bonsai
Constrained Optimization with Genetic Algorithms and Project BonsaiConstrained Optimization with Genetic Algorithms and Project Bonsai
Constrained Optimization with Genetic Algorithms and Project Bonsai
 
Azure security guidelines for developers
Azure security guidelines for developers Azure security guidelines for developers
Azure security guidelines for developers
 
Autonomous Machines with Project Bonsai
Autonomous Machines with Project BonsaiAutonomous Machines with Project Bonsai
Autonomous Machines with Project Bonsai
 
Global azure virtual 2021 - Azure Lighthouse
Global azure virtual 2021 - Azure LighthouseGlobal azure virtual 2021 - Azure Lighthouse
Global azure virtual 2021 - Azure Lighthouse
 
Flux QL - Nexgen Management of Time Series Inspired by JS
Flux QL - Nexgen Management of Time Series Inspired by JSFlux QL - Nexgen Management of Time Series Inspired by JS
Flux QL - Nexgen Management of Time Series Inspired by JS
 
Azure architecture design patterns - proven solutions to common challenges
Azure architecture design patterns - proven solutions to common challengesAzure architecture design patterns - proven solutions to common challenges
Azure architecture design patterns - proven solutions to common challenges
 
Industrial IoT on Azure
Industrial IoT on AzureIndustrial IoT on Azure
Industrial IoT on Azure
 
Flying a Drone with JavaScript and Computer Vision
Flying a Drone with JavaScript and Computer VisionFlying a Drone with JavaScript and Computer Vision
Flying a Drone with JavaScript and Computer Vision
 
ML with Power BI for Business and Pros
ML with Power BI for Business and ProsML with Power BI for Business and Pros
ML with Power BI for Business and Pros
 

Recently uploaded

Econ3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdfEcon3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdf
blueshagoo1
 
Drownings spike from May to August in children
Drownings spike from May to August in childrenDrownings spike from May to August in children
Drownings spike from May to August in children
Bisnar Chase Personal Injury Attorneys
 
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdfOverview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
nhutnguyen355078
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
eoxhsaa
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
nyvan3
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
bmucuha
 
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
agdhot
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
ElizabethGarrettChri
 
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
aguty
 
Digital Marketing Performance Marketing Sample .pdf
Digital Marketing Performance Marketing  Sample .pdfDigital Marketing Performance Marketing  Sample .pdf
Digital Marketing Performance Marketing Sample .pdf
Vineet
 
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
eudsoh
 
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
osoyvvf
 
社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .
NABLAS株式会社
 
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
ywqeos
 
一比一原版(UofT毕业证)多伦多大学毕业证如何办理
一比一原版(UofT毕业证)多伦多大学毕业证如何办理一比一原版(UofT毕业证)多伦多大学毕业证如何办理
一比一原版(UofT毕业证)多伦多大学毕业证如何办理
exukyp
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
How To Control IO Usage using Resource Manager
How To Control IO Usage using Resource ManagerHow To Control IO Usage using Resource Manager
How To Control IO Usage using Resource Manager
Alireza Kamrani
 

Recently uploaded (20)

Econ3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdfEcon3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdf
 
Drownings spike from May to August in children
Drownings spike from May to August in childrenDrownings spike from May to August in children
Drownings spike from May to August in children
 
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdfOverview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
 
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
 
Digital Marketing Performance Marketing Sample .pdf
Digital Marketing Performance Marketing  Sample .pdfDigital Marketing Performance Marketing  Sample .pdf
Digital Marketing Performance Marketing Sample .pdf
 
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
 
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
 
社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .
 
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
 
一比一原版(UofT毕业证)多伦多大学毕业证如何办理
一比一原版(UofT毕业证)多伦多大学毕业证如何办理一比一原版(UofT毕业证)多伦多大学毕业证如何办理
一比一原版(UofT毕业证)多伦多大学毕业证如何办理
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
How To Control IO Usage using Resource Manager
How To Control IO Usage using Resource ManagerHow To Control IO Usage using Resource Manager
How To Control IO Usage using Resource Manager
 

Machine learning for IoT - unpacking the blackbox

  • 1. Machine Learning for IoT Unpacking the Blackbox Ivelin Andreev
  • 3. About me • Software Architect @ o 15 years professional experience o .NET Web Development MCPD • External Expert Horizon 2020 • External Expert Eurostars-Eureka & IFD • Business Interests o Web Development, SOA, Integration o Security & Performance Optimization o IoT, Computer Intelligence • Contact o ivelin.andreev@icb.bg o www.linkedin.com/in/ivelin o www.slideshare.net/ivoandreev
  • 4. Agenda Machine Learning • Azure ML and Competitors • Choosing the right algorithm • Reasons your ML model may fail • Algorithm performance • Deep neural networks • ML on-premises • Demo
  • 5. Real World Business Cases • Predictive maintenance • Is it likely that the machine will break based on sensor readings? • Forest fire prediction • Is it likely that a fire will start based on environment parameters? • Automated teller machine replenishment • How to optimize ATM replenishment based on historical data for transactions? • Spear phishing identification • Is a web site likely to be phishing based on URL and HTTP response?
  • 6. ASPires Distributed IoT Platform Funded by EUROPEAN CIVIL PROTECTION AND HUMANITARIAN AID OPERATIONS 2016/PREV/03 (ASPIRES)
  • 7. ML is not Black Magic
  • 8. Supervised Learning • Majority of practical ML uses supervised learning • Mapping function approximated from past experience o Regression f(X) = Y, Y is a real number o Classification f(X) = Y, Y is a category label • Training o Labeled positive and negative examples o From unseen input, predict corresponding output o Learning until acceptable performance is achieved
  • 9. Unsupervised Learning • Discover hidden relations and learn about the data o Clustering f(X) = [X1,…, Xk], k disjoint subsets o Association f(Xi, Xj) = R, relation • Training o All examples are positive o No labeling / No teacher o No single correct answer • Practical usage o Derive groups, not explicitly labeled o Market basket analysis (association among items)
  • 10. Machine Learning with Microsoft Azure • Primary goal: makes deployable and scalable web services from the ML modules. • Though experience for creating ML models is great, it is not intended to be a place to create and export models
  • 11. Main ML Market Players Azure ML BigML Amazon ML Google Prediction IBM Watson ML Flexibility High High Low Low Low Usability High Med High Low High Training time Low Low High Med High Accuracy (AUC) High High High Med High Cloud/ On-premises +/- +/+ +/- +/- +/- Algorithms Classification Regression Clustering Anomaly detect Recommendations Classification Regression Clustering Anomaly Recommend Classification Regression Classification Regression Semantic mining Hypothesis rank Regression Customizations Model parameters R-script Evaluation support Own models C#, R, Node.js Few parameters
  • 12. Pricing? Azure ML BigML Amazon ML Google Prediction IBM Watson ML Model Building $1.00 / h $10.00 / user / month $30 - $10’000 / month $0.42 / h $0.002 / MB $0.45 / h $10.00 / service Retraining (per 1000) $0.50 - N/A $0.05 - Prediction (per 1000) $0.50 - $0.10 $0.50 $0.50 Compute (per hour) $2.00 - $0.42 - - Free Usage 1’000 / month 2h compute Dataset Size Max 16MB N/A 10’000/month 5’000 / month 5h compute Notes Private deployment $55’000 / year Shutdown April 30, 2018 Cloud ML Engine (TensorFlow)
  • 13. 1. Dataset Azure ML Flow 2. Training Experiment 3. Predictive Experiment 4. Publish Web Service 5. Retrain Model
  • 14. Which ML Algorithm shall I Use?
  • 15. Do you need Math to work with ML? Answer: It is not required, but would definitely help Data Science is what is really necessary Interdisciplinary field about processes and methods for extracting relations and knowledge from data Working around Math • Select the right algorithm • Use cheat-sheets instead • Choose parameter settings • Experiment • Use parameter-sweep • Identify underfitting and overfitting • Compare training and validation scores
  • 16. I selected an appropriate ML algorithm but… why my ML model fails?
  • 17. ML Decision Tree Is it an ML task? No, use another solution Is it correct ML scenario? No, try another scenario Is suitable model identified? Do you have enough data? Is the model overly complicated? Correct features used? Performed feature engineering? Proper evaluation metrics? Good evaluation set?
  • 18. Diagnose Steps (part 1) 1. Is it an ML task? Are you sure ML is the best solution? • Hard: X is independent of Y: X <Name, Age, Income>, Height=? • Easy: X is a set with limited variations. Configure Y=F(X) 2. Appropriate ML scenario? • Supervised learning (classification, regression, anomaly detection) • Unsupervised learning (clustering, pattern learning) 3. Appropriate model? • Data size (small data -> linear model, large data -> consider non-linear) • Sparse data (require normalization to perform better) • Imbalanced data (special treatment of the minority class required) • Data quality (noise and missing values require loss function – i.e. L2) 4. Enough training data? • Investigate how precision improves with more data
  • 19. 5. Model overly complicated • Start simple first, increase complexity and evaluate performance • Avoid overfitting to training set 6. Feature quality • Have you identified all useful features? • Use domain knowledge of an expert to start • Include any feature that could be found and investigate model performance 7. Feature engineering • The best strategy to improve performance and reveal important input • Encode features, normalize [0:1], combine features 8. Combine models • If multiple models have similar performance there is a chance of improvement • Use one model for one subset of data and another model for the other Diagnose Steps (part 2)
  • 20. Diagnose Steps (part 3) 9. Model Validation • Use appropriate performance indicator (Accuracy, Precision, Recall, F1, etc.) • How well does the model describe data? (AUC) • Data typically divided into Training and Validation • Evaluated accuracy on disjoint dataset (other than training dataset) • Tune model hyper parameters (i.e. number of iterations)
  • 22. Types of Algorithms Linear Algorithms • Classification - classes separated by straight line • Support Vector Machine – wide gap instead of line • Regression – linear relation between variables and label Non-Linear Algorithms • Decision Trees and Jungles - divide space into regions • Neural Networks – complex and irregular boundaries Special Algorithms • Ordinal Regression – ranked values (i.e. race) • Poisson - discrete distribution (i.e. count of events) • Bayesian – normal distribution of errors (bell curve)
  • 23. • Binary classification outcomes {negative; positive} • ROC Curve o TP Rate = True Positives / All Positives o FP Rate = False Positives / All Negatives • Example • AUC (Area Under Curve) o KPI for model performance and model comparison o 0.5 = Random prediction, 1 = Perfect match • For multiclass – average from all RoC curves Model Performance (Classification) TP Rate FP Rate 1-FP Rate 5 0.56 0.99 0.01 7 0.78 0.81 0.19 9 0.91 0.42 0.58
  • 24. • Probability Threshold o Cost of one error could be much higher that cost of other o (i.e. Spam filter – it is more expensive to miss a real mail) • Accuracy o For symmetric 50/50 data • Precision o (i.e. 1000 devices, 6 fails, 8 predicted, 5 true failures) o Correct positives (i.e. 5/8 = 0.625, FP are expensive) • Recall o Correctly predicted positives (i.e. 5/6=0.83, FN are expensive) • F1 (balanced error cost) o Balanced cost of Precision/Recall Threshold Selection (Binary)
  • 25. Model Performance (Regression) • Coefficient of Determination (R2) o Single numeric KPI – how well data fits model o R2>0.6 – good, R2>0.8 – very good R2=1 – perfect • Mean Absolute Error / Root Mean Squared Error o Deviation of the estimates from the observed values o Compare model errors measure in the SAME units • Relative Absolute Error / Relative Squared Error o % deviation from real value o Compare model errors measure in the DIFFERENT units
  • 26. Neural Networks & Deep Learning • Neural-Networks are considered universal function approximators • They can compute and learn any function
  • 27. Neural Network Architecture • Nodes, organized in layers, with weighted connections o Acyclic directed graph • Layers o Input (1), Output (1) o Shallow - 1 hidden layer o Deep – multiple hidden layers • MS NET# language • Define DNN layers • Bundles (connections) • Activation functions
  • 28. Artificial Neuron Activation • Calculates weighted sum of inputs • Adds bias (function shift) • Decides whether it shall be activated Natural Questions • Why do we have so many? • Why some work better than other? • Which one to use ?
  • 29. Activation Functions (AF) Goal • Convert input -> output signal • Output signal is input to next layer • Approximate target function faster Samples • ReLu, PReLu – good to start with • TanH and Sigmoid – outdated • Softmax – output layer, classification • Linear func – output layer, regression
  • 30. Image Recognition with DNN • Convolution • Non-Linearity (i.e. ReLU) • Downsampling • Classification
  • 31. Microsoft Cognitive Toolkit • The Microsoft way to run ML on-premises • Toolkit for learning and evaluating DNN o Production-ready o Open source (GitHub since Jan 2016) o Cross platform (Windows, Linux, Docker) o Highly optimized for multiple GPUs/machines • Definition in C++, Python, C# (beta), BrainScript • Convolutional NN are extremely good for feature extraction
  • 33. Key ML Modules Building blocks for creating experiments Split Data Split data into training and verification sets Train Model Trains model from untrained model and dataset Number of arguments = algorithm flexibility Score Model Confirm model results against data set Evaluate Model Get key quality factors of the results or evaluate against another model Tune Model Sweep model parameters for optimal settings
  • 34. Takeaways Machine Learning Cortana Intelligence Gallery (3700+ Sample Azure ML Projects) Evaluating model performance Azure ML documentation (full) Azure ML video guidelines ML algorithms by use case Artificial Neural Networks Intuitive Explanation of Convolutional Neural Networks Biginners guide to Convolutional Neural Networks Understanding activation functions Activation functions, which is better