PL SQLDay Machine Learning- Hands on ML.NET.pptx

Machine Learning: Hands on ML.NET
Luis Beltrán
SQLDay 2021

Luis Beltrán
• Researcher - Tomas Bata University in Zlín, Czech Republic.
• Lecturer - Tecnológico Nacional de México en Celaya,
Mexico.
• Xamarin, Azure and Artificial Intelligence
@darkicebeam
luis@luisbeltran.mx

AGENDA
• Machine Learning
• ML.NET
• ML Workflow with Ml.NET
• Data
• Train
• Evaluate
• Save model
• Consume model
• Deep Learning
• MLOps
SQLDay 2021

Objectives
• Build, train, evaluate, and consume machine learning algorithms in your .NET apps
using ML.NET.
• Understand how TensorFlow (and ONNX) models can be integrated into a pipeline
for deep learning.
• Set up model lifecycle automation using MLOps.
SQLDay 2021

Artificial Intelligence
The ability of a computer to perform tasks
commonly associated with intelligent beings
(reason, discover meaning, generalize, learn
from past experience)
• Typically starts as rule or
logic-based system
• Traditional AI techniques
can be difficult to scale
SQLDay 2021

Machine Learning
Machine Learning
Getting computers to make predictions
without being explicitly programmed
• Computers find patterns in
data and learn from
experience to act on new
data
• Used to solve problems
that are difficult or
impossible to solve with
rules-based programming
SQLDay 2021

Machine Learning
Bread
or
not bread?
Bread
Not bread
SQLDay 2021

Artificial Intelligence vs. Machine Learning
Machine Learning
Rules
Data
Data
Answers
Answers
Rules
SQLDay 2021

Deep learning
Deep Learning
Subset of ML based on
artificial neural networks
which imitate the way the
human brain learns, thinks,
and processes data.
• Neural networks form
many layers
• Scenarios include image
classification, object
detection, speech
recognition, NLP
SQLDay 2021

AI + ML + Deep learning
Artificial
Intelligence
Machine Learning
Deep
Learning
SQLDay 2021

Mapping business problems to ML Tasks
What
problem
are you
looking to
solve?
Find outliers
• Anomaly detection
Predict a
number
• Regression
• Forecasting
Find
relationships
• Clustering
Categorize
items
• Binary
classification
• Multiclass
classification
•Image
classification
Make
suggestions
•Recommendation
SQLDay 2021

Machine Learning Workflow
Prepare the data Evaluate
Train Deploy
Model Training Model Consumption
Inferencing
Get the data
SQLDay 2021

Get and prepare the data
Data Source Pipeline Environment Data exploration
SQLDay 2021

Prepare the data
Data exploration Visualization
Data cleaning
SQLDay 2021

Train the model
Previous Grade (A-F) Hours Studied Pass
B 5 Y
D 2 N
A 20 Y
Features Label / Target
F(PreviousGrade, HoursStudied)
=
Pass
Model
SQLDay 2021

Evaluate
Evaluation
Metrics
Explainability Training effort
SQLDay 2021

Deploy
Model files Deployment targets
SQLDay 2021

Automated Machine Learning
Get the data Prepare the data Train Evaluate Inferencing
Deploy
Model Training Model Consumption
SQLDay 2021

Machine learning landscape
External
SQLDay 2021

ONNX
SQLDay 2021

At Microsoft
Azure Cognitive Services Azure Machine Learning
WinML
SQLDay 2021

At Microsoft
Training custom models Model consumption Requires ML knowledge
ML.NET Yes Yes - ML.NET, TensorFlow, ONNX No
Azure Cognitive Services Limited to some services Yes – consume via API/SDK No
Azure ML Yes Yes – register models & consume via
web service
Somewhat
WinML No Yes - ONNX No
SQLDay 2021

An open source and cross-platform
machine learning framework for .NET
Windows Linux macOS
SQLDay 2021

Built for
.NET
Can use existing
C# and F# skills to
integrate ML into
.NET apps
Data science &
ML experience
not required
Developers
SQLDay 2021

ML.NET Tooling + AutoML
ML.NET API
(Microsoft.ML)
AutoML.NET API
(Microsoft.ML.AutoML)
Model Builder ML.NET CLI
SQLDay 2021

Model Builder & ML.NET CLI
• Easily build custom ML models with AutoML
• Generates code for training and consumption
• Model Builder
• Currently in Visual Studio only (ships with VS 16.6)
• Integration with Azure ML (image classification)
• ML.NET CLI
• Cross platform
SQLDay 2021

Supported ML tasks in ML.NET
Classification Regression Image classification
Anomaly detection
Forecasting
Object detection
Clustering Recommendation
Ranking
SQLDay 2021

Integration with other ML tech @ Microsoft
Azure Cognitive Services
(Custom Vision)
Train
Image classification or
object detection
Consume
In .NET app using
ML.NET
Export
To ONNX
Azure Cognitive Services – Custom Vision
SQLDay 2021

Train
Using Azure AutoML
Start in Model Builder
Choose Scenario, Training
Environment, & Data
Consume
In .NET app using Model
Builder & ML.NET
Azure Machine Learning
Azure ML
Model Builder in VS
SQLDay 2021

Train
Using ML.NET
Consume
With WinML in Windows
Desktop Apps
WinML
Export
To ONNX
WinML
SQLDay 2021

• Want to stay in .NET ecosystem for Machine Learning
• Don’t want to worry about low-level complexities of ML
• Want to train a custom model
• Want to consume a pre-trained model
When should you use ML.NET?
When you…
SQLDay 2021

ML Workflow with Ml.NET
SQLDay 2021

• We will create a web app that allows users to input in data about a taxi
trip and returns how much they will pay (taxi fare).
• Regression task (value prediction scenario)
• App details:
• Train regression model in .NET core console app with given dataset
• Consume model in ASP.NET Core web app
Problem to solve: Taxi fare prediction
SQLDay 2021

• Most scenarios
• Microsoft.ML
• Forecasting & anomaly detection
scenarios :
• Microsoft.ML.TimeSeries
• Recommendation scenario:
• Microsoft.ML.Recommender
• Database loader
• System.Data.SqlClient
ML.NET NuGet Packages
• Consuming ONNX models:
• Microsoft.ML.ONNXTransformer (+
Microsoft.ML.ImageAnalytics for
object detection)
• Consuming TensorFlow models:
• Microsoft.ML.TensorFlow +
SciSharp.TensorFlow.Redist (+
Microsoft.ML.ImageAnalytics for
image classification)
• Train custom image classification
models:
• Microsoft.ML.Vision +
Microsoft.ML.ImageAnalytics +
SciSharp.TensorFlow.Redist
SQLDay 2021

• MLContext = starting point for all ML.NET operations
• Provides ways to create components for
• Data preparation
• Feature engineering
• Training
• Prediction
• Model evaluation
• Logging
• Execution control
• Seeding
MLContext
SQLDay 2021

Task
1. Add the Microsoft.ML NuGet package to your console project
2. Initialize a new MLContext in your console app
SQLDay 2021

Data in ML.NET represented as IDataView
IDataView
High-dimensional Lazy + memory efficient Immutable
SQLDay 2021

• DataViewSchema = Data schema of IDataView = set of columns, their
names, types, & other annotations
• Before loading data, must define how schema of data will look (column
names & column types)
• Use class definitions to define IDV schemas
Data schema
Class definition of
schema
Dataset
Label SepalLength SepalWidth PetalLength PetalWidth
Iris-setosa 5.1 3.5 1.4 0.2
Iris-versicolor 7.0 3.2 4.7 1.4
Iris-setosa 4.9 3.0 1.5 0.1
…
IDataView
SQLDay 2021

File loaders
• Load data from sources like text,
binary, and image files to IDV
• Can load from single or multiple
files
• Supported:
• Text: .csv, .tsv, .txt
• Images: .png, .jpg, .bmp
Data loaders & sources
Database loaders
• Load and train data directly
from relational database
• Supports:
• SQL Server, Azure SQL Database,
Oracle, SQLite, PostgreSQL,
Progress, IBM DB2, + many more
Other sources
• Load from Enumerable (in-
memory collections)
• Supports:
• JSON/XML
• Everything else
SQLDay 2021

Task
1. Create class for Model Input based on the provided taxi trip dataset
2. Load data from file to IDataView
SQLDay 2021

Preparing your data
Filter data Convert data types Normalize the data
Split data Feature engineering
SQLDay 2021

IEstimator and ITransformer
IEstimator ITransformer
SQLDay 2021

Normalization
• Min-Max
• Binning
• Mean variance
Missing Values
• Indicate
• Replace
ColumnMapping
• Concatenate
• Copy columns
• Drop columns
Type Conversion
• Convert type
• Map value to
key
• Hash
Text Transforms
• Featurize text
• Remove stop
words
• N-grams
• Word bags
Data transforms
SQLDay 2021

Algorithms / Trainers
Trainer = Algorithm + Task
Example: Stochastic Dual Coordinated Ascent (SDCA)
Binary
classification
Multi-class
classification
Regression
SdcaNonCalibratedMulti
classTrainer
SdcaRegressionTrainer
SdcaNonCalibrated
BinaryTrainer
Algorithm
Task
Trainer
SQLDay 2021

IEstimatorChain = Collection of Data Transforms + Algorithms
Training pipeline
IDataView
IEstimatorChain Model
Drop columns Normalize
Naïve Bayes
Algorithm
SQLDay 2021

Pipeline executed when Fit() method is called
Fit() the model
ITransformer model = pipeline.Fit(trainingData)
SQLDay 2021

Task
1. Split data into train and test datasets
2. Add data transformations to the pipeline
3. Choose an algorithm and add to the pipeline
4. Train the model
SQLDay 2021

Evaluation metrics
ML Task
Most common evaluation
metric
Look for
Classification
Binary: Accuracy
Multi-class: Micro-Accuracy
Closer to 1.0, the better the
quality
Regression R-Squared
quality
Recommendation R-Squared
quality
Clustering Average Distance Values closer to 0 are better.
Ranking Discounted Cumulative Gains Higher values are better
Anomaly detection Area Under ROC Curve Values closer to 1 are better.
SQLDay 2021

Underfitting & Overfitting a model
Underfitting
Model is too simple and can’t
capture the underlying trend of
the data
Overfitting
Model doesn’t generalize well
from training data to unseen
data
To prevent:
• Remove noise from data
• Try different algorithms
To prevent:
• More training data
• Remove features
• Cross validation
SQLDay 2021

• Training and model evaluation technique
• Folds the data into n-partitions and trains multiple algorithms on these
partitions
• Improves robustness by holding out data from training process
Cross validation
Partition 1 Partition 2 Partition 3 Partition 4 Partition 5
SQLDay 2021

• Global and local explanations
• Global = entire model (What features does the model give more importance to?)
• Local = individual predictions (Why was Bob rejected for a loan?)
• Techniques:
Model explainability
Permutation Feature Importance (PFI)
• Used for Classification and Regression models
• Shuffles data one feature at a time and calculates
how much the performance metric of interest
decreases; the larger the change, the more
important the feature
Feature Contribution Calculation (FCC)
• Used for Classification and Regression models
• Shows which features are most influential for a
model’s prediction on a particular and individual
data sample
SQLDay 2021

• Provide more training data
• Filter missing values and outliers
• Select different features
• Choose a different algorithm
• Tune algorithm hyperparameters
• Cross validation
Improving your model
SQLDay 2021

• Use AutoML to speed up the experimentation process
• Use Model Builder in VS or cross-platform ML.NET CLI
Tooling + AutoML
ML Task Tooling Local / Azure AutoML
Text-based classification Model Builder, CLI Local
Value prediction
(Regression)
Model Builder, CLI Local
Image classification Model Builder Local + Azure
Recommendation Model Builder, CLI Local
SQLDay 2021

Task
1. Evaluate your model and print out the metrics
2. Optional: Try training with different algorithms to see if your
evaluation metrics change
SQLDay 2021

ML.NET Model
ML.NET
Model
=
MLModel.zip
Serialized zip file which contains
data schemas, data transforms,
and algorithms
SQLDay 2021

Task
1. Save your model
SQLDay 2021

1. Create model output schema
How to consume model in ML.NET
Task Model Output
Binary
classification
Predicted Label: Class predicted by model (true or false)
Score: Positive score = true, negative score = false
Probability: Probability of having true as label
Multiclass
classification
Predicted Label: Class predicted by model
Score (vector): Scores of all classes; highest score = predicted
label
Regression Score: Predicted value
Recommendation Score: Predicted rating
Clustering
Predicted Label: Closest cluster’s index predicted by model
Score: Distances of data point to clusters’ centroid
Ranking Score: Predicted rank
Anomaly
detection
<Alert (Boolean), Raw Score, P-value (likelihood of anomaly)>
OR
Predicted Label: Anomaly vs. not anomaly predicted by model
Score: Likelihood of anomaly
Forecasting
Forecasting values
Confidence lower bounds
Confidence upper bounds
SQLDay 2021

2. Load your model
SQLDay 2021

• Prediction Engine = convenience
API for making single
predictions
Make single predictions
Prediction Engine
• Prediction Engine not thread-
safe
• Use dependency injection +
Prediction Engine Pool in multi-
threaded apps (e.g. web apps
and services)
• Creates ObjectPool of
PredictionEngine objects for
application use
Make single predictions scalable
Prediction Engine Pool
• Takes in data, makes the
transformations (such as,
making predictions), and
outputs the data
• Can load unknown data into
IDataView, use Transform to
predict, receive IDataView of
predicted values, and use
GetColumn to get the Prediction
column
Make batch predictions
Transform
3. Choose one of the below:
SQLDay 2021

Model deployment
Desktop Web Mobile
SQLDay 2021

Task
1. Load the model from a file to the web app
2. Create a Prediction Engine
3. Use the model and prediction engine to make predictions on new
sample data (e.g. consume the model)
SQLDay 2021

What are Neural Networks?
SQLDay 2021

Deep Learning
• Deep learning is a subfield of Machine Learning
concerned with algorithms inspired by the structure
and function of the brain called artificial neural
networks.
• It is exceptionally effective in discovering patterns.
• Algorithms learn through a multi-layered hierarchy.
• If you supply the system with tons of information, it
will begin to understand and respond in helpful
ways.
SQLDay 2021

Deep learning has an inbuilt automatic multi stage feature learning
process that learns rich hierarchical representations (i.e. features).
Low-level
features
Mid-level
features
Output (e.g. exterior,
interior)
High-level
features
Trainable
Classifier
SQLDay 2021

• Image
Pixel  Edge  Texture  Motif  Part  Object
• Text
Character  Word  Word-group  Clause  Sentence  Story
• Each module in Deep Learning transforms its input representation into a
higher-level one, in a way similar to human cortex.
Low Level
Features
Mid Level
Features Output
High
Level
Features
Trainable
Classifier
Input
SQLDay 2021

Convolutional Layers
Filter
1 1 1 1 1 1 0.015686 0.015686 0.011765 0.015686 0.015686 0.015686 0.015686 0.964706 0.988235 0.964706 0.866667 0.031373 0.023529 0.007843
0.007843 0.741176 1 1 0.984314 0.023529 0.019608 0.015686 0.015686 0.015686 0.011765 0.101961 0.972549 1 1 0.996078 0.996078 0.996078 0.058824 0.015686
0.019608 0.513726 1 1 1 0.019608 0.015686 0.015686 0.015686 0.007843 0.011765 1 1 1 0.996078 0.031373 0.015686 0.019608 1 0.011765
0.015686 0.733333 1 1 0.996078 0.019608 0.019608 0.015686 0.015686 0.011765 0.984314 1 1 0.988235 0.027451 0.015686 0.007843 0.007843 1 0.352941
0.015686 0.823529 1 1 0.988235 0.019608 0.019608 0.015686 0.015686 0.019608 1 1 0.980392 0.015686 0.015686 0.015686 0.015686 0.996078 1 0.996078
0.015686 0.913726 1 1 0.996078 0.019608 0.019608 0.019608 0.019608 1 1 0.984314 0.015686 0.015686 0.015686 0.015686 0.952941 1 1 0.992157
0.019608 0.913726 1 1 0.988235 0.019608 0.019608 0.019608 0.039216 0.996078 1 0.015686 0.015686 0.015686 0.015686 0.996078 1 1 1 0.007843
0.019608 0.898039 1 1 0.988235 0.019608 0.015686 0.019608 0.968628 0.996078 0.980392 0.027451 0.015686 0.019608 0.980392 0.972549 1 1 1 0.019608
0.043137 0.905882 1 1 1 0.015686 0.035294 0.968628 1 1 0.023529 1 0.792157 0.996078 1 1 0.980392 0.992157 0.039216 0.023529
1 1 1 1 1 0.992157 0.992157 1 1 0.984314 0.015686 0.015686 0.858824 0.996078 1 0.992157 0.501961 0.019608 0.019608 0.023529
0.996078 0.992157 1 1 1 0.933333 0.003922 0.996078 1 0.988235 1 0.992157 1 1 1 0.988235 1 1 1 1
0.015686 0.74902 1 1 0.984314 0.019608 0.019608 0.031373 0.984314 0.023529 0.015686 0.015686 1 1 1 0 0.003922 0.027451 0.980392 1
0.019608 0.023529 1 1 1 0.019608 0.019608 0.564706 0.894118 0.019608 0.015686 0.015686 1 1 1 0.015686 0.015686 0.015686 0.05098 1
0.015686 0.015686 1 1 1 0.047059 0.019608 0.992157 0.007843 0.011765 0.011765 0.015686 1 1 1 0.015686 0.019608 0.996078 0.023529 0.996078
0.019608 0.015686 0.243137 1 1 0.976471 0.035294 1 0.003922 0.011765 0.011765 0.015686 1 1 1 0.988235 0.988235 1 0.003922 0.015686
0.019608 0.019608 0.027451 1 1 0.992157 0.223529 0.662745 0.011765 0.011765 0.011765 0.015686 1 1 1 0.015686 0.023529 0.996078 0.011765 0.011765
0.015686 0.015686 0.011765 1 1 1 1 0.035294 0.011765 0.011765 0.011765 0.015686 1 1 1 0.015686 0.015686 0.964706 0.003922 0.996078
0.007843 0.019608 0.011765 0.054902 1 1 0.988235 0.007843 0.011765 0.011765 0.015686 0.011765 1 1 1 0.015686 0.015686 0.015686 0.023529 1
0.007843 0.007843 0.015686 0.015686 0.960784 1 0.490196 0.015686 0.015686 0.015686 0.007843 0.027451 1 1 1 0.011765 0.011765 0.043137 1 1
0.023529 0.003922 0.007843 0.023529 0.980392 0.976471 0.039216 0.019608 0.007843 0.019608 0.015686 1 1 1 1 1 1 1 1 1
0 1 0
1 -4 1
0 1 0
Input Image Convoluted Image
SQLDay 2021

Convolution
Input Image Convolved Image
(Feature Map)
a b c d
e f g h
i j k l
m n o p
w1 w2
w3 w4
Filter
h1 h2
ℎ1 = 𝑓 𝑎 ∗ 𝑤1 + 𝑏 ∗ 𝑤2 + 𝑒 ∗ 𝑤3 + 𝑓 ∗ 𝑤4
ℎ2 = 𝑓 𝑏 ∗ 𝑤1 + 𝑐 ∗ 𝑤2 + 𝑓 ∗ 𝑤3 + 𝑔 ∗ 𝑤4
SQLDay 2021

Lower Level to More
Complex Features
Input Image
Layer 1
Feature Map
Layer 2
Feature Map
w1 w2
w3 w4
w5 w6
w7 w8
Filter 1
Filter 2
SQLDay 2021

Pooling
• Max pooling: reports the maximum output within a rectangular
neighborhood.
• Average pooling: reports the average output of a rectangular
neighborhood.
1 3 5 3
4 2 3 1
3 1 1 3
0 1 0 4
MaxPool with 2X2 filter with
stride of 2
Input Matrix Output Matrix
4 5
3 4
SQLDay 2021

Convolutional Neural Network
Feature Extraction Architecture
64
64
128
128
256
256
256
512
512
512
512
512
512
Filter
Max
Pool
Fully Connected
Layers
Living Room
Bed Room
Kitchen
Bathroom
Outdoor
Maxpool
Output
Vector
SQLDay 2021

Training Deep Learning
Models in ML.NET
Architectures
• MobileNet
• Inception
• Resnet
SQLDay 2021

Deep learning in ML.NET
• Model Training
Image Classification API
• Train custom image
classification models via
Image Classification API
• Uses transfer learning
• Built on TensorFlow.NET
• Can use local GPU for
training
• Model Consumption
ML.NET API
• Consume pre-trained
TensorFlow and ONNX
models
Model Training
Model Builder in VS
• Train custom image
classification models
• Can train locally or in Azure
(Azure ML)
SQLDay 2021

Task
• Task: Local image training with Image Classification API
SQLDay 2021

Input Model
Solution structure
Nuget
Packages
Output Model
(Prediction)
SQLDay 2021

Main libraries
Paths
PrepareSet:
Loading input images for training,
validation, and testing
SQLDay 2021

Display information to the Console
Input data (images)
SQLDay 2021

Dataset: http://www.laurencemoroney.com/rock-paper-scissors-dataset/
SQLDay 2021

Main program
Loading data for
supervised learning
(images include tags)
Training and Validation sets
Load pipeline:
Images loaded in memory
Training options:
ImageClassificationTrainer
chosen, based on the
InceptionV3 architecture
Training pipeline:
Trying to predict a
category
Both pipelines are combined
SQLDay 2021

Perform training
Model precision is validated
using validation dataset
Model Metrics calculated
Test the classification model using the new images
Prepare new images for validation
Export the model
Consume the model
SQLDay 2021

ConsumingModel
Load a previously trained
classification model and prepare test
images that were not used before in
the training and validation stages
ClassifyImages: Test the model with new images
SQLDay 2021

Results
Training
Features discovery
Precision
Learning Rate
Cross-Entrophy
SQLDay 2021

Validation
Precision metrics
Model validation results:
Image
Actual damage class
Prediction
SQLDay 2021

Image classification results
ML Model
exported as zip file
SQLDay 2021

You can load a pre-trained TensorFlow model to integrate it in an
ML.NET pipeline:
SQLDay 2020

Task
Task: Analyze sentiment of movie reviews using a pre-trained
TensorFlow model in ML.NET
SQLDay 2021

Machine Learning Operations (MLOps) applies DevOps principles &
practices (e.g. continuous integration, delivery, and deployment) to the
ML process
What is ML Ops?
SQLDay 2021

Task
Task: Build a project that trains, tests, and deploys a model.
SQLDay 2021

Thank you for your attention
luis@luisbeltran.mx

PL SQLDay Machine Learning- Hands on ML.NET.pptx

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to PL SQLDay Machine Learning- Hands on ML.NET.pptx

Similar to PL SQLDay Machine Learning- Hands on ML.NET.pptx (20)

More from Luis Beltran

More from Luis Beltran (20)

Recently uploaded

Recently uploaded (20)

PL SQLDay Machine Learning- Hands on ML.NET.pptx

Editor's Notes