SlideShare a Scribd company logo
1 of 8
Machine-learning with
Accord.NET: Wine-quality
This example is based on the Wine Quality dataset from the
University of California Irvine Machine Learning Repository:
https://archive.ics.uci.edu/ml/datasets/Wine+Quality
Machine-Learning
• Current cloud providers (Microsoft, Amazon, Google, …)
have interest to sell computing power as API
• Machine-learning takes a lot of computing power
They have the interest to make it the next buzz-word.
They made cloud a buzz-word, they can do it again.
e.g. https://azure.microsoft.com/en-in/services/machine-learning/
However, this time we won’t use any APIs, but an open source tool called Accord.NET.
A case for machine learning
 We have some existing sample data.
 We want to estimate a variable known in the sample data,
but not in the real life.
 We expect that the real results will follow the sample data.
Randomize the sample data rows order and split it to two
parts:
1) Training set
• Used to find the correct model.
2) Model evaluation set
• Used to verify that the model works to data outside the trained samples
A sample dataset:
Wine quality
There is just a parameters of wines and a people
voted quality from 0 to 10:
https://archive.ics.uci.edu/ml/machine-learning-
databases/wine-quality/winequality-red.csv
Can we estimate a quality of non-listed wine based
on the features we know?
fixed acidity
volatile
acidity citric acid residual sugar chlorides
free sulfur
dioxide
total sulfur
dioxide density pH sulphates alcohol quality
7.4 0.7 0 1.9 0.076 11 34 0.9978 3.51 0.56 9.4 5
7.3 0.65 0 1.2 0.065 15 21 0.9946 3.39 0.47 10 7
(Linear) Regression
Creating a linear regression over
one feature is relatively simple.
y = k x + b
The dataset has a large amount of
wines, with different alcohol levels
and qualities.
But the dataset has 10 other
features also, so how to make a
regression over combined 11
variables? Takes forever…?
y = k1 x1 + k2 x2 + … + kn xn + b Original picture from: http://brandewinder.com/2016/08/06/gradient-boosting-part-1/
Accord .NET cancer example
Age Smokes Had cancer
55 0 FALSE
28 0 FALSE
65 1 FALSE
46 0 TRUE
86 1 TRUE
56 1 TRUE
85 0 FALSE
33 0 FALSE
21 1 FALSE
42 1 TRUE
Feature Odd ratio
Age 1.02
Smoking 5.86
Calculation y(x0, x1) =
0.0206451183100222*x0
+ 1.76788931343272*x1
+ -2.45774643623285
Decide()
http://fssnip.net/7Sz
Decision trees
Instead of combining slopes, create a combination
of feature-condition-stumps.
Estimating a few (discrete) categories based on
combination of decision nodes.
What method should I choose?
http://scikit-learn.org/stable/_static/ml_map.png
PH > 3.5
Alcohol > 10.6
Manual example and theory:
http://brandewinder.com/2016/08/06/gradient-boosting-part-1/
http://fssnip.net/7Tz
Figure has just 2 stumps, but real life AI can
generate huge trees.
Use-case: Quality for our event’s wine from Alko
https://www.alko.fi/tuotteet/455518/Frontera-Cabernet-Sauvignon-2016-hanapakkaus
Data from Alko analysis laboratory, wine entry L2BIBS34016:
In Finnish In English
Alk-% 12,01 Alcohol 12.01
Sokeri 3,5 g/l Sugar 3.5
Haihtuvat hapot 0,5 g/l Volatile acidity 0.5
Kokonaisrikki 96 mg/l Total sulfur 96
Vapaa rikki 36 mg/l Free sulfur 36
Sitruunahappo 0,045 g/l Citric acid 0.045
• This sample is from Chile and the sample data is from Italy, so our algorithm has to be able to
work outside the dataset.
• Parameter mismatch:
1) Convert parameters,
2) Remove parameter from learning process
 Measure the error, effect to model quality
We don’t have Mean
Fixed acidity 8.32 g/l
Chlorides 0.087 g/l
Density 0.9967 g/l
pH 3.31
Sulphates 0.66 g/l
Extra data Known
Total acids 4.62 g/l
Extract 29.7
Density “medium”
Cabernet Sauvignong
(Alko provided the data I asked by email)

More Related Content

Similar to Machine learning (using Accord.NET and FSharp)

Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...
Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...
Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...
DataWorks Summit
 
MicroManager_MATLAB_Implementation
MicroManager_MATLAB_ImplementationMicroManager_MATLAB_Implementation
MicroManager_MATLAB_Implementation
Philip Mohun
 

Similar to Machine learning (using Accord.NET and FSharp) (20)

Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...
Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...
Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...
 
Machine Learning Automation using Flask API
Machine Learning Automation using Flask APIMachine Learning Automation using Flask API
Machine Learning Automation using Flask API
 
The importance of model fairness and interpretability in AI systems
The importance of model fairness and interpretability in AI systemsThe importance of model fairness and interpretability in AI systems
The importance of model fairness and interpretability in AI systems
 
scalable machine learning
scalable machine learningscalable machine learning
scalable machine learning
 
Optimize Content Processing in the Cloud with GPU and Spot Instances
Optimize Content Processing in the Cloud with GPU and Spot InstancesOptimize Content Processing in the Cloud with GPU and Spot Instances
Optimize Content Processing in the Cloud with GPU and Spot Instances
 
DATI, AI E ROBOTICA @POLITO
DATI, AI E ROBOTICA @POLITODATI, AI E ROBOTICA @POLITO
DATI, AI E ROBOTICA @POLITO
 
2021 06 19 ms student ambassadors nigeria ml net 01 slide-share
2021 06 19 ms student ambassadors nigeria ml net 01   slide-share2021 06 19 ms student ambassadors nigeria ml net 01   slide-share
2021 06 19 ms student ambassadors nigeria ml net 01 slide-share
 
2021 02 23 MVP Fusion Getting Started with Machine Learning.Net and AutoML
2021 02 23 MVP Fusion Getting Started with Machine Learning.Net and AutoML2021 02 23 MVP Fusion Getting Started with Machine Learning.Net and AutoML
2021 02 23 MVP Fusion Getting Started with Machine Learning.Net and AutoML
 
Clipper: A Low-Latency Online Prediction Serving System
Clipper: A Low-Latency Online Prediction Serving SystemClipper: A Low-Latency Online Prediction Serving System
Clipper: A Low-Latency Online Prediction Serving System
 
Real World Single Page App - A Knockout Case Study
Real World Single Page App - A Knockout Case StudyReal World Single Page App - A Knockout Case Study
Real World Single Page App - A Knockout Case Study
 
Machine Learning Tokyo - Deep Neural Networks for Video - NumberBoost
Machine Learning Tokyo - Deep Neural Networks for Video - NumberBoostMachine Learning Tokyo - Deep Neural Networks for Video - NumberBoost
Machine Learning Tokyo - Deep Neural Networks for Video - NumberBoost
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用
 
Nss power point_machine_learning
Nss power point_machine_learningNss power point_machine_learning
Nss power point_machine_learning
 
Deep Learning in Spark with BigDL by Petar Zecevic at Big Data Spain 2017
Deep Learning in Spark with BigDL by Petar Zecevic at Big Data Spain 2017Deep Learning in Spark with BigDL by Petar Zecevic at Big Data Spain 2017
Deep Learning in Spark with BigDL by Petar Zecevic at Big Data Spain 2017
 
Art & music vs Google App Engine
Art & music vs Google App EngineArt & music vs Google App Engine
Art & music vs Google App Engine
 
Productive Use of the Apache Spark Prompt with Sam Penrose
Productive Use of the Apache Spark Prompt with Sam PenroseProductive Use of the Apache Spark Prompt with Sam Penrose
Productive Use of the Apache Spark Prompt with Sam Penrose
 
Data Mining using Weka
Data Mining using WekaData Mining using Weka
Data Mining using Weka
 
Deep Neural Networks for Video Applications at the Edge
Deep Neural Networks for Video Applications at the EdgeDeep Neural Networks for Video Applications at the Edge
Deep Neural Networks for Video Applications at the Edge
 
MicroManager_MATLAB_Implementation
MicroManager_MATLAB_ImplementationMicroManager_MATLAB_Implementation
MicroManager_MATLAB_Implementation
 
Report face recognition : ArganRecogn
Report face recognition :  ArganRecognReport face recognition :  ArganRecogn
Report face recognition : ArganRecogn
 

More from Tuomas Hietanen

More from Tuomas Hietanen (13)

Possible FSharp Refactorings could be...
Possible FSharp Refactorings could be...Possible FSharp Refactorings could be...
Possible FSharp Refactorings could be...
 
Message passing & NoSQL (in English)
Message passing & NoSQL (in English)Message passing & NoSQL (in English)
Message passing & NoSQL (in English)
 
Function therory
Function theroryFunction therory
Function therory
 
The Pain Points of C#
The Pain Points of C#The Pain Points of C#
The Pain Points of C#
 
F# references (and some misc slides)
F# references (and some misc slides)F# references (and some misc slides)
F# references (and some misc slides)
 
Linq in practice
Linq in practiceLinq in practice
Linq in practice
 
Using f# project from c#
Using f# project from c#Using f# project from c#
Using f# project from c#
 
Funktioteoriaa
FunktioteoriaaFunktioteoriaa
Funktioteoriaa
 
Pari sekalaista diaa ja F#-Referenssejä
Pari sekalaista diaa ja F#-ReferenssejäPari sekalaista diaa ja F#-Referenssejä
Pari sekalaista diaa ja F#-Referenssejä
 
F# ja C# yhteiskäyttö
F# ja C# yhteiskäyttöF# ja C# yhteiskäyttö
F# ja C# yhteiskäyttö
 
C# nykyiset kipupisteet
C# nykyiset kipupisteetC# nykyiset kipupisteet
C# nykyiset kipupisteet
 
LINQ käytännössä
LINQ käytännössäLINQ käytännössä
LINQ käytännössä
 
Coding with LINQ, Patterns & Practices
Coding with LINQ, Patterns & PracticesCoding with LINQ, Patterns & Practices
Coding with LINQ, Patterns & Practices
 

Recently uploaded

Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
masabamasaba
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
VictoriaMetrics
 

Recently uploaded (20)

Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT  - Elevating Productivity in Today's Agile EnvironmentHarnessing ChatGPT  - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 

Machine learning (using Accord.NET and FSharp)

  • 1. Machine-learning with Accord.NET: Wine-quality This example is based on the Wine Quality dataset from the University of California Irvine Machine Learning Repository: https://archive.ics.uci.edu/ml/datasets/Wine+Quality
  • 2. Machine-Learning • Current cloud providers (Microsoft, Amazon, Google, …) have interest to sell computing power as API • Machine-learning takes a lot of computing power They have the interest to make it the next buzz-word. They made cloud a buzz-word, they can do it again. e.g. https://azure.microsoft.com/en-in/services/machine-learning/ However, this time we won’t use any APIs, but an open source tool called Accord.NET.
  • 3. A case for machine learning  We have some existing sample data.  We want to estimate a variable known in the sample data, but not in the real life.  We expect that the real results will follow the sample data. Randomize the sample data rows order and split it to two parts: 1) Training set • Used to find the correct model. 2) Model evaluation set • Used to verify that the model works to data outside the trained samples
  • 4. A sample dataset: Wine quality There is just a parameters of wines and a people voted quality from 0 to 10: https://archive.ics.uci.edu/ml/machine-learning- databases/wine-quality/winequality-red.csv Can we estimate a quality of non-listed wine based on the features we know? fixed acidity volatile acidity citric acid residual sugar chlorides free sulfur dioxide total sulfur dioxide density pH sulphates alcohol quality 7.4 0.7 0 1.9 0.076 11 34 0.9978 3.51 0.56 9.4 5 7.3 0.65 0 1.2 0.065 15 21 0.9946 3.39 0.47 10 7
  • 5. (Linear) Regression Creating a linear regression over one feature is relatively simple. y = k x + b The dataset has a large amount of wines, with different alcohol levels and qualities. But the dataset has 10 other features also, so how to make a regression over combined 11 variables? Takes forever…? y = k1 x1 + k2 x2 + … + kn xn + b Original picture from: http://brandewinder.com/2016/08/06/gradient-boosting-part-1/
  • 6. Accord .NET cancer example Age Smokes Had cancer 55 0 FALSE 28 0 FALSE 65 1 FALSE 46 0 TRUE 86 1 TRUE 56 1 TRUE 85 0 FALSE 33 0 FALSE 21 1 FALSE 42 1 TRUE Feature Odd ratio Age 1.02 Smoking 5.86 Calculation y(x0, x1) = 0.0206451183100222*x0 + 1.76788931343272*x1 + -2.45774643623285 Decide() http://fssnip.net/7Sz
  • 7. Decision trees Instead of combining slopes, create a combination of feature-condition-stumps. Estimating a few (discrete) categories based on combination of decision nodes. What method should I choose? http://scikit-learn.org/stable/_static/ml_map.png PH > 3.5 Alcohol > 10.6 Manual example and theory: http://brandewinder.com/2016/08/06/gradient-boosting-part-1/ http://fssnip.net/7Tz Figure has just 2 stumps, but real life AI can generate huge trees.
  • 8. Use-case: Quality for our event’s wine from Alko https://www.alko.fi/tuotteet/455518/Frontera-Cabernet-Sauvignon-2016-hanapakkaus Data from Alko analysis laboratory, wine entry L2BIBS34016: In Finnish In English Alk-% 12,01 Alcohol 12.01 Sokeri 3,5 g/l Sugar 3.5 Haihtuvat hapot 0,5 g/l Volatile acidity 0.5 Kokonaisrikki 96 mg/l Total sulfur 96 Vapaa rikki 36 mg/l Free sulfur 36 Sitruunahappo 0,045 g/l Citric acid 0.045 • This sample is from Chile and the sample data is from Italy, so our algorithm has to be able to work outside the dataset. • Parameter mismatch: 1) Convert parameters, 2) Remove parameter from learning process  Measure the error, effect to model quality We don’t have Mean Fixed acidity 8.32 g/l Chlorides 0.087 g/l Density 0.9967 g/l pH 3.31 Sulphates 0.66 g/l Extra data Known Total acids 4.62 g/l Extract 29.7 Density “medium” Cabernet Sauvignong (Alko provided the data I asked by email)