Ruby Shrestha
THE ABC OF IMPLEMENTING
SUPERVISED ML WITH
PYTHON
| MACHINE LEARNING |
A computer program is said to 'learn' from experience E with respect to some
class of tasks T and performance measure P, if its performance at tasks in T, as
measured by P, improves with experience E.
(source: Wiki)
• Task (T): recognizing and classifying handwritten words within images
• Performance measure (P): percent of words correctly classified
• Training experience (E): a database of handwritten words with given
classifications
OFFICIAL DEFINITION OF
MACHINE LEARNING (ML)
• Sample: Any item to classify. Example, picture, doc, row of DB or CSV file, an audio or video
clip
• Training Set: set of data from which predictive relationship is developed by the system
• Testing Set: set of data whose predictions is to be made by the system or result
automatically derived
• Features: Distinct traits to describe each item in a quantitative manner
• Feature Vector: n-dimensional vector of quantitative features
• Label Vector: 1-dimensional vector of label values / classes, which identify each row of
feature vector (in case of supervised learning)
• Feature Extraction: preparation of Feature Vector and Label Vector
TERMINOLOGIES
EXAMPLE
● Given above is the training set.
● Petal width, petal length, sepal width, sepal length are the features from which we can
create feature vector.
● Species name is the label from which we can create label vector.
● Each row is the sample we have.
● A number of samples form training set, used to train machine. Similarly, other group of
similar samples form testing test for the purpose of testing the accuracy.
• Supervised Learning
• Unsupervised Learning
• Reinforcement Learning
TYPES OF MACHINE LEARNING
TYPES OF SUPERVISED ML
Regression Classification
Fig: Linear Regression
Source: http://ci.columbia.edu
Fig: 3- Class Leaf Species Classification
Source: https://astrobites.org
WORKFLOW OF SUPERVISED ML
Source: NLTK
• Download and Install Python (Basically, sudo apt-get install python3)
• Any IDE: Example, PyCharm (https://itsfoss.com/install-pycharm-ubuntu/)
• Important Python libraries to install (using, pip module):
✓ Numpy: n-dimensional array creation and array related functionalities (pip install numpy)
✓ Scipy: scientific operations (linear algebra, integration, signal and img processing) (pip install scipy)
✓ Matplotlib: plotting figures (pip install matplotlib)
✓ Pandas: high level data manipulation (groupby, merge, join, time series data manipulation) (pip install
pandas)
✓ Sklearn: Machine Learning algorithms (pip install scikit-learn)
• Topic Related Dataset
PREREQUISITES
Usual Method
a = 83;
b = -2;
c = a + b;
A Different Contemporary Approach
“MACHINE LEARNING”
MAKE A MACHINE LEARN TO ADD
LDA A Load operand from location A 0010 0001 0000 0100
ADD B Add operand from location B 0001 0001 0000 0101
STA C Store sum in location C 0011 0001 0000 0110
1. Import required libraries using import statement.
• Example: for importing scikit-learn,
• import sklearn
• import sklearn as sk
• from sklearn.linear_model import LinearRegression
ADDING TWO NUMBERS BASED ON ML
2. Load the dataset.
ADDING TWO NUMBERS BASED ON ML
6 x 3 array
3. Create Feature Set and Label Set.
ADDING TWO NUMBERS BASED ON ML
6 x 2 array
1-d array
// Feature Set
// Label Set
Data Summarization and Visualization
using functions of matplotlib or other
suitable module.
Here, not required
X
4. Choose an appropriate Machine Learning algorithm
According to docs.microsoft.com:
It depends on:
• size, quality, and nature of the data.
• what you want to do with the answer.
• how the math of the algorithm was translated
• how much time you have.
• complexity of the problem at hand.
“Even the most experienced data scientists can't tell which algorithm will
perform best before trying them; however, they can certainly give a strong
hypothesis. ”
ADDING TWO NUMBERS BASED ON ML
5. Create an instance of ML Algorithm so chosen.
How does it work?
Given: x1 and x2 as two numbers to sum and y as result. In Linear Regression:
y=a∗x1+b∗x2+c
For a simple addition the coefficient we are looking for are:
a = 1, b = 1 and c = 0
But here we want the AI to figure it out by itself. So, we just feed it with some
examples ( 2+3=5 , 1+5=6)
ADDING TWO NUMBERS BASED ON ML
6. Fit the linear regression model with the training data.
Fit function finds coefficients required to create a trained model.
Output
ADDING TWO NUMBERS BASED ON ML
7. We’re almost done now ☺ Now we just need to test the trained system
to find summations of two test numbers.
Output
ADDING TWO NUMBERS BASED ON ML
8. Finally, if interested we can find accuracy of the system using
mathematical calculation (accuracy= correct / total) or accuracy_score
function of sklearn.metrics module.
Output
ADDING TWO NUMBERS BASED ON ML
1. Import required libraries using import statement.
2. Load the dataset.
3. Create Feature Set and Label Set.
4. Choose an appropriate Machine Learning algorithm.
5. Create an instance of ML Algorithm so chosen.
6. Fit the model with the training data.
7. Test the system for result.
8. Finally, if interested, find accuracy of the system using mathematical calculation
(accuracy= correct / total) or accuracy_score function of sklearn.metrics
module.
THE OVERALL STEPS NOW
Supervised Machine Learning
Classification Problem
IN THE NEXT PRESENTATION SESSION
THANK YOU

The ABC of Implementing Supervised Machine Learning with Python.pptx

  • 1.
    Ruby Shrestha THE ABCOF IMPLEMENTING SUPERVISED ML WITH PYTHON
  • 2.
  • 3.
    A computer programis said to 'learn' from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E. (source: Wiki) • Task (T): recognizing and classifying handwritten words within images • Performance measure (P): percent of words correctly classified • Training experience (E): a database of handwritten words with given classifications OFFICIAL DEFINITION OF MACHINE LEARNING (ML)
  • 4.
    • Sample: Anyitem to classify. Example, picture, doc, row of DB or CSV file, an audio or video clip • Training Set: set of data from which predictive relationship is developed by the system • Testing Set: set of data whose predictions is to be made by the system or result automatically derived • Features: Distinct traits to describe each item in a quantitative manner • Feature Vector: n-dimensional vector of quantitative features • Label Vector: 1-dimensional vector of label values / classes, which identify each row of feature vector (in case of supervised learning) • Feature Extraction: preparation of Feature Vector and Label Vector TERMINOLOGIES
  • 5.
    EXAMPLE ● Given aboveis the training set. ● Petal width, petal length, sepal width, sepal length are the features from which we can create feature vector. ● Species name is the label from which we can create label vector. ● Each row is the sample we have. ● A number of samples form training set, used to train machine. Similarly, other group of similar samples form testing test for the purpose of testing the accuracy.
  • 6.
    • Supervised Learning •Unsupervised Learning • Reinforcement Learning TYPES OF MACHINE LEARNING
  • 7.
    TYPES OF SUPERVISEDML Regression Classification Fig: Linear Regression Source: http://ci.columbia.edu Fig: 3- Class Leaf Species Classification Source: https://astrobites.org
  • 8.
    WORKFLOW OF SUPERVISEDML Source: NLTK
  • 9.
    • Download andInstall Python (Basically, sudo apt-get install python3) • Any IDE: Example, PyCharm (https://itsfoss.com/install-pycharm-ubuntu/) • Important Python libraries to install (using, pip module): ✓ Numpy: n-dimensional array creation and array related functionalities (pip install numpy) ✓ Scipy: scientific operations (linear algebra, integration, signal and img processing) (pip install scipy) ✓ Matplotlib: plotting figures (pip install matplotlib) ✓ Pandas: high level data manipulation (groupby, merge, join, time series data manipulation) (pip install pandas) ✓ Sklearn: Machine Learning algorithms (pip install scikit-learn) • Topic Related Dataset PREREQUISITES
  • 10.
    Usual Method a =83; b = -2; c = a + b; A Different Contemporary Approach “MACHINE LEARNING” MAKE A MACHINE LEARN TO ADD LDA A Load operand from location A 0010 0001 0000 0100 ADD B Add operand from location B 0001 0001 0000 0101 STA C Store sum in location C 0011 0001 0000 0110
  • 11.
    1. Import requiredlibraries using import statement. • Example: for importing scikit-learn, • import sklearn • import sklearn as sk • from sklearn.linear_model import LinearRegression ADDING TWO NUMBERS BASED ON ML
  • 12.
    2. Load thedataset. ADDING TWO NUMBERS BASED ON ML 6 x 3 array
  • 13.
    3. Create FeatureSet and Label Set. ADDING TWO NUMBERS BASED ON ML 6 x 2 array 1-d array // Feature Set // Label Set Data Summarization and Visualization using functions of matplotlib or other suitable module. Here, not required X
  • 14.
    4. Choose anappropriate Machine Learning algorithm According to docs.microsoft.com: It depends on: • size, quality, and nature of the data. • what you want to do with the answer. • how the math of the algorithm was translated • how much time you have. • complexity of the problem at hand. “Even the most experienced data scientists can't tell which algorithm will perform best before trying them; however, they can certainly give a strong hypothesis. ” ADDING TWO NUMBERS BASED ON ML
  • 15.
    5. Create aninstance of ML Algorithm so chosen. How does it work? Given: x1 and x2 as two numbers to sum and y as result. In Linear Regression: y=a∗x1+b∗x2+c For a simple addition the coefficient we are looking for are: a = 1, b = 1 and c = 0 But here we want the AI to figure it out by itself. So, we just feed it with some examples ( 2+3=5 , 1+5=6) ADDING TWO NUMBERS BASED ON ML
  • 16.
    6. Fit thelinear regression model with the training data. Fit function finds coefficients required to create a trained model. Output ADDING TWO NUMBERS BASED ON ML
  • 17.
    7. We’re almostdone now ☺ Now we just need to test the trained system to find summations of two test numbers. Output ADDING TWO NUMBERS BASED ON ML
  • 18.
    8. Finally, ifinterested we can find accuracy of the system using mathematical calculation (accuracy= correct / total) or accuracy_score function of sklearn.metrics module. Output ADDING TWO NUMBERS BASED ON ML
  • 19.
    1. Import requiredlibraries using import statement. 2. Load the dataset. 3. Create Feature Set and Label Set. 4. Choose an appropriate Machine Learning algorithm. 5. Create an instance of ML Algorithm so chosen. 6. Fit the model with the training data. 7. Test the system for result. 8. Finally, if interested, find accuracy of the system using mathematical calculation (accuracy= correct / total) or accuracy_score function of sklearn.metrics module. THE OVERALL STEPS NOW
  • 20.
    Supervised Machine Learning ClassificationProblem IN THE NEXT PRESENTATION SESSION
  • 21.