MLOPINION APP : SENTIMENT ANALYSIS
MADE SIMPLE
Ajmal Dookhan
9th March , 2020
“Type a quote here.
“Type a quote here.
Snapchat stock loses $1.3 billion after Kylie Jenner tweet.
Some players were able to benefit from the social media turmoil, The Big Short-style, and made $163 million.
All of that fuzz was based on one tweet with negative sentiment.
What would you do if you were a major investor in Snapchat ?
That’s the power of sentiment analysis of online mentions.
“’’The practice of applying Natural language
Processing and Text Analysis techniques to identify
and extract subjective information from a piece of
text’’
SENTIMENT ANALYSIS, WHAT IS IT ?
NATURAL LANGUAGE PROCESSING
➤ The process of analyzing unstructured text to extract relevant
information
➤ It determines if an expression is positive, negative or neutral
and to what degree
➤ It is an emerging field that attempts to analyze and measure
human emotions and convert into hard facts.
NATURAL LANGUAGE PROCESSING
MLOPINION APP
➤ A asp.net core web application
➤ Based on Microsoft Machine Learning platform
MLOPINION APP - CODE WORKFLOW
• Collect and load training data into an IDataView object
• Specify a pipeline of operations to extract features and
apply a machine learning algorithm
• Train a model by calling Fit() on the pipeline
• Evaluate the model and iterate to improve
• Save the model into binary format, for use in an
application
• Load the model back into an ITransformer object
• Make predictions by
calling CreatePredictionEngine.Predict()
WHAT IS ML.NET ?
➤ ML.NET is a free, open source, and cross platform machine
learning framework for the .NET developer platform.
➤ ML.NET allows you to train, build, and ship custom machine
learning models using C# or F# for a variety of ML scenarios.
ML.NET includes features like automated machine learning
(AutoML) and tools like ML.NET CLI and ML.NET Model
Builder, which make integrating machine learning into your
applications even easier.
MLOPINION APP - CODE WORKFLOW
• Collect and load training data into an IDataView object
• Specify a pipeline of operations to extract features and
apply a machine learning algorithm
• Train a model by calling Fit() on the pipeline
• Evaluate the model and iterate to improve
• Save the model into binary format, for use in an
application
• Load the model back into an ITransformer object
• Make predictions by
calling CreatePredictionEngine.Predict()
MLOPINION APP - HOW I BUILT IT ?
➤ I created the web app
➤ Added ML.NET Model Builder Extension
MLOPINION APP - HOW I BUILT IT ?
MLOPINION APP - HOW I BUILT IT ?
MLOPINION APP - ADDING THE DATA SOURCE
https://raw.githubusercontent.com/dotnet/machinelearning/master/test/data/wikipedia-detox-250-line-data.tsv
An annotated dataset of 1m crowd-sourced annotations that cover 100k talk page diffs (with 10 judgements per diff) for personal attacks, aggression,
and toxicity.
MLOPINION APP - HOW I BUILT IT ?
MLOPINION APP - HOW I BUILT IT ?
MLOPINION APP - ADDING THE DATA
MLOPINION APP - TRAINING
MLOPINION APP - TRAINING
MLOPINION APP - TRAINING
MLOPINION APP - EVALUATING
MLOPINION APP - EVALUATING THE MODEL
MLOPINION APP - GENERATING THE CODE
MLOPINION APP - GENERATING THE CODE
MLOPINION APP - ADDING MICROSOFT.EXTENSIONS.ML TO THE WEB APP
MLOPINION APP - ADDING REFERENCES TO THE MODEL IN THE WEB APP
MLOPINION APP - BUILDING AND RUNNING THE APP
MLOPINION APP - ADDING REFERENCES TO THE MODEL IN THE WEB APP
NEXT : DOTNET MLNET CLI
• Prepare your data for the selected machine learning task
• Run the 'mlnet auto-train' command from the CLI
• Review the quality metric results
• Understand the generated C# code to use the model in your application
• Explore the generated C# code that was used to train the model
The ML.NET CLI automates model generation for .NET developers.
The ML.NET CLI simplifies the Machine Learning process using automated machine learning (AutoML).
Source :Microsoft
ML.NET
Currently, the ML Tasks supported by the ML.NET CLI are:
Binary-classification
Multiclass-classification
Regression
ML.NET CLI WITH AUTO-TRAIN
mlnet auto-train --task binary-classification --dataset "yelp_labelled.txt" --label-
column-index 1 --has-header false --max-exploration-time 30 --name mLOpinionApp
Increasing the training time to 60 seconds => Higher accuracy
EVALUATION METRICS
Metrics Description Look for
Accuracy
Accuracy is the proportion of correct predictions with a
test data set. It is the ratio of number of correct
predictions to the total number of input samples. It
works well if there are similar number of samples
belonging to each class.
The closer to 1.00, the better. But exactly 1.00 indicates
an issue (commonly: label/target leakage, over-fitting, or
testing with training data). When the test data is
unbalanced (where most of the instances belong to one
of the classes), the dataset is small, or scores approach
0.00 or 1.00, then accuracy doesn’t really capture the
effectiveness of a classifier and you need to check
additional metrics.
AUC
aucROC or Area under the curve measures the area
under the curve created by sweeping the true positive
rate vs. the false positive rate.
The closer to 1.00, the better. It should be greater than
0.50 for a model to be acceptable. A model with AUC of
0.50 or less is worthless.
AUCPR
aucPR or Area under the curve of a Precision-Recall
curve: Useful measure of success of prediction when the
classes are imbalanced (highly skewed datasets).
The closer to 1.00, the better. High scores close to 1.00
show that the classifier is returning accurate results
(high precision), as well as returning a majority of all
positive results (high recall).
F1-score
F1 score also known as balanced F-score or F-measure.
It's the harmonic mean of the precision and recall. F1
Score is helpful when you want to seek a balance
between Precision and Recall.
The closer to 1.00, the better. An F1 score reaches its
best value at 1.00 and worst score at 0.00. It tells you
how precise your classifier is
Source :Microsoft
SUMMING IT ALL
ML.NET gives you the ability to add machine learning to .NET
applications, in either online or offline scenarios.
The app focuses on making automatic predictions using the data
available rather than needing to be explicitly programmed.
Developers can train a Machine Learning Model or reuse an existing
Model by a 3rd party and run it on any environment offline.
Developers do not need to have a background in Data Science to use
the framework but will still need the help of data scientists in the
building of the model and in analysing the results.

Sentiment Analysis for .NET Developers using ML.net

  • 1.
    MLOPINION APP :SENTIMENT ANALYSIS MADE SIMPLE Ajmal Dookhan 9th March , 2020
  • 2.
  • 3.
    “Type a quotehere. Snapchat stock loses $1.3 billion after Kylie Jenner tweet. Some players were able to benefit from the social media turmoil, The Big Short-style, and made $163 million. All of that fuzz was based on one tweet with negative sentiment. What would you do if you were a major investor in Snapchat ? That’s the power of sentiment analysis of online mentions.
  • 4.
    “’’The practice ofapplying Natural language Processing and Text Analysis techniques to identify and extract subjective information from a piece of text’’ SENTIMENT ANALYSIS, WHAT IS IT ?
  • 5.
    NATURAL LANGUAGE PROCESSING ➤The process of analyzing unstructured text to extract relevant information ➤ It determines if an expression is positive, negative or neutral and to what degree ➤ It is an emerging field that attempts to analyze and measure human emotions and convert into hard facts.
  • 6.
  • 7.
    MLOPINION APP ➤ Aasp.net core web application ➤ Based on Microsoft Machine Learning platform
  • 8.
    MLOPINION APP -CODE WORKFLOW • Collect and load training data into an IDataView object • Specify a pipeline of operations to extract features and apply a machine learning algorithm • Train a model by calling Fit() on the pipeline • Evaluate the model and iterate to improve • Save the model into binary format, for use in an application • Load the model back into an ITransformer object • Make predictions by calling CreatePredictionEngine.Predict()
  • 9.
    WHAT IS ML.NET? ➤ ML.NET is a free, open source, and cross platform machine learning framework for the .NET developer platform. ➤ ML.NET allows you to train, build, and ship custom machine learning models using C# or F# for a variety of ML scenarios. ML.NET includes features like automated machine learning (AutoML) and tools like ML.NET CLI and ML.NET Model Builder, which make integrating machine learning into your applications even easier.
  • 10.
    MLOPINION APP -CODE WORKFLOW • Collect and load training data into an IDataView object • Specify a pipeline of operations to extract features and apply a machine learning algorithm • Train a model by calling Fit() on the pipeline • Evaluate the model and iterate to improve • Save the model into binary format, for use in an application • Load the model back into an ITransformer object • Make predictions by calling CreatePredictionEngine.Predict()
  • 11.
    MLOPINION APP -HOW I BUILT IT ? ➤ I created the web app ➤ Added ML.NET Model Builder Extension
  • 12.
    MLOPINION APP -HOW I BUILT IT ?
  • 13.
    MLOPINION APP -HOW I BUILT IT ?
  • 14.
    MLOPINION APP -ADDING THE DATA SOURCE https://raw.githubusercontent.com/dotnet/machinelearning/master/test/data/wikipedia-detox-250-line-data.tsv An annotated dataset of 1m crowd-sourced annotations that cover 100k talk page diffs (with 10 judgements per diff) for personal attacks, aggression, and toxicity.
  • 15.
    MLOPINION APP -HOW I BUILT IT ?
  • 16.
    MLOPINION APP -HOW I BUILT IT ?
  • 17.
    MLOPINION APP -ADDING THE DATA
  • 18.
  • 19.
  • 20.
  • 21.
    MLOPINION APP -EVALUATING
  • 22.
    MLOPINION APP -EVALUATING THE MODEL
  • 23.
    MLOPINION APP -GENERATING THE CODE
  • 24.
    MLOPINION APP -GENERATING THE CODE
  • 25.
    MLOPINION APP -ADDING MICROSOFT.EXTENSIONS.ML TO THE WEB APP
  • 26.
    MLOPINION APP -ADDING REFERENCES TO THE MODEL IN THE WEB APP
  • 27.
    MLOPINION APP -BUILDING AND RUNNING THE APP
  • 28.
    MLOPINION APP -ADDING REFERENCES TO THE MODEL IN THE WEB APP
  • 29.
    NEXT : DOTNETMLNET CLI • Prepare your data for the selected machine learning task • Run the 'mlnet auto-train' command from the CLI • Review the quality metric results • Understand the generated C# code to use the model in your application • Explore the generated C# code that was used to train the model The ML.NET CLI automates model generation for .NET developers. The ML.NET CLI simplifies the Machine Learning process using automated machine learning (AutoML). Source :Microsoft
  • 30.
    ML.NET Currently, the MLTasks supported by the ML.NET CLI are: Binary-classification Multiclass-classification Regression
  • 31.
    ML.NET CLI WITHAUTO-TRAIN mlnet auto-train --task binary-classification --dataset "yelp_labelled.txt" --label- column-index 1 --has-header false --max-exploration-time 30 --name mLOpinionApp Increasing the training time to 60 seconds => Higher accuracy
  • 32.
    EVALUATION METRICS Metrics DescriptionLook for Accuracy Accuracy is the proportion of correct predictions with a test data set. It is the ratio of number of correct predictions to the total number of input samples. It works well if there are similar number of samples belonging to each class. The closer to 1.00, the better. But exactly 1.00 indicates an issue (commonly: label/target leakage, over-fitting, or testing with training data). When the test data is unbalanced (where most of the instances belong to one of the classes), the dataset is small, or scores approach 0.00 or 1.00, then accuracy doesn’t really capture the effectiveness of a classifier and you need to check additional metrics. AUC aucROC or Area under the curve measures the area under the curve created by sweeping the true positive rate vs. the false positive rate. The closer to 1.00, the better. It should be greater than 0.50 for a model to be acceptable. A model with AUC of 0.50 or less is worthless. AUCPR aucPR or Area under the curve of a Precision-Recall curve: Useful measure of success of prediction when the classes are imbalanced (highly skewed datasets). The closer to 1.00, the better. High scores close to 1.00 show that the classifier is returning accurate results (high precision), as well as returning a majority of all positive results (high recall). F1-score F1 score also known as balanced F-score or F-measure. It's the harmonic mean of the precision and recall. F1 Score is helpful when you want to seek a balance between Precision and Recall. The closer to 1.00, the better. An F1 score reaches its best value at 1.00 and worst score at 0.00. It tells you how precise your classifier is Source :Microsoft
  • 34.
    SUMMING IT ALL ML.NETgives you the ability to add machine learning to .NET applications, in either online or offline scenarios. The app focuses on making automatic predictions using the data available rather than needing to be explicitly programmed. Developers can train a Machine Learning Model or reuse an existing Model by a 3rd party and run it on any environment offline. Developers do not need to have a background in Data Science to use the framework but will still need the help of data scientists in the building of the model and in analysing the results.