3. IPULLRANK.COM @ IPULLRANK
Agenda
Machine Learning
Doomsday
ML vs DL vs AI?
Marketing Use Cases
Models & Use Cases
Tools For Marketers
Wrapping Up
Real World Examples
17. IPULLRANK.COM @ IPULLRANK
Apparently Machine Learning Can Write Copy For you
There is a sub-field of artificial intelligence called Natural Language Generation that has made the concept of content spinning a lot more viable and has been used for sports recaps and financial reports.
19. IPULLRANK.COM @ IPULLRANK
Improved Google Translate from Scratch in 2 Months
A team rebuilt the broken Google Translate using Machine Learning and within 2 months it was already as good as the version that had taken years to build.
20. IPULLRANK.COM @ IPULLRANK
AI Is Gonna Steal Your Job?
One of the more common fears of middle America around the idea of artificial intelligence is that robots will replace humans in their jobs.
22. The real fear of machine
learning and artificial
intelligence should be its
ability to reflect and amplify
our biases and the lack of
diversity of the people
creating it.
30. IPULLRANK.COM @ IPULLRANK
AI is Comprised of Many Disciplines
Deep Learning is a subset of Machine Learning is a subset of Artificial Intelligence.
AI many branches of which machine learning is a core branch that we can execute.
31. Artificial Intelligence as it is represented
in sci-fi is “general” artificial
intelligence. What we have achieved so
far is “narrow” artificial intelligence.
32. IPULLRANK.COM @ IPULLRANK
Types of Artificial Intelligence Explained Using
“The Lawnmower Man”
Narrow Artificial Intelligence
Machines that can do a specific
task or series of tasks
exceedingly well and very
efficiently.
General Artificial Intelligence
A machine that is as smart as a
human in that it can take in new
situations and make decisions.
Artificial Superintelligence
A machine that is potentially
orders of magnitude smarter
than a human in all categories
simultaneously
33. IPULLRANK.COM @ IPULLRANK
Experts Disagree on When General Intelligence Will
Happen
The primary issue keeping this from happening is computing power.
34. IPULLRANK.COM @ IPULLRANK
Experts Disagree
on When General
Intelligence Will
Happen
The primary issue
keeping this from
happening is
computing power.
37. Ok. So, What Is Machine Learning?
“Machine learning is a type of artificial
intelligence that provides computers with
the ability to learn without being
explicitly programmed.”
39. IPULLRANK.COM @ IPULLRANK
Supervised Learning
The machine looks for patterns that match the labeled data that you provide and classifies new data based on that.
41. IPULLRANK.COM @ IPULLRANK
Reinforcement Learning
With reinforcement learning, the model is continually trained based on new data thereby improving the classifier’s ability to perform.
42. And Deep Learning?
“Deep Learning is a subfield of machine
learning concerned with algorithms inspired
by the structure and function of the brain
called artificial neural networks.”
44. Machine Learning vs. Statistics
Machine Learning learns from data without
relying on rules-based programming,
statistical modeling identifies relationships in
the form of mathematical equations.
45. IPULLRANK.COM @ IPULLRANK
All Values vs. Linear
Representation
Machine Learning examines all
potential values based on probability
whereas statistics looks for a linear
function to describe the trend.
46. IPULLRANK.COM @ IPULLRANK
Machine Learning is the “Growth
Hacking” of the Statistics World
However, in some ways machine learning and statistics are so similar that many statisticians just feel as though machine learning is just a rebranding of what they do much like “growth hacking” is just a rebranding of marketing.
47. IPULLRANK.COM @ IPULLRANK
The Machine Learning
Process
GET & PREPARE
YOUR DATA
You identify and clean
your dataset in
preparation for solving
the machine learning
problem
CHOOSE YOUR
MODEL TRAIN YOUR
CLASSIFIER
You chose the
algorithm or model that
you believe will yield
the best results then
run it in order to train
your classifier.
SCORE AND
EVALUATE
You score the
accuracy and precision
of the classifier and
test it against other
algorithms to see what
performs best.
PREDICT OR
IDENTIFY
OUTCOMES
Once you are happy
with the results, you
use the classifier
moving forward to
make conclusions
about new data.
48. IPULLRANK.COM @ IPULLRANK
This is an example of how you could
predict the demand of cars for a car
rental company. It follows the same
framework.
Car Rental Example
49. IPULLRANK.COM @ IPULLRANK
Training Chatbots
Training chatbots is
similar to training
ML classifiers in that
you take a
knowledge base and
run it through NLP
then tune it with
regard to
conversations.
63. IPULLRANK.COM @ IPULLRANK
Training Chatbots
Training chatbots is
similar to training
ML classifiers in that
you take a
knowledge base and
run it through NLP
then tune it with
regard to
conversations.
65. IPULLRANK.COM @ IPULLRANK
Shit, Google Doesn’t Even Know How Rankbrain Works
Yet, the world’s greatest search engine has deployed it to production.
66. IPULLRANK.COM @ IPULLRANK
Remember this?
A team rebuilt the broken Google Translate using Machine Learning and within 2 months it was already as good as the version that had taken years to build.
67. IPULLRANK.COM @ IPULLRANK
Look Closer
A team rebuilt the broken Google Translate using Machine Learning and within 2 months it was already as good as the version that had taken years to build.
84. IPULLRANK.COM @ IPULLRANK
The Methodology is the
Machine Learning Part
We took all available domain-level link
features for the Searchmetrics losers
and winners and figured out (5-fold
cross validation, random forest and
lasso) which ones correlated best with
the results and then used that model
to re-rank the Inc. 500. (I probably
shoulda asked Marcus for more data,
but whatever).
85. IPULLRANK.COM @ IPULLRANK
Methodology behind
the Vector Report
We broke it into two types of machine learning questions. Classification and Logistic Regression to predict the probability of continued visibility in Organic Search.
Goal: identify SEO
winners and losers
and predict a site’s
performance in SEO
Classification
Random Forest
Gradient Boosting
Machine
Support Vector
Machine
Logistic Regression Regularization
98. IPULLRANK.COM @ IPULLRANK
K-Fold Cross Validation is a Guess and Check
Try out a model and validate it using k-fold cross validation.
99. IPULLRANK.COM @ IPULLRANK
How to Choose a Machine
Learning Model
https://docs.microsoft.com/en-us/azure/machine-learning/machine-learning-algorithm-choice
106. IPULLRANK.COM @ IPULLRANK
yHat Science Ops
yHat allows you to deploy machine learning models as REST APIs that can then be integrated with your site like any other API.
107. IPULLRANK.COM @ IPULLRANK
Beeswax Bidder-as-a-Service
Beeswax allows you to set up custom models to run your Display RTB campaigns.
108. Those are tools that allow
marketers to take control
with a data scientist.
109. IPULLRANK.COM @ IPULLRANK
mTurk - Labeling Data for Supervised Learning
Exploratory Data Analysis helps identifying general patterns in the data and serve as initial explorations of correlations.
112. IPULLRANK.COM @ IPULLRANK
MonkeyLearn &
Orange
We will primarily talk about MonkeyLearn and Orange as two tools marketers can use to do machine learning right now.
114. IPULLRANK.COM @ IPULLRANK
Exploratory Data Analysis
Exploratory Data Analysis helps identifying general patterns in the data and serve as initial explorations of correlations.
This SEO question can be translated into two type of machine learning questions (1) classification, which can be used to identify SEO winners and losers (2) while logistic regression, has the capability of predicting the probability of being a seo winner. Within classification…, within…
A snapshot shows the scatterplot of the Iris data set with the coloring matching of the class attribute.
For continuous attributes, the attribute values are displayed as a function graph.
Classification Tree is a simple classification algorithm that splits the data into nodes by class purity.
We removed stopwords and punctuation to find frequencies for meaningful words only.
Word Cloud displays tokens in the corpus, their size denoting the frequency of the word in corpus. Words are listed by their frequency (weight) in the widget. The widget outputs documents, containing selected tokens from the word cloud.
Similarity Hashing widget computes similarity hashes for the given corpus, allowing the user to find duplicates, plagiarism or textual borrowing in the corpus. What we got on the output is a table of 64 binary features (predefined in the SimHash widget), which denote a 64-bit hash size. Then we computed similarities in text by sending Similarity Hashing to Distances. Here we’ve selected euclidean row distances and sent the output to Hierarchical Clustering. We can see that we have some similar documents, so we can select and inspect them in Corpus Viewer.
Ridge regression generally yields better predictions than OLS solution, through a better compromise between bias and variance. Its main drawback is that all predictors are kept in the model, so it is not very interesting if you seek a parsimonious model or want to apply some kind of feature selection.
To achieve sparsity, the lasso is more appropriate but it will not necessarily yield good results in presence of high collinearity (it has been observed that if predictors are highly correlated, the prediction performance of the lasso is dominated by ridge regression). The second problem with L1 penalty is that the lasso solution is not uniquely determined when the number of variables is greater than the number of subjects (this is not the case of ridge regression). The last drawback of lasso is that it tends to select only one variable among a group of predictors with high pairwise correlations. In this case, there are alternative solutions like the group (i.e., achieve shrinkage on block of covariates, that is some blocks of regression coefficients are exactly zero) or fused lasso.
Naïve Bayes misclassified 14 adult corpus as children.
We randomly split the data into two subsets. The larger subset, containing 80 % of data instances, is sent to SVM and logistics regression, so they can produce the corresponding classifiers. Classifiers are then sent into Predictions, among with the remaining 20 % of the data. Predictions shows how these examples are classified.