Understanding Basics of Machine Learning

Understanding basics of
Machine Learning
Pranav Ainavolu
Microsoft MVP | Senior Developer at Realpage
@a_pranav | http://pranavon.net/

Agenda
1) data science
2) prediction
3) process
4) models
5) AzureML

data science
• key word: “science”
• try stuff
• it (might not | won’t) work
the first time
• this might work…question
• wikipedia timeresearch
• I have an ideahypothesis
• try it outexperiment
• did this even work?analysis
• time for a better ideaconclusion

machine learning
• finding (and exploiting) patterns in data
• replacing “human writing code” with
“human supplying data”
• system figures out what the person wants
based on examples
• need to abstract from “training” examples
to “test” examples
• most central issue in ML: generalization

machine learning
•split into two (ish) areas
•supervised learning
• predicting the future
• learn from past examples to predict future
•unsupervised learning
• understanding the past
• making sense of data
• learning structure of data
• compressing data for consumption

neat applications applications

neat applications
• spam catchers
• ocr (optical character recognition)
• natural language processing
• machine translation
• biology
• medicine
• robotics (autonomous systems)
• etc…
7

making decisions
•what kinds of decisions are we making?
• binary classification
• yes/no, 1/0, male/female
• multi-class classification
• {A, B, C, D, F} (Grade),
{1, 2, 3, 4} (Class),
{teacher, student, secretary}
• regression
• number between 0 and 100, real value
9

process
data
clean
transform
maths
model
predict

data
Class Outlook Temp. Windy
Play Sunny Low Yes
No Play Sunny High Yes
No Play Sunny High No
Play Overcast Low Yes
Play Overcast High No
Play Overcast Low No
No Play Rainy Low Yes
Play Rainy Low No
? Sunny Low No
label (y)
play / no play
features
outlook, temp, windy
values (x)
[Sunny, Low, Yes]
Labeled dataset is a collection of (X, Y) pairs.
Given a new x, how do we predict y?

clean / transform / maths
Play Sunny Lowest Yes
No Play ? High Yes
No Play Sunny High KindOf
Play Overcast ? Yes
Play Turtle Cloud High No
Play Overcast ? No
No Play Rainy Low 28%
Play Rainy Low No
? Sunny Low No
need to clean up data
need to convert to model-able form (linear algebra)
yak shaving
Any apparently useless activity
which, by allowing you to
overcome intermediate difficulties,
allows you to solve a larger
problem.
I was doing a bit of yak shaving
this morning, and it looks like it
might have paid off.
http://en.wiktionary.org/wiki/yak_shaving

clean / transform / maths
Play Sunny Low Yes
Play Rainy Low No
? Sunny Low No
need to clean up data
need to convert to model-able form (linear algebra)

model
Play Sunny Low Yes
Play Rainy Low No
? Sunny Low No

predict
PLAY!!!
? Sunny Low No

linear classifiers
•in order to classify things properly we need:
• a way to mathematically represent examples
• a way to separate classes (yes/no)
•“decision boundary”
•excel example
•graph example
17
MODELS

linear classifiers
•dot product of vectors
• [ 3, 4 ] ● [ 1, 2 ] = (3 × 1) + (4 × 2) = 11
• a ● b = | a | × | b | cos θ
• When does this equal 0?
•why would this be useful?
• decision boundary can be represented using a single vector
18
MODELS

perceptron
…and other linear models

linear classifiers
•Frank Rosenblatt, Cornell 1957
• let’s make a line (by using a single vector)
• take the dot product between the line and the new point
• > 0 belongs to class 1
• < 0 belongs to class 2
• == 0 flip a coin we don’t know
• for each example, if we make a mistake, move the line
20
MODELS

kernel methods
2𝑛 +
𝑛
2
= 2n +
𝑛 𝑛−1
2
features….

perceptron
•minimize mistakes by moving w
arg min
(𝒘,𝒃)
1
2
𝒘 2
subject to:
𝑦𝑖 𝒘 ∙ 𝒙𝒊 − 𝑏 ≥ 1
REMINDER

perceptron
•eventually this becomes an optimization problem
𝐿 𝛼 =
𝑖=1
𝑛
𝛼𝑖 −
1
2
𝑖,𝑗
𝛼𝑖 𝛼𝑗 𝑦𝑖 𝑦𝑗 𝒙𝑖
𝑇
𝒙𝑗
subject to:
𝛼𝑖 ≥ 0,
𝑖=1
𝑛
𝛼𝑖 𝑦𝑖 = 0
REMINDER

perceptron
•eventually this becomes an optimization problem
𝐿 𝛼 =
𝑖=1
𝑛
𝛼𝑖 −
1
2
𝑖,𝑗
𝛼𝑖 𝛼𝑗 𝑦𝑖 𝑦𝑗 𝑘 𝒙𝑖, 𝒙𝑗
subject to:
𝛼𝑖 ≥ 0,
𝑖=1
𝑛
𝛼𝑖 𝑦𝑖 = 0
REMINDER
dot product

perceptron
•Frank Rosenblatt, Cornell 1957
• let’s make a line (by using a single vector)
• take the dot product between the line and the new point
• > 0 belongs to class 1
• < 0 belongs to class 2
• == 0 flip a coin we don’t know
• for each example, if we make a mistake, move the line
30
REMINDER

kernel (one weird trick….)
•store dot product in a table
𝒙0
𝑇
𝒙0 ⋯ 𝒙0
𝑇
𝒙𝑗
⋮ ⋱ ⋮
𝒙𝑖
𝑇
𝒙0 ⋯ 𝒙𝑖
𝑇
𝒙𝑗
•call it the “kernel matrix” and “kernel trick”
•project into any space and still learn a linear model
MODELS

support vector machines
•this method is the basis for SVM’s
•returns a set of vectors (<< n) to make decision
•essentially changed the space to make it separable
MODELS

kernels
•polynomial kernel
𝐾 𝒙, 𝒚 = 𝒙 𝑇
𝒚 + 𝑐 𝑑
•RBF kernel
𝐾 𝒙, 𝒚 = exp −
𝒙 − 𝒚 2
2
2𝜎2
MODELS
1

neural networks
Play?
ℎ1
ℎ2
ℎ3
𝐵1

decision trees
Play Sunny Low Yes
Play Rainy Low No
? Sunny Low No

decision trees
•how should the computer split?
• information gain (with entropy)
• entropy measures how disorganized your
answer is.
• information gain says:
• if I separate the answer by the values in a
particular column, does the answer become
*more* organized?

decision trees
•calculating information gain:
• 𝐻 𝑦 – how messy is the answer
• 𝐻 𝑦 𝑎) – how messy is the answer if we
know a?
𝐼𝐺 𝑦, 𝑎 = 𝐻 𝑦 − 𝐻 𝑦 𝑎)
𝑎 ∈ 𝐴𝑡𝑡𝑟(𝑥)

how well is it doing?
Train Test
Use 80% Use 20%

AzureML
putting it all together
48

process reminder (same on Azure)
data
clean
transform
maths
model
predict

experiments
putting it all together
50

Truth
true false
Guess
positive
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
𝑡𝑝
𝑡𝑝 + 𝑓𝑝
negative
𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑡𝑟𝑢𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒
𝑟𝑒𝑐𝑎𝑙𝑙 =
𝑡𝑝
𝑡𝑝 + 𝑓𝑛
𝑎𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
𝑡𝑝 + 𝑡𝑛
𝑡𝑝 + 𝑡𝑛 + 𝑓𝑝 + 𝑓𝑛
confusion matrix

Understanding Basics of Machine Learning

More Related Content

What's hot

Similar to Understanding Basics of Machine Learning

More from Pranav Ainavolu

Recently uploaded

Understanding Basics of Machine Learning