Carolina AI Meetup Nov 2018

MACHINE LEARNING
PRINCIPLES & ALGORITHMS

OUTLINE
• What is Machine Learning?
• Applications in Machine Learning
• (The Machine Learning) Model
• Machine Learning Models in Action
• Training Data
• Model / Data Considerations
• Models
• DecisionTree
• Random Forest
• Clustering
• Linear Models
• SupportVector Machines (SVM)
• Artificial Neural Networks
• Deep Learning (CNN)
• Reinforcement Learning

WHAT IS MACHINE LEARNING?
“Field of study that gives computers the ability to
learn without being explicitly programmed.”
- Arthur Samuel
A computer program is said to learn from experience E with respect to
some taskT and some performance measure P, if its performance onT,
as measured by P, improves with experience E.
-Tom Mitchell1959 1998

APPLICATIONS INTHE MODERN WORLDAPPLICATIONS INTHE MODERN WORLD
Optical Character
Recognition
Recommendation
Engines
Facial Recognition
Autonomous
Vehicles
Personal Assistants /
Chat Bots

MODEL
A REPRESENTATION OF A REALWORLD PROCESS
Water Cycle

MODEL
Water Cycle
Evolution

MODEL
Water Cycle
Evolution
Neuron-McCulloch & Pitts Model, 1943

MACHINE LEARNING MODELS IN ACTION
Untrained Model(Old)
DATA
Trained Model New
Data
Info?
Prediction?
Decision?
Expert
Knowledge

TRAINING DATA
Feature =Variable = Predictor Objective Measurement
Height (in) Weight (lb) Color Claws retract Class
11.2 10.1 black yes cat
23.1 45.2 black/white no dog
13.0 20.1 black/white yes cat
9.7 7.2 white yes cat
… … … … …

TRAINING DATA
Feature =Variable = Predictor Objective Measurement
Height (in) Weight (lb) Color Claws retract Class
11.2 10.1 black yes
23.1 45.2 black/white no dog
13.0 20.1 black/white yes
9.7 7.2 white yes cat
… … … … …

TESTING DATA (NO PEEKING!)
Training and testing sets must
ALWAYS be disjoint
• Cross-validation
• Leave-one-out
• OOB (Out-of-bag for
ensembles)

MODEL/DATA CONSIDERATIONS
(RELEVANT TO MODEL SELECTION)
Each model can/cannot handle certain data characteristics / analysis needs
• Supervised vs. Unsupervised data?
• Class Imbalance (200 cats vs. 3 dogs)
• 2-class vs. Multiclass (say 200 cats, 146 dogs, 25 sugar gliders, 5 platypuses)
• Scale issues (see Distance-based Clustering; Normalization / Standardization)
• FeatureType (Categorical, Continuous, etc)
• Dimensionality (# of features / measurements)
• Cost Sensitivity (Miss / False Alarm – can the model adjust?)
• Propensity to Overtrain (fitting to noise – see Bias vs.Variance)?
• Need to estimate uncertainty?
• Ability to adapt to changing conditions (parameters)?
• Robustness to sparse data (parameter estimation)?

DECISIONTREE
1) At each node, a question is asked
about a specific feature
2) The answer directs data left/right
3) Decision trees must be pruned to
prevent overtraining

RANDOM FOREST
Random Forest is an ENSEMBLE of DecisionTrees
RANDOM FOREST

RANDOM FOREST
Node Splits (Training)
• Bagging (resampled data for each
tree)
• “Best” univariate split on random
subspace (subset of all features)
• Gini Impurity
• Leaf nodes are class homogeneousLeo Breiman

RANDOM FOREST
Leo Breiman
Classification
1) Samples propagate through
each tree
2) Tree “votes” for a class
based on leaf node
3) Final decision based on class
conditional probability

CLUSTERING
KeyVariants
K-means: point-to-cluster mean distance

CLUSTERING
KeyVariants
Mean-Shift: hill-climbing to max density

CLUSTERING
KeyVariants
DBSCAN: epsilon neighborhood

CLUSTERING
KeyVariants
Gaussian Mixture Models: Gaussian assumption

CLUSTERING
KeyVariants
Hierarchical Clustering
Gaussian Mixture Models: Gaussian assumption

LINEAR MODELS
Linear Discriminant Analysis Simple Linear Regression
(Ronald)
Fisher’s LDA

SUPPORTVECTOR MACHINES (SVM)
Maps linearly nonseparable data to a higher dimension
Kernel trick
makes this mapping more
efficient
Also: sub-gradient descent, coordinate descent

SUPPORTVECTOR MACHINES (SVM)
Support vectors in the feature space used for classification
Support vectors are
determined by the
most difficult points to
classify…

ARTIFICIAL NEURAL NETWORKS
Recall the original model of the
neuron…

Input Layer
Hidden Layers
Output Layer
Feedforward (forward processing)
• Each arrow represents a weight
• Hidden & output nodes “process”
input values/weights
Backpropagation (of errors)
• Allow specification of desired
output
• Minimize loss function
w11
w12
w41
f31
f41
f51

Input Layer
Hidden Layers
Output Layer
w11
w12
w41
f31
f41
f51

DEEP LEARNING (CONVOLUTIONAL NN)
From the Latin convolvere,“to convolve” means to roll together
We convolve an image with multiple kernels (filters) at each layer

Each layer of the network learns different features of the image

REINFORCEMENT LEARNING
A reward-driven approach for a
machine to “self-learn”
• At each step, the agent takes an
action based on environment state
• The agent receives a reward based
upon the new state (post-action)
• The agent’s goal is to maximize his
reward

Donald Michie creates MENACE, 1963
(Machine Educable Noughts And
Crosses Engine)
MENACE learned to play TicTacToe
using stacks of matchboxes

Q(uality)-Learning – values-based; environment may be unknown

Google’s
DeepMind AI
learns to walk

Carolina AI Meetup Nov 2018

Recommended

Recommended

More Related Content

Similar to Carolina AI Meetup Nov 2018

Similar to Carolina AI Meetup Nov 2018 (20)

Recently uploaded

Recently uploaded (20)

Carolina AI Meetup Nov 2018