ML basics.pptx

Outline
2
Machine Learning
Types of Machine Learning Problems
Steps to solve a MachineLearning Problem

What is aCat?
Occlusion Diversity Deformation Lighting variations
4

What is MachineLearning?
Introduction to Machine Learning
The subfield of computer science that “gives computers the ability to learn
without being explicitly programmed”.
(Arthur Samuel, 1959)
Acomputer program is said to learn from experience Ewith respect to
some class of tasks Tand performance measure Pif its performance at
tasks in T,as measured by P
,improves with experience E.”
(TomMitchell, 1997)
Using data for answeringquestions
Training Predicting

The Big DataEra
Introduction toMachine Learning
Data alreadyavailable
everywhere
Lowstorage costs: everyone
has several GBs for“free”
Hardware more powerfuland
cheaper than everbefore
Everyone has a computer
fullypacked withsensors:
• GPS
• Cameras
• Microphones
Permanentlyconnected to
Internet
• YouTube
• Gmail
• Facebook
• Twitter
Services
CloudComputing:
• Onlinestorage
• Infrastructure as a
Service
User applications:
Devices
Data

Types of Machine LearningProblems
Supervised
Unsupervised
Reinforcement

Typesof Machine LearningProblems
Supervised
Is this a cat or a dog?
Are these emails spam or not?
Unsupervised
Predict the market value of houses, given the
square meters, number of rooms, neighborhood,
etc.
Reinforcement
Learn through examples of which we knowthe
desired output (what we want topredict).

Supervised
Unsupervised
Reinforcement
Output is a discrete
variable (e.g., cat/dog)
Classification
Regression
Output is continuous
(e.g., price, temperature)

Unsupervised
Supervised
There is no desired output. Learn somethingabout
the data. Latent relationships.
I want to find anomalies in the credit cardusage
patterns of my customers.
Reinforcement
I have photos and want to put them in 20
groups.

Unsupervised
Supervised
Reinforcement
Useful for learning structure in the data(clustering),
hidden correlations, reduce dimensionality,etc.

Environment gives feedback via a positiveor
negative reward signal.
Unsupervised
Reinforcement
Supervised An agent interacts with an environment andwatches
the result of the interaction.

IntroductiontoMachineLearning
StepstoSolve a MachineLearning Problem
Data
Gathering
Collect data from
various sources
Data
Preprocessing
Clean data tohave
homogeneity
Feature
Engineering
Making yourdata
more useful
Algorithm Selection
& Training
Selecting the right
machine learning model
Making
Predictions
Evaluate the
model

DataGathering
Might depend on humanwork
• Manual labeling for supervised learning.
• Domain knowledge. Maybe evenexperts.
May come for free, or “sortof”
• E.g., Machine Translation.
The more the better: Some algorithms need large amounts of data to be
useful (e.g., neural networks).
The quantity and quality of data dictate the modelaccuracy

DataPreprocessing
Is there anything wrong with thedata?
• Missing values
• Outliers
• Bad encoding (fortext)
• Wrongly-labeled examples
• Biased data
• Do I have many more samples of one class
than the rest?
Need to fix/remove data?

FeatureEngineering
What is a feature?
Afeature is an individual measurable
property of a phenomenon being observed
Our inputs are represented by a setof features.
T
oclassify spam email, features couldbe:
• Number of words that have beench4ng3d
like this.
• Language of the email (0=English,
1=Spanish)
• Number of emojis
Buy ch34p drugs
from the ph4rm4cy
now :) :) :)
(2, 0, 3)
Feature
engineering

FeatureEngineering
Extract more information from existing data, not adding “new”dataper-se
• Making it more useful
• With good features, most algorithms can learnfaster
It can be an art
• Requires thought and knowledge of thedata
T
wo steps:
• Variable transformation (e.g., dates into weekdays, normalizing)
• Feature creation (e.g., n-grams for texts, if word is capitalizedto
detect names, etc.)

Algorithm Selection &Training
Supervised
• Linear classifier
• Naive Bayes
• Support Vector Machines (SVM)
• Decision Tree
• Random Forests
• k-Nearest Neighbors
• Neural Networks (Deeplearning)
Unsupervised
• PCA
• t-SNE
• k-means
• DBSCAN
Reinforcement
• SARSA–λ
• Q-Learning

Goal of training: making the correct prediction as often as possible
• Incremental improvement:
• Use of metrics for evaluating performance and comparingsolutions
• Hyperparameter tuning: more an art than ascience
Algorithm Selection &Training
Predict Adjust

Making Predictions
Feature
extraction
Machine
Learning
model
Samples
Labels
Features
Feature
extraction
Input Features Trained
classifier
Label
Training Phase
Prediction Phase

Summary
• Machine Learning is intelligent use of data to answer questions
• Enabled by an exponential increase in computing power anddata
availability
• Three big types of problems: supervised, unsupervised,reinforcement
• 5 stepsto every machine learning solution:
1. Data Gathering
2. Data Preprocessing
3. Feature Engineering
4. Algorithm Selection & Training
5. Making Predictions

ML basics.pptx

More Related Content

Similar to ML basics.pptx

More from PriyadharshiniG41

Recently uploaded

ML basics.pptx