SlideShare a Scribd company logo
Machine Learning
What is it?
● Grew out of work in Artificial Intelligence (AI)
● 1959 Arthur Samuel – Machine Learning:
● „Field of study that gives computers the ability to
learn without being explicitly programmed.”
● 1998 Tom Mitchell – Well posed learning
problem:
● „A computer program is said to 'learn' from
experience 'E' with respect to some task 'T' and
some performance measure 'P', if its performance
on 'T', as measured by 'P', improves with
experience 'E'.”
What is it?
● Example:
● Email program
(experience)
– 'E' – watches you label emails as spam/not spam
(task)
– 'T' – classifies emails as spam/not spam
(performance)
– 'P' – fraction of emails correctly classified as spam/not
spam
What is it?
● Solves complicated, underspecified problems
● Some problems can't be solved directly by software
● Instead of writing a program for each problem:
● Collect samples of correct input->output
● Use algorithm to create a program to do the same
● Program handles new cases (other than those in
the training data), retrain if new data
● Massive amounts of data + computation is
cheaper than developing software
http://technocalifornia.blogspot.com/2012/07/more-data-or-better-models.html
Problems for Machine Learning
● Pattern recognition
● Objects in real scenes
● Computer vision – facial identities / expressions
● Speech recognition
– Sample sounds
– Partition phonemes
– Decoding – extract meaning, NLP
● Natural language
Problems for Machine Learning
● Recognizing anomalies
● Unusual sequences
– Credit / phone fraud
– SPAM / HAM
● Sensor readings
– Power plant operation and health
– Detect when actions are required
Problems for Machine Learning
● Prediction
● Stock price movements (time sequence)
● Currency exchange rates
● Risk analytics
● Sentiment analysis
● Click throughs (web traffic)
● Preferences
– Netflix, Amazon, Pandora, web ad targetting, etc.
Problems for Machine Learning
● Information Retrieval (database mining)
● Genomics
● News/Twitter data feeds
● Archived data
● Web clicks
● Medical records
● Find similar, summarize groups of material
Learning - Supervised
● Predict output given the input, train using inputs
with known outputs
● Regression – target is a real number, goal is to
be 'close'
● Classification – target is a class label: binary
(yes/no) or multi-class (one of many)
Learning – Unsupervised
● Older texts explicitly exclude this from being
learning!
● Discover good internal representation of input
● Difficult to determine what the goal is
● Create a representation that can be used in
subsequent supervised learning?
● Dimensionality reduction (PCA) can be used for
compression or to simplify analysis
● Provide an economical high dimensional
representation (binary features, real features –
single largest parameter)
Learning – Reinforcement
● Select action to maximize payoff
● Maximize expected sum of future rewards
● Not every action results in a payoff
● Apply discounting to minimize effect of far future on
present decisions
● Difficult – payoffs are delayed, critical decision
points unknown, scalar payoff contains little
information
Learning – Reinforcement
● Planning
● Choice of actions by anticipating outcomes
● Actions and planning can be interleaved
(incomplete knowledge)
– Warehouse, dock management, Route
planning/replanning
● Multiple simultaneous agents planning
independently
– Emergency responders
– http://www.aiai.ed.ac.uk/project/i-globe/resources/2007-
03-06-Iglobe/2007-03-06-Iglobe-Demo.avi
Learning – Data
● Training data [ ~60% - 80% ]
● Inputs (with correct response for supervised)
● Validation data [ ~20% ]
● Converge by training on multiple sets of data,
improving each time
● Test data [ ~10% - 20% ]
● Not used until training and validation are complete –
measure performance with this data set
Learning – Data
● Partition randomly
● Time series data use random subsequences
● Training and test data should be from same
population
● If feature selection or model tuning required
(e.g. PCA parameter mapping) then the tuning
must be done for each training set
Learning – Training
● One iteration for each set of input data in the
training data set
● Start with random parameters
● Randomize input data during training
● Calculate model parameters for each input
● Use previous parameter values to calculate
next values using new training input
Learning – Bias and Variance
● Bias – algorithm errors
● High bias – underfit
● More training data does not help
● Variance – sensitivity to fluctuations in data
● High variance – overfit
● More training data likely to help
● Irreducible error - noise
Learning – Bias and Variance
Learning – (Cross) Validation
● Validation
● Holdout data for tuning model with new data
● Evaluate model using holdout as test set
● Cross validation
● generating models with different holdouts to avoid
overfitting
● n-fold - divide data into n chunks and train n times,
treating a different chunk as the holdout each time
(leave-one-out – same with chunk size of 1)
● Random subsampling – approaches leave-p-out
Learning - Improvements
● Things to do when the error is to high
● Get more training data (high variance)
● Try smaller sets of features (high variance)
● Try getting additional features (high bias)
● Add polynomial features (high bias)
● Decrease smoothing parameter λ (high bias)
● Increase smoothing parameter λ (high variance)
Learning – Testing
● Reserve set of data [~10% - 20% ]
● Evaluate model performance with the test set
● Make no further model changes
● Performance evaluation
● Supervised learning – compare predictions with
known results
● Predictions of unsupervised model when results
can be known – even if not used in training
Training - Gradient Descent
● Find minimum of a cost / performance metric
Training – Gradient Descent
● Linear cost function
● Well behaved
● Single global minimum, easily reached
Training – Gradient Descent
● Complex cost functions
● Not well behaved
● Global minimum, many local minima
Training – Gradient Descent
● Convergence speed and stability controlled by
slope parameter α
● Low α ● High α
Training – k-means
● Classify data into k different groups
● Start with k random points
● Group data with the closest point
● Move the points to the centroid of the data for that
point
● Terminate when the points no longer move (or
move only a small amount)
Training – k-means
Training – k-nn
● k nearest neighbors determine classification of
each element in data
● Skewed data can result in homogenous result
● Use weighting to avoid this
● Training – store the training data
● For each data point to be predicted
● Locate the nearest k other points
– Use any consistent distance metric – l-p norms (euclidan,
manhattan distances, maximum single direction)
● Assign the majority class of those nearest points
Training – k-nn
Types of Machine Learning
● Regressions
● Neural Networks
● Dimensionality reduction
● Support Vector Machines (SVM)
● Principle Component Analysis (PCA)
● Clustering
● Classification
● Probabilistic – Bayes, Markov
● ...others...
Regression
● Single / Multiple variable
● Linear / Logistic
● Regularization (smoothing) – helps to avoid
overfitting
Regression – Equations
● Linear regression
hypothesis function
● Logistic regression
hypothesis function
● Regularized linear
regression cost
function
● Regularized logistic
regression cost
function
Neural Networks - Representation
● Nodes – compared to neurons, many inputs,
one output
● Transfer characteristic – logistic function
● Input from left, output to right
● Layers
– Input layer, driven by numeric input values
– Output layer, provides numeric output values (or
thresholded for classification output)
– Hidden layers between input and output – no discernable
meaning for their values
Neural Networks - Representation
Neural Networks – Learning
● Learns using gradient descent
● Forward propagation – start at inputs, derive
parameters of next stage
● Backward propagation – start at outputs, adjust
parameters to produce desired output
Neural Networks - Learning
● OCR training set
● what does the number '2' look like when
handwritten?
Neural Networks - Learning
● Neural Network parameters are not simply
interpretable
Support Vector Machines
● Supervised learning classification and
regression algorithm
● Cocktail Party Problem
● Many speakers, many sensors (microphones)
● Classify source from the inputs
[W,s,v] = svd((repmat(sum(x.*x,1),size(x,1),1).*x)*x');
Principle Component Analysis
● Unsupervised learning
● Finds basis vectors for data
● Largest is the 'principle' component
● Center each attribute on mean for visualization,
not for prediction models
● Normalized to same range to provide
comparable contributions from each factor
Classification
● Logistic partitioning - data
Classification
● Logistic partitioning – classification boundary
Classification
● Logistic partitioning – overfit boundary
Classification
● Logistic partitioning – underfit boundary
Classification - Performance
Classification - Performance
Classification - Performance
Classification - Performance
Classification - Performance
● Receiver Operating Characteristic (ROC)
● Location of classification performance
● Perfect predictions indicated in upper left corner
● Up and to the left means better
● Diagonal from lower left to upper right indicates
performance equivalent to random guessing
Classification - Performance
● Receiver Operating Characteristic (ROC)
Classification - Performance
● Area Under the Curve (AUC)
● ROC chart with curves applied
● Classifications based on thresholds for continuous
random variables
● Curve is parametric plot with the threshold as the
varying parameter
● AUC is a scalar summary of predictive value
Classification - Performance
● Area Under the Curve (AUC)
Natural Language Processing
● Text processing
● Modeling
● Generative models – generate observed data from
hidden parameters
– N-gram, Naive Bayes, HSMM, CFG
● Discriminative models – estimate probability of
hidden parameters from observed data
– Regressions, maximum entropy, conditional random
fields, support vector machines, neural networks
NLP - Language Modeling
● Probability of sequences of words (fragments,
sentences)
● Markov assumption
● Product of each element probability conditional on
small preceding sequence
– N-grams: bigrams: single preceding word, trigrams: two
preceeding words
NLP - Information Extraction
● Find and understand relevant parts of texts
● Gather information from many sources
● Produce structured representation
● Relations, knowledge base
● Resource Description Framework (RDF)
● Retrieval
● Finding unstructured material in a large collection
● Web/email search, knowledge bases, legal data,
health data, etc.
NLP - Performance

More Related Content

What's hot

House price prediction
House price predictionHouse price prediction
House price prediction
Karanseth30
 
Predicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine LearningPredicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine Learning
Leo Salemann
 
House Price Prediction An AI Approach.
House Price Prediction An AI Approach.House Price Prediction An AI Approach.
House Price Prediction An AI Approach.
Nahian Ahmed
 
IRJET- House Rent Price Prediction
IRJET- House Rent Price PredictionIRJET- House Rent Price Prediction
IRJET- House Rent Price Prediction
IRJET Journal
 
Housing price prediction
Housing price predictionHousing price prediction
Housing price prediction
Abhimanyu Dwivedi
 
House Price Prediction.pptx
House Price Prediction.pptxHouse Price Prediction.pptx
House Price Prediction.pptx
CodingWorld5
 
Prediction of house price using multiple regression
Prediction of house price using multiple regressionPrediction of house price using multiple regression
Prediction of house price using multiple regression
vinovk
 
Artificial-Neural-Networks.ppt
Artificial-Neural-Networks.pptArtificial-Neural-Networks.ppt
Artificial-Neural-Networks.ppt
ChidanGowda1
 
Handwritten Digit Recognition
Handwritten Digit RecognitionHandwritten Digit Recognition
Handwritten Digit Recognition
ijtsrd
 
knowledge representation using rules
knowledge representation using rulesknowledge representation using rules
knowledge representation using rules
Harini Balamurugan
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
GauravPandey319
 
Bathi%20Ram%20PPT.pptx
Bathi%20Ram%20PPT.pptxBathi%20Ram%20PPT.pptx
Bathi%20Ram%20PPT.pptx
BONUSAIVENKATADEEPAK
 
Prediction of Diamond Prices Using Multivariate Regression
Prediction of Diamond Prices Using Multivariate RegressionPrediction of Diamond Prices Using Multivariate Regression
Prediction of Diamond Prices Using Multivariate Regression
MohitMhapuskar
 
CS8691 - Artificial Intelligence.pdf
CS8691 - Artificial Intelligence.pdfCS8691 - Artificial Intelligence.pdf
CS8691 - Artificial Intelligence.pdf
KishaKiddo
 
Knowledge representation in AI
Knowledge representation in AIKnowledge representation in AI
Knowledge representation in AIVishal Singh
 
MACHINE LEARNING-LEARNING RULE
MACHINE LEARNING-LEARNING RULEMACHINE LEARNING-LEARNING RULE
MACHINE LEARNING-LEARNING RULE
DrBindhuM
 
Heuristic search
Heuristic searchHeuristic search
Heuristic search
Soheil Khodayari
 
Activation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural networkActivation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural network
Gayatri Khanvilkar
 
HEALTH PREDICTION ANALYSIS USING DATA MINING
HEALTH PREDICTION ANALYSIS USING DATA  MININGHEALTH PREDICTION ANALYSIS USING DATA  MINING
HEALTH PREDICTION ANALYSIS USING DATA MINING
Ashish Salve
 

What's hot (20)

House price prediction
House price predictionHouse price prediction
House price prediction
 
Predicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine LearningPredicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine Learning
 
House Price Prediction An AI Approach.
House Price Prediction An AI Approach.House Price Prediction An AI Approach.
House Price Prediction An AI Approach.
 
IRJET- House Rent Price Prediction
IRJET- House Rent Price PredictionIRJET- House Rent Price Prediction
IRJET- House Rent Price Prediction
 
Housing price prediction
Housing price predictionHousing price prediction
Housing price prediction
 
House Price Prediction.pptx
House Price Prediction.pptxHouse Price Prediction.pptx
House Price Prediction.pptx
 
Prediction of house price using multiple regression
Prediction of house price using multiple regressionPrediction of house price using multiple regression
Prediction of house price using multiple regression
 
Artificial-Neural-Networks.ppt
Artificial-Neural-Networks.pptArtificial-Neural-Networks.ppt
Artificial-Neural-Networks.ppt
 
Handwritten Digit Recognition
Handwritten Digit RecognitionHandwritten Digit Recognition
Handwritten Digit Recognition
 
knowledge representation using rules
knowledge representation using rulesknowledge representation using rules
knowledge representation using rules
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
Bathi%20Ram%20PPT.pptx
Bathi%20Ram%20PPT.pptxBathi%20Ram%20PPT.pptx
Bathi%20Ram%20PPT.pptx
 
Learning
LearningLearning
Learning
 
Prediction of Diamond Prices Using Multivariate Regression
Prediction of Diamond Prices Using Multivariate RegressionPrediction of Diamond Prices Using Multivariate Regression
Prediction of Diamond Prices Using Multivariate Regression
 
CS8691 - Artificial Intelligence.pdf
CS8691 - Artificial Intelligence.pdfCS8691 - Artificial Intelligence.pdf
CS8691 - Artificial Intelligence.pdf
 
Knowledge representation in AI
Knowledge representation in AIKnowledge representation in AI
Knowledge representation in AI
 
MACHINE LEARNING-LEARNING RULE
MACHINE LEARNING-LEARNING RULEMACHINE LEARNING-LEARNING RULE
MACHINE LEARNING-LEARNING RULE
 
Heuristic search
Heuristic searchHeuristic search
Heuristic search
 
Activation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural networkActivation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural network
 
HEALTH PREDICTION ANALYSIS USING DATA MINING
HEALTH PREDICTION ANALYSIS USING DATA  MININGHEALTH PREDICTION ANALYSIS USING DATA  MINING
HEALTH PREDICTION ANALYSIS USING DATA MINING
 

Similar to Machine learning

Deep learning crash course
Deep learning crash courseDeep learning crash course
Deep learning crash course
Vishwas N
 
Getting started with Machine Learning
Getting started with Machine LearningGetting started with Machine Learning
Getting started with Machine Learning
Gaurav Bhalotia
 
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво....NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...
NETFest
 
Overview of machine learning
Overview of machine learning Overview of machine learning
Overview of machine learning
SolivarLabs
 
CSSC ML Workshop
CSSC ML WorkshopCSSC ML Workshop
CSSC ML Workshop
GDSC UofT Mississauga
 
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
PATHALAMRAJESH
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Universitat Politècnica de Catalunya
 
Artificial Intelligence Chapter 9 Negnevitsky
Artificial Intelligence Chapter 9 NegnevitskyArtificial Intelligence Chapter 9 Negnevitsky
Artificial Intelligence Chapter 9 Negnevitskylopanath
 
Analytics Boot Camp - Slides
Analytics Boot Camp - SlidesAnalytics Boot Camp - Slides
Analytics Boot Camp - Slides
Aditya Joshi
 
Machine Learning and Deep Learning 4 dummies
Machine Learning and Deep Learning 4 dummies Machine Learning and Deep Learning 4 dummies
Machine Learning and Deep Learning 4 dummies
Dori Waldman
 
Machine learning4dummies
Machine learning4dummiesMachine learning4dummies
Machine learning4dummies
Michael Winer
 
How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ? How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ?
HackerEarth
 
Production-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroProduction-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to hero
Daniel Marcous
 
Introduction to Machine learning
Introduction to Machine learningIntroduction to Machine learning
Introduction to Machine learning
Knoldus Inc.
 
V2.0 open power ai virtual university deep learning and ai introduction
V2.0 open power ai virtual university   deep learning and ai introductionV2.0 open power ai virtual university   deep learning and ai introduction
V2.0 open power ai virtual university deep learning and ai introduction
Ganesan Narayanasamy
 
Online advertising and large scale model fitting
Online advertising and large scale model fittingOnline advertising and large scale model fitting
Online advertising and large scale model fitting
Wush Wu
 
Machine Learning
Machine Learning Machine Learning
Machine Learning
GaytriDhingra1
 
AI -learning and machine learning.pptx
AI  -learning and machine learning.pptxAI  -learning and machine learning.pptx
AI -learning and machine learning.pptx
GaytriDhingra1
 
KNOLX_Data_preprocessing
KNOLX_Data_preprocessingKNOLX_Data_preprocessing
KNOLX_Data_preprocessing
Knoldus Inc.
 
Data mining with weka
Data mining with wekaData mining with weka
Data mining with weka
Hein Min Htike
 

Similar to Machine learning (20)

Deep learning crash course
Deep learning crash courseDeep learning crash course
Deep learning crash course
 
Getting started with Machine Learning
Getting started with Machine LearningGetting started with Machine Learning
Getting started with Machine Learning
 
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво....NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...
 
Overview of machine learning
Overview of machine learning Overview of machine learning
Overview of machine learning
 
CSSC ML Workshop
CSSC ML WorkshopCSSC ML Workshop
CSSC ML Workshop
 
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
 
Artificial Intelligence Chapter 9 Negnevitsky
Artificial Intelligence Chapter 9 NegnevitskyArtificial Intelligence Chapter 9 Negnevitsky
Artificial Intelligence Chapter 9 Negnevitsky
 
Analytics Boot Camp - Slides
Analytics Boot Camp - SlidesAnalytics Boot Camp - Slides
Analytics Boot Camp - Slides
 
Machine Learning and Deep Learning 4 dummies
Machine Learning and Deep Learning 4 dummies Machine Learning and Deep Learning 4 dummies
Machine Learning and Deep Learning 4 dummies
 
Machine learning4dummies
Machine learning4dummiesMachine learning4dummies
Machine learning4dummies
 
How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ? How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ?
 
Production-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroProduction-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to hero
 
Introduction to Machine learning
Introduction to Machine learningIntroduction to Machine learning
Introduction to Machine learning
 
V2.0 open power ai virtual university deep learning and ai introduction
V2.0 open power ai virtual university   deep learning and ai introductionV2.0 open power ai virtual university   deep learning and ai introduction
V2.0 open power ai virtual university deep learning and ai introduction
 
Online advertising and large scale model fitting
Online advertising and large scale model fittingOnline advertising and large scale model fitting
Online advertising and large scale model fitting
 
Machine Learning
Machine Learning Machine Learning
Machine Learning
 
AI -learning and machine learning.pptx
AI  -learning and machine learning.pptxAI  -learning and machine learning.pptx
AI -learning and machine learning.pptx
 
KNOLX_Data_preprocessing
KNOLX_Data_preprocessingKNOLX_Data_preprocessing
KNOLX_Data_preprocessing
 
Data mining with weka
Data mining with wekaData mining with weka
Data mining with weka
 

Recently uploaded

Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
correoyaya
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 

Recently uploaded (20)

Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 

Machine learning

  • 2. What is it? ● Grew out of work in Artificial Intelligence (AI) ● 1959 Arthur Samuel – Machine Learning: ● „Field of study that gives computers the ability to learn without being explicitly programmed.” ● 1998 Tom Mitchell – Well posed learning problem: ● „A computer program is said to 'learn' from experience 'E' with respect to some task 'T' and some performance measure 'P', if its performance on 'T', as measured by 'P', improves with experience 'E'.”
  • 3. What is it? ● Example: ● Email program (experience) – 'E' – watches you label emails as spam/not spam (task) – 'T' – classifies emails as spam/not spam (performance) – 'P' – fraction of emails correctly classified as spam/not spam
  • 4. What is it? ● Solves complicated, underspecified problems ● Some problems can't be solved directly by software ● Instead of writing a program for each problem: ● Collect samples of correct input->output ● Use algorithm to create a program to do the same ● Program handles new cases (other than those in the training data), retrain if new data ● Massive amounts of data + computation is cheaper than developing software http://technocalifornia.blogspot.com/2012/07/more-data-or-better-models.html
  • 5. Problems for Machine Learning ● Pattern recognition ● Objects in real scenes ● Computer vision – facial identities / expressions ● Speech recognition – Sample sounds – Partition phonemes – Decoding – extract meaning, NLP ● Natural language
  • 6. Problems for Machine Learning ● Recognizing anomalies ● Unusual sequences – Credit / phone fraud – SPAM / HAM ● Sensor readings – Power plant operation and health – Detect when actions are required
  • 7. Problems for Machine Learning ● Prediction ● Stock price movements (time sequence) ● Currency exchange rates ● Risk analytics ● Sentiment analysis ● Click throughs (web traffic) ● Preferences – Netflix, Amazon, Pandora, web ad targetting, etc.
  • 8. Problems for Machine Learning ● Information Retrieval (database mining) ● Genomics ● News/Twitter data feeds ● Archived data ● Web clicks ● Medical records ● Find similar, summarize groups of material
  • 9. Learning - Supervised ● Predict output given the input, train using inputs with known outputs ● Regression – target is a real number, goal is to be 'close' ● Classification – target is a class label: binary (yes/no) or multi-class (one of many)
  • 10. Learning – Unsupervised ● Older texts explicitly exclude this from being learning! ● Discover good internal representation of input ● Difficult to determine what the goal is ● Create a representation that can be used in subsequent supervised learning? ● Dimensionality reduction (PCA) can be used for compression or to simplify analysis ● Provide an economical high dimensional representation (binary features, real features – single largest parameter)
  • 11. Learning – Reinforcement ● Select action to maximize payoff ● Maximize expected sum of future rewards ● Not every action results in a payoff ● Apply discounting to minimize effect of far future on present decisions ● Difficult – payoffs are delayed, critical decision points unknown, scalar payoff contains little information
  • 12. Learning – Reinforcement ● Planning ● Choice of actions by anticipating outcomes ● Actions and planning can be interleaved (incomplete knowledge) – Warehouse, dock management, Route planning/replanning ● Multiple simultaneous agents planning independently – Emergency responders – http://www.aiai.ed.ac.uk/project/i-globe/resources/2007- 03-06-Iglobe/2007-03-06-Iglobe-Demo.avi
  • 13. Learning – Data ● Training data [ ~60% - 80% ] ● Inputs (with correct response for supervised) ● Validation data [ ~20% ] ● Converge by training on multiple sets of data, improving each time ● Test data [ ~10% - 20% ] ● Not used until training and validation are complete – measure performance with this data set
  • 14. Learning – Data ● Partition randomly ● Time series data use random subsequences ● Training and test data should be from same population ● If feature selection or model tuning required (e.g. PCA parameter mapping) then the tuning must be done for each training set
  • 15. Learning – Training ● One iteration for each set of input data in the training data set ● Start with random parameters ● Randomize input data during training ● Calculate model parameters for each input ● Use previous parameter values to calculate next values using new training input
  • 16. Learning – Bias and Variance ● Bias – algorithm errors ● High bias – underfit ● More training data does not help ● Variance – sensitivity to fluctuations in data ● High variance – overfit ● More training data likely to help ● Irreducible error - noise
  • 17. Learning – Bias and Variance
  • 18. Learning – (Cross) Validation ● Validation ● Holdout data for tuning model with new data ● Evaluate model using holdout as test set ● Cross validation ● generating models with different holdouts to avoid overfitting ● n-fold - divide data into n chunks and train n times, treating a different chunk as the holdout each time (leave-one-out – same with chunk size of 1) ● Random subsampling – approaches leave-p-out
  • 19. Learning - Improvements ● Things to do when the error is to high ● Get more training data (high variance) ● Try smaller sets of features (high variance) ● Try getting additional features (high bias) ● Add polynomial features (high bias) ● Decrease smoothing parameter λ (high bias) ● Increase smoothing parameter λ (high variance)
  • 20. Learning – Testing ● Reserve set of data [~10% - 20% ] ● Evaluate model performance with the test set ● Make no further model changes ● Performance evaluation ● Supervised learning – compare predictions with known results ● Predictions of unsupervised model when results can be known – even if not used in training
  • 21. Training - Gradient Descent ● Find minimum of a cost / performance metric
  • 22. Training – Gradient Descent ● Linear cost function ● Well behaved ● Single global minimum, easily reached
  • 23. Training – Gradient Descent ● Complex cost functions ● Not well behaved ● Global minimum, many local minima
  • 24. Training – Gradient Descent ● Convergence speed and stability controlled by slope parameter α ● Low α ● High α
  • 25. Training – k-means ● Classify data into k different groups ● Start with k random points ● Group data with the closest point ● Move the points to the centroid of the data for that point ● Terminate when the points no longer move (or move only a small amount)
  • 27. Training – k-nn ● k nearest neighbors determine classification of each element in data ● Skewed data can result in homogenous result ● Use weighting to avoid this ● Training – store the training data ● For each data point to be predicted ● Locate the nearest k other points – Use any consistent distance metric – l-p norms (euclidan, manhattan distances, maximum single direction) ● Assign the majority class of those nearest points
  • 29. Types of Machine Learning ● Regressions ● Neural Networks ● Dimensionality reduction ● Support Vector Machines (SVM) ● Principle Component Analysis (PCA) ● Clustering ● Classification ● Probabilistic – Bayes, Markov ● ...others...
  • 30. Regression ● Single / Multiple variable ● Linear / Logistic ● Regularization (smoothing) – helps to avoid overfitting
  • 31. Regression – Equations ● Linear regression hypothesis function ● Logistic regression hypothesis function ● Regularized linear regression cost function ● Regularized logistic regression cost function
  • 32. Neural Networks - Representation ● Nodes – compared to neurons, many inputs, one output ● Transfer characteristic – logistic function ● Input from left, output to right ● Layers – Input layer, driven by numeric input values – Output layer, provides numeric output values (or thresholded for classification output) – Hidden layers between input and output – no discernable meaning for their values
  • 33. Neural Networks - Representation
  • 34. Neural Networks – Learning ● Learns using gradient descent ● Forward propagation – start at inputs, derive parameters of next stage ● Backward propagation – start at outputs, adjust parameters to produce desired output
  • 35. Neural Networks - Learning ● OCR training set ● what does the number '2' look like when handwritten?
  • 36. Neural Networks - Learning ● Neural Network parameters are not simply interpretable
  • 37. Support Vector Machines ● Supervised learning classification and regression algorithm ● Cocktail Party Problem ● Many speakers, many sensors (microphones) ● Classify source from the inputs [W,s,v] = svd((repmat(sum(x.*x,1),size(x,1),1).*x)*x');
  • 38. Principle Component Analysis ● Unsupervised learning ● Finds basis vectors for data ● Largest is the 'principle' component ● Center each attribute on mean for visualization, not for prediction models ● Normalized to same range to provide comparable contributions from each factor
  • 40. Classification ● Logistic partitioning – classification boundary
  • 47. Classification - Performance ● Receiver Operating Characteristic (ROC) ● Location of classification performance ● Perfect predictions indicated in upper left corner ● Up and to the left means better ● Diagonal from lower left to upper right indicates performance equivalent to random guessing
  • 48. Classification - Performance ● Receiver Operating Characteristic (ROC)
  • 49. Classification - Performance ● Area Under the Curve (AUC) ● ROC chart with curves applied ● Classifications based on thresholds for continuous random variables ● Curve is parametric plot with the threshold as the varying parameter ● AUC is a scalar summary of predictive value
  • 50. Classification - Performance ● Area Under the Curve (AUC)
  • 51. Natural Language Processing ● Text processing ● Modeling ● Generative models – generate observed data from hidden parameters – N-gram, Naive Bayes, HSMM, CFG ● Discriminative models – estimate probability of hidden parameters from observed data – Regressions, maximum entropy, conditional random fields, support vector machines, neural networks
  • 52. NLP - Language Modeling ● Probability of sequences of words (fragments, sentences) ● Markov assumption ● Product of each element probability conditional on small preceding sequence – N-grams: bigrams: single preceding word, trigrams: two preceeding words
  • 53. NLP - Information Extraction ● Find and understand relevant parts of texts ● Gather information from many sources ● Produce structured representation ● Relations, knowledge base ● Resource Description Framework (RDF) ● Retrieval ● Finding unstructured material in a large collection ● Web/email search, knowledge bases, legal data, health data, etc.