SlideShare a Scribd company logo
1 of 31
ML 101
SAMEER MAHAJAN
ML Paradigm Shift
Insight
Intelligence
Model
f
input
Traditional
Programming
output
ML
input / data / features /x
output / labels / y
f
Types of problems and solutions
• Regression: real valued output, predicting house prices
• Classification: product reviews
• Clustering: unsupervised learning, document retrieval
• Recommender systems: product recommendation
• Deep learning: neural networks
• Time series: forecasting
• NLP / NLU: GPT3
• Image Recognition – Computer Vision
• Acoustics
Regression
• Predicting house prices
• Linear regression
• Logistic regression
• Multi variate
• Polynomial Regression
• Ridge
• Lasso
y = f(x) = w0 + w1 * x
Linear regression
Cost / Loss Function
w0
w1
RSS(w0, w1)
Gradient Descent
x
y
x0, y0
x1, y1
step
x2, y2
xmin, ymin
…
yn = yn – 1 – alpha * d yn – 1 /d xn – 1
gradient
gradient = 0
alpha = learning rate = xn - xn – 1
Gradient / slope at point (xn,yn) = d yn /d xn
Tools & Technologies
• Jupyter notebook
• Python
• Numpy
• Pandas
• Matplotlib
• Scikit learn, Tensorflow, Pytorch
Solving Linear Regression
import pandas as pd
sales = pd.read_csv('seattle_house_sales.csv')
from sklearn import linear_model
regr = linear_model.LinearRegression()
sqft_model = regr.fit(train.sqft_living.values.reshape(-1, 1),
train.price.values.reshape(-1, 1))
import matplotlib.pyplot as
plt plt.scatter(train['sqft_living'], train['price'], color='black')
plt.plot(train['sqft_living'], sqft_model.predict(train['sqft_living'].
values.reshape(-1, 1)), color='blue', linewidth=3)
plt.xticks(())
plt.yticks(())
plt.show()
Classification
• Sentiment analysis
• Analyze restaurant reviews
Clustering
• Unsupervised learning
• Document retrieval
#occurrences of a word in a document
Document Vectors
word1 word2 word3 …… wordn
Document1 ( N11 N12 N13 ….. N1n)
Document2 ( N21 N22 N23 ….. N2n)
.
.
.
Dcoumentm ( Nm1 Nm2 Nm3 ….. Nmn)
Recommenders
• Netflix movie recommendations based on user ratings
• Song recommender based on user listen count
• Facebook friend recommender
• Popularity based: not personalized
• Classification based: features may not be available
• Co-occurrence based: who bought this also bought…
Time series - fbprophet
• Semi supervised
• Time as feature
• Data as y
• Components
• Trend : upward / downward
• Seasonality : day of the week
• Cycle : every 5 years
• Noise
• Usually a combination of above components
• Forecasting
Future
• Acoustics - Speech recognition
• Video processing
• Robotics
• Alpha Go Zero
• Self driving cars
Challenges
• Model selection
• Feature engineering
• Scaling
• Data
• Model
• Special architectures
• Parallel processing
• GPUs
Next steps – Online courses
• https://github.com/sameermahajan/MLWorkshop
• Coursera
• Machine learning specialization
• Machine learning by Andrew Ng, Stanford
• Deep learning specialization
• Udemy
• Machine Learning A to Z
• Deep Learning A to Z
• Udacity
• Machine Learning Engineer
• Deep Learning Foundation Nanodegree Program
Next steps - contd
• Online competitions
• kaggle
• Online datasets to play with
• https://www.kaggle.com/datasets
• http://mldata.org/repository/data/
• http://archive.ics.uci.edu/ml/index.php
• http://deeplearning.net/datasets/
• https://deeplearning4j.org/opendata
• https://catalog.data.gov/dataset
• Formulate your own problem, gather data, model, evaluate and keep
refining it further
Q&A

More Related Content

Similar to Introduction to Machine Learning

Cikm 2013 - Beyond Data From User Information to Business Value
Cikm 2013 - Beyond Data From User Information to Business ValueCikm 2013 - Beyond Data From User Information to Business Value
Cikm 2013 - Beyond Data From User Information to Business Value
Xavier Amatriain
 
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakLearn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
PyData
 
Deep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesDeep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep Features
Turi, Inc.
 

Similar to Introduction to Machine Learning (20)

Cutting Edge Computer Vision for Everyone
Cutting Edge Computer Vision for EveryoneCutting Edge Computer Vision for Everyone
Cutting Edge Computer Vision for Everyone
 
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...
 
Machine Learning.pptx
Machine Learning.pptxMachine Learning.pptx
Machine Learning.pptx
 
Cikm 2013 - Beyond Data From User Information to Business Value
Cikm 2013 - Beyond Data From User Information to Business ValueCikm 2013 - Beyond Data From User Information to Business Value
Cikm 2013 - Beyond Data From User Information to Business Value
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
Deep learning: the future of recommendations
Deep learning: the future of recommendationsDeep learning: the future of recommendations
Deep learning: the future of recommendations
 
Big & Personal: the data and the models behind Netflix recommendations by Xa...
 Big & Personal: the data and the models behind Netflix recommendations by Xa... Big & Personal: the data and the models behind Netflix recommendations by Xa...
Big & Personal: the data and the models behind Netflix recommendations by Xa...
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
 
Ml - A shallow dive
Ml  - A shallow diveMl  - A shallow dive
Ml - A shallow dive
 
(CMP305) Deep Learning on AWS Made EasyCmp305
(CMP305) Deep Learning on AWS Made EasyCmp305(CMP305) Deep Learning on AWS Made EasyCmp305
(CMP305) Deep Learning on AWS Made EasyCmp305
 
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakLearn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
 
Deep Learning Introduction - WeCloudData
Deep Learning Introduction - WeCloudDataDeep Learning Introduction - WeCloudData
Deep Learning Introduction - WeCloudData
 
Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015
 
Building High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsBuilding High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning Applications
 
DeepLearningLecture.pptx
DeepLearningLecture.pptxDeepLearningLecture.pptx
DeepLearningLecture.pptx
 
OWF14 - Big Data : The State of Machine Learning in 2014
OWF14 - Big Data : The State of Machine  Learning in 2014OWF14 - Big Data : The State of Machine  Learning in 2014
OWF14 - Big Data : The State of Machine Learning in 2014
 
Deep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesDeep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep Features
 
DL (v2).pptx
DL (v2).pptxDL (v2).pptx
DL (v2).pptx
 
GANs Deep Learning Summer School
GANs Deep Learning Summer SchoolGANs Deep Learning Summer School
GANs Deep Learning Summer School
 
Connected Components Labeling
Connected Components LabelingConnected Components Labeling
Connected Components Labeling
 

More from Sameer Mahajan

Leapfrog in deep learning
Leapfrog in deep learningLeapfrog in deep learning
Leapfrog in deep learning
Sameer Mahajan
 

More from Sameer Mahajan (9)

blockchainpost.pptx
blockchainpost.pptxblockchainpost.pptx
blockchainpost.pptx
 
Crypto101.pptx
Crypto101.pptxCrypto101.pptx
Crypto101.pptx
 
Software Architect Track
Software Architect TrackSoftware Architect Track
Software Architect Track
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
 
Computer Networking 101
Computer Networking 101Computer Networking 101
Computer Networking 101
 
Nasscom ml ops webinar
Nasscom ml ops webinarNasscom ml ops webinar
Nasscom ml ops webinar
 
Apache spark
Apache sparkApache spark
Apache spark
 
Feature engineering
Feature engineeringFeature engineering
Feature engineering
 
Leapfrog in deep learning
Leapfrog in deep learningLeapfrog in deep learning
Leapfrog in deep learning
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Recently uploaded (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

Introduction to Machine Learning

  • 2.
  • 4.
  • 5. Types of problems and solutions • Regression: real valued output, predicting house prices • Classification: product reviews • Clustering: unsupervised learning, document retrieval • Recommender systems: product recommendation • Deep learning: neural networks • Time series: forecasting • NLP / NLU: GPT3 • Image Recognition – Computer Vision • Acoustics
  • 6. Regression • Predicting house prices • Linear regression • Logistic regression • Multi variate • Polynomial Regression • Ridge • Lasso
  • 7. y = f(x) = w0 + w1 * x Linear regression
  • 8. Cost / Loss Function w0 w1 RSS(w0, w1)
  • 9. Gradient Descent x y x0, y0 x1, y1 step x2, y2 xmin, ymin … yn = yn – 1 – alpha * d yn – 1 /d xn – 1 gradient gradient = 0 alpha = learning rate = xn - xn – 1 Gradient / slope at point (xn,yn) = d yn /d xn
  • 10.
  • 11. Tools & Technologies • Jupyter notebook • Python • Numpy • Pandas • Matplotlib • Scikit learn, Tensorflow, Pytorch
  • 12. Solving Linear Regression import pandas as pd sales = pd.read_csv('seattle_house_sales.csv') from sklearn import linear_model regr = linear_model.LinearRegression() sqft_model = regr.fit(train.sqft_living.values.reshape(-1, 1), train.price.values.reshape(-1, 1)) import matplotlib.pyplot as plt plt.scatter(train['sqft_living'], train['price'], color='black') plt.plot(train['sqft_living'], sqft_model.predict(train['sqft_living']. values.reshape(-1, 1)), color='blue', linewidth=3) plt.xticks(()) plt.yticks(()) plt.show()
  • 13. Classification • Sentiment analysis • Analyze restaurant reviews
  • 14.
  • 15.
  • 17. #occurrences of a word in a document
  • 18. Document Vectors word1 word2 word3 …… wordn Document1 ( N11 N12 N13 ….. N1n) Document2 ( N21 N22 N23 ….. N2n) . . . Dcoumentm ( Nm1 Nm2 Nm3 ….. Nmn)
  • 19.
  • 20. Recommenders • Netflix movie recommendations based on user ratings • Song recommender based on user listen count • Facebook friend recommender • Popularity based: not personalized • Classification based: features may not be available • Co-occurrence based: who bought this also bought…
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26. Time series - fbprophet • Semi supervised • Time as feature • Data as y • Components • Trend : upward / downward • Seasonality : day of the week • Cycle : every 5 years • Noise • Usually a combination of above components • Forecasting
  • 27. Future • Acoustics - Speech recognition • Video processing • Robotics • Alpha Go Zero • Self driving cars
  • 28. Challenges • Model selection • Feature engineering • Scaling • Data • Model • Special architectures • Parallel processing • GPUs
  • 29. Next steps – Online courses • https://github.com/sameermahajan/MLWorkshop • Coursera • Machine learning specialization • Machine learning by Andrew Ng, Stanford • Deep learning specialization • Udemy • Machine Learning A to Z • Deep Learning A to Z • Udacity • Machine Learning Engineer • Deep Learning Foundation Nanodegree Program
  • 30. Next steps - contd • Online competitions • kaggle • Online datasets to play with • https://www.kaggle.com/datasets • http://mldata.org/repository/data/ • http://archive.ics.uci.edu/ml/index.php • http://deeplearning.net/datasets/ • https://deeplearning4j.org/opendata • https://catalog.data.gov/dataset • Formulate your own problem, gather data, model, evaluate and keep refining it further
  • 31. Q&A