SlideShare a Scribd company logo
1 of 43
Download to read offline
S H I WA N I
G U P T A
M A C H I N E
L E A R N I N G
1
S Y L L A B U S
Introduction to Machine Learning (1) 6
Machine Learning terminology, Types of Machine Learning, Issues in Machine Learning, Application of Machine
Learning, Steps in developing ML application, How to choose the right algorithm
Data Preprocessing (3) 10
Data Cleaning (missing value, outlier), Exploratory Data Analysis (descriptive statistics, Visualization), Feature
Engineering (Data Transformation (encoding, skew, scale), Feature selection)
Supervised Learning with Regression (1) 5
Simple Linear, Multiple Linear, Polynomial, Overfit/Undefit, Regularization, Evaluation Metric, Use case
Supervised Learning with Classification (3) 12
k Nearest Neighbor, Logistic Regression, Linear SVM, Kernels, Decision Tree (CART), Issues in DT learning,
Ensembles (Bagging – Random Forest, Boosting – Gradient Boost), Evaluation metric, Use case
Optimization Techniques (2) 6
Model Selection techniques ( Cross Validation), Gradient Descent Algorithm, Grid Search method, Model Evaluation
technique (Bias, Variance)
Unsupervised Learning with clustering and Reinforcement Learning (2) 6
k Means algorithm, Dimensionality Reduction, Use case, Elements of Reinforcement Learning, Temporal Difference
Learning, Online Learning, Use case
2
M O D U L E 1 ( 6 H O U R )
• Machine Learning terminology
• Types of Machine Learning
• Issues in Machine Learning
• Application of Machine Learning
• Steps in developing ML application
• How to choose the right algorithm
3
S / W A N D H / W R E Q U I R E M E N T
16+ GB RAM, 4+ CORES, SSD storage, Amazon AWS, MS Azure, Google cloud
Python Data Science S/W stack (pip, conda)
NumPy – Linear Algebra
Pandas – Data read / process
Scikit-Learn – ML algo
Matplotlib – Visualization
Seaborn – more aesthetically pleasing
Plotly – interactive visualization library
tsne – high dimensional visualization
StatsModel – statistical models
SciPy – optimization
Tkinter – GUI lib for python
PyTorch – open source framework
Keras – high level API and open source framework
TensorFlow - open source framework
Theano – multidim array manipulation
NLTK – human language data
BeautifulSoup – navigating webpage
Bokeh – interactive visualizations
TextBlob – process textual data
SHAP – Shaplely Additive exPlanations
xAI – eXplainable AI
•IDE – Spyder, Jupyter notebook, PyCharm, Google Colab
4
PROJECT
5
PROJECT
6
np.array([1, 2, 3]) #rank1 array
b.Shape #rows,col
a[:2, 1:3] # first 2 rows, col1,2
x.Dtype #datatype- int64, float64
np.reshape(v, (3, 1)) * w
PROJECT
pd.read_csv('data.csv')
pandas.DataFrame(mydataset)
df.head(10)
df.tail()
df.dropna()
df.corr()
df.plot()
P R E R E Q U I S I T E S
• Probability and Statistics (r.v., prob distrib, statistic – mean,
median, mode, variance, s.d., covariance, Baye’s theorem,
entropy)
• Linear Algebra (matrix, vector, tensors, eigen value, eigen
vector)
• Calculus (functions, derivatives of single variable and
multivariate functions)
• Python language
• Structured thinking, communication and prob solving
• Business understanding
7
W H Y I S M L G E T T I N G A T T E N T I O N R E C E N T L Y
This development is driven by a few underlying forces:
• The amount of data generated is increasing significantly with reduction in the cost of
sensors
• The cost of storing this data has reduced significantly
• The cost of computing has come down significantly
• Cloud has democratized compute for the masses
8
FUTURE
M L V S A U T O M A T I O N
• If you are thinking that machine learning is nothing but a new name for automation – you
would be wrong. Most of the automation which has happened in the last few decades has
been rule-driven automation. For example – automating flows in our mailbox needs us to
define the rules. These rules act in the same manner every time.
• On the other hand, machine learning helps machines learn by past data and change their
decisions/performance accordingly. Spam detection in our mailboxes is driven by machine
learning. Hence, it continues to evolve with time.
9
PROJECT
D E F I N I T I O N
“A computer program is said to learn from experience E with respect to some class of tasks T and performance
measure P, if its performance at tasks in T, as measured by P, improves with experience E” - Tom Mitchell
“Machine learning enables a machine to automatically learn from data, improve performance from experiences,
and predict things without being explicitly programmed.”
“Machine learning is a subfield of artificial intelligence, which enables machines to learn from past data or
experiences without being explicitly programmed.”
“Science of getting computers act without explicit programming” - Arthur Samuel
10
EXAM
S C I E N C E O F T E A C H I N G M A C H I N E S H O W T O L E A R N B Y S E L F
Eg. the task of mopping and cleaning the floor.
• When a human does the task – the quality of outcome would vary. The human would get exhausted / bored after a few hours of
work. The human would also get sick at times. Depending on the place – it could also be hazardous or risky for a human.
• Machines can do high frequency repetitive tasks with high accuracy without getting tired. On the other hand, if we can teach
machines to detect whether the floor needs cleaning and mopping and how much cleaning is required based on the condition of
the floor and the type of the floor, machines would be far better in doing the same job. They can go on to do that job without
getting tired or sick!
• This is what Machine Learning aims to do - enable machines to learn on their own.
In order to answer questions like:
• Whether the floor needs cleaning and mopping?
• How long does the floor need to be cleaned?
• Machines need a way to think and this is precisely where machine learning models help. The machines capture data from the
environment and feed it to the machine learning model. The model then uses this data to predict whether the floor needs cleaning
or not. And, for how long does it need the cleaning.
11
H O W D O M A C H I N E S L E A R N
• Tasks difficult for humans can be very simple for machines. e.g. multiplying very large numbers.
• Tasks which look simple to humans can be very difficult for machines!
• You only need to demonstrate cleaning and mopping to a human a few times before they can perform it on
their own.
• But, that is not the case with machines. We need to collect a lot of data along with the desired outcomes in
order to teach machines to perform specific tasks.
• This is where machine learning comes into play. Machine Learning would help the machine understand the
kind of cleaning, the intensity of cleaning, and duration of cleaning based on the conditions and nature of the
floor.
12
T O O L S
Language
• R
• Python
• SAS
• Julia
• Java
• Scala
Database
• SQL
• Oracle
• Hadoop
Visualisation
• D3.js
• Tableau
• QlikView
13
FUTURE
T E R M I N O L O G Y
• Dataset (training, validation, testing)
• .csv file
• Structured vs unstructured data
• predictor, target, explanatory, independent, dependent, response variable
• Instance
• Features (numerical, discrete, categorical, ordinal, nominal)
• Model
• Hypothesis
14
PROJECT
F E AT U R E S
15
PROJECT
T Y P E S
16
EXAM
T Y P E S
• Supervised Learning – labelled (binary and multi class)
• Classification – discrete response eg. LoR, NB, kNN,
SVM, DT, RF, GBM, XGB, NN
Eg. spam filtering, waste classification
• Regression – continuous response eg. LR, SVR, DTR,
RFR
Eg. changes in temperature, stock price prediction
17
EXAM
T Y P E S
• Unsupervised Learning - unlabelled
• Clustering eg. k means, hierarchical, NN
Eg. customer segmentation, city planning, cell phone tower for optimal signal reception
• Association eg. Apriori
Eg. diaper and beer, bread and milk
• Dimensionality Reduction eg. PCA, SVD
Eg. MNIST data (70000X784), face recognition (698X4096)
• Anomaly Detection eg. kNN, kMeans
Eg. Fraud detection, fault detection, outlier detection
• Semi supervised learning
• Speech Analysis, Web content classification, Google Expander
18
EXAM
T Y P E S
• Reinforcement Learning maximise cumulative reward eg. Q-Learning, SARSA, DQN
Eg. robotic dog, Tic Tac Toe
• Neural Network eg. recognise dog
• Deep Learning eg. chat bot, real time bidding, recommender system
• Natural Language Processing eg. Lemmatisation, Stemming
Eg. customer service complaints, virtual assistant
• Computer Vision eg. Canny edge detection, Haar Cascade classifier
Eg. skin cancer diagnosis, detect real time traffic, guided surgery
• Evolutionary Learning (GA, Optimisation algorithms)
Eg. Super Mario
19
EXAM
I S S U E S I N M A C H I N E L E A R N I N G
• What are the existing algorithm for learning?
• When will algorithm converge?
• Which algo perform best for what kind of problems?
• How much data sufficient? eg. training to classify cat and dog
• Non representative training data e.g. Exit poll during elections
• Poor quality of data eg. Outliers, Missing
• How many features required? Irrelevant features
• Overfitting training data
• Underfitting training data
• Computation power? eg. GPU and TPU for ML and DL
• Interpretability of model? eg. Why bank declined loan for customer
• How to improve learning?
• Optimization vs Generalization?
• New and better algorithms required
• Need for more data scientists
20
EXAM
P R O J E C T I D E A S ( 4 0 )
• Fraud detection
• Predict low oxygen level during surgery
• Recognise CVD factors
• Movie recommendation (Netflix)
• Marketing and Sales
• Weather prediction
• Traffic Prediction (Uber ATG)
• Loan defaulting prediction
• Handwriting recognition
• Sentiment analysis
• Human activity recognition
• Sports predictor
• Big Mart Sales prediction
• Fake news detection
• Disease prediction
• Stock market analysis
• Amazon Alexa
• Search Engine Optimization
• Auto-tagging and Friend
suggestion (Facebook)
• Swiggy and Uber Eats
• House price prediction
• Market Analysis
• Handwritten digit recognition
• Equipment failure prediction
• Prospective insurance buyer
• Google News
• Video Surveillance
• Movie Ticket pricing system
• Object Detection
21
PROJECT
M L U S E C A S E I N S M A R T P H O N E S
• From the voice assistant that sets your alarm and finds you the best restaurants to the simple
use case of unlocking your phone via facial recognition – Machine Learning is truly
embedded in our favourite devices.
• Voice Assistants
• Smartphone Cameras
• App Store and Play Store Recommendations
• Face Unlock
22
EXAM
M L U S E C A S E I N T R A N S P O R TAT I O N
• The application of machine learning in the transport industry has gone to an entirely different
level in the last decade. This coincides with the rise of ride-hailing apps like Uber, Lyft, Ola,
etc. These companies use machine learning throughout their many products, from planning
optimal routes to deciding prices for the rides we take. So, let’s look at a few popular use
cases in transportation which use machine learning heavily.
• Dynamic Pricing in Travel
• Transporting and Commuting - Uber
• Google Maps
23
EXAM
M L U S E C A S E I N W E B S E R V I C E S
• We interact with certain applications every day multiple times. What we perhaps did not
realize until recently, most of these applications work thanks to the power and flexibility of
Machine Learning.
• Email Filtering
• Google Search
• Google Translate
• Facebook and LinkedIn Recommendations
24
EXAM
M L U S E C A S E I N S A L E S A N D M A R K E T I N G
• Top companies in the world are using Machine Learning to transform their strategies from top
to bottom. The two most impacted functions? Marketing and Sales!
• These days if you’re working in the Marketing and Sales field, you need to know at least one
Business Intelligence tool (like Tableau or Power BI). Additionally, marketers are expected to
know how to leverage Machine Learning in their day-to-day role to increase brand
awareness, improve the bottom line, etc.
• Recommendation Engine
• Personalized Marketing
• Customer Support (Chatbots)
25
EXAM
M L U S E C A S E I N F I N A N C I A L D O M A I N
• Most of the jobs in Machine Learning are geared towards the financial domain. And that
makes sense! This is the ultimate numbers field. A lot of banking institutions till recently used
to lean on Logistic Regression (a simple machine learning algorithm) to crunch these
numbers.
• Fraud Detection
• Personalized Banking
26
EXAM
S T E P S I N B U I L D I N G A M L A P P L I C AT I O N
• Frame and define the business problem to ML problem
• What is the main objective? What are we trying to predict?
• What are the target features?
• What is the input data? Is it available?
• What kind of problem are we facing? Binary classification? Clustering?
• What is the expected improvement?
• Define performance metric
• Regression problems use certain evaluation metrics such as Mean Squared Error (MSE).
• Classification problems use evaluation metrics as Precision, Accuracy and Recall.
27
EXAM
S T E P S I N B U I L D I N G A M L A P P L I C AT I O N
• Gathering Data
• RSS feed, web scraping, API
• Generating Hypothesis
• Can our outputs be predicted given the inputs.
• Our available data is informative enough to learn the relationship between the inputs and the outputs
• Exploratory Data Analysis (Visualisation for outlier)
• Data Preparation and cleaning (Missing Value)
• Delete relevant info or samples
• Missing value imputation
28
EXAM
S T E P S I N B U I L D I N G A M L A P P L I C AT I O N
• Feature Engineering (Encoding, Transformation)
• Mapping Ordinal features
• Encoding Nominal class labels
• Normalization, Standardization
• Define benchmark / baseline model (kNN, NB)
• Chose model
• Train/build Model (train:validation:test)
• Shuffle for classification
• For weather prediction, stock price prediction etc. data should not be shuffled, as the sequence of data is a crucial feature.
• Evaluate Model for Optimal Hyperparameters (cross validation)
• Tune Model (Grid search, Randomized search)
• Model testing and Deployment for prediction
29
EXAM
C H O I C E O F R I G H T A L G O R I T H M
30
EXAM
S T E P S F O R S E L E C T I N G R I G H T M L A L G O
• Understand your Data
• Type of data will decide algorithm
• Algo will decide no. of samples
Eg. NB will work with categorical data and is not sensitive to missing data
• Stats and Visualization to know your data
• Percentile helps to identify outlier, median to identify central tendency
• Box plot (outlier), Histogram (spread), Scatter plot (bivariate relationship)
• Clean data w.r.t Missing value
• Feature Engineering
• Encoding
• Feature creation
31
EXAM
S T E P S F O R S E L E C T I N G R I G H T M L A L G O
• Categorize the problem
• By I/P (supervised, unsupervised)
• By O/P (regression, classification, clustering, anomaly detection)
• Understand constraints (data storage capacity, real time applications, fast learning)
• Look for available algorithm (business goals met?, preprocessing required?, accuracy?, explain ability?,
speed?, scalable?)
• Try each, assess and compare
• Optimize
• Evaluate performance
• Repeat if required
32
EXAM
C H O I C E O F M O D E L ( U S E C A S E )
• Linear Regression: unstable with redundant feature
Eg. Sales prediction, Time for commuting
• Logistic Regression: not blackbox, works with correlated features
Eg. Fraud detection, Customer churn prediction
• Decision Tree: can handle outliers but overfit and take large memory
Eg. Bank loan defaulter, Investment decision
• SVM: memory intensive, hard to interpret and difficult to tune
Eg. Text classification, Handwritten character recognition
• NB: less training data required, low memory requirement, faster
Eg. Sentiment analysis, Recommender systems
• RF: works well with large data and high dimension
Eg. Predict loan defaulters, Predict patients for high risk
• NN: resource and memory intensive
Eg. Object Recognition, Natural Language Translation
• K-means: grouping but no. of groups unknown
Eg. Customer Segmentation, Crime locality identification
• PCA: dimensionality reduction
Eg. MNIST digits
33
PROJECT
C H O I C E O F M E T R I C
• Regression
• Mean Square Error, Root MSE, R-squared
• Mean Absolute Error if outliers
• R2
• Classification
• Accuracy, LogLoss, ROC-AUC, Precision Recall
• Kappa score, MCC
• Unsupervised
• Mutual Information
• RAND index
• Reinforcement Learning
• Dispersion across Time
• Risk across Time
34
PROJECT
P R O J E C T L A B O R I E N TAT I O N
Installing Anaconda and Python
Step-1: Download Anaconda Python: www.anaconda.com/distribution/
Step- 2: Install Anaconda Python (Python 3.7 version): double click on the ".exe" file of
Anaconda
Step- 3: Open Anaconda Navigator: use Anaconda navigator to launch a Python IDE such as
Spyder and Jupyter Notebook
Step- 4: Close the Spyder/Jupyter Notebook IDE.
https://colab.research.google.com
https://github.com
35
PROJECT
P R O J E C T TA S K L I S T Study tool for implementation
Project title and Course identification
Chose data (Understand Domain and data)
Perform EDA
Perform Feature Engineering
Chose model
Train and Validate model
Tune Hyperparameters
Test and Evaluate model
Prepare Report
Prepare Technical Paper
Present Case Study
36
PROJECT
E X P E C TAT I O N S
Case Study Presentation
Mini Project
Technical Paper
Report
Competition (Inhouse, Online)
37
PROJECT
C A S E S T U D Y T I T L E S ( 3 1 )
MNIST
MS-COCO
ImageNet
CIFAR
IMDB Reviews
WordNet
Twitter Sentiment Analysis
BreastCancer Wisconsin
BBC News
Wheat seeds
Amazon Reviews
Facial Image
Spam SMS
YouTube
Chars74K
WineQuality
IrisFlowers
LabelMe
HoTPotQA
Ionosphere
Xview
US Census
Boston House Price
BankNote authentication
PIMA Indian Diabetes
BBC Sport
Titanic
Santander Product Recommendation
Sonar
Swedish Auto Insurance
Abalone
38
PROJECT
B O O K S A N D D ATA S E T R E S O U R C E S
• https://www.kaggle.com/datasets
• https://archive.ics.uci.edu/ml/index.php
• https://registry.opendata.aws/
• https://toolbox.google.com/datasetsearch
• https://msropendata.com/
• https://github.com/awesomedata/awesome-public-datasets
• Indian Government dataset
• US Government Dataset
• Northern Ireland Public Sector Datasets
• European Union Open Data Portal
• https://scikit-learn.org/stable/datasets/index.html
• https://data.world
• http://archive.ics.uci.edu/ml/datasets
• https://www.ehdp.com/vitalnet/datasets.htm
• https://www.data.gov/health/
• “Python Machine Learning”, Sebastian Raschka, Packt
publishing
• “Machine Learning In Action”, Peter Harrington,
DreamTech Press
• “Introduction to Machine Learning” Ethem Alpaydın,
MIT Press
• “Machine Learning” Tom M. Mitchell, McGraw Hill
• “Machine Learning - An Algorithmic Perspective”
Stephen Marsland, CRC Press
• “Machine Learning ― A Probabilistic Perspective”
Kevin P. Murphy, MIT Press
• “Pattern Recognition and Machine Learning”,
Christopher M. Bishop, Springer
• “Elements of Statistical Learning” Trevor Hastie,
Robert Tibshirani, Jerome Friedman, Springer
39
L E A R N I N G R E S O U R C E S
• https://www.analyticsvidhya.com
• https://towardsdatascience.com
• https://analyticsindiamag.com
• https://machinelearningmastery.com
• https://www.datacamp.com
• https://www.superdatascience.com
• https://www.elitedatascience.com
• https://medium.com
• Siraj Raval youtube channel
• https://mlcontests.com
• https://www.datasciencechallenge.net
• https://www.machinehack.com
• https://www.hackerearth.com
• www.hackerearth.com
• www.kaggle.com/competitions
• www.smartindiahackathon.gov.in
• www.datahack.analyticsvidhya.com
• www.daretocompete.com
• https://github.com
40
W H Y ?
41
S U M M A R Y ( S U M M AT I V E A S S E S S M E N T )
• Examine steps in developing Machine Learning application with respect to your mini project. [10]
• Review the issues in Machine Learning. [10]
• State applicable use case for each ML algorithm. [10]
• Examine Applications of AI. [10]
• Illustrate steps for selecting right ML algorithm. [10]
• Define ML and differentiate between Supervised, Unsupervised and Reinforcement learning with the help of suitable examples. [10]
• Explain ML w.r.t. identifying Tasks, Experience and Performance measure (Tom Mitchell). [10]
• designing a checkers learning problem
• designing a handwriting recognition learning problem
• designing a Robot driving learning problem
• Illustrate with example how Supervised learning can be used in handling loan defaulters. [10]
• Explain Supervised Learning with neat diagram. [10]
42
EXAM
Q U E R I E S
?
T H A N K
Y O U
43

More Related Content

Similar to ML MODULE 1_slideshare.pdf

Engineering Intelligent Systems using Machine Learning
Engineering Intelligent Systems using Machine Learning Engineering Intelligent Systems using Machine Learning
Engineering Intelligent Systems using Machine Learning Saurabh Kaushik
 
intro to ML by the way m toh phasee movie Punjabi
intro to ML by the way m toh phasee movie Punjabiintro to ML by the way m toh phasee movie Punjabi
intro to ML by the way m toh phasee movie Punjabibotvillain45
 
chapter1-introduction1.ppt
chapter1-introduction1.pptchapter1-introduction1.ppt
chapter1-introduction1.pptSeshuSrinivas2
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkIvo Andreev
 
Useful Techniques in Artificial Intelligence
Useful Techniques in Artificial IntelligenceUseful Techniques in Artificial Intelligence
Useful Techniques in Artificial IntelligenceIla Group
 
Machine learning a developer's perspective
Machine learning   a developer's perspectiveMachine learning   a developer's perspective
Machine learning a developer's perspectiveRupak Chakraborty
 
The Machine Learning Workflow with Azure
The Machine Learning Workflow with AzureThe Machine Learning Workflow with Azure
The Machine Learning Workflow with AzureIvo Andreev
 
Lecture 5 machine learning updated
Lecture 5   machine learning updatedLecture 5   machine learning updated
Lecture 5 machine learning updatedVajira Thambawita
 
Essential concepts for machine learning
Essential concepts for machine learning Essential concepts for machine learning
Essential concepts for machine learning pyingkodi maran
 
The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...
The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...
The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...Lucas Jellema
 
Machine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxMachine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxIvo Andreev
 
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)MAHIRA
 
Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflowCharmi Chokshi
 
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...Lucas Jellema
 
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...Infoshare
 

Similar to ML MODULE 1_slideshare.pdf (20)

Engineering Intelligent Systems using Machine Learning
Engineering Intelligent Systems using Machine Learning Engineering Intelligent Systems using Machine Learning
Engineering Intelligent Systems using Machine Learning
 
intro to ML by the way m toh phasee movie Punjabi
intro to ML by the way m toh phasee movie Punjabiintro to ML by the way m toh phasee movie Punjabi
intro to ML by the way m toh phasee movie Punjabi
 
AI Presentation 1
AI Presentation 1AI Presentation 1
AI Presentation 1
 
chapter1-introduction1.ppt
chapter1-introduction1.pptchapter1-introduction1.ppt
chapter1-introduction1.ppt
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
 
Useful Techniques in Artificial Intelligence
Useful Techniques in Artificial IntelligenceUseful Techniques in Artificial Intelligence
Useful Techniques in Artificial Intelligence
 
Artificial intelligence
Artificial intelligenceArtificial intelligence
Artificial intelligence
 
Machine learning a developer's perspective
Machine learning   a developer's perspectiveMachine learning   a developer's perspective
Machine learning a developer's perspective
 
AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)
 
The Machine Learning Workflow with Azure
The Machine Learning Workflow with AzureThe Machine Learning Workflow with Azure
The Machine Learning Workflow with Azure
 
Lecture 5 machine learning updated
Lecture 5   machine learning updatedLecture 5   machine learning updated
Lecture 5 machine learning updated
 
Essential concepts for machine learning
Essential concepts for machine learning Essential concepts for machine learning
Essential concepts for machine learning
 
The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...
The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...
The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...
 
Deep learning
Deep learningDeep learning
Deep learning
 
Machine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxMachine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackbox
 
OpenML data@Sheffield
OpenML data@SheffieldOpenML data@Sheffield
OpenML data@Sheffield
 
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
 
Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflow
 
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
 
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...
 

More from Shiwani Gupta

module6_stringmatchingalgorithm_2022.pdf
module6_stringmatchingalgorithm_2022.pdfmodule6_stringmatchingalgorithm_2022.pdf
module6_stringmatchingalgorithm_2022.pdfShiwani Gupta
 
module5_backtrackingnbranchnbound_2022.pdf
module5_backtrackingnbranchnbound_2022.pdfmodule5_backtrackingnbranchnbound_2022.pdf
module5_backtrackingnbranchnbound_2022.pdfShiwani Gupta
 
module4_dynamic programming_2022.pdf
module4_dynamic programming_2022.pdfmodule4_dynamic programming_2022.pdf
module4_dynamic programming_2022.pdfShiwani Gupta
 
module3_Greedymethod_2022.pdf
module3_Greedymethod_2022.pdfmodule3_Greedymethod_2022.pdf
module3_Greedymethod_2022.pdfShiwani Gupta
 
module2_dIVIDEncONQUER_2022.pdf
module2_dIVIDEncONQUER_2022.pdfmodule2_dIVIDEncONQUER_2022.pdf
module2_dIVIDEncONQUER_2022.pdfShiwani Gupta
 
module1_Introductiontoalgorithms_2022.pdf
module1_Introductiontoalgorithms_2022.pdfmodule1_Introductiontoalgorithms_2022.pdf
module1_Introductiontoalgorithms_2022.pdfShiwani Gupta
 
Functionsandpigeonholeprinciple
FunctionsandpigeonholeprincipleFunctionsandpigeonholeprinciple
FunctionsandpigeonholeprincipleShiwani Gupta
 
Uncertain knowledge and reasoning
Uncertain knowledge and reasoningUncertain knowledge and reasoning
Uncertain knowledge and reasoningShiwani Gupta
 

More from Shiwani Gupta (20)

ML MODULE 6.pdf
ML MODULE 6.pdfML MODULE 6.pdf
ML MODULE 6.pdf
 
ML MODULE 5.pdf
ML MODULE 5.pdfML MODULE 5.pdf
ML MODULE 5.pdf
 
ML MODULE 4.pdf
ML MODULE 4.pdfML MODULE 4.pdf
ML MODULE 4.pdf
 
module6_stringmatchingalgorithm_2022.pdf
module6_stringmatchingalgorithm_2022.pdfmodule6_stringmatchingalgorithm_2022.pdf
module6_stringmatchingalgorithm_2022.pdf
 
module5_backtrackingnbranchnbound_2022.pdf
module5_backtrackingnbranchnbound_2022.pdfmodule5_backtrackingnbranchnbound_2022.pdf
module5_backtrackingnbranchnbound_2022.pdf
 
module4_dynamic programming_2022.pdf
module4_dynamic programming_2022.pdfmodule4_dynamic programming_2022.pdf
module4_dynamic programming_2022.pdf
 
module3_Greedymethod_2022.pdf
module3_Greedymethod_2022.pdfmodule3_Greedymethod_2022.pdf
module3_Greedymethod_2022.pdf
 
module2_dIVIDEncONQUER_2022.pdf
module2_dIVIDEncONQUER_2022.pdfmodule2_dIVIDEncONQUER_2022.pdf
module2_dIVIDEncONQUER_2022.pdf
 
module1_Introductiontoalgorithms_2022.pdf
module1_Introductiontoalgorithms_2022.pdfmodule1_Introductiontoalgorithms_2022.pdf
module1_Introductiontoalgorithms_2022.pdf
 
ML MODULE 2.pdf
ML MODULE 2.pdfML MODULE 2.pdf
ML MODULE 2.pdf
 
ML Module 3.pdf
ML Module 3.pdfML Module 3.pdf
ML Module 3.pdf
 
Problem formulation
Problem formulationProblem formulation
Problem formulation
 
Simplex method
Simplex methodSimplex method
Simplex method
 
Functionsandpigeonholeprinciple
FunctionsandpigeonholeprincipleFunctionsandpigeonholeprinciple
Functionsandpigeonholeprinciple
 
Relations
RelationsRelations
Relations
 
Logic
LogicLogic
Logic
 
Set theory
Set theorySet theory
Set theory
 
Uncertain knowledge and reasoning
Uncertain knowledge and reasoningUncertain knowledge and reasoning
Uncertain knowledge and reasoning
 
Introduction to ai
Introduction to aiIntroduction to ai
Introduction to ai
 
Planning Agent
Planning AgentPlanning Agent
Planning Agent
 

Recently uploaded

Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationBoston Institute of Analytics
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Digi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxDigi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxTanveerAhmed817946
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 

Recently uploaded (20)

Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project Presentation
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Digi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxDigi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptx
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 

ML MODULE 1_slideshare.pdf

  • 1. S H I WA N I G U P T A M A C H I N E L E A R N I N G 1
  • 2. S Y L L A B U S Introduction to Machine Learning (1) 6 Machine Learning terminology, Types of Machine Learning, Issues in Machine Learning, Application of Machine Learning, Steps in developing ML application, How to choose the right algorithm Data Preprocessing (3) 10 Data Cleaning (missing value, outlier), Exploratory Data Analysis (descriptive statistics, Visualization), Feature Engineering (Data Transformation (encoding, skew, scale), Feature selection) Supervised Learning with Regression (1) 5 Simple Linear, Multiple Linear, Polynomial, Overfit/Undefit, Regularization, Evaluation Metric, Use case Supervised Learning with Classification (3) 12 k Nearest Neighbor, Logistic Regression, Linear SVM, Kernels, Decision Tree (CART), Issues in DT learning, Ensembles (Bagging – Random Forest, Boosting – Gradient Boost), Evaluation metric, Use case Optimization Techniques (2) 6 Model Selection techniques ( Cross Validation), Gradient Descent Algorithm, Grid Search method, Model Evaluation technique (Bias, Variance) Unsupervised Learning with clustering and Reinforcement Learning (2) 6 k Means algorithm, Dimensionality Reduction, Use case, Elements of Reinforcement Learning, Temporal Difference Learning, Online Learning, Use case 2
  • 3. M O D U L E 1 ( 6 H O U R ) • Machine Learning terminology • Types of Machine Learning • Issues in Machine Learning • Application of Machine Learning • Steps in developing ML application • How to choose the right algorithm 3
  • 4. S / W A N D H / W R E Q U I R E M E N T 16+ GB RAM, 4+ CORES, SSD storage, Amazon AWS, MS Azure, Google cloud Python Data Science S/W stack (pip, conda) NumPy – Linear Algebra Pandas – Data read / process Scikit-Learn – ML algo Matplotlib – Visualization Seaborn – more aesthetically pleasing Plotly – interactive visualization library tsne – high dimensional visualization StatsModel – statistical models SciPy – optimization Tkinter – GUI lib for python PyTorch – open source framework Keras – high level API and open source framework TensorFlow - open source framework Theano – multidim array manipulation NLTK – human language data BeautifulSoup – navigating webpage Bokeh – interactive visualizations TextBlob – process textual data SHAP – Shaplely Additive exPlanations xAI – eXplainable AI •IDE – Spyder, Jupyter notebook, PyCharm, Google Colab 4 PROJECT
  • 6. 6 np.array([1, 2, 3]) #rank1 array b.Shape #rows,col a[:2, 1:3] # first 2 rows, col1,2 x.Dtype #datatype- int64, float64 np.reshape(v, (3, 1)) * w PROJECT pd.read_csv('data.csv') pandas.DataFrame(mydataset) df.head(10) df.tail() df.dropna() df.corr() df.plot()
  • 7. P R E R E Q U I S I T E S • Probability and Statistics (r.v., prob distrib, statistic – mean, median, mode, variance, s.d., covariance, Baye’s theorem, entropy) • Linear Algebra (matrix, vector, tensors, eigen value, eigen vector) • Calculus (functions, derivatives of single variable and multivariate functions) • Python language • Structured thinking, communication and prob solving • Business understanding 7
  • 8. W H Y I S M L G E T T I N G A T T E N T I O N R E C E N T L Y This development is driven by a few underlying forces: • The amount of data generated is increasing significantly with reduction in the cost of sensors • The cost of storing this data has reduced significantly • The cost of computing has come down significantly • Cloud has democratized compute for the masses 8 FUTURE
  • 9. M L V S A U T O M A T I O N • If you are thinking that machine learning is nothing but a new name for automation – you would be wrong. Most of the automation which has happened in the last few decades has been rule-driven automation. For example – automating flows in our mailbox needs us to define the rules. These rules act in the same manner every time. • On the other hand, machine learning helps machines learn by past data and change their decisions/performance accordingly. Spam detection in our mailboxes is driven by machine learning. Hence, it continues to evolve with time. 9 PROJECT
  • 10. D E F I N I T I O N “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E” - Tom Mitchell “Machine learning enables a machine to automatically learn from data, improve performance from experiences, and predict things without being explicitly programmed.” “Machine learning is a subfield of artificial intelligence, which enables machines to learn from past data or experiences without being explicitly programmed.” “Science of getting computers act without explicit programming” - Arthur Samuel 10 EXAM
  • 11. S C I E N C E O F T E A C H I N G M A C H I N E S H O W T O L E A R N B Y S E L F Eg. the task of mopping and cleaning the floor. • When a human does the task – the quality of outcome would vary. The human would get exhausted / bored after a few hours of work. The human would also get sick at times. Depending on the place – it could also be hazardous or risky for a human. • Machines can do high frequency repetitive tasks with high accuracy without getting tired. On the other hand, if we can teach machines to detect whether the floor needs cleaning and mopping and how much cleaning is required based on the condition of the floor and the type of the floor, machines would be far better in doing the same job. They can go on to do that job without getting tired or sick! • This is what Machine Learning aims to do - enable machines to learn on their own. In order to answer questions like: • Whether the floor needs cleaning and mopping? • How long does the floor need to be cleaned? • Machines need a way to think and this is precisely where machine learning models help. The machines capture data from the environment and feed it to the machine learning model. The model then uses this data to predict whether the floor needs cleaning or not. And, for how long does it need the cleaning. 11
  • 12. H O W D O M A C H I N E S L E A R N • Tasks difficult for humans can be very simple for machines. e.g. multiplying very large numbers. • Tasks which look simple to humans can be very difficult for machines! • You only need to demonstrate cleaning and mopping to a human a few times before they can perform it on their own. • But, that is not the case with machines. We need to collect a lot of data along with the desired outcomes in order to teach machines to perform specific tasks. • This is where machine learning comes into play. Machine Learning would help the machine understand the kind of cleaning, the intensity of cleaning, and duration of cleaning based on the conditions and nature of the floor. 12
  • 13. T O O L S Language • R • Python • SAS • Julia • Java • Scala Database • SQL • Oracle • Hadoop Visualisation • D3.js • Tableau • QlikView 13 FUTURE
  • 14. T E R M I N O L O G Y • Dataset (training, validation, testing) • .csv file • Structured vs unstructured data • predictor, target, explanatory, independent, dependent, response variable • Instance • Features (numerical, discrete, categorical, ordinal, nominal) • Model • Hypothesis 14 PROJECT
  • 15. F E AT U R E S 15 PROJECT
  • 16. T Y P E S 16 EXAM
  • 17. T Y P E S • Supervised Learning – labelled (binary and multi class) • Classification – discrete response eg. LoR, NB, kNN, SVM, DT, RF, GBM, XGB, NN Eg. spam filtering, waste classification • Regression – continuous response eg. LR, SVR, DTR, RFR Eg. changes in temperature, stock price prediction 17 EXAM
  • 18. T Y P E S • Unsupervised Learning - unlabelled • Clustering eg. k means, hierarchical, NN Eg. customer segmentation, city planning, cell phone tower for optimal signal reception • Association eg. Apriori Eg. diaper and beer, bread and milk • Dimensionality Reduction eg. PCA, SVD Eg. MNIST data (70000X784), face recognition (698X4096) • Anomaly Detection eg. kNN, kMeans Eg. Fraud detection, fault detection, outlier detection • Semi supervised learning • Speech Analysis, Web content classification, Google Expander 18 EXAM
  • 19. T Y P E S • Reinforcement Learning maximise cumulative reward eg. Q-Learning, SARSA, DQN Eg. robotic dog, Tic Tac Toe • Neural Network eg. recognise dog • Deep Learning eg. chat bot, real time bidding, recommender system • Natural Language Processing eg. Lemmatisation, Stemming Eg. customer service complaints, virtual assistant • Computer Vision eg. Canny edge detection, Haar Cascade classifier Eg. skin cancer diagnosis, detect real time traffic, guided surgery • Evolutionary Learning (GA, Optimisation algorithms) Eg. Super Mario 19 EXAM
  • 20. I S S U E S I N M A C H I N E L E A R N I N G • What are the existing algorithm for learning? • When will algorithm converge? • Which algo perform best for what kind of problems? • How much data sufficient? eg. training to classify cat and dog • Non representative training data e.g. Exit poll during elections • Poor quality of data eg. Outliers, Missing • How many features required? Irrelevant features • Overfitting training data • Underfitting training data • Computation power? eg. GPU and TPU for ML and DL • Interpretability of model? eg. Why bank declined loan for customer • How to improve learning? • Optimization vs Generalization? • New and better algorithms required • Need for more data scientists 20 EXAM
  • 21. P R O J E C T I D E A S ( 4 0 ) • Fraud detection • Predict low oxygen level during surgery • Recognise CVD factors • Movie recommendation (Netflix) • Marketing and Sales • Weather prediction • Traffic Prediction (Uber ATG) • Loan defaulting prediction • Handwriting recognition • Sentiment analysis • Human activity recognition • Sports predictor • Big Mart Sales prediction • Fake news detection • Disease prediction • Stock market analysis • Amazon Alexa • Search Engine Optimization • Auto-tagging and Friend suggestion (Facebook) • Swiggy and Uber Eats • House price prediction • Market Analysis • Handwritten digit recognition • Equipment failure prediction • Prospective insurance buyer • Google News • Video Surveillance • Movie Ticket pricing system • Object Detection 21 PROJECT
  • 22. M L U S E C A S E I N S M A R T P H O N E S • From the voice assistant that sets your alarm and finds you the best restaurants to the simple use case of unlocking your phone via facial recognition – Machine Learning is truly embedded in our favourite devices. • Voice Assistants • Smartphone Cameras • App Store and Play Store Recommendations • Face Unlock 22 EXAM
  • 23. M L U S E C A S E I N T R A N S P O R TAT I O N • The application of machine learning in the transport industry has gone to an entirely different level in the last decade. This coincides with the rise of ride-hailing apps like Uber, Lyft, Ola, etc. These companies use machine learning throughout their many products, from planning optimal routes to deciding prices for the rides we take. So, let’s look at a few popular use cases in transportation which use machine learning heavily. • Dynamic Pricing in Travel • Transporting and Commuting - Uber • Google Maps 23 EXAM
  • 24. M L U S E C A S E I N W E B S E R V I C E S • We interact with certain applications every day multiple times. What we perhaps did not realize until recently, most of these applications work thanks to the power and flexibility of Machine Learning. • Email Filtering • Google Search • Google Translate • Facebook and LinkedIn Recommendations 24 EXAM
  • 25. M L U S E C A S E I N S A L E S A N D M A R K E T I N G • Top companies in the world are using Machine Learning to transform their strategies from top to bottom. The two most impacted functions? Marketing and Sales! • These days if you’re working in the Marketing and Sales field, you need to know at least one Business Intelligence tool (like Tableau or Power BI). Additionally, marketers are expected to know how to leverage Machine Learning in their day-to-day role to increase brand awareness, improve the bottom line, etc. • Recommendation Engine • Personalized Marketing • Customer Support (Chatbots) 25 EXAM
  • 26. M L U S E C A S E I N F I N A N C I A L D O M A I N • Most of the jobs in Machine Learning are geared towards the financial domain. And that makes sense! This is the ultimate numbers field. A lot of banking institutions till recently used to lean on Logistic Regression (a simple machine learning algorithm) to crunch these numbers. • Fraud Detection • Personalized Banking 26 EXAM
  • 27. S T E P S I N B U I L D I N G A M L A P P L I C AT I O N • Frame and define the business problem to ML problem • What is the main objective? What are we trying to predict? • What are the target features? • What is the input data? Is it available? • What kind of problem are we facing? Binary classification? Clustering? • What is the expected improvement? • Define performance metric • Regression problems use certain evaluation metrics such as Mean Squared Error (MSE). • Classification problems use evaluation metrics as Precision, Accuracy and Recall. 27 EXAM
  • 28. S T E P S I N B U I L D I N G A M L A P P L I C AT I O N • Gathering Data • RSS feed, web scraping, API • Generating Hypothesis • Can our outputs be predicted given the inputs. • Our available data is informative enough to learn the relationship between the inputs and the outputs • Exploratory Data Analysis (Visualisation for outlier) • Data Preparation and cleaning (Missing Value) • Delete relevant info or samples • Missing value imputation 28 EXAM
  • 29. S T E P S I N B U I L D I N G A M L A P P L I C AT I O N • Feature Engineering (Encoding, Transformation) • Mapping Ordinal features • Encoding Nominal class labels • Normalization, Standardization • Define benchmark / baseline model (kNN, NB) • Chose model • Train/build Model (train:validation:test) • Shuffle for classification • For weather prediction, stock price prediction etc. data should not be shuffled, as the sequence of data is a crucial feature. • Evaluate Model for Optimal Hyperparameters (cross validation) • Tune Model (Grid search, Randomized search) • Model testing and Deployment for prediction 29 EXAM
  • 30. C H O I C E O F R I G H T A L G O R I T H M 30 EXAM
  • 31. S T E P S F O R S E L E C T I N G R I G H T M L A L G O • Understand your Data • Type of data will decide algorithm • Algo will decide no. of samples Eg. NB will work with categorical data and is not sensitive to missing data • Stats and Visualization to know your data • Percentile helps to identify outlier, median to identify central tendency • Box plot (outlier), Histogram (spread), Scatter plot (bivariate relationship) • Clean data w.r.t Missing value • Feature Engineering • Encoding • Feature creation 31 EXAM
  • 32. S T E P S F O R S E L E C T I N G R I G H T M L A L G O • Categorize the problem • By I/P (supervised, unsupervised) • By O/P (regression, classification, clustering, anomaly detection) • Understand constraints (data storage capacity, real time applications, fast learning) • Look for available algorithm (business goals met?, preprocessing required?, accuracy?, explain ability?, speed?, scalable?) • Try each, assess and compare • Optimize • Evaluate performance • Repeat if required 32 EXAM
  • 33. C H O I C E O F M O D E L ( U S E C A S E ) • Linear Regression: unstable with redundant feature Eg. Sales prediction, Time for commuting • Logistic Regression: not blackbox, works with correlated features Eg. Fraud detection, Customer churn prediction • Decision Tree: can handle outliers but overfit and take large memory Eg. Bank loan defaulter, Investment decision • SVM: memory intensive, hard to interpret and difficult to tune Eg. Text classification, Handwritten character recognition • NB: less training data required, low memory requirement, faster Eg. Sentiment analysis, Recommender systems • RF: works well with large data and high dimension Eg. Predict loan defaulters, Predict patients for high risk • NN: resource and memory intensive Eg. Object Recognition, Natural Language Translation • K-means: grouping but no. of groups unknown Eg. Customer Segmentation, Crime locality identification • PCA: dimensionality reduction Eg. MNIST digits 33 PROJECT
  • 34. C H O I C E O F M E T R I C • Regression • Mean Square Error, Root MSE, R-squared • Mean Absolute Error if outliers • R2 • Classification • Accuracy, LogLoss, ROC-AUC, Precision Recall • Kappa score, MCC • Unsupervised • Mutual Information • RAND index • Reinforcement Learning • Dispersion across Time • Risk across Time 34 PROJECT
  • 35. P R O J E C T L A B O R I E N TAT I O N Installing Anaconda and Python Step-1: Download Anaconda Python: www.anaconda.com/distribution/ Step- 2: Install Anaconda Python (Python 3.7 version): double click on the ".exe" file of Anaconda Step- 3: Open Anaconda Navigator: use Anaconda navigator to launch a Python IDE such as Spyder and Jupyter Notebook Step- 4: Close the Spyder/Jupyter Notebook IDE. https://colab.research.google.com https://github.com 35 PROJECT
  • 36. P R O J E C T TA S K L I S T Study tool for implementation Project title and Course identification Chose data (Understand Domain and data) Perform EDA Perform Feature Engineering Chose model Train and Validate model Tune Hyperparameters Test and Evaluate model Prepare Report Prepare Technical Paper Present Case Study 36 PROJECT
  • 37. E X P E C TAT I O N S Case Study Presentation Mini Project Technical Paper Report Competition (Inhouse, Online) 37 PROJECT
  • 38. C A S E S T U D Y T I T L E S ( 3 1 ) MNIST MS-COCO ImageNet CIFAR IMDB Reviews WordNet Twitter Sentiment Analysis BreastCancer Wisconsin BBC News Wheat seeds Amazon Reviews Facial Image Spam SMS YouTube Chars74K WineQuality IrisFlowers LabelMe HoTPotQA Ionosphere Xview US Census Boston House Price BankNote authentication PIMA Indian Diabetes BBC Sport Titanic Santander Product Recommendation Sonar Swedish Auto Insurance Abalone 38 PROJECT
  • 39. B O O K S A N D D ATA S E T R E S O U R C E S • https://www.kaggle.com/datasets • https://archive.ics.uci.edu/ml/index.php • https://registry.opendata.aws/ • https://toolbox.google.com/datasetsearch • https://msropendata.com/ • https://github.com/awesomedata/awesome-public-datasets • Indian Government dataset • US Government Dataset • Northern Ireland Public Sector Datasets • European Union Open Data Portal • https://scikit-learn.org/stable/datasets/index.html • https://data.world • http://archive.ics.uci.edu/ml/datasets • https://www.ehdp.com/vitalnet/datasets.htm • https://www.data.gov/health/ • “Python Machine Learning”, Sebastian Raschka, Packt publishing • “Machine Learning In Action”, Peter Harrington, DreamTech Press • “Introduction to Machine Learning” Ethem Alpaydın, MIT Press • “Machine Learning” Tom M. Mitchell, McGraw Hill • “Machine Learning - An Algorithmic Perspective” Stephen Marsland, CRC Press • “Machine Learning ― A Probabilistic Perspective” Kevin P. Murphy, MIT Press • “Pattern Recognition and Machine Learning”, Christopher M. Bishop, Springer • “Elements of Statistical Learning” Trevor Hastie, Robert Tibshirani, Jerome Friedman, Springer 39
  • 40. L E A R N I N G R E S O U R C E S • https://www.analyticsvidhya.com • https://towardsdatascience.com • https://analyticsindiamag.com • https://machinelearningmastery.com • https://www.datacamp.com • https://www.superdatascience.com • https://www.elitedatascience.com • https://medium.com • Siraj Raval youtube channel • https://mlcontests.com • https://www.datasciencechallenge.net • https://www.machinehack.com • https://www.hackerearth.com • www.hackerearth.com • www.kaggle.com/competitions • www.smartindiahackathon.gov.in • www.datahack.analyticsvidhya.com • www.daretocompete.com • https://github.com 40
  • 41. W H Y ? 41
  • 42. S U M M A R Y ( S U M M AT I V E A S S E S S M E N T ) • Examine steps in developing Machine Learning application with respect to your mini project. [10] • Review the issues in Machine Learning. [10] • State applicable use case for each ML algorithm. [10] • Examine Applications of AI. [10] • Illustrate steps for selecting right ML algorithm. [10] • Define ML and differentiate between Supervised, Unsupervised and Reinforcement learning with the help of suitable examples. [10] • Explain ML w.r.t. identifying Tasks, Experience and Performance measure (Tom Mitchell). [10] • designing a checkers learning problem • designing a handwriting recognition learning problem • designing a Robot driving learning problem • Illustrate with example how Supervised learning can be used in handling loan defaulters. [10] • Explain Supervised Learning with neat diagram. [10] 42 EXAM
  • 43. Q U E R I E S ? T H A N K Y O U 43