SlideShare a Scribd company logo
S H I WA N I
G U P T A
M A C H I N E
L E A R N I N G
1
S Y L L A B U S
Introduction to Machine Learning (1) 6
Machine Learning terminology, Types of Machine Learning, Issues in Machine Learning, Application of Machine
Learning, Steps in developing ML application, How to choose the right algorithm
Data Preprocessing (3) 10
Data Cleaning (missing value, outlier), Exploratory Data Analysis (descriptive statistics, Visualization), Feature
Engineering (Data Transformation (encoding, skew, scale), Feature selection)
Supervised Learning with Regression (1) 5
Simple Linear, Multiple Linear, Polynomial, Overfit/Undefit, Regularization, Evaluation Metric, Use case
Supervised Learning with Classification (3) 12
k Nearest Neighbor, Logistic Regression, Linear SVM, Kernels, Decision Tree (CART), Issues in DT learning,
Ensembles (Bagging – Random Forest, Boosting – Gradient Boost), Evaluation metric, Use case
Optimization Techniques (2) 6
Model Selection techniques ( Cross Validation), Gradient Descent Algorithm, Grid Search method, Model Evaluation
technique (Bias, Variance)
Unsupervised Learning with clustering and Reinforcement Learning (2) 6
k Means algorithm, Dimensionality Reduction, Use case, Elements of Reinforcement Learning, Temporal Difference
Learning, Online Learning, Use case
2
M O D U L E 1 ( 6 H O U R )
• Machine Learning terminology
• Types of Machine Learning
• Issues in Machine Learning
• Application of Machine Learning
• Steps in developing ML application
• How to choose the right algorithm
3
S / W A N D H / W R E Q U I R E M E N T
16+ GB RAM, 4+ CORES, SSD storage, Amazon AWS, MS Azure, Google cloud
Python Data Science S/W stack (pip, conda)
NumPy – Linear Algebra
Pandas – Data read / process
Scikit-Learn – ML algo
Matplotlib – Visualization
Seaborn – more aesthetically pleasing
Plotly – interactive visualization library
tsne – high dimensional visualization
StatsModel – statistical models
SciPy – optimization
Tkinter – GUI lib for python
PyTorch – open source framework
Keras – high level API and open source framework
TensorFlow - open source framework
Theano – multidim array manipulation
NLTK – human language data
BeautifulSoup – navigating webpage
Bokeh – interactive visualizations
TextBlob – process textual data
SHAP – Shaplely Additive exPlanations
xAI – eXplainable AI
•IDE – Spyder, Jupyter notebook, PyCharm, Google Colab
4
PROJECT
5
PROJECT
6
np.array([1, 2, 3]) #rank1 array
b.Shape #rows,col
a[:2, 1:3] # first 2 rows, col1,2
x.Dtype #datatype- int64, float64
np.reshape(v, (3, 1)) * w
PROJECT
pd.read_csv('data.csv')
pandas.DataFrame(mydataset)
df.head(10)
df.tail()
df.dropna()
df.corr()
df.plot()
P R E R E Q U I S I T E S
• Probability and Statistics (r.v., prob distrib, statistic – mean,
median, mode, variance, s.d., covariance, Baye’s theorem,
entropy)
• Linear Algebra (matrix, vector, tensors, eigen value, eigen
vector)
• Calculus (functions, derivatives of single variable and
multivariate functions)
• Python language
• Structured thinking, communication and prob solving
• Business understanding
7
W H Y I S M L G E T T I N G A T T E N T I O N R E C E N T L Y
This development is driven by a few underlying forces:
• The amount of data generated is increasing significantly with reduction in the cost of
sensors
• The cost of storing this data has reduced significantly
• The cost of computing has come down significantly
• Cloud has democratized compute for the masses
8
FUTURE
M L V S A U T O M A T I O N
• If you are thinking that machine learning is nothing but a new name for automation – you
would be wrong. Most of the automation which has happened in the last few decades has
been rule-driven automation. For example – automating flows in our mailbox needs us to
define the rules. These rules act in the same manner every time.
• On the other hand, machine learning helps machines learn by past data and change their
decisions/performance accordingly. Spam detection in our mailboxes is driven by machine
learning. Hence, it continues to evolve with time.
9
PROJECT
D E F I N I T I O N
“A computer program is said to learn from experience E with respect to some class of tasks T and performance
measure P, if its performance at tasks in T, as measured by P, improves with experience E” - Tom Mitchell
“Machine learning enables a machine to automatically learn from data, improve performance from experiences,
and predict things without being explicitly programmed.”
“Machine learning is a subfield of artificial intelligence, which enables machines to learn from past data or
experiences without being explicitly programmed.”
“Science of getting computers act without explicit programming” - Arthur Samuel
10
EXAM
S C I E N C E O F T E A C H I N G M A C H I N E S H O W T O L E A R N B Y S E L F
Eg. the task of mopping and cleaning the floor.
• When a human does the task – the quality of outcome would vary. The human would get exhausted / bored after a few hours of
work. The human would also get sick at times. Depending on the place – it could also be hazardous or risky for a human.
• Machines can do high frequency repetitive tasks with high accuracy without getting tired. On the other hand, if we can teach
machines to detect whether the floor needs cleaning and mopping and how much cleaning is required based on the condition of
the floor and the type of the floor, machines would be far better in doing the same job. They can go on to do that job without
getting tired or sick!
• This is what Machine Learning aims to do - enable machines to learn on their own.
In order to answer questions like:
• Whether the floor needs cleaning and mopping?
• How long does the floor need to be cleaned?
• Machines need a way to think and this is precisely where machine learning models help. The machines capture data from the
environment and feed it to the machine learning model. The model then uses this data to predict whether the floor needs cleaning
or not. And, for how long does it need the cleaning.
11
H O W D O M A C H I N E S L E A R N
• Tasks difficult for humans can be very simple for machines. e.g. multiplying very large numbers.
• Tasks which look simple to humans can be very difficult for machines!
• You only need to demonstrate cleaning and mopping to a human a few times before they can perform it on
their own.
• But, that is not the case with machines. We need to collect a lot of data along with the desired outcomes in
order to teach machines to perform specific tasks.
• This is where machine learning comes into play. Machine Learning would help the machine understand the
kind of cleaning, the intensity of cleaning, and duration of cleaning based on the conditions and nature of the
floor.
12
T O O L S
Language
• R
• Python
• SAS
• Julia
• Java
• Scala
Database
• SQL
• Oracle
• Hadoop
Visualisation
• D3.js
• Tableau
• QlikView
13
FUTURE
T E R M I N O L O G Y
• Dataset (training, validation, testing)
• .csv file
• Structured vs unstructured data
• predictor, target, explanatory, independent, dependent, response variable
• Instance
• Features (numerical, discrete, categorical, ordinal, nominal)
• Model
• Hypothesis
14
PROJECT
F E AT U R E S
15
PROJECT
T Y P E S
16
EXAM
T Y P E S
• Supervised Learning – labelled (binary and multi class)
• Classification – discrete response eg. LoR, NB, kNN,
SVM, DT, RF, GBM, XGB, NN
Eg. spam filtering, waste classification
• Regression – continuous response eg. LR, SVR, DTR,
RFR
Eg. changes in temperature, stock price prediction
17
EXAM
T Y P E S
• Unsupervised Learning - unlabelled
• Clustering eg. k means, hierarchical, NN
Eg. customer segmentation, city planning, cell phone tower for optimal signal reception
• Association eg. Apriori
Eg. diaper and beer, bread and milk
• Dimensionality Reduction eg. PCA, SVD
Eg. MNIST data (70000X784), face recognition (698X4096)
• Anomaly Detection eg. kNN, kMeans
Eg. Fraud detection, fault detection, outlier detection
• Semi supervised learning
• Speech Analysis, Web content classification, Google Expander
18
EXAM
T Y P E S
• Reinforcement Learning maximise cumulative reward eg. Q-Learning, SARSA, DQN
Eg. robotic dog, Tic Tac Toe
• Neural Network eg. recognise dog
• Deep Learning eg. chat bot, real time bidding, recommender system
• Natural Language Processing eg. Lemmatisation, Stemming
Eg. customer service complaints, virtual assistant
• Computer Vision eg. Canny edge detection, Haar Cascade classifier
Eg. skin cancer diagnosis, detect real time traffic, guided surgery
• Evolutionary Learning (GA, Optimisation algorithms)
Eg. Super Mario
19
EXAM
I S S U E S I N M A C H I N E L E A R N I N G
• What are the existing algorithm for learning?
• When will algorithm converge?
• Which algo perform best for what kind of problems?
• How much data sufficient? eg. training to classify cat and dog
• Non representative training data e.g. Exit poll during elections
• Poor quality of data eg. Outliers, Missing
• How many features required? Irrelevant features
• Overfitting training data
• Underfitting training data
• Computation power? eg. GPU and TPU for ML and DL
• Interpretability of model? eg. Why bank declined loan for customer
• How to improve learning?
• Optimization vs Generalization?
• New and better algorithms required
• Need for more data scientists
20
EXAM
P R O J E C T I D E A S ( 4 0 )
• Fraud detection
• Predict low oxygen level during surgery
• Recognise CVD factors
• Movie recommendation (Netflix)
• Marketing and Sales
• Weather prediction
• Traffic Prediction (Uber ATG)
• Loan defaulting prediction
• Handwriting recognition
• Sentiment analysis
• Human activity recognition
• Sports predictor
• Big Mart Sales prediction
• Fake news detection
• Disease prediction
• Stock market analysis
• Amazon Alexa
• Search Engine Optimization
• Auto-tagging and Friend
suggestion (Facebook)
• Swiggy and Uber Eats
• House price prediction
• Market Analysis
• Handwritten digit recognition
• Equipment failure prediction
• Prospective insurance buyer
• Google News
• Video Surveillance
• Movie Ticket pricing system
• Object Detection
21
PROJECT
M L U S E C A S E I N S M A R T P H O N E S
• From the voice assistant that sets your alarm and finds you the best restaurants to the simple
use case of unlocking your phone via facial recognition – Machine Learning is truly
embedded in our favourite devices.
• Voice Assistants
• Smartphone Cameras
• App Store and Play Store Recommendations
• Face Unlock
22
EXAM
M L U S E C A S E I N T R A N S P O R TAT I O N
• The application of machine learning in the transport industry has gone to an entirely different
level in the last decade. This coincides with the rise of ride-hailing apps like Uber, Lyft, Ola,
etc. These companies use machine learning throughout their many products, from planning
optimal routes to deciding prices for the rides we take. So, let’s look at a few popular use
cases in transportation which use machine learning heavily.
• Dynamic Pricing in Travel
• Transporting and Commuting - Uber
• Google Maps
23
EXAM
M L U S E C A S E I N W E B S E R V I C E S
• We interact with certain applications every day multiple times. What we perhaps did not
realize until recently, most of these applications work thanks to the power and flexibility of
Machine Learning.
• Email Filtering
• Google Search
• Google Translate
• Facebook and LinkedIn Recommendations
24
EXAM
M L U S E C A S E I N S A L E S A N D M A R K E T I N G
• Top companies in the world are using Machine Learning to transform their strategies from top
to bottom. The two most impacted functions? Marketing and Sales!
• These days if you’re working in the Marketing and Sales field, you need to know at least one
Business Intelligence tool (like Tableau or Power BI). Additionally, marketers are expected to
know how to leverage Machine Learning in their day-to-day role to increase brand
awareness, improve the bottom line, etc.
• Recommendation Engine
• Personalized Marketing
• Customer Support (Chatbots)
25
EXAM
M L U S E C A S E I N F I N A N C I A L D O M A I N
• Most of the jobs in Machine Learning are geared towards the financial domain. And that
makes sense! This is the ultimate numbers field. A lot of banking institutions till recently used
to lean on Logistic Regression (a simple machine learning algorithm) to crunch these
numbers.
• Fraud Detection
• Personalized Banking
26
EXAM
S T E P S I N B U I L D I N G A M L A P P L I C AT I O N
• Frame and define the business problem to ML problem
• What is the main objective? What are we trying to predict?
• What are the target features?
• What is the input data? Is it available?
• What kind of problem are we facing? Binary classification? Clustering?
• What is the expected improvement?
• Define performance metric
• Regression problems use certain evaluation metrics such as Mean Squared Error (MSE).
• Classification problems use evaluation metrics as Precision, Accuracy and Recall.
27
EXAM
S T E P S I N B U I L D I N G A M L A P P L I C AT I O N
• Gathering Data
• RSS feed, web scraping, API
• Generating Hypothesis
• Can our outputs be predicted given the inputs.
• Our available data is informative enough to learn the relationship between the inputs and the outputs
• Exploratory Data Analysis (Visualisation for outlier)
• Data Preparation and cleaning (Missing Value)
• Delete relevant info or samples
• Missing value imputation
28
EXAM
S T E P S I N B U I L D I N G A M L A P P L I C AT I O N
• Feature Engineering (Encoding, Transformation)
• Mapping Ordinal features
• Encoding Nominal class labels
• Normalization, Standardization
• Define benchmark / baseline model (kNN, NB)
• Chose model
• Train/build Model (train:validation:test)
• Shuffle for classification
• For weather prediction, stock price prediction etc. data should not be shuffled, as the sequence of data is a crucial feature.
• Evaluate Model for Optimal Hyperparameters (cross validation)
• Tune Model (Grid search, Randomized search)
• Model testing and Deployment for prediction
29
EXAM
C H O I C E O F R I G H T A L G O R I T H M
30
EXAM
S T E P S F O R S E L E C T I N G R I G H T M L A L G O
• Understand your Data
• Type of data will decide algorithm
• Algo will decide no. of samples
Eg. NB will work with categorical data and is not sensitive to missing data
• Stats and Visualization to know your data
• Percentile helps to identify outlier, median to identify central tendency
• Box plot (outlier), Histogram (spread), Scatter plot (bivariate relationship)
• Clean data w.r.t Missing value
• Feature Engineering
• Encoding
• Feature creation
31
EXAM
S T E P S F O R S E L E C T I N G R I G H T M L A L G O
• Categorize the problem
• By I/P (supervised, unsupervised)
• By O/P (regression, classification, clustering, anomaly detection)
• Understand constraints (data storage capacity, real time applications, fast learning)
• Look for available algorithm (business goals met?, preprocessing required?, accuracy?, explain ability?,
speed?, scalable?)
• Try each, assess and compare
• Optimize
• Evaluate performance
• Repeat if required
32
EXAM
C H O I C E O F M O D E L ( U S E C A S E )
• Linear Regression: unstable with redundant feature
Eg. Sales prediction, Time for commuting
• Logistic Regression: not blackbox, works with correlated features
Eg. Fraud detection, Customer churn prediction
• Decision Tree: can handle outliers but overfit and take large memory
Eg. Bank loan defaulter, Investment decision
• SVM: memory intensive, hard to interpret and difficult to tune
Eg. Text classification, Handwritten character recognition
• NB: less training data required, low memory requirement, faster
Eg. Sentiment analysis, Recommender systems
• RF: works well with large data and high dimension
Eg. Predict loan defaulters, Predict patients for high risk
• NN: resource and memory intensive
Eg. Object Recognition, Natural Language Translation
• K-means: grouping but no. of groups unknown
Eg. Customer Segmentation, Crime locality identification
• PCA: dimensionality reduction
Eg. MNIST digits
33
PROJECT
C H O I C E O F M E T R I C
• Regression
• Mean Square Error, Root MSE, R-squared
• Mean Absolute Error if outliers
• R2
• Classification
• Accuracy, LogLoss, ROC-AUC, Precision Recall
• Kappa score, MCC
• Unsupervised
• Mutual Information
• RAND index
• Reinforcement Learning
• Dispersion across Time
• Risk across Time
34
PROJECT
P R O J E C T L A B O R I E N TAT I O N
Installing Anaconda and Python
Step-1: Download Anaconda Python: www.anaconda.com/distribution/
Step- 2: Install Anaconda Python (Python 3.7 version): double click on the ".exe" file of
Anaconda
Step- 3: Open Anaconda Navigator: use Anaconda navigator to launch a Python IDE such as
Spyder and Jupyter Notebook
Step- 4: Close the Spyder/Jupyter Notebook IDE.
https://colab.research.google.com
https://github.com
35
PROJECT
P R O J E C T TA S K L I S T Study tool for implementation
Project title and Course identification
Chose data (Understand Domain and data)
Perform EDA
Perform Feature Engineering
Chose model
Train and Validate model
Tune Hyperparameters
Test and Evaluate model
Prepare Report
Prepare Technical Paper
Present Case Study
36
PROJECT
E X P E C TAT I O N S
Case Study Presentation
Mini Project
Technical Paper
Report
Competition (Inhouse, Online)
37
PROJECT
C A S E S T U D Y T I T L E S ( 3 1 )
MNIST
MS-COCO
ImageNet
CIFAR
IMDB Reviews
WordNet
Twitter Sentiment Analysis
BreastCancer Wisconsin
BBC News
Wheat seeds
Amazon Reviews
Facial Image
Spam SMS
YouTube
Chars74K
WineQuality
IrisFlowers
LabelMe
HoTPotQA
Ionosphere
Xview
US Census
Boston House Price
BankNote authentication
PIMA Indian Diabetes
BBC Sport
Titanic
Santander Product Recommendation
Sonar
Swedish Auto Insurance
Abalone
38
PROJECT
B O O K S A N D D ATA S E T R E S O U R C E S
• https://www.kaggle.com/datasets
• https://archive.ics.uci.edu/ml/index.php
• https://registry.opendata.aws/
• https://toolbox.google.com/datasetsearch
• https://msropendata.com/
• https://github.com/awesomedata/awesome-public-datasets
• Indian Government dataset
• US Government Dataset
• Northern Ireland Public Sector Datasets
• European Union Open Data Portal
• https://scikit-learn.org/stable/datasets/index.html
• https://data.world
• http://archive.ics.uci.edu/ml/datasets
• https://www.ehdp.com/vitalnet/datasets.htm
• https://www.data.gov/health/
• “Python Machine Learning”, Sebastian Raschka, Packt
publishing
• “Machine Learning In Action”, Peter Harrington,
DreamTech Press
• “Introduction to Machine Learning” Ethem Alpaydın,
MIT Press
• “Machine Learning” Tom M. Mitchell, McGraw Hill
• “Machine Learning - An Algorithmic Perspective”
Stephen Marsland, CRC Press
• “Machine Learning ― A Probabilistic Perspective”
Kevin P. Murphy, MIT Press
• “Pattern Recognition and Machine Learning”,
Christopher M. Bishop, Springer
• “Elements of Statistical Learning” Trevor Hastie,
Robert Tibshirani, Jerome Friedman, Springer
39
L E A R N I N G R E S O U R C E S
• https://www.analyticsvidhya.com
• https://towardsdatascience.com
• https://analyticsindiamag.com
• https://machinelearningmastery.com
• https://www.datacamp.com
• https://www.superdatascience.com
• https://www.elitedatascience.com
• https://medium.com
• Siraj Raval youtube channel
• https://mlcontests.com
• https://www.datasciencechallenge.net
• https://www.machinehack.com
• https://www.hackerearth.com
• www.hackerearth.com
• www.kaggle.com/competitions
• www.smartindiahackathon.gov.in
• www.datahack.analyticsvidhya.com
• www.daretocompete.com
• https://github.com
40
W H Y ?
41
S U M M A R Y ( S U M M AT I V E A S S E S S M E N T )
• Examine steps in developing Machine Learning application with respect to your mini project. [10]
• Review the issues in Machine Learning. [10]
• State applicable use case for each ML algorithm. [10]
• Examine Applications of AI. [10]
• Illustrate steps for selecting right ML algorithm. [10]
• Define ML and differentiate between Supervised, Unsupervised and Reinforcement learning with the help of suitable examples. [10]
• Explain ML w.r.t. identifying Tasks, Experience and Performance measure (Tom Mitchell). [10]
• designing a checkers learning problem
• designing a handwriting recognition learning problem
• designing a Robot driving learning problem
• Illustrate with example how Supervised learning can be used in handling loan defaulters. [10]
• Explain Supervised Learning with neat diagram. [10]
42
EXAM
Q U E R I E S
?
T H A N K
Y O U
43

More Related Content

Similar to ML MODULE 1_slideshare.pdf

UKSG 2024 - Demystifying AI - Evaluating future uses and limits in library co...
UKSG 2024 - Demystifying AI - Evaluating future uses and limits in library co...UKSG 2024 - Demystifying AI - Evaluating future uses and limits in library co...
UKSG 2024 - Demystifying AI - Evaluating future uses and limits in library co...
UKSG: connecting the knowledge community
 
Engineering Intelligent Systems using Machine Learning
Engineering Intelligent Systems using Machine Learning Engineering Intelligent Systems using Machine Learning
Engineering Intelligent Systems using Machine Learning
Saurabh Kaushik
 
intro to ML by the way m toh phasee movie Punjabi
intro to ML by the way m toh phasee movie Punjabiintro to ML by the way m toh phasee movie Punjabi
intro to ML by the way m toh phasee movie Punjabi
botvillain45
 
AI Presentation 1
AI Presentation 1AI Presentation 1
AI Presentation 1
Mustafa Kuğu
 
chapter1-introduction1.ppt
chapter1-introduction1.pptchapter1-introduction1.ppt
chapter1-introduction1.ppt
SeshuSrinivas2
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
Ivo Andreev
 
Useful Techniques in Artificial Intelligence
Useful Techniques in Artificial IntelligenceUseful Techniques in Artificial Intelligence
Useful Techniques in Artificial Intelligence
Ila Group
 
Artificial intelligence
Artificial intelligenceArtificial intelligence
Artificial intelligence
Gangeshwar Krishnamurthy
 
Machine learning a developer's perspective
Machine learning   a developer's perspectiveMachine learning   a developer's perspective
Machine learning a developer's perspective
Rupak Chakraborty
 
AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)
Joaquin Vanschoren
 
The Machine Learning Workflow with Azure
The Machine Learning Workflow with AzureThe Machine Learning Workflow with Azure
The Machine Learning Workflow with Azure
Ivo Andreev
 
Lecture 5 machine learning updated
Lecture 5   machine learning updatedLecture 5   machine learning updated
Lecture 5 machine learning updated
Vajira Thambawita
 
Machine learning introduction to unit 1.ppt
Machine learning introduction to unit 1.pptMachine learning introduction to unit 1.ppt
Machine learning introduction to unit 1.ppt
ShivaShiva783981
 
Essential concepts for machine learning
Essential concepts for machine learning Essential concepts for machine learning
Essential concepts for machine learning
pyingkodi maran
 
The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...
The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...
The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...
Lucas Jellema
 
Deep learning
Deep learningDeep learning
Deep learning
AnimaSinghDhabal
 
Machine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxMachine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackbox
Ivo Andreev
 
OpenML data@Sheffield
OpenML data@SheffieldOpenML data@Sheffield
OpenML data@Sheffield
Joaquin Vanschoren
 
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MAHIRA
 
Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflow
Charmi Chokshi
 

Similar to ML MODULE 1_slideshare.pdf (20)

UKSG 2024 - Demystifying AI - Evaluating future uses and limits in library co...
UKSG 2024 - Demystifying AI - Evaluating future uses and limits in library co...UKSG 2024 - Demystifying AI - Evaluating future uses and limits in library co...
UKSG 2024 - Demystifying AI - Evaluating future uses and limits in library co...
 
Engineering Intelligent Systems using Machine Learning
Engineering Intelligent Systems using Machine Learning Engineering Intelligent Systems using Machine Learning
Engineering Intelligent Systems using Machine Learning
 
intro to ML by the way m toh phasee movie Punjabi
intro to ML by the way m toh phasee movie Punjabiintro to ML by the way m toh phasee movie Punjabi
intro to ML by the way m toh phasee movie Punjabi
 
AI Presentation 1
AI Presentation 1AI Presentation 1
AI Presentation 1
 
chapter1-introduction1.ppt
chapter1-introduction1.pptchapter1-introduction1.ppt
chapter1-introduction1.ppt
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
 
Useful Techniques in Artificial Intelligence
Useful Techniques in Artificial IntelligenceUseful Techniques in Artificial Intelligence
Useful Techniques in Artificial Intelligence
 
Artificial intelligence
Artificial intelligenceArtificial intelligence
Artificial intelligence
 
Machine learning a developer's perspective
Machine learning   a developer's perspectiveMachine learning   a developer's perspective
Machine learning a developer's perspective
 
AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)
 
The Machine Learning Workflow with Azure
The Machine Learning Workflow with AzureThe Machine Learning Workflow with Azure
The Machine Learning Workflow with Azure
 
Lecture 5 machine learning updated
Lecture 5   machine learning updatedLecture 5   machine learning updated
Lecture 5 machine learning updated
 
Machine learning introduction to unit 1.ppt
Machine learning introduction to unit 1.pptMachine learning introduction to unit 1.ppt
Machine learning introduction to unit 1.ppt
 
Essential concepts for machine learning
Essential concepts for machine learning Essential concepts for machine learning
Essential concepts for machine learning
 
The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...
The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...
The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...
 
Deep learning
Deep learningDeep learning
Deep learning
 
Machine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxMachine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackbox
 
OpenML data@Sheffield
OpenML data@SheffieldOpenML data@Sheffield
OpenML data@Sheffield
 
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
 
Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflow
 

More from Shiwani Gupta

ML MODULE 6.pdf
ML MODULE 6.pdfML MODULE 6.pdf
ML MODULE 6.pdf
Shiwani Gupta
 
ML MODULE 5.pdf
ML MODULE 5.pdfML MODULE 5.pdf
ML MODULE 5.pdf
Shiwani Gupta
 
ML MODULE 4.pdf
ML MODULE 4.pdfML MODULE 4.pdf
ML MODULE 4.pdf
Shiwani Gupta
 
module6_stringmatchingalgorithm_2022.pdf
module6_stringmatchingalgorithm_2022.pdfmodule6_stringmatchingalgorithm_2022.pdf
module6_stringmatchingalgorithm_2022.pdf
Shiwani Gupta
 
module5_backtrackingnbranchnbound_2022.pdf
module5_backtrackingnbranchnbound_2022.pdfmodule5_backtrackingnbranchnbound_2022.pdf
module5_backtrackingnbranchnbound_2022.pdf
Shiwani Gupta
 
module4_dynamic programming_2022.pdf
module4_dynamic programming_2022.pdfmodule4_dynamic programming_2022.pdf
module4_dynamic programming_2022.pdf
Shiwani Gupta
 
module3_Greedymethod_2022.pdf
module3_Greedymethod_2022.pdfmodule3_Greedymethod_2022.pdf
module3_Greedymethod_2022.pdf
Shiwani Gupta
 
module2_dIVIDEncONQUER_2022.pdf
module2_dIVIDEncONQUER_2022.pdfmodule2_dIVIDEncONQUER_2022.pdf
module2_dIVIDEncONQUER_2022.pdf
Shiwani Gupta
 
module1_Introductiontoalgorithms_2022.pdf
module1_Introductiontoalgorithms_2022.pdfmodule1_Introductiontoalgorithms_2022.pdf
module1_Introductiontoalgorithms_2022.pdf
Shiwani Gupta
 
ML MODULE 2.pdf
ML MODULE 2.pdfML MODULE 2.pdf
ML MODULE 2.pdf
Shiwani Gupta
 
ML Module 3.pdf
ML Module 3.pdfML Module 3.pdf
ML Module 3.pdf
Shiwani Gupta
 
Problem formulation
Problem formulationProblem formulation
Problem formulation
Shiwani Gupta
 
Simplex method
Simplex methodSimplex method
Simplex method
Shiwani Gupta
 
Functionsandpigeonholeprinciple
FunctionsandpigeonholeprincipleFunctionsandpigeonholeprinciple
Functionsandpigeonholeprinciple
Shiwani Gupta
 
Relations
RelationsRelations
Relations
Shiwani Gupta
 
Logic
LogicLogic
Set theory
Set theorySet theory
Set theory
Shiwani Gupta
 
Uncertain knowledge and reasoning
Uncertain knowledge and reasoningUncertain knowledge and reasoning
Uncertain knowledge and reasoning
Shiwani Gupta
 
Introduction to ai
Introduction to aiIntroduction to ai
Introduction to ai
Shiwani Gupta
 
Planning Agent
Planning AgentPlanning Agent
Planning Agent
Shiwani Gupta
 

More from Shiwani Gupta (20)

ML MODULE 6.pdf
ML MODULE 6.pdfML MODULE 6.pdf
ML MODULE 6.pdf
 
ML MODULE 5.pdf
ML MODULE 5.pdfML MODULE 5.pdf
ML MODULE 5.pdf
 
ML MODULE 4.pdf
ML MODULE 4.pdfML MODULE 4.pdf
ML MODULE 4.pdf
 
module6_stringmatchingalgorithm_2022.pdf
module6_stringmatchingalgorithm_2022.pdfmodule6_stringmatchingalgorithm_2022.pdf
module6_stringmatchingalgorithm_2022.pdf
 
module5_backtrackingnbranchnbound_2022.pdf
module5_backtrackingnbranchnbound_2022.pdfmodule5_backtrackingnbranchnbound_2022.pdf
module5_backtrackingnbranchnbound_2022.pdf
 
module4_dynamic programming_2022.pdf
module4_dynamic programming_2022.pdfmodule4_dynamic programming_2022.pdf
module4_dynamic programming_2022.pdf
 
module3_Greedymethod_2022.pdf
module3_Greedymethod_2022.pdfmodule3_Greedymethod_2022.pdf
module3_Greedymethod_2022.pdf
 
module2_dIVIDEncONQUER_2022.pdf
module2_dIVIDEncONQUER_2022.pdfmodule2_dIVIDEncONQUER_2022.pdf
module2_dIVIDEncONQUER_2022.pdf
 
module1_Introductiontoalgorithms_2022.pdf
module1_Introductiontoalgorithms_2022.pdfmodule1_Introductiontoalgorithms_2022.pdf
module1_Introductiontoalgorithms_2022.pdf
 
ML MODULE 2.pdf
ML MODULE 2.pdfML MODULE 2.pdf
ML MODULE 2.pdf
 
ML Module 3.pdf
ML Module 3.pdfML Module 3.pdf
ML Module 3.pdf
 
Problem formulation
Problem formulationProblem formulation
Problem formulation
 
Simplex method
Simplex methodSimplex method
Simplex method
 
Functionsandpigeonholeprinciple
FunctionsandpigeonholeprincipleFunctionsandpigeonholeprinciple
Functionsandpigeonholeprinciple
 
Relations
RelationsRelations
Relations
 
Logic
LogicLogic
Logic
 
Set theory
Set theorySet theory
Set theory
 
Uncertain knowledge and reasoning
Uncertain knowledge and reasoningUncertain knowledge and reasoning
Uncertain knowledge and reasoning
 
Introduction to ai
Introduction to aiIntroduction to ai
Introduction to ai
 
Planning Agent
Planning AgentPlanning Agent
Planning Agent
 

Recently uploaded

一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 

Recently uploaded (20)

一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 

ML MODULE 1_slideshare.pdf

  • 1. S H I WA N I G U P T A M A C H I N E L E A R N I N G 1
  • 2. S Y L L A B U S Introduction to Machine Learning (1) 6 Machine Learning terminology, Types of Machine Learning, Issues in Machine Learning, Application of Machine Learning, Steps in developing ML application, How to choose the right algorithm Data Preprocessing (3) 10 Data Cleaning (missing value, outlier), Exploratory Data Analysis (descriptive statistics, Visualization), Feature Engineering (Data Transformation (encoding, skew, scale), Feature selection) Supervised Learning with Regression (1) 5 Simple Linear, Multiple Linear, Polynomial, Overfit/Undefit, Regularization, Evaluation Metric, Use case Supervised Learning with Classification (3) 12 k Nearest Neighbor, Logistic Regression, Linear SVM, Kernels, Decision Tree (CART), Issues in DT learning, Ensembles (Bagging – Random Forest, Boosting – Gradient Boost), Evaluation metric, Use case Optimization Techniques (2) 6 Model Selection techniques ( Cross Validation), Gradient Descent Algorithm, Grid Search method, Model Evaluation technique (Bias, Variance) Unsupervised Learning with clustering and Reinforcement Learning (2) 6 k Means algorithm, Dimensionality Reduction, Use case, Elements of Reinforcement Learning, Temporal Difference Learning, Online Learning, Use case 2
  • 3. M O D U L E 1 ( 6 H O U R ) • Machine Learning terminology • Types of Machine Learning • Issues in Machine Learning • Application of Machine Learning • Steps in developing ML application • How to choose the right algorithm 3
  • 4. S / W A N D H / W R E Q U I R E M E N T 16+ GB RAM, 4+ CORES, SSD storage, Amazon AWS, MS Azure, Google cloud Python Data Science S/W stack (pip, conda) NumPy – Linear Algebra Pandas – Data read / process Scikit-Learn – ML algo Matplotlib – Visualization Seaborn – more aesthetically pleasing Plotly – interactive visualization library tsne – high dimensional visualization StatsModel – statistical models SciPy – optimization Tkinter – GUI lib for python PyTorch – open source framework Keras – high level API and open source framework TensorFlow - open source framework Theano – multidim array manipulation NLTK – human language data BeautifulSoup – navigating webpage Bokeh – interactive visualizations TextBlob – process textual data SHAP – Shaplely Additive exPlanations xAI – eXplainable AI •IDE – Spyder, Jupyter notebook, PyCharm, Google Colab 4 PROJECT
  • 6. 6 np.array([1, 2, 3]) #rank1 array b.Shape #rows,col a[:2, 1:3] # first 2 rows, col1,2 x.Dtype #datatype- int64, float64 np.reshape(v, (3, 1)) * w PROJECT pd.read_csv('data.csv') pandas.DataFrame(mydataset) df.head(10) df.tail() df.dropna() df.corr() df.plot()
  • 7. P R E R E Q U I S I T E S • Probability and Statistics (r.v., prob distrib, statistic – mean, median, mode, variance, s.d., covariance, Baye’s theorem, entropy) • Linear Algebra (matrix, vector, tensors, eigen value, eigen vector) • Calculus (functions, derivatives of single variable and multivariate functions) • Python language • Structured thinking, communication and prob solving • Business understanding 7
  • 8. W H Y I S M L G E T T I N G A T T E N T I O N R E C E N T L Y This development is driven by a few underlying forces: • The amount of data generated is increasing significantly with reduction in the cost of sensors • The cost of storing this data has reduced significantly • The cost of computing has come down significantly • Cloud has democratized compute for the masses 8 FUTURE
  • 9. M L V S A U T O M A T I O N • If you are thinking that machine learning is nothing but a new name for automation – you would be wrong. Most of the automation which has happened in the last few decades has been rule-driven automation. For example – automating flows in our mailbox needs us to define the rules. These rules act in the same manner every time. • On the other hand, machine learning helps machines learn by past data and change their decisions/performance accordingly. Spam detection in our mailboxes is driven by machine learning. Hence, it continues to evolve with time. 9 PROJECT
  • 10. D E F I N I T I O N “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E” - Tom Mitchell “Machine learning enables a machine to automatically learn from data, improve performance from experiences, and predict things without being explicitly programmed.” “Machine learning is a subfield of artificial intelligence, which enables machines to learn from past data or experiences without being explicitly programmed.” “Science of getting computers act without explicit programming” - Arthur Samuel 10 EXAM
  • 11. S C I E N C E O F T E A C H I N G M A C H I N E S H O W T O L E A R N B Y S E L F Eg. the task of mopping and cleaning the floor. • When a human does the task – the quality of outcome would vary. The human would get exhausted / bored after a few hours of work. The human would also get sick at times. Depending on the place – it could also be hazardous or risky for a human. • Machines can do high frequency repetitive tasks with high accuracy without getting tired. On the other hand, if we can teach machines to detect whether the floor needs cleaning and mopping and how much cleaning is required based on the condition of the floor and the type of the floor, machines would be far better in doing the same job. They can go on to do that job without getting tired or sick! • This is what Machine Learning aims to do - enable machines to learn on their own. In order to answer questions like: • Whether the floor needs cleaning and mopping? • How long does the floor need to be cleaned? • Machines need a way to think and this is precisely where machine learning models help. The machines capture data from the environment and feed it to the machine learning model. The model then uses this data to predict whether the floor needs cleaning or not. And, for how long does it need the cleaning. 11
  • 12. H O W D O M A C H I N E S L E A R N • Tasks difficult for humans can be very simple for machines. e.g. multiplying very large numbers. • Tasks which look simple to humans can be very difficult for machines! • You only need to demonstrate cleaning and mopping to a human a few times before they can perform it on their own. • But, that is not the case with machines. We need to collect a lot of data along with the desired outcomes in order to teach machines to perform specific tasks. • This is where machine learning comes into play. Machine Learning would help the machine understand the kind of cleaning, the intensity of cleaning, and duration of cleaning based on the conditions and nature of the floor. 12
  • 13. T O O L S Language • R • Python • SAS • Julia • Java • Scala Database • SQL • Oracle • Hadoop Visualisation • D3.js • Tableau • QlikView 13 FUTURE
  • 14. T E R M I N O L O G Y • Dataset (training, validation, testing) • .csv file • Structured vs unstructured data • predictor, target, explanatory, independent, dependent, response variable • Instance • Features (numerical, discrete, categorical, ordinal, nominal) • Model • Hypothesis 14 PROJECT
  • 15. F E AT U R E S 15 PROJECT
  • 16. T Y P E S 16 EXAM
  • 17. T Y P E S • Supervised Learning – labelled (binary and multi class) • Classification – discrete response eg. LoR, NB, kNN, SVM, DT, RF, GBM, XGB, NN Eg. spam filtering, waste classification • Regression – continuous response eg. LR, SVR, DTR, RFR Eg. changes in temperature, stock price prediction 17 EXAM
  • 18. T Y P E S • Unsupervised Learning - unlabelled • Clustering eg. k means, hierarchical, NN Eg. customer segmentation, city planning, cell phone tower for optimal signal reception • Association eg. Apriori Eg. diaper and beer, bread and milk • Dimensionality Reduction eg. PCA, SVD Eg. MNIST data (70000X784), face recognition (698X4096) • Anomaly Detection eg. kNN, kMeans Eg. Fraud detection, fault detection, outlier detection • Semi supervised learning • Speech Analysis, Web content classification, Google Expander 18 EXAM
  • 19. T Y P E S • Reinforcement Learning maximise cumulative reward eg. Q-Learning, SARSA, DQN Eg. robotic dog, Tic Tac Toe • Neural Network eg. recognise dog • Deep Learning eg. chat bot, real time bidding, recommender system • Natural Language Processing eg. Lemmatisation, Stemming Eg. customer service complaints, virtual assistant • Computer Vision eg. Canny edge detection, Haar Cascade classifier Eg. skin cancer diagnosis, detect real time traffic, guided surgery • Evolutionary Learning (GA, Optimisation algorithms) Eg. Super Mario 19 EXAM
  • 20. I S S U E S I N M A C H I N E L E A R N I N G • What are the existing algorithm for learning? • When will algorithm converge? • Which algo perform best for what kind of problems? • How much data sufficient? eg. training to classify cat and dog • Non representative training data e.g. Exit poll during elections • Poor quality of data eg. Outliers, Missing • How many features required? Irrelevant features • Overfitting training data • Underfitting training data • Computation power? eg. GPU and TPU for ML and DL • Interpretability of model? eg. Why bank declined loan for customer • How to improve learning? • Optimization vs Generalization? • New and better algorithms required • Need for more data scientists 20 EXAM
  • 21. P R O J E C T I D E A S ( 4 0 ) • Fraud detection • Predict low oxygen level during surgery • Recognise CVD factors • Movie recommendation (Netflix) • Marketing and Sales • Weather prediction • Traffic Prediction (Uber ATG) • Loan defaulting prediction • Handwriting recognition • Sentiment analysis • Human activity recognition • Sports predictor • Big Mart Sales prediction • Fake news detection • Disease prediction • Stock market analysis • Amazon Alexa • Search Engine Optimization • Auto-tagging and Friend suggestion (Facebook) • Swiggy and Uber Eats • House price prediction • Market Analysis • Handwritten digit recognition • Equipment failure prediction • Prospective insurance buyer • Google News • Video Surveillance • Movie Ticket pricing system • Object Detection 21 PROJECT
  • 22. M L U S E C A S E I N S M A R T P H O N E S • From the voice assistant that sets your alarm and finds you the best restaurants to the simple use case of unlocking your phone via facial recognition – Machine Learning is truly embedded in our favourite devices. • Voice Assistants • Smartphone Cameras • App Store and Play Store Recommendations • Face Unlock 22 EXAM
  • 23. M L U S E C A S E I N T R A N S P O R TAT I O N • The application of machine learning in the transport industry has gone to an entirely different level in the last decade. This coincides with the rise of ride-hailing apps like Uber, Lyft, Ola, etc. These companies use machine learning throughout their many products, from planning optimal routes to deciding prices for the rides we take. So, let’s look at a few popular use cases in transportation which use machine learning heavily. • Dynamic Pricing in Travel • Transporting and Commuting - Uber • Google Maps 23 EXAM
  • 24. M L U S E C A S E I N W E B S E R V I C E S • We interact with certain applications every day multiple times. What we perhaps did not realize until recently, most of these applications work thanks to the power and flexibility of Machine Learning. • Email Filtering • Google Search • Google Translate • Facebook and LinkedIn Recommendations 24 EXAM
  • 25. M L U S E C A S E I N S A L E S A N D M A R K E T I N G • Top companies in the world are using Machine Learning to transform their strategies from top to bottom. The two most impacted functions? Marketing and Sales! • These days if you’re working in the Marketing and Sales field, you need to know at least one Business Intelligence tool (like Tableau or Power BI). Additionally, marketers are expected to know how to leverage Machine Learning in their day-to-day role to increase brand awareness, improve the bottom line, etc. • Recommendation Engine • Personalized Marketing • Customer Support (Chatbots) 25 EXAM
  • 26. M L U S E C A S E I N F I N A N C I A L D O M A I N • Most of the jobs in Machine Learning are geared towards the financial domain. And that makes sense! This is the ultimate numbers field. A lot of banking institutions till recently used to lean on Logistic Regression (a simple machine learning algorithm) to crunch these numbers. • Fraud Detection • Personalized Banking 26 EXAM
  • 27. S T E P S I N B U I L D I N G A M L A P P L I C AT I O N • Frame and define the business problem to ML problem • What is the main objective? What are we trying to predict? • What are the target features? • What is the input data? Is it available? • What kind of problem are we facing? Binary classification? Clustering? • What is the expected improvement? • Define performance metric • Regression problems use certain evaluation metrics such as Mean Squared Error (MSE). • Classification problems use evaluation metrics as Precision, Accuracy and Recall. 27 EXAM
  • 28. S T E P S I N B U I L D I N G A M L A P P L I C AT I O N • Gathering Data • RSS feed, web scraping, API • Generating Hypothesis • Can our outputs be predicted given the inputs. • Our available data is informative enough to learn the relationship between the inputs and the outputs • Exploratory Data Analysis (Visualisation for outlier) • Data Preparation and cleaning (Missing Value) • Delete relevant info or samples • Missing value imputation 28 EXAM
  • 29. S T E P S I N B U I L D I N G A M L A P P L I C AT I O N • Feature Engineering (Encoding, Transformation) • Mapping Ordinal features • Encoding Nominal class labels • Normalization, Standardization • Define benchmark / baseline model (kNN, NB) • Chose model • Train/build Model (train:validation:test) • Shuffle for classification • For weather prediction, stock price prediction etc. data should not be shuffled, as the sequence of data is a crucial feature. • Evaluate Model for Optimal Hyperparameters (cross validation) • Tune Model (Grid search, Randomized search) • Model testing and Deployment for prediction 29 EXAM
  • 30. C H O I C E O F R I G H T A L G O R I T H M 30 EXAM
  • 31. S T E P S F O R S E L E C T I N G R I G H T M L A L G O • Understand your Data • Type of data will decide algorithm • Algo will decide no. of samples Eg. NB will work with categorical data and is not sensitive to missing data • Stats and Visualization to know your data • Percentile helps to identify outlier, median to identify central tendency • Box plot (outlier), Histogram (spread), Scatter plot (bivariate relationship) • Clean data w.r.t Missing value • Feature Engineering • Encoding • Feature creation 31 EXAM
  • 32. S T E P S F O R S E L E C T I N G R I G H T M L A L G O • Categorize the problem • By I/P (supervised, unsupervised) • By O/P (regression, classification, clustering, anomaly detection) • Understand constraints (data storage capacity, real time applications, fast learning) • Look for available algorithm (business goals met?, preprocessing required?, accuracy?, explain ability?, speed?, scalable?) • Try each, assess and compare • Optimize • Evaluate performance • Repeat if required 32 EXAM
  • 33. C H O I C E O F M O D E L ( U S E C A S E ) • Linear Regression: unstable with redundant feature Eg. Sales prediction, Time for commuting • Logistic Regression: not blackbox, works with correlated features Eg. Fraud detection, Customer churn prediction • Decision Tree: can handle outliers but overfit and take large memory Eg. Bank loan defaulter, Investment decision • SVM: memory intensive, hard to interpret and difficult to tune Eg. Text classification, Handwritten character recognition • NB: less training data required, low memory requirement, faster Eg. Sentiment analysis, Recommender systems • RF: works well with large data and high dimension Eg. Predict loan defaulters, Predict patients for high risk • NN: resource and memory intensive Eg. Object Recognition, Natural Language Translation • K-means: grouping but no. of groups unknown Eg. Customer Segmentation, Crime locality identification • PCA: dimensionality reduction Eg. MNIST digits 33 PROJECT
  • 34. C H O I C E O F M E T R I C • Regression • Mean Square Error, Root MSE, R-squared • Mean Absolute Error if outliers • R2 • Classification • Accuracy, LogLoss, ROC-AUC, Precision Recall • Kappa score, MCC • Unsupervised • Mutual Information • RAND index • Reinforcement Learning • Dispersion across Time • Risk across Time 34 PROJECT
  • 35. P R O J E C T L A B O R I E N TAT I O N Installing Anaconda and Python Step-1: Download Anaconda Python: www.anaconda.com/distribution/ Step- 2: Install Anaconda Python (Python 3.7 version): double click on the ".exe" file of Anaconda Step- 3: Open Anaconda Navigator: use Anaconda navigator to launch a Python IDE such as Spyder and Jupyter Notebook Step- 4: Close the Spyder/Jupyter Notebook IDE. https://colab.research.google.com https://github.com 35 PROJECT
  • 36. P R O J E C T TA S K L I S T Study tool for implementation Project title and Course identification Chose data (Understand Domain and data) Perform EDA Perform Feature Engineering Chose model Train and Validate model Tune Hyperparameters Test and Evaluate model Prepare Report Prepare Technical Paper Present Case Study 36 PROJECT
  • 37. E X P E C TAT I O N S Case Study Presentation Mini Project Technical Paper Report Competition (Inhouse, Online) 37 PROJECT
  • 38. C A S E S T U D Y T I T L E S ( 3 1 ) MNIST MS-COCO ImageNet CIFAR IMDB Reviews WordNet Twitter Sentiment Analysis BreastCancer Wisconsin BBC News Wheat seeds Amazon Reviews Facial Image Spam SMS YouTube Chars74K WineQuality IrisFlowers LabelMe HoTPotQA Ionosphere Xview US Census Boston House Price BankNote authentication PIMA Indian Diabetes BBC Sport Titanic Santander Product Recommendation Sonar Swedish Auto Insurance Abalone 38 PROJECT
  • 39. B O O K S A N D D ATA S E T R E S O U R C E S • https://www.kaggle.com/datasets • https://archive.ics.uci.edu/ml/index.php • https://registry.opendata.aws/ • https://toolbox.google.com/datasetsearch • https://msropendata.com/ • https://github.com/awesomedata/awesome-public-datasets • Indian Government dataset • US Government Dataset • Northern Ireland Public Sector Datasets • European Union Open Data Portal • https://scikit-learn.org/stable/datasets/index.html • https://data.world • http://archive.ics.uci.edu/ml/datasets • https://www.ehdp.com/vitalnet/datasets.htm • https://www.data.gov/health/ • “Python Machine Learning”, Sebastian Raschka, Packt publishing • “Machine Learning In Action”, Peter Harrington, DreamTech Press • “Introduction to Machine Learning” Ethem Alpaydın, MIT Press • “Machine Learning” Tom M. Mitchell, McGraw Hill • “Machine Learning - An Algorithmic Perspective” Stephen Marsland, CRC Press • “Machine Learning ― A Probabilistic Perspective” Kevin P. Murphy, MIT Press • “Pattern Recognition and Machine Learning”, Christopher M. Bishop, Springer • “Elements of Statistical Learning” Trevor Hastie, Robert Tibshirani, Jerome Friedman, Springer 39
  • 40. L E A R N I N G R E S O U R C E S • https://www.analyticsvidhya.com • https://towardsdatascience.com • https://analyticsindiamag.com • https://machinelearningmastery.com • https://www.datacamp.com • https://www.superdatascience.com • https://www.elitedatascience.com • https://medium.com • Siraj Raval youtube channel • https://mlcontests.com • https://www.datasciencechallenge.net • https://www.machinehack.com • https://www.hackerearth.com • www.hackerearth.com • www.kaggle.com/competitions • www.smartindiahackathon.gov.in • www.datahack.analyticsvidhya.com • www.daretocompete.com • https://github.com 40
  • 41. W H Y ? 41
  • 42. S U M M A R Y ( S U M M AT I V E A S S E S S M E N T ) • Examine steps in developing Machine Learning application with respect to your mini project. [10] • Review the issues in Machine Learning. [10] • State applicable use case for each ML algorithm. [10] • Examine Applications of AI. [10] • Illustrate steps for selecting right ML algorithm. [10] • Define ML and differentiate between Supervised, Unsupervised and Reinforcement learning with the help of suitable examples. [10] • Explain ML w.r.t. identifying Tasks, Experience and Performance measure (Tom Mitchell). [10] • designing a checkers learning problem • designing a handwriting recognition learning problem • designing a Robot driving learning problem • Illustrate with example how Supervised learning can be used in handling loan defaulters. [10] • Explain Supervised Learning with neat diagram. [10] 42 EXAM
  • 43. Q U E R I E S ? T H A N K Y O U 43