SlideShare a Scribd company logo
1 of 14
1
Introduction
to
Machine Learning
by
Shiva Dasharathi
2
Machine Learning: In simple terms, is a set of pattern learning
techniques
- These techniques are based on statistical assumptions of the
data
- Conceptually these techniques can be applied to various
forms of data
- Machine learning models are built on training data (praportion
of the actual data) and then are used to predict pattern of
unseen data
Statistical Model: The outcome of a machine learning process is an
entity (or) a model, and is often called a statistical model
Terminology
3
Feature (or) Dimension: Feature, Dimension, Variable, Attribute,
Property represent the characteristic of a data
Ex: {age, height, gender) are Features of User
Training Data (60-80%): Sampled data used for building the model
Test Data (20%): Sampled data used for testing the model
Validation Data (20%): Sampled data (unseen) used for validating
the model
Terminology cont ..
4
Vector : A vector is a multi dimensional representation of a data
point,
- each row in a matrix is a Vector
Similarity : Is a measure used to represent how close 2 data points
are the vector space model
Ex: Euclidian Distance, Cosine etc.
Terminology cont ..
5
Supervised Learning: are modeling techniques where you have the
labeled data
Ex: Customer Segmentation using Classification
Un-supervised Learning: are modeling techniques where you don’t
have the labeled data and are based on the natural occurrence
of the data
Ex: Customer Segmentation using Clustering
Dimensionality Reduction: Techniques to reduce the M dimensions
to N dimensions where M>N,
- That can explain most variation in the data,
- so that the computations & interpretations are easy.
Terminology cont ..
6
Overfitting: If a model is tuned too much for the training data, it wont
be able to predict the unseen with accuracy, this situation is
called over fitting.
Terminology cont ..
7
Typical steps of a model building cycle, but not limited to are,
1. Data collection: collecting data from sources
2. Data cleaning: Dealing with missing values etc.
3. Pre-processing: Outliers & transformations
4. Random sampling: train, test , validation sets
5. Model building: iterative process
1. Feature selection: sub set feature selection that explains
data better
2. Validation: Finalize model summaries
3. Model Selection: Model comparison & final model
6. Model deployment: For predicting unseen data
7. Feedback & model improvement
Model building cycle
8
A few Supervised Learning methods to explore
1. Linear Multiple Regression
2. Logistic Regression
3. Decision Tree
1. CART
2. CHAID
4. Ensemble Methods
1. Bagging
2. Boosting
5. KNN
6. Naïve Bayesian
Supervised Learning
9
A few Un-supervised Learning methods to explore
1. K-means clustering
2. Hierarchical clustering
Unsupervised learning
10
A few similarity measures to explore
1. Euclidian distance
2. Cosine similarity
3. Pearson correlation
4. Jaccard similarity
5. Tanimoto distance
Similarity measures
11
A few dimensionality reduction methods to explore
PCA
Factor Analysis
SVD
Dealing with sparsity
Min Hashing
LSH
Dimensionality reduction
12
Collaborative Filtering
Item based
User based
slope-one
Challenges
clod start problem
curse of dimensionality
outliers
frequent items/association rules
Recommendations
13
Text Mining
1. NLP approach (building language dependant models)
2. Machine Learning approach:
documents are converted into vector space model, and
machine learning techniques are applied on them to solve
problems.
Vector space model
documents => data points
words in the documents=> features
Feature, Document pairs
<feature , document, TF*IDF>
TF = normalized Term Frequency
IDF = Inverse Document Frequency
Text Mining
14
Thank you !


More Related Content

What's hot

Supervised Machine Learning
Supervised Machine LearningSupervised Machine Learning
Supervised Machine LearningAnkit Rai
 
Cmpe 255 cross validation
Cmpe 255 cross validationCmpe 255 cross validation
Cmpe 255 cross validationAbraham Kong
 
Exploratory Factor Analysis With Small Samples and Missing Data
Exploratory Factor Analysis With Small Samples and Missing DataExploratory Factor Analysis With Small Samples and Missing Data
Exploratory Factor Analysis With Small Samples and Missing DataFatemeh Nikbakht
 
Supervised and Unsupervised Machine Learning
Supervised and Unsupervised Machine LearningSupervised and Unsupervised Machine Learning
Supervised and Unsupervised Machine LearningSpotle.ai
 
Machine Learning - Accuracy and Confusion Matrix
Machine Learning - Accuracy and Confusion MatrixMachine Learning - Accuracy and Confusion Matrix
Machine Learning - Accuracy and Confusion MatrixAndrew Ferlitsch
 
Using machine learning in anti money laundering part 2
Using machine learning in anti money laundering   part 2Using machine learning in anti money laundering   part 2
Using machine learning in anti money laundering part 2Naveen Grover
 
Students academic performance using clustering technique
Students academic performance using clustering techniqueStudents academic performance using clustering technique
Students academic performance using clustering techniquesaniacorreya
 
RapidMiner: Learning Schemes In Rapid Miner
RapidMiner:  Learning Schemes In Rapid MinerRapidMiner:  Learning Schemes In Rapid Miner
RapidMiner: Learning Schemes In Rapid MinerDataminingTools Inc
 
Supervised Machine Learning Techniques
Supervised Machine Learning TechniquesSupervised Machine Learning Techniques
Supervised Machine Learning TechniquesTara ram Goyal
 
Supervised machine learning algorithms(strengths and weaknesses)
Supervised machine learning algorithms(strengths and weaknesses)Supervised machine learning algorithms(strengths and weaknesses)
Supervised machine learning algorithms(strengths and weaknesses)MonarchSaha
 
Strategy pattern ooad presentation
Strategy pattern ooad presentationStrategy pattern ooad presentation
Strategy pattern ooad presentationKimliang Mich
 
Using Machine Learning in Anti Money Laundering - Part 1
Using Machine Learning in Anti Money Laundering - Part 1Using Machine Learning in Anti Money Laundering - Part 1
Using Machine Learning in Anti Money Laundering - Part 1Naveen Grover
 
Advanced Working Principles on Supervised and Unsupervised Learning
Advanced Working Principles on Supervised and Unsupervised LearningAdvanced Working Principles on Supervised and Unsupervised Learning
Advanced Working Principles on Supervised and Unsupervised LearningNahin Kumar Dey
 

What's hot (18)

Machine learning
Machine learningMachine learning
Machine learning
 
Supervised Machine Learning
Supervised Machine LearningSupervised Machine Learning
Supervised Machine Learning
 
Cmpe 255 cross validation
Cmpe 255 cross validationCmpe 255 cross validation
Cmpe 255 cross validation
 
Exploratory Factor Analysis With Small Samples and Missing Data
Exploratory Factor Analysis With Small Samples and Missing DataExploratory Factor Analysis With Small Samples and Missing Data
Exploratory Factor Analysis With Small Samples and Missing Data
 
Supervised and Unsupervised Machine Learning
Supervised and Unsupervised Machine LearningSupervised and Unsupervised Machine Learning
Supervised and Unsupervised Machine Learning
 
Machine Learning - Accuracy and Confusion Matrix
Machine Learning - Accuracy and Confusion MatrixMachine Learning - Accuracy and Confusion Matrix
Machine Learning - Accuracy and Confusion Matrix
 
Using machine learning in anti money laundering part 2
Using machine learning in anti money laundering   part 2Using machine learning in anti money laundering   part 2
Using machine learning in anti money laundering part 2
 
Students academic performance using clustering technique
Students academic performance using clustering techniqueStudents academic performance using clustering technique
Students academic performance using clustering technique
 
RapidMiner: Learning Schemes In Rapid Miner
RapidMiner:  Learning Schemes In Rapid MinerRapidMiner:  Learning Schemes In Rapid Miner
RapidMiner: Learning Schemes In Rapid Miner
 
Supervised Machine Learning Techniques
Supervised Machine Learning TechniquesSupervised Machine Learning Techniques
Supervised Machine Learning Techniques
 
Supervised machine learning algorithms(strengths and weaknesses)
Supervised machine learning algorithms(strengths and weaknesses)Supervised machine learning algorithms(strengths and weaknesses)
Supervised machine learning algorithms(strengths and weaknesses)
 
Strategy pattern ooad presentation
Strategy pattern ooad presentationStrategy pattern ooad presentation
Strategy pattern ooad presentation
 
Using Machine Learning in Anti Money Laundering - Part 1
Using Machine Learning in Anti Money Laundering - Part 1Using Machine Learning in Anti Money Laundering - Part 1
Using Machine Learning in Anti Money Laundering - Part 1
 
Student Performance Data Mining Project Report
Student Performance Data Mining Project ReportStudent Performance Data Mining Project Report
Student Performance Data Mining Project Report
 
Machine learning basics
Machine learning   basicsMachine learning   basics
Machine learning basics
 
Advanced Working Principles on Supervised and Unsupervised Learning
Advanced Working Principles on Supervised and Unsupervised LearningAdvanced Working Principles on Supervised and Unsupervised Learning
Advanced Working Principles on Supervised and Unsupervised Learning
 
Ml part2
Ml part2Ml part2
Ml part2
 
Multiple imputation of missing data
Multiple imputation of missing dataMultiple imputation of missing data
Multiple imputation of missing data
 

Similar to Introduction to Machine Learning

Data analytcis-first-steps
Data analytcis-first-stepsData analytcis-first-steps
Data analytcis-first-stepsShesha R
 
Pharmacokinetic pharmacodynamic modeling
Pharmacokinetic pharmacodynamic modelingPharmacokinetic pharmacodynamic modeling
Pharmacokinetic pharmacodynamic modelingMeghana Gowda
 
Top 20 Data Science Interview Questions and Answers in 2023.pdf
Top 20 Data Science Interview Questions and Answers in 2023.pdfTop 20 Data Science Interview Questions and Answers in 2023.pdf
Top 20 Data Science Interview Questions and Answers in 2023.pdfAnanthReddy38
 
Optimal Model Complexity (1).pptx
Optimal Model Complexity (1).pptxOptimal Model Complexity (1).pptx
Optimal Model Complexity (1).pptxMurindanyiSudi1
 
Machine Learning Interview Questions and Answers
Machine Learning Interview Questions and AnswersMachine Learning Interview Questions and Answers
Machine Learning Interview Questions and AnswersSatyam Jaiswal
 
Tech meetup Data Driven - Codemotion
Tech meetup Data Driven - Codemotion Tech meetup Data Driven - Codemotion
Tech meetup Data Driven - Codemotion antimo musone
 
Machine learning module 2
Machine learning module 2Machine learning module 2
Machine learning module 2Gokulks007
 
Supervised learning techniques and applications
Supervised learning techniques and applicationsSupervised learning techniques and applications
Supervised learning techniques and applicationsBenjaminlapid1
 
machine learning basic-1.pptx
machine learning basic-1.pptxmachine learning basic-1.pptx
machine learning basic-1.pptxDrLola1
 
Introduction to ML (Machine Learning)
Introduction to ML (Machine Learning)Introduction to ML (Machine Learning)
Introduction to ML (Machine Learning)SwatiTripathi44
 
Intro to supervised learning.pptx
Intro to supervised learning.pptxIntro to supervised learning.pptx
Intro to supervised learning.pptxSaranCreations
 
unit 1.2 supervised learning.pptx
unit 1.2 supervised learning.pptxunit 1.2 supervised learning.pptx
unit 1.2 supervised learning.pptxDr.Shweta
 
Post Graduate Admission Prediction System
Post Graduate Admission Prediction SystemPost Graduate Admission Prediction System
Post Graduate Admission Prediction SystemIRJET Journal
 
Presentation on supervised learning
Presentation on supervised learningPresentation on supervised learning
Presentation on supervised learningTonmoy Bhagawati
 
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...Intel® Software
 
NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique
NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique
NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique Sujeet Suryawanshi
 

Similar to Introduction to Machine Learning (20)

Machine Learning_Unit 2_Full.ppt.pdf
Machine Learning_Unit 2_Full.ppt.pdfMachine Learning_Unit 2_Full.ppt.pdf
Machine Learning_Unit 2_Full.ppt.pdf
 
ML PPT-1.pptx
ML PPT-1.pptxML PPT-1.pptx
ML PPT-1.pptx
 
Data analytcis-first-steps
Data analytcis-first-stepsData analytcis-first-steps
Data analytcis-first-steps
 
Pharmacokinetic pharmacodynamic modeling
Pharmacokinetic pharmacodynamic modelingPharmacokinetic pharmacodynamic modeling
Pharmacokinetic pharmacodynamic modeling
 
Top 20 Data Science Interview Questions and Answers in 2023.pdf
Top 20 Data Science Interview Questions and Answers in 2023.pdfTop 20 Data Science Interview Questions and Answers in 2023.pdf
Top 20 Data Science Interview Questions and Answers in 2023.pdf
 
Optimal Model Complexity (1).pptx
Optimal Model Complexity (1).pptxOptimal Model Complexity (1).pptx
Optimal Model Complexity (1).pptx
 
Machine Learning Interview Questions and Answers
Machine Learning Interview Questions and AnswersMachine Learning Interview Questions and Answers
Machine Learning Interview Questions and Answers
 
ML_Module_1.pdf
ML_Module_1.pdfML_Module_1.pdf
ML_Module_1.pdf
 
Tech meetup Data Driven - Codemotion
Tech meetup Data Driven - Codemotion Tech meetup Data Driven - Codemotion
Tech meetup Data Driven - Codemotion
 
Machine learning module 2
Machine learning module 2Machine learning module 2
Machine learning module 2
 
Supervised learning techniques and applications
Supervised learning techniques and applicationsSupervised learning techniques and applications
Supervised learning techniques and applications
 
machine learning basic-1.pptx
machine learning basic-1.pptxmachine learning basic-1.pptx
machine learning basic-1.pptx
 
Introduction to ML (Machine Learning)
Introduction to ML (Machine Learning)Introduction to ML (Machine Learning)
Introduction to ML (Machine Learning)
 
Intro to supervised learning.pptx
Intro to supervised learning.pptxIntro to supervised learning.pptx
Intro to supervised learning.pptx
 
Machine learning
Machine learningMachine learning
Machine learning
 
unit 1.2 supervised learning.pptx
unit 1.2 supervised learning.pptxunit 1.2 supervised learning.pptx
unit 1.2 supervised learning.pptx
 
Post Graduate Admission Prediction System
Post Graduate Admission Prediction SystemPost Graduate Admission Prediction System
Post Graduate Admission Prediction System
 
Presentation on supervised learning
Presentation on supervised learningPresentation on supervised learning
Presentation on supervised learning
 
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...
 
NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique
NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique
NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique
 

Recently uploaded

Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxJisc
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxAmanpreet Kaur
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxDr. Ravikiran H M Gowda
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Association for Project Management
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - Englishneillewis46
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfDr Vijay Vishwakarma
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...Nguyen Thanh Tu Collection
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structuredhanjurrannsibayan2
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 

Recently uploaded (20)

Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 

Introduction to Machine Learning

  • 2. 2 Machine Learning: In simple terms, is a set of pattern learning techniques - These techniques are based on statistical assumptions of the data - Conceptually these techniques can be applied to various forms of data - Machine learning models are built on training data (praportion of the actual data) and then are used to predict pattern of unseen data Statistical Model: The outcome of a machine learning process is an entity (or) a model, and is often called a statistical model Terminology
  • 3. 3 Feature (or) Dimension: Feature, Dimension, Variable, Attribute, Property represent the characteristic of a data Ex: {age, height, gender) are Features of User Training Data (60-80%): Sampled data used for building the model Test Data (20%): Sampled data used for testing the model Validation Data (20%): Sampled data (unseen) used for validating the model Terminology cont ..
  • 4. 4 Vector : A vector is a multi dimensional representation of a data point, - each row in a matrix is a Vector Similarity : Is a measure used to represent how close 2 data points are the vector space model Ex: Euclidian Distance, Cosine etc. Terminology cont ..
  • 5. 5 Supervised Learning: are modeling techniques where you have the labeled data Ex: Customer Segmentation using Classification Un-supervised Learning: are modeling techniques where you don’t have the labeled data and are based on the natural occurrence of the data Ex: Customer Segmentation using Clustering Dimensionality Reduction: Techniques to reduce the M dimensions to N dimensions where M>N, - That can explain most variation in the data, - so that the computations & interpretations are easy. Terminology cont ..
  • 6. 6 Overfitting: If a model is tuned too much for the training data, it wont be able to predict the unseen with accuracy, this situation is called over fitting. Terminology cont ..
  • 7. 7 Typical steps of a model building cycle, but not limited to are, 1. Data collection: collecting data from sources 2. Data cleaning: Dealing with missing values etc. 3. Pre-processing: Outliers & transformations 4. Random sampling: train, test , validation sets 5. Model building: iterative process 1. Feature selection: sub set feature selection that explains data better 2. Validation: Finalize model summaries 3. Model Selection: Model comparison & final model 6. Model deployment: For predicting unseen data 7. Feedback & model improvement Model building cycle
  • 8. 8 A few Supervised Learning methods to explore 1. Linear Multiple Regression 2. Logistic Regression 3. Decision Tree 1. CART 2. CHAID 4. Ensemble Methods 1. Bagging 2. Boosting 5. KNN 6. Naïve Bayesian Supervised Learning
  • 9. 9 A few Un-supervised Learning methods to explore 1. K-means clustering 2. Hierarchical clustering Unsupervised learning
  • 10. 10 A few similarity measures to explore 1. Euclidian distance 2. Cosine similarity 3. Pearson correlation 4. Jaccard similarity 5. Tanimoto distance Similarity measures
  • 11. 11 A few dimensionality reduction methods to explore PCA Factor Analysis SVD Dealing with sparsity Min Hashing LSH Dimensionality reduction
  • 12. 12 Collaborative Filtering Item based User based slope-one Challenges clod start problem curse of dimensionality outliers frequent items/association rules Recommendations
  • 13. 13 Text Mining 1. NLP approach (building language dependant models) 2. Machine Learning approach: documents are converted into vector space model, and machine learning techniques are applied on them to solve problems. Vector space model documents => data points words in the documents=> features Feature, Document pairs <feature , document, TF*IDF> TF = normalized Term Frequency IDF = Inverse Document Frequency Text Mining