SlideShare a Scribd company logo
1 of 11
Machine Learning
Basics
(For 1st Year)
Contents
INTRODUCTION
1. Data Acquisition to KDD, Importance of data mining (overview of the process)
2. Different forms of data – Text (unstructured), Audio, video, Structured data (csv files), Semi structured data (Log
files, xml files)
3. Applications associated – Product Recommendation System (text data), Cancer prediction based on health records
data (text data) , Auto driving cars (video data) etc.
STRUCTURED DATA
1. Knowing your data (Exploratory Data Analysis)
2. Preprocessing
3. Concept of training & test sets
4. Supervised Learning – Classification (Decision Trees, kNN, SVM) & Regression (Linear Regression, MLR)
5. Unsupervised Learning – Clustering (K-means)
6. Model Evaluation
Data Acquisition to KDD, Importance of data
mining (overview of the process)
Little Background
Different forms of data
• Unstructured Data – Text,Audio, video,
• Structured data (csv files),
• Semi structured data (Log files, xml files)
Applications which use ML
• Product Recommendation System (text data),
• Cancer prediction based on health records data (text data) ,
• Auto driving cars (video data) etc.
STRUCTURED DATA
Knowing your data
• Nominal Attributes -
• Binary Attributes,
• Ordinal Attributes,
• Numeric Attributes,
• Discrete (e.g zip codes, profession or set of words) Sometimes, represented as integer variables
• Note: Binary attributes are a special case of discrete attributes
• Continuous Attributes
• Has real numbers as attribute values
• E.g., temperature, height, or weight
• Basic Statistical Descriptions of data
Knowing your Data & Preprocessing
• Preprocessing Steps :
• Data Cleaning
• Data Integration
• Data Reduction
• Data Transformation and Data
Discretization
Basic Statistical
Descriptions of
data
Measuring central
tendency Mean, median,mode
Measuring
dispersion of data
Range, Quratiles, Variance,
Standard Deviation,
Interquatile range
Graphic displays Box plot…
Concept of training & test sets
Machine Learning Basics for Beginners

More Related Content

Similar to Machine Learning Basics for Beginners

Knowing me, knowing you, knowing your disease
Knowing me, knowing you, knowing your diseaseKnowing me, knowing you, knowing your disease
Knowing me, knowing you, knowing your diseaseeHealth Forum
 
Azure Machine Learning
Azure Machine LearningAzure Machine Learning
Azure Machine LearningMostafa
 
Big data-analytics-for-smart-manufacturing-systems-report
Big data-analytics-for-smart-manufacturing-systems-reportBig data-analytics-for-smart-manufacturing-systems-report
Big data-analytics-for-smart-manufacturing-systems-reportAravindharamanan S
 
Peng Privette SMM_AMS2014_P695
Peng Privette SMM_AMS2014_P695Peng Privette SMM_AMS2014_P695
Peng Privette SMM_AMS2014_P695Ge Peng
 
Chapter 1. Introduction
Chapter 1. IntroductionChapter 1. Introduction
Chapter 1. Introductionbutest
 
Database Systems.ppt
Database Systems.pptDatabase Systems.ppt
Database Systems.pptArbazAli27
 
Optim test data management for IMS 2011
Optim test data management for IMS 2011Optim test data management for IMS 2011
Optim test data management for IMS 2011evgeni77
 
TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS
TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaSTUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS
TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaSHong-Linh Truong
 
Machine Learned Relevance at A Large Scale Search Engine
Machine Learned Relevance at A Large Scale Search EngineMachine Learned Relevance at A Large Scale Search Engine
Machine Learned Relevance at A Large Scale Search EngineSalford Systems
 
BioSymetrics Bioconnect April 2018
BioSymetrics Bioconnect April 2018BioSymetrics Bioconnect April 2018
BioSymetrics Bioconnect April 2018Wendy Tsai
 
Data Management for Quantitative Biology - Database systems, May 7, 2015, Dr....
Data Management for Quantitative Biology - Database systems, May 7, 2015, Dr....Data Management for Quantitative Biology - Database systems, May 7, 2015, Dr....
Data Management for Quantitative Biology - Database systems, May 7, 2015, Dr....QBiC_Tue
 
Analysis of Benefits for Knowledge Workers Expected from Knowledge-Graph-Base...
Analysis of Benefits for Knowledge Workers Expected from Knowledge-Graph-Base...Analysis of Benefits for Knowledge Workers Expected from Knowledge-Graph-Base...
Analysis of Benefits for Knowledge Workers Expected from Knowledge-Graph-Base...Vera G. Meister
 
Video Analytics on Hadoop webinar victor fang-201309
Video Analytics on Hadoop webinar victor fang-201309Video Analytics on Hadoop webinar victor fang-201309
Video Analytics on Hadoop webinar victor fang-201309DrVictorFang
 
DATA MINING DC Presentation.pptx
DATA MINING DC Presentation.pptxDATA MINING DC Presentation.pptx
DATA MINING DC Presentation.pptxSaravanaD2
 
Technical Documentation 101 for Data Engineers.pdf
Technical Documentation 101 for Data Engineers.pdfTechnical Documentation 101 for Data Engineers.pdf
Technical Documentation 101 for Data Engineers.pdfShristi Shrestha
 
Data Mining with SQL Server 2005
Data Mining with SQL Server 2005Data Mining with SQL Server 2005
Data Mining with SQL Server 2005Dean Willson
 
Data science technology overview
Data science technology overviewData science technology overview
Data science technology overviewSoojung Hong
 

Similar to Machine Learning Basics for Beginners (20)

Knowing me, knowing you, knowing your disease
Knowing me, knowing you, knowing your diseaseKnowing me, knowing you, knowing your disease
Knowing me, knowing you, knowing your disease
 
Azure Machine Learning
Azure Machine LearningAzure Machine Learning
Azure Machine Learning
 
Big data-analytics-for-smart-manufacturing-systems-report
Big data-analytics-for-smart-manufacturing-systems-reportBig data-analytics-for-smart-manufacturing-systems-report
Big data-analytics-for-smart-manufacturing-systems-report
 
Peng Privette SMM_AMS2014_P695
Peng Privette SMM_AMS2014_P695Peng Privette SMM_AMS2014_P695
Peng Privette SMM_AMS2014_P695
 
Chapter 1. Introduction
Chapter 1. IntroductionChapter 1. Introduction
Chapter 1. Introduction
 
Database Systems.ppt
Database Systems.pptDatabase Systems.ppt
Database Systems.ppt
 
Optim test data management for IMS 2011
Optim test data management for IMS 2011Optim test data management for IMS 2011
Optim test data management for IMS 2011
 
TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS
TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaSTUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS
TUW-ASE-SUmmer 2014: Evaluating and Utilizing Data Concerns for DaaS
 
18231979 Data Mining
18231979 Data Mining18231979 Data Mining
18231979 Data Mining
 
Machine Learned Relevance at A Large Scale Search Engine
Machine Learned Relevance at A Large Scale Search EngineMachine Learned Relevance at A Large Scale Search Engine
Machine Learned Relevance at A Large Scale Search Engine
 
BioSymetrics Bioconnect April 2018
BioSymetrics Bioconnect April 2018BioSymetrics Bioconnect April 2018
BioSymetrics Bioconnect April 2018
 
Mis lecture ppt
Mis lecture pptMis lecture ppt
Mis lecture ppt
 
Data Management for Quantitative Biology - Database systems, May 7, 2015, Dr....
Data Management for Quantitative Biology - Database systems, May 7, 2015, Dr....Data Management for Quantitative Biology - Database systems, May 7, 2015, Dr....
Data Management for Quantitative Biology - Database systems, May 7, 2015, Dr....
 
Analysis of Benefits for Knowledge Workers Expected from Knowledge-Graph-Base...
Analysis of Benefits for Knowledge Workers Expected from Knowledge-Graph-Base...Analysis of Benefits for Knowledge Workers Expected from Knowledge-Graph-Base...
Analysis of Benefits for Knowledge Workers Expected from Knowledge-Graph-Base...
 
Video Analytics on Hadoop webinar victor fang-201309
Video Analytics on Hadoop webinar victor fang-201309Video Analytics on Hadoop webinar victor fang-201309
Video Analytics on Hadoop webinar victor fang-201309
 
DATA MINING DC Presentation.pptx
DATA MINING DC Presentation.pptxDATA MINING DC Presentation.pptx
DATA MINING DC Presentation.pptx
 
dwdm unit 1.ppt
dwdm unit 1.pptdwdm unit 1.ppt
dwdm unit 1.ppt
 
Technical Documentation 101 for Data Engineers.pdf
Technical Documentation 101 for Data Engineers.pdfTechnical Documentation 101 for Data Engineers.pdf
Technical Documentation 101 for Data Engineers.pdf
 
Data Mining with SQL Server 2005
Data Mining with SQL Server 2005Data Mining with SQL Server 2005
Data Mining with SQL Server 2005
 
Data science technology overview
Data science technology overviewData science technology overview
Data science technology overview
 

Recently uploaded

High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxvipinkmenon1
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and usesDevarapalliHaritha
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 

Recently uploaded (20)

High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and uses
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 

Machine Learning Basics for Beginners

  • 2. Contents INTRODUCTION 1. Data Acquisition to KDD, Importance of data mining (overview of the process) 2. Different forms of data – Text (unstructured), Audio, video, Structured data (csv files), Semi structured data (Log files, xml files) 3. Applications associated – Product Recommendation System (text data), Cancer prediction based on health records data (text data) , Auto driving cars (video data) etc. STRUCTURED DATA 1. Knowing your data (Exploratory Data Analysis) 2. Preprocessing 3. Concept of training & test sets 4. Supervised Learning – Classification (Decision Trees, kNN, SVM) & Regression (Linear Regression, MLR) 5. Unsupervised Learning – Clustering (K-means) 6. Model Evaluation
  • 3. Data Acquisition to KDD, Importance of data mining (overview of the process)
  • 5. Different forms of data • Unstructured Data – Text,Audio, video, • Structured data (csv files), • Semi structured data (Log files, xml files)
  • 6. Applications which use ML • Product Recommendation System (text data), • Cancer prediction based on health records data (text data) , • Auto driving cars (video data) etc.
  • 8. Knowing your data • Nominal Attributes - • Binary Attributes, • Ordinal Attributes, • Numeric Attributes, • Discrete (e.g zip codes, profession or set of words) Sometimes, represented as integer variables • Note: Binary attributes are a special case of discrete attributes • Continuous Attributes • Has real numbers as attribute values • E.g., temperature, height, or weight • Basic Statistical Descriptions of data
  • 9. Knowing your Data & Preprocessing • Preprocessing Steps : • Data Cleaning • Data Integration • Data Reduction • Data Transformation and Data Discretization Basic Statistical Descriptions of data Measuring central tendency Mean, median,mode Measuring dispersion of data Range, Quratiles, Variance, Standard Deviation, Interquatile range Graphic displays Box plot…
  • 10. Concept of training & test sets

Editor's Notes

  1. Measuring Data Similarity and Dissimilarity