SlideShare a Scribd company logo
1 of 9
Data Science: Concepts and Practice
Course slides
Data Science: Concepts and
Practice
Authors : Vijay Kotu & Bala Deshpande
Publisher : Morgan Kaufmann
Course Book Course Software
www.rapidminer.com
Free Download
1. Introduction
What is Data Science
Models
Types of Data Science
Tasks Description Algorithms Examples
Classification Predict if a data point belongs to
one of predefined classes. The
prediction will be based on
learning from known data set.
Decision Trees, Neural
networks, Bayesian
models, Induction rules, K
nearest neighbors
Assigning voters into known buckets by
political parties eg: soccer moms.
Bucketing new customers into one of
known customer groups.
Regression Predict the numeric target label of
a data point. The prediction will
be based on learning from known
data set.
Linear regression, Logistic
regression
Predicting unemployment rate for next
year. Estimating insurance premium.
Anomaly detection Predict if a data point is an outlier
compared to other data points in
the data set.
Distance based, Density
based, LOF
Fraud transaction detection in credit
cards. Network intrusion detection.
Time series Predict if the value of the target
variable for future time frame
based on history values.
Exponential smoothing,
ARIMA, regression
Sales forecasting, production
forecasting, virtually any growth
phenomenon that needs to be
extrapolated
Clustering Identify natural clusters within the
data set based on inherit
properties within the data set.
K means, density based
clustering - DBSCAN
Finding customer segments in a
company based on transaction, web
and customer call data.
Association analysis Identify relationships within an
itemset based on transaction
data.
FP Growth, Apriori Find cross selling opportunities for a
retailor based on transaction purchase
history.
Course
outline
Data Science
Process
Data Exploration
Model Evaluation
Classification
Decision Trees
Rule Induction
k-Nearest Neighbors
Naïve Bayesian
Artificial Neural Networks
Support Vector Machines
Ensemble Learners
Regression
Linear Regression
Logistic Regression
Association Analysis
Apriori
FP-Growth
Clustering
k-Means
DBSCAN
Self-Organizing Maps
Text Mining
Time Series Forecasting
Anomaly Detection
Feature Selection
Process Basics
Core Algorithms
Common Applications

More Related Content

Similar to 01. Introduction.pptx

222111Organization N.docx
222111Organization N.docx222111Organization N.docx
222111Organization N.docxRAJU852744
 
The Science Behind the Data_ A Deep Dive into Data Science.pdf
The Science Behind the Data_ A Deep Dive into Data Science.pdfThe Science Behind the Data_ A Deep Dive into Data Science.pdf
The Science Behind the Data_ A Deep Dive into Data Science.pdfkhushnuma khan
 
Introduction To Data Mining
Introduction To Data MiningIntroduction To Data Mining
Introduction To Data Miningdataminers.ir
 
Introduction To Data Mining
Introduction To Data Mining   Introduction To Data Mining
Introduction To Data Mining Phi Jack
 
De-Mystefying Predictive Analytics
De-Mystefying Predictive AnalyticsDe-Mystefying Predictive Analytics
De-Mystefying Predictive AnalyticsGalit Shmueli
 
Defense nikhil khullar
Defense nikhil khullarDefense nikhil khullar
Defense nikhil khullarNikhil Khullar
 
Data mining techniques and dss
Data mining techniques and dssData mining techniques and dss
Data mining techniques and dssNiyitegekabilly
 
NEAREST NEIGHBOUR CLUSTER ANALYSIS.pptx
NEAREST NEIGHBOUR CLUSTER ANALYSIS.pptxNEAREST NEIGHBOUR CLUSTER ANALYSIS.pptx
NEAREST NEIGHBOUR CLUSTER ANALYSIS.pptxagniva pradhan
 
Data scientist roadmap
Data scientist roadmapData scientist roadmap
Data scientist roadmapSonu Kumar
 
KDD, Data Mining, Data Science_I.pptx
KDD, Data Mining, Data Science_I.pptxKDD, Data Mining, Data Science_I.pptx
KDD, Data Mining, Data Science_I.pptxYogeshGairola2
 
FDA_SAKEC2018.pptx
FDA_SAKEC2018.pptxFDA_SAKEC2018.pptx
FDA_SAKEC2018.pptxmviji
 
Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2Roger Barga
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data miningTony Nguyen
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data miningHoang Nguyen
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data miningLuis Goldster
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data miningJames Wong
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data miningHarry Potter
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data miningYoung Alista
 

Similar to 01. Introduction.pptx (20)

Data mining
Data miningData mining
Data mining
 
222111Organization N.docx
222111Organization N.docx222111Organization N.docx
222111Organization N.docx
 
The Science Behind the Data_ A Deep Dive into Data Science.pdf
The Science Behind the Data_ A Deep Dive into Data Science.pdfThe Science Behind the Data_ A Deep Dive into Data Science.pdf
The Science Behind the Data_ A Deep Dive into Data Science.pdf
 
Introduction To Data Mining
Introduction To Data MiningIntroduction To Data Mining
Introduction To Data Mining
 
Introduction To Data Mining
Introduction To Data Mining   Introduction To Data Mining
Introduction To Data Mining
 
De-Mystefying Predictive Analytics
De-Mystefying Predictive AnalyticsDe-Mystefying Predictive Analytics
De-Mystefying Predictive Analytics
 
Defense nikhil khullar
Defense nikhil khullarDefense nikhil khullar
Defense nikhil khullar
 
Data mining techniques and dss
Data mining techniques and dssData mining techniques and dss
Data mining techniques and dss
 
Risk mgmt-analysis-wp-326822
Risk mgmt-analysis-wp-326822Risk mgmt-analysis-wp-326822
Risk mgmt-analysis-wp-326822
 
NEAREST NEIGHBOUR CLUSTER ANALYSIS.pptx
NEAREST NEIGHBOUR CLUSTER ANALYSIS.pptxNEAREST NEIGHBOUR CLUSTER ANALYSIS.pptx
NEAREST NEIGHBOUR CLUSTER ANALYSIS.pptx
 
Data scientist roadmap
Data scientist roadmapData scientist roadmap
Data scientist roadmap
 
KDD, Data Mining, Data Science_I.pptx
KDD, Data Mining, Data Science_I.pptxKDD, Data Mining, Data Science_I.pptx
KDD, Data Mining, Data Science_I.pptx
 
FDA_SAKEC2018.pptx
FDA_SAKEC2018.pptxFDA_SAKEC2018.pptx
FDA_SAKEC2018.pptx
 
Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 

Recently uploaded

Seven tools of quality control.slideshare
Seven tools of quality control.slideshareSeven tools of quality control.slideshare
Seven tools of quality control.slideshareraiaryan448
 
How to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsHow to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsBrainSell Technologies
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Valters Lauzums
 
Formulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfFormulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfRobertoOcampo24
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证acoha1
 
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证zifhagzkk
 
一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理cyebo
 
Predictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesPredictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesBoston Institute of Analytics
 
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证ju0dztxtn
 
edited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdfedited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdfgreat91
 
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证ppy8zfkfm
 
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam DunksNOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam Dunksgmuir1066
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group MeetingAlison Pitt
 
Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...
Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...
Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...mikehavy0
 
Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)Jon Hansen
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理pyhepag
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理pyhepag
 
MATERI MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI  MANAJEMEN OF PENYAKIT TETANUS.pptMATERI  MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI MANAJEMEN OF PENYAKIT TETANUS.pptRachmaGhifari
 
社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token PredictionNABLAS株式会社
 

Recently uploaded (20)

Seven tools of quality control.slideshare
Seven tools of quality control.slideshareSeven tools of quality control.slideshare
Seven tools of quality control.slideshare
 
How to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsHow to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data Analytics
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
 
Formulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfFormulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdf
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
 
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
 
一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理
 
123.docx. .
123.docx.                                 .123.docx.                                 .
123.docx. .
 
Predictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesPredictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting Techniques
 
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
 
edited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdfedited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdf
 
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
 
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam DunksNOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting
 
Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...
Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...
Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...
 
Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
 
MATERI MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI  MANAJEMEN OF PENYAKIT TETANUS.pptMATERI  MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI MANAJEMEN OF PENYAKIT TETANUS.ppt
 
社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction
 

01. Introduction.pptx

  • 1. Data Science: Concepts and Practice Course slides
  • 2. Data Science: Concepts and Practice Authors : Vijay Kotu & Bala Deshpande Publisher : Morgan Kaufmann Course Book Course Software www.rapidminer.com Free Download
  • 4. What is Data Science
  • 6.
  • 7. Types of Data Science
  • 8. Tasks Description Algorithms Examples Classification Predict if a data point belongs to one of predefined classes. The prediction will be based on learning from known data set. Decision Trees, Neural networks, Bayesian models, Induction rules, K nearest neighbors Assigning voters into known buckets by political parties eg: soccer moms. Bucketing new customers into one of known customer groups. Regression Predict the numeric target label of a data point. The prediction will be based on learning from known data set. Linear regression, Logistic regression Predicting unemployment rate for next year. Estimating insurance premium. Anomaly detection Predict if a data point is an outlier compared to other data points in the data set. Distance based, Density based, LOF Fraud transaction detection in credit cards. Network intrusion detection. Time series Predict if the value of the target variable for future time frame based on history values. Exponential smoothing, ARIMA, regression Sales forecasting, production forecasting, virtually any growth phenomenon that needs to be extrapolated Clustering Identify natural clusters within the data set based on inherit properties within the data set. K means, density based clustering - DBSCAN Finding customer segments in a company based on transaction, web and customer call data. Association analysis Identify relationships within an itemset based on transaction data. FP Growth, Apriori Find cross selling opportunities for a retailor based on transaction purchase history.
  • 9. Course outline Data Science Process Data Exploration Model Evaluation Classification Decision Trees Rule Induction k-Nearest Neighbors Naïve Bayesian Artificial Neural Networks Support Vector Machines Ensemble Learners Regression Linear Regression Logistic Regression Association Analysis Apriori FP-Growth Clustering k-Means DBSCAN Self-Organizing Maps Text Mining Time Series Forecasting Anomaly Detection Feature Selection Process Basics Core Algorithms Common Applications