SlideShare a Scribd company logo
DATA SCIENCE
NIMIT JAIN
(252101141)
AGENDA
 INTRODUCTION
 PYTHON FOR DATA SCIENCE
 UNDERSTANDING THE STATSTICS FOR DATA
SCIENCE
 PREDICTIVE MODELING AND MACHINE LEARNING
2
DATA SCIENCE =
DATA+SCIENCE
The field of bringing out insights from data
using scientific techniques is called data
science.
3
TERMINOLOGOES USED IN
DATA SCIENCE
MIS/Reporting
Detective Analysis
Dashboarding
Predictive Modelling/Machine learning
Bigdata
Forecasting
Business Intelligence
4
FORECASTING - It is a process of predicting or
estimating the future based on past and present data.
Predictive Modelling – It is used to perform prediction
more granular like “who are the customer who are
likely to buy a product in next month?” and then act
accordingly.
Machine Learning – It is a method of teaching machine
to learn things and improve predictions based on data
on their own.
Detective Analysis – Analysing past data and no future
outcome or forecast.
5
6
20XX Pitch deck title 7
PYTHON FOR DATA
SCIENCE
 Operators
 Variables and variables naming conventions
 Data types in python
 Conditional statements
 Looping statements
 Functions
 Libraries in python
8
LIBRARIES USED
NumPy Sci Py Pandas
Matplotlib
9
PANDASPanel Data System
Pandas is an open source, BSD-licensed library.
High-performance, easy-to-use data structures.
Provides data analysis and data manipulation
tools (reshaping, merging, sorting, slicing,
aggregation etc.)
Allow handling missing data.
Reading different varieties of Data
10
NUMPY
11
Introduces objects for multidimensional arrays
and matrices, as well as functions that allow to
easily perform advanced mathematical and
statistical operations on those objects
Provides vectorization of mathematical
operations on arrays and matrices which
significantly improves the performance
Many other python libraries are built on NumPy.
MATPLOTLIB
Python 2D plotting library which produces
publication quality figures in a variety of
hardcopy formats
A set of functionalities similar to those of MATLAB
 Line plots, scatter plots, Bar Charts, histograms,
pie charts etc.
Relatively low-level; some effort needed to
create advanced visualization
12
SCI PY
Collection of algorithms for linear algebra, differential
equations, numerical integration, optimization, statistics
and more.
Part of SciPy Stack
Built on NumPy
13
STATISTICAL
USE
DESCRIPTIVE STASTICS
FREQUENCY DISTRIBUTION
It is a table that displays the frequency of various outcome in a sample.
MEASURE OF CENTRAL TENDENCY
 MEAN
 MEDIAN
 MODE
Descriptive statistics describe, show, and summarize
the basic features of a dataset found in a given study,
presented in a summary that describes the data sample
and its measurements. It helps analysts to understand
the data better.
15
MEASURES OF VARIABILITY
• RANGE
• STANDARD DEVIATION
• VARIANCE
UNIVARIATE DESCRIPTIVE STATSTICS
Statistics focused on only one variable at a time
BIVARIATE DECRIPTIVE STASTICS
o SCATTERPLOT
o CONTIGENCY TABLE
16
INFERENTIAL
STATISTICS
17
Statistical method that deduce
from small but representative
sample the characterstics of a
bigger population.
REGRESSION ANALYSIS
 LINEAR REGRESSION
 NOMINAL REGRESSION
 LOGISTIC REGRESSION
 ORDINAL REGRESSION
18
PREDECTIVE
MODELING
19
PREDICTIVE MODELLING
Making use of past data and other attributes
and predict the future using this data.
20
TYPES OF PREDICTIVE
MODELS
SUPERVISED
LEARNING
UNSUPERVISED
LEARNING
21
PRODUCT OVERVIEW
UNIQUE
Only product specifically dedicated to
this niche market
TESTED
Conducted testing with college students in
the area
FIRST TO MARKET
First beautifully designed product that's
both stylish and functional
AUTHENTIC
Designed with the help and input of
experts in the field
20XX Pitch deck title 22
 PROBLEM DEFINATION – This initial phase of data mining project focuses on
understanding the project objectives and requirements.
 HYPOTHESIS GENERATION – It helps in comprehending the business problem as we dive
deep inferring the various factor affecting our target variables and we get a much better idea of
what are the major factor that are responsible to solve the problem.
 DATA EXTRACTON – It is a process of obtaining data from a database or SaaS platform.
 DATA EXPLORATION – It is approach similar to initial data analysis whereby a data analyst
use visual exploration to understand what is in a dataset and characterstics of dataset.
 MODEL DEPLOYMENT- The concept of deployment in data science refers to the
application of a model for prediction using a new data.
23
THANK YOU

More Related Content

Similar to DATA SCIENCE PPT. (HARSH GAUTAM).pptx

Analyzing and Visualizing Data with Power BI (SF)_Student.pptx
Analyzing and Visualizing Data with Power BI (SF)_Student.pptxAnalyzing and Visualizing Data with Power BI (SF)_Student.pptx
Analyzing and Visualizing Data with Power BI (SF)_Student.pptx
AlexChua42
 
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdfThe Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
Data Science Council of America
 
DATASCIENCE vs BUSINESS INTELLIGENCE.pptx
DATASCIENCE vs BUSINESS INTELLIGENCE.pptxDATASCIENCE vs BUSINESS INTELLIGENCE.pptx
DATASCIENCE vs BUSINESS INTELLIGENCE.pptx
OTA13NayabNakhwa
 
Self Study Business Approach to DS_01022022.docx
Self Study Business Approach to DS_01022022.docxSelf Study Business Approach to DS_01022022.docx
Self Study Business Approach to DS_01022022.docx
Shanmugasundaram M
 
Data Analytics & Visualization (Introduction)
Data Analytics & Visualization (Introduction)Data Analytics & Visualization (Introduction)
Data Analytics & Visualization (Introduction)
Dolapo Amusat
 
INTRODUCTION TO BUSINESS ANALYTICS BA4206 ANNA UNIVERSITY
INTRODUCTION TO BUSINESS ANALYTICS BA4206 ANNA UNIVERSITYINTRODUCTION TO BUSINESS ANALYTICS BA4206 ANNA UNIVERSITY
INTRODUCTION TO BUSINESS ANALYTICS BA4206 ANNA UNIVERSITY
Freelance
 
DataScience_RoadMap_2023.pdf
DataScience_RoadMap_2023.pdfDataScience_RoadMap_2023.pdf
DataScience_RoadMap_2023.pdf
MuhammadRizwanAmanat
 
Data Science.pdf
Data Science.pdfData Science.pdf
Data Science.pdf
WinduGata3
 
6 levels of big data analytics applications
6 levels of big data analytics applications6 levels of big data analytics applications
6 levels of big data analytics applications
panoratio
 
FDS_dept_ppt.pptx
FDS_dept_ppt.pptxFDS_dept_ppt.pptx
FDS_dept_ppt.pptx
SatyajitPatil42
 
Business Analytics Techniques.docx
Business Analytics Techniques.docxBusiness Analytics Techniques.docx
Business Analytics Techniques.docx
AbhinavSharma309481
 
Data Science and Analytics
Data Science and Analytics Data Science and Analytics
Data Science and Analytics
Prommas Design Agency
 
Data Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptxData Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptx
sumitkumar600840
 
electtrict vehicle aticle full book hariharan23900 .pdf
electtrict vehicle aticle full book hariharan23900 .pdfelecttrict vehicle aticle full book hariharan23900 .pdf
electtrict vehicle aticle full book hariharan23900 .pdf
hariharan 23900
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
AmarnathKambale
 
Credit card fraud detection using python machine learning
Credit card fraud detection using python machine learningCredit card fraud detection using python machine learning
Credit card fraud detection using python machine learning
Sandeep Garg
 
Experimenting with Data!
Experimenting with Data!Experimenting with Data!
Experimenting with Data!
Andrea Montemaggio
 
Data analytcis-first-steps
Data analytcis-first-stepsData analytcis-first-steps
Data analytcis-first-steps
Shesha R
 
skill development program in research paper and patenting
skill development program in research paper and patentingskill development program in research paper and patenting
skill development program in research paper and patenting
BeemkumarN
 
SAP Applications and the Modern Data Scientist - Predictive Analytics for the...
SAP Applications and the Modern Data Scientist - Predictive Analytics for the...SAP Applications and the Modern Data Scientist - Predictive Analytics for the...
SAP Applications and the Modern Data Scientist - Predictive Analytics for the...
Dickinson + Associates
 

Similar to DATA SCIENCE PPT. (HARSH GAUTAM).pptx (20)

Analyzing and Visualizing Data with Power BI (SF)_Student.pptx
Analyzing and Visualizing Data with Power BI (SF)_Student.pptxAnalyzing and Visualizing Data with Power BI (SF)_Student.pptx
Analyzing and Visualizing Data with Power BI (SF)_Student.pptx
 
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdfThe Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
 
DATASCIENCE vs BUSINESS INTELLIGENCE.pptx
DATASCIENCE vs BUSINESS INTELLIGENCE.pptxDATASCIENCE vs BUSINESS INTELLIGENCE.pptx
DATASCIENCE vs BUSINESS INTELLIGENCE.pptx
 
Self Study Business Approach to DS_01022022.docx
Self Study Business Approach to DS_01022022.docxSelf Study Business Approach to DS_01022022.docx
Self Study Business Approach to DS_01022022.docx
 
Data Analytics & Visualization (Introduction)
Data Analytics & Visualization (Introduction)Data Analytics & Visualization (Introduction)
Data Analytics & Visualization (Introduction)
 
INTRODUCTION TO BUSINESS ANALYTICS BA4206 ANNA UNIVERSITY
INTRODUCTION TO BUSINESS ANALYTICS BA4206 ANNA UNIVERSITYINTRODUCTION TO BUSINESS ANALYTICS BA4206 ANNA UNIVERSITY
INTRODUCTION TO BUSINESS ANALYTICS BA4206 ANNA UNIVERSITY
 
DataScience_RoadMap_2023.pdf
DataScience_RoadMap_2023.pdfDataScience_RoadMap_2023.pdf
DataScience_RoadMap_2023.pdf
 
Data Science.pdf
Data Science.pdfData Science.pdf
Data Science.pdf
 
6 levels of big data analytics applications
6 levels of big data analytics applications6 levels of big data analytics applications
6 levels of big data analytics applications
 
FDS_dept_ppt.pptx
FDS_dept_ppt.pptxFDS_dept_ppt.pptx
FDS_dept_ppt.pptx
 
Business Analytics Techniques.docx
Business Analytics Techniques.docxBusiness Analytics Techniques.docx
Business Analytics Techniques.docx
 
Data Science and Analytics
Data Science and Analytics Data Science and Analytics
Data Science and Analytics
 
Data Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptxData Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptx
 
electtrict vehicle aticle full book hariharan23900 .pdf
electtrict vehicle aticle full book hariharan23900 .pdfelecttrict vehicle aticle full book hariharan23900 .pdf
electtrict vehicle aticle full book hariharan23900 .pdf
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Credit card fraud detection using python machine learning
Credit card fraud detection using python machine learningCredit card fraud detection using python machine learning
Credit card fraud detection using python machine learning
 
Experimenting with Data!
Experimenting with Data!Experimenting with Data!
Experimenting with Data!
 
Data analytcis-first-steps
Data analytcis-first-stepsData analytcis-first-steps
Data analytcis-first-steps
 
skill development program in research paper and patenting
skill development program in research paper and patentingskill development program in research paper and patenting
skill development program in research paper and patenting
 
SAP Applications and the Modern Data Scientist - Predictive Analytics for the...
SAP Applications and the Modern Data Scientist - Predictive Analytics for the...SAP Applications and the Modern Data Scientist - Predictive Analytics for the...
SAP Applications and the Modern Data Scientist - Predictive Analytics for the...
 

Recently uploaded

一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
SaffaIbrahim1
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
AlessioFois2
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
xclpvhuk
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Fernanda Palhano
 

Recently uploaded (20)

一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
 

DATA SCIENCE PPT. (HARSH GAUTAM).pptx

  • 2. AGENDA  INTRODUCTION  PYTHON FOR DATA SCIENCE  UNDERSTANDING THE STATSTICS FOR DATA SCIENCE  PREDICTIVE MODELING AND MACHINE LEARNING 2
  • 3. DATA SCIENCE = DATA+SCIENCE The field of bringing out insights from data using scientific techniques is called data science. 3
  • 4. TERMINOLOGOES USED IN DATA SCIENCE MIS/Reporting Detective Analysis Dashboarding Predictive Modelling/Machine learning Bigdata Forecasting Business Intelligence 4
  • 5. FORECASTING - It is a process of predicting or estimating the future based on past and present data. Predictive Modelling – It is used to perform prediction more granular like “who are the customer who are likely to buy a product in next month?” and then act accordingly. Machine Learning – It is a method of teaching machine to learn things and improve predictions based on data on their own. Detective Analysis – Analysing past data and no future outcome or forecast. 5
  • 6. 6
  • 7. 20XX Pitch deck title 7
  • 8. PYTHON FOR DATA SCIENCE  Operators  Variables and variables naming conventions  Data types in python  Conditional statements  Looping statements  Functions  Libraries in python 8
  • 9. LIBRARIES USED NumPy Sci Py Pandas Matplotlib 9
  • 10. PANDASPanel Data System Pandas is an open source, BSD-licensed library. High-performance, easy-to-use data structures. Provides data analysis and data manipulation tools (reshaping, merging, sorting, slicing, aggregation etc.) Allow handling missing data. Reading different varieties of Data 10
  • 11. NUMPY 11 Introduces objects for multidimensional arrays and matrices, as well as functions that allow to easily perform advanced mathematical and statistical operations on those objects Provides vectorization of mathematical operations on arrays and matrices which significantly improves the performance Many other python libraries are built on NumPy.
  • 12. MATPLOTLIB Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats A set of functionalities similar to those of MATLAB  Line plots, scatter plots, Bar Charts, histograms, pie charts etc. Relatively low-level; some effort needed to create advanced visualization 12
  • 13. SCI PY Collection of algorithms for linear algebra, differential equations, numerical integration, optimization, statistics and more. Part of SciPy Stack Built on NumPy 13
  • 15. DESCRIPTIVE STASTICS FREQUENCY DISTRIBUTION It is a table that displays the frequency of various outcome in a sample. MEASURE OF CENTRAL TENDENCY  MEAN  MEDIAN  MODE Descriptive statistics describe, show, and summarize the basic features of a dataset found in a given study, presented in a summary that describes the data sample and its measurements. It helps analysts to understand the data better. 15
  • 16. MEASURES OF VARIABILITY • RANGE • STANDARD DEVIATION • VARIANCE UNIVARIATE DESCRIPTIVE STATSTICS Statistics focused on only one variable at a time BIVARIATE DECRIPTIVE STASTICS o SCATTERPLOT o CONTIGENCY TABLE 16
  • 17. INFERENTIAL STATISTICS 17 Statistical method that deduce from small but representative sample the characterstics of a bigger population.
  • 18. REGRESSION ANALYSIS  LINEAR REGRESSION  NOMINAL REGRESSION  LOGISTIC REGRESSION  ORDINAL REGRESSION 18
  • 20. PREDICTIVE MODELLING Making use of past data and other attributes and predict the future using this data. 20
  • 22. PRODUCT OVERVIEW UNIQUE Only product specifically dedicated to this niche market TESTED Conducted testing with college students in the area FIRST TO MARKET First beautifully designed product that's both stylish and functional AUTHENTIC Designed with the help and input of experts in the field 20XX Pitch deck title 22
  • 23.  PROBLEM DEFINATION – This initial phase of data mining project focuses on understanding the project objectives and requirements.  HYPOTHESIS GENERATION – It helps in comprehending the business problem as we dive deep inferring the various factor affecting our target variables and we get a much better idea of what are the major factor that are responsible to solve the problem.  DATA EXTRACTON – It is a process of obtaining data from a database or SaaS platform.  DATA EXPLORATION – It is approach similar to initial data analysis whereby a data analyst use visual exploration to understand what is in a dataset and characterstics of dataset.  MODEL DEPLOYMENT- The concept of deployment in data science refers to the application of a model for prediction using a new data. 23