SlideShare a Scribd company logo
Data science has become an essential business tool. With access to
incredible amounts of data—thanks to advanced computing and the
“Internet of things”—companies are now able to measure every aspect
of their operations in granular detail.
Introduction
There are no shortcuts for data exploration. If you are in a state of
mind, that machine learning can sail you away from every data storm,
trust me, it won’t. After some point of time, you’ll realize that you are
struggling at improving model’s accuracy. In such situation, data
exploration techniques will come to your rescue.
Steps of Data Exploration and
Preparation
Below are the steps involved to understand, clean and prepare your
data for building your predictive model:-
Variable Identification
Univariate Analysis
Bi-variate Analysis
Missing values treatment
Outlier treatment
Variable transformation
Variable creation
Variable Identification
First, identify Predictor (Input) and Target (output) variables. Next,
identify the data type and category of the variables.
Univariate Analysis
At this stage, we explore variables one by one. Method to perform uni-
variate analysis will depend on whether the variable type is categorical
or continuous.
Bi-variate Analysis
Bi-variate Analysis finds out the relationship between two variables.
Here, we look for association and disassociation between variables at a
pre-defined significance level.
Missing Value Treatment
Missing data in the training data set can reduce the power / fit of a
model or can lead to a biased model because we have not analysed the
behavior and relationship with other variables correctly. It can lead to
wrong prediction or classification.
We looked at the importance of treatment of missing values in a
dataset. Now, let’s identify the reasons for occurrence of these missing
values. They may occur at two stages:
Data Extraction
Data Collection
Outlier treatment
Outlier is a commonly used terminology by analysts and data scientists
as it needs close attention else it can result in wildly wrong estimations.
Outlier can be of two types: Univariate and Multivariate.
Outliers can drastically change the results of the data analysis and
statistical modeling.
It increases the error variance and reduces the power of statistical
tests.
If the outliers are non-randomly distributed, they can decrease
normality.
They can bias or influence estimates that may be of substantive
interest.
Working of Data Analysis
A working knowledge of data
science can help leaders turn
analytics into genuine insight. It
can also save them from making
decisions based on faulty
assumptions. “When analytics
goes bad,”
How can leaders learn to distinguish
between good and bad analytics?
It all starts with understanding the data-generation process.You cannot
judge the quality of the analytics if you don’t have a very clear idea of
where the data came from.
Guide to data analytics

More Related Content

What's hot

IPT Tools 2
IPT Tools 2IPT Tools 2
IPT Tools 2
MR Z
 
Statistical Fundamentals in Total Quality Management
Statistical Fundamentals in Total Quality ManagementStatistical Fundamentals in Total Quality Management
Statistical Fundamentals in Total Quality Management
Dr.Raja R
 
Computer Assisted Data Analysis (Hands-on Practice)
Computer Assisted Data Analysis (Hands-on Practice)Computer Assisted Data Analysis (Hands-on Practice)
Computer Assisted Data Analysis (Hands-on Practice)
Dr. Amjad Ali Arain
 
Imputation Techniques For Market Research Datasets With Missing Values
Imputation Techniques For Market Research Datasets With Missing Values Imputation Techniques For Market Research Datasets With Missing Values
Imputation Techniques For Market Research Datasets With Missing Values
Salford Systems
 
Excel Datamining Addin Advanced
Excel Datamining Addin AdvancedExcel Datamining Addin Advanced
Excel Datamining Addin Advanced
DataminingTools Inc
 
Topic 4 intro spss_stata
Topic 4 intro spss_stataTopic 4 intro spss_stata
Topic 4 intro spss_stata
Sizwan Ahammed
 
Applications of sas and minitab in data analysis
Applications of sas and minitab in data analysisApplications of sas and minitab in data analysis
Applications of sas and minitab in data analysis
VeenaV29
 
How to Use NPT
How to Use NPTHow to Use NPT
How to Use NPT
NEQOS
 
Spring 2016
Spring 2016Spring 2016
Spring 2016
Jean Ramirez
 
K10765 Operation Planning Control
K10765 Operation Planning ControlK10765 Operation Planning Control
K10765 Operation Planning Control
Shraddhey Bhandari
 
A predictive analytics primer
A predictive analytics primerA predictive analytics primer
A predictive analytics primer
Raminder Singh
 
Analyst’s Nightmare or Laundering Massive Spreadsheets
Analyst’s Nightmare or Laundering Massive SpreadsheetsAnalyst’s Nightmare or Laundering Massive Spreadsheets
Analyst’s Nightmare or Laundering Massive Spreadsheets
PyData
 
How to establish and evaluate clinical prediction models - Statswork
How to establish and evaluate clinical prediction models - StatsworkHow to establish and evaluate clinical prediction models - Statswork
How to establish and evaluate clinical prediction models - Statswork
Stats Statswork
 
All you want to know about sensitivity analysis
All you want to know about sensitivity analysisAll you want to know about sensitivity analysis
All you want to know about sensitivity analysis
Rajan Vishwakarma
 
Edu 8006 8 assignment edu8006
Edu 8006 8 assignment edu8006Edu 8006 8 assignment edu8006
Edu 8006 8 assignment edu8006
arnitaetsitty
 
A quest for better sleep
A quest for better sleepA quest for better sleep
A quest for better sleep
Alex Martinelli
 
Types of statistical analysis infographic
Types of statistical analysis infographicTypes of statistical analysis infographic
Types of statistical analysis infographic
Intellspot
 
Imputation techniques for missing data in clinical trials
Imputation techniques for missing data in clinical trialsImputation techniques for missing data in clinical trials
Imputation techniques for missing data in clinical trials
Nitin George
 
Imputation of missing data in clinical trials
Imputation of missing data in clinical trialsImputation of missing data in clinical trials
Imputation of missing data in clinical trials
Seema Ahirwar
 

What's hot (19)

IPT Tools 2
IPT Tools 2IPT Tools 2
IPT Tools 2
 
Statistical Fundamentals in Total Quality Management
Statistical Fundamentals in Total Quality ManagementStatistical Fundamentals in Total Quality Management
Statistical Fundamentals in Total Quality Management
 
Computer Assisted Data Analysis (Hands-on Practice)
Computer Assisted Data Analysis (Hands-on Practice)Computer Assisted Data Analysis (Hands-on Practice)
Computer Assisted Data Analysis (Hands-on Practice)
 
Imputation Techniques For Market Research Datasets With Missing Values
Imputation Techniques For Market Research Datasets With Missing Values Imputation Techniques For Market Research Datasets With Missing Values
Imputation Techniques For Market Research Datasets With Missing Values
 
Excel Datamining Addin Advanced
Excel Datamining Addin AdvancedExcel Datamining Addin Advanced
Excel Datamining Addin Advanced
 
Topic 4 intro spss_stata
Topic 4 intro spss_stataTopic 4 intro spss_stata
Topic 4 intro spss_stata
 
Applications of sas and minitab in data analysis
Applications of sas and minitab in data analysisApplications of sas and minitab in data analysis
Applications of sas and minitab in data analysis
 
How to Use NPT
How to Use NPTHow to Use NPT
How to Use NPT
 
Spring 2016
Spring 2016Spring 2016
Spring 2016
 
K10765 Operation Planning Control
K10765 Operation Planning ControlK10765 Operation Planning Control
K10765 Operation Planning Control
 
A predictive analytics primer
A predictive analytics primerA predictive analytics primer
A predictive analytics primer
 
Analyst’s Nightmare or Laundering Massive Spreadsheets
Analyst’s Nightmare or Laundering Massive SpreadsheetsAnalyst’s Nightmare or Laundering Massive Spreadsheets
Analyst’s Nightmare or Laundering Massive Spreadsheets
 
How to establish and evaluate clinical prediction models - Statswork
How to establish and evaluate clinical prediction models - StatsworkHow to establish and evaluate clinical prediction models - Statswork
How to establish and evaluate clinical prediction models - Statswork
 
All you want to know about sensitivity analysis
All you want to know about sensitivity analysisAll you want to know about sensitivity analysis
All you want to know about sensitivity analysis
 
Edu 8006 8 assignment edu8006
Edu 8006 8 assignment edu8006Edu 8006 8 assignment edu8006
Edu 8006 8 assignment edu8006
 
A quest for better sleep
A quest for better sleepA quest for better sleep
A quest for better sleep
 
Types of statistical analysis infographic
Types of statistical analysis infographicTypes of statistical analysis infographic
Types of statistical analysis infographic
 
Imputation techniques for missing data in clinical trials
Imputation techniques for missing data in clinical trialsImputation techniques for missing data in clinical trials
Imputation techniques for missing data in clinical trials
 
Imputation of missing data in clinical trials
Imputation of missing data in clinical trialsImputation of missing data in clinical trials
Imputation of missing data in clinical trials
 

Similar to Guide to data analytics

Data Analyst Interview Questions & Answers
Data Analyst Interview Questions & AnswersData Analyst Interview Questions & Answers
Data Analyst Interview Questions & Answers
Satyam Jaiswal
 
Lesson 1 - Overview of Machine Learning and Data Analysis.pptx
Lesson 1 - Overview of Machine Learning and Data Analysis.pptxLesson 1 - Overview of Machine Learning and Data Analysis.pptx
Lesson 1 - Overview of Machine Learning and Data Analysis.pptx
cloudserviceuit
 
what is ..how to process types and methods involved in data analysis
what is ..how to process types and methods involved in data analysiswhat is ..how to process types and methods involved in data analysis
what is ..how to process types and methods involved in data analysis
Data analysis ireland
 
Data Science - Part III - EDA & Model Selection
Data Science - Part III - EDA & Model SelectionData Science - Part III - EDA & Model Selection
Data Science - Part III - EDA & Model Selection
Derek Kane
 
data science course with placement in hyderabad
data science course with placement in hyderabaddata science course with placement in hyderabad
data science course with placement in hyderabad
maneesha2312
 
Unit2
Unit2Unit2
Data Analysis and Analytics.pdf
Data Analysis and Analytics.pdfData Analysis and Analytics.pdf
Data Analysis and Analytics.pdf
rohitgautam105831
 
Data analytics
Data analyticsData analytics
Data analytics
Bhanu Pratap
 
AI in anomaly detection.pdf
AI in anomaly detection.pdfAI in anomaly detection.pdf
AI in anomaly detection.pdf
StephenAmell4
 
Regression and correlation
Regression and correlationRegression and correlation
Regression and correlation
VrushaliSolanke
 
Understanding The Pattern Of Recognition
Understanding The Pattern Of RecognitionUnderstanding The Pattern Of Recognition
Understanding The Pattern Of Recognition
Rahul Bedi
 
Data analytics
Data analyticsData analytics
Uncover Trends and Patterns with Data Science.pdf
Uncover Trends and Patterns with Data Science.pdfUncover Trends and Patterns with Data Science.pdf
Uncover Trends and Patterns with Data Science.pdf
Uncodemy
 
AI in anomaly detection - An Overview.pdf
AI in anomaly detection - An Overview.pdfAI in anomaly detection - An Overview.pdf
AI in anomaly detection - An Overview.pdf
StephenAmell4
 
Moh.Abd-Ellatif_DataAnalysis1.pptx
Moh.Abd-Ellatif_DataAnalysis1.pptxMoh.Abd-Ellatif_DataAnalysis1.pptx
Moh.Abd-Ellatif_DataAnalysis1.pptx
AbdullahEmam4
 
httphome.ubalt.eduntsbarshbusiness-statoprepartIX.htmTool.docx
httphome.ubalt.eduntsbarshbusiness-statoprepartIX.htmTool.docxhttphome.ubalt.eduntsbarshbusiness-statoprepartIX.htmTool.docx
httphome.ubalt.eduntsbarshbusiness-statoprepartIX.htmTool.docx
adampcarr67227
 
The potential of predictive Analytics Models
The potential of predictive Analytics ModelsThe potential of predictive Analytics Models
The potential of predictive Analytics Models
statswork100
 
SW-Asset-Predictive Analytics Models.pdf
SW-Asset-Predictive Analytics Models.pdfSW-Asset-Predictive Analytics Models.pdf
SW-Asset-Predictive Analytics Models.pdf
Stats Statswork
 
Empowering Business Growth with Predictive Analytic - Statswork
Empowering Business Growth with Predictive Analytic - StatsworkEmpowering Business Growth with Predictive Analytic - Statswork
Empowering Business Growth with Predictive Analytic - Statswork
Stats Statswork
 
Running Head Data Mining in The Cloud .docx
Running Head Data Mining in The Cloud                            .docxRunning Head Data Mining in The Cloud                            .docx
Running Head Data Mining in The Cloud .docx
healdkathaleen
 

Similar to Guide to data analytics (20)

Data Analyst Interview Questions & Answers
Data Analyst Interview Questions & AnswersData Analyst Interview Questions & Answers
Data Analyst Interview Questions & Answers
 
Lesson 1 - Overview of Machine Learning and Data Analysis.pptx
Lesson 1 - Overview of Machine Learning and Data Analysis.pptxLesson 1 - Overview of Machine Learning and Data Analysis.pptx
Lesson 1 - Overview of Machine Learning and Data Analysis.pptx
 
what is ..how to process types and methods involved in data analysis
what is ..how to process types and methods involved in data analysiswhat is ..how to process types and methods involved in data analysis
what is ..how to process types and methods involved in data analysis
 
Data Science - Part III - EDA & Model Selection
Data Science - Part III - EDA & Model SelectionData Science - Part III - EDA & Model Selection
Data Science - Part III - EDA & Model Selection
 
data science course with placement in hyderabad
data science course with placement in hyderabaddata science course with placement in hyderabad
data science course with placement in hyderabad
 
Unit2
Unit2Unit2
Unit2
 
Data Analysis and Analytics.pdf
Data Analysis and Analytics.pdfData Analysis and Analytics.pdf
Data Analysis and Analytics.pdf
 
Data analytics
Data analyticsData analytics
Data analytics
 
AI in anomaly detection.pdf
AI in anomaly detection.pdfAI in anomaly detection.pdf
AI in anomaly detection.pdf
 
Regression and correlation
Regression and correlationRegression and correlation
Regression and correlation
 
Understanding The Pattern Of Recognition
Understanding The Pattern Of RecognitionUnderstanding The Pattern Of Recognition
Understanding The Pattern Of Recognition
 
Data analytics
Data analyticsData analytics
Data analytics
 
Uncover Trends and Patterns with Data Science.pdf
Uncover Trends and Patterns with Data Science.pdfUncover Trends and Patterns with Data Science.pdf
Uncover Trends and Patterns with Data Science.pdf
 
AI in anomaly detection - An Overview.pdf
AI in anomaly detection - An Overview.pdfAI in anomaly detection - An Overview.pdf
AI in anomaly detection - An Overview.pdf
 
Moh.Abd-Ellatif_DataAnalysis1.pptx
Moh.Abd-Ellatif_DataAnalysis1.pptxMoh.Abd-Ellatif_DataAnalysis1.pptx
Moh.Abd-Ellatif_DataAnalysis1.pptx
 
httphome.ubalt.eduntsbarshbusiness-statoprepartIX.htmTool.docx
httphome.ubalt.eduntsbarshbusiness-statoprepartIX.htmTool.docxhttphome.ubalt.eduntsbarshbusiness-statoprepartIX.htmTool.docx
httphome.ubalt.eduntsbarshbusiness-statoprepartIX.htmTool.docx
 
The potential of predictive Analytics Models
The potential of predictive Analytics ModelsThe potential of predictive Analytics Models
The potential of predictive Analytics Models
 
SW-Asset-Predictive Analytics Models.pdf
SW-Asset-Predictive Analytics Models.pdfSW-Asset-Predictive Analytics Models.pdf
SW-Asset-Predictive Analytics Models.pdf
 
Empowering Business Growth with Predictive Analytic - Statswork
Empowering Business Growth with Predictive Analytic - StatsworkEmpowering Business Growth with Predictive Analytic - Statswork
Empowering Business Growth with Predictive Analytic - Statswork
 
Running Head Data Mining in The Cloud .docx
Running Head Data Mining in The Cloud                            .docxRunning Head Data Mining in The Cloud                            .docx
Running Head Data Mining in The Cloud .docx
 

More from Debashish Jana

Lies damned lies and statistics
Lies damned lies and statisticsLies damned lies and statistics
Lies damned lies and statistics
Debashish Jana
 
Data to make hit tv show
Data to make hit tv showData to make hit tv show
Data to make hit tv show
Debashish Jana
 
Bad statistics
Bad statisticsBad statistics
Bad statistics
Debashish Jana
 
Predictive analytics
Predictive analyticsPredictive analytics
Predictive analytics
Debashish Jana
 
Data communication
Data communicationData communication
Data communication
Debashish Jana
 
The beauty of data visualization
The beauty of data visualizationThe beauty of data visualization
The beauty of data visualization
Debashish Jana
 
Make data more human
Make data more humanMake data more human
Make data more human
Debashish Jana
 
How to start thinking like a data scientist
How to start thinking like a data scientistHow to start thinking like a data scientist
How to start thinking like a data scientist
Debashish Jana
 
Big data
Big dataBig data
Big data
Debashish Jana
 
Data analysis
Data analysisData analysis
Data analysis
Debashish Jana
 

More from Debashish Jana (10)

Lies damned lies and statistics
Lies damned lies and statisticsLies damned lies and statistics
Lies damned lies and statistics
 
Data to make hit tv show
Data to make hit tv showData to make hit tv show
Data to make hit tv show
 
Bad statistics
Bad statisticsBad statistics
Bad statistics
 
Predictive analytics
Predictive analyticsPredictive analytics
Predictive analytics
 
Data communication
Data communicationData communication
Data communication
 
The beauty of data visualization
The beauty of data visualizationThe beauty of data visualization
The beauty of data visualization
 
Make data more human
Make data more humanMake data more human
Make data more human
 
How to start thinking like a data scientist
How to start thinking like a data scientistHow to start thinking like a data scientist
How to start thinking like a data scientist
 
Big data
Big dataBig data
Big data
 
Data analysis
Data analysisData analysis
Data analysis
 

Recently uploaded

DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
SaffaIbrahim1
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
hyfjgavov
 
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
osoyvvf
 
Sample Devops SRE Product Companies .pdf
Sample Devops SRE  Product Companies .pdfSample Devops SRE  Product Companies .pdf
Sample Devops SRE Product Companies .pdf
Vineet
 
06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus
Timothy Spann
 
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理 原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
tzu5xla
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
oaxefes
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
Vietnam Cotton & Spinning Association
 
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
uevausa
 
Econ3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdfEcon3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdf
blueshagoo1
 
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
Timothy Spann
 
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdfreading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
perranet1
 
一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理
zsafxbf
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
hqfek
 
Telemetry Solution for Gaming (AWS Summit'24)
Telemetry Solution for Gaming (AWS Summit'24)Telemetry Solution for Gaming (AWS Summit'24)
Telemetry Solution for Gaming (AWS Summit'24)
GeorgiiSteshenko
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
z6osjkqvd
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
Márton Kodok
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
bmucuha
 
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
agdhot
 

Recently uploaded (20)

DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
 
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
 
Sample Devops SRE Product Companies .pdf
Sample Devops SRE  Product Companies .pdfSample Devops SRE  Product Companies .pdf
Sample Devops SRE Product Companies .pdf
 
06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus
 
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理 原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
 
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
 
Econ3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdfEcon3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdf
 
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
 
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdfreading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
 
一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
 
Telemetry Solution for Gaming (AWS Summit'24)
Telemetry Solution for Gaming (AWS Summit'24)Telemetry Solution for Gaming (AWS Summit'24)
Telemetry Solution for Gaming (AWS Summit'24)
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
 

Guide to data analytics

  • 1.
  • 2. Data science has become an essential business tool. With access to incredible amounts of data—thanks to advanced computing and the “Internet of things”—companies are now able to measure every aspect of their operations in granular detail.
  • 3. Introduction There are no shortcuts for data exploration. If you are in a state of mind, that machine learning can sail you away from every data storm, trust me, it won’t. After some point of time, you’ll realize that you are struggling at improving model’s accuracy. In such situation, data exploration techniques will come to your rescue.
  • 4. Steps of Data Exploration and Preparation Below are the steps involved to understand, clean and prepare your data for building your predictive model:- Variable Identification Univariate Analysis Bi-variate Analysis Missing values treatment Outlier treatment Variable transformation Variable creation
  • 5. Variable Identification First, identify Predictor (Input) and Target (output) variables. Next, identify the data type and category of the variables.
  • 6. Univariate Analysis At this stage, we explore variables one by one. Method to perform uni- variate analysis will depend on whether the variable type is categorical or continuous.
  • 7. Bi-variate Analysis Bi-variate Analysis finds out the relationship between two variables. Here, we look for association and disassociation between variables at a pre-defined significance level.
  • 8. Missing Value Treatment Missing data in the training data set can reduce the power / fit of a model or can lead to a biased model because we have not analysed the behavior and relationship with other variables correctly. It can lead to wrong prediction or classification.
  • 9. We looked at the importance of treatment of missing values in a dataset. Now, let’s identify the reasons for occurrence of these missing values. They may occur at two stages: Data Extraction Data Collection
  • 10. Outlier treatment Outlier is a commonly used terminology by analysts and data scientists as it needs close attention else it can result in wildly wrong estimations. Outlier can be of two types: Univariate and Multivariate.
  • 11. Outliers can drastically change the results of the data analysis and statistical modeling. It increases the error variance and reduces the power of statistical tests. If the outliers are non-randomly distributed, they can decrease normality. They can bias or influence estimates that may be of substantive interest.
  • 12. Working of Data Analysis A working knowledge of data science can help leaders turn analytics into genuine insight. It can also save them from making decisions based on faulty assumptions. “When analytics goes bad,”
  • 13. How can leaders learn to distinguish between good and bad analytics? It all starts with understanding the data-generation process.You cannot judge the quality of the analytics if you don’t have a very clear idea of where the data came from.