SlideShare a Scribd company logo
1 of 18
QQ Plot
Introduction
Engineers and scientists work with data. Without data, they are not able to draw
any conclusion. Now is the era of creation of data everyday from every aspects of
our lives. Some data are random and some are biased. Some may suffer from bias
because of the data collection process.
One very important aspect of data is the distribution profile. The collected data
may have normal distribution or may be far from normal. It may also be skewed one
either side or may follow multimodal pattern. It may be discrete or may be
continuous. For continuous data, normal distribution brings a whole lot of
advantages compared to its counterparts.
Various inferential statistical process assume that the distribution is normal. A
bell-shaped curve is easy to describe with mean and standard deviation.
Introduction
Why Q-Q plot?
Since normal distribution is of so much importance, we need to check if the
collected data is normal or not.
Here, we will demonstrate the Q-Q plot to check the normality of skewness of
data. Q stands for quantile and therefore, Q-Q plot represents quantile-quantile
plot.
To determine the normality, there are also several statistical tests out there such
as the Kolmogorov–Smirnov test and the Shapiro– Wilk test.
However, it is difficult to do so by looking at the table.
Brief explanation?
We now know Q-Q plot is quantile-quantile plot but what is quantile at the
first place?
When the whole data is sorted, 50th quantile means 50% of the data falls
below that point and 50% of the data falls above that point. That is the
median point.
When we say 1st quantile, only 1% of the data falls below that point and
99% is above that. 25th and 75th quantile points are also known as quartiles.
There are three quartiles is the dataset.
Q1 = first quartile = 25th quantile
Q2 = second quartile = 50th quantile = median
Q3 = third quartile = 75th quantile
Brief explanation?
Quantile are sometimes called percentile.
A typical Q-Q plot is sown below. Let’s explain this plot which seems pretty much a
straight line.
Axes
The x-axis of a Q-Q plot represents the quantiles of
standard normal distribution.
Let’s say we have a normal data and we want to
standardize it. Standardizing means subtracting mean
from each data point and dividing it by standard
deviation.
The resultant is also known as z-score. Let’s sort those
z-scores and plot again. The plot below shows that the
x-axis is now centered at 0 and extended up to 3
Network Graph: Motivation
Wouldn’t it be nice if you could visualize their connections using an
interactive network graph like below?
Network Graph: Pyvis
 What is Pyvis?
Pyvis is a Python library that allows you to create interactive network graphs
in a few lines of code.
To install pyvis, type:
pip install pyvis
What is a Tree map?
 What is a Tree map?
A tree map is a special type of chart for visualization using a set of nested
rectangles of categorical data that is preferably hierarchical.
In Hierarchical data, the categories or items share parent-child type
relationships in an overall tree structure.
The simplest example of this type of data structure can be seen in a
company where all individuals and their designations within teams could
be grouped under one entity i.e., the company itself
What is a Tree map?
 When to use a Tree map?
These are some key points to consider before using tree maps for
visualization.
Tree maps work well when there is a clear ‘Part-to-whole’ relationship
amongst multiple categories present in the data.
Hierarchical Data is needed. This indicates that the data could be arranged
in branches and sub-branches.
The focus is not on precise comparisons between categories but rather on
spotting the key factors/trends or patterns.
Benefits of using a Tree map?
Benefits of using a Tree map:
Space constraint: There is a large amount of hierarchical data that needs
to be visualized in a smaller space.
Easier to read: When compared to a circular multi-level pie chart, the tree
map is easier to read due to its linear visual appearance.
Quickly spot patterns: Since each group is represented by a rectangle
and the area of this rectangle is always proportional to its value, trends
and patterns (similarities and anomalies) are quickly visible in tree maps.
Real-world use cases for Tree map Charts?
1. Displaying region-wise customer complaints about a product
Suppose there are 10 different types of complaints (assume these are
denoted as C1 to C10) about a product and the company wants to
visualize which complaints are relevant to a region then in such a case a
tree map could be used. Here, it can be clearly seen how different regions
have specific types of user complaints.
Real-world use cases for Tree map Charts?
2. Showcasing category-wise product availability like mobile phones
Let us assume that there are four categories of mobile phones with their
market share percentages i.e., Low-end (up to 10,000 INR – 15%), Mid-
Range (10,000-25000 INR- 55%), Premium (above 25,000 to 50,000 INR-
25%), and Top-end (above 50,000 INR-10%).
From this tree map, we can gauge that there is a bigger demand and
market for Mid-Range phones while there are limited phones available in the
Top-End category.
Real-world use cases for Tree map Charts?
3. Explore customer segmentation for a product
Usually, companies for apparel or personal products divide their customers
based on their age. This way they can categorize their products and the
product variants separately for each age group.
In the case of this tree map, the company could decide whether to launch
more products for particular customer segments based on the distribution.
Challenges associated with a Tree map
Tree maps also come with a set of limitations as outlined below-
– Tree maps built with large data points on a single level could be hard to
read as well as print for reporting purposes.
– Sometimes, additional sorting might be required to understand the data
better. However, all the rectangles are automatically ordered within the parent
node by area.
– With too many categories and colors to represent these, the tree map
becomes overwhelming for the reader.
– Tree maps become ineffective for datasets with balanced trees i.e., when
items are of a similar value. In these cases, the main purpose of a tree map
of highlighting the largest item in a given category becomes impossible.
Dendrograms in Python
A dendrogram is a diagram that depicts a tree.
The create_dendrogram figure factory conducts hierarchical clustering on data
and depicts the resultant tree.
Distances between clusters are represented by the values on the tree depth axis.
Dendrogram plots are often used in computational biology to depict gene
or sample grouping, occasionally in the margins of heatmaps.
Hierarchical clustering produces dendrograms as an output. Many people
claim that dendrograms of this type may be used to determine the number of
clusters.
Dendrograms in Python
Wholesale Customer Segmentation Problem using Hierarchical Clustering
We will be working on a wholesale customer segmentation problem. You
can download the dataset using this link.
The data is hosted on the UCI Machine Learning repository. The aim of this
problem is to segment the clients of a wholesale distributor based on their
annual spending on diverse product categories, like milk, grocery, region, etc.
Scree Plot in Python
How to Create a Scree Plot in Python
Principal components analysis (PCA) is an unsupervised machine learning
technique that finds principal components (linear combinations of the
predictor variables) that explain a large portion of the variation in a dataset.
When we perform PCA, we’re interested in understanding what percentage
of the total variation in the dataset can be explained by each principal
component.
One of the easiest ways to visualize the percentage of variation explained
by each principal component is to create a scree plot.

More Related Content

Similar to QQ Plot.pptx

Research methodology-Research Report
Research methodology-Research ReportResearch methodology-Research Report
Research methodology-Research ReportDrMAlagupriyasafiq
 
Research Methodology-Data Processing
Research Methodology-Data ProcessingResearch Methodology-Data Processing
Research Methodology-Data ProcessingDrMAlagupriyasafiq
 
Cancer genomics first look
Cancer genomics first lookCancer genomics first look
Cancer genomics first lookLinu George
 
Descriptive Statistics
Descriptive StatisticsDescriptive Statistics
Descriptive StatisticsCIToolkit
 
Visual Analytics in Big Data
Visual Analytics in Big DataVisual Analytics in Big Data
Visual Analytics in Big DataSaurabh Shanbhag
 
Write a Mission Statement 1. What are your most important .docx
Write a Mission Statement 1. What are your most important .docxWrite a Mission Statement 1. What are your most important .docx
Write a Mission Statement 1. What are your most important .docxedgar6wallace88877
 
Graphical Analysis
Graphical AnalysisGraphical Analysis
Graphical AnalysisCIToolkit
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Derek Kane
 
Issues in data mining Patterns Online Analytical Processing
Issues in data mining  Patterns Online Analytical ProcessingIssues in data mining  Patterns Online Analytical Processing
Issues in data mining Patterns Online Analytical ProcessingShivarkarSandip
 
Data and Information Visualization Part 1part 1.pptx
Data and Information Visualization Part 1part 1.pptxData and Information Visualization Part 1part 1.pptx
Data and Information Visualization Part 1part 1.pptxLamees EL- Ghazoly
 
KIT-601 Lecture Notes-UNIT-2.pdf
KIT-601 Lecture Notes-UNIT-2.pdfKIT-601 Lecture Notes-UNIT-2.pdf
KIT-601 Lecture Notes-UNIT-2.pdfDr. Radhey Shyam
 
Data Science & AI Road Map by Python & Computer science tutor in Malaysia
Data Science  & AI Road Map by Python & Computer science tutor in MalaysiaData Science  & AI Road Map by Python & Computer science tutor in Malaysia
Data Science & AI Road Map by Python & Computer science tutor in MalaysiaAhmed Elmalla
 
Introduction to Data Science.pptx
Introduction to Data Science.pptxIntroduction to Data Science.pptx
Introduction to Data Science.pptxVrishit Saraswat
 
Introduction to Data Science.pptx
Introduction to Data Science.pptxIntroduction to Data Science.pptx
Introduction to Data Science.pptxDr.Shweta
 
Data Mining StepsProblem Definition Market AnalysisC
Data Mining StepsProblem Definition Market AnalysisCData Mining StepsProblem Definition Market AnalysisC
Data Mining StepsProblem Definition Market AnalysisCsharondabriggs
 
Data analysis.pptx
Data analysis.pptxData analysis.pptx
Data analysis.pptxMDPiasKhan
 
B409 W11 Sas Collaborative Stats Guide V4.2
B409 W11 Sas Collaborative Stats Guide V4.2B409 W11 Sas Collaborative Stats Guide V4.2
B409 W11 Sas Collaborative Stats Guide V4.2marshalkalra
 

Similar to QQ Plot.pptx (20)

Research methodology-Research Report
Research methodology-Research ReportResearch methodology-Research Report
Research methodology-Research Report
 
Research Methodology-Data Processing
Research Methodology-Data ProcessingResearch Methodology-Data Processing
Research Methodology-Data Processing
 
Cancer genomics first look
Cancer genomics first lookCancer genomics first look
Cancer genomics first look
 
Descriptive Statistics
Descriptive StatisticsDescriptive Statistics
Descriptive Statistics
 
Visual Analytics in Big Data
Visual Analytics in Big DataVisual Analytics in Big Data
Visual Analytics in Big Data
 
Write a Mission Statement 1. What are your most important .docx
Write a Mission Statement 1. What are your most important .docxWrite a Mission Statement 1. What are your most important .docx
Write a Mission Statement 1. What are your most important .docx
 
Predictive modeling
Predictive modelingPredictive modeling
Predictive modeling
 
Graphical Analysis
Graphical AnalysisGraphical Analysis
Graphical Analysis
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests
 
Issues in data mining Patterns Online Analytical Processing
Issues in data mining  Patterns Online Analytical ProcessingIssues in data mining  Patterns Online Analytical Processing
Issues in data mining Patterns Online Analytical Processing
 
Top Machine Learning Algorithms Used By AI Professionals ARTiBA.pdf
Top Machine Learning Algorithms Used By AI Professionals ARTiBA.pdfTop Machine Learning Algorithms Used By AI Professionals ARTiBA.pdf
Top Machine Learning Algorithms Used By AI Professionals ARTiBA.pdf
 
Data and Information Visualization Part 1part 1.pptx
Data and Information Visualization Part 1part 1.pptxData and Information Visualization Part 1part 1.pptx
Data and Information Visualization Part 1part 1.pptx
 
KIT-601 Lecture Notes-UNIT-2.pdf
KIT-601 Lecture Notes-UNIT-2.pdfKIT-601 Lecture Notes-UNIT-2.pdf
KIT-601 Lecture Notes-UNIT-2.pdf
 
Data Science & AI Road Map by Python & Computer science tutor in Malaysia
Data Science  & AI Road Map by Python & Computer science tutor in MalaysiaData Science  & AI Road Map by Python & Computer science tutor in Malaysia
Data Science & AI Road Map by Python & Computer science tutor in Malaysia
 
Introduction to Data Science.pptx
Introduction to Data Science.pptxIntroduction to Data Science.pptx
Introduction to Data Science.pptx
 
Introduction to Data Science.pptx
Introduction to Data Science.pptxIntroduction to Data Science.pptx
Introduction to Data Science.pptx
 
Data Mining StepsProblem Definition Market AnalysisC
Data Mining StepsProblem Definition Market AnalysisCData Mining StepsProblem Definition Market AnalysisC
Data Mining StepsProblem Definition Market AnalysisC
 
unit 1.pptx
unit 1.pptxunit 1.pptx
unit 1.pptx
 
Data analysis.pptx
Data analysis.pptxData analysis.pptx
Data analysis.pptx
 
B409 W11 Sas Collaborative Stats Guide V4.2
B409 W11 Sas Collaborative Stats Guide V4.2B409 W11 Sas Collaborative Stats Guide V4.2
B409 W11 Sas Collaborative Stats Guide V4.2
 

More from Rahul Borate

PigHive presentation and hive impor.pptx
PigHive presentation and hive impor.pptxPigHive presentation and hive impor.pptx
PigHive presentation and hive impor.pptxRahul Borate
 
Unit 4_Introduction to Server Farms.pptx
Unit 4_Introduction to Server Farms.pptxUnit 4_Introduction to Server Farms.pptx
Unit 4_Introduction to Server Farms.pptxRahul Borate
 
Unit 3_Data Center Design in storage.pptx
Unit  3_Data Center Design in storage.pptxUnit  3_Data Center Design in storage.pptx
Unit 3_Data Center Design in storage.pptxRahul Borate
 
Fundamentals of storage Unit III Backup and Recovery.ppt
Fundamentals of storage Unit III Backup and Recovery.pptFundamentals of storage Unit III Backup and Recovery.ppt
Fundamentals of storage Unit III Backup and Recovery.pptRahul Borate
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxRahul Borate
 
Confusion Matrix.pptx
Confusion Matrix.pptxConfusion Matrix.pptx
Confusion Matrix.pptxRahul Borate
 
Unit 4 SVM and AVR.ppt
Unit 4 SVM and AVR.pptUnit 4 SVM and AVR.ppt
Unit 4 SVM and AVR.pptRahul Borate
 
Unit I Fundamentals of Cloud Computing.pptx
Unit I Fundamentals of Cloud Computing.pptxUnit I Fundamentals of Cloud Computing.pptx
Unit I Fundamentals of Cloud Computing.pptxRahul Borate
 
Unit II Cloud Delivery Models.pptx
Unit II Cloud Delivery Models.pptxUnit II Cloud Delivery Models.pptx
Unit II Cloud Delivery Models.pptxRahul Borate
 
Module III MachineLearningSparkML.pptx
Module III MachineLearningSparkML.pptxModule III MachineLearningSparkML.pptx
Module III MachineLearningSparkML.pptxRahul Borate
 
Unit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptxUnit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptxRahul Borate
 
2.2 Logit and Probit.pptx
2.2 Logit and Probit.pptx2.2 Logit and Probit.pptx
2.2 Logit and Probit.pptxRahul Borate
 
UNIT I Streaming Data & Architectures.pptx
UNIT I Streaming Data & Architectures.pptxUNIT I Streaming Data & Architectures.pptx
UNIT I Streaming Data & Architectures.pptxRahul Borate
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxRahul Borate
 
Practice_Exercises_Files_and_Exceptions.pptx
Practice_Exercises_Files_and_Exceptions.pptxPractice_Exercises_Files_and_Exceptions.pptx
Practice_Exercises_Files_and_Exceptions.pptxRahul Borate
 
Practice_Exercises_Data_Structures.pptx
Practice_Exercises_Data_Structures.pptxPractice_Exercises_Data_Structures.pptx
Practice_Exercises_Data_Structures.pptxRahul Borate
 
Practice_Exercises_Control_Flow.pptx
Practice_Exercises_Control_Flow.pptxPractice_Exercises_Control_Flow.pptx
Practice_Exercises_Control_Flow.pptxRahul Borate
 

More from Rahul Borate (20)

PigHive presentation and hive impor.pptx
PigHive presentation and hive impor.pptxPigHive presentation and hive impor.pptx
PigHive presentation and hive impor.pptx
 
Unit 4_Introduction to Server Farms.pptx
Unit 4_Introduction to Server Farms.pptxUnit 4_Introduction to Server Farms.pptx
Unit 4_Introduction to Server Farms.pptx
 
Unit 3_Data Center Design in storage.pptx
Unit  3_Data Center Design in storage.pptxUnit  3_Data Center Design in storage.pptx
Unit 3_Data Center Design in storage.pptx
 
Fundamentals of storage Unit III Backup and Recovery.ppt
Fundamentals of storage Unit III Backup and Recovery.pptFundamentals of storage Unit III Backup and Recovery.ppt
Fundamentals of storage Unit III Backup and Recovery.ppt
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
 
Confusion Matrix.pptx
Confusion Matrix.pptxConfusion Matrix.pptx
Confusion Matrix.pptx
 
Unit 4 SVM and AVR.ppt
Unit 4 SVM and AVR.pptUnit 4 SVM and AVR.ppt
Unit 4 SVM and AVR.ppt
 
Unit I Fundamentals of Cloud Computing.pptx
Unit I Fundamentals of Cloud Computing.pptxUnit I Fundamentals of Cloud Computing.pptx
Unit I Fundamentals of Cloud Computing.pptx
 
Unit II Cloud Delivery Models.pptx
Unit II Cloud Delivery Models.pptxUnit II Cloud Delivery Models.pptx
Unit II Cloud Delivery Models.pptx
 
EDA.pptx
EDA.pptxEDA.pptx
EDA.pptx
 
Module III MachineLearningSparkML.pptx
Module III MachineLearningSparkML.pptxModule III MachineLearningSparkML.pptx
Module III MachineLearningSparkML.pptx
 
Unit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptxUnit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptx
 
2.2 Logit and Probit.pptx
2.2 Logit and Probit.pptx2.2 Logit and Probit.pptx
2.2 Logit and Probit.pptx
 
UNIT I Streaming Data & Architectures.pptx
UNIT I Streaming Data & Architectures.pptxUNIT I Streaming Data & Architectures.pptx
UNIT I Streaming Data & Architectures.pptx
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
 
Practice_Exercises_Files_and_Exceptions.pptx
Practice_Exercises_Files_and_Exceptions.pptxPractice_Exercises_Files_and_Exceptions.pptx
Practice_Exercises_Files_and_Exceptions.pptx
 
Practice_Exercises_Data_Structures.pptx
Practice_Exercises_Data_Structures.pptxPractice_Exercises_Data_Structures.pptx
Practice_Exercises_Data_Structures.pptx
 
Practice_Exercises_Control_Flow.pptx
Practice_Exercises_Control_Flow.pptxPractice_Exercises_Control_Flow.pptx
Practice_Exercises_Control_Flow.pptx
 
blog creation.pdf
blog creation.pdfblog creation.pdf
blog creation.pdf
 
Chapter I.pptx
Chapter I.pptxChapter I.pptx
Chapter I.pptx
 

Recently uploaded

CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxvipinkmenon1
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 

Recently uploaded (20)

CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 

QQ Plot.pptx

  • 2. Introduction Engineers and scientists work with data. Without data, they are not able to draw any conclusion. Now is the era of creation of data everyday from every aspects of our lives. Some data are random and some are biased. Some may suffer from bias because of the data collection process. One very important aspect of data is the distribution profile. The collected data may have normal distribution or may be far from normal. It may also be skewed one either side or may follow multimodal pattern. It may be discrete or may be continuous. For continuous data, normal distribution brings a whole lot of advantages compared to its counterparts. Various inferential statistical process assume that the distribution is normal. A bell-shaped curve is easy to describe with mean and standard deviation.
  • 4. Why Q-Q plot? Since normal distribution is of so much importance, we need to check if the collected data is normal or not. Here, we will demonstrate the Q-Q plot to check the normality of skewness of data. Q stands for quantile and therefore, Q-Q plot represents quantile-quantile plot. To determine the normality, there are also several statistical tests out there such as the Kolmogorov–Smirnov test and the Shapiro– Wilk test. However, it is difficult to do so by looking at the table.
  • 5. Brief explanation? We now know Q-Q plot is quantile-quantile plot but what is quantile at the first place? When the whole data is sorted, 50th quantile means 50% of the data falls below that point and 50% of the data falls above that point. That is the median point. When we say 1st quantile, only 1% of the data falls below that point and 99% is above that. 25th and 75th quantile points are also known as quartiles. There are three quartiles is the dataset. Q1 = first quartile = 25th quantile Q2 = second quartile = 50th quantile = median Q3 = third quartile = 75th quantile
  • 6. Brief explanation? Quantile are sometimes called percentile. A typical Q-Q plot is sown below. Let’s explain this plot which seems pretty much a straight line. Axes The x-axis of a Q-Q plot represents the quantiles of standard normal distribution. Let’s say we have a normal data and we want to standardize it. Standardizing means subtracting mean from each data point and dividing it by standard deviation. The resultant is also known as z-score. Let’s sort those z-scores and plot again. The plot below shows that the x-axis is now centered at 0 and extended up to 3
  • 7. Network Graph: Motivation Wouldn’t it be nice if you could visualize their connections using an interactive network graph like below?
  • 8. Network Graph: Pyvis  What is Pyvis? Pyvis is a Python library that allows you to create interactive network graphs in a few lines of code. To install pyvis, type: pip install pyvis
  • 9. What is a Tree map?  What is a Tree map? A tree map is a special type of chart for visualization using a set of nested rectangles of categorical data that is preferably hierarchical. In Hierarchical data, the categories or items share parent-child type relationships in an overall tree structure. The simplest example of this type of data structure can be seen in a company where all individuals and their designations within teams could be grouped under one entity i.e., the company itself
  • 10. What is a Tree map?  When to use a Tree map? These are some key points to consider before using tree maps for visualization. Tree maps work well when there is a clear ‘Part-to-whole’ relationship amongst multiple categories present in the data. Hierarchical Data is needed. This indicates that the data could be arranged in branches and sub-branches. The focus is not on precise comparisons between categories but rather on spotting the key factors/trends or patterns.
  • 11. Benefits of using a Tree map? Benefits of using a Tree map: Space constraint: There is a large amount of hierarchical data that needs to be visualized in a smaller space. Easier to read: When compared to a circular multi-level pie chart, the tree map is easier to read due to its linear visual appearance. Quickly spot patterns: Since each group is represented by a rectangle and the area of this rectangle is always proportional to its value, trends and patterns (similarities and anomalies) are quickly visible in tree maps.
  • 12. Real-world use cases for Tree map Charts? 1. Displaying region-wise customer complaints about a product Suppose there are 10 different types of complaints (assume these are denoted as C1 to C10) about a product and the company wants to visualize which complaints are relevant to a region then in such a case a tree map could be used. Here, it can be clearly seen how different regions have specific types of user complaints.
  • 13. Real-world use cases for Tree map Charts? 2. Showcasing category-wise product availability like mobile phones Let us assume that there are four categories of mobile phones with their market share percentages i.e., Low-end (up to 10,000 INR – 15%), Mid- Range (10,000-25000 INR- 55%), Premium (above 25,000 to 50,000 INR- 25%), and Top-end (above 50,000 INR-10%). From this tree map, we can gauge that there is a bigger demand and market for Mid-Range phones while there are limited phones available in the Top-End category.
  • 14. Real-world use cases for Tree map Charts? 3. Explore customer segmentation for a product Usually, companies for apparel or personal products divide their customers based on their age. This way they can categorize their products and the product variants separately for each age group. In the case of this tree map, the company could decide whether to launch more products for particular customer segments based on the distribution.
  • 15. Challenges associated with a Tree map Tree maps also come with a set of limitations as outlined below- – Tree maps built with large data points on a single level could be hard to read as well as print for reporting purposes. – Sometimes, additional sorting might be required to understand the data better. However, all the rectangles are automatically ordered within the parent node by area. – With too many categories and colors to represent these, the tree map becomes overwhelming for the reader. – Tree maps become ineffective for datasets with balanced trees i.e., when items are of a similar value. In these cases, the main purpose of a tree map of highlighting the largest item in a given category becomes impossible.
  • 16. Dendrograms in Python A dendrogram is a diagram that depicts a tree. The create_dendrogram figure factory conducts hierarchical clustering on data and depicts the resultant tree. Distances between clusters are represented by the values on the tree depth axis. Dendrogram plots are often used in computational biology to depict gene or sample grouping, occasionally in the margins of heatmaps. Hierarchical clustering produces dendrograms as an output. Many people claim that dendrograms of this type may be used to determine the number of clusters.
  • 17. Dendrograms in Python Wholesale Customer Segmentation Problem using Hierarchical Clustering We will be working on a wholesale customer segmentation problem. You can download the dataset using this link. The data is hosted on the UCI Machine Learning repository. The aim of this problem is to segment the clients of a wholesale distributor based on their annual spending on diverse product categories, like milk, grocery, region, etc.
  • 18. Scree Plot in Python How to Create a Scree Plot in Python Principal components analysis (PCA) is an unsupervised machine learning technique that finds principal components (linear combinations of the predictor variables) that explain a large portion of the variation in a dataset. When we perform PCA, we’re interested in understanding what percentage of the total variation in the dataset can be explained by each principal component. One of the easiest ways to visualize the percentage of variation explained by each principal component is to create a scree plot.