SlideShare a Scribd company logo
GGPLOT IN PYTHON
-By Sarah Masud
INTRODUCTION:
This ppt will cover the basic functions of ggplot in python. This will help
beginners to understand what the functions mean and how to use them.
SELF HELP: If you don’t remember the function or wish to know more about it, you can use the
help function in python by simply typing the function name followed by a ?
EXAMPLE:
INPUT:
OUTPUT:
PREREQUISITES INSTALLED:
● pip
● easy install
● matplotlib
● pandas
● numpy
● scipy
● statmodels
INSTALL ggplot UNDER PYTHON:
FOR LINUX
Method 1: pip install ggplot
Method 2: pip install git+git://github.com/yhat/ggplot.git
SOURCE OF DATA:
Big Diamonds data set is used through the presentation. You can
download the data set from:
https://github.com/SolomonMg/diamonds-data
HOW TO OBTAIN THE CSV:
METHOD 1: Read the RDA file in R and writeback as CSV.
METHOD 2: Use rpy2.
WHAT IS ggplot?
Created by H. Wickman, ggplot provides an easy interface to generate state of art visualizations.
Written originally for R, its success enabled it be used for Python as well.
COMPONENTS OF ggplot:
● ggplot API- Used to implement the plots.
● Data- Uses data as Data Frames as in pandas.
● Aesthetics- How the axes and theme looks.
● Layer- what information is annotated on top of basic plot.
TIME TAKEN TO EXECUTE
A FUNCTION:
Source: http://stackoverflow.com/questions/6786990/find-out-time-it-took-for-a-python-script-to-complete-
execution
INPUT THE DATA:
Import necessary packages.
Read Data:
EXPLORE THE DATA:
1. len()- Number of rows in the dataset
2. column()- What are names of the columns.
WHAT DOES THE COLUMNS CONTAIN:
1. Carat- Weight of the diamond (1 carat=0.2g)
2. Cut- Quality of cut
3. Color- Color of diamond (J-worst D-best)
4. Clarity- A measure of how clear the diamond is.
5. Cert- The level of certification granted.
6. x- Length in mm.
7. y- Breadth in mm.
8. z- Height in mm.
9. Measurement- Volume in terms of x*y*z.
10. Table- Width of top of diamond relative to widest point.
11. Depth- Numerically = (2*z) /(x+y)
3. head()/tail()- To know the first few & last few values, respectively.
NOTE: The dataset contains
both quantitative(numeric)
and qualitative fields.
4. Random selection- To see data values at random.
5. Describe()- Give the mathematical details of fields with numerical value.
NOTE: The mean of x,y are approximately
same. Do diamonds have proportionate
length/breadth?
7. unique()- To know the unique(1 or many) values that make up the dataset.
PREPARING DATA:
1.Check for null values.
2. Check for zero price values.
3. Obtain clean data set by removing null values.
EVALUATION OF DATA:
1. New Statistical Information:
2. Correlations
3. The plot of density of diamond
NOTE:
stat.lingress is used to calculate the components of the line of best fit of the form y=mx+c, where m=slope
and c=y-intercept. The r_value is the regression coefficient, the p_value s a constant usually zero, while
std_err is the error of estimation.
ggplot(dataset,aesthetics(y,x)- Gives us a blank coordinate system
geom_points- Plots the dataset on the blank plot.
scale_y/x_continous- Use to give name and range of the axis.
geom_abline- Draw a line of form y=mx+c.
OUTPUT:
HOW IT WORKS:
1. ggplot is invoked.
2. A blank coordinate system
with labeled axes is put up.
3. The points are plotted.
4. The axis redefined and
cropped.
5. The line draw as another
layer on top of the points.
PRICE EVALUATION:
Diamonds are expensive!
Let us try to map what factors make them so.
PRICE VS
BREADTH
NOTE:
labs-use to label the
graph and the
axises.
x-lab and y-lab can
also be separately
used.
stats_smooth
provides a
mechanism to plot
the line of
regression and help
determine the
relation among the
variants.
PRICE VS
LENGTH
PRICE VS
HEIGHT
PRICE VS VOLUME
NOTE:
Geom-jitter
Over-plotting hides the
number of points in
each neighbourhood.
We can reduce this
problem by making the
points more
transparent.
PRICE VS TABLE
NOTE:
A horizontal line of
regression means that
value of f(x) can be
calculated without much
consideration of the
value of x.
Thus, price is not
considerably affected by
table and can be
calculated without taking
table into account.
PRICE VS DEPTH
PRICE VS CARATS
NOTE:
A quadratic line of
regression signifies
that value of price
depends on the
value of carat. But is
only carat, lets see
closely.
PRICE vs CARAT
NOTE:
Since the original
Price VS carat
graph was not
providing us
accurate
information, we
narrow down the
scale to a
particular section.
In the next slide
we narrow it down
further.
NOTE:
We see that the
plot becomes
vertical, i.e for the
same value of
carat we have
varying price.
Surely some other
factor is controlling
it.
NOTE:
This is plotting the
price with respect to
the cut.
We see that for a
given carat value the
quality of changes
the price.
Differentiate
price VS carat
with respect to
cut.
Differentiate
price VS carat
with respect to
color.
Differentiate
price VS carat
with respect to
clarity.
NOTE:
Facets- It features
the same set of
data with respect to
a given factor. This
helps us determine
which value of
factor affects f(x)
the most,
FACETS
FACETS
This presentation is a part of the larger pool of learning resources provided by
DecisionStats.org
FURTHER SOURCES:
1. https://decisionstats.org
2. https://github.com/SolomonMg/diamonds-data
3. https://themessier.wordpress.com/2015/06/17/ggplot-in-python-part-1
4. http://nbviewer.ipython.org/gist/sara-02/d5a61234ef32e60bddda
5. http://nbviewer.ipython.org/gist/sara-02/d38da4a2023da169ac13
6. https://gist.github.com/sara-02/4eb520fd1b82521e8a11
CITATION:
1. http://ggplot.yhathq.com/
2. https://github.com/yhat/ggplot
FOR QUERIES:
info@decisionstats.org
sarahmasud02@gmail.com
THANK YOU

More Related Content

Similar to Ggplot in python

Q plot tutorial
Q plot tutorialQ plot tutorial
Q plot tutorial
Abhik Seal
 
c++ arrays and pointers grade 9 STEP curriculum.pptx
c++ arrays and pointers grade 9 STEP curriculum.pptxc++ arrays and pointers grade 9 STEP curriculum.pptx
c++ arrays and pointers grade 9 STEP curriculum.pptx
JanineCallangan
 
Ggplot2 ch2
Ggplot2 ch2Ggplot2 ch2
Ggplot2 ch2
heba_ahmad
 
07 dimensionality reduction
07 dimensionality reduction07 dimensionality reduction
07 dimensionality reduction
Marco Quartulli
 
Building and deploying analytics
Building and deploying analyticsBuilding and deploying analytics
Building and deploying analytics
Collin Bennett
 
vb.net.pdf
vb.net.pdfvb.net.pdf
vb.net.pdf
VimalSangar1
 
R programming slides
R  programming slidesR  programming slides
R programming slides
Pankaj Saini
 
Data Structures And Algorithms Roadmap for Beginners By ScholarHat PDF
Data Structures And Algorithms Roadmap for Beginners  By ScholarHat PDFData Structures And Algorithms Roadmap for Beginners  By ScholarHat PDF
Data Structures And Algorithms Roadmap for Beginners By ScholarHat PDF
Scholarhat
 
Python
PythonPython
R Programming
R ProgrammingR Programming
R Programming
AKSHANSH MISHRA
 
Summerization notes for descriptive statistics using r
Summerization notes for descriptive statistics using r Summerization notes for descriptive statistics using r
Summerization notes for descriptive statistics using r
Ashwini Mathur
 
Matrix decomposition and_applications_to_nlp
Matrix decomposition and_applications_to_nlpMatrix decomposition and_applications_to_nlp
Matrix decomposition and_applications_to_nlp
ankit_ppt
 
Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016
Spencer Fox
 
January 2016 Meetup: Speeding up (big) data manipulation with data.table package
January 2016 Meetup: Speeding up (big) data manipulation with data.table packageJanuary 2016 Meetup: Speeding up (big) data manipulation with data.table package
January 2016 Meetup: Speeding up (big) data manipulation with data.table package
Zurich_R_User_Group
 
Data Compression in Data mining and Business Intelligencs
Data Compression in Data mining and Business Intelligencs Data Compression in Data mining and Business Intelligencs
Data Compression in Data mining and Business Intelligencs
ShahDhruv21
 
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam LermaGraph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
PyData
 
De-Cluttering-ML | TechWeekends
De-Cluttering-ML | TechWeekendsDe-Cluttering-ML | TechWeekends
De-Cluttering-ML | TechWeekends
DSCUSICT
 
PYTHON ALGORITHMS, DATA STRUCTURE, SORTING TECHNIQUES
PYTHON ALGORITHMS, DATA STRUCTURE, SORTING TECHNIQUESPYTHON ALGORITHMS, DATA STRUCTURE, SORTING TECHNIQUES
PYTHON ALGORITHMS, DATA STRUCTURE, SORTING TECHNIQUES
vanithasivdc
 
Building Intuitions about Machine Learning and Deep Learning
Building Intuitions about Machine Learning and Deep LearningBuilding Intuitions about Machine Learning and Deep Learning
Building Intuitions about Machine Learning and Deep Learning
Anurag Sahay
 

Similar to Ggplot in python (20)

Q plot tutorial
Q plot tutorialQ plot tutorial
Q plot tutorial
 
c++ arrays and pointers grade 9 STEP curriculum.pptx
c++ arrays and pointers grade 9 STEP curriculum.pptxc++ arrays and pointers grade 9 STEP curriculum.pptx
c++ arrays and pointers grade 9 STEP curriculum.pptx
 
Ggplot2 ch2
Ggplot2 ch2Ggplot2 ch2
Ggplot2 ch2
 
07 dimensionality reduction
07 dimensionality reduction07 dimensionality reduction
07 dimensionality reduction
 
Building and deploying analytics
Building and deploying analyticsBuilding and deploying analytics
Building and deploying analytics
 
vb.net.pdf
vb.net.pdfvb.net.pdf
vb.net.pdf
 
R programming slides
R  programming slidesR  programming slides
R programming slides
 
Data Structures And Algorithms Roadmap for Beginners By ScholarHat PDF
Data Structures And Algorithms Roadmap for Beginners  By ScholarHat PDFData Structures And Algorithms Roadmap for Beginners  By ScholarHat PDF
Data Structures And Algorithms Roadmap for Beginners By ScholarHat PDF
 
Python
PythonPython
Python
 
R Programming
R ProgrammingR Programming
R Programming
 
Summerization notes for descriptive statistics using r
Summerization notes for descriptive statistics using r Summerization notes for descriptive statistics using r
Summerization notes for descriptive statistics using r
 
Matrix decomposition and_applications_to_nlp
Matrix decomposition and_applications_to_nlpMatrix decomposition and_applications_to_nlp
Matrix decomposition and_applications_to_nlp
 
Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016
 
January 2016 Meetup: Speeding up (big) data manipulation with data.table package
January 2016 Meetup: Speeding up (big) data manipulation with data.table packageJanuary 2016 Meetup: Speeding up (big) data manipulation with data.table package
January 2016 Meetup: Speeding up (big) data manipulation with data.table package
 
Data Compression in Data mining and Business Intelligencs
Data Compression in Data mining and Business Intelligencs Data Compression in Data mining and Business Intelligencs
Data Compression in Data mining and Business Intelligencs
 
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam LermaGraph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
 
De-Cluttering-ML | TechWeekends
De-Cluttering-ML | TechWeekendsDe-Cluttering-ML | TechWeekends
De-Cluttering-ML | TechWeekends
 
PYTHON ALGORITHMS, DATA STRUCTURE, SORTING TECHNIQUES
PYTHON ALGORITHMS, DATA STRUCTURE, SORTING TECHNIQUESPYTHON ALGORITHMS, DATA STRUCTURE, SORTING TECHNIQUES
PYTHON ALGORITHMS, DATA STRUCTURE, SORTING TECHNIQUES
 
Twopi.1
Twopi.1Twopi.1
Twopi.1
 
Building Intuitions about Machine Learning and Deep Learning
Building Intuitions about Machine Learning and Deep LearningBuilding Intuitions about Machine Learning and Deep Learning
Building Intuitions about Machine Learning and Deep Learning
 

More from Ajay Ohri

Introduction to R ajay Ohri
Introduction to R ajay OhriIntroduction to R ajay Ohri
Introduction to R ajay Ohri
Ajay Ohri
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
Ajay Ohri
 
Social Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 ElectionSocial Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 Election
Ajay Ohri
 
Pyspark
PysparkPyspark
Pyspark
Ajay Ohri
 
Download Python for R Users pdf for free
Download Python for R Users pdf for freeDownload Python for R Users pdf for free
Download Python for R Users pdf for free
Ajay Ohri
 
Install spark on_windows10
Install spark on_windows10Install spark on_windows10
Install spark on_windows10
Ajay Ohri
 
Ajay ohri Resume
Ajay ohri ResumeAjay ohri Resume
Ajay ohri Resume
Ajay Ohri
 
Statistics for data scientists
Statistics for  data scientistsStatistics for  data scientists
Statistics for data scientists
Ajay Ohri
 
National seminar on emergence of internet of things (io t) trends and challe...
National seminar on emergence of internet of things (io t)  trends and challe...National seminar on emergence of internet of things (io t)  trends and challe...
National seminar on emergence of internet of things (io t) trends and challe...
Ajay Ohri
 
Tools and techniques for data science
Tools and techniques for data scienceTools and techniques for data science
Tools and techniques for data science
Ajay Ohri
 
How Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help businessHow Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help business
Ajay Ohri
 
Training in Analytics and Data Science
Training in Analytics and Data ScienceTraining in Analytics and Data Science
Training in Analytics and Data Science
Ajay Ohri
 
Tradecraft
Tradecraft   Tradecraft
Tradecraft
Ajay Ohri
 
Software Testing for Data Scientists
Software Testing for Data ScientistsSoftware Testing for Data Scientists
Software Testing for Data Scientists
Ajay Ohri
 
Craps
CrapsCraps
Craps
Ajay Ohri
 
A Data Science Tutorial in Python
A Data Science Tutorial in PythonA Data Science Tutorial in Python
A Data Science Tutorial in Python
Ajay Ohri
 
How does cryptography work? by Jeroen Ooms
How does cryptography work?  by Jeroen OomsHow does cryptography work?  by Jeroen Ooms
How does cryptography work? by Jeroen Ooms
Ajay Ohri
 
Using R for Social Media and Sports Analytics
Using R for Social Media and Sports AnalyticsUsing R for Social Media and Sports Analytics
Using R for Social Media and Sports Analytics
Ajay Ohri
 
Kush stats alpha
Kush stats alpha Kush stats alpha
Kush stats alpha
Ajay Ohri
 
Analyze this
Analyze thisAnalyze this
Analyze this
Ajay Ohri
 

More from Ajay Ohri (20)

Introduction to R ajay Ohri
Introduction to R ajay OhriIntroduction to R ajay Ohri
Introduction to R ajay Ohri
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
 
Social Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 ElectionSocial Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 Election
 
Pyspark
PysparkPyspark
Pyspark
 
Download Python for R Users pdf for free
Download Python for R Users pdf for freeDownload Python for R Users pdf for free
Download Python for R Users pdf for free
 
Install spark on_windows10
Install spark on_windows10Install spark on_windows10
Install spark on_windows10
 
Ajay ohri Resume
Ajay ohri ResumeAjay ohri Resume
Ajay ohri Resume
 
Statistics for data scientists
Statistics for  data scientistsStatistics for  data scientists
Statistics for data scientists
 
National seminar on emergence of internet of things (io t) trends and challe...
National seminar on emergence of internet of things (io t)  trends and challe...National seminar on emergence of internet of things (io t)  trends and challe...
National seminar on emergence of internet of things (io t) trends and challe...
 
Tools and techniques for data science
Tools and techniques for data scienceTools and techniques for data science
Tools and techniques for data science
 
How Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help businessHow Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help business
 
Training in Analytics and Data Science
Training in Analytics and Data ScienceTraining in Analytics and Data Science
Training in Analytics and Data Science
 
Tradecraft
Tradecraft   Tradecraft
Tradecraft
 
Software Testing for Data Scientists
Software Testing for Data ScientistsSoftware Testing for Data Scientists
Software Testing for Data Scientists
 
Craps
CrapsCraps
Craps
 
A Data Science Tutorial in Python
A Data Science Tutorial in PythonA Data Science Tutorial in Python
A Data Science Tutorial in Python
 
How does cryptography work? by Jeroen Ooms
How does cryptography work?  by Jeroen OomsHow does cryptography work?  by Jeroen Ooms
How does cryptography work? by Jeroen Ooms
 
Using R for Social Media and Sports Analytics
Using R for Social Media and Sports AnalyticsUsing R for Social Media and Sports Analytics
Using R for Social Media and Sports Analytics
 
Kush stats alpha
Kush stats alpha Kush stats alpha
Kush stats alpha
 
Analyze this
Analyze thisAnalyze this
Analyze this
 

Recently uploaded

1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 

Recently uploaded (20)

1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 

Ggplot in python

  • 1. GGPLOT IN PYTHON -By Sarah Masud
  • 2. INTRODUCTION: This ppt will cover the basic functions of ggplot in python. This will help beginners to understand what the functions mean and how to use them. SELF HELP: If you don’t remember the function or wish to know more about it, you can use the help function in python by simply typing the function name followed by a ? EXAMPLE: INPUT: OUTPUT:
  • 3. PREREQUISITES INSTALLED: ● pip ● easy install ● matplotlib ● pandas ● numpy ● scipy ● statmodels INSTALL ggplot UNDER PYTHON: FOR LINUX Method 1: pip install ggplot Method 2: pip install git+git://github.com/yhat/ggplot.git
  • 4. SOURCE OF DATA: Big Diamonds data set is used through the presentation. You can download the data set from: https://github.com/SolomonMg/diamonds-data HOW TO OBTAIN THE CSV: METHOD 1: Read the RDA file in R and writeback as CSV. METHOD 2: Use rpy2.
  • 5. WHAT IS ggplot? Created by H. Wickman, ggplot provides an easy interface to generate state of art visualizations. Written originally for R, its success enabled it be used for Python as well. COMPONENTS OF ggplot: ● ggplot API- Used to implement the plots. ● Data- Uses data as Data Frames as in pandas. ● Aesthetics- How the axes and theme looks. ● Layer- what information is annotated on top of basic plot.
  • 6. TIME TAKEN TO EXECUTE A FUNCTION: Source: http://stackoverflow.com/questions/6786990/find-out-time-it-took-for-a-python-script-to-complete- execution
  • 7. INPUT THE DATA: Import necessary packages. Read Data:
  • 8. EXPLORE THE DATA: 1. len()- Number of rows in the dataset 2. column()- What are names of the columns.
  • 9. WHAT DOES THE COLUMNS CONTAIN: 1. Carat- Weight of the diamond (1 carat=0.2g) 2. Cut- Quality of cut 3. Color- Color of diamond (J-worst D-best) 4. Clarity- A measure of how clear the diamond is. 5. Cert- The level of certification granted. 6. x- Length in mm. 7. y- Breadth in mm. 8. z- Height in mm. 9. Measurement- Volume in terms of x*y*z. 10. Table- Width of top of diamond relative to widest point. 11. Depth- Numerically = (2*z) /(x+y)
  • 10. 3. head()/tail()- To know the first few & last few values, respectively. NOTE: The dataset contains both quantitative(numeric) and qualitative fields.
  • 11. 4. Random selection- To see data values at random.
  • 12. 5. Describe()- Give the mathematical details of fields with numerical value. NOTE: The mean of x,y are approximately same. Do diamonds have proportionate length/breadth?
  • 13. 7. unique()- To know the unique(1 or many) values that make up the dataset.
  • 14. PREPARING DATA: 1.Check for null values. 2. Check for zero price values.
  • 15. 3. Obtain clean data set by removing null values.
  • 16. EVALUATION OF DATA: 1. New Statistical Information:
  • 18. 3. The plot of density of diamond NOTE: stat.lingress is used to calculate the components of the line of best fit of the form y=mx+c, where m=slope and c=y-intercept. The r_value is the regression coefficient, the p_value s a constant usually zero, while std_err is the error of estimation. ggplot(dataset,aesthetics(y,x)- Gives us a blank coordinate system geom_points- Plots the dataset on the blank plot. scale_y/x_continous- Use to give name and range of the axis. geom_abline- Draw a line of form y=mx+c.
  • 19. OUTPUT: HOW IT WORKS: 1. ggplot is invoked. 2. A blank coordinate system with labeled axes is put up. 3. The points are plotted. 4. The axis redefined and cropped. 5. The line draw as another layer on top of the points.
  • 20. PRICE EVALUATION: Diamonds are expensive! Let us try to map what factors make them so.
  • 21. PRICE VS BREADTH NOTE: labs-use to label the graph and the axises. x-lab and y-lab can also be separately used. stats_smooth provides a mechanism to plot the line of regression and help determine the relation among the variants.
  • 24. PRICE VS VOLUME NOTE: Geom-jitter Over-plotting hides the number of points in each neighbourhood. We can reduce this problem by making the points more transparent.
  • 25. PRICE VS TABLE NOTE: A horizontal line of regression means that value of f(x) can be calculated without much consideration of the value of x. Thus, price is not considerably affected by table and can be calculated without taking table into account.
  • 28. NOTE: A quadratic line of regression signifies that value of price depends on the value of carat. But is only carat, lets see closely.
  • 29. PRICE vs CARAT NOTE: Since the original Price VS carat graph was not providing us accurate information, we narrow down the scale to a particular section. In the next slide we narrow it down further.
  • 30. NOTE: We see that the plot becomes vertical, i.e for the same value of carat we have varying price. Surely some other factor is controlling it.
  • 31. NOTE: This is plotting the price with respect to the cut. We see that for a given carat value the quality of changes the price.
  • 34. Differentiate price VS carat with respect to clarity.
  • 35. NOTE: Facets- It features the same set of data with respect to a given factor. This helps us determine which value of factor affects f(x) the most, FACETS
  • 37. This presentation is a part of the larger pool of learning resources provided by DecisionStats.org FURTHER SOURCES: 1. https://decisionstats.org 2. https://github.com/SolomonMg/diamonds-data 3. https://themessier.wordpress.com/2015/06/17/ggplot-in-python-part-1 4. http://nbviewer.ipython.org/gist/sara-02/d5a61234ef32e60bddda 5. http://nbviewer.ipython.org/gist/sara-02/d38da4a2023da169ac13 6. https://gist.github.com/sara-02/4eb520fd1b82521e8a11 CITATION: 1. http://ggplot.yhathq.com/ 2. https://github.com/yhat/ggplot FOR QUERIES: info@decisionstats.org sarahmasud02@gmail.com