SlideShare a Scribd company logo
Data Science
DataVisualization
01 Understanding Data Visualization 02 Base Graphics in R
03 Visualization with GGPLOT2
Agenda
Data Visualization
Data Visualization
Data visualization is basically presentation of data with graphics. It's a way to summarize your findings and display it in a
form that facilitates interpretation and can help in identifying patterns or trends
Importance of Data Visualization
Importance of Data Visualization
A picture is worth a 1000 words. Humans are easily attracted to visuals and mostof us preferto understand a particular
scenario through pictures instead of text
It is faster to recognize a result than to read a paragraph. When done well, visualizations explain complexideas simply.
Oftentimes charts tell the story much faster than a prose descriptionor a table presentation
Importance of Data Visualization
Retailer country
Order method
type
Retailer type Product line Product type Product Year Revenue Quantity Gross margin
United States Telephone Golf Shop
Personal
Accessories
Navigation Trail Master 2012 1095 3 0.34767123
United States Telephone Department Store
Camping
Equipment
Sleeping Bags Hibernator 2012 160103.2 1160 0.3769019
United States Telephone Department Store
Camping
Equipment
Sleeping Bags
Hibernator Self -
Inflating Mat
2012 66514.28 556 0.5440107
United States Telephone Department Store
Camping
Equipment
Sleeping Bags Hibernator Pad 2012 16205.73 411 0.51382196
United States Telephone Department Store
Camping
Equipment
Sleeping Bags Hibernator Pillow 2012 33520.42 2475 0.46298346
United States Fax Outdoors Shop
Camping
Equipment
Cooking Gear
TrailChef Water
Bag
2013 19418.52 3102 0.53194888
United States Fax Outdoors Shop
Camping
Equipment
Cooking Gear TrailChef Cook Set 2013 42304.32 794 0.34365616
United States Fax Outdoors Shop
Camping
Equipment
Cooking Gear
TrailChef Single
Flame
2013 52266.32 824 0.26880025
United States Telephone Department Store
Camping
Equipment
Packs
Canyon Mule
Journey Backpack
2012 235660.36 676 0.38805542
United States Telephone Department Store
Camping
Equipment
Packs
Canyon Mule
Cooler
2012 53822.16 1652 0.50890117
United States Web Golf Shop
Personal
Accessories
Watches Venue 2014 73949 1013 0.42858294
United States Web Golf Shop
Personal
Accessories
Watches Infinity 2014 157890.2 665 0.45986527
United States Web Golf Shop
Personal
Accessories
Watches Lux 2014 67265.2 396 0.48654044
After looking at the data
Is it possible to answer the following questions?
• Which order method type has highest revenue
?
• Which product line has highest quantity ?
Below is the sample data: Sales Products
Importance of Data Visualization
Afterdisplaying the data in the form of graphs . Lets try answering the questions
0
50
100
150
200
250
300
350
400
Camping
Equipment
Golf Equipment Mountaineering
Equipment
Product line
Outdoor
Protection
Personal
Accessories
Quantity
E-mail
Fax
Sales visit Telephone
Web
E-mail
Fax
Sales visit
Telephone
Web
Importance of Data Visualization
Afterdisplaying the data in the form of graphs . Lets try answering the questions
E-mail
Fax
Sales visit
Telephone
Web
E-mail
Fax
Sales visit
Telephone
Web
Web
Which ordermethod type has highest revenue ?
Web ordermethod type has highest revenue
among all the ordermethod type.
Moreover looking at the pie chart you can also
tell that which ordermethod type has lowest
revenue : Email.
Importance of Data Visualization
Afterdisplaying the data in the form of graphs . Lets try answering the questions
Which productline has highest quantity ?
0
50
100
150
300
250
200
350
400
Camping
Equipment
Golf Equipment Mountaineering
Equipment
Product line
Outdoor
Protection
Personal
Accessories
Quantity
From the graph it can be easily seen that
PersonalAccessorieshas highest number
among all the productline.
Base Graphics in R
Base Graphics in R
Base Graphics helps in making simple graphs
Bar Plot
Bar Plot
Bar Plots are suitable for showing comparisonbetweencumulative totals across several groups
Bar Plot
Making a simple Bar-plot
plot(customer_churn$Dependent
s)
Bar Plot
plot(customer_churn$Dependents, col="coral")
Adding color
Bar Plot
plot(customer_churn$Dependents,
col="coral",xlab="Dependents",
main="Distribution of Dependents")
Adding x-axis label & title
Bar Plot
plot(customer_churn$PhoneService,col="aquamarine4"
)
Bar-plot for ‘PhoneService’column
plot(customer_churn$Contract,col="palegreen4",
xlab="Contract",
main="Distribution of Contract")
Bar Plot
Bar-plot for ‘Contract’ column
Histogram
Histogram
Histogram is basically a plot that breaks the data into bins (or breaks) and shows frequencydistribution of these bins
Histogram
Making a simple histogram
hist(customer_churn$tenure)
Histogram
Adding Color
hist(customer_churn$tenure,col="olivedrab")
Histogram
Change the number of bins
hist(customer_churn$tenure,
col="olivedrab",
breaks=30)
Grammar of Graphics
Grammar of Graphics
Every form of communicationneeds to have grammar. Since, visualization is also a form of communication, it needs to
have a foundation of grammar
I am John Am John I
Components of Grammar of Graphics
Element Description
Data The data-set forwhich we wouldwantto plot a graph
Aesthetics The metrics onto whichwe plot our data
Geometry Visual elements to plot the data
Facet Groups by which wedivide the data
Visualization with ggplot2
Visualization with ggplot2
ggplot2 is a system for declaratively creating
graphics, based on the Grammar of Graphics. You
provide the data, tell ggplot2 how to map variables
to aesthetics,what graphical primitives to use, and it
takes care of the details
geom_hist()
geom_hist()
geom_hist()function helps in making histograms with ggplot2
geom_hist()
Build a histogram for ‘tenure’
column
ggplot(data=customer_churn,
aes(x=tenure))+geom_histogram()
geom_hist()
Change the number of bins
ggplot(data = customer_churn,
aes(x=tenure))+geom_histogram(bins = 50)
geom_hist()
Add fill color and bordercolor
ggplot(data = customer_churn, aes(x=tenure))+
geom_histogram(fill="palegreen4",col="green")
geom_hist()
Assign‘Partner’ to fill aesthetic
ggplot(data = customer_churn, aes(x=tenure,
fill=Partner))+geom_histogram()
geom_hist()
Set Position to be ‘identity’
ggplot(data = customer_churn,
aes(x=tenure, fill=Partner))+
geom_histogram(position = "identity")
geom_bar()
geom_bar()
geom_bar()function helps in making bar-plots with ggplot2
geom_bar()
Build a bar-plot for ‘Dependents’column
ggplot(data = customer_churn,
aes(x=Dependents))+geom_bar()
geom_bar()
Add fill color
ggplot(data = customer_churn,
aes(x=Dependents))+geom_bar(fill="chocolate")
geom_bar()
Assigning ‘DeviceProtection’column to
fill aesthetic
ggplot(data = customer_churn,
aes(x=Dependents,fill=DeviceProtection))
+geom_bar()
geom_bar()
Set the positionto ‘dodge’
ggplot(data = customer_churn,
aes(x=Dependents,fill=DeviceProtection))
+geom_bar(position=‘dodge’)
geom_point()
geom_point()
geom_point() function helps in making scatterplots with ggplot2.Ascatter plot helps in understanding how does one
variable change w.r.t another variable. It is used for two continuous values
geom_point()
Scatter-plotbetween ‘TotalCharges’
& ‘tenure’
ggplot(data = customer_churn,
aes(y=TotalCharges,x=tenure))+geom_point(
)
geom_point()
Adding color
ggplot(data = customer_churn,
aes(y=TotalCharges,x=tenure)) +
geom_point(col="slateblue3")
geom_point()
Map ‘Partner’ to col aesthetic
ggplot(data = customer_churn,
aes(y=TotalCharges,x=tenure, col=Partner)) +
geom_point()
geom_point()
Map ‘InternetService’ to col aesthetic
ggplot(data = customer_churn,
aes(y=TotalCharges,x=tenure,
col=InternetService)) +
geom_point()
geom_boxplot()
geom_boxplot()
geom_boxplot() function helps in making boxplots with ggplot2. Box Plot shows 5 statistically significant numbers- the
minimum, the 25th percentile, the median, the 75th percentile and the maximum. It is thus useful for visualizing the spread of
the data and deriving inferences accordingly
geom_boxplot()
Box-plot between ‘MonthlyCharges’ &
‘InternetService’
ggplot(data =
customer_churn,aes(y=MonthlyCharges,x=In
ternetService))+geom_boxplot()
geom_boxplot()
Add fill color
ggplot(data =
customer_churn,aes(y=MonthlyCharges,x=Inte
rnetService))+geom_boxplot(fill="violetred4")
Faceting the data
Faceting the data
facet_grid() is used to facet the data. Faceting is used when the plot is too chaotic and some variables have to be grouped
into different facets to have a better visualization
facet_grid()
Initial Graph
ggplot(data = customer_churn,
aes(x=tenure,fill=InternetService))+
geom_histogram()
After Faceting
ggplot(data = customer_churn,
aes(x=tenure,fill=InternetService))+
geom_histogram()+ facet_grid(~InternetService)
facet_grid()
Initial Graph
ggplot(data = customer_churn,
aes(y=TotalCharges,x=tenure, col=Contract))+
geom_point()+geom_smooth()
After Faceting
ggplot(data = customer_churn,
aes(y=TotalCharges,x=tenure, col=Contract))+
geom_point()+geom_smooth()+facet_grid(~Contract)
Theme Layer
Theme Layer
Theme layeris used to add themes to our plots
Theme Layer
Initial Graph
ggplot(data = customer_churn,aes(x=tenure))+
geom_histogram(fill="tomato3",
col="mediumaquamarine") -> g1
After Adding Title
g1+labs(title = "Distribution of tenure")->g2
Theme Layer
After Adding Title
g1+labs(title = "Distribution of tenure")->g2
After Adding Panel Background
g2+theme(panel.background =
element_rect(fill = "olivedrab3"))->g3
Theme Layer
After Adding Panel Background
g2+theme(panel.background =
element_rect(fill = "olivedrab3"))->g3
After Adding Plot Background
g3+theme(plot.background =
element_rect(fill = "palegreen4"))->g4
Theme Layer
After Adding Plot Background
g3+theme(plot.background =
element_rect(fill = "palegreen4"))->g4
After Changing Title
g4+theme(plot.title = element_text(hjust = 0.5, face="bold",
colour = "peachpuff"))
Quiz
Quiz
Which of these is the correctcode to make a bar-plot for the ‘TechSupport’column. The colorof the
bars should be ‘blue’ & the title of the plot should be ‘Distribution of Tech Support’
1. plot(customer_churn$TechSupport,col="blue", main="Distributionof TechSupport")
2. plot(customer_churn$TechSupport,fill="blue", main="Distribution of Tech Support")
3. plot(customer_churn$TechSupport,col="blue", title="Distributionof Tech Support")
4. plot(customer_churn$TechSupport,color="blue",title="Distributionof TechSupport")
Quiz
Which of these is the correctcode to make a histogram for the ‘tenure’column. The fill color of the bins
should be ‘azure’ & the number of bins should be 87
1. ggplot(data= customer_churn,aes(x=tenure,col='azure'))+geom_histogram(bins=87)
2. ggplot(data = customer_churn,aes(x=tenure))+geom_histogram(col="azure",bins=87)
3. ggplot(data= customer_churn,aes(x=tenure))+geom_histogram(fill="azure",bins=87)
4. ggplot(data= customer_churn,aes(x=tenure,fill='azure'))+geom_histogram(bins=87)
Quiz
Which of these is the correct code to make a bar-plot for the ‘OnlineBackup’ column. The color of the bars
should be determined by the ‘PhoneService’ column
1. ggplot(data= customer_churn,aes(fill=OnlineBackup,x=PhoneService))+geom_bar()
2. ggplot(data= customer_churn,aes(y=OnlineBackup,fill=PhoneService))+geom_bar()
3. ggplot(data= customer_churn,aes(x=OnlineBackup))+geom_bar(fill=PhoneService)
4. ggplot(data= customer_churn,aes(x=OnlineBackup,fill=PhoneService))+geom_bar()
Quiz
To which of these geometries can you add the facet_grid()?
1. geom_bar()
2. geom_histogram()
3. geom_point()
4. All of the above
ThankY
ou

More Related Content

Similar to 4.Data-Visualization.pptx

An introduction to Machine Learning
An introduction to Machine LearningAn introduction to Machine Learning
An introduction to Machine Learning
Julien SIMON
 
The Sum of our Parts: the Complete CARTO Journey [CARTO]
The Sum of our Parts: the Complete CARTO Journey [CARTO]The Sum of our Parts: the Complete CARTO Journey [CARTO]
The Sum of our Parts: the Complete CARTO Journey [CARTO]
CARTO
 
Real-Time Personalized Customer Experiences at Bonobos (RET203) - AWS re:Inve...
Real-Time Personalized Customer Experiences at Bonobos (RET203) - AWS re:Inve...Real-Time Personalized Customer Experiences at Bonobos (RET203) - AWS re:Inve...
Real-Time Personalized Customer Experiences at Bonobos (RET203) - AWS re:Inve...
Amazon Web Services
 
2019 WIA - Data-Driven Product Improvements
2019 WIA - Data-Driven Product Improvements2019 WIA - Data-Driven Product Improvements
2019 WIA - Data-Driven Product Improvements
Women in Analytics Conference
 
Customer Clustering for Retailer Marketing
Customer Clustering for Retailer MarketingCustomer Clustering for Retailer Marketing
Customer Clustering for Retailer Marketing
Jonathan Sedar
 
Implementing a data_science_project (Python Version)_part1
Implementing a data_science_project (Python Version)_part1Implementing a data_science_project (Python Version)_part1
Implementing a data_science_project (Python Version)_part1
Dr Sulaimon Afolabi
 
CARTO en 5 Pasos: del Dato a la Toma de Decisiones [CARTO]
CARTO en 5 Pasos: del Dato a la Toma de Decisiones [CARTO]CARTO en 5 Pasos: del Dato a la Toma de Decisiones [CARTO]
CARTO en 5 Pasos: del Dato a la Toma de Decisiones [CARTO]
CARTO
 
Google Analytics for Beginners - Training
Google Analytics for Beginners - TrainingGoogle Analytics for Beginners - Training
Google Analytics for Beginners - Training
Ruben Vezzoli
 
Ali upload
Ali uploadAli upload
Ali upload
Ali Zahraei, Ph.D
 
The Dynamic Language is not Enough
The Dynamic Language is not EnoughThe Dynamic Language is not Enough
The Dynamic Language is not Enough
Lukas Renggli
 
Visualizing your data in JavaScript
Visualizing your data in JavaScriptVisualizing your data in JavaScript
Visualizing your data in JavaScript
Mandi Cai
 
GraphTour - How to Build Next-Generation Solutions using Graph Databases
GraphTour - How to Build Next-Generation Solutions using Graph DatabasesGraphTour - How to Build Next-Generation Solutions using Graph Databases
GraphTour - How to Build Next-Generation Solutions using Graph Databases
Neo4j
 
Esteem chartgraph
Esteem chartgraphEsteem chartgraph
Esteem chartgraph
Tony Hirst
 
The R of War
The R of WarThe R of War
The R of War
Kevin Davis
 
Customer Clustering For Retail Marketing
Customer Clustering For Retail MarketingCustomer Clustering For Retail Marketing
Customer Clustering For Retail Marketing
Jonathan Sedar
 
Data Warehouses and Multi-Dimensional Data Analysis
Data Warehouses and Multi-Dimensional Data AnalysisData Warehouses and Multi-Dimensional Data Analysis
Data Warehouses and Multi-Dimensional Data Analysis
Raimonds Simanovskis
 
Introducing Amazon Machine Learning
Introducing Amazon Machine LearningIntroducing Amazon Machine Learning
Introducing Amazon Machine Learning
Amazon Web Services
 
Visual, Interactive, Predictive Analytics for Big Data
Visual, Interactive, Predictive Analytics for Big DataVisual, Interactive, Predictive Analytics for Big Data
Visual, Interactive, Predictive Analytics for Big Data
Arimo, Inc.
 
Viki Big Data Meetup 2013_10
Viki Big Data Meetup 2013_10Viki Big Data Meetup 2013_10
Viki Big Data Meetup 2013_10
ishanagrawal90
 
Dataiku productive application to production - pap is may 2015
Dataiku    productive application to production - pap is may 2015 Dataiku    productive application to production - pap is may 2015
Dataiku productive application to production - pap is may 2015
Dataiku
 

Similar to 4.Data-Visualization.pptx (20)

An introduction to Machine Learning
An introduction to Machine LearningAn introduction to Machine Learning
An introduction to Machine Learning
 
The Sum of our Parts: the Complete CARTO Journey [CARTO]
The Sum of our Parts: the Complete CARTO Journey [CARTO]The Sum of our Parts: the Complete CARTO Journey [CARTO]
The Sum of our Parts: the Complete CARTO Journey [CARTO]
 
Real-Time Personalized Customer Experiences at Bonobos (RET203) - AWS re:Inve...
Real-Time Personalized Customer Experiences at Bonobos (RET203) - AWS re:Inve...Real-Time Personalized Customer Experiences at Bonobos (RET203) - AWS re:Inve...
Real-Time Personalized Customer Experiences at Bonobos (RET203) - AWS re:Inve...
 
2019 WIA - Data-Driven Product Improvements
2019 WIA - Data-Driven Product Improvements2019 WIA - Data-Driven Product Improvements
2019 WIA - Data-Driven Product Improvements
 
Customer Clustering for Retailer Marketing
Customer Clustering for Retailer MarketingCustomer Clustering for Retailer Marketing
Customer Clustering for Retailer Marketing
 
Implementing a data_science_project (Python Version)_part1
Implementing a data_science_project (Python Version)_part1Implementing a data_science_project (Python Version)_part1
Implementing a data_science_project (Python Version)_part1
 
CARTO en 5 Pasos: del Dato a la Toma de Decisiones [CARTO]
CARTO en 5 Pasos: del Dato a la Toma de Decisiones [CARTO]CARTO en 5 Pasos: del Dato a la Toma de Decisiones [CARTO]
CARTO en 5 Pasos: del Dato a la Toma de Decisiones [CARTO]
 
Google Analytics for Beginners - Training
Google Analytics for Beginners - TrainingGoogle Analytics for Beginners - Training
Google Analytics for Beginners - Training
 
Ali upload
Ali uploadAli upload
Ali upload
 
The Dynamic Language is not Enough
The Dynamic Language is not EnoughThe Dynamic Language is not Enough
The Dynamic Language is not Enough
 
Visualizing your data in JavaScript
Visualizing your data in JavaScriptVisualizing your data in JavaScript
Visualizing your data in JavaScript
 
GraphTour - How to Build Next-Generation Solutions using Graph Databases
GraphTour - How to Build Next-Generation Solutions using Graph DatabasesGraphTour - How to Build Next-Generation Solutions using Graph Databases
GraphTour - How to Build Next-Generation Solutions using Graph Databases
 
Esteem chartgraph
Esteem chartgraphEsteem chartgraph
Esteem chartgraph
 
The R of War
The R of WarThe R of War
The R of War
 
Customer Clustering For Retail Marketing
Customer Clustering For Retail MarketingCustomer Clustering For Retail Marketing
Customer Clustering For Retail Marketing
 
Data Warehouses and Multi-Dimensional Data Analysis
Data Warehouses and Multi-Dimensional Data AnalysisData Warehouses and Multi-Dimensional Data Analysis
Data Warehouses and Multi-Dimensional Data Analysis
 
Introducing Amazon Machine Learning
Introducing Amazon Machine LearningIntroducing Amazon Machine Learning
Introducing Amazon Machine Learning
 
Visual, Interactive, Predictive Analytics for Big Data
Visual, Interactive, Predictive Analytics for Big DataVisual, Interactive, Predictive Analytics for Big Data
Visual, Interactive, Predictive Analytics for Big Data
 
Viki Big Data Meetup 2013_10
Viki Big Data Meetup 2013_10Viki Big Data Meetup 2013_10
Viki Big Data Meetup 2013_10
 
Dataiku productive application to production - pap is may 2015
Dataiku    productive application to production - pap is may 2015 Dataiku    productive application to production - pap is may 2015
Dataiku productive application to production - pap is may 2015
 

Recently uploaded

Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
manji sharman06
 
Ensuring Efficiency and Speed with Practical Solutions for Clinical Operations
Ensuring Efficiency and Speed with Practical Solutions for Clinical OperationsEnsuring Efficiency and Speed with Practical Solutions for Clinical Operations
Ensuring Efficiency and Speed with Practical Solutions for Clinical Operations
OnePlan Solutions
 
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptxOperational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
sandeepmenon62
 
Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...
Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...
Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...
Ortus Solutions, Corp
 
What’s New in VictoriaLogs - Q2 2024 Update
What’s New in VictoriaLogs - Q2 2024 UpdateWhat’s New in VictoriaLogs - Q2 2024 Update
What’s New in VictoriaLogs - Q2 2024 Update
VictoriaMetrics
 
How GenAI Can Improve Supplier Performance Management.pdf
How GenAI Can Improve Supplier Performance Management.pdfHow GenAI Can Improve Supplier Performance Management.pdf
How GenAI Can Improve Supplier Performance Management.pdf
Zycus
 
Flutter vs. React Native: A Detailed Comparison for App Development in 2024
Flutter vs. React Native: A Detailed Comparison for App Development in 2024Flutter vs. React Native: A Detailed Comparison for App Development in 2024
Flutter vs. React Native: A Detailed Comparison for App Development in 2024
dhavalvaghelanectarb
 
Boost Your Savings with These Money Management Apps
Boost Your Savings with These Money Management AppsBoost Your Savings with These Money Management Apps
Boost Your Savings with These Money Management Apps
Jhone kinadey
 
Computer Science & Engineering VI Sem- New Syllabus.pdf
Computer Science & Engineering VI Sem- New Syllabus.pdfComputer Science & Engineering VI Sem- New Syllabus.pdf
Computer Science & Engineering VI Sem- New Syllabus.pdf
chandangoswami40933
 
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
kgyxske
 
Orca: Nocode Graphical Editor for Container Orchestration
Orca: Nocode Graphical Editor for Container OrchestrationOrca: Nocode Graphical Editor for Container Orchestration
Orca: Nocode Graphical Editor for Container Orchestration
Pedro J. Molina
 
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Paul Brebner
 
ppt on the brain chip neuralink.pptx
ppt  on   the brain  chip neuralink.pptxppt  on   the brain  chip neuralink.pptx
ppt on the brain chip neuralink.pptx
Reetu63
 
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
gapen1
 
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom KittEnhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Peter Caitens
 
Hyperledger Besu 빨리 따라하기 (Private Networks)
Hyperledger Besu 빨리 따라하기 (Private Networks)Hyperledger Besu 빨리 따라하기 (Private Networks)
Hyperledger Besu 빨리 따라하기 (Private Networks)
wonyong hwang
 
42 Ways to Generate Real Estate Leads - Sellxpert
42 Ways to Generate Real Estate Leads - Sellxpert42 Ways to Generate Real Estate Leads - Sellxpert
42 Ways to Generate Real Estate Leads - Sellxpert
vaishalijagtap12
 
Secure-by-Design Using Hardware and Software Protection for FDA Compliance
Secure-by-Design Using Hardware and Software Protection for FDA ComplianceSecure-by-Design Using Hardware and Software Protection for FDA Compliance
Secure-by-Design Using Hardware and Software Protection for FDA Compliance
ICS
 
The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024
Yara Milbes
 

Recently uploaded (20)

Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
 
Ensuring Efficiency and Speed with Practical Solutions for Clinical Operations
Ensuring Efficiency and Speed with Practical Solutions for Clinical OperationsEnsuring Efficiency and Speed with Practical Solutions for Clinical Operations
Ensuring Efficiency and Speed with Practical Solutions for Clinical Operations
 
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptxOperational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
 
Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...
Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...
Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...
 
bgiolcb
bgiolcbbgiolcb
bgiolcb
 
What’s New in VictoriaLogs - Q2 2024 Update
What’s New in VictoriaLogs - Q2 2024 UpdateWhat’s New in VictoriaLogs - Q2 2024 Update
What’s New in VictoriaLogs - Q2 2024 Update
 
How GenAI Can Improve Supplier Performance Management.pdf
How GenAI Can Improve Supplier Performance Management.pdfHow GenAI Can Improve Supplier Performance Management.pdf
How GenAI Can Improve Supplier Performance Management.pdf
 
Flutter vs. React Native: A Detailed Comparison for App Development in 2024
Flutter vs. React Native: A Detailed Comparison for App Development in 2024Flutter vs. React Native: A Detailed Comparison for App Development in 2024
Flutter vs. React Native: A Detailed Comparison for App Development in 2024
 
Boost Your Savings with These Money Management Apps
Boost Your Savings with These Money Management AppsBoost Your Savings with These Money Management Apps
Boost Your Savings with These Money Management Apps
 
Computer Science & Engineering VI Sem- New Syllabus.pdf
Computer Science & Engineering VI Sem- New Syllabus.pdfComputer Science & Engineering VI Sem- New Syllabus.pdf
Computer Science & Engineering VI Sem- New Syllabus.pdf
 
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
 
Orca: Nocode Graphical Editor for Container Orchestration
Orca: Nocode Graphical Editor for Container OrchestrationOrca: Nocode Graphical Editor for Container Orchestration
Orca: Nocode Graphical Editor for Container Orchestration
 
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
 
ppt on the brain chip neuralink.pptx
ppt  on   the brain  chip neuralink.pptxppt  on   the brain  chip neuralink.pptx
ppt on the brain chip neuralink.pptx
 
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
 
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom KittEnhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
 
Hyperledger Besu 빨리 따라하기 (Private Networks)
Hyperledger Besu 빨리 따라하기 (Private Networks)Hyperledger Besu 빨리 따라하기 (Private Networks)
Hyperledger Besu 빨리 따라하기 (Private Networks)
 
42 Ways to Generate Real Estate Leads - Sellxpert
42 Ways to Generate Real Estate Leads - Sellxpert42 Ways to Generate Real Estate Leads - Sellxpert
42 Ways to Generate Real Estate Leads - Sellxpert
 
Secure-by-Design Using Hardware and Software Protection for FDA Compliance
Secure-by-Design Using Hardware and Software Protection for FDA ComplianceSecure-by-Design Using Hardware and Software Protection for FDA Compliance
Secure-by-Design Using Hardware and Software Protection for FDA Compliance
 
The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024
 

4.Data-Visualization.pptx

  • 2. 01 Understanding Data Visualization 02 Base Graphics in R 03 Visualization with GGPLOT2 Agenda
  • 4. Data Visualization Data visualization is basically presentation of data with graphics. It's a way to summarize your findings and display it in a form that facilitates interpretation and can help in identifying patterns or trends
  • 5. Importance of Data Visualization
  • 6. Importance of Data Visualization A picture is worth a 1000 words. Humans are easily attracted to visuals and mostof us preferto understand a particular scenario through pictures instead of text It is faster to recognize a result than to read a paragraph. When done well, visualizations explain complexideas simply. Oftentimes charts tell the story much faster than a prose descriptionor a table presentation
  • 7. Importance of Data Visualization Retailer country Order method type Retailer type Product line Product type Product Year Revenue Quantity Gross margin United States Telephone Golf Shop Personal Accessories Navigation Trail Master 2012 1095 3 0.34767123 United States Telephone Department Store Camping Equipment Sleeping Bags Hibernator 2012 160103.2 1160 0.3769019 United States Telephone Department Store Camping Equipment Sleeping Bags Hibernator Self - Inflating Mat 2012 66514.28 556 0.5440107 United States Telephone Department Store Camping Equipment Sleeping Bags Hibernator Pad 2012 16205.73 411 0.51382196 United States Telephone Department Store Camping Equipment Sleeping Bags Hibernator Pillow 2012 33520.42 2475 0.46298346 United States Fax Outdoors Shop Camping Equipment Cooking Gear TrailChef Water Bag 2013 19418.52 3102 0.53194888 United States Fax Outdoors Shop Camping Equipment Cooking Gear TrailChef Cook Set 2013 42304.32 794 0.34365616 United States Fax Outdoors Shop Camping Equipment Cooking Gear TrailChef Single Flame 2013 52266.32 824 0.26880025 United States Telephone Department Store Camping Equipment Packs Canyon Mule Journey Backpack 2012 235660.36 676 0.38805542 United States Telephone Department Store Camping Equipment Packs Canyon Mule Cooler 2012 53822.16 1652 0.50890117 United States Web Golf Shop Personal Accessories Watches Venue 2014 73949 1013 0.42858294 United States Web Golf Shop Personal Accessories Watches Infinity 2014 157890.2 665 0.45986527 United States Web Golf Shop Personal Accessories Watches Lux 2014 67265.2 396 0.48654044 After looking at the data Is it possible to answer the following questions? • Which order method type has highest revenue ? • Which product line has highest quantity ? Below is the sample data: Sales Products
  • 8. Importance of Data Visualization Afterdisplaying the data in the form of graphs . Lets try answering the questions 0 50 100 150 200 250 300 350 400 Camping Equipment Golf Equipment Mountaineering Equipment Product line Outdoor Protection Personal Accessories Quantity E-mail Fax Sales visit Telephone Web E-mail Fax Sales visit Telephone Web
  • 9. Importance of Data Visualization Afterdisplaying the data in the form of graphs . Lets try answering the questions E-mail Fax Sales visit Telephone Web E-mail Fax Sales visit Telephone Web Web Which ordermethod type has highest revenue ? Web ordermethod type has highest revenue among all the ordermethod type. Moreover looking at the pie chart you can also tell that which ordermethod type has lowest revenue : Email.
  • 10. Importance of Data Visualization Afterdisplaying the data in the form of graphs . Lets try answering the questions Which productline has highest quantity ? 0 50 100 150 300 250 200 350 400 Camping Equipment Golf Equipment Mountaineering Equipment Product line Outdoor Protection Personal Accessories Quantity From the graph it can be easily seen that PersonalAccessorieshas highest number among all the productline.
  • 12. Base Graphics in R Base Graphics helps in making simple graphs
  • 14. Bar Plot Bar Plots are suitable for showing comparisonbetweencumulative totals across several groups
  • 15. Bar Plot Making a simple Bar-plot plot(customer_churn$Dependent s)
  • 21. Histogram Histogram is basically a plot that breaks the data into bins (or breaks) and shows frequencydistribution of these bins
  • 22. Histogram Making a simple histogram hist(customer_churn$tenure)
  • 24. Histogram Change the number of bins hist(customer_churn$tenure, col="olivedrab", breaks=30)
  • 26. Grammar of Graphics Every form of communicationneeds to have grammar. Since, visualization is also a form of communication, it needs to have a foundation of grammar I am John Am John I
  • 27. Components of Grammar of Graphics Element Description Data The data-set forwhich we wouldwantto plot a graph Aesthetics The metrics onto whichwe plot our data Geometry Visual elements to plot the data Facet Groups by which wedivide the data
  • 29. Visualization with ggplot2 ggplot2 is a system for declaratively creating graphics, based on the Grammar of Graphics. You provide the data, tell ggplot2 how to map variables to aesthetics,what graphical primitives to use, and it takes care of the details
  • 31. geom_hist() geom_hist()function helps in making histograms with ggplot2
  • 32. geom_hist() Build a histogram for ‘tenure’ column ggplot(data=customer_churn, aes(x=tenure))+geom_histogram()
  • 33. geom_hist() Change the number of bins ggplot(data = customer_churn, aes(x=tenure))+geom_histogram(bins = 50)
  • 34. geom_hist() Add fill color and bordercolor ggplot(data = customer_churn, aes(x=tenure))+ geom_histogram(fill="palegreen4",col="green")
  • 35. geom_hist() Assign‘Partner’ to fill aesthetic ggplot(data = customer_churn, aes(x=tenure, fill=Partner))+geom_histogram()
  • 36. geom_hist() Set Position to be ‘identity’ ggplot(data = customer_churn, aes(x=tenure, fill=Partner))+ geom_histogram(position = "identity")
  • 38. geom_bar() geom_bar()function helps in making bar-plots with ggplot2
  • 39. geom_bar() Build a bar-plot for ‘Dependents’column ggplot(data = customer_churn, aes(x=Dependents))+geom_bar()
  • 40. geom_bar() Add fill color ggplot(data = customer_churn, aes(x=Dependents))+geom_bar(fill="chocolate")
  • 41. geom_bar() Assigning ‘DeviceProtection’column to fill aesthetic ggplot(data = customer_churn, aes(x=Dependents,fill=DeviceProtection)) +geom_bar()
  • 42. geom_bar() Set the positionto ‘dodge’ ggplot(data = customer_churn, aes(x=Dependents,fill=DeviceProtection)) +geom_bar(position=‘dodge’)
  • 44. geom_point() geom_point() function helps in making scatterplots with ggplot2.Ascatter plot helps in understanding how does one variable change w.r.t another variable. It is used for two continuous values
  • 45. geom_point() Scatter-plotbetween ‘TotalCharges’ & ‘tenure’ ggplot(data = customer_churn, aes(y=TotalCharges,x=tenure))+geom_point( )
  • 46. geom_point() Adding color ggplot(data = customer_churn, aes(y=TotalCharges,x=tenure)) + geom_point(col="slateblue3")
  • 47. geom_point() Map ‘Partner’ to col aesthetic ggplot(data = customer_churn, aes(y=TotalCharges,x=tenure, col=Partner)) + geom_point()
  • 48. geom_point() Map ‘InternetService’ to col aesthetic ggplot(data = customer_churn, aes(y=TotalCharges,x=tenure, col=InternetService)) + geom_point()
  • 50. geom_boxplot() geom_boxplot() function helps in making boxplots with ggplot2. Box Plot shows 5 statistically significant numbers- the minimum, the 25th percentile, the median, the 75th percentile and the maximum. It is thus useful for visualizing the spread of the data and deriving inferences accordingly
  • 51. geom_boxplot() Box-plot between ‘MonthlyCharges’ & ‘InternetService’ ggplot(data = customer_churn,aes(y=MonthlyCharges,x=In ternetService))+geom_boxplot()
  • 52. geom_boxplot() Add fill color ggplot(data = customer_churn,aes(y=MonthlyCharges,x=Inte rnetService))+geom_boxplot(fill="violetred4")
  • 54. Faceting the data facet_grid() is used to facet the data. Faceting is used when the plot is too chaotic and some variables have to be grouped into different facets to have a better visualization
  • 55. facet_grid() Initial Graph ggplot(data = customer_churn, aes(x=tenure,fill=InternetService))+ geom_histogram() After Faceting ggplot(data = customer_churn, aes(x=tenure,fill=InternetService))+ geom_histogram()+ facet_grid(~InternetService)
  • 56. facet_grid() Initial Graph ggplot(data = customer_churn, aes(y=TotalCharges,x=tenure, col=Contract))+ geom_point()+geom_smooth() After Faceting ggplot(data = customer_churn, aes(y=TotalCharges,x=tenure, col=Contract))+ geom_point()+geom_smooth()+facet_grid(~Contract)
  • 58. Theme Layer Theme layeris used to add themes to our plots
  • 59. Theme Layer Initial Graph ggplot(data = customer_churn,aes(x=tenure))+ geom_histogram(fill="tomato3", col="mediumaquamarine") -> g1 After Adding Title g1+labs(title = "Distribution of tenure")->g2
  • 60. Theme Layer After Adding Title g1+labs(title = "Distribution of tenure")->g2 After Adding Panel Background g2+theme(panel.background = element_rect(fill = "olivedrab3"))->g3
  • 61. Theme Layer After Adding Panel Background g2+theme(panel.background = element_rect(fill = "olivedrab3"))->g3 After Adding Plot Background g3+theme(plot.background = element_rect(fill = "palegreen4"))->g4
  • 62. Theme Layer After Adding Plot Background g3+theme(plot.background = element_rect(fill = "palegreen4"))->g4 After Changing Title g4+theme(plot.title = element_text(hjust = 0.5, face="bold", colour = "peachpuff"))
  • 63. Quiz
  • 64. Quiz Which of these is the correctcode to make a bar-plot for the ‘TechSupport’column. The colorof the bars should be ‘blue’ & the title of the plot should be ‘Distribution of Tech Support’ 1. plot(customer_churn$TechSupport,col="blue", main="Distributionof TechSupport") 2. plot(customer_churn$TechSupport,fill="blue", main="Distribution of Tech Support") 3. plot(customer_churn$TechSupport,col="blue", title="Distributionof Tech Support") 4. plot(customer_churn$TechSupport,color="blue",title="Distributionof TechSupport")
  • 65. Quiz Which of these is the correctcode to make a histogram for the ‘tenure’column. The fill color of the bins should be ‘azure’ & the number of bins should be 87 1. ggplot(data= customer_churn,aes(x=tenure,col='azure'))+geom_histogram(bins=87) 2. ggplot(data = customer_churn,aes(x=tenure))+geom_histogram(col="azure",bins=87) 3. ggplot(data= customer_churn,aes(x=tenure))+geom_histogram(fill="azure",bins=87) 4. ggplot(data= customer_churn,aes(x=tenure,fill='azure'))+geom_histogram(bins=87)
  • 66. Quiz Which of these is the correct code to make a bar-plot for the ‘OnlineBackup’ column. The color of the bars should be determined by the ‘PhoneService’ column 1. ggplot(data= customer_churn,aes(fill=OnlineBackup,x=PhoneService))+geom_bar() 2. ggplot(data= customer_churn,aes(y=OnlineBackup,fill=PhoneService))+geom_bar() 3. ggplot(data= customer_churn,aes(x=OnlineBackup))+geom_bar(fill=PhoneService) 4. ggplot(data= customer_churn,aes(x=OnlineBackup,fill=PhoneService))+geom_bar()
  • 67. Quiz To which of these geometries can you add the facet_grid()? 1. geom_bar() 2. geom_histogram() 3. geom_point() 4. All of the above