SlideShare a Scribd company logo
1 of 26
Percobaan Visualisasi Data
Menggunakan EXCEL, Octave/Matlab, R, dan Python
atjahyanto@gmail.com
Linegraph
• To visualize the value of something over time
• Buka data “Crude Oil Prices.xls”
• Buat grafik berikut ini
• Date vs Price
Bar Chart for Categorical Data
• Presents categorical data with rectangular bars with heights or
lengths proportional to the values that they represent.
• Buka data “Energy Drink Survey.xls”
• Buat grafik berikut ini
Scatterplot
• Displays relationship between two numerical variables
• Buka data “House Sales.xls”
• Buat grafik berikut ini
• Selling Price vs Lot Cost
Scatterplot
• Displays relationship between two numerical variables
• Buka data “Boston Housing.xls”
• Buat grafik berikut ini
• Lstat vs Medv
%% Octave / Matlab
clf;
medv = [24 21.6 34.7 …
lstat= [4.98 9.14 4.03 …
x = lstat;
y = medv;
ukuran = 200;
scatter (x, y, ukuran, 0, "filled");
Scatterplot with color added
• Contoh dengan menggunakan Octave
clf;
x = [1, 2, 3, 4, 5, 6, 7];
y = [1.9, 1.76, 1.34, 1.67, 1.72, 1.89, 1.91];
warna = [1,1,1,2,2,3,3];
ukuran = 200;
scatter (x, y, ukuran, warna, "filled");
Scatterplot with color added
• Displays relationship between two numerical variables
• Buka data “Boston Housing.xls”
• Buat grafik berikut ini
• Lstat vs Nox
clf;
medv = [24 21.6 34.7 …
lstat= [4.98 9.14 4.03 …
nox = [0.538 0.469 0.469 …
med = median(medv);
warna = 1:size(medv)(2);
warna(:)= 1;
iwarna = medv <= med;
warna(iwarna) = 0;
x = lstat;
y = nox;
ukuran = 200;
scatter (x, y, ukuran, warna, "filled");
Scatterplot using R
data(iris)
pairs(iris[1:4],main="Iris Data(red=setosa,green=versicolor,blue=virginica)",
pch=21, bg=c("red","green3","blue")[unclass(iris$Species)])
> summary(iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width
Min. :4.300 Min. :2.000 Min. :1.000 Min. :0.100
1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 1st Qu.:0.300
Median :5.800 Median :3.000 Median :4.350 Median :1.300
Mean :5.843 Mean :3.057 Mean :3.758 Mean :1.199
3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 3rd Qu.:1.800
Max. :7.900 Max. :4.400 Max. :6.900 Max. :2.500
Species
setosa :50
versicolor:50
virginica :50
Scatterplot using Python
# data iris
import pandas as pd
import os
#Set working directory and load data
os.chdir('C:myworkspacepython') # C:myworkspacepython
# load dataset
irisdata = pd.read_csv('iris-UCI-header.csv')
iris = pd.read_csv('iris-UCI-header.csv')
#Create numeric classes for species (0,1,2)
iris.loc[iris['Species']=='Iris-virginica','Species']=0
iris.loc[iris['Species']=='Iris-versicolor','Species']=1
iris.loc[iris['Species']=='Iris-setosa','Species'] = 2
iris = iris[iris['Species']!=2]
#
X = iris['PetalLength'].values.T
Y = iris['PetalWidth'].values.T
warna = iris[['Species']].values.T
warna = warna.astype('uint8')
#Make a scatter plot
import matplotlib.pyplot as plt
plt.scatter(X, Y, c=warna[0,:], s=40, cmap=plt.cm.Spectral);
plt.title("IRIS DATA | Blue - Versicolor, Red - Virginica ")
plt.xlabel('Petal Length')
plt.ylabel('Petal Width')
plt.show()
X = iris['PetalLength'].values.T
Y = iris['PetalWidth'].values.T
warna = iris[['Species']].values.T
warna = warna.astype('uint8')
Scatterplot for all attributes using Python
# Scatter plots of all pairs of attributes
# pip install seaborn
import matplotlib.pyplot as plt
import seaborn as sns
plt.close()
sns.pairplot(irisdata, hue = 'Species', size = 2, diag_kind = 'kde')
plt.show()
# data iris
import pandas as pd
import os
#Set working directory and load data
os.chdir('C:myworkspacepython') # C:myworkspacepython
# load dataset
irisdata = pd.read_csv('iris-UCI-header.csv')
iris = pd.read_csv('iris-UCI-header.csv')
#Create numeric classes for species (0,1,2)
iris.loc[iris['Species']=='Iris-virginica','Species']=0
iris.loc[iris['Species']=='Iris-versicolor','Species']=1
iris.loc[iris['Species']=='Iris-setosa','Species'] = 2
Histogram
• Displays the distribution of the outcome variable
• Display “how many” of each value occur in a data set
• Buka data “House Sales.xls”
• Buat grafik berikut ini
Histogram using R
• Displays the distribution of the outcome variable
# coba histogram
HouseSales<-read.csv("House Sales.csv",header=T)
hist(HouseSales$SellingPrice)
Histogram
• Displays the distribution of the outcome variable
• Display “how many” of each value occur in a data set
• Buka data “Boston Housing.xls”
• Buat grafik berikut ini
# coba histogram
BostonHousing<-read.csv("Boston Housing.csv",header=T)
hist(BostonHousing$medv)
Histogram using Python
import pandas as pd
import os
#Set working directory and load data
os.chdir('C:myworkspacepython') # C:myworkspacepython
# load dataset
irisdata = pd.read_csv('iris-UCI-header.csv')
iris = pd.read_csv('iris-UCI-header.csv')
#Create numeric classes for species (0,1,2)
iris.loc[iris['Species']=='Iris-virginica','Species']=0
iris.loc[iris['Species']=='Iris-versicolor','Species']=1
iris.loc[iris['Species']=='Iris-setosa','Species'] = 2
# Histograms of distribution of input attributes
irisdata.hist()
his = plt.gcf()
his.set_size_inches(12, 6)
plt.show()
Boxplot
• Depicting groups of numerical data through their quartiles
• Useful for comparing subgroups
• Buka data “President's Inn Guest Database.xls”
• Buat grafik berikut ini
Boxplot
• Depicting groups of numerical data through their quartiles
• Useful for comparing subgroups
• Buka data “Boston Housing.xls”
• Buat grafik berikut ini
Boxplot using R
• Depicting groups of numerical data
through their quartiles
• Useful for comparing subgroups
# coba boxplot untuk kolom medv
BostonHousing<-read.csv("Boston Housing.csv",header=T)
boxplot(BostonHousing$medv)
# coba boxplot untuk kolom medv
BostonHousing<-read.csv("Boston Housing.csv",header=T)
boxplot(BostonHousing$medv, BostonHousing$lstat,
main = "Multiple boxplots for comparison",
at = c(1,2), names = c("medv", "lstat") )
Boxplot using Python
import pandas as pd
import os
#Set working directory and load data
os.chdir('C:myworkspacepython') # C:myworkspacepython
# load dataset
irisdata = pd.read_csv('iris-UCI-header.csv')
iris = pd.read_csv('iris-UCI-header.csv')
#Create numeric classes for species (0,1,2)
iris.loc[iris['Species']=='Iris-virginica','Species']=0
iris.loc[iris['Species']=='Iris-versicolor','Species']=1
iris.loc[iris['Species']=='Iris-setosa','Species'] = 2
import matplotlib.pyplot as plt
# Box and whisker plots(Give idea about distribution of input attributes)
irisdata.plot(kind = 'box', subplots = True, layout = (2, 2), sharex = False, sharey = False)
plt.show()
Heatmap
• Correlation Matrix
• To highlight correlations
• Buka data “Boston Housing.xls”
• Buat grafik berikut ini
Load the Analysis ToolPak in Excel
• Click the File tab, click Options, and then click the Add-Ins category.
• If you're using Excel 2007, click the Microsoft Office Button Office button
image , and then click Excel Options
• In the Manage box, select Excel Add-ins and then click Go.
• If you're using Excel for Mac, in the file menu go to Tools > Excel Add-ins.
• In the Add-Ins box, check the Analysis ToolPak check box, and then click OK.
• If Analysis ToolPak is not listed in the Add-Ins available box, click Browse to
locate it.
• If you are prompted that the Analysis ToolPak is not currently installed on
your computer, click Yes to install it.
• Select Data–Data Analysis–Correlation
• Select the input range..in our case columns F:Q
• Check the box for “Labels in first row”
• Select output…either a new worksheet or a location in the
current sheet
crim zn indus chas nox rm age dis rad tax ptratio b lstat medv
crim 1
zn -0,20047 1
indus 0,406583 -0,53383 1
chas -0,05589 -0,0427 0,062938 1
nox 0,420972 -0,5166 0,763651 0,091203 1
rm -0,21925 0,311991 -0,39168 0,091251 -0,30219 1
age 0,352734 -0,56954 0,644779 0,086518 0,73147 -0,24026 1
dis -0,37967 0,664408 -0,70803 -0,09918 -0,76923 0,205246 -0,74788 1
rad 0,625505 -0,31195 0,595129 -0,00737 0,611441 -0,20985 0,456022 -0,49459 1
tax 0,582764 -0,31456 0,72076 -0,03559 0,668023 -0,29205 0,506456 -0,53443 0,910228 1
ptratio 0,289946 -0,39168 0,383248 -0,12152 0,188933 -0,3555 0,261515 -0,23247 0,464741 0,460853 1
b -0,38506 0,17552 -0,35698 0,048788 -0,38005 0,128069 -0,27353 0,291512 -0,44441 -0,44181 -0,17738 1
lstat 0,455621 -0,41299 0,6038 -0,05393 0,590879 -0,61381 0,602339 -0,497 0,488676 0,543993 0,374044 -0,36609 1
medv -0,3883 0,360445 -0,48373 0,17526 -0,42732 0,69536 -0,37695 0,249929 -0,38163 -0,46854 -0,50779 0,333461 -0,73766 1
• Conditional formatting
• Color scales
Heatmap using R
bostonhousing<-read.csv("Boston Housing.csv",header=T)
x <- as.matrix(bostonhousing)
xx <- cor(x)
my_palette <- colorRampPalette(c("red", "blue", "yellow"))(n = 256)
your_palette <- cm.colors(256)
rc <- rainbow(nrow(xx), start = 0, end = .3)
cc <- rainbow(ncol(xx), start = 0, end = .3)
hv <- heatmap(xx, col = my_palette, scale = "column",
RowSideColors = rc, ColSideColors = cc, margins = c(5,10),
xlab = "xlabel", ylab = "ylabel",
main = "heatmap"
)
utils::str(hv) # the two re-ordering index vectors
Heatmap using Python
# data iris
import pandas as pd
import os
#Set working directory and load data
os.chdir('C:myworkspacepython') # C:myworkspacepython
# load dataset
irisdata = pd.read_csv('iris-UCI-header.csv')
iris = pd.read_csv('iris-UCI-header.csv')
#Create numeric classes for species (0,1,2)
iris.loc[iris['Species']=='Iris-virginica','Species']=0
iris.loc[iris['Species']=='Iris-versicolor','Species']=1
iris.loc[iris['Species']=='Iris-setosa','Species'] = 2
import matplotlib.pyplot as plt
plt.figure(figsize=(7,5))
sns.heatmap(irisdata.corr(),annot=True,cmap='RdYlGn_r')
plt.show()
Treemaps
• Gives you a vision of the
size of your data by area.
The more then area is big,
the more the data is
important.
• Buka data “daftar-file.xls”
• Buat grafik berikut ini
Treemaps using R
• Gives you a vision of the size of your
data by area. The more then area is
big, the more the data is important.
• Buka data “daftar-file.csv”
# install.packages("treemap");
library(treemap)
dataku<-read.csv("daftar-file.csv",header=T)
treemap(dataku,
index=c("subdir2", "namafile"),
vSize="ukuran",
vColor="ukuran",
type="value",
format.legend = list(scientific = FALSE, big.mark = " "))
Treemaps using R
• Gives you a vision of the
size of your data by area.
The more then area is big,
the more the data is
important.
library(treemap)
data(GNI2014)
treemap(GNI2014,
index=c("continent", "iso3"),
vSize="population",
vColor="GNI",
type="value",
format.legend = list(scientific = FALSE, big.mark = " "))
iso3 country continent population GNI
BMU Bermuda North America 67837 106140
NOR Norway Europe 4676305 103630
Treemap using Python
import pandas as pd
import os
#Set working directory and load datax
os.chdir('C:myworkspacepython') # C:myworkspacepython
# load dataxset
datax = pd.read_csv('daftar-file.csv')
import matplotlib
import matplotlib.pyplot as plt
import pandas as pd
import squarify
# filter yang ukuran file lebih dari 25K
mydata = datax[datax["ukuran"]>25]
#Utilise matplotlib to scale our goal numbers between the min and max, then assign this scale to our values.
norm = matplotlib.colors.Normalize(vmin=min(mydata.ukuran), vmax=max(mydata.ukuran))
colors = [matplotlib.cm.Blues(norm(value)) for value in mydata.ukuran]
#Create our plot and resize it.
fig = plt.gcf()
ax = fig.add_subplot()
fig.set_size_inches(16, 4.5)
#Use squarify to plot our datax, label it and add colours. We add an alpha layer to ensure black labels show
through
squarify.plot(label=mydata.namafile,sizes=mydata.ukuran, color = colors, alpha=.6)
plt.title("Ini file saya yang lebih dari 25K",fontsize=23,fontweight="bold")
#Remove our axes and display the plot
plt.axis('off')
plt.show()
Descriptive statistics for Iris Dataset
Using Python
import pandas as pd
import os
#Set working directory and load data
os.chdir('C:myworkspacepython') #
C:myworkspacepython
# load dataset
irisdata = pd.read_csv('iris-UCI-header.csv')
iris = pd.read_csv('iris-UCI-header.csv')
#Create numeric classes for species (0,1,2)
iris.loc[iris['Species']=='Iris-virginica','Species']=0
iris.loc[iris['Species']=='Iris-versicolor','Species']=1
iris.loc[iris['Species']=='Iris-setosa','Species'] = 2
iris.shape
iris.describe()
irisdata.info()
irisdata.head()
irisdata[irisdata['Species']=='Iris-virginica'].describe()
irisdata.groupby('Species').size()

More Related Content

Similar to visualisasi data praktik pakai excel, py

Find Anything In Your APEX App - Fuzzy Search with Oracle Text
Find Anything In Your APEX App - Fuzzy Search with Oracle TextFind Anything In Your APEX App - Fuzzy Search with Oracle Text
Find Anything In Your APEX App - Fuzzy Search with Oracle TextCarsten Czarski
 
Cool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearchCool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearchclintongormley
 
Introduction - Using Stata
Introduction - Using StataIntroduction - Using Stata
Introduction - Using StataRyan Herzog
 
Sample Questions The following sample questions are not in.docx
Sample Questions The following sample questions are not in.docxSample Questions The following sample questions are not in.docx
Sample Questions The following sample questions are not in.docxtodd331
 
Gotcha! Ruby things that will come back to bite you.
Gotcha! Ruby things that will come back to bite you.Gotcha! Ruby things that will come back to bite you.
Gotcha! Ruby things that will come back to bite you.David Tollmyr
 
Deep Learning in Spark with BigDL by Petar Zecevic at Big Data Spain 2017
Deep Learning in Spark with BigDL by Petar Zecevic at Big Data Spain 2017Deep Learning in Spark with BigDL by Petar Zecevic at Big Data Spain 2017
Deep Learning in Spark with BigDL by Petar Zecevic at Big Data Spain 2017Big Data Spain
 
How Clean is your database? Data scrubbing for all skills sets
How Clean is your database? Data scrubbing for all skills setsHow Clean is your database? Data scrubbing for all skills sets
How Clean is your database? Data scrubbing for all skills setsChad Petrovay
 
MongoDB Chunks - Distribution, Splitting, and Merging
MongoDB Chunks - Distribution, Splitting, and MergingMongoDB Chunks - Distribution, Splitting, and Merging
MongoDB Chunks - Distribution, Splitting, and MergingJason Terpko
 
The Many Facets of Apache Solr - Yonik Seeley
The Many Facets of Apache Solr - Yonik SeeleyThe Many Facets of Apache Solr - Yonik Seeley
The Many Facets of Apache Solr - Yonik Seeleylucenerevolution
 
Python seaborn cheat_sheet
Python seaborn cheat_sheetPython seaborn cheat_sheet
Python seaborn cheat_sheetNishant Upadhyay
 
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...DataStax
 
Machine Learning with Python
Machine Learning with PythonMachine Learning with Python
Machine Learning with PythonAnkit Rathi
 
Regression and Classification with R
Regression and Classification with RRegression and Classification with R
Regression and Classification with RYanchang Zhao
 
Machine Learning with ML.NET and Azure - Andy Cross
Machine Learning with ML.NET and Azure - Andy CrossMachine Learning with ML.NET and Azure - Andy Cross
Machine Learning with ML.NET and Azure - Andy CrossAndrew Flatters
 
Fazendo mágica com ElasticSearch
Fazendo mágica com ElasticSearchFazendo mágica com ElasticSearch
Fazendo mágica com ElasticSearchPedro Franceschi
 
EDA tools and making sense of data.pdf
EDA tools and making sense of   data.pdfEDA tools and making sense of   data.pdf
EDA tools and making sense of data.pdf9wldv5h8n
 
Machine Learning with Azure
Machine Learning with AzureMachine Learning with Azure
Machine Learning with AzureBarbara Fusinska
 
DataCamp Cheat Sheets 4 Python Users (2020)
DataCamp Cheat Sheets 4 Python Users (2020)DataCamp Cheat Sheets 4 Python Users (2020)
DataCamp Cheat Sheets 4 Python Users (2020)EMRE AKCAOGLU
 
Hands on Mahout!
Hands on Mahout!Hands on Mahout!
Hands on Mahout!OSCON Byrum
 
ComputeFest 2012: Intro To R for Physical Sciences
ComputeFest 2012: Intro To R for Physical SciencesComputeFest 2012: Intro To R for Physical Sciences
ComputeFest 2012: Intro To R for Physical Sciencesalexstorer
 

Similar to visualisasi data praktik pakai excel, py (20)

Find Anything In Your APEX App - Fuzzy Search with Oracle Text
Find Anything In Your APEX App - Fuzzy Search with Oracle TextFind Anything In Your APEX App - Fuzzy Search with Oracle Text
Find Anything In Your APEX App - Fuzzy Search with Oracle Text
 
Cool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearchCool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearch
 
Introduction - Using Stata
Introduction - Using StataIntroduction - Using Stata
Introduction - Using Stata
 
Sample Questions The following sample questions are not in.docx
Sample Questions The following sample questions are not in.docxSample Questions The following sample questions are not in.docx
Sample Questions The following sample questions are not in.docx
 
Gotcha! Ruby things that will come back to bite you.
Gotcha! Ruby things that will come back to bite you.Gotcha! Ruby things that will come back to bite you.
Gotcha! Ruby things that will come back to bite you.
 
Deep Learning in Spark with BigDL by Petar Zecevic at Big Data Spain 2017
Deep Learning in Spark with BigDL by Petar Zecevic at Big Data Spain 2017Deep Learning in Spark with BigDL by Petar Zecevic at Big Data Spain 2017
Deep Learning in Spark with BigDL by Petar Zecevic at Big Data Spain 2017
 
How Clean is your database? Data scrubbing for all skills sets
How Clean is your database? Data scrubbing for all skills setsHow Clean is your database? Data scrubbing for all skills sets
How Clean is your database? Data scrubbing for all skills sets
 
MongoDB Chunks - Distribution, Splitting, and Merging
MongoDB Chunks - Distribution, Splitting, and MergingMongoDB Chunks - Distribution, Splitting, and Merging
MongoDB Chunks - Distribution, Splitting, and Merging
 
The Many Facets of Apache Solr - Yonik Seeley
The Many Facets of Apache Solr - Yonik SeeleyThe Many Facets of Apache Solr - Yonik Seeley
The Many Facets of Apache Solr - Yonik Seeley
 
Python seaborn cheat_sheet
Python seaborn cheat_sheetPython seaborn cheat_sheet
Python seaborn cheat_sheet
 
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
 
Machine Learning with Python
Machine Learning with PythonMachine Learning with Python
Machine Learning with Python
 
Regression and Classification with R
Regression and Classification with RRegression and Classification with R
Regression and Classification with R
 
Machine Learning with ML.NET and Azure - Andy Cross
Machine Learning with ML.NET and Azure - Andy CrossMachine Learning with ML.NET and Azure - Andy Cross
Machine Learning with ML.NET and Azure - Andy Cross
 
Fazendo mágica com ElasticSearch
Fazendo mágica com ElasticSearchFazendo mágica com ElasticSearch
Fazendo mágica com ElasticSearch
 
EDA tools and making sense of data.pdf
EDA tools and making sense of   data.pdfEDA tools and making sense of   data.pdf
EDA tools and making sense of data.pdf
 
Machine Learning with Azure
Machine Learning with AzureMachine Learning with Azure
Machine Learning with Azure
 
DataCamp Cheat Sheets 4 Python Users (2020)
DataCamp Cheat Sheets 4 Python Users (2020)DataCamp Cheat Sheets 4 Python Users (2020)
DataCamp Cheat Sheets 4 Python Users (2020)
 
Hands on Mahout!
Hands on Mahout!Hands on Mahout!
Hands on Mahout!
 
ComputeFest 2012: Intro To R for Physical Sciences
ComputeFest 2012: Intro To R for Physical SciencesComputeFest 2012: Intro To R for Physical Sciences
ComputeFest 2012: Intro To R for Physical Sciences
 

Recently uploaded

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....ShaimaaMohamedGalal
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfCionsystems
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 

Recently uploaded (20)

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdf
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 

visualisasi data praktik pakai excel, py

  • 1. Percobaan Visualisasi Data Menggunakan EXCEL, Octave/Matlab, R, dan Python atjahyanto@gmail.com
  • 2. Linegraph • To visualize the value of something over time • Buka data “Crude Oil Prices.xls” • Buat grafik berikut ini • Date vs Price
  • 3. Bar Chart for Categorical Data • Presents categorical data with rectangular bars with heights or lengths proportional to the values that they represent. • Buka data “Energy Drink Survey.xls” • Buat grafik berikut ini
  • 4. Scatterplot • Displays relationship between two numerical variables • Buka data “House Sales.xls” • Buat grafik berikut ini • Selling Price vs Lot Cost
  • 5. Scatterplot • Displays relationship between two numerical variables • Buka data “Boston Housing.xls” • Buat grafik berikut ini • Lstat vs Medv %% Octave / Matlab clf; medv = [24 21.6 34.7 … lstat= [4.98 9.14 4.03 … x = lstat; y = medv; ukuran = 200; scatter (x, y, ukuran, 0, "filled");
  • 6. Scatterplot with color added • Contoh dengan menggunakan Octave clf; x = [1, 2, 3, 4, 5, 6, 7]; y = [1.9, 1.76, 1.34, 1.67, 1.72, 1.89, 1.91]; warna = [1,1,1,2,2,3,3]; ukuran = 200; scatter (x, y, ukuran, warna, "filled");
  • 7. Scatterplot with color added • Displays relationship between two numerical variables • Buka data “Boston Housing.xls” • Buat grafik berikut ini • Lstat vs Nox clf; medv = [24 21.6 34.7 … lstat= [4.98 9.14 4.03 … nox = [0.538 0.469 0.469 … med = median(medv); warna = 1:size(medv)(2); warna(:)= 1; iwarna = medv <= med; warna(iwarna) = 0; x = lstat; y = nox; ukuran = 200; scatter (x, y, ukuran, warna, "filled");
  • 8. Scatterplot using R data(iris) pairs(iris[1:4],main="Iris Data(red=setosa,green=versicolor,blue=virginica)", pch=21, bg=c("red","green3","blue")[unclass(iris$Species)]) > summary(iris) Sepal.Length Sepal.Width Petal.Length Petal.Width Min. :4.300 Min. :2.000 Min. :1.000 Min. :0.100 1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 1st Qu.:0.300 Median :5.800 Median :3.000 Median :4.350 Median :1.300 Mean :5.843 Mean :3.057 Mean :3.758 Mean :1.199 3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 3rd Qu.:1.800 Max. :7.900 Max. :4.400 Max. :6.900 Max. :2.500 Species setosa :50 versicolor:50 virginica :50
  • 9. Scatterplot using Python # data iris import pandas as pd import os #Set working directory and load data os.chdir('C:myworkspacepython') # C:myworkspacepython # load dataset irisdata = pd.read_csv('iris-UCI-header.csv') iris = pd.read_csv('iris-UCI-header.csv') #Create numeric classes for species (0,1,2) iris.loc[iris['Species']=='Iris-virginica','Species']=0 iris.loc[iris['Species']=='Iris-versicolor','Species']=1 iris.loc[iris['Species']=='Iris-setosa','Species'] = 2 iris = iris[iris['Species']!=2] # X = iris['PetalLength'].values.T Y = iris['PetalWidth'].values.T warna = iris[['Species']].values.T warna = warna.astype('uint8') #Make a scatter plot import matplotlib.pyplot as plt plt.scatter(X, Y, c=warna[0,:], s=40, cmap=plt.cm.Spectral); plt.title("IRIS DATA | Blue - Versicolor, Red - Virginica ") plt.xlabel('Petal Length') plt.ylabel('Petal Width') plt.show() X = iris['PetalLength'].values.T Y = iris['PetalWidth'].values.T warna = iris[['Species']].values.T warna = warna.astype('uint8')
  • 10. Scatterplot for all attributes using Python # Scatter plots of all pairs of attributes # pip install seaborn import matplotlib.pyplot as plt import seaborn as sns plt.close() sns.pairplot(irisdata, hue = 'Species', size = 2, diag_kind = 'kde') plt.show() # data iris import pandas as pd import os #Set working directory and load data os.chdir('C:myworkspacepython') # C:myworkspacepython # load dataset irisdata = pd.read_csv('iris-UCI-header.csv') iris = pd.read_csv('iris-UCI-header.csv') #Create numeric classes for species (0,1,2) iris.loc[iris['Species']=='Iris-virginica','Species']=0 iris.loc[iris['Species']=='Iris-versicolor','Species']=1 iris.loc[iris['Species']=='Iris-setosa','Species'] = 2
  • 11. Histogram • Displays the distribution of the outcome variable • Display “how many” of each value occur in a data set • Buka data “House Sales.xls” • Buat grafik berikut ini
  • 12. Histogram using R • Displays the distribution of the outcome variable # coba histogram HouseSales<-read.csv("House Sales.csv",header=T) hist(HouseSales$SellingPrice)
  • 13. Histogram • Displays the distribution of the outcome variable • Display “how many” of each value occur in a data set • Buka data “Boston Housing.xls” • Buat grafik berikut ini # coba histogram BostonHousing<-read.csv("Boston Housing.csv",header=T) hist(BostonHousing$medv)
  • 14. Histogram using Python import pandas as pd import os #Set working directory and load data os.chdir('C:myworkspacepython') # C:myworkspacepython # load dataset irisdata = pd.read_csv('iris-UCI-header.csv') iris = pd.read_csv('iris-UCI-header.csv') #Create numeric classes for species (0,1,2) iris.loc[iris['Species']=='Iris-virginica','Species']=0 iris.loc[iris['Species']=='Iris-versicolor','Species']=1 iris.loc[iris['Species']=='Iris-setosa','Species'] = 2 # Histograms of distribution of input attributes irisdata.hist() his = plt.gcf() his.set_size_inches(12, 6) plt.show()
  • 15. Boxplot • Depicting groups of numerical data through their quartiles • Useful for comparing subgroups • Buka data “President's Inn Guest Database.xls” • Buat grafik berikut ini
  • 16. Boxplot • Depicting groups of numerical data through their quartiles • Useful for comparing subgroups • Buka data “Boston Housing.xls” • Buat grafik berikut ini
  • 17. Boxplot using R • Depicting groups of numerical data through their quartiles • Useful for comparing subgroups # coba boxplot untuk kolom medv BostonHousing<-read.csv("Boston Housing.csv",header=T) boxplot(BostonHousing$medv) # coba boxplot untuk kolom medv BostonHousing<-read.csv("Boston Housing.csv",header=T) boxplot(BostonHousing$medv, BostonHousing$lstat, main = "Multiple boxplots for comparison", at = c(1,2), names = c("medv", "lstat") )
  • 18. Boxplot using Python import pandas as pd import os #Set working directory and load data os.chdir('C:myworkspacepython') # C:myworkspacepython # load dataset irisdata = pd.read_csv('iris-UCI-header.csv') iris = pd.read_csv('iris-UCI-header.csv') #Create numeric classes for species (0,1,2) iris.loc[iris['Species']=='Iris-virginica','Species']=0 iris.loc[iris['Species']=='Iris-versicolor','Species']=1 iris.loc[iris['Species']=='Iris-setosa','Species'] = 2 import matplotlib.pyplot as plt # Box and whisker plots(Give idea about distribution of input attributes) irisdata.plot(kind = 'box', subplots = True, layout = (2, 2), sharex = False, sharey = False) plt.show()
  • 19. Heatmap • Correlation Matrix • To highlight correlations • Buka data “Boston Housing.xls” • Buat grafik berikut ini Load the Analysis ToolPak in Excel • Click the File tab, click Options, and then click the Add-Ins category. • If you're using Excel 2007, click the Microsoft Office Button Office button image , and then click Excel Options • In the Manage box, select Excel Add-ins and then click Go. • If you're using Excel for Mac, in the file menu go to Tools > Excel Add-ins. • In the Add-Ins box, check the Analysis ToolPak check box, and then click OK. • If Analysis ToolPak is not listed in the Add-Ins available box, click Browse to locate it. • If you are prompted that the Analysis ToolPak is not currently installed on your computer, click Yes to install it. • Select Data–Data Analysis–Correlation • Select the input range..in our case columns F:Q • Check the box for “Labels in first row” • Select output…either a new worksheet or a location in the current sheet crim zn indus chas nox rm age dis rad tax ptratio b lstat medv crim 1 zn -0,20047 1 indus 0,406583 -0,53383 1 chas -0,05589 -0,0427 0,062938 1 nox 0,420972 -0,5166 0,763651 0,091203 1 rm -0,21925 0,311991 -0,39168 0,091251 -0,30219 1 age 0,352734 -0,56954 0,644779 0,086518 0,73147 -0,24026 1 dis -0,37967 0,664408 -0,70803 -0,09918 -0,76923 0,205246 -0,74788 1 rad 0,625505 -0,31195 0,595129 -0,00737 0,611441 -0,20985 0,456022 -0,49459 1 tax 0,582764 -0,31456 0,72076 -0,03559 0,668023 -0,29205 0,506456 -0,53443 0,910228 1 ptratio 0,289946 -0,39168 0,383248 -0,12152 0,188933 -0,3555 0,261515 -0,23247 0,464741 0,460853 1 b -0,38506 0,17552 -0,35698 0,048788 -0,38005 0,128069 -0,27353 0,291512 -0,44441 -0,44181 -0,17738 1 lstat 0,455621 -0,41299 0,6038 -0,05393 0,590879 -0,61381 0,602339 -0,497 0,488676 0,543993 0,374044 -0,36609 1 medv -0,3883 0,360445 -0,48373 0,17526 -0,42732 0,69536 -0,37695 0,249929 -0,38163 -0,46854 -0,50779 0,333461 -0,73766 1 • Conditional formatting • Color scales
  • 20. Heatmap using R bostonhousing<-read.csv("Boston Housing.csv",header=T) x <- as.matrix(bostonhousing) xx <- cor(x) my_palette <- colorRampPalette(c("red", "blue", "yellow"))(n = 256) your_palette <- cm.colors(256) rc <- rainbow(nrow(xx), start = 0, end = .3) cc <- rainbow(ncol(xx), start = 0, end = .3) hv <- heatmap(xx, col = my_palette, scale = "column", RowSideColors = rc, ColSideColors = cc, margins = c(5,10), xlab = "xlabel", ylab = "ylabel", main = "heatmap" ) utils::str(hv) # the two re-ordering index vectors
  • 21. Heatmap using Python # data iris import pandas as pd import os #Set working directory and load data os.chdir('C:myworkspacepython') # C:myworkspacepython # load dataset irisdata = pd.read_csv('iris-UCI-header.csv') iris = pd.read_csv('iris-UCI-header.csv') #Create numeric classes for species (0,1,2) iris.loc[iris['Species']=='Iris-virginica','Species']=0 iris.loc[iris['Species']=='Iris-versicolor','Species']=1 iris.loc[iris['Species']=='Iris-setosa','Species'] = 2 import matplotlib.pyplot as plt plt.figure(figsize=(7,5)) sns.heatmap(irisdata.corr(),annot=True,cmap='RdYlGn_r') plt.show()
  • 22. Treemaps • Gives you a vision of the size of your data by area. The more then area is big, the more the data is important. • Buka data “daftar-file.xls” • Buat grafik berikut ini
  • 23. Treemaps using R • Gives you a vision of the size of your data by area. The more then area is big, the more the data is important. • Buka data “daftar-file.csv” # install.packages("treemap"); library(treemap) dataku<-read.csv("daftar-file.csv",header=T) treemap(dataku, index=c("subdir2", "namafile"), vSize="ukuran", vColor="ukuran", type="value", format.legend = list(scientific = FALSE, big.mark = " "))
  • 24. Treemaps using R • Gives you a vision of the size of your data by area. The more then area is big, the more the data is important. library(treemap) data(GNI2014) treemap(GNI2014, index=c("continent", "iso3"), vSize="population", vColor="GNI", type="value", format.legend = list(scientific = FALSE, big.mark = " ")) iso3 country continent population GNI BMU Bermuda North America 67837 106140 NOR Norway Europe 4676305 103630
  • 25. Treemap using Python import pandas as pd import os #Set working directory and load datax os.chdir('C:myworkspacepython') # C:myworkspacepython # load dataxset datax = pd.read_csv('daftar-file.csv') import matplotlib import matplotlib.pyplot as plt import pandas as pd import squarify # filter yang ukuran file lebih dari 25K mydata = datax[datax["ukuran"]>25] #Utilise matplotlib to scale our goal numbers between the min and max, then assign this scale to our values. norm = matplotlib.colors.Normalize(vmin=min(mydata.ukuran), vmax=max(mydata.ukuran)) colors = [matplotlib.cm.Blues(norm(value)) for value in mydata.ukuran] #Create our plot and resize it. fig = plt.gcf() ax = fig.add_subplot() fig.set_size_inches(16, 4.5) #Use squarify to plot our datax, label it and add colours. We add an alpha layer to ensure black labels show through squarify.plot(label=mydata.namafile,sizes=mydata.ukuran, color = colors, alpha=.6) plt.title("Ini file saya yang lebih dari 25K",fontsize=23,fontweight="bold") #Remove our axes and display the plot plt.axis('off') plt.show()
  • 26. Descriptive statistics for Iris Dataset Using Python import pandas as pd import os #Set working directory and load data os.chdir('C:myworkspacepython') # C:myworkspacepython # load dataset irisdata = pd.read_csv('iris-UCI-header.csv') iris = pd.read_csv('iris-UCI-header.csv') #Create numeric classes for species (0,1,2) iris.loc[iris['Species']=='Iris-virginica','Species']=0 iris.loc[iris['Species']=='Iris-versicolor','Species']=1 iris.loc[iris['Species']=='Iris-setosa','Species'] = 2 iris.shape iris.describe() irisdata.info() irisdata.head() irisdata[irisdata['Species']=='Iris-virginica'].describe() irisdata.groupby('Species').size()