Data Visualization
SEJI OH
JULY 20, 2018
CONTENTS
 What Happened to Napoleon’s troops?
– Minard’s plot
– Dataset
– Reproduction of the plot using R
 What Can We Do With Game Log Data?
– Visualizing StarCraft with R
JULY 20, 2018 ©SEJI OH PAGE 2
CONTENTS
 Regression Analysis
– Dataset
– Simple Linear Regression Model
– Multiple Regression Model
References
JULY 20, 2018 ©SEJI OH PAGE 3
What Happened to Napoleon’s troops?
 Minard’s Plot[1]
JULY 20, 2018 ©SEJI OH PAGE 4
What Happened to Napoleon’s troops?
 Dataset: Napoleon’s March[2]
JULY 20, 2018 ©SEJI OH PAGE 5
What Happened to Napoleon’s troops?
 Let’s draw Minard’s plot using R, especially with the package
‘ggplot2’.[3]
JULY 20, 2018 ©SEJI OH PAGE 6
What Can We Do With Game Log Data?
 Visualizing StarCraft with R[4][5]
JULY 20, 2018 ©SEJI OH PAGE 7
What Can We Do With Game Log Data?
 Visualizing StarCraft with R[4][5]: colored by unitID
JULY 20, 2018 ©SEJI OH PAGE 8
Regression Analysis
 Dataset:
in
the package
 The goal is a establishment
of the multiple regression
model like this.
(Drawn with ggplot2)
JULY 20, 2018 ©SEJI OH PAGE 9
Regression Analysis
 Dataset: diamonds in the package ggplot2
JULY 20, 2018 ©SEJI OH PAGE 10
Regression Analysis
JULY 20, 2018 ©SEJI OH PAGE 11
 The package tidyverse and its family are used in this analysis.
 A holdout cross validation is applied.
 Random sampling from the data as the Train set 70% and the Test
set 30%.
Regression Analysis
JULY 20, 2018 ©SEJI OH PAGE 12
 Check the principal components in
the data.
 Draw a plot which shows a
relation between variables.
 Or calculate the Pearson
correlation coefficient.
Regression Analysis
JULY 20, 2018 ©SEJI OH PAGE 13
Simple linear model
 The independent
variable = price
 The response
variable = carat
Regression Analysis
JULY 20, 2018 ©SEJI OH PAGE 14
Simple regression
model
 The power
transformation is
applied.
Regression Analysis
JULY 20, 2018 ©SEJI OH PAGE 15
Multiple regression
model
 Various
independent
variables =
price, x, y, z
 RMSE = 0.0843145
Regression Analysis
JULY 20, 2018 ©SEJI OH PAGE 16
Multiple regression
model
 The normal
distribution
predictor is applied.
 RMSE = 0.08431445
Regression Analysis
JULY 20, 2018 ©SEJI OH PAGE 17
Multiple regression
model
 The power
transformation is
applied.
 RMSE = 0.01722037
Regression Analysis
JULY 20, 2018 ©SEJI OH PAGE 18
Multiple regression
model
 Check the
validation of the
model with the test
set.
 RMSE = 0.01722037
refrences
[1] Wikipedia, Charles Joseph Minard
https://en.wikipedia.org/wiki/Charles_Joseph_Minard
[2] The Grammar of Graphics, 2ED, Leland Wilkinson, SPSS Inc.
[3] A Layered Grammar of Graphics, Hadley WICKHAM
http://vita.had.co.nz/papers/layered-grammar.pdf
JULY 20, 2018 ©SEJI OH PAGE 19
refrences
[4] Visualizing Professional StarCraft with R
https://towardsdatascience.com/visualizing-professional-starcraft-
with-r-598b5e7a82ac
[5] StarCraftMining, Github
https://github.com/bgweber/StarCraftMining
JULY 20, 2018 ©SEJI OH PAGE 20

Data visualization regression analysis pratice sejioh-july20_2018

  • 1.
  • 2.
    CONTENTS  What Happenedto Napoleon’s troops? – Minard’s plot – Dataset – Reproduction of the plot using R  What Can We Do With Game Log Data? – Visualizing StarCraft with R JULY 20, 2018 ©SEJI OH PAGE 2
  • 3.
    CONTENTS  Regression Analysis –Dataset – Simple Linear Regression Model – Multiple Regression Model References JULY 20, 2018 ©SEJI OH PAGE 3
  • 4.
    What Happened toNapoleon’s troops?  Minard’s Plot[1] JULY 20, 2018 ©SEJI OH PAGE 4
  • 5.
    What Happened toNapoleon’s troops?  Dataset: Napoleon’s March[2] JULY 20, 2018 ©SEJI OH PAGE 5
  • 6.
    What Happened toNapoleon’s troops?  Let’s draw Minard’s plot using R, especially with the package ‘ggplot2’.[3] JULY 20, 2018 ©SEJI OH PAGE 6
  • 7.
    What Can WeDo With Game Log Data?  Visualizing StarCraft with R[4][5] JULY 20, 2018 ©SEJI OH PAGE 7
  • 8.
    What Can WeDo With Game Log Data?  Visualizing StarCraft with R[4][5]: colored by unitID JULY 20, 2018 ©SEJI OH PAGE 8
  • 9.
    Regression Analysis  Dataset: in thepackage  The goal is a establishment of the multiple regression model like this. (Drawn with ggplot2) JULY 20, 2018 ©SEJI OH PAGE 9
  • 10.
    Regression Analysis  Dataset:diamonds in the package ggplot2 JULY 20, 2018 ©SEJI OH PAGE 10
  • 11.
    Regression Analysis JULY 20,2018 ©SEJI OH PAGE 11  The package tidyverse and its family are used in this analysis.  A holdout cross validation is applied.  Random sampling from the data as the Train set 70% and the Test set 30%.
  • 12.
    Regression Analysis JULY 20,2018 ©SEJI OH PAGE 12  Check the principal components in the data.  Draw a plot which shows a relation between variables.  Or calculate the Pearson correlation coefficient.
  • 13.
    Regression Analysis JULY 20,2018 ©SEJI OH PAGE 13 Simple linear model  The independent variable = price  The response variable = carat
  • 14.
    Regression Analysis JULY 20,2018 ©SEJI OH PAGE 14 Simple regression model  The power transformation is applied.
  • 15.
    Regression Analysis JULY 20,2018 ©SEJI OH PAGE 15 Multiple regression model  Various independent variables = price, x, y, z  RMSE = 0.0843145
  • 16.
    Regression Analysis JULY 20,2018 ©SEJI OH PAGE 16 Multiple regression model  The normal distribution predictor is applied.  RMSE = 0.08431445
  • 17.
    Regression Analysis JULY 20,2018 ©SEJI OH PAGE 17 Multiple regression model  The power transformation is applied.  RMSE = 0.01722037
  • 18.
    Regression Analysis JULY 20,2018 ©SEJI OH PAGE 18 Multiple regression model  Check the validation of the model with the test set.  RMSE = 0.01722037
  • 19.
    refrences [1] Wikipedia, CharlesJoseph Minard https://en.wikipedia.org/wiki/Charles_Joseph_Minard [2] The Grammar of Graphics, 2ED, Leland Wilkinson, SPSS Inc. [3] A Layered Grammar of Graphics, Hadley WICKHAM http://vita.had.co.nz/papers/layered-grammar.pdf JULY 20, 2018 ©SEJI OH PAGE 19
  • 20.
    refrences [4] Visualizing ProfessionalStarCraft with R https://towardsdatascience.com/visualizing-professional-starcraft- with-r-598b5e7a82ac [5] StarCraftMining, Github https://github.com/bgweber/StarCraftMining JULY 20, 2018 ©SEJI OH PAGE 20