SlideShare a Scribd company logo
1 of 39
Download to read offline
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
Examining data and importing data in R
Richard L. Zijdeman
May 29, 2015
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
1 Recap
2 Getting data in R
3 Do it yourself!
4 Plotting using ggplot2
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
Recap
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
The structure of objects
Store just about anything in R: numbers, sentences, datasets
Objects
Study the structure of objects: str()
type of object
features of object
ships <- data.frame(year = c(1850, 1860, 1870, 1880),
inbound = c(215, 237, 237, NA),
outbound = c(212, 239, 260, 265))
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
Study the structure of object “ships”"
str(ships)
## 'data.frame': 4 obs. of 3 variables:
## $ year : num 1850 1860 1870 1880
## $ inbound : num 215 237 237 NA
## $ outbound: num 212 239 260 265
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
Characteristics of objects
Class: class()
Length: length()
Dimensions: dim()
class(ships)
## [1] "data.frame"
length(ships)
## [1] 3
dim(ships) # rows, columns
## [1] 4 3
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
Closer inspection of data.frames
names of columns (variables): names()
top/bottom rows: head(), tail()
missing data: is.na()
names(ships)
## [1] "year" "inbound" "outbound"
is.na(ships)
## year inbound outbound
## [1,] FALSE FALSE FALSE
## [2,] FALSE FALSE FALSE
## [3,] FALSE FALSE FALSE
## [4,] FALSE TRUE FALSE
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
Summarizing data in data.frames
descriptive statistics: summary()
calculus: e.g. min(), mean(), sum()
results table format: table()
summary(ships)
## year inbound outbound
## Min. :1850 Min. :215.0 Min. :212.0
## 1st Qu.:1858 1st Qu.:226.0 1st Qu.:232.2
## Median :1865 Median :237.0 Median :249.5
## Mean :1865 Mean :229.7 Mean :244.0
## 3rd Qu.:1872 3rd Qu.:237.0 3rd Qu.:261.2
## Max. :1880 Max. :237.0 Max. :265.0
## NA's :1
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
is.na(ships)
## year inbound outbound
## [1,] FALSE FALSE FALSE
## [2,] FALSE FALSE FALSE
## [3,] FALSE FALSE FALSE
## [4,] FALSE TRUE FALSE
table(is.na(ships))
##
## FALSE TRUE
## 11 1
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
Visualizing your data
Not just for analyses!
Data quality
representativeness
missing data
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
plot(ships)
year
215 220 225 230 235
1850186018701880
215220225230235
inbound
1850 1855 1860 1865 1870 1875 1880 210 220 230 240 250 260
210220230240250260
outbound
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
Getting data in R
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
Data already in R
The “datasets” package
very slim datasets
specific example data
To obtain list of datasets, type:
library(help = "datasets")
To obtain information on a specific dataset, type:
help(swiss) # thus: help(name_of_package)
or to just see the data:
help(swiss)
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
Reading in data
Different functions for different files:
Base R: read.table() (read.csv())
foreign package: read.spss(), read.dta(), read.dbf()
openxlsx package: read.xlsx()
alternatives packages:
xlsx(Java required)
gdata (perl-based)
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
read.xlsx() from openxlsx package
file: your file, including directory
sheet: name of sheet
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
read.csv()
file: your file, including directory
header: variable names or not?
sep: seperator
read.csv default: “,”
read.csv2 default: “;”
skip: number of rows to skip
nrows: total number of rows to read
stringsAsFactors
encoding (e.g. “latin1” or “UTF-8”)
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
Do it yourself!
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
Read in the following files as data.frames:
HSN_basic.xlsx
check the data.frame: using dim(), length()
check the variables: using summary(), min(), table()
Repeat for HSN_marriages.csv:
read in only 100 lines
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
Plotting using ggplot2
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
ggplot2
Package by Hadley Wickham
Generic plotting for a great range of plots
ggplot2 website: http://ggplot2.org
excellent tutorial:
https://jofrhwld.github.io/avml2012/#Section_1.1
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
Building your graph
Each plot consists of multiple layers
Think of a canvas on which you ‘paint’
data layer
geometries layer
statistics layer
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
Data layer
data.frame and aesthetics
ggplot(data.frame, aes(x= ..., y = ...))
geometries layer
ggplot(..., aes(x= ..., y = ...)) +
geom_...() # e.g. geom_line
statistics layer
ggplot(..., aes(x= ..., y = ...)) +
geom_...() +
stat_...() # e.g. stat_smooth
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
an example
Reading in the data
hmar <- read.csv("./../data/derived/HSN_marriages.csv",
stringsAsFactors = FALSE,
encoding = "latin1",
header = TRUE,
nrows = 100)
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
Plotting the data
install.packages(ggplot2)
library(ggplot2)
ggplot(hmar, aes(x= M_year, y = Age_bride)) +
geom_point()
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
20
30
40
50
1830 1840 1850 1860 1870
M_year
Age_bride
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
Improving the plot
Specify characteristics of the geom_layer
ggplot(hmar, aes(x= M_year, y = Age_bride)) +
geom_point(colour = "blue", size = 3, shape = 18)
See http:
//www.cookbook-r.com/Graphs/Shapes_and_line_types/
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
Specify characteristics of the geom_layer
20
30
40
50
1830 1840 1850 1860 1870
M_year
Age_bride
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
A PTE example
Does age at marriage depend on educational attainment?
To marry you need resources
the more attainment the longer it takes to acquire resources
ergo: brides with edu attainment marry later in life
Not a statistical test: but let’s graph this
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
A request from yesterday
Can I plot labels?
ggplot(hmar, aes(x= M_year, y = Age_bride,
label = SIgn_bride)) +
geom_text()
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
Yes you can!
Not really useful though. . .
h
a
h
h
h
a
h
a
h
a
a
a
a
h
a
a
h
h
h
h
h
h
h
a
a
h
h
a
a
h
a
a
a
hh
h hh
a
a
a
a
h
a
h
a
h
h
a
a
h
hh
h
a
h
h h
h
h
h
h
a
h
a
h
h
a
h
a
h
h
a
hh
a
h
h
h
h
h
h
a
a
h
h
h
h
h
h
h
h
h
a
h
a
a
h
a
h
20
30
40
50
1830 1840 1850 1860 1870
M_year
Age_bride
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
Let’s try with colours. . .
ggplot(hmar, aes(x= M_year, y = Age_bride)) +
geom_point(aes(colour = factor(SIgn_bride)),
size = 3, shape = 18)
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
20
30
40
50
1830 1840 1850 1860 1870
M_year
Age_bride
factor(SIgn_bride)
a
h
No real
pattern, though. . .
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
Finalizing the graph
ggplot(hmar, aes(x= M_year, y = Age_bride)) +
geom_point(aes(colour = factor(SIgn_bride)),
size = 3,
shape = 18) +
labs(list(title = "Age of marriage over time",
x = "time (years since A.D.)",
y = "age of bride (years)",
colour = "Signature"))
# here we use colour since legend shows colour
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
20
30
40
50
1830 1840 1850 1860 1870
time (years since A.D.)
ageofbride(years)
Signature
a
h
Age of marriage over time
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
Satisfied?
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
Actually not. . . the points are plotted on top of each other. . .
Solution: geom_jitter
ggplot(hmar, aes(x= M_year, y = Age_bride)) +
geom_jitter(aes(colour = factor(SIgn_bride)),
size = 3,
shape = 18) +
labs(list(title = "Age of marriage over time",
x = "time (years since A.D.)",
y = "age of bride (years)",
colour = "Signature"))
# here we use colour since legend shows colour
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
20
30
40
50
1830 1840 1850 1860 1870
time (years since A.D.)
ageofbride(years)
Signature
a
h
Age of marriage over time
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
Final remarks on ggplot2
We have just scratched the surface of ggplot2
Build your graph slowly
start with the basics
add complexity step-wise
Now it’s your turn!
Richard L. Zijdeman Examining data and importing data in R
Recap
Getting data in R
Do it yourself!
Plotting using ggplot2
A small PTE project
Look at the variables in the HSN files
Think of a research question
Provide a general mechanism and hypothesis
Plot your results
Richard L. Zijdeman Examining data and importing data in R

More Related Content

What's hot

final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)
Ankit Rathi
 
GraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communitiesGraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communities
Paco Nathan
 
Text Analysis: Latent Topics and Annotated Documents
Text Analysis: Latent Topics and Annotated DocumentsText Analysis: Latent Topics and Annotated Documents
Text Analysis: Latent Topics and Annotated Documents
Nelson Auner
 

What's hot (20)

A Workshop on R
A Workshop on RA Workshop on R
A Workshop on R
 
R tutorial
R tutorialR tutorial
R tutorial
 
R programming & Machine Learning
R programming & Machine LearningR programming & Machine Learning
R programming & Machine Learning
 
LSESU a Taste of R Language Workshop
LSESU a Taste of R Language WorkshopLSESU a Taste of R Language Workshop
LSESU a Taste of R Language Workshop
 
Introduction to the R Statistical Computing Environment
Introduction to the R Statistical Computing EnvironmentIntroduction to the R Statistical Computing Environment
Introduction to the R Statistical Computing Environment
 
Workshop - Hadoop + R by CARLOS GIL BELLOSTA at Big Data Spain 2013
Workshop - Hadoop + R by CARLOS GIL BELLOSTA at Big Data Spain 2013Workshop - Hadoop + R by CARLOS GIL BELLOSTA at Big Data Spain 2013
Workshop - Hadoop + R by CARLOS GIL BELLOSTA at Big Data Spain 2013
 
Working with text data
Working with text dataWorking with text data
Working with text data
 
final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)
 
Using R for Social Media and Sports Analytics
Using R for Social Media and Sports AnalyticsUsing R for Social Media and Sports Analytics
Using R for Social Media and Sports Analytics
 
Why R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics PlatformWhy R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics Platform
 
R program
R programR program
R program
 
GraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communitiesGraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communities
 
The History and Use of R
The History and Use of RThe History and Use of R
The History and Use of R
 
Text Analysis: Latent Topics and Annotated Documents
Text Analysis: Latent Topics and Annotated DocumentsText Analysis: Latent Topics and Annotated Documents
Text Analysis: Latent Topics and Annotated Documents
 
15 unionfind
15 unionfind15 unionfind
15 unionfind
 
BDACA1617s2 - Lecture7
BDACA1617s2 - Lecture7BDACA1617s2 - Lecture7
BDACA1617s2 - Lecture7
 
1.3 introduction to R language, importing dataset in r, data exploration in r
1.3 introduction to R language, importing dataset in r, data exploration in r1.3 introduction to R language, importing dataset in r, data exploration in r
1.3 introduction to R language, importing dataset in r, data exploration in r
 
R Programming For Beginners | R Language Tutorial | R Tutorial For Beginners ...
R Programming For Beginners | R Language Tutorial | R Tutorial For Beginners ...R Programming For Beginners | R Language Tutorial | R Tutorial For Beginners ...
R Programming For Beginners | R Language Tutorial | R Tutorial For Beginners ...
 
Democratizing Big Semantic Data management
Democratizing Big Semantic Data managementDemocratizing Big Semantic Data management
Democratizing Big Semantic Data management
 
Coding and Cookies: R basics
Coding and Cookies: R basicsCoding and Cookies: R basics
Coding and Cookies: R basics
 

Similar to Introduction into R for historians (part 3: examine and import data)

CS 542 -- Query Optimization
CS 542 -- Query OptimizationCS 542 -- Query Optimization
CS 542 -- Query Optimization
J Singh
 
Lecture 4 - Comm Lab: Web @ ITP
Lecture 4 - Comm Lab: Web @ ITPLecture 4 - Comm Lab: Web @ ITP
Lecture 4 - Comm Lab: Web @ ITP
yucefmerhi
 
All I know about rsc.io/c2go
All I know about rsc.io/c2goAll I know about rsc.io/c2go
All I know about rsc.io/c2go
Moriyoshi Koizumi
 
Slides on introduction to R by ArinBasu MD
Slides on introduction to R by ArinBasu MDSlides on introduction to R by ArinBasu MD
Slides on introduction to R by ArinBasu MD
SonaCharles2
 

Similar to Introduction into R for historians (part 3: examine and import data) (20)

R visualization: ggplot2, googlevis, plotly, igraph Overview
R visualization: ggplot2, googlevis, plotly, igraph OverviewR visualization: ggplot2, googlevis, plotly, igraph Overview
R visualization: ggplot2, googlevis, plotly, igraph Overview
 
Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016
 
Data Science, what even?!
Data Science, what even?!Data Science, what even?!
Data Science, what even?!
 
introtorandrstudio.ppt
introtorandrstudio.pptintrotorandrstudio.ppt
introtorandrstudio.ppt
 
Data Science, what even...
Data Science, what even...Data Science, what even...
Data Science, what even...
 
Exploratory Analysis Part1 Coursera DataScience Specialisation
Exploratory Analysis Part1 Coursera DataScience SpecialisationExploratory Analysis Part1 Coursera DataScience Specialisation
Exploratory Analysis Part1 Coursera DataScience Specialisation
 
CS 542 -- Query Optimization
CS 542 -- Query OptimizationCS 542 -- Query Optimization
CS 542 -- Query Optimization
 
Lecture_R.ppt
Lecture_R.pptLecture_R.ppt
Lecture_R.ppt
 
Rstudio is an integrated development environment for R that allows users to i...
Rstudio is an integrated development environment for R that allows users to i...Rstudio is an integrated development environment for R that allows users to i...
Rstudio is an integrated development environment for R that allows users to i...
 
Introduction to R for data science
Introduction to R for data scienceIntroduction to R for data science
Introduction to R for data science
 
Lecture 4 - Comm Lab: Web @ ITP
Lecture 4 - Comm Lab: Web @ ITPLecture 4 - Comm Lab: Web @ ITP
Lecture 4 - Comm Lab: Web @ ITP
 
R basics
R basicsR basics
R basics
 
R meetup talk
R meetup talkR meetup talk
R meetup talk
 
Best corporate-r-programming-training-in-mumbai
Best corporate-r-programming-training-in-mumbaiBest corporate-r-programming-training-in-mumbai
Best corporate-r-programming-training-in-mumbai
 
Metadata and the Power of Pattern-Finding
Metadata and the Power of Pattern-FindingMetadata and the Power of Pattern-Finding
Metadata and the Power of Pattern-Finding
 
All I know about rsc.io/c2go
All I know about rsc.io/c2goAll I know about rsc.io/c2go
All I know about rsc.io/c2go
 
Presentation about the use of R3BRoot for data analysis
Presentation about the use of R3BRoot for data analysisPresentation about the use of R3BRoot for data analysis
Presentation about the use of R3BRoot for data analysis
 
Neo4j GraphTalks Munich - Graph-based Metadata Managament & Data Governance
Neo4j GraphTalks Munich - Graph-based Metadata Managament & Data GovernanceNeo4j GraphTalks Munich - Graph-based Metadata Managament & Data Governance
Neo4j GraphTalks Munich - Graph-based Metadata Managament & Data Governance
 
17641.ppt
17641.ppt17641.ppt
17641.ppt
 
Slides on introduction to R by ArinBasu MD
Slides on introduction to R by ArinBasu MDSlides on introduction to R by ArinBasu MD
Slides on introduction to R by ArinBasu MD
 

More from Richard Zijdeman

More from Richard Zijdeman (15)

Linked Data: Een extra ontstluitingslaag op archieven
Linked Data: Een extra ontstluitingslaag op archieven Linked Data: Een extra ontstluitingslaag op archieven
Linked Data: Een extra ontstluitingslaag op archieven
 
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
 
grlc. store, share and run sparql queries
grlc. store, share and run sparql queriesgrlc. store, share and run sparql queries
grlc. store, share and run sparql queries
 
Rijpma's Catasto meets SPARQL dhb2017_workshop
Rijpma's Catasto meets SPARQL dhb2017_workshopRijpma's Catasto meets SPARQL dhb2017_workshop
Rijpma's Catasto meets SPARQL dhb2017_workshop
 
Data legend dh_benelux_2017.key
Data legend dh_benelux_2017.keyData legend dh_benelux_2017.key
Data legend dh_benelux_2017.key
 
Toogdag 2017
Toogdag 2017Toogdag 2017
Toogdag 2017
 
Historical occupational classification and occupational stratification schemes
Historical occupational classification and occupational stratification schemesHistorical occupational classification and occupational stratification schemes
Historical occupational classification and occupational stratification schemes
 
Labour force participation of married women, US 1860-2010
Labour force participation of married women, US 1860-2010Labour force participation of married women, US 1860-2010
Labour force participation of married women, US 1860-2010
 
Advancing the comparability of occupational data through Linked Open Data
Advancing the comparability of occupational data through Linked Open DataAdvancing the comparability of occupational data through Linked Open Data
Advancing the comparability of occupational data through Linked Open Data
 
work in a globalized world
work in a globalized worldwork in a globalized world
work in a globalized world
 
The Structured Data Hub in 2019
The Structured Data Hub in 2019The Structured Data Hub in 2019
The Structured Data Hub in 2019
 
Examples of digital history at the IISH
Examples of digital history at the IISHExamples of digital history at the IISH
Examples of digital history at the IISH
 
Historical occupational classification and stratification schemes (lecture)
Historical occupational classification and stratification schemes (lecture)Historical occupational classification and stratification schemes (lecture)
Historical occupational classification and stratification schemes (lecture)
 
Using HISCO and HISCAM to code and analyze occupations
Using HISCO and HISCAM to code and analyze occupationsUsing HISCO and HISCAM to code and analyze occupations
Using HISCO and HISCAM to code and analyze occupations
 
Csdh sbg clariah_intr01
Csdh sbg clariah_intr01Csdh sbg clariah_intr01
Csdh sbg clariah_intr01
 

Recently uploaded

Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
amitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 

Recently uploaded (20)

Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 

Introduction into R for historians (part 3: examine and import data)

  • 1. Recap Getting data in R Do it yourself! Plotting using ggplot2 Examining data and importing data in R Richard L. Zijdeman May 29, 2015 Richard L. Zijdeman Examining data and importing data in R
  • 2. Recap Getting data in R Do it yourself! Plotting using ggplot2 1 Recap 2 Getting data in R 3 Do it yourself! 4 Plotting using ggplot2 Richard L. Zijdeman Examining data and importing data in R
  • 3. Recap Getting data in R Do it yourself! Plotting using ggplot2 Recap Richard L. Zijdeman Examining data and importing data in R
  • 4. Recap Getting data in R Do it yourself! Plotting using ggplot2 The structure of objects Store just about anything in R: numbers, sentences, datasets Objects Study the structure of objects: str() type of object features of object ships <- data.frame(year = c(1850, 1860, 1870, 1880), inbound = c(215, 237, 237, NA), outbound = c(212, 239, 260, 265)) Richard L. Zijdeman Examining data and importing data in R
  • 5. Recap Getting data in R Do it yourself! Plotting using ggplot2 Study the structure of object “ships”" str(ships) ## 'data.frame': 4 obs. of 3 variables: ## $ year : num 1850 1860 1870 1880 ## $ inbound : num 215 237 237 NA ## $ outbound: num 212 239 260 265 Richard L. Zijdeman Examining data and importing data in R
  • 6. Recap Getting data in R Do it yourself! Plotting using ggplot2 Characteristics of objects Class: class() Length: length() Dimensions: dim() class(ships) ## [1] "data.frame" length(ships) ## [1] 3 dim(ships) # rows, columns ## [1] 4 3 Richard L. Zijdeman Examining data and importing data in R
  • 7. Recap Getting data in R Do it yourself! Plotting using ggplot2 Closer inspection of data.frames names of columns (variables): names() top/bottom rows: head(), tail() missing data: is.na() names(ships) ## [1] "year" "inbound" "outbound" is.na(ships) ## year inbound outbound ## [1,] FALSE FALSE FALSE ## [2,] FALSE FALSE FALSE ## [3,] FALSE FALSE FALSE ## [4,] FALSE TRUE FALSE Richard L. Zijdeman Examining data and importing data in R
  • 8. Recap Getting data in R Do it yourself! Plotting using ggplot2 Summarizing data in data.frames descriptive statistics: summary() calculus: e.g. min(), mean(), sum() results table format: table() summary(ships) ## year inbound outbound ## Min. :1850 Min. :215.0 Min. :212.0 ## 1st Qu.:1858 1st Qu.:226.0 1st Qu.:232.2 ## Median :1865 Median :237.0 Median :249.5 ## Mean :1865 Mean :229.7 Mean :244.0 ## 3rd Qu.:1872 3rd Qu.:237.0 3rd Qu.:261.2 ## Max. :1880 Max. :237.0 Max. :265.0 ## NA's :1 Richard L. Zijdeman Examining data and importing data in R
  • 9. Recap Getting data in R Do it yourself! Plotting using ggplot2 is.na(ships) ## year inbound outbound ## [1,] FALSE FALSE FALSE ## [2,] FALSE FALSE FALSE ## [3,] FALSE FALSE FALSE ## [4,] FALSE TRUE FALSE table(is.na(ships)) ## ## FALSE TRUE ## 11 1 Richard L. Zijdeman Examining data and importing data in R
  • 10. Recap Getting data in R Do it yourself! Plotting using ggplot2 Visualizing your data Not just for analyses! Data quality representativeness missing data Richard L. Zijdeman Examining data and importing data in R
  • 11. Recap Getting data in R Do it yourself! Plotting using ggplot2 plot(ships) year 215 220 225 230 235 1850186018701880 215220225230235 inbound 1850 1855 1860 1865 1870 1875 1880 210 220 230 240 250 260 210220230240250260 outbound Richard L. Zijdeman Examining data and importing data in R
  • 12. Recap Getting data in R Do it yourself! Plotting using ggplot2 Getting data in R Richard L. Zijdeman Examining data and importing data in R
  • 13. Recap Getting data in R Do it yourself! Plotting using ggplot2 Data already in R The “datasets” package very slim datasets specific example data To obtain list of datasets, type: library(help = "datasets") To obtain information on a specific dataset, type: help(swiss) # thus: help(name_of_package) or to just see the data: help(swiss) Richard L. Zijdeman Examining data and importing data in R
  • 14. Recap Getting data in R Do it yourself! Plotting using ggplot2 Reading in data Different functions for different files: Base R: read.table() (read.csv()) foreign package: read.spss(), read.dta(), read.dbf() openxlsx package: read.xlsx() alternatives packages: xlsx(Java required) gdata (perl-based) Richard L. Zijdeman Examining data and importing data in R
  • 15. Recap Getting data in R Do it yourself! Plotting using ggplot2 read.xlsx() from openxlsx package file: your file, including directory sheet: name of sheet Richard L. Zijdeman Examining data and importing data in R
  • 16. Recap Getting data in R Do it yourself! Plotting using ggplot2 read.csv() file: your file, including directory header: variable names or not? sep: seperator read.csv default: “,” read.csv2 default: “;” skip: number of rows to skip nrows: total number of rows to read stringsAsFactors encoding (e.g. “latin1” or “UTF-8”) Richard L. Zijdeman Examining data and importing data in R
  • 17. Recap Getting data in R Do it yourself! Plotting using ggplot2 Do it yourself! Richard L. Zijdeman Examining data and importing data in R
  • 18. Recap Getting data in R Do it yourself! Plotting using ggplot2 Read in the following files as data.frames: HSN_basic.xlsx check the data.frame: using dim(), length() check the variables: using summary(), min(), table() Repeat for HSN_marriages.csv: read in only 100 lines Richard L. Zijdeman Examining data and importing data in R
  • 19. Recap Getting data in R Do it yourself! Plotting using ggplot2 Plotting using ggplot2 Richard L. Zijdeman Examining data and importing data in R
  • 20. Recap Getting data in R Do it yourself! Plotting using ggplot2 ggplot2 Package by Hadley Wickham Generic plotting for a great range of plots ggplot2 website: http://ggplot2.org excellent tutorial: https://jofrhwld.github.io/avml2012/#Section_1.1 Richard L. Zijdeman Examining data and importing data in R
  • 21. Recap Getting data in R Do it yourself! Plotting using ggplot2 Building your graph Each plot consists of multiple layers Think of a canvas on which you ‘paint’ data layer geometries layer statistics layer Richard L. Zijdeman Examining data and importing data in R
  • 22. Recap Getting data in R Do it yourself! Plotting using ggplot2 Data layer data.frame and aesthetics ggplot(data.frame, aes(x= ..., y = ...)) geometries layer ggplot(..., aes(x= ..., y = ...)) + geom_...() # e.g. geom_line statistics layer ggplot(..., aes(x= ..., y = ...)) + geom_...() + stat_...() # e.g. stat_smooth Richard L. Zijdeman Examining data and importing data in R
  • 23. Recap Getting data in R Do it yourself! Plotting using ggplot2 an example Reading in the data hmar <- read.csv("./../data/derived/HSN_marriages.csv", stringsAsFactors = FALSE, encoding = "latin1", header = TRUE, nrows = 100) Richard L. Zijdeman Examining data and importing data in R
  • 24. Recap Getting data in R Do it yourself! Plotting using ggplot2 Plotting the data install.packages(ggplot2) library(ggplot2) ggplot(hmar, aes(x= M_year, y = Age_bride)) + geom_point() Richard L. Zijdeman Examining data and importing data in R
  • 25. Recap Getting data in R Do it yourself! Plotting using ggplot2 20 30 40 50 1830 1840 1850 1860 1870 M_year Age_bride Richard L. Zijdeman Examining data and importing data in R
  • 26. Recap Getting data in R Do it yourself! Plotting using ggplot2 Improving the plot Specify characteristics of the geom_layer ggplot(hmar, aes(x= M_year, y = Age_bride)) + geom_point(colour = "blue", size = 3, shape = 18) See http: //www.cookbook-r.com/Graphs/Shapes_and_line_types/ Richard L. Zijdeman Examining data and importing data in R
  • 27. Recap Getting data in R Do it yourself! Plotting using ggplot2 Specify characteristics of the geom_layer 20 30 40 50 1830 1840 1850 1860 1870 M_year Age_bride Richard L. Zijdeman Examining data and importing data in R
  • 28. Recap Getting data in R Do it yourself! Plotting using ggplot2 A PTE example Does age at marriage depend on educational attainment? To marry you need resources the more attainment the longer it takes to acquire resources ergo: brides with edu attainment marry later in life Not a statistical test: but let’s graph this Richard L. Zijdeman Examining data and importing data in R
  • 29. Recap Getting data in R Do it yourself! Plotting using ggplot2 A request from yesterday Can I plot labels? ggplot(hmar, aes(x= M_year, y = Age_bride, label = SIgn_bride)) + geom_text() Richard L. Zijdeman Examining data and importing data in R
  • 30. Recap Getting data in R Do it yourself! Plotting using ggplot2 Yes you can! Not really useful though. . . h a h h h a h a h a a a a h a a h h h h h h h a a h h a a h a a a hh h hh a a a a h a h a h h a a h hh h a h h h h h h h a h a h h a h a h h a hh a h h h h h h a a h h h h h h h h h a h a a h a h 20 30 40 50 1830 1840 1850 1860 1870 M_year Age_bride Richard L. Zijdeman Examining data and importing data in R
  • 31. Recap Getting data in R Do it yourself! Plotting using ggplot2 Let’s try with colours. . . ggplot(hmar, aes(x= M_year, y = Age_bride)) + geom_point(aes(colour = factor(SIgn_bride)), size = 3, shape = 18) Richard L. Zijdeman Examining data and importing data in R
  • 32. Recap Getting data in R Do it yourself! Plotting using ggplot2 20 30 40 50 1830 1840 1850 1860 1870 M_year Age_bride factor(SIgn_bride) a h No real pattern, though. . . Richard L. Zijdeman Examining data and importing data in R
  • 33. Recap Getting data in R Do it yourself! Plotting using ggplot2 Finalizing the graph ggplot(hmar, aes(x= M_year, y = Age_bride)) + geom_point(aes(colour = factor(SIgn_bride)), size = 3, shape = 18) + labs(list(title = "Age of marriage over time", x = "time (years since A.D.)", y = "age of bride (years)", colour = "Signature")) # here we use colour since legend shows colour Richard L. Zijdeman Examining data and importing data in R
  • 34. Recap Getting data in R Do it yourself! Plotting using ggplot2 20 30 40 50 1830 1840 1850 1860 1870 time (years since A.D.) ageofbride(years) Signature a h Age of marriage over time Richard L. Zijdeman Examining data and importing data in R
  • 35. Recap Getting data in R Do it yourself! Plotting using ggplot2 Satisfied? Richard L. Zijdeman Examining data and importing data in R
  • 36. Recap Getting data in R Do it yourself! Plotting using ggplot2 Actually not. . . the points are plotted on top of each other. . . Solution: geom_jitter ggplot(hmar, aes(x= M_year, y = Age_bride)) + geom_jitter(aes(colour = factor(SIgn_bride)), size = 3, shape = 18) + labs(list(title = "Age of marriage over time", x = "time (years since A.D.)", y = "age of bride (years)", colour = "Signature")) # here we use colour since legend shows colour Richard L. Zijdeman Examining data and importing data in R
  • 37. Recap Getting data in R Do it yourself! Plotting using ggplot2 20 30 40 50 1830 1840 1850 1860 1870 time (years since A.D.) ageofbride(years) Signature a h Age of marriage over time Richard L. Zijdeman Examining data and importing data in R
  • 38. Recap Getting data in R Do it yourself! Plotting using ggplot2 Final remarks on ggplot2 We have just scratched the surface of ggplot2 Build your graph slowly start with the basics add complexity step-wise Now it’s your turn! Richard L. Zijdeman Examining data and importing data in R
  • 39. Recap Getting data in R Do it yourself! Plotting using ggplot2 A small PTE project Look at the variables in the HSN files Think of a research question Provide a general mechanism and hypothesis Plot your results Richard L. Zijdeman Examining data and importing data in R