SlideShare a Scribd company logo
1 of 19
Data Science
Exploratory data analysis
of 2017 US Employment data
using R – Use Case
Chetan Khanzode
Data Source
• Bureau of Labor Statistics (BLS)mission is the collection, analysis, and
dissemination of essential economic information to support public and
private decision-making.
• Data from Quarterly Census of Employment and Wages for year 2017
https://www.bls.gov/
• 3.5 million rows and 38 columns
Data Science Process
Source: data science cook book
R Packages Used
• library(data.table)
• library(plyr)
• library(dplyr)
• library(stringr)
• library(ggplot2)
• library(maps)
• library(bit64)
• library(RColorBrewer)
• library(choroplethr)
Import the data
Use fread function from the data.table package which is significantly faster
Merge the data with associated codes and Titles
Map package data
• Purpose is to look at the geographical distribution of
wages across the US.
• Map package has US map for both at the state-and
county-levels and the data required to make the
maps can be extracted.
• Then align our employment data with the map data
so that the correct data is represented at the right
location on the map.
Map package data
Map package data
state.fips$fips <- str_pad(state.fips$fips, width=2, pad="0“,side='left')
Map package data
Merge to main dataset
Merged data sample to main data frame
Geospatial data visualization
library(ggplot2)
library(RColorBrewer)
state_df <- map_data('state')
county_df <- map_data('county')
transform_mapdata <- function(x){
names(x)[5:6] <- c('state','county')
for(u in c('state','county')){
x[,u] <- sapply(x[,u],MakeCap)
}
return(x)
}
state_df <- transform_mapdata(state_df)
county_df <- transform_mapdata(county_df)
chor <- left_join(county_df, d.cty)
ggplot(chor, aes(long,lat, group=group))+
geom_polygon(aes(fill=wage))+
geom_path( color='white',alpha=0.5,size=0.2)+
geom_polygon(data=state_df, color='black',fill=NA)+
scale_fill_brewer(palette='PuRd')+
labs(x='',y='', fill='Avg Annual Pay by county')+
theme(axis.text.x=element_blank(), axis.text.y=element_blank(),
axis.ticks.x=element_blank(), axis.ticks.y=element_blank())
chor <- left_join(state_df, d.state)
ggplot(chor, aes(long,lat, group=group))+
geom_polygon(aes(fill=wage))+
geom_path( color='white',alpha=0.5,size=0.2)+
geom_polygon(data=state_df, color='black',fill=NA)+
scale_fill_brewer(palette='Spectral')+
labs(x='',y='', fill='Avg Annual Pay By State')+
theme(axis.text.x=element_blank(), axis.text.y=element_blank(),
axis.ticks.x=element_blank(), axis.ticks.y=element_blank())
#The two functions filter and select are from dplyr.
d.cty <- filter(ann2017full, agglvl_code==70)%>%
select(state,county,abb, avg_annual_pay,
annual_avg_emplvl)%>%
mutate(wage=comDiscretize(avg_annual_pay),
empquantile=comDiscretize(annual_avg_emplvl))
Avg Annual Pay by County
Avg Annual Pay by State
JOBS by Industry - NIACS
d.sectors <- filter(ann2017full, industry_code %in%
c(11,21,54,52),
own_code==5, # Private sector
agglvl_code == 74 # county-level
) %>%
select(state,county,industry_code, own_code,agglvl_code,
industry_title, own_title, avg_annual_pay,
annual_avg_emplvl)%>%
mutate(wage=comDiscretize(avg_annual_pay),
emplevel=comDiscretize(annual_avg_emplvl))
d.sectors <- filter(d.sectors, !is.na(industry_code))
chor <- left_join(county_df, d.sectors)
ggplot(chor, aes(long,lat,group=group))+
geom_polygon(aes(fill=emplevel))+
geom_polygon(data=state_df, color='black',fill=NA)+
scale_fill_brewer(palette='PuBu')+
facet_wrap(~industry_title, ncol=2, as.table=T)+
labs(fill='Avg Employment Level',x='',y='')+
theme(axis.text.x=element_blank(),
axis.text.y=element_blank(),
axis.ticks.x=element_blank(),
axis.ticks.y=element_blank())
JOBS by Industry - NIACS
JOBS by Industry - NIACS
Thank You
References
https://www.bls.gov/
https://cran.r-project.org/web/packages/data.table/vignettes/datatable-intro.html
https://www.rdocumentation.org/packages/plyr/versions/1.8.4
https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.html
https://cran.r-project.org/web/packages/stringr/vignettes/stringr.html
https://www.statmethods.net/advgraphs/ggplot2.html
https://www.r-graph-gallery.com/map/
Practical data science book

More Related Content

What's hot

Flowchart Of Jtag Setting
Flowchart Of Jtag SettingFlowchart Of Jtag Setting
Flowchart Of Jtag Settingpnathan
 
ePOM - Intro to Ocean Data Science - Data Visualization
ePOM - Intro to Ocean Data Science - Data VisualizationePOM - Intro to Ocean Data Science - Data Visualization
ePOM - Intro to Ocean Data Science - Data VisualizationGiuseppe Masetti
 
Fast Frequent Pattern Mining without Candidate Generations on GPU by Low Late...
Fast Frequent Pattern Mining without Candidate Generations on GPU by Low Late...Fast Frequent Pattern Mining without Candidate Generations on GPU by Low Late...
Fast Frequent Pattern Mining without Candidate Generations on GPU by Low Late...Lennox Wu
 
simple introduction to hadoop
simple introduction to hadoopsimple introduction to hadoop
simple introduction to hadoopvishnu rao
 

What's hot (6)

Flowchart Of Jtag Setting
Flowchart Of Jtag SettingFlowchart Of Jtag Setting
Flowchart Of Jtag Setting
 
ePOM - Intro to Ocean Data Science - Data Visualization
ePOM - Intro to Ocean Data Science - Data VisualizationePOM - Intro to Ocean Data Science - Data Visualization
ePOM - Intro to Ocean Data Science - Data Visualization
 
Fast Frequent Pattern Mining without Candidate Generations on GPU by Low Late...
Fast Frequent Pattern Mining without Candidate Generations on GPU by Low Late...Fast Frequent Pattern Mining without Candidate Generations on GPU by Low Late...
Fast Frequent Pattern Mining without Candidate Generations on GPU by Low Late...
 
simple introduction to hadoop
simple introduction to hadoopsimple introduction to hadoop
simple introduction to hadoop
 
Big data
Big dataBig data
Big data
 
GIS Data Types
GIS Data TypesGIS Data Types
GIS Data Types
 

Similar to Exploratory data analysis of 2017 US Employment data using R

Exploratory Analysis Part1 Coursera DataScience Specialisation
Exploratory Analysis Part1 Coursera DataScience SpecialisationExploratory Analysis Part1 Coursera DataScience Specialisation
Exploratory Analysis Part1 Coursera DataScience SpecialisationWesley Goi
 
Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...
Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...
Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...Craig Knoblock
 
Design and implementation of three dimensional objects in database management...
Design and implementation of three dimensional objects in database management...Design and implementation of three dimensional objects in database management...
Design and implementation of three dimensional objects in database management...eSAT Journals
 
Introduction to R for data science
Introduction to R for data scienceIntroduction to R for data science
Introduction to R for data scienceLong Nguyen
 
R programming & Machine Learning
R programming & Machine LearningR programming & Machine Learning
R programming & Machine LearningAmanBhalla14
 
The Tidyverse and the Future of the Monitoring Toolchain
The Tidyverse and the Future of the Monitoring ToolchainThe Tidyverse and the Future of the Monitoring Toolchain
The Tidyverse and the Future of the Monitoring ToolchainJohn Rauser
 
Chapter2 gis fundamentals
Chapter2 gis fundamentalsChapter2 gis fundamentals
Chapter2 gis fundamentalsmayasubodh22
 
Broom: Converting Statistical Models to Tidy Data Frames
Broom: Converting Statistical Models to Tidy Data FramesBroom: Converting Statistical Models to Tidy Data Frames
Broom: Converting Statistical Models to Tidy Data FramesWork-Bench
 
Presentation spatial data nata final
Presentation spatial data nata finalPresentation spatial data nata final
Presentation spatial data nata finalMahbubul Hassan
 
Geographic Information System unit 1
Geographic Information System   unit 1Geographic Information System   unit 1
Geographic Information System unit 1sridevi5983
 
It's painful how much data rules the world
It's painful how much data rules the worldIt's painful how much data rules the world
It's painful how much data rules the worldJean-Georges Perrin
 
Challenge@RuleML2015 Modeling Object-Relational Geolocation Knowledge in PSOA...
Challenge@RuleML2015 Modeling Object-Relational Geolocation Knowledge in PSOA...Challenge@RuleML2015 Modeling Object-Relational Geolocation Knowledge in PSOA...
Challenge@RuleML2015 Modeling Object-Relational Geolocation Knowledge in PSOA...RuleML
 
Unit 2 - Data Manipulation with R.pptx
Unit 2 - Data Manipulation with R.pptxUnit 2 - Data Manipulation with R.pptx
Unit 2 - Data Manipulation with R.pptxMalla Reddy University
 

Similar to Exploratory data analysis of 2017 US Employment data using R (20)

Exploratory Analysis Part1 Coursera DataScience Specialisation
Exploratory Analysis Part1 Coursera DataScience SpecialisationExploratory Analysis Part1 Coursera DataScience Specialisation
Exploratory Analysis Part1 Coursera DataScience Specialisation
 
Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...
Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...
Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...
 
Design and implementation of three dimensional objects in database management...
Design and implementation of three dimensional objects in database management...Design and implementation of three dimensional objects in database management...
Design and implementation of three dimensional objects in database management...
 
Introduction to R for data science
Introduction to R for data scienceIntroduction to R for data science
Introduction to R for data science
 
R programming & Machine Learning
R programming & Machine LearningR programming & Machine Learning
R programming & Machine Learning
 
GIS data structure
GIS data structureGIS data structure
GIS data structure
 
Apresentação IBGE
Apresentação IBGEApresentação IBGE
Apresentação IBGE
 
GIS_FDP_Final.pdf
GIS_FDP_Final.pdfGIS_FDP_Final.pdf
GIS_FDP_Final.pdf
 
The Tidyverse and the Future of the Monitoring Toolchain
The Tidyverse and the Future of the Monitoring ToolchainThe Tidyverse and the Future of the Monitoring Toolchain
The Tidyverse and the Future of the Monitoring Toolchain
 
Chapter2 gis fundamentals
Chapter2 gis fundamentalsChapter2 gis fundamentals
Chapter2 gis fundamentals
 
Broom: Converting Statistical Models to Tidy Data Frames
Broom: Converting Statistical Models to Tidy Data FramesBroom: Converting Statistical Models to Tidy Data Frames
Broom: Converting Statistical Models to Tidy Data Frames
 
Data_Sources
Data_SourcesData_Sources
Data_Sources
 
Presentation spatial data nata final
Presentation spatial data nata finalPresentation spatial data nata final
Presentation spatial data nata final
 
Geographic Information System unit 1
Geographic Information System   unit 1Geographic Information System   unit 1
Geographic Information System unit 1
 
Fundamentals of GIS
Fundamentals of GISFundamentals of GIS
Fundamentals of GIS
 
It's painful how much data rules the world
It's painful how much data rules the worldIt's painful how much data rules the world
It's painful how much data rules the world
 
Introduction to GIS
Introduction to GISIntroduction to GIS
Introduction to GIS
 
Challenge@RuleML2015 Modeling Object-Relational Geolocation Knowledge in PSOA...
Challenge@RuleML2015 Modeling Object-Relational Geolocation Knowledge in PSOA...Challenge@RuleML2015 Modeling Object-Relational Geolocation Knowledge in PSOA...
Challenge@RuleML2015 Modeling Object-Relational Geolocation Knowledge in PSOA...
 
Data visualization
Data visualizationData visualization
Data visualization
 
Unit 2 - Data Manipulation with R.pptx
Unit 2 - Data Manipulation with R.pptxUnit 2 - Data Manipulation with R.pptx
Unit 2 - Data Manipulation with R.pptx
 

More from Chetan Khanzode

Data science in health care
Data science in health careData science in health care
Data science in health careChetan Khanzode
 
Smart project management - Best Practices to Manage Project effectively
Smart project management - Best Practices to Manage Project effectivelySmart project management - Best Practices to Manage Project effectively
Smart project management - Best Practices to Manage Project effectivelyChetan Khanzode
 
Value driven IT program management
Value driven IT program managementValue driven IT program management
Value driven IT program managementChetan Khanzode
 
Value driven IT program management
Value driven IT program managementValue driven IT program management
Value driven IT program managementChetan Khanzode
 
Value driven IT program management
Value driven IT program management Value driven IT program management
Value driven IT program management Chetan Khanzode
 
Value driven IT program management
Value driven IT program managementValue driven IT program management
Value driven IT program managementChetan Khanzode
 

More from Chetan Khanzode (8)

Python
PythonPython
Python
 
Data science in health care
Data science in health careData science in health care
Data science in health care
 
Order to cash
Order to cashOrder to cash
Order to cash
 
Smart project management - Best Practices to Manage Project effectively
Smart project management - Best Practices to Manage Project effectivelySmart project management - Best Practices to Manage Project effectively
Smart project management - Best Practices to Manage Project effectively
 
Value driven IT program management
Value driven IT program managementValue driven IT program management
Value driven IT program management
 
Value driven IT program management
Value driven IT program managementValue driven IT program management
Value driven IT program management
 
Value driven IT program management
Value driven IT program management Value driven IT program management
Value driven IT program management
 
Value driven IT program management
Value driven IT program managementValue driven IT program management
Value driven IT program management
 

Recently uploaded

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlkumarajju5765
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 

Recently uploaded (20)

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 

Exploratory data analysis of 2017 US Employment data using R

  • 1. Data Science Exploratory data analysis of 2017 US Employment data using R – Use Case Chetan Khanzode
  • 2. Data Source • Bureau of Labor Statistics (BLS)mission is the collection, analysis, and dissemination of essential economic information to support public and private decision-making. • Data from Quarterly Census of Employment and Wages for year 2017 https://www.bls.gov/ • 3.5 million rows and 38 columns
  • 3. Data Science Process Source: data science cook book
  • 4. R Packages Used • library(data.table) • library(plyr) • library(dplyr) • library(stringr) • library(ggplot2) • library(maps) • library(bit64) • library(RColorBrewer) • library(choroplethr)
  • 5. Import the data Use fread function from the data.table package which is significantly faster
  • 6. Merge the data with associated codes and Titles
  • 7. Map package data • Purpose is to look at the geographical distribution of wages across the US. • Map package has US map for both at the state-and county-levels and the data required to make the maps can be extracted. • Then align our employment data with the map data so that the correct data is represented at the right location on the map.
  • 9. Map package data state.fips$fips <- str_pad(state.fips$fips, width=2, pad="0“,side='left')
  • 11. Merge to main dataset Merged data sample to main data frame
  • 12. Geospatial data visualization library(ggplot2) library(RColorBrewer) state_df <- map_data('state') county_df <- map_data('county') transform_mapdata <- function(x){ names(x)[5:6] <- c('state','county') for(u in c('state','county')){ x[,u] <- sapply(x[,u],MakeCap) } return(x) } state_df <- transform_mapdata(state_df) county_df <- transform_mapdata(county_df) chor <- left_join(county_df, d.cty) ggplot(chor, aes(long,lat, group=group))+ geom_polygon(aes(fill=wage))+ geom_path( color='white',alpha=0.5,size=0.2)+ geom_polygon(data=state_df, color='black',fill=NA)+ scale_fill_brewer(palette='PuRd')+ labs(x='',y='', fill='Avg Annual Pay by county')+ theme(axis.text.x=element_blank(), axis.text.y=element_blank(), axis.ticks.x=element_blank(), axis.ticks.y=element_blank()) chor <- left_join(state_df, d.state) ggplot(chor, aes(long,lat, group=group))+ geom_polygon(aes(fill=wage))+ geom_path( color='white',alpha=0.5,size=0.2)+ geom_polygon(data=state_df, color='black',fill=NA)+ scale_fill_brewer(palette='Spectral')+ labs(x='',y='', fill='Avg Annual Pay By State')+ theme(axis.text.x=element_blank(), axis.text.y=element_blank(), axis.ticks.x=element_blank(), axis.ticks.y=element_blank()) #The two functions filter and select are from dplyr. d.cty <- filter(ann2017full, agglvl_code==70)%>% select(state,county,abb, avg_annual_pay, annual_avg_emplvl)%>% mutate(wage=comDiscretize(avg_annual_pay), empquantile=comDiscretize(annual_avg_emplvl))
  • 13. Avg Annual Pay by County
  • 14. Avg Annual Pay by State
  • 15. JOBS by Industry - NIACS d.sectors <- filter(ann2017full, industry_code %in% c(11,21,54,52), own_code==5, # Private sector agglvl_code == 74 # county-level ) %>% select(state,county,industry_code, own_code,agglvl_code, industry_title, own_title, avg_annual_pay, annual_avg_emplvl)%>% mutate(wage=comDiscretize(avg_annual_pay), emplevel=comDiscretize(annual_avg_emplvl)) d.sectors <- filter(d.sectors, !is.na(industry_code)) chor <- left_join(county_df, d.sectors) ggplot(chor, aes(long,lat,group=group))+ geom_polygon(aes(fill=emplevel))+ geom_polygon(data=state_df, color='black',fill=NA)+ scale_fill_brewer(palette='PuBu')+ facet_wrap(~industry_title, ncol=2, as.table=T)+ labs(fill='Avg Employment Level',x='',y='')+ theme(axis.text.x=element_blank(), axis.text.y=element_blank(), axis.ticks.x=element_blank(), axis.ticks.y=element_blank())
  • 16. JOBS by Industry - NIACS
  • 17. JOBS by Industry - NIACS