SlideShare a Scribd company logo
1 of 1
Download to read offline
RESEARCH POSTER PRESENTATION DESIGN
© 2015
www.PosterPresentation
Since the exploding popularity of social media
sites such as Twitter, there has been a lot of social
studies done using data mined from these sites.
Entirely spontaneous of its users, Twitter data is
truer to individuals’ expression than traditional
survey data.
MOTIVATION	
OBJECTIVES	
1. Gathering data (granularity issue):
•  Given Date, GEOid, and Tweet Count data
(~6 million obs.)
•  Finding survey data of the same granularity as
Twitter data was hard.
•  Web-scraped ZCTA-level population and
median gross rent data from American
Community Survey.
METHODS	
Do richer neighborhoods use Twitter more,
and is there a point in which the richest
neighborhoods use Twitter less?
RESULTS	
CONCLUSIONS	
Though there was an R2 of 0.03, the parameter
estimates had corresponding p-values that
suggested statistical significance, so we
concluded that wealthier neighborhoods use
Twitter more, while the wealthiest neighborhoods
use Twitter less.
The corresponding p-value for the model's F-
statistic is also significant, BUT the model itself
explains only 3% of the variability in the data.
Categorical	Data	Analysis	
•  Gather data from external sources to utilize
Twitter dataset.
•  Clean, merge, and compile data for
econometric analysis
•  Convert Twitter GEOid’s so that it would be
compatible with TIGER/Line Shapefiles and
Geographic Information Systems.
•  Analyze data, create a model to predict
number of tweets per location.
Dept.	of	Economics,	Sta/s/cs,	and	Computer	Science	
Tom	Jeon	BS,	BDIC	’17		(seeking	2016	summer	internships)	
Clean	TwiEer	Data	for	TIGER/Line	Shapefiles	and	Econometric	Analysis	
2. Cleaning data
(differing GEOid code
issue):
•  Twitter and ACS
has different ways
notating location in
their datasets
•  Converted GEOid
(location data) code
accordingly as
shown on the right.
3. Merging data
(missingness and
imputation issue):
•  Two final datasets
•  Dplyr package
Can median gross rent and population
predict the number of tweets?
WHAT	I	AM	DOING	NOW:	
I want to learn as much analytical techniques as
possible so I’ve been learning from experts of different
domains.
LDA	Topic	Modeling	
BioinformaMcs	and	Neural	Data	
Independent	Study	
supervised	by	
Brendan	O’Connor

More Related Content

What's hot

ODSC_Cherven_20160518
ODSC_Cherven_20160518ODSC_Cherven_20160518
ODSC_Cherven_20160518Ken Cherven
 
Practicing Data Science Responsibly
Practicing Data Science ResponsiblyPracticing Data Science Responsibly
Practicing Data Science Responsiblyrahulbot
 
Mp26 : A Quick Introduction to NetworkX
Mp26 : A Quick Introduction to NetworkXMp26 : A Quick Introduction to NetworkX
Mp26 : A Quick Introduction to NetworkXMontreal Python
 
From Big Data to Big Theory: Lessons Learned from Archival Internet Research.
From Big Data to Big Theory: Lessons Learned from Archival Internet Research.From Big Data to Big Theory: Lessons Learned from Archival Internet Research.
From Big Data to Big Theory: Lessons Learned from Archival Internet Research.mwe400
 
Crisis Mapping Lightning Talk Geo Commons
Crisis Mapping Lightning Talk   Geo CommonsCrisis Mapping Lightning Talk   Geo Commons
Crisis Mapping Lightning Talk Geo CommonsAndrew Turner
 
Why and how to scrape geospatial data from the web
Why and how to scrape geospatial data from the webWhy and how to scrape geospatial data from the web
Why and how to scrape geospatial data from the webPromptCloud
 
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...Artificial Intelligence Institute at UofSC
 
Status on integration of ICT's in TL III _ Annual Meet
Status on integration of ICT's in TL III _ Annual MeetStatus on integration of ICT's in TL III _ Annual Meet
Status on integration of ICT's in TL III _ Annual MeetTropical Legumes III
 
GeoSocial: Making it Real—Trends in Planning Technology, APA National Plannin...
GeoSocial: Making it Real—Trends in Planning Technology, APA National Plannin...GeoSocial: Making it Real—Trends in Planning Technology, APA National Plannin...
GeoSocial: Making it Real—Trends in Planning Technology, APA National Plannin...PlaceSpeak Inc.
 
GeoSocial: Making it Real
GeoSocial: Making it RealGeoSocial: Making it Real
GeoSocial: Making it RealPlaceSpeak
 
Improving Conference Social Media Engagement: The IMSH Twitter Story
Improving Conference Social Media Engagement: The IMSH Twitter StoryImproving Conference Social Media Engagement: The IMSH Twitter Story
Improving Conference Social Media Engagement: The IMSH Twitter StoryEric B. Bauman
 
Dealing with Information Overload When Using Social Media for Emergency Manag...
Dealing with Information Overload When Using Social Media for Emergency Manag...Dealing with Information Overload When Using Social Media for Emergency Manag...
Dealing with Information Overload When Using Social Media for Emergency Manag...Mirjam-Mona
 
Predicting Social Interactions from Different Sources of Location-based Knowl...
Predicting Social Interactions from Different Sources of Location-based Knowl...Predicting Social Interactions from Different Sources of Location-based Knowl...
Predicting Social Interactions from Different Sources of Location-based Knowl...Michael Steurer
 
Power Law Distributions for Twitter Data
Power Law Distributions for Twitter DataPower Law Distributions for Twitter Data
Power Law Distributions for Twitter DataConor Feeney
 
Strategic perspectives 3
Strategic perspectives 3Strategic perspectives 3
Strategic perspectives 3archiejones4
 
On user generated content, teleology and predictability in social systems
On user generated content, teleology and predictability in social systemsOn user generated content, teleology and predictability in social systems
On user generated content, teleology and predictability in social systemsUniversità of Urbino Carlo Bo
 
Ripples on the Web: Diffusion of Activity Bursts across Hyperlink Networks in...
Ripples on the Web: Diffusion of Activity Bursts across Hyperlink Networks in...Ripples on the Web: Diffusion of Activity Bursts across Hyperlink Networks in...
Ripples on the Web: Diffusion of Activity Bursts across Hyperlink Networks in...Brian Keegan
 

What's hot (20)

Framing #VemPraRua
Framing #VemPraRuaFraming #VemPraRua
Framing #VemPraRua
 
ODSC_Cherven_20160518
ODSC_Cherven_20160518ODSC_Cherven_20160518
ODSC_Cherven_20160518
 
Practicing Data Science Responsibly
Practicing Data Science ResponsiblyPracticing Data Science Responsibly
Practicing Data Science Responsibly
 
Changer de Focale
Changer de FocaleChanger de Focale
Changer de Focale
 
Mp26 : A Quick Introduction to NetworkX
Mp26 : A Quick Introduction to NetworkXMp26 : A Quick Introduction to NetworkX
Mp26 : A Quick Introduction to NetworkX
 
I love Big Bird
I love Big Bird I love Big Bird
I love Big Bird
 
From Big Data to Big Theory: Lessons Learned from Archival Internet Research.
From Big Data to Big Theory: Lessons Learned from Archival Internet Research.From Big Data to Big Theory: Lessons Learned from Archival Internet Research.
From Big Data to Big Theory: Lessons Learned from Archival Internet Research.
 
Crisis Mapping Lightning Talk Geo Commons
Crisis Mapping Lightning Talk   Geo CommonsCrisis Mapping Lightning Talk   Geo Commons
Crisis Mapping Lightning Talk Geo Commons
 
Why and how to scrape geospatial data from the web
Why and how to scrape geospatial data from the webWhy and how to scrape geospatial data from the web
Why and how to scrape geospatial data from the web
 
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
 
Status on integration of ICT's in TL III _ Annual Meet
Status on integration of ICT's in TL III _ Annual MeetStatus on integration of ICT's in TL III _ Annual Meet
Status on integration of ICT's in TL III _ Annual Meet
 
GeoSocial: Making it Real—Trends in Planning Technology, APA National Plannin...
GeoSocial: Making it Real—Trends in Planning Technology, APA National Plannin...GeoSocial: Making it Real—Trends in Planning Technology, APA National Plannin...
GeoSocial: Making it Real—Trends in Planning Technology, APA National Plannin...
 
GeoSocial: Making it Real
GeoSocial: Making it RealGeoSocial: Making it Real
GeoSocial: Making it Real
 
Improving Conference Social Media Engagement: The IMSH Twitter Story
Improving Conference Social Media Engagement: The IMSH Twitter StoryImproving Conference Social Media Engagement: The IMSH Twitter Story
Improving Conference Social Media Engagement: The IMSH Twitter Story
 
Dealing with Information Overload When Using Social Media for Emergency Manag...
Dealing with Information Overload When Using Social Media for Emergency Manag...Dealing with Information Overload When Using Social Media for Emergency Manag...
Dealing with Information Overload When Using Social Media for Emergency Manag...
 
Predicting Social Interactions from Different Sources of Location-based Knowl...
Predicting Social Interactions from Different Sources of Location-based Knowl...Predicting Social Interactions from Different Sources of Location-based Knowl...
Predicting Social Interactions from Different Sources of Location-based Knowl...
 
Power Law Distributions for Twitter Data
Power Law Distributions for Twitter DataPower Law Distributions for Twitter Data
Power Law Distributions for Twitter Data
 
Strategic perspectives 3
Strategic perspectives 3Strategic perspectives 3
Strategic perspectives 3
 
On user generated content, teleology and predictability in social systems
On user generated content, teleology and predictability in social systemsOn user generated content, teleology and predictability in social systems
On user generated content, teleology and predictability in social systems
 
Ripples on the Web: Diffusion of Activity Bursts across Hyperlink Networks in...
Ripples on the Web: Diffusion of Activity Bursts across Hyperlink Networks in...Ripples on the Web: Diffusion of Activity Bursts across Hyperlink Networks in...
Ripples on the Web: Diffusion of Activity Bursts across Hyperlink Networks in...
 

Similar to Data Science career mixer poster

Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...Farida Vis
 
Mapping Online Publics: New Methods for Twitter Research
Mapping Online Publics: New Methods for Twitter ResearchMapping Online Publics: New Methods for Twitter Research
Mapping Online Publics: New Methods for Twitter ResearchAxel Bruns
 
User Behaviour Pattern Recognition On Twitter Social Network
User Behaviour Pattern Recognition On Twitter Social NetworkUser Behaviour Pattern Recognition On Twitter Social Network
User Behaviour Pattern Recognition On Twitter Social NetworkGeorge Konstantakopoulos
 
Evolution of Twitter Users and Behavior
Evolution of Twitter Users and BehaviorEvolution of Twitter Users and Behavior
Evolution of Twitter Users and BehaviorAli Babaoglan Blog
 
The evolution of research on social media
The evolution of research on social mediaThe evolution of research on social media
The evolution of research on social mediaFarida Vis
 
Geo-information and Twitter Use
Geo-information and Twitter UseGeo-information and Twitter Use
Geo-information and Twitter UseHan Woo PARK
 
INFORMATION RETRIEVAL TOPICS IN TWITTER USING WEIGHTED PREDICTION NETWORK
INFORMATION RETRIEVAL TOPICS IN TWITTER USING WEIGHTED PREDICTION NETWORKINFORMATION RETRIEVAL TOPICS IN TWITTER USING WEIGHTED PREDICTION NETWORK
INFORMATION RETRIEVAL TOPICS IN TWITTER USING WEIGHTED PREDICTION NETWORKIAEME Publication
 
Gop social media_analysis_21_dec_2011_final
Gop social media_analysis_21_dec_2011_finalGop social media_analysis_21_dec_2011_final
Gop social media_analysis_21_dec_2011_finalRichard Hartman, Ph.D.
 
Researching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media AnalysisResearching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media AnalysisFarida Vis
 
Twitter Based Election Prediction and Analysis
Twitter Based Election Prediction and AnalysisTwitter Based Election Prediction and Analysis
Twitter Based Election Prediction and AnalysisIRJET Journal
 
IRJET - Political Orientation Prediction using Social Media Activity
IRJET -  	  Political Orientation Prediction using Social Media ActivityIRJET -  	  Political Orientation Prediction using Social Media Activity
IRJET - Political Orientation Prediction using Social Media ActivityIRJET Journal
 
CCI Winter School Workshop on Digital Methods and Social Media Analytics
CCI Winter School Workshop on Digital Methods and Social Media AnalyticsCCI Winter School Workshop on Digital Methods and Social Media Analytics
CCI Winter School Workshop on Digital Methods and Social Media AnalyticsJean Burgess
 
CCI Winter School Social Media Presentation
CCI Winter School Social Media PresentationCCI Winter School Social Media Presentation
CCI Winter School Social Media PresentationDarryl Woodford
 
#Microposts16 - Comparing Social Media and Traditional Surveys around the Bos...
#Microposts16 - Comparing Social Media and Traditional Surveys around the Bos...#Microposts16 - Comparing Social Media and Traditional Surveys around the Bos...
#Microposts16 - Comparing Social Media and Traditional Surveys around the Bos...Cody Buntain
 
Twitter And Status Updating Pew Internet Report Oct 2009
Twitter And Status Updating Pew Internet Report Oct 2009Twitter And Status Updating Pew Internet Report Oct 2009
Twitter And Status Updating Pew Internet Report Oct 2009Subrahmanyam KVJ
 
Twitter And Status Updating, Fall 2009
Twitter And Status Updating, Fall 2009Twitter And Status Updating, Fall 2009
Twitter And Status Updating, Fall 2009Marketingfacts
 

Similar to Data Science career mixer poster (20)

Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
 
Mapping Online Publics: New Methods for Twitter Research
Mapping Online Publics: New Methods for Twitter ResearchMapping Online Publics: New Methods for Twitter Research
Mapping Online Publics: New Methods for Twitter Research
 
User Behaviour Pattern Recognition On Twitter Social Network
User Behaviour Pattern Recognition On Twitter Social NetworkUser Behaviour Pattern Recognition On Twitter Social Network
User Behaviour Pattern Recognition On Twitter Social Network
 
Evolution of Twitter Users and Behavior
Evolution of Twitter Users and BehaviorEvolution of Twitter Users and Behavior
Evolution of Twitter Users and Behavior
 
The evolution of research on social media
The evolution of research on social mediaThe evolution of research on social media
The evolution of research on social media
 
Geo-information and Twitter Use
Geo-information and Twitter UseGeo-information and Twitter Use
Geo-information and Twitter Use
 
data, big data, open data
data, big data, open datadata, big data, open data
data, big data, open data
 
Research Project Sample Chapter
Research Project Sample ChapterResearch Project Sample Chapter
Research Project Sample Chapter
 
INFORMATION RETRIEVAL TOPICS IN TWITTER USING WEIGHTED PREDICTION NETWORK
INFORMATION RETRIEVAL TOPICS IN TWITTER USING WEIGHTED PREDICTION NETWORKINFORMATION RETRIEVAL TOPICS IN TWITTER USING WEIGHTED PREDICTION NETWORK
INFORMATION RETRIEVAL TOPICS IN TWITTER USING WEIGHTED PREDICTION NETWORK
 
Gop social media_analysis_21_dec_2011_final
Gop social media_analysis_21_dec_2011_finalGop social media_analysis_21_dec_2011_final
Gop social media_analysis_21_dec_2011_final
 
Researching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media AnalysisResearching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media Analysis
 
Twitter Based Election Prediction and Analysis
Twitter Based Election Prediction and AnalysisTwitter Based Election Prediction and Analysis
Twitter Based Election Prediction and Analysis
 
IRJET - Political Orientation Prediction using Social Media Activity
IRJET -  	  Political Orientation Prediction using Social Media ActivityIRJET -  	  Political Orientation Prediction using Social Media Activity
IRJET - Political Orientation Prediction using Social Media Activity
 
CCI Winter School Workshop on Digital Methods and Social Media Analytics
CCI Winter School Workshop on Digital Methods and Social Media AnalyticsCCI Winter School Workshop on Digital Methods and Social Media Analytics
CCI Winter School Workshop on Digital Methods and Social Media Analytics
 
CCI Winter School Social Media Presentation
CCI Winter School Social Media PresentationCCI Winter School Social Media Presentation
CCI Winter School Social Media Presentation
 
Digital in Australia June 2016
Digital in Australia June 2016Digital in Australia June 2016
Digital in Australia June 2016
 
#Microposts16 - Comparing Social Media and Traditional Surveys around the Bos...
#Microposts16 - Comparing Social Media and Traditional Surveys around the Bos...#Microposts16 - Comparing Social Media and Traditional Surveys around the Bos...
#Microposts16 - Comparing Social Media and Traditional Surveys around the Bos...
 
Twitter And Status Updating Pew Internet Report Oct 2009
Twitter And Status Updating Pew Internet Report Oct 2009Twitter And Status Updating Pew Internet Report Oct 2009
Twitter And Status Updating Pew Internet Report Oct 2009
 
Twitter And Status Updating, Fall 2009
Twitter And Status Updating, Fall 2009Twitter And Status Updating, Fall 2009
Twitter And Status Updating, Fall 2009
 
JFrank_1
JFrank_1JFrank_1
JFrank_1
 

Data Science career mixer poster

  • 1. RESEARCH POSTER PRESENTATION DESIGN © 2015 www.PosterPresentation Since the exploding popularity of social media sites such as Twitter, there has been a lot of social studies done using data mined from these sites. Entirely spontaneous of its users, Twitter data is truer to individuals’ expression than traditional survey data. MOTIVATION OBJECTIVES 1. Gathering data (granularity issue): •  Given Date, GEOid, and Tweet Count data (~6 million obs.) •  Finding survey data of the same granularity as Twitter data was hard. •  Web-scraped ZCTA-level population and median gross rent data from American Community Survey. METHODS Do richer neighborhoods use Twitter more, and is there a point in which the richest neighborhoods use Twitter less? RESULTS CONCLUSIONS Though there was an R2 of 0.03, the parameter estimates had corresponding p-values that suggested statistical significance, so we concluded that wealthier neighborhoods use Twitter more, while the wealthiest neighborhoods use Twitter less. The corresponding p-value for the model's F- statistic is also significant, BUT the model itself explains only 3% of the variability in the data. Categorical Data Analysis •  Gather data from external sources to utilize Twitter dataset. •  Clean, merge, and compile data for econometric analysis •  Convert Twitter GEOid’s so that it would be compatible with TIGER/Line Shapefiles and Geographic Information Systems. •  Analyze data, create a model to predict number of tweets per location. Dept. of Economics, Sta/s/cs, and Computer Science Tom Jeon BS, BDIC ’17 (seeking 2016 summer internships) Clean TwiEer Data for TIGER/Line Shapefiles and Econometric Analysis 2. Cleaning data (differing GEOid code issue): •  Twitter and ACS has different ways notating location in their datasets •  Converted GEOid (location data) code accordingly as shown on the right. 3. Merging data (missingness and imputation issue): •  Two final datasets •  Dplyr package Can median gross rent and population predict the number of tweets? WHAT I AM DOING NOW: I want to learn as much analytical techniques as possible so I’ve been learning from experts of different domains. LDA Topic Modeling BioinformaMcs and Neural Data Independent Study supervised by Brendan O’Connor