Data Science career mixer poster

•

1 like•455 views

Tom Jeon

RESEARCH POSTER PRESENTATION DESIGN
© 2015
www.PosterPresentation
Since the exploding popularity of social media
sites such as Twitter, there has been a lot of social
studies done using data mined from these sites.
Entirely spontaneous of its users, Twitter data is
truer to individuals’ expression than traditional
survey data.
MOTIVATION
OBJECTIVES
1. Gathering data (granularity issue):
•  Given Date, GEOid, and Tweet Count data
(~6 million obs.)
•  Finding survey data of the same granularity as
Twitter data was hard.
•  Web-scraped ZCTA-level population and
median gross rent data from American
Community Survey.
METHODS
Do richer neighborhoods use Twitter more,
and is there a point in which the richest
neighborhoods use Twitter less?
RESULTS
CONCLUSIONS
Though there was an R2 of 0.03, the parameter
estimates had corresponding p-values that
suggested statistical significance, so we
concluded that wealthier neighborhoods use
Twitter more, while the wealthiest neighborhoods
use Twitter less.
The corresponding p-value for the model's F-
statistic is also significant, BUT the model itself
explains only 3% of the variability in the data.
Categorical Data Analysis
•  Gather data from external sources to utilize
Twitter dataset.
•  Clean, merge, and compile data for
econometric analysis
•  Convert Twitter GEOid’s so that it would be
compatible with TIGER/Line Shapefiles and
Geographic Information Systems.
•  Analyze data, create a model to predict
number of tweets per location.
Dept. of Economics, Sta/s/cs, and Computer Science
Tom Jeon BS, BDIC ’17 (seeking 2016 summer internships)
Clean TwiEer Data for TIGER/Line Shapeﬁles and Econometric Analysis
2. Cleaning data
(differing GEOid code
issue):
•  Twitter and ACS
has different ways
notating location in
their datasets
•  Converted GEOid
(location data) code
accordingly as
shown on the right.
3. Merging data
(missingness and
imputation issue):
•  Two final datasets
•  Dplyr package
Can median gross rent and population
predict the number of tweets?
WHAT I AM DOING NOW:
I want to learn as much analytical techniques as
possible so I’ve been learning from experts of different
domains.
LDA Topic Modeling
BioinformaMcs and Neural Data
Independent Study
supervised by
Brendan O’Connor

What's hot

Framing #VemPraRuaRachel Reis Mourao

ODSC_Cherven_20160518Ken Cherven

Practicing Data Science Responsiblyrahulbot

Changer de FocaleINRIA - ENS Lyon

Mp26 : A Quick Introduction to NetworkXMontreal Python

I love Big Bird Rachel Reis Mourao

From Big Data to Big Theory: Lessons Learned from Archival Internet Research.mwe400

Crisis Mapping Lightning Talk Geo CommonsAndrew Turner

Why and how to scrape geospatial data from the webPromptCloud

Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...Artificial Intelligence Institute at UofSC

Status on integration of ICT's in TL III _ Annual MeetTropical Legumes III

GeoSocial: Making it Real—Trends in Planning Technology, APA National Plannin...PlaceSpeak Inc.

GeoSocial: Making it RealPlaceSpeak

Improving Conference Social Media Engagement: The IMSH Twitter StoryEric B. Bauman

Dealing with Information Overload When Using Social Media for Emergency Manag...Mirjam-Mona

Predicting Social Interactions from Different Sources of Location-based Knowl...Michael Steurer

Power Law Distributions for Twitter DataConor Feeney

Strategic perspectives 3archiejones4

On user generated content, teleology and predictability in social systemsUniversità of Urbino Carlo Bo

Ripples on the Web: Diffusion of Activity Bursts across Hyperlink Networks in...Brian Keegan

What's hot (20)

Framing #VemPraRua

ODSC_Cherven_20160518

Practicing Data Science Responsibly

Changer de Focale

Mp26 : A Quick Introduction to NetworkX

I love Big Bird

From Big Data to Big Theory: Lessons Learned from Archival Internet Research.

Crisis Mapping Lightning Talk Geo Commons

Why and how to scrape geospatial data from the web

Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...

Status on integration of ICT's in TL III _ Annual Meet

GeoSocial: Making it Real—Trends in Planning Technology, APA National Plannin...

GeoSocial: Making it Real

Improving Conference Social Media Engagement: The IMSH Twitter Story

Dealing with Information Overload When Using Social Media for Emergency Manag...

Predicting Social Interactions from Different Sources of Location-based Knowl...

Power Law Distributions for Twitter Data

Strategic perspectives 3

On user generated content, teleology and predictability in social systems

Ripples on the Web: Diffusion of Activity Bursts across Hyperlink Networks in...

Similar to Data Science career mixer poster

Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...Farida Vis

Mapping Online Publics: New Methods for Twitter ResearchAxel Bruns

User Behaviour Pattern Recognition On Twitter Social NetworkGeorge Konstantakopoulos

Evolution of Twitter Users and BehaviorAli Babaoglan Blog

The evolution of research on social mediaFarida Vis

Geo-information and Twitter UseHan Woo PARK

data, big data, open dataVincenzo Patruno

Research Project Sample ChapterAlexander Thompkins

INFORMATION RETRIEVAL TOPICS IN TWITTER USING WEIGHTED PREDICTION NETWORKIAEME Publication

Gop social media_analysis_21_dec_2011_finalRichard Hartman, Ph.D.

Researching Social Media – Big Data and Social Media AnalysisFarida Vis

Twitter Based Election Prediction and AnalysisIRJET Journal

IRJET - Political Orientation Prediction using Social Media ActivityIRJET Journal

CCI Winter School Workshop on Digital Methods and Social Media AnalyticsJean Burgess

CCI Winter School Social Media PresentationDarryl Woodford

Digital in Australia June 2016We Are Social Singapore

#Microposts16 - Comparing Social Media and Traditional Surveys around the Bos...Cody Buntain

Twitter And Status Updating Pew Internet Report Oct 2009Subrahmanyam KVJ

Twitter And Status Updating, Fall 2009Marketingfacts

JFrank_1Julian Frank

Data Science career mixer poster

1. RESEARCH POSTER PRESENTATION DESIGN © 2015 www.PosterPresentation Since the exploding popularity of social media sites such as Twitter, there has been a lot of social studies done using data mined from these sites. Entirely spontaneous of its users, Twitter data is truer to individuals’ expression than traditional survey data. MOTIVATION OBJECTIVES 1. Gathering data (granularity issue): •  Given Date, GEOid, and Tweet Count data (~6 million obs.) •  Finding survey data of the same granularity as Twitter data was hard. •  Web-scraped ZCTA-level population and median gross rent data from American Community Survey. METHODS Do richer neighborhoods use Twitter more, and is there a point in which the richest neighborhoods use Twitter less? RESULTS CONCLUSIONS Though there was an R2 of 0.03, the parameter estimates had corresponding p-values that suggested statistical significance, so we concluded that wealthier neighborhoods use Twitter more, while the wealthiest neighborhoods use Twitter less. The corresponding p-value for the model's F- statistic is also significant, BUT the model itself explains only 3% of the variability in the data. Categorical Data Analysis •  Gather data from external sources to utilize Twitter dataset. •  Clean, merge, and compile data for econometric analysis •  Convert Twitter GEOid’s so that it would be compatible with TIGER/Line Shapefiles and Geographic Information Systems. •  Analyze data, create a model to predict number of tweets per location. Dept. of Economics, Sta/s/cs, and Computer Science Tom Jeon BS, BDIC ’17 (seeking 2016 summer internships) Clean TwiEer Data for TIGER/Line Shapeﬁles and Econometric Analysis 2. Cleaning data (differing GEOid code issue): •  Twitter and ACS has different ways notating location in their datasets •  Converted GEOid (location data) code accordingly as shown on the right. 3. Merging data (missingness and imputation issue): •  Two final datasets •  Dplyr package Can median gross rent and population predict the number of tweets? WHAT I AM DOING NOW: I want to learn as much analytical techniques as possible so I’ve been learning from experts of different domains. LDA Topic Modeling BioinformaMcs and Neural Data Independent Study supervised by Brendan O’Connor

Data Science career mixer poster

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Data Science career mixer poster

Similar to Data Science career mixer poster (20)

Data Science career mixer poster