SlideShare a Scribd company logo
Digital Demography
Bogdan State & Ingmar Weber
@bogdanstate @ingmarweber
https://sites.google.com/site/digitaldemography/
The Next Three and a Half Hours
09:00 – 10h30: Part I: Overview of Traditional Demography (Bogdan)
● Standard models
● Standard data sources
10h30 – 11h00: Coffee Break and Network Opportunity
11h00 – 12h30: Part II: New Opportunities for Demography with Digital Data (Ingmar)
● Case studies about fertility, mortality and migration
● More about data sources
About Us: Bogdan
Sociology PhD (Stanford), focused on computational
sociology of social ties.
Currently: Graduate Student at Stanford (CS),
Data Scientist at Facebook.
Long-standing interest in migration research.
Articles on measurement of migration with big data,
focus on highly-skilled migration and on social
networks.
About Us: Ingmar
Research Director at QCRI.
Started working on demographics of web search
at Yahoo Research Barcelona (2009-2012).
Collaborating with Emilio Zagheni since 2010,
focusing on international migration.
Published seven articles on different aspects of
WWW and demographics.
Serving as ACM Distinguished Speaker,
http://www.dsp.acm.org/view_lecturer.cfm?lecturer_id=7123.
ACM financially supports travel expenses if you want to have me present at you
event.
Part II: New Opportunities for
Demography with Digital Data
The next 90 minutes
• 16 case studies, i.e. published peer-reviewed papers (~65 min)
- Breadth over depth
- Key idea over methodological details
- Organized by topic: fertility, mortality and migration
• Not-so-obvious data sets, in particular ad audience estimates (~15 min)
- How many Twitter users match criteria X?
• Where to from here and discussion (~10 min)
- What are you working on? How can we help you?
Fertility
Image from clipartfest.com
“Forecasting Births Using Google”
Francesco C. Billari, Francesco D’Amuri, Juri Marcucci
PAA Annual Meeting; 2013
http://paa2013.princeton.edu/papers/131393
Predict Monthly Fertility Rate
Does Google search intensity (GI) for “maternity”, “pregnancy” or “ovulation”
predict (with a lag) monthly birth rates?
New to Google Trends? Example: https://trends.google.com/trends/explore?geo=US&q=healthy%20diet
Looks somewhat promising Also incorporate external factors
Model Performance
Fit an autoregressive–moving-average (ARMA) model
Encouraging results, but lots of models were tried. Potentially risk of overfitting.
Correlation with birth rate. GI1 is the monthly average of the
google index for ‘maternity’, GI2 is the monthly average of
the GI for ‘ovulation’, and GI3 is the monthly average of the
GI for ‘pregnancy’. Error rates with and without Google Trends data.
“Falsification” Test
Lots of things correlate, either by chance or due to hidden factor
Temporal interest in “skiing” correlated with flu activity
Important: robust selection of key words
Used Google Correlate with 2004-2006 time series data to find most correlated
term. Turned out to be: “KXMB”
KXMB is a local affiliate of CBS (one of the major US commercial broadcasting
TVs) for central and western North Dakota
Tested for prediction power. Got poor results (unlike for their terms).
“Fertility and its Meaning: Evidence from Search Behavior”
Jussi Ojala, Emilio Zagheni, Francesco C. Billari, Ingmar Weber
ICWSM; 2017
https://arxiv.org/abs/1703.03935
Study Goals
(i) detect evidence for different contexts surrounding different types of fertility;
Teen, low/high income, (un-)married, …
(ii) model regional variation across states for different fertility levels;
What distinguishes Alabama from California from New York?
(iii) track temporal changes in fertility across time.
Train a model across space, predict across time.
Feature Discovery via Google Trends
Different Contexts of Fertility
Discover search terms correlated with different fertility rates across US states
https://www.google.com/trends/correlate/search?e=id:f7PU4mFDWV-&t=all
Remove terms with no conceivable link to sex, pregnancy or maternity
Predicting Spatial Variability
Performance of the regression models using
leave-one-out cross-validation. SMAPE is in [%], RMSE
values are multiplied by 1,000.
Use the previous terms to build models
predicting state-level fertility rates
All these models make predictions based on
linear combinations of search intensity
Goal: apply these spatial models across time
Learning Across Space, Predicting Across Time
Temporal trend when applying the “teen” model across
time. Values are rescaled to a maximum of 1.0.
Pearson r correlation across 2010-2015 when
using the spatial model to predict trends across
time.
“Seasonal Variation in Internet Keyword Searches: A Proxy
Assessment of Sex Mating Behaviors”
Patrick M. Markey, Charlotte N. Markey
Archives of Sexual Behavior; 2013
http://link.springer.com/article/10.1007/s10508-012-9996-5
Seasonality of Mating-Related Web Searches
Similar temporal patterns for searches
about (i) prostitution and (ii) dating sites
Births have a (weak) seasonal pattern
Can we detect seasonal mating interest?
“Measuring the impact of health policies using Internet search
patterns: the case of abortion”
Ben Y. Reis, John S. Brownstein
BMC Public Health; 2010
http://bmcpublichealth.biomedcentral.com/articles/10.1186/1471-2458-10-514
Searches for “abortion” vs. Abortion Rates
Recent data: https://www.google.com/trends/correlate/search?e=id:a-6K3jgcMLM&t=all#
The Impact of Policies on Search Behavior
“With regard to the abortion policies available for study, abortion search volume
was significantly higher in states having any of the following four restrictions:
(i) mandatory waiting period, (ii) mandatory counseling, (iii) mandatory parental
notification in the case of minors, and (iv) mandatory parental consent for minors.
Examining abortion availability, abortion search volume was significantly higher in
states where fewer than 10% of counties have providers.”
“These findings are consistent with published evidence that local restrictions on
abortion lead individuals to seek abortion services outside of their area.”
“#babyfever: Social and media influences on fertility desires”
Lora E. Adair, Gary L. Brase, Karen Akao, Mackenzie Jantsch
Personality and Individual Differences; 2014
http://www.sciencedirect.com/science/article/pii/S019188691400422X
#babyfever on Twitter
https://twitter.com/search?q=%23babyfever
Mortality
Image from clipartfest.com
“Data Mining of Online Genealogy Datasets for Revealing
Lifespan Patterns in Human Population”
Michael Fire, Yuval Elovici
ACM Transactions on Intelligent Systems and Technology; 2015
http://dl.acm.org/citation.cfm?doid=2753829.2700464
A Wiki Approach to Online Genealogy
Anonymized version available at:
http://proj.ise.bgu.ac.il/sns/wikitree.html
Lifespan in the US over the Last 350 Years
Goal: Predict Someone’s Lifespan
Born in US and >50, predict if >80
“Quantitative analysis of population-scale family trees using
millions of relatives”
Joanna Kaplanis, Assaf Gordon, Mary Wahl, Michael Gershovits, Barak Markus,
Mona Sheikh, Melissa Gymrek, Gaurav Bhatia, Daniel G MarArthur, Alkes Price,
Yaniv Erlich
bioRxiv; 2017
http://biorxiv.org/content/early/2017/02/07/106427
Online Genealogy Data - Again
13 million people, after
cleaning, in a single pedigree
Small sample of mitochondria
and Y-STR haplotypes (not
discussed)
Also location information.
Cleaned, de-identified data
available at:
http://familinx.org/
Geographical Distribution of Data (Place of Birth)
Pre 1800 Post 1800
Mortality and City Growth
Their model (red) validated against
previous models (Oeppen & Vaupel, black)
Mobility Over Time
And a lot more! Check out the paper.
Median migration distance in North American
born individuals as a function of time.
Red: mother-offspring,
blue: father-offspring,
black: marital radius.
Dots represent the data before smoothing.
“A New Source of Data for Public Health Surveillance:
Facebook Likes”
Steven Gittelman, Victor Lange, Carol A. G. Crawford, Catherine A. Okoro,
Eugene Lieb, Satvinder S. Dhingra, Elaine Trimarchi
Journal of Medical Internet Research; 2015
http://www.jmir.org/2015/4/e98/
Zip-Level “Like” Counts for Different Categories
Data from Facebook’s advertising API.
Details about current API later.
Predict County-Level Life Expectancy
Map zip codes to counties
Used 214 counties in
the continental USA
So what are the factors?
What are the Nine
Factors?
Examples:
Factor 2 is good for you
Factor 8 is bad for you
“A novel web informatics approach for automated
surveillance of cancer mortality trends”
Georgia Tourassi, Hong-Jun Yoon, Songhua Xu
Journal of Biomedical Informatics; 2016
http://www.sciencedirect.com/science/article/pii/S1532046416300181
Crawling Cancer-Related Obituaries
Use a web search engine to get seeds
for queries such as “breast cancer
obituary, New York”
Example
Then post-filter
Then lung vs. breast cancer
Then infer age and gender
Cancer Mortality Rates from Online Obituaries
Percent of lung cancer deaths per age
group based on SEER data and
obituaries for both genders.
Annual female breast cancer death rates based on
obituaries and on National Vital Statistics Report
(NVSR) for 2008–2012.
“Online obituaries are a reliable and valid source of mortality
data”
M. L. Soowamber, J. T. Granton, F. Bavaghar-Zaeimi, S. R. Johnson
Journal of Clinical Epidemiology; 2016
http://www.jclinepi.com/article/S0895-4356(16)30183-4/abstract
Let Me Google if My Patient Died …
Discharged patients might die at home without the hospital knowing
Leads to underestimates of mortality for procedures and diseases
Search patients’ first and last names in online obituaries
Not Covered in this Tutorial: Digital Mourning
“"We will never forget you [online]": an empirical investigation of post-mortem
myspace comments”; J. R. Brubaker, G. R. Hayes; 2011
“Death and mourning as sources of community participation in online social
networks: R.I.P. pages in Facebook”; A. E. Forman, R. Kern, G. Gil-Egui; 2012
“Does the internet change how we die and mourn? Overview and analysis.”; T.
Walter, R. Hourizi, W. Moncur, S. Pitsillides; 2012
“Beyond the Grave: Facebook as a Site for the Expansion of Death and
Mourning”; J. R. Brubaker, G. R. Hayes, P. Dourish; 2013
Migration
Image from clipartfest.com
(not Mobility)
Migration = (i) across countries, and (ii) long-term
Lots of work on mobility from Twitter/mobile phone CDR
“You are where you e-mail: using e-mail data to estimate
international migration rates”
Emilio Zagheni, Ingmar Weber
WebSci; 2012
http://dl.acm.org/citation.cfm?doid=2380718.2380764
IP Address => Approximate Geolocation
Any online service you
frequently use knows
your coarse-grained
mobility pattern
We used anonymized
data from Yahoo
https://www.maxmind.com/en/geoip-demo
Data Collection
Large sample of anonymized Yahoo email meta data (date, hashed user ID, inferred
country), including self-reported birth year and gender
Sent email between September 2009 and June 2011, at least once a month
43 million users, half from the US
Migration: different modal country for [Sep 2009, Jun 2010] and [Jul 2010, Jun 2011]
Also obtained internet penetration for (country, age, gender) group
And migration data for European countries from Eurostat (for calibration)
Internet => Young & Educated => More Mobile
Expect a particular type of selection bias:
Highly mobile people are early adopters for internet (and email) use
Introduce an ad-hoc correction factor (CF)
pgac = internet penetration for gender g, age group a and country c
k = factor that controls the strength of the selection bias
Find appropriate k using calibration data for European countries
Results for the United States
Red line: after applying correction factor. Top of gray area: estimates from raw data.
The US don’t have good
data on outgoing migration
flows.
Only some data from IRS
on stocks of expats.
Sensitivity for Low Internet Penetration Countries
Red line: using k=20 for CF. Gray area: Using k between 5 and 35 for CF.
“Studying inter-national mobility through IP geolocation”
Bogdan State, Ingmar Weber, Emilio Zagheni
WSDM; 2013
http://dl.acm.org/citation.cfm?doid=2433396.2433432
Data Collection
Anonymized Yahoo log-in information, covering July 2011 to July 2012
Geolocated using IP address, using an average of 100 log in events per user
~10^8 users, 97% in one country, 3% in two countries, 0.23% in more countries
Define migration: 2x 90 days in two countries (223 migrants after cleaning)
Use “outdated” (April 2012) self-declared country-of-residence to define the origin
Normalize out-edges for a given source country:
Given that I’m leaving country X, where do I go?
What Predicts Target of a Migration Event?
Visualization of Conditional Migration Flows
Black = origin, red = destination, solid lines = “no return”, dashed = some back-and-forth, dotted = pendular
“Inferring international and internal migration patterns from
Twitter data”
Emilio Zagheni, Venkata Rama Kiran Garimella, Ingmar Weber, Bogdan State
WWW; 2014
http://dl.acm.org/citation.cfm?doid=2567948.2576930
Data Collection
Used Twitter streaming API filter for geo-tagged tweets from OECD countries
Pick 3,000 users per country, get their tweets
Estimate out-migration and oversample countries where migration is rare
Get data for ~500K users
Activity thresholding: 3+ tweets in four-months windows, May 2011->April 2013
Left with ~15K users -> Small!
Estimated Out-Migration Rates
Difference-in-Differences
Out-migration rates clearly an overestimate
Non-representative user set
Selection bias is changing over time
Focus on between-country differences
D D
Also see: “Demographic research with non-representative internet data”, Zagheni & Weber, 2015
Results
(Soft) Validation: Ireland out-migration rate grew by 2.2% 2011 -> 2012, more than most
countries (Irish Central Statistics Office)
Mexico also sees a reduction in out-migration (Pew Research Center)
“Migration of Professionals to the U.S. - Evidence from
LinkedIn Data”
Bogdan State, Mario Rodriguez, Dirk Helbing, Emilio Zagheni
SocInfo; 2014
http://link.springer.com/chapter/10.1007%2F978-3-319-13734-6_37
Data Collection
Data for ~200 million LinkedIn Users
Complete with education level and city/country of education/job
No details about data cleaning/preprocessing included
Results
“From Migration Corridors to Clusters: The Value of Google+
Data for Migration Studies”
Johnnatan Messias, Fabricio Benevenuto, Ingmar Weber, Emilio Zagheni
ASONAM; 2016
http://ieeexplore.ieee.org/document/7752269/
Beyond Origin-Destination Migration Analysis
I’m a German citizen living in Qatar. So did I migrate from Germany to Qatar?
Yes, according to Qatari border control.
But: Germany (78->99), United Kingdom (99->03),
Germany (03->07), Switzerland (07->09),
Spain (09->12), Qatar (12->now)
Use the “places lived” on Google+
In 2012, no “currently”, just set of places
Get tuples of co-lived countries
Flows/Corridors vs. Tuples/Clusters
This is what border
control can obtain
(with directionality)
This is what the Google+ “places lived” provides
Expected Cluster Frequencies
Lots of migrant flows on (A,B), (A,C) and (B,C) => expect lots on (A,B,C)
“Expect” = rank clusters according to:
min(freqAB; freqAC; freqBC) * mean(freqAB; freqAC; freqBC)
Best performing ranking approximation (Kendall .565, Spearman .754)
Look at outliers and try to explain those
Outlier Frequencies
Look at “expected rank – actual rank”
Middle 20%: “close to expected”
Top 20%: “higher than expected”
Low 20%: “lower than expected”
Feature Analysis
More than expected:
(Spain, France, Italy)
(UAE, India, Singapore)
Less than expected:
(Brazil, Mexico, USA)
(Canada, China, UK)
Most discriminative features for 3-class distinction
Other Digital Mobility Data: Mobile Phone Data
Mostly used for studying mobility (within a country) rather than migration (across countries). Also
used for socio-economic estimates (such as income estimates).
See work by the following authors for examples (alphabetical order).
Joshua Blumenstock, https://scholar.google.com/citations?user=YpxRngIAAAAJ
Francesco Calabrese, https://scholar.google.com/citations?user=uoI2RgEAAAAJ
Nathan Eagle, https://scholar.google.com/scholar?q=author%3A%22nathan+eagle%22
Cesar Hidalgo, https://scholar.google.com/citations?user=xhCWdtMAAAAJ
Alex ‘Sandy’ Pentland, https://scholar.google.com/citations?user=P4nfoKYAAAAJ
Andrew Tatem, https://scholar.google.com/citations?user=wt8NpZgAAAAJ
More Data Sources
Ad Audience Estimates as Digital Census
Please consider citing this tutorial if you should use these data sets and tools. See
the proceedings for citation details. Stay tuned for forthcoming work using this data.
Targeted Advertising as a Digital Census
All the Internet giants make money with targeted advertising
It’s in their commercial interest to “understand” their users
Rich data on both demographic and behavioral attributes
Usually not available for outside researchers, but …
Some aggregate “audience estimates” available for advertisers:
How many users/impressions match criteria X?
Supported by (at least) Facebook, Twitter, and Google
Facebook’s Advertising Reach Estimates
https://www.facebook.com/ads/manager/creation/creation/
https://developers.facebook.com/docs/marketing-api/buying-api/targeting/v2.8
Easy-to-Use Python code
https://github.com/maraujo/pySocialWatcher
Created by Matheus Araujo at QCRI
Contact me if you want to (i) know about important
details, and (ii) know what’s in the pipeline.
Sneak Preview: Estimating Stocks of Migrants
Joint work with Emilio Zagheni and Krishna Gummadi. Currently under review.
Twitter’s Advertising Reach Estimates
https://dev.twitter.com/ads/reference/1/get/
accounts/%3Aaccount_id/reach_estimate
https://ads.twitter.com/login
Google’s Advertising Reach Estimates
https://support.google.com/adwords/answer/2475441?hl=en
https://developers.google.com/adwords/api/docs/guides/traffic-
estimator-servicehttp://adwords.google.com/
Using Online Ads to Reach Migrants
Only described use as a passive data source. But can be used as an active
outreach channel. Examples below.
“Migrant Sampling Using Facebook Advertisements A Case Study of Polish:
Migrants in Four European Countries”; S. Pötzschke, M. Braun; 2016
“Using Internet to Recruit Immigrants with Language and Culture Barriers for
Tobacco and Alcohol Use Screening: A Study Among Brazilians”; B. H. Carlini, L.
Safioti, T. C. Rue, L. Miles; 2014
“Reaching and recruiting Turkish migrants for a clinical trial through Facebook: A
process evaluation”; B. Ü. Ince, P. Cuijpers, E. van 't Hof, H. Riper; 2014
Google Trends on Steroids
Google Trends does not provide demographic information
Get DMA-level demographic information (race, income, …)
Join with DMA-level Google Trends information
Can potentially give “average income of a web search query over time”
But often sparsity problems, with data only showing for bigger cities (=> bias)
See “The cost of racial animus on a black candidate: Evidence using Google
search data”, Seth Stephens-Davidowitz; Journal of Public Economics; 2014
Also: “Demographic information flows”, Ingmar Weber, Alejandro Jaimes; CIKM 2010
Recall: Previously Mentioned Data Sources
Online genealogy projects
Online obituaries
Google Correlate (= upload your own data, discover correlated search terms)
Geotagged tweets
Others?
Baby announcements? Wedding invitations?
Enriching Your Data
Demographic Inference 101
Please consider citing this tutorial if you should use these data sets and tools. See
the proceedings for citation details.
Demographic Inference – Name Dictionaries
First name gender dictionaries:
https://ideas.repec.org/c/wip/eccode/10.html
http://gender.io/
Contact me for dictionary in “International Gender Differences and Gaps in Online
Social Networks”
Ethnicity Dictionary:
https://www.census.gov/topics/population/genealogy/data/2010_surnames.html
Also see “Inferring Nationalities of Twitter Users and Studying Inter-National Linking”
Demographic Inference – Image-Based Inference
Face++ Cognitive Services
https://www.faceplusplus.com/face-detection/
Microsoft Cognitive Services
https://www.microsoft.com/cognitive-services/en-us/computer-vision-api
Demographic Inference – Build Your Training Data
FollowerWonk by Moz
https://moz.com/followerwonk/bio
https://moz.com/followerwonk/bio/?q=(38-yr%7C38-yrs%7C38%20years)%20old%0A%0A
Where to From Here?*
*Other than lunch
Image from user rculwellmins on Pinterest
Where to Go From Here
Slides and references, including unused ones, will be posted at:
https://sites.google.com/site/digitaldemography/
(Annual?) Workshop at ICWSM: Social Media and Demographic Research,
https://sites.google.com/site/smdrworkshop/
Forthcoming special collection on “Social Media and Demographic Research” for
Demographic Research, edited by E. Zagheni (http://www.demographic-
research.org/info/default.htm)
Organizations
IUSSP “Big Data and Population Processes” Panel, http://iussp.org/en/panel/big-
data-and-population-processes
 See their events
UN Global Pulse, http://unglobalpulse.org/
Data-Pop Alliance, http://datapopalliance.org/
Digital Demography email list at UW,
https://mailman12.u.washington.edu/mailman/listinfo/digital-demog
Questions, Comments, Thoughts?

More Related Content

What's hot

Digital 2022: Essential Snapchat Stats for Q3 2022 v01
Digital 2022: Essential Snapchat Stats for Q3 2022 v01Digital 2022: Essential Snapchat Stats for Q3 2022 v01
Digital 2022: Essential Snapchat Stats for Q3 2022 v01
DataReportal
 
Digital 2022 Saint Helena, Ascension and Tristan da Cunha (February 2022) v01
Digital 2022 Saint Helena, Ascension and Tristan da Cunha (February 2022) v01Digital 2022 Saint Helena, Ascension and Tristan da Cunha (February 2022) v01
Digital 2022 Saint Helena, Ascension and Tristan da Cunha (February 2022) v01
DataReportal
 
Data Management in R
Data Management in RData Management in R
Data Management in R
Sankhya_Analytics
 
Badanie polskich użytkowników Pinteresta
Badanie polskich użytkowników Pinteresta Badanie polskich użytkowników Pinteresta
Badanie polskich użytkowników Pinteresta
CEO Magazyn Polska
 
A little introduction to GIS and QGIS
A little introduction to GIS and QGIS A little introduction to GIS and QGIS
A little introduction to GIS and QGIS
Riccardo Rigon
 
Digital 2022 Trinidad and Tobago (February 2022) v01
Digital 2022 Trinidad and Tobago (February 2022) v01Digital 2022 Trinidad and Tobago (February 2022) v01
Digital 2022 Trinidad and Tobago (February 2022) v01
DataReportal
 
Digital 2022 Canada (February 2022) v02
Digital 2022 Canada (February 2022) v02Digital 2022 Canada (February 2022) v02
Digital 2022 Canada (February 2022) v02
DataReportal
 
Digital 2022 Sierra Leone (February 2022) v01
Digital 2022 Sierra Leone (February 2022) v01Digital 2022 Sierra Leone (February 2022) v01
Digital 2022 Sierra Leone (February 2022) v01
DataReportal
 
Σύντομη εισαγωγή στο MapReduce
Σύντομη εισαγωγή στο MapReduceΣύντομη εισαγωγή στο MapReduce
Σύντομη εισαγωγή στο MapReduce
Kostas Diamantaras
 
Digital 2022 Kazakhstan (February 2022) v01
Digital 2022 Kazakhstan (February 2022) v01Digital 2022 Kazakhstan (February 2022) v01
Digital 2022 Kazakhstan (February 2022) v01
DataReportal
 
مقدمة نظرية مختصرة عن نعريف - مكونات - تطبيقات واستخدامات نظم المعلومات الجغر...
مقدمة نظرية مختصرة عن نعريف - مكونات - تطبيقات واستخدامات نظم المعلومات الجغر...مقدمة نظرية مختصرة عن نعريف - مكونات - تطبيقات واستخدامات نظم المعلومات الجغر...
مقدمة نظرية مختصرة عن نعريف - مكونات - تطبيقات واستخدامات نظم المعلومات الجغر...
Soha Ahmed د. سهــى أحمد
 
Exploratory Spatial Analysis using GeoDa
Exploratory Spatial Analysis using GeoDaExploratory Spatial Analysis using GeoDa
Exploratory Spatial Analysis using GeoDa
MEASURE Evaluation
 
Arcgis training day_1
Arcgis training day_1Arcgis training day_1
Arcgis training day_1
yashasweesharma
 
Digital 2020 Global Digital Yearbook (January 2020) v01
Digital 2020 Global Digital Yearbook (January 2020) v01Digital 2020 Global Digital Yearbook (January 2020) v01
Digital 2020 Global Digital Yearbook (January 2020) v01
DataReportal
 
SAP S/4 HANA Technical assessment before migration
SAP S/4 HANA Technical assessment before migrationSAP S/4 HANA Technical assessment before migration
SAP S/4 HANA Technical assessment before migration
Марина Ковалёва
 
Digital 2022 Comoros (February 2022) v01
Digital 2022 Comoros (February 2022) v01Digital 2022 Comoros (February 2022) v01
Digital 2022 Comoros (February 2022) v01
DataReportal
 
SAP-desde-cero (1).pdf
SAP-desde-cero (1).pdfSAP-desde-cero (1).pdf
SAP-desde-cero (1).pdf
carlatapia22
 
Digital 2020 France (January 2020) v01
Digital 2020 France (January 2020) v01Digital 2020 France (January 2020) v01
Digital 2020 France (January 2020) v01
DataReportal
 
Air Travel Analytics in SAS
Air Travel Analytics in SASAir Travel Analytics in SAS
Air Travel Analytics in SASRohan Nanda
 

What's hot (20)

Digital 2022: Essential Snapchat Stats for Q3 2022 v01
Digital 2022: Essential Snapchat Stats for Q3 2022 v01Digital 2022: Essential Snapchat Stats for Q3 2022 v01
Digital 2022: Essential Snapchat Stats for Q3 2022 v01
 
Digital 2022 Saint Helena, Ascension and Tristan da Cunha (February 2022) v01
Digital 2022 Saint Helena, Ascension and Tristan da Cunha (February 2022) v01Digital 2022 Saint Helena, Ascension and Tristan da Cunha (February 2022) v01
Digital 2022 Saint Helena, Ascension and Tristan da Cunha (February 2022) v01
 
Data Management in R
Data Management in RData Management in R
Data Management in R
 
Badanie polskich użytkowników Pinteresta
Badanie polskich użytkowników Pinteresta Badanie polskich użytkowników Pinteresta
Badanie polskich użytkowników Pinteresta
 
A little introduction to GIS and QGIS
A little introduction to GIS and QGIS A little introduction to GIS and QGIS
A little introduction to GIS and QGIS
 
Digital 2022 Trinidad and Tobago (February 2022) v01
Digital 2022 Trinidad and Tobago (February 2022) v01Digital 2022 Trinidad and Tobago (February 2022) v01
Digital 2022 Trinidad and Tobago (February 2022) v01
 
Digital 2022 Canada (February 2022) v02
Digital 2022 Canada (February 2022) v02Digital 2022 Canada (February 2022) v02
Digital 2022 Canada (February 2022) v02
 
Digital 2022 Sierra Leone (February 2022) v01
Digital 2022 Sierra Leone (February 2022) v01Digital 2022 Sierra Leone (February 2022) v01
Digital 2022 Sierra Leone (February 2022) v01
 
Σύντομη εισαγωγή στο MapReduce
Σύντομη εισαγωγή στο MapReduceΣύντομη εισαγωγή στο MapReduce
Σύντομη εισαγωγή στο MapReduce
 
Digital 2022 Kazakhstan (February 2022) v01
Digital 2022 Kazakhstan (February 2022) v01Digital 2022 Kazakhstan (February 2022) v01
Digital 2022 Kazakhstan (February 2022) v01
 
مقدمة نظرية مختصرة عن نعريف - مكونات - تطبيقات واستخدامات نظم المعلومات الجغر...
مقدمة نظرية مختصرة عن نعريف - مكونات - تطبيقات واستخدامات نظم المعلومات الجغر...مقدمة نظرية مختصرة عن نعريف - مكونات - تطبيقات واستخدامات نظم المعلومات الجغر...
مقدمة نظرية مختصرة عن نعريف - مكونات - تطبيقات واستخدامات نظم المعلومات الجغر...
 
Exploratory Spatial Analysis using GeoDa
Exploratory Spatial Analysis using GeoDaExploratory Spatial Analysis using GeoDa
Exploratory Spatial Analysis using GeoDa
 
Arcgis training day_1
Arcgis training day_1Arcgis training day_1
Arcgis training day_1
 
Digital 2020 Global Digital Yearbook (January 2020) v01
Digital 2020 Global Digital Yearbook (January 2020) v01Digital 2020 Global Digital Yearbook (January 2020) v01
Digital 2020 Global Digital Yearbook (January 2020) v01
 
SAP S/4 HANA Technical assessment before migration
SAP S/4 HANA Technical assessment before migrationSAP S/4 HANA Technical assessment before migration
SAP S/4 HANA Technical assessment before migration
 
Erp final proj report slideshare
Erp final proj report slideshareErp final proj report slideshare
Erp final proj report slideshare
 
Digital 2022 Comoros (February 2022) v01
Digital 2022 Comoros (February 2022) v01Digital 2022 Comoros (February 2022) v01
Digital 2022 Comoros (February 2022) v01
 
SAP-desde-cero (1).pdf
SAP-desde-cero (1).pdfSAP-desde-cero (1).pdf
SAP-desde-cero (1).pdf
 
Digital 2020 France (January 2020) v01
Digital 2020 France (January 2020) v01Digital 2020 France (January 2020) v01
Digital 2020 France (January 2020) v01
 
Air Travel Analytics in SAS
Air Travel Analytics in SASAir Travel Analytics in SAS
Air Travel Analytics in SAS
 

Similar to Digital Demography - WWW'17 Tutorial - Part II

Fairness in Machine Learning @Codemotion
Fairness in Machine Learning @CodemotionFairness in Machine Learning @Codemotion
Fairness in Machine Learning @Codemotion
Azzurra Ragone
 
Not-so-obvious Online Data Sources for Demographic Research
Not-so-obvious Online Data Sources for Demographic ResearchNot-so-obvious Online Data Sources for Demographic Research
Not-so-obvious Online Data Sources for Demographic Research
Ingmar Weber
 
Nonprobability report-may-2016-final
Nonprobability report-may-2016-finalNonprobability report-may-2016-final
Nonprobability report-may-2016-final
SUMEET VERMA
 
APLIC 2014 - Social Observatories Coordinating Network
APLIC 2014 - Social Observatories Coordinating NetworkAPLIC 2014 - Social Observatories Coordinating Network
APLIC 2014 - Social Observatories Coordinating Network
APLICwebmaster
 
Eysenbach: Infodemiology and Infoveillance
Eysenbach: Infodemiology and InfoveillanceEysenbach: Infodemiology and Infoveillance
Eysenbach: Infodemiology and Infoveillance
Gunther Eysenbach
 
Role of data in precision oncology
Role of data in precision oncologyRole of data in precision oncology
Role of data in precision oncology
Warren Kibbe
 
Big data, Big prejudice: how algorithms can discriminate?
Big data, Big prejudice: how algorithms can discriminate?Big data, Big prejudice: how algorithms can discriminate?
Big data, Big prejudice: how algorithms can discriminate?
Sara_Hajian
 
IMPACTS OF JUCVENILE JUSTICE SYSTEM ON AFRICAN AMERICAN ADOLESCENT
IMPACTS OF JUCVENILE JUSTICE SYSTEM ON AFRICAN AMERICAN ADOLESCENTIMPACTS OF JUCVENILE JUSTICE SYSTEM ON AFRICAN AMERICAN ADOLESCENT
IMPACTS OF JUCVENILE JUSTICE SYSTEM ON AFRICAN AMERICAN ADOLESCENT
MalikPinckney86
 
Digital Trace Data for Demographic Research
Digital Trace Data for Demographic ResearchDigital Trace Data for Demographic Research
Digital Trace Data for Demographic Research
Ingmar Weber
 
Spatial Data for Health: What’s Changed in Terms of Availability and Quality?
Spatial Data for Health: What’s Changed in Terms of Availability and Quality?Spatial Data for Health: What’s Changed in Terms of Availability and Quality?
Spatial Data for Health: What’s Changed in Terms of Availability and Quality?
MEASURE Evaluation
 
Healthy relationships | Combating cyber bullying (Doc)
Healthy relationships | Combating cyber bullying (Doc)Healthy relationships | Combating cyber bullying (Doc)
Healthy relationships | Combating cyber bullying (Doc)
Adele Ramos
 
Methodology 2.pptx
Methodology 2.pptxMethodology 2.pptx
Methodology 2.pptx
MarcCollazo1
 
Psychological Testing on the InternetNew Problems, Old Issue.docx
Psychological Testing on the InternetNew Problems, Old Issue.docxPsychological Testing on the InternetNew Problems, Old Issue.docx
Psychological Testing on the InternetNew Problems, Old Issue.docx
woodruffeloisa
 
Next Generation Systems
Next Generation SystemsNext Generation Systems
The Use of Query Reformulation to Predict Future User Actions
The Use of Query Reformulation to Predict Future User ActionsThe Use of Query Reformulation to Predict Future User Actions
The Use of Query Reformulation to Predict Future User ActionsJim Jansen
 
Article one Lethal injection -electronic resource- -.docx
Article one         Lethal injection -electronic resource- -.docxArticle one         Lethal injection -electronic resource- -.docx
Article one Lethal injection -electronic resource- -.docx
noel23456789
 
5-pln-1341-Sahai
5-pln-1341-Sahai5-pln-1341-Sahai
5-pln-1341-Sahaimed20su
 
“Big Data” and the Challenges for Statisticians
“Big Data” and the  Challenges for Statisticians“Big Data” and the  Challenges for Statisticians
“Big Data” and the Challenges for Statisticians
Setia Pramana
 
WithinReachFinalReport
WithinReachFinalReportWithinReachFinalReport
WithinReachFinalReportJinyang Luo
 

Similar to Digital Demography - WWW'17 Tutorial - Part II (20)

Fairness in Machine Learning @Codemotion
Fairness in Machine Learning @CodemotionFairness in Machine Learning @Codemotion
Fairness in Machine Learning @Codemotion
 
Not-so-obvious Online Data Sources for Demographic Research
Not-so-obvious Online Data Sources for Demographic ResearchNot-so-obvious Online Data Sources for Demographic Research
Not-so-obvious Online Data Sources for Demographic Research
 
Nonprobability report-may-2016-final
Nonprobability report-may-2016-finalNonprobability report-may-2016-final
Nonprobability report-may-2016-final
 
APLIC 2014 - Social Observatories Coordinating Network
APLIC 2014 - Social Observatories Coordinating NetworkAPLIC 2014 - Social Observatories Coordinating Network
APLIC 2014 - Social Observatories Coordinating Network
 
Eysenbach: Infodemiology and Infoveillance
Eysenbach: Infodemiology and InfoveillanceEysenbach: Infodemiology and Infoveillance
Eysenbach: Infodemiology and Infoveillance
 
Role of data in precision oncology
Role of data in precision oncologyRole of data in precision oncology
Role of data in precision oncology
 
Big data, Big prejudice: how algorithms can discriminate?
Big data, Big prejudice: how algorithms can discriminate?Big data, Big prejudice: how algorithms can discriminate?
Big data, Big prejudice: how algorithms can discriminate?
 
IMPACTS OF JUCVENILE JUSTICE SYSTEM ON AFRICAN AMERICAN ADOLESCENT
IMPACTS OF JUCVENILE JUSTICE SYSTEM ON AFRICAN AMERICAN ADOLESCENTIMPACTS OF JUCVENILE JUSTICE SYSTEM ON AFRICAN AMERICAN ADOLESCENT
IMPACTS OF JUCVENILE JUSTICE SYSTEM ON AFRICAN AMERICAN ADOLESCENT
 
Digital Trace Data for Demographic Research
Digital Trace Data for Demographic ResearchDigital Trace Data for Demographic Research
Digital Trace Data for Demographic Research
 
Spatial Data for Health: What’s Changed in Terms of Availability and Quality?
Spatial Data for Health: What’s Changed in Terms of Availability and Quality?Spatial Data for Health: What’s Changed in Terms of Availability and Quality?
Spatial Data for Health: What’s Changed in Terms of Availability and Quality?
 
Healthy relationships | Combating cyber bullying (Doc)
Healthy relationships | Combating cyber bullying (Doc)Healthy relationships | Combating cyber bullying (Doc)
Healthy relationships | Combating cyber bullying (Doc)
 
Methodology 2.pptx
Methodology 2.pptxMethodology 2.pptx
Methodology 2.pptx
 
Psychological Testing on the InternetNew Problems, Old Issue.docx
Psychological Testing on the InternetNew Problems, Old Issue.docxPsychological Testing on the InternetNew Problems, Old Issue.docx
Psychological Testing on the InternetNew Problems, Old Issue.docx
 
Next Generation Systems
Next Generation SystemsNext Generation Systems
Next Generation Systems
 
Brown_Research
Brown_ResearchBrown_Research
Brown_Research
 
The Use of Query Reformulation to Predict Future User Actions
The Use of Query Reformulation to Predict Future User ActionsThe Use of Query Reformulation to Predict Future User Actions
The Use of Query Reformulation to Predict Future User Actions
 
Article one Lethal injection -electronic resource- -.docx
Article one         Lethal injection -electronic resource- -.docxArticle one         Lethal injection -electronic resource- -.docx
Article one Lethal injection -electronic resource- -.docx
 
5-pln-1341-Sahai
5-pln-1341-Sahai5-pln-1341-Sahai
5-pln-1341-Sahai
 
“Big Data” and the Challenges for Statisticians
“Big Data” and the  Challenges for Statisticians“Big Data” and the  Challenges for Statisticians
“Big Data” and the Challenges for Statisticians
 
WithinReachFinalReport
WithinReachFinalReportWithinReachFinalReport
WithinReachFinalReport
 

More from Ingmar Weber

Digital Gender Gaps Seen Through Social Media
Digital Gender Gaps Seen Through Social MediaDigital Gender Gaps Seen Through Social Media
Digital Gender Gaps Seen Through Social Media
Ingmar Weber
 
Different Hashtags, Different Opinions - Twitter Polarization in Egypt
Different Hashtags, Different Opinions - Twitter Polarization in EgyptDifferent Hashtags, Different Opinions - Twitter Polarization in Egypt
Different Hashtags, Different Opinions - Twitter Polarization in Egypt
Ingmar Weber
 
Data on Polarization, Peace, and Propaganda
Data on Polarization, Peace, and PropagandaData on Polarization, Peace, and Propaganda
Data on Polarization, Peace, and Propaganda
Ingmar Weber
 
Using Advertising Platforms for Social Good
Using Advertising Platforms for Social GoodUsing Advertising Platforms for Social Good
Using Advertising Platforms for Social Good
Ingmar Weber
 
Monitoring migration using social media data an introduction
Monitoring migration using social media data   an introductionMonitoring migration using social media data   an introduction
Monitoring migration using social media data an introduction
Ingmar Weber
 
Not so-obvious social media analysis to study current affairs
Not so-obvious social media analysis to study current affairsNot so-obvious social media analysis to study current affairs
Not so-obvious social media analysis to study current affairs
Ingmar Weber
 
Digital data for migration research
Digital data for migration researchDigital data for migration research
Digital data for migration research
Ingmar Weber
 
Digital advertising data for migration research
Digital advertising data for migration researchDigital advertising data for migration research
Digital advertising data for migration research
Ingmar Weber
 
Advertising Data for Good
Advertising Data for GoodAdvertising Data for Good
Advertising Data for Good
Ingmar Weber
 
Using advertising data to model migration, poverty and digital gender gaps
Using advertising data to model migration, poverty and digital gender gapsUsing advertising data to model migration, poverty and digital gender gaps
Using advertising data to model migration, poverty and digital gender gaps
Ingmar Weber
 
Correlated Impulses: Using Facebook Interests to Improve Predictions of Crime...
Correlated Impulses: Using Facebook Interests to Improve Predictions of Crime...Correlated Impulses: Using Facebook Interests to Improve Predictions of Crime...
Correlated Impulses: Using Facebook Interests to Improve Predictions of Crime...
Ingmar Weber
 
Tapping into advertising platforms to monitor ict usage and more
Tapping into advertising platforms to monitor ict usage and moreTapping into advertising platforms to monitor ict usage and more
Tapping into advertising platforms to monitor ict usage and more
Ingmar Weber
 
Hate Speech, Polarization and Online Data
Hate Speech, Polarization and Online DataHate Speech, Polarization and Online Data
Hate Speech, Polarization and Online Data
Ingmar Weber
 
Digital Demography - Keynote at SocInfo'18
Digital Demography - Keynote at SocInfo'18Digital Demography - Keynote at SocInfo'18
Digital Demography - Keynote at SocInfo'18
Ingmar Weber
 
Tracking Digital Gender Gaps
Tracking Digital Gender GapsTracking Digital Gender Gaps
Tracking Digital Gender Gaps
Ingmar Weber
 
Estimating Migration and Quantifying Migrant Assimilation Using Internet Adve...
Estimating Migration and Quantifying Migrant Assimilation Using Internet Adve...Estimating Migration and Quantifying Migrant Assimilation Using Internet Adve...
Estimating Migration and Quantifying Migrant Assimilation Using Internet Adve...
Ingmar Weber
 
Using internet advertising data for studying international migration
Using internet advertising data for studying international migrationUsing internet advertising data for studying international migration
Using internet advertising data for studying international migration
Ingmar Weber
 
Social media analysis for better policy making
Social media analysis for better policy makingSocial media analysis for better policy making
Social media analysis for better policy making
Ingmar Weber
 
Matching Methods and Natural Experiments - Examples of Causal Inference from ...
Matching Methods and Natural Experiments - Examples of Causal Inference from ...Matching Methods and Natural Experiments - Examples of Causal Inference from ...
Matching Methods and Natural Experiments - Examples of Causal Inference from ...
Ingmar Weber
 
A Warm Welcome Matters! The Link Between Social Feedback and Weight Loss in /...
A Warm Welcome Matters! The Link Between Social Feedback and Weight Loss in /...A Warm Welcome Matters! The Link Between Social Feedback and Weight Loss in /...
A Warm Welcome Matters! The Link Between Social Feedback and Weight Loss in /...
Ingmar Weber
 

More from Ingmar Weber (20)

Digital Gender Gaps Seen Through Social Media
Digital Gender Gaps Seen Through Social MediaDigital Gender Gaps Seen Through Social Media
Digital Gender Gaps Seen Through Social Media
 
Different Hashtags, Different Opinions - Twitter Polarization in Egypt
Different Hashtags, Different Opinions - Twitter Polarization in EgyptDifferent Hashtags, Different Opinions - Twitter Polarization in Egypt
Different Hashtags, Different Opinions - Twitter Polarization in Egypt
 
Data on Polarization, Peace, and Propaganda
Data on Polarization, Peace, and PropagandaData on Polarization, Peace, and Propaganda
Data on Polarization, Peace, and Propaganda
 
Using Advertising Platforms for Social Good
Using Advertising Platforms for Social GoodUsing Advertising Platforms for Social Good
Using Advertising Platforms for Social Good
 
Monitoring migration using social media data an introduction
Monitoring migration using social media data   an introductionMonitoring migration using social media data   an introduction
Monitoring migration using social media data an introduction
 
Not so-obvious social media analysis to study current affairs
Not so-obvious social media analysis to study current affairsNot so-obvious social media analysis to study current affairs
Not so-obvious social media analysis to study current affairs
 
Digital data for migration research
Digital data for migration researchDigital data for migration research
Digital data for migration research
 
Digital advertising data for migration research
Digital advertising data for migration researchDigital advertising data for migration research
Digital advertising data for migration research
 
Advertising Data for Good
Advertising Data for GoodAdvertising Data for Good
Advertising Data for Good
 
Using advertising data to model migration, poverty and digital gender gaps
Using advertising data to model migration, poverty and digital gender gapsUsing advertising data to model migration, poverty and digital gender gaps
Using advertising data to model migration, poverty and digital gender gaps
 
Correlated Impulses: Using Facebook Interests to Improve Predictions of Crime...
Correlated Impulses: Using Facebook Interests to Improve Predictions of Crime...Correlated Impulses: Using Facebook Interests to Improve Predictions of Crime...
Correlated Impulses: Using Facebook Interests to Improve Predictions of Crime...
 
Tapping into advertising platforms to monitor ict usage and more
Tapping into advertising platforms to monitor ict usage and moreTapping into advertising platforms to monitor ict usage and more
Tapping into advertising platforms to monitor ict usage and more
 
Hate Speech, Polarization and Online Data
Hate Speech, Polarization and Online DataHate Speech, Polarization and Online Data
Hate Speech, Polarization and Online Data
 
Digital Demography - Keynote at SocInfo'18
Digital Demography - Keynote at SocInfo'18Digital Demography - Keynote at SocInfo'18
Digital Demography - Keynote at SocInfo'18
 
Tracking Digital Gender Gaps
Tracking Digital Gender GapsTracking Digital Gender Gaps
Tracking Digital Gender Gaps
 
Estimating Migration and Quantifying Migrant Assimilation Using Internet Adve...
Estimating Migration and Quantifying Migrant Assimilation Using Internet Adve...Estimating Migration and Quantifying Migrant Assimilation Using Internet Adve...
Estimating Migration and Quantifying Migrant Assimilation Using Internet Adve...
 
Using internet advertising data for studying international migration
Using internet advertising data for studying international migrationUsing internet advertising data for studying international migration
Using internet advertising data for studying international migration
 
Social media analysis for better policy making
Social media analysis for better policy makingSocial media analysis for better policy making
Social media analysis for better policy making
 
Matching Methods and Natural Experiments - Examples of Causal Inference from ...
Matching Methods and Natural Experiments - Examples of Causal Inference from ...Matching Methods and Natural Experiments - Examples of Causal Inference from ...
Matching Methods and Natural Experiments - Examples of Causal Inference from ...
 
A Warm Welcome Matters! The Link Between Social Feedback and Weight Loss in /...
A Warm Welcome Matters! The Link Between Social Feedback and Weight Loss in /...A Warm Welcome Matters! The Link Between Social Feedback and Weight Loss in /...
A Warm Welcome Matters! The Link Between Social Feedback and Weight Loss in /...
 

Recently uploaded

(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
Scintica Instrumentation
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Sérgio Sacani
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
Richard Gill
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
Areesha Ahmad
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
muralinath2
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
AADYARAJPANDEY1
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
aishnasrivastava
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
muralinath2
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
subedisuryaofficial
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
IqrimaNabilatulhusni
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
silvermistyshot
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
Areesha Ahmad
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
muralinath2
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
sachin783648
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
AlguinaldoKong
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SELF-EXPLANATORY
 

Recently uploaded (20)

(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
 

Digital Demography - WWW'17 Tutorial - Part II

  • 1. Digital Demography Bogdan State & Ingmar Weber @bogdanstate @ingmarweber https://sites.google.com/site/digitaldemography/
  • 2. The Next Three and a Half Hours 09:00 – 10h30: Part I: Overview of Traditional Demography (Bogdan) ● Standard models ● Standard data sources 10h30 – 11h00: Coffee Break and Network Opportunity 11h00 – 12h30: Part II: New Opportunities for Demography with Digital Data (Ingmar) ● Case studies about fertility, mortality and migration ● More about data sources
  • 3. About Us: Bogdan Sociology PhD (Stanford), focused on computational sociology of social ties. Currently: Graduate Student at Stanford (CS), Data Scientist at Facebook. Long-standing interest in migration research. Articles on measurement of migration with big data, focus on highly-skilled migration and on social networks.
  • 4. About Us: Ingmar Research Director at QCRI. Started working on demographics of web search at Yahoo Research Barcelona (2009-2012). Collaborating with Emilio Zagheni since 2010, focusing on international migration. Published seven articles on different aspects of WWW and demographics. Serving as ACM Distinguished Speaker, http://www.dsp.acm.org/view_lecturer.cfm?lecturer_id=7123. ACM financially supports travel expenses if you want to have me present at you event.
  • 5. Part II: New Opportunities for Demography with Digital Data
  • 6. The next 90 minutes • 16 case studies, i.e. published peer-reviewed papers (~65 min) - Breadth over depth - Key idea over methodological details - Organized by topic: fertility, mortality and migration • Not-so-obvious data sets, in particular ad audience estimates (~15 min) - How many Twitter users match criteria X? • Where to from here and discussion (~10 min) - What are you working on? How can we help you?
  • 8. “Forecasting Births Using Google” Francesco C. Billari, Francesco D’Amuri, Juri Marcucci PAA Annual Meeting; 2013 http://paa2013.princeton.edu/papers/131393
  • 9. Predict Monthly Fertility Rate Does Google search intensity (GI) for “maternity”, “pregnancy” or “ovulation” predict (with a lag) monthly birth rates? New to Google Trends? Example: https://trends.google.com/trends/explore?geo=US&q=healthy%20diet Looks somewhat promising Also incorporate external factors
  • 10. Model Performance Fit an autoregressive–moving-average (ARMA) model Encouraging results, but lots of models were tried. Potentially risk of overfitting. Correlation with birth rate. GI1 is the monthly average of the google index for ‘maternity’, GI2 is the monthly average of the GI for ‘ovulation’, and GI3 is the monthly average of the GI for ‘pregnancy’. Error rates with and without Google Trends data.
  • 11. “Falsification” Test Lots of things correlate, either by chance or due to hidden factor Temporal interest in “skiing” correlated with flu activity Important: robust selection of key words Used Google Correlate with 2004-2006 time series data to find most correlated term. Turned out to be: “KXMB” KXMB is a local affiliate of CBS (one of the major US commercial broadcasting TVs) for central and western North Dakota Tested for prediction power. Got poor results (unlike for their terms).
  • 12. “Fertility and its Meaning: Evidence from Search Behavior” Jussi Ojala, Emilio Zagheni, Francesco C. Billari, Ingmar Weber ICWSM; 2017 https://arxiv.org/abs/1703.03935
  • 13. Study Goals (i) detect evidence for different contexts surrounding different types of fertility; Teen, low/high income, (un-)married, … (ii) model regional variation across states for different fertility levels; What distinguishes Alabama from California from New York? (iii) track temporal changes in fertility across time. Train a model across space, predict across time.
  • 14. Feature Discovery via Google Trends
  • 15. Different Contexts of Fertility Discover search terms correlated with different fertility rates across US states https://www.google.com/trends/correlate/search?e=id:f7PU4mFDWV-&t=all Remove terms with no conceivable link to sex, pregnancy or maternity
  • 16. Predicting Spatial Variability Performance of the regression models using leave-one-out cross-validation. SMAPE is in [%], RMSE values are multiplied by 1,000. Use the previous terms to build models predicting state-level fertility rates All these models make predictions based on linear combinations of search intensity Goal: apply these spatial models across time
  • 17. Learning Across Space, Predicting Across Time Temporal trend when applying the “teen” model across time. Values are rescaled to a maximum of 1.0. Pearson r correlation across 2010-2015 when using the spatial model to predict trends across time.
  • 18. “Seasonal Variation in Internet Keyword Searches: A Proxy Assessment of Sex Mating Behaviors” Patrick M. Markey, Charlotte N. Markey Archives of Sexual Behavior; 2013 http://link.springer.com/article/10.1007/s10508-012-9996-5
  • 19. Seasonality of Mating-Related Web Searches Similar temporal patterns for searches about (i) prostitution and (ii) dating sites Births have a (weak) seasonal pattern Can we detect seasonal mating interest?
  • 20. “Measuring the impact of health policies using Internet search patterns: the case of abortion” Ben Y. Reis, John S. Brownstein BMC Public Health; 2010 http://bmcpublichealth.biomedcentral.com/articles/10.1186/1471-2458-10-514
  • 21. Searches for “abortion” vs. Abortion Rates Recent data: https://www.google.com/trends/correlate/search?e=id:a-6K3jgcMLM&t=all#
  • 22. The Impact of Policies on Search Behavior “With regard to the abortion policies available for study, abortion search volume was significantly higher in states having any of the following four restrictions: (i) mandatory waiting period, (ii) mandatory counseling, (iii) mandatory parental notification in the case of minors, and (iv) mandatory parental consent for minors. Examining abortion availability, abortion search volume was significantly higher in states where fewer than 10% of counties have providers.” “These findings are consistent with published evidence that local restrictions on abortion lead individuals to seek abortion services outside of their area.”
  • 23. “#babyfever: Social and media influences on fertility desires” Lora E. Adair, Gary L. Brase, Karen Akao, Mackenzie Jantsch Personality and Individual Differences; 2014 http://www.sciencedirect.com/science/article/pii/S019188691400422X
  • 26. “Data Mining of Online Genealogy Datasets for Revealing Lifespan Patterns in Human Population” Michael Fire, Yuval Elovici ACM Transactions on Intelligent Systems and Technology; 2015 http://dl.acm.org/citation.cfm?doid=2753829.2700464
  • 27. A Wiki Approach to Online Genealogy Anonymized version available at: http://proj.ise.bgu.ac.il/sns/wikitree.html
  • 28. Lifespan in the US over the Last 350 Years
  • 29. Goal: Predict Someone’s Lifespan Born in US and >50, predict if >80
  • 30. “Quantitative analysis of population-scale family trees using millions of relatives” Joanna Kaplanis, Assaf Gordon, Mary Wahl, Michael Gershovits, Barak Markus, Mona Sheikh, Melissa Gymrek, Gaurav Bhatia, Daniel G MarArthur, Alkes Price, Yaniv Erlich bioRxiv; 2017 http://biorxiv.org/content/early/2017/02/07/106427
  • 31. Online Genealogy Data - Again 13 million people, after cleaning, in a single pedigree Small sample of mitochondria and Y-STR haplotypes (not discussed) Also location information. Cleaned, de-identified data available at: http://familinx.org/
  • 32. Geographical Distribution of Data (Place of Birth) Pre 1800 Post 1800
  • 33. Mortality and City Growth Their model (red) validated against previous models (Oeppen & Vaupel, black)
  • 34. Mobility Over Time And a lot more! Check out the paper. Median migration distance in North American born individuals as a function of time. Red: mother-offspring, blue: father-offspring, black: marital radius. Dots represent the data before smoothing.
  • 35. “A New Source of Data for Public Health Surveillance: Facebook Likes” Steven Gittelman, Victor Lange, Carol A. G. Crawford, Catherine A. Okoro, Eugene Lieb, Satvinder S. Dhingra, Elaine Trimarchi Journal of Medical Internet Research; 2015 http://www.jmir.org/2015/4/e98/
  • 36. Zip-Level “Like” Counts for Different Categories Data from Facebook’s advertising API. Details about current API later.
  • 37. Predict County-Level Life Expectancy Map zip codes to counties Used 214 counties in the continental USA So what are the factors?
  • 38. What are the Nine Factors? Examples: Factor 2 is good for you Factor 8 is bad for you
  • 39. “A novel web informatics approach for automated surveillance of cancer mortality trends” Georgia Tourassi, Hong-Jun Yoon, Songhua Xu Journal of Biomedical Informatics; 2016 http://www.sciencedirect.com/science/article/pii/S1532046416300181
  • 40. Crawling Cancer-Related Obituaries Use a web search engine to get seeds for queries such as “breast cancer obituary, New York” Example Then post-filter Then lung vs. breast cancer Then infer age and gender
  • 41. Cancer Mortality Rates from Online Obituaries Percent of lung cancer deaths per age group based on SEER data and obituaries for both genders. Annual female breast cancer death rates based on obituaries and on National Vital Statistics Report (NVSR) for 2008–2012.
  • 42. “Online obituaries are a reliable and valid source of mortality data” M. L. Soowamber, J. T. Granton, F. Bavaghar-Zaeimi, S. R. Johnson Journal of Clinical Epidemiology; 2016 http://www.jclinepi.com/article/S0895-4356(16)30183-4/abstract
  • 43. Let Me Google if My Patient Died … Discharged patients might die at home without the hospital knowing Leads to underestimates of mortality for procedures and diseases Search patients’ first and last names in online obituaries
  • 44. Not Covered in this Tutorial: Digital Mourning “"We will never forget you [online]": an empirical investigation of post-mortem myspace comments”; J. R. Brubaker, G. R. Hayes; 2011 “Death and mourning as sources of community participation in online social networks: R.I.P. pages in Facebook”; A. E. Forman, R. Kern, G. Gil-Egui; 2012 “Does the internet change how we die and mourn? Overview and analysis.”; T. Walter, R. Hourizi, W. Moncur, S. Pitsillides; 2012 “Beyond the Grave: Facebook as a Site for the Expansion of Death and Mourning”; J. R. Brubaker, G. R. Hayes, P. Dourish; 2013
  • 45. Migration Image from clipartfest.com (not Mobility) Migration = (i) across countries, and (ii) long-term Lots of work on mobility from Twitter/mobile phone CDR
  • 46. “You are where you e-mail: using e-mail data to estimate international migration rates” Emilio Zagheni, Ingmar Weber WebSci; 2012 http://dl.acm.org/citation.cfm?doid=2380718.2380764
  • 47. IP Address => Approximate Geolocation Any online service you frequently use knows your coarse-grained mobility pattern We used anonymized data from Yahoo https://www.maxmind.com/en/geoip-demo
  • 48. Data Collection Large sample of anonymized Yahoo email meta data (date, hashed user ID, inferred country), including self-reported birth year and gender Sent email between September 2009 and June 2011, at least once a month 43 million users, half from the US Migration: different modal country for [Sep 2009, Jun 2010] and [Jul 2010, Jun 2011] Also obtained internet penetration for (country, age, gender) group And migration data for European countries from Eurostat (for calibration)
  • 49. Internet => Young & Educated => More Mobile Expect a particular type of selection bias: Highly mobile people are early adopters for internet (and email) use Introduce an ad-hoc correction factor (CF) pgac = internet penetration for gender g, age group a and country c k = factor that controls the strength of the selection bias Find appropriate k using calibration data for European countries
  • 50. Results for the United States Red line: after applying correction factor. Top of gray area: estimates from raw data. The US don’t have good data on outgoing migration flows. Only some data from IRS on stocks of expats.
  • 51. Sensitivity for Low Internet Penetration Countries Red line: using k=20 for CF. Gray area: Using k between 5 and 35 for CF.
  • 52. “Studying inter-national mobility through IP geolocation” Bogdan State, Ingmar Weber, Emilio Zagheni WSDM; 2013 http://dl.acm.org/citation.cfm?doid=2433396.2433432
  • 53. Data Collection Anonymized Yahoo log-in information, covering July 2011 to July 2012 Geolocated using IP address, using an average of 100 log in events per user ~10^8 users, 97% in one country, 3% in two countries, 0.23% in more countries Define migration: 2x 90 days in two countries (223 migrants after cleaning) Use “outdated” (April 2012) self-declared country-of-residence to define the origin Normalize out-edges for a given source country: Given that I’m leaving country X, where do I go?
  • 54. What Predicts Target of a Migration Event?
  • 55. Visualization of Conditional Migration Flows Black = origin, red = destination, solid lines = “no return”, dashed = some back-and-forth, dotted = pendular
  • 56. “Inferring international and internal migration patterns from Twitter data” Emilio Zagheni, Venkata Rama Kiran Garimella, Ingmar Weber, Bogdan State WWW; 2014 http://dl.acm.org/citation.cfm?doid=2567948.2576930
  • 57. Data Collection Used Twitter streaming API filter for geo-tagged tweets from OECD countries Pick 3,000 users per country, get their tweets Estimate out-migration and oversample countries where migration is rare Get data for ~500K users Activity thresholding: 3+ tweets in four-months windows, May 2011->April 2013 Left with ~15K users -> Small!
  • 59. Difference-in-Differences Out-migration rates clearly an overestimate Non-representative user set Selection bias is changing over time Focus on between-country differences D D Also see: “Demographic research with non-representative internet data”, Zagheni & Weber, 2015
  • 60. Results (Soft) Validation: Ireland out-migration rate grew by 2.2% 2011 -> 2012, more than most countries (Irish Central Statistics Office) Mexico also sees a reduction in out-migration (Pew Research Center)
  • 61. “Migration of Professionals to the U.S. - Evidence from LinkedIn Data” Bogdan State, Mario Rodriguez, Dirk Helbing, Emilio Zagheni SocInfo; 2014 http://link.springer.com/chapter/10.1007%2F978-3-319-13734-6_37
  • 62. Data Collection Data for ~200 million LinkedIn Users Complete with education level and city/country of education/job No details about data cleaning/preprocessing included
  • 64. “From Migration Corridors to Clusters: The Value of Google+ Data for Migration Studies” Johnnatan Messias, Fabricio Benevenuto, Ingmar Weber, Emilio Zagheni ASONAM; 2016 http://ieeexplore.ieee.org/document/7752269/
  • 65. Beyond Origin-Destination Migration Analysis I’m a German citizen living in Qatar. So did I migrate from Germany to Qatar? Yes, according to Qatari border control. But: Germany (78->99), United Kingdom (99->03), Germany (03->07), Switzerland (07->09), Spain (09->12), Qatar (12->now) Use the “places lived” on Google+ In 2012, no “currently”, just set of places Get tuples of co-lived countries
  • 66. Flows/Corridors vs. Tuples/Clusters This is what border control can obtain (with directionality) This is what the Google+ “places lived” provides
  • 67. Expected Cluster Frequencies Lots of migrant flows on (A,B), (A,C) and (B,C) => expect lots on (A,B,C) “Expect” = rank clusters according to: min(freqAB; freqAC; freqBC) * mean(freqAB; freqAC; freqBC) Best performing ranking approximation (Kendall .565, Spearman .754) Look at outliers and try to explain those
  • 68. Outlier Frequencies Look at “expected rank – actual rank” Middle 20%: “close to expected” Top 20%: “higher than expected” Low 20%: “lower than expected”
  • 69. Feature Analysis More than expected: (Spain, France, Italy) (UAE, India, Singapore) Less than expected: (Brazil, Mexico, USA) (Canada, China, UK) Most discriminative features for 3-class distinction
  • 70. Other Digital Mobility Data: Mobile Phone Data Mostly used for studying mobility (within a country) rather than migration (across countries). Also used for socio-economic estimates (such as income estimates). See work by the following authors for examples (alphabetical order). Joshua Blumenstock, https://scholar.google.com/citations?user=YpxRngIAAAAJ Francesco Calabrese, https://scholar.google.com/citations?user=uoI2RgEAAAAJ Nathan Eagle, https://scholar.google.com/scholar?q=author%3A%22nathan+eagle%22 Cesar Hidalgo, https://scholar.google.com/citations?user=xhCWdtMAAAAJ Alex ‘Sandy’ Pentland, https://scholar.google.com/citations?user=P4nfoKYAAAAJ Andrew Tatem, https://scholar.google.com/citations?user=wt8NpZgAAAAJ
  • 71. More Data Sources Ad Audience Estimates as Digital Census Please consider citing this tutorial if you should use these data sets and tools. See the proceedings for citation details. Stay tuned for forthcoming work using this data.
  • 72. Targeted Advertising as a Digital Census All the Internet giants make money with targeted advertising It’s in their commercial interest to “understand” their users Rich data on both demographic and behavioral attributes Usually not available for outside researchers, but … Some aggregate “audience estimates” available for advertisers: How many users/impressions match criteria X? Supported by (at least) Facebook, Twitter, and Google
  • 73. Facebook’s Advertising Reach Estimates https://www.facebook.com/ads/manager/creation/creation/ https://developers.facebook.com/docs/marketing-api/buying-api/targeting/v2.8 Easy-to-Use Python code https://github.com/maraujo/pySocialWatcher Created by Matheus Araujo at QCRI Contact me if you want to (i) know about important details, and (ii) know what’s in the pipeline.
  • 74. Sneak Preview: Estimating Stocks of Migrants Joint work with Emilio Zagheni and Krishna Gummadi. Currently under review.
  • 75. Twitter’s Advertising Reach Estimates https://dev.twitter.com/ads/reference/1/get/ accounts/%3Aaccount_id/reach_estimate https://ads.twitter.com/login
  • 76. Google’s Advertising Reach Estimates https://support.google.com/adwords/answer/2475441?hl=en https://developers.google.com/adwords/api/docs/guides/traffic- estimator-servicehttp://adwords.google.com/
  • 77. Using Online Ads to Reach Migrants Only described use as a passive data source. But can be used as an active outreach channel. Examples below. “Migrant Sampling Using Facebook Advertisements A Case Study of Polish: Migrants in Four European Countries”; S. Pötzschke, M. Braun; 2016 “Using Internet to Recruit Immigrants with Language and Culture Barriers for Tobacco and Alcohol Use Screening: A Study Among Brazilians”; B. H. Carlini, L. Safioti, T. C. Rue, L. Miles; 2014 “Reaching and recruiting Turkish migrants for a clinical trial through Facebook: A process evaluation”; B. Ü. Ince, P. Cuijpers, E. van 't Hof, H. Riper; 2014
  • 78. Google Trends on Steroids Google Trends does not provide demographic information Get DMA-level demographic information (race, income, …) Join with DMA-level Google Trends information Can potentially give “average income of a web search query over time” But often sparsity problems, with data only showing for bigger cities (=> bias) See “The cost of racial animus on a black candidate: Evidence using Google search data”, Seth Stephens-Davidowitz; Journal of Public Economics; 2014 Also: “Demographic information flows”, Ingmar Weber, Alejandro Jaimes; CIKM 2010
  • 79. Recall: Previously Mentioned Data Sources Online genealogy projects Online obituaries Google Correlate (= upload your own data, discover correlated search terms) Geotagged tweets Others? Baby announcements? Wedding invitations?
  • 80. Enriching Your Data Demographic Inference 101 Please consider citing this tutorial if you should use these data sets and tools. See the proceedings for citation details.
  • 81. Demographic Inference – Name Dictionaries First name gender dictionaries: https://ideas.repec.org/c/wip/eccode/10.html http://gender.io/ Contact me for dictionary in “International Gender Differences and Gaps in Online Social Networks” Ethnicity Dictionary: https://www.census.gov/topics/population/genealogy/data/2010_surnames.html Also see “Inferring Nationalities of Twitter Users and Studying Inter-National Linking”
  • 82. Demographic Inference – Image-Based Inference Face++ Cognitive Services https://www.faceplusplus.com/face-detection/ Microsoft Cognitive Services https://www.microsoft.com/cognitive-services/en-us/computer-vision-api
  • 83. Demographic Inference – Build Your Training Data FollowerWonk by Moz https://moz.com/followerwonk/bio https://moz.com/followerwonk/bio/?q=(38-yr%7C38-yrs%7C38%20years)%20old%0A%0A
  • 84. Where to From Here?* *Other than lunch Image from user rculwellmins on Pinterest
  • 85. Where to Go From Here Slides and references, including unused ones, will be posted at: https://sites.google.com/site/digitaldemography/ (Annual?) Workshop at ICWSM: Social Media and Demographic Research, https://sites.google.com/site/smdrworkshop/ Forthcoming special collection on “Social Media and Demographic Research” for Demographic Research, edited by E. Zagheni (http://www.demographic- research.org/info/default.htm)
  • 86. Organizations IUSSP “Big Data and Population Processes” Panel, http://iussp.org/en/panel/big- data-and-population-processes  See their events UN Global Pulse, http://unglobalpulse.org/ Data-Pop Alliance, http://datapopalliance.org/ Digital Demography email list at UW, https://mailman12.u.washington.edu/mailman/listinfo/digital-demog