SlideShare a Scribd company logo
1 of 25
Correlated Impulses
Using Facebook Interests to Improve
Predictions of Crime Rates in Urban Areas
Masoomali Fatehkia, Dan O’Brien, Ingmar Weber
@ingmarweber
PLOS ONE 14(2): e0211350, February 2019
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0211350
The Team
Masoomali Fatehkia, QCRI
Daniel O’Brien, NEU
ADVERTISING AUDIENCE ESTIMATES
https://business.facebook.com/adsmanager/creation/
http://fb-doha.qcri.org
http://fb-doha.qcri.org
http://fb-nyc.qcri.org
Advertising Audience Estimates
+ Global reach with over 2 billion users
+ FB, LinkedIn, Google, Snapchat, IG, ...
+ Real-time estimates
+ Uses anonymous and aggregate data
+ Gender, age, location, country of origin, ….
+ Non-traditional attributes such as interests
+ Accessible through APIs
Advertising Audience Estimates
- Black box on how attributes are inferred
- Usage patterns change over time
- Black boxes changes over time
- Only includes people who are online
- Somewhat coarse (was 20, now 1,000)
- Possibility to locate vulnerable populations
MODELING CRIME RATES
Background for Exploration
• Ecological theories for community-level variation (social
norms, economic inequalities, …)
• Individual level characteristics (self-control, risk taking, …)
• Could the spatial distribution of individual-level processes
explain the variation in crime between neighborhoods?
• Can we pick up the traits of perpetrators?
• Can we pick up the traits of victims
Exploratory study
Data Sources
• 9 large cities (out of 25) with “usable” crime data
• Three data sources: (i) Facebook data, (ii) ACS
2015 data, and (iii) crime incident data
• ZIP codes ≈ ZCTA, except parks/office buildings
– ZCTA codes > 10k population in ACS 2015 (65x)
– FB population < 1.5 ACS population (41x)
• Resulted in 432 ZIP codes
• Bias in which ZIP codes remain (bigger/denser)
Facebook Interests Data
• Online dating and relationship status
– m/f in different relationship statuses
– Planned: “open relationship” – too sparse
• Online gaming
– Action games, card games, FPS, …
• Music genres
– Hip hop, blues, electronic, country, …
• Movie genre
– Action, horror, comedy, ..
ACS 2015 Demographic Data
% of population aged 15-19
% of population aged 18-24
Median age
% of population one race White
% of population one race Black or African-American
% of households on food stamp benefits
Median family income
% households with income > 150K
% households with income <= 25K
% of population 18-24 with bachelors or associates degree
% of population 18-24 with less than high school degree
% of population 25+ with less than high school degree
% of population 25+ with bachelors or higher degree
RaceAgeIncomeEducation
Crime Rate Data
• Use geo-coded incident data
• Aggregate by ZIP codes
• Standardized cities reporting to National
Incident Based Reporting System (NIBRS)
• Compute rates per 100k (using ACS’15 pop)
• Reported crime rates, not crime itself
• Policing strategies
https://www.gimletmedia.com/reply-all/127-the-crime-machine-part-i
https://www.gimletmedia.com/reply-all/128-the-crime-machine-part-ii
Data Availability
https://doi.org/10.7910/DVN/MGRADP
Predictive Performance Across Age/Gender
Performance is mean absolute error (MAE) for the crime rate per 100k population
All aged 18+ always gave best CV performance
All other experiments done using only this (broad) subset
Using regularized linear regression (LASSO) with FB-only
features to identify which age/gender group is most predictive
Factor Analysis
• Too many features are selected (up to 23),
even with LASSO
• Group features into “factors” for sparser, more
interpretable model
• Do a factor analysis for each of (i) relationship,
(ii) music, (iii) movie, and (iv) gaming features
Music-Related Factors
Factor analysis used ‘factanal’ R package on the feature correlation matrix.
Hip-hop,soulandrelated
Model Performance
Demographics only Facebook only Demogr. + Facebook
Assault .639 .488 .604 .437 .656 .511
Burglary .562 .083 .601 .163 .598 .157
Robbery .558 .411 .528 .371 .581 .441
“Facebook only” not bad
“Demogr. + Facebook” is best
Adjusted R^2Marginal gain over city dummies
Modeling Assault Crime
Spatial Variation in Assault Crime
Discussion
• Based on “18+ all”: FB’s predictive power lies
less in the behaviors of particular individuals
and more in the overall behavioral ecology
• Hip hop: the only factor that remained in all
models. Indicates culture of crime or of
increased policing?
• “… correcting for demographics”: or just
unmodeled variation (re interaction terms)?
• What else?
Rock, Rap, or Reggaeton? Assessing Mexican Immigrants’
Cultural Assimilation Using Facebook Data
Today, Session 168, 1:00-2:30 PM, Brazos/206
See you in Doha for SocInfo’19!
Speakers include:
Francesco Billari, Emre Kiciman, Katy Börner, Yelena Mejova, Luca
Maria Aiello, Aniko Hannak, and Giovanni Luca Ciampaglia
Submit papers/abstracts by April 15, 2019
Submit tutorials/workshops by April 30, 2019
Thanks!

More Related Content

Similar to Correlated Impulses: Using Facebook Interests to Improve Predictions of Crime Rates in Urban Areas

3.1Discussion Social Factors in Violent CrimeMany socia.docx
3.1Discussion Social Factors in Violent CrimeMany socia.docx3.1Discussion Social Factors in Violent CrimeMany socia.docx
3.1Discussion Social Factors in Violent CrimeMany socia.docx
taishao1
 
HealthGIS_Lisa_Levoir
HealthGIS_Lisa_LevoirHealthGIS_Lisa_Levoir
HealthGIS_Lisa_Levoir
Lisa LeVoir
 

Similar to Correlated Impulses: Using Facebook Interests to Improve Predictions of Crime Rates in Urban Areas (20)

Digital Gender Gaps Seen Through Social Media
Digital Gender Gaps Seen Through Social MediaDigital Gender Gaps Seen Through Social Media
Digital Gender Gaps Seen Through Social Media
 
Digital sexualities sussex
Digital sexualities   sussexDigital sexualities   sussex
Digital sexualities sussex
 
Using internet advertising data for studying international migration
Using internet advertising data for studying international migrationUsing internet advertising data for studying international migration
Using internet advertising data for studying international migration
 
Encuentro Mundial
Encuentro MundialEncuentro Mundial
Encuentro Mundial
 
How Young Adults Get News and Information About Their Local Communities
How Young Adults Get News and Information About Their Local CommunitiesHow Young Adults Get News and Information About Their Local Communities
How Young Adults Get News and Information About Their Local Communities
 
Online Violence Against Women in Politics - Comparative Trends, Impacts and R...
Online Violence Against Women in Politics - Comparative Trends, Impacts and R...Online Violence Against Women in Politics - Comparative Trends, Impacts and R...
Online Violence Against Women in Politics - Comparative Trends, Impacts and R...
 
Using advertising data to model migration, poverty and digital gender gaps
Using advertising data to model migration, poverty and digital gender gapsUsing advertising data to model migration, poverty and digital gender gaps
Using advertising data to model migration, poverty and digital gender gaps
 
Best Local Sources
Best Local SourcesBest Local Sources
Best Local Sources
 
Ai, social media and political polarization
Ai, social media and political polarizationAi, social media and political polarization
Ai, social media and political polarization
 
Votus pitchdeck august2016_v6
Votus pitchdeck august2016_v6Votus pitchdeck august2016_v6
Votus pitchdeck august2016_v6
 
2009 08 14_warren_ministerial
2009 08 14_warren_ministerial2009 08 14_warren_ministerial
2009 08 14_warren_ministerial
 
Select a city you have visited or resided in (or the nearest major m.docx
Select a city you have visited or resided in (or the nearest major m.docxSelect a city you have visited or resided in (or the nearest major m.docx
Select a city you have visited or resided in (or the nearest major m.docx
 
Youth exposure to pornography and violent web sites
Youth exposure to pornography and violent web sitesYouth exposure to pornography and violent web sites
Youth exposure to pornography and violent web sites
 
Using Advertising Platforms for Social Good
Using Advertising Platforms for Social GoodUsing Advertising Platforms for Social Good
Using Advertising Platforms for Social Good
 
Digital Breadcrums: Investigating Internet Crime with Open Source Intelligenc...
Digital Breadcrums: Investigating Internet Crime with Open Source Intelligenc...Digital Breadcrums: Investigating Internet Crime with Open Source Intelligenc...
Digital Breadcrums: Investigating Internet Crime with Open Source Intelligenc...
 
Smmp 3 slides
Smmp 3 slidesSmmp 3 slides
Smmp 3 slides
 
Digital Breadcrumbs- Investigating Internet Crime with Open Source Intellige...
Digital Breadcrumbs-  Investigating Internet Crime with Open Source Intellige...Digital Breadcrumbs-  Investigating Internet Crime with Open Source Intellige...
Digital Breadcrumbs- Investigating Internet Crime with Open Source Intellige...
 
3.1Discussion Social Factors in Violent CrimeMany socia.docx
3.1Discussion Social Factors in Violent CrimeMany socia.docx3.1Discussion Social Factors in Violent CrimeMany socia.docx
3.1Discussion Social Factors in Violent CrimeMany socia.docx
 
Tech2Empower.v2
Tech2Empower.v2Tech2Empower.v2
Tech2Empower.v2
 
HealthGIS_Lisa_Levoir
HealthGIS_Lisa_LevoirHealthGIS_Lisa_Levoir
HealthGIS_Lisa_Levoir
 

More from Ingmar Weber

More from Ingmar Weber (18)

Different Hashtags, Different Opinions - Twitter Polarization in Egypt
Different Hashtags, Different Opinions - Twitter Polarization in EgyptDifferent Hashtags, Different Opinions - Twitter Polarization in Egypt
Different Hashtags, Different Opinions - Twitter Polarization in Egypt
 
Data on Polarization, Peace, and Propaganda
Data on Polarization, Peace, and PropagandaData on Polarization, Peace, and Propaganda
Data on Polarization, Peace, and Propaganda
 
Monitoring migration using social media data an introduction
Monitoring migration using social media data   an introductionMonitoring migration using social media data   an introduction
Monitoring migration using social media data an introduction
 
Not so-obvious social media analysis to study current affairs
Not so-obvious social media analysis to study current affairsNot so-obvious social media analysis to study current affairs
Not so-obvious social media analysis to study current affairs
 
Digital data for migration research
Digital data for migration researchDigital data for migration research
Digital data for migration research
 
Digital advertising data for migration research
Digital advertising data for migration researchDigital advertising data for migration research
Digital advertising data for migration research
 
Advertising Data for Good
Advertising Data for GoodAdvertising Data for Good
Advertising Data for Good
 
Tapping into advertising platforms to monitor ict usage and more
Tapping into advertising platforms to monitor ict usage and moreTapping into advertising platforms to monitor ict usage and more
Tapping into advertising platforms to monitor ict usage and more
 
Hate Speech, Polarization and Online Data
Hate Speech, Polarization and Online DataHate Speech, Polarization and Online Data
Hate Speech, Polarization and Online Data
 
Digital Demography - Keynote at SocInfo'18
Digital Demography - Keynote at SocInfo'18Digital Demography - Keynote at SocInfo'18
Digital Demography - Keynote at SocInfo'18
 
Tracking Digital Gender Gaps
Tracking Digital Gender GapsTracking Digital Gender Gaps
Tracking Digital Gender Gaps
 
Estimating Migration and Quantifying Migrant Assimilation Using Internet Adve...
Estimating Migration and Quantifying Migrant Assimilation Using Internet Adve...Estimating Migration and Quantifying Migrant Assimilation Using Internet Adve...
Estimating Migration and Quantifying Migrant Assimilation Using Internet Adve...
 
Social media analysis for better policy making
Social media analysis for better policy makingSocial media analysis for better policy making
Social media analysis for better policy making
 
Matching Methods and Natural Experiments - Examples of Causal Inference from ...
Matching Methods and Natural Experiments - Examples of Causal Inference from ...Matching Methods and Natural Experiments - Examples of Causal Inference from ...
Matching Methods and Natural Experiments - Examples of Causal Inference from ...
 
Not-so-obvious Online Data Sources for Demographic Research
Not-so-obvious Online Data Sources for Demographic ResearchNot-so-obvious Online Data Sources for Demographic Research
Not-so-obvious Online Data Sources for Demographic Research
 
A Warm Welcome Matters! The Link Between Social Feedback and Weight Loss in /...
A Warm Welcome Matters! The Link Between Social Feedback and Weight Loss in /...A Warm Welcome Matters! The Link Between Social Feedback and Weight Loss in /...
A Warm Welcome Matters! The Link Between Social Feedback and Weight Loss in /...
 
Social Media Research and Practice in the Health Domain - Tutorial, Part II
Social Media Research and Practice in the Health Domain - Tutorial, Part IISocial Media Research and Practice in the Health Domain - Tutorial, Part II
Social Media Research and Practice in the Health Domain - Tutorial, Part II
 
Digital Demography - WWW'17 Tutorial - Part II
Digital Demography - WWW'17 Tutorial - Part IIDigital Demography - WWW'17 Tutorial - Part II
Digital Demography - WWW'17 Tutorial - Part II
 

Recently uploaded

Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
MohamedFarag457087
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 

Recently uploaded (20)

Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 

Correlated Impulses: Using Facebook Interests to Improve Predictions of Crime Rates in Urban Areas

  • 1. Correlated Impulses Using Facebook Interests to Improve Predictions of Crime Rates in Urban Areas Masoomali Fatehkia, Dan O’Brien, Ingmar Weber @ingmarweber PLOS ONE 14(2): e0211350, February 2019 https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0211350
  • 2. The Team Masoomali Fatehkia, QCRI Daniel O’Brien, NEU
  • 8. Advertising Audience Estimates + Global reach with over 2 billion users + FB, LinkedIn, Google, Snapchat, IG, ... + Real-time estimates + Uses anonymous and aggregate data + Gender, age, location, country of origin, …. + Non-traditional attributes such as interests + Accessible through APIs
  • 9. Advertising Audience Estimates - Black box on how attributes are inferred - Usage patterns change over time - Black boxes changes over time - Only includes people who are online - Somewhat coarse (was 20, now 1,000) - Possibility to locate vulnerable populations
  • 11. Background for Exploration • Ecological theories for community-level variation (social norms, economic inequalities, …) • Individual level characteristics (self-control, risk taking, …) • Could the spatial distribution of individual-level processes explain the variation in crime between neighborhoods? • Can we pick up the traits of perpetrators? • Can we pick up the traits of victims Exploratory study
  • 12. Data Sources • 9 large cities (out of 25) with “usable” crime data • Three data sources: (i) Facebook data, (ii) ACS 2015 data, and (iii) crime incident data • ZIP codes ≈ ZCTA, except parks/office buildings – ZCTA codes > 10k population in ACS 2015 (65x) – FB population < 1.5 ACS population (41x) • Resulted in 432 ZIP codes • Bias in which ZIP codes remain (bigger/denser)
  • 13. Facebook Interests Data • Online dating and relationship status – m/f in different relationship statuses – Planned: “open relationship” – too sparse • Online gaming – Action games, card games, FPS, … • Music genres – Hip hop, blues, electronic, country, … • Movie genre – Action, horror, comedy, ..
  • 14. ACS 2015 Demographic Data % of population aged 15-19 % of population aged 18-24 Median age % of population one race White % of population one race Black or African-American % of households on food stamp benefits Median family income % households with income > 150K % households with income <= 25K % of population 18-24 with bachelors or associates degree % of population 18-24 with less than high school degree % of population 25+ with less than high school degree % of population 25+ with bachelors or higher degree RaceAgeIncomeEducation
  • 15. Crime Rate Data • Use geo-coded incident data • Aggregate by ZIP codes • Standardized cities reporting to National Incident Based Reporting System (NIBRS) • Compute rates per 100k (using ACS’15 pop) • Reported crime rates, not crime itself • Policing strategies https://www.gimletmedia.com/reply-all/127-the-crime-machine-part-i https://www.gimletmedia.com/reply-all/128-the-crime-machine-part-ii
  • 17. Predictive Performance Across Age/Gender Performance is mean absolute error (MAE) for the crime rate per 100k population All aged 18+ always gave best CV performance All other experiments done using only this (broad) subset Using regularized linear regression (LASSO) with FB-only features to identify which age/gender group is most predictive
  • 18. Factor Analysis • Too many features are selected (up to 23), even with LASSO • Group features into “factors” for sparser, more interpretable model • Do a factor analysis for each of (i) relationship, (ii) music, (iii) movie, and (iv) gaming features
  • 19. Music-Related Factors Factor analysis used ‘factanal’ R package on the feature correlation matrix. Hip-hop,soulandrelated
  • 20. Model Performance Demographics only Facebook only Demogr. + Facebook Assault .639 .488 .604 .437 .656 .511 Burglary .562 .083 .601 .163 .598 .157 Robbery .558 .411 .528 .371 .581 .441 “Facebook only” not bad “Demogr. + Facebook” is best Adjusted R^2Marginal gain over city dummies
  • 22. Spatial Variation in Assault Crime
  • 23. Discussion • Based on “18+ all”: FB’s predictive power lies less in the behaviors of particular individuals and more in the overall behavioral ecology • Hip hop: the only factor that remained in all models. Indicates culture of crime or of increased policing? • “… correcting for demographics”: or just unmodeled variation (re interaction terms)? • What else? Rock, Rap, or Reggaeton? Assessing Mexican Immigrants’ Cultural Assimilation Using Facebook Data Today, Session 168, 1:00-2:30 PM, Brazos/206
  • 24. See you in Doha for SocInfo’19! Speakers include: Francesco Billari, Emre Kiciman, Katy Börner, Yelena Mejova, Luca Maria Aiello, Aniko Hannak, and Giovanni Luca Ciampaglia Submit papers/abstracts by April 15, 2019 Submit tutorials/workshops by April 30, 2019