SlideShare a Scribd company logo
1 of 24
Data Journalism at
The Baltimore
Banner
Nick Thieme
2/23/2023
1
Some of our work
2/23/2023
2
Let’s dig in
• Choosing a story
• Questions
• Learning
• Contextualizing
2/23/2023
3
Choosing this story
• “I think... if it is true that there are
as many minds as there are heads,
then there are as many kinds of
love as there are hearts.”
• In 2017, my house was almost sold
in tax sale
• Baltimore City’s property tax nearly
double second highest (~2.2 v
~1.4)
• Work on this done before:
– Abell Foundation
– MVLS
– Baltimore Brew
– No comprehensive data look
2/23/2023
4
What question do we want to answer? What data
needed?
• How much money do investors make off the property tax system?
• Who is affected?
• What is the geography of property tax?
• Baltimore City property tax sale data
– Record request in: 6/21
– Responsive documents: 8/29
– MPIA violation, one of many
• State land records
– Records request in: 8/8
– Responsive documents: 9/1
• Census data
Learning / data work
• SDAT records:
– 24 counties in Maryland, one file from each.
– 277 columns
• Everything you could want about a property
• Location
• Value
• Owner transfers
• City of Baltimore Tax sale data
– 21 columns, most either uninformative or
incomplete
– Importantly, property address, bidder information,
lien information
2/23/2023
6
Learning / data work (2)
• Need to match the SDAT data with the
property lien data
– Property addresses are maintained
differently in different places
– Block/lot/ward doesn’t work because of
multi-unit buildings
– One unit may be liened but another isn’t
• Use postmastr in R to standardize addresses
between SDAT and CoB
• Link shapefile information between joined
SDAT/CoB data and shapefiles with
block/lot/ward since footprint is the same for
all units in a building
• A lot of back and forth making sure the match
was complete
2/23/2023
7
Learning / data work
(3)
• Investors make money in 3 ways
– Flipping homes
– Interest payments
– Attorney’s fees
• 2/3 tractable
– Court records not granular/dependable enough
for attorney’s fees
• Flipping homes
– Find next sale after tax sale, tax difference as
profit
– Most homes sold soon after tax sale
– Assessment values rarely change after tax sale
• Interest
– Interest accrues immediately after tax sale and
until redemption
– Different rates for homeowners and non-
homeowners
– Errors in CoB data
2/23/2023
8
What did we learn?
• Enormous racial disparities in Baltimore tax sale
– 46% of buildings in SW Baltimore and 42% of Sandtown-
Winchester liened
– The more white residents in a tract, the less likely homes
are to be liened
– Logistic mixed-effects GAM supports this
• $37m total in income off tax sale in 6 years
• $27m in flips
• $10m in interest
– $8m in non-owner-occupied
– $2m in owner-occupied
• Misclassification a huge issue
– Of 10,000 homes where owner lives at liened house, 6.6k
listed as non-owner-occupied
– Has implications for tax rate, forclosure time, protections
2/23/2023
9
Contextualizing
• Arnita Owens-Phillips almost lost her house through tax sale
– Lien purchased by Stonefield Investments
– Tangled titled / “heirs property”
– Liens ballooned through interest and tax sale process
– Helped by legal aid fund
• Edmondson Community Center sold through tax sale
– $5,000 on $2,500 of liens
– Investor flipped center for $140,000
• Legal changes
– HPP
– Judicial in rem
– Changing misclassification
Choosing this story
• Reporter Jessica Calefati reached out
8/30/22 about Johns Hopkins creating a
new police force
• Pre-reporting already finished
– Hopkins paused the plans in 2020 for two
years
– Originally implemented because of a ”crime
wave” on Hopkins campus
– Shapefiles of proposed jurisdictions
2/23/2023
11
What questions do we want to answer? What
data is needed?
• What do crime trends in the proposed jurisdictions
look ?
• Do they depend on the campus? Year? Crime type?
• How do those trends fit with Hopkins’ stated rationale
for creating a police force?
• Crime data from Baltimore Police Department / Open
Baltimore
• Shapefiles of proposed jurisdiction
– Non-existent
– Need to be created from reports
• https://publicsafety.jhu.edu/assets/uploads/sites/9/2022/08
/8_JHPD_PoliceDept_Maps-Homewood-8.4.22.pdf
• https://publicsafety.jhu.edu/assets/uploads/sites/9/2022/08
/9_JHPD_PoliceDept_Maps-East-Baltimore-8.4.22.pdf
• https://publicsafety.jhu.edu/assets/uploads/sites/9/2022/08
/7_JHPD_PoliceDept_Maps-Peabody-8.4.22.pdf
• QGIS by hand
Learning/data work
• BPD crime data
– Has geolocation information for
crimes
– Is victim-level, want incident level!
• Shapefiles in arbitrary coordinate
system
– Need to convert to WGS-84
• Group BPD data by crime type,
location, time, inside-outside,… to
reduce victim-level to incident level
• Use Sf to convert shapefiles to right
CRS, and join with BPD data
2/23/2023
Sample Footer Text 13
Learning/data work (2)
• Care more about crime rates than raw crime numbers
• Need to combine with ACS data about populations
• Linear interpolation
– Not perfect, but can check whether resulting population counts in campus make sense
• A lot of checking
– Do the yearly victim numbers in jurisdictions agree with the BPD data? Sample and check
by hand
What did we learn?
2/23/2023
15
• Property crimes down across all
three campuses
• Violent crime steady or down
• No “crime wave” since plan stopped
Contextualizing
• “We are still besieged by violence, and it’s unacceptable” – Branville Bard Jr.,
Hopkins’ Vice President for Public Safety
• No other private universities in Maryland have private police forces
• Student and faculty opposition
• JHU police can only police strictly within the bounds of the jurisdiction
Defending your
work
• Essential part of data journalism is
defending your own work
• We create facts in a way journalists
typically do not
• Our work is only as good as the
science behind it
• When the science is sound but
challenged, should respond
substantively to critiques
2/23/2023
Sample Footer Text 17
Choosing this story
• Experimenting with pure data stories
• NYT advanced visual journalism,
teaching readers a new visual
language
• COVID taught readers about models
and exponential growth
• Next decade of journalism will teach
readers a new statistical language
2/23/2023
Sample Footer Text 18
What questions do we want to answer? / What
data is needed?
• How did voting trends in Maryland and in Baltimore City / County change between 2018
and 2022?
• Did Black and white voters differ in their preference for Wes Moore?
• Did Black and white voters differ in their preference for cannabis legalization?
• Interesting answers to these questions require more granular data than county-level
results
– Need precinct-level voting results from MD State Board of Elections from 2022
– Precinct-level results from 2018 from Metric Geometry and Gerrymandering Group at MIT
– Precinct shapefiles from Maryland Department of Planning
• Census data
Learning / data work
• Precinct-level results come in XML form, requires parsing
with xml2 in R
• Precinct names in voting data and shapefile data often
differ, need to match
– No obvious way to do this, need to examine by hand
(needed to remove two 0’s in the middle of the names
in some cases, just one in others. Automatable, but
needs to be discovered manually)
– Much, much worse in Georgia
• Missing data from Montgomery County and Kent County
• This is the fun but dangerous part: data analysis
2/23/2023
Sample Footer Text 20
What did we learn?
• Moore outperformed Jealous / Cox performed
worse than Hogan almost everywhere
• Precinct-level data lets us see:
– the incredible expansion of blue territory in the
west part of the County and City from 2018 to
2022
– The expansion of blue between D.C. and
Baltimore and the lightening of red support all
around the state
2/23/2023
21
What did we learn (2)
• Moore preference related to race
– <20% Black census tracts had fully mixed
opinions on Moore
– >80% Black nearly all for Moore
– Log relationship
• Strong majority of voters wanted legal weed
– More support in Black community
– Driven by:
• long left tail in white
• Less variation in Black
– Violin plot!
2/23/2023
22
23
Thank you
Nick Thieme, The Baltimore Banner
Nick.thieme@thebaltimorebanner.com
703-850-0935
Twitter: @FurrierTranform

More Related Content

Similar to Data Journalism at The Baltimore Banner

No More Half Fast: Improving US Broadband Download Speed. Georgetown Universi...
No More Half Fast: Improving US Broadband Download Speed. Georgetown Universi...No More Half Fast: Improving US Broadband Download Speed. Georgetown Universi...
No More Half Fast: Improving US Broadband Download Speed. Georgetown Universi...Brittne Kakulla, Ph.D.
 
Feasibility Studies: Finley Engineerin
Feasibility Studies: Finley EngineerinFeasibility Studies: Finley Engineerin
Feasibility Studies: Finley EngineerinAnn Treacy
 
Diversion First Stakeholders Group Meeting: Sept. 17, 2018
Diversion First Stakeholders Group Meeting: Sept. 17, 2018Diversion First Stakeholders Group Meeting: Sept. 17, 2018
Diversion First Stakeholders Group Meeting: Sept. 17, 2018Fairfax County
 
Discoverable Client Issues Using Public Big Data
Discoverable Client Issues Using Public Big DataDiscoverable Client Issues Using Public Big Data
Discoverable Client Issues Using Public Big DataMatthew Stubenberg
 
Minnesota Busts the ‘Broadband Is Too Expensive’ Myth
Minnesota Busts the ‘Broadband Is Too Expensive’ MythMinnesota Busts the ‘Broadband Is Too Expensive’ Myth
Minnesota Busts the ‘Broadband Is Too Expensive’ MythAnn Treacy
 
Jack geller challenges of measuring broadband adoption
Jack geller   challenges of measuring broadband adoptionJack geller   challenges of measuring broadband adoption
Jack geller challenges of measuring broadband adoptionAnn Treacy
 
What's Old is New Again
What's Old is New AgainWhat's Old is New Again
What's Old is New AgainTom Blefko
 
Measuring ROI of Rural Broadband Investments: Stories of Success in Five Rura...
Measuring ROI of Rural Broadband Investments: Stories of Success in Five Rura...Measuring ROI of Rural Broadband Investments: Stories of Success in Five Rura...
Measuring ROI of Rural Broadband Investments: Stories of Success in Five Rura...Ann Treacy
 
HNA Slides Show and Tell 14.10.21
HNA Slides Show and Tell 14.10.21 HNA Slides Show and Tell 14.10.21
HNA Slides Show and Tell 14.10.21 PAS_Team
 
2014: NJ GMIS: Legal Issues Surrounding the Web and Social Media
2014: NJ GMIS: Legal Issues Surrounding the Web and Social Media2014: NJ GMIS: Legal Issues Surrounding the Web and Social Media
2014: NJ GMIS: Legal Issues Surrounding the Web and Social MediaCarol Spencer
 
Community Engagement for Complete Communities
Community Engagement for Complete CommunitiesCommunity Engagement for Complete Communities
Community Engagement for Complete CommunitiesRPO America
 
Guidance to-use-lending-data for local authorities
Guidance to-use-lending-data for local authoritiesGuidance to-use-lending-data for local authorities
Guidance to-use-lending-data for local authoritiesRichard Browne
 
Eye on the E-Citizen - Great numbers and perspective from 2002
Eye on the E-Citizen - Great numbers and perspective from 2002Eye on the E-Citizen - Great numbers and perspective from 2002
Eye on the E-Citizen - Great numbers and perspective from 2002Steven Clift
 
Homework gap presentation
Homework gap presentationHomework gap presentation
Homework gap presentationEducationNC
 
10/31/2019: 2020 Census
10/31/2019: 2020 Census10/31/2019: 2020 Census
10/31/2019: 2020 Censusprofcyclist
 
Canada's reforming of Social Welfare Programs
Canada's reforming of Social Welfare Programs Canada's reforming of Social Welfare Programs
Canada's reforming of Social Welfare Programs paul young cpa, cga
 
Week 10, budget approval and budget communication
Week 10, budget approval and budget communicationWeek 10, budget approval and budget communication
Week 10, budget approval and budget communicationwillshatcher
 
Class_5_Data_2018W_pptx.pptx
Class_5_Data_2018W_pptx.pptxClass_5_Data_2018W_pptx.pptx
Class_5_Data_2018W_pptx.pptxccaskumba
 

Similar to Data Journalism at The Baltimore Banner (20)

Boston
BostonBoston
Boston
 
No More Half Fast: Improving US Broadband Download Speed. Georgetown Universi...
No More Half Fast: Improving US Broadband Download Speed. Georgetown Universi...No More Half Fast: Improving US Broadband Download Speed. Georgetown Universi...
No More Half Fast: Improving US Broadband Download Speed. Georgetown Universi...
 
Feasibility Studies: Finley Engineerin
Feasibility Studies: Finley EngineerinFeasibility Studies: Finley Engineerin
Feasibility Studies: Finley Engineerin
 
Diversion First Stakeholders Group Meeting: Sept. 17, 2018
Diversion First Stakeholders Group Meeting: Sept. 17, 2018Diversion First Stakeholders Group Meeting: Sept. 17, 2018
Diversion First Stakeholders Group Meeting: Sept. 17, 2018
 
Discoverable Client Issues Using Public Big Data
Discoverable Client Issues Using Public Big DataDiscoverable Client Issues Using Public Big Data
Discoverable Client Issues Using Public Big Data
 
Minnesota Busts the ‘Broadband Is Too Expensive’ Myth
Minnesota Busts the ‘Broadband Is Too Expensive’ MythMinnesota Busts the ‘Broadband Is Too Expensive’ Myth
Minnesota Busts the ‘Broadband Is Too Expensive’ Myth
 
Jack geller challenges of measuring broadband adoption
Jack geller   challenges of measuring broadband adoptionJack geller   challenges of measuring broadband adoption
Jack geller challenges of measuring broadband adoption
 
What's Old is New Again
What's Old is New AgainWhat's Old is New Again
What's Old is New Again
 
Measuring ROI of Rural Broadband Investments: Stories of Success in Five Rura...
Measuring ROI of Rural Broadband Investments: Stories of Success in Five Rura...Measuring ROI of Rural Broadband Investments: Stories of Success in Five Rura...
Measuring ROI of Rural Broadband Investments: Stories of Success in Five Rura...
 
HNA Slides Show and Tell 14.10.21
HNA Slides Show and Tell 14.10.21 HNA Slides Show and Tell 14.10.21
HNA Slides Show and Tell 14.10.21
 
2014: NJ GMIS: Legal Issues Surrounding the Web and Social Media
2014: NJ GMIS: Legal Issues Surrounding the Web and Social Media2014: NJ GMIS: Legal Issues Surrounding the Web and Social Media
2014: NJ GMIS: Legal Issues Surrounding the Web and Social Media
 
Community Engagement for Complete Communities
Community Engagement for Complete CommunitiesCommunity Engagement for Complete Communities
Community Engagement for Complete Communities
 
Guidance to-use-lending-data for local authorities
Guidance to-use-lending-data for local authoritiesGuidance to-use-lending-data for local authorities
Guidance to-use-lending-data for local authorities
 
Eye on the E-Citizen - Great numbers and perspective from 2002
Eye on the E-Citizen - Great numbers and perspective from 2002Eye on the E-Citizen - Great numbers and perspective from 2002
Eye on the E-Citizen - Great numbers and perspective from 2002
 
Homework gap presentation
Homework gap presentationHomework gap presentation
Homework gap presentation
 
10/31/2019: 2020 Census
10/31/2019: 2020 Census10/31/2019: 2020 Census
10/31/2019: 2020 Census
 
Canada's reforming of Social Welfare Programs
Canada's reforming of Social Welfare Programs Canada's reforming of Social Welfare Programs
Canada's reforming of Social Welfare Programs
 
The BC Regional Experience
The BC Regional ExperienceThe BC Regional Experience
The BC Regional Experience
 
Week 10, budget approval and budget communication
Week 10, budget approval and budget communicationWeek 10, budget approval and budget communication
Week 10, budget approval and budget communication
 
Class_5_Data_2018W_pptx.pptx
Class_5_Data_2018W_pptx.pptxClass_5_Data_2018W_pptx.pptx
Class_5_Data_2018W_pptx.pptx
 

More from Data Works MD

Jolt’s Picks - Machine Learning and Major League Baseball Hit Streaks
Jolt’s Picks - Machine Learning and Major League Baseball Hit StreaksJolt’s Picks - Machine Learning and Major League Baseball Hit Streaks
Jolt’s Picks - Machine Learning and Major League Baseball Hit StreaksData Works MD
 
Introducing DataWave
Introducing DataWaveIntroducing DataWave
Introducing DataWaveData Works MD
 
Malware Detection, Enabled by Machine Learning
Malware Detection, Enabled by Machine LearningMalware Detection, Enabled by Machine Learning
Malware Detection, Enabled by Machine LearningData Works MD
 
Using AWS, Terraform, and Ansible to Automate Splunk at Scale
Using AWS, Terraform, and Ansible to Automate Splunk at ScaleUsing AWS, Terraform, and Ansible to Automate Splunk at Scale
Using AWS, Terraform, and Ansible to Automate Splunk at ScaleData Works MD
 
A Day in the Life of a Data Journalist
A Day in the Life of a Data JournalistA Day in the Life of a Data Journalist
A Day in the Life of a Data JournalistData Works MD
 
Robotics and Machine Learning: Working with NVIDIA Jetson Kits
Robotics and Machine Learning: Working with NVIDIA Jetson KitsRobotics and Machine Learning: Working with NVIDIA Jetson Kits
Robotics and Machine Learning: Working with NVIDIA Jetson KitsData Works MD
 
Connect Data and Devices with Apache NiFi
Connect Data and Devices with Apache NiFiConnect Data and Devices with Apache NiFi
Connect Data and Devices with Apache NiFiData Works MD
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningData Works MD
 
Data in the City: Analytics and Civic Data in Baltimore
Data in the City: Analytics and Civic Data in BaltimoreData in the City: Analytics and Civic Data in Baltimore
Data in the City: Analytics and Civic Data in BaltimoreData Works MD
 
Exploring Correlation Between Sentiment of Environmental Tweets and the Stock...
Exploring Correlation Between Sentiment of Environmental Tweets and the Stock...Exploring Correlation Between Sentiment of Environmental Tweets and the Stock...
Exploring Correlation Between Sentiment of Environmental Tweets and the Stock...Data Works MD
 
Automated Software Requirements Labeling
Automated Software Requirements LabelingAutomated Software Requirements Labeling
Automated Software Requirements LabelingData Works MD
 
Introduction to Elasticsearch for Business Intelligence and Application Insights
Introduction to Elasticsearch for Business Intelligence and Application InsightsIntroduction to Elasticsearch for Business Intelligence and Application Insights
Introduction to Elasticsearch for Business Intelligence and Application InsightsData Works MD
 
An Asynchronous Distributed Deep Learning Based Intrusion Detection System fo...
An Asynchronous Distributed Deep Learning Based Intrusion Detection System fo...An Asynchronous Distributed Deep Learning Based Intrusion Detection System fo...
An Asynchronous Distributed Deep Learning Based Intrusion Detection System fo...Data Works MD
 
RAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data ScienceRAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data ScienceData Works MD
 
Two Algorithms for Weakly Supervised Denoising of EEG Data
Two Algorithms for Weakly Supervised Denoising of EEG DataTwo Algorithms for Weakly Supervised Denoising of EEG Data
Two Algorithms for Weakly Supervised Denoising of EEG DataData Works MD
 
Detecting Lateral Movement with a Compute-Intense Graph Kernel
Detecting Lateral Movement with a Compute-Intense Graph KernelDetecting Lateral Movement with a Compute-Intense Graph Kernel
Detecting Lateral Movement with a Compute-Intense Graph KernelData Works MD
 
Predictive Analytics and Neighborhood Health
Predictive Analytics and Neighborhood HealthPredictive Analytics and Neighborhood Health
Predictive Analytics and Neighborhood HealthData Works MD
 
Social Network Analysis Workshop
Social Network Analysis WorkshopSocial Network Analysis Workshop
Social Network Analysis WorkshopData Works MD
 

More from Data Works MD (18)

Jolt’s Picks - Machine Learning and Major League Baseball Hit Streaks
Jolt’s Picks - Machine Learning and Major League Baseball Hit StreaksJolt’s Picks - Machine Learning and Major League Baseball Hit Streaks
Jolt’s Picks - Machine Learning and Major League Baseball Hit Streaks
 
Introducing DataWave
Introducing DataWaveIntroducing DataWave
Introducing DataWave
 
Malware Detection, Enabled by Machine Learning
Malware Detection, Enabled by Machine LearningMalware Detection, Enabled by Machine Learning
Malware Detection, Enabled by Machine Learning
 
Using AWS, Terraform, and Ansible to Automate Splunk at Scale
Using AWS, Terraform, and Ansible to Automate Splunk at ScaleUsing AWS, Terraform, and Ansible to Automate Splunk at Scale
Using AWS, Terraform, and Ansible to Automate Splunk at Scale
 
A Day in the Life of a Data Journalist
A Day in the Life of a Data JournalistA Day in the Life of a Data Journalist
A Day in the Life of a Data Journalist
 
Robotics and Machine Learning: Working with NVIDIA Jetson Kits
Robotics and Machine Learning: Working with NVIDIA Jetson KitsRobotics and Machine Learning: Working with NVIDIA Jetson Kits
Robotics and Machine Learning: Working with NVIDIA Jetson Kits
 
Connect Data and Devices with Apache NiFi
Connect Data and Devices with Apache NiFiConnect Data and Devices with Apache NiFi
Connect Data and Devices with Apache NiFi
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Data in the City: Analytics and Civic Data in Baltimore
Data in the City: Analytics and Civic Data in BaltimoreData in the City: Analytics and Civic Data in Baltimore
Data in the City: Analytics and Civic Data in Baltimore
 
Exploring Correlation Between Sentiment of Environmental Tweets and the Stock...
Exploring Correlation Between Sentiment of Environmental Tweets and the Stock...Exploring Correlation Between Sentiment of Environmental Tweets and the Stock...
Exploring Correlation Between Sentiment of Environmental Tweets and the Stock...
 
Automated Software Requirements Labeling
Automated Software Requirements LabelingAutomated Software Requirements Labeling
Automated Software Requirements Labeling
 
Introduction to Elasticsearch for Business Intelligence and Application Insights
Introduction to Elasticsearch for Business Intelligence and Application InsightsIntroduction to Elasticsearch for Business Intelligence and Application Insights
Introduction to Elasticsearch for Business Intelligence and Application Insights
 
An Asynchronous Distributed Deep Learning Based Intrusion Detection System fo...
An Asynchronous Distributed Deep Learning Based Intrusion Detection System fo...An Asynchronous Distributed Deep Learning Based Intrusion Detection System fo...
An Asynchronous Distributed Deep Learning Based Intrusion Detection System fo...
 
RAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data ScienceRAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data Science
 
Two Algorithms for Weakly Supervised Denoising of EEG Data
Two Algorithms for Weakly Supervised Denoising of EEG DataTwo Algorithms for Weakly Supervised Denoising of EEG Data
Two Algorithms for Weakly Supervised Denoising of EEG Data
 
Detecting Lateral Movement with a Compute-Intense Graph Kernel
Detecting Lateral Movement with a Compute-Intense Graph KernelDetecting Lateral Movement with a Compute-Intense Graph Kernel
Detecting Lateral Movement with a Compute-Intense Graph Kernel
 
Predictive Analytics and Neighborhood Health
Predictive Analytics and Neighborhood HealthPredictive Analytics and Neighborhood Health
Predictive Analytics and Neighborhood Health
 
Social Network Analysis Workshop
Social Network Analysis WorkshopSocial Network Analysis Workshop
Social Network Analysis Workshop
 

Recently uploaded

From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationBoston Institute of Analytics
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service LucknowAminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknowmakika9823
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 

Recently uploaded (20)

From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project Presentation
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service LucknowAminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 

Data Journalism at The Baltimore Banner

  • 1. Data Journalism at The Baltimore Banner Nick Thieme 2/23/2023 1
  • 2. Some of our work 2/23/2023 2
  • 3. Let’s dig in • Choosing a story • Questions • Learning • Contextualizing 2/23/2023 3
  • 4. Choosing this story • “I think... if it is true that there are as many minds as there are heads, then there are as many kinds of love as there are hearts.” • In 2017, my house was almost sold in tax sale • Baltimore City’s property tax nearly double second highest (~2.2 v ~1.4) • Work on this done before: – Abell Foundation – MVLS – Baltimore Brew – No comprehensive data look 2/23/2023 4
  • 5. What question do we want to answer? What data needed? • How much money do investors make off the property tax system? • Who is affected? • What is the geography of property tax? • Baltimore City property tax sale data – Record request in: 6/21 – Responsive documents: 8/29 – MPIA violation, one of many • State land records – Records request in: 8/8 – Responsive documents: 9/1 • Census data
  • 6. Learning / data work • SDAT records: – 24 counties in Maryland, one file from each. – 277 columns • Everything you could want about a property • Location • Value • Owner transfers • City of Baltimore Tax sale data – 21 columns, most either uninformative or incomplete – Importantly, property address, bidder information, lien information 2/23/2023 6
  • 7. Learning / data work (2) • Need to match the SDAT data with the property lien data – Property addresses are maintained differently in different places – Block/lot/ward doesn’t work because of multi-unit buildings – One unit may be liened but another isn’t • Use postmastr in R to standardize addresses between SDAT and CoB • Link shapefile information between joined SDAT/CoB data and shapefiles with block/lot/ward since footprint is the same for all units in a building • A lot of back and forth making sure the match was complete 2/23/2023 7
  • 8. Learning / data work (3) • Investors make money in 3 ways – Flipping homes – Interest payments – Attorney’s fees • 2/3 tractable – Court records not granular/dependable enough for attorney’s fees • Flipping homes – Find next sale after tax sale, tax difference as profit – Most homes sold soon after tax sale – Assessment values rarely change after tax sale • Interest – Interest accrues immediately after tax sale and until redemption – Different rates for homeowners and non- homeowners – Errors in CoB data 2/23/2023 8
  • 9. What did we learn? • Enormous racial disparities in Baltimore tax sale – 46% of buildings in SW Baltimore and 42% of Sandtown- Winchester liened – The more white residents in a tract, the less likely homes are to be liened – Logistic mixed-effects GAM supports this • $37m total in income off tax sale in 6 years • $27m in flips • $10m in interest – $8m in non-owner-occupied – $2m in owner-occupied • Misclassification a huge issue – Of 10,000 homes where owner lives at liened house, 6.6k listed as non-owner-occupied – Has implications for tax rate, forclosure time, protections 2/23/2023 9
  • 10. Contextualizing • Arnita Owens-Phillips almost lost her house through tax sale – Lien purchased by Stonefield Investments – Tangled titled / “heirs property” – Liens ballooned through interest and tax sale process – Helped by legal aid fund • Edmondson Community Center sold through tax sale – $5,000 on $2,500 of liens – Investor flipped center for $140,000 • Legal changes – HPP – Judicial in rem – Changing misclassification
  • 11. Choosing this story • Reporter Jessica Calefati reached out 8/30/22 about Johns Hopkins creating a new police force • Pre-reporting already finished – Hopkins paused the plans in 2020 for two years – Originally implemented because of a ”crime wave” on Hopkins campus – Shapefiles of proposed jurisdictions 2/23/2023 11
  • 12. What questions do we want to answer? What data is needed? • What do crime trends in the proposed jurisdictions look ? • Do they depend on the campus? Year? Crime type? • How do those trends fit with Hopkins’ stated rationale for creating a police force? • Crime data from Baltimore Police Department / Open Baltimore • Shapefiles of proposed jurisdiction – Non-existent – Need to be created from reports • https://publicsafety.jhu.edu/assets/uploads/sites/9/2022/08 /8_JHPD_PoliceDept_Maps-Homewood-8.4.22.pdf • https://publicsafety.jhu.edu/assets/uploads/sites/9/2022/08 /9_JHPD_PoliceDept_Maps-East-Baltimore-8.4.22.pdf • https://publicsafety.jhu.edu/assets/uploads/sites/9/2022/08 /7_JHPD_PoliceDept_Maps-Peabody-8.4.22.pdf • QGIS by hand
  • 13. Learning/data work • BPD crime data – Has geolocation information for crimes – Is victim-level, want incident level! • Shapefiles in arbitrary coordinate system – Need to convert to WGS-84 • Group BPD data by crime type, location, time, inside-outside,… to reduce victim-level to incident level • Use Sf to convert shapefiles to right CRS, and join with BPD data 2/23/2023 Sample Footer Text 13
  • 14. Learning/data work (2) • Care more about crime rates than raw crime numbers • Need to combine with ACS data about populations • Linear interpolation – Not perfect, but can check whether resulting population counts in campus make sense • A lot of checking – Do the yearly victim numbers in jurisdictions agree with the BPD data? Sample and check by hand
  • 15. What did we learn? 2/23/2023 15 • Property crimes down across all three campuses • Violent crime steady or down • No “crime wave” since plan stopped
  • 16. Contextualizing • “We are still besieged by violence, and it’s unacceptable” – Branville Bard Jr., Hopkins’ Vice President for Public Safety • No other private universities in Maryland have private police forces • Student and faculty opposition • JHU police can only police strictly within the bounds of the jurisdiction
  • 17. Defending your work • Essential part of data journalism is defending your own work • We create facts in a way journalists typically do not • Our work is only as good as the science behind it • When the science is sound but challenged, should respond substantively to critiques 2/23/2023 Sample Footer Text 17
  • 18. Choosing this story • Experimenting with pure data stories • NYT advanced visual journalism, teaching readers a new visual language • COVID taught readers about models and exponential growth • Next decade of journalism will teach readers a new statistical language 2/23/2023 Sample Footer Text 18
  • 19. What questions do we want to answer? / What data is needed? • How did voting trends in Maryland and in Baltimore City / County change between 2018 and 2022? • Did Black and white voters differ in their preference for Wes Moore? • Did Black and white voters differ in their preference for cannabis legalization? • Interesting answers to these questions require more granular data than county-level results – Need precinct-level voting results from MD State Board of Elections from 2022 – Precinct-level results from 2018 from Metric Geometry and Gerrymandering Group at MIT – Precinct shapefiles from Maryland Department of Planning • Census data
  • 20. Learning / data work • Precinct-level results come in XML form, requires parsing with xml2 in R • Precinct names in voting data and shapefile data often differ, need to match – No obvious way to do this, need to examine by hand (needed to remove two 0’s in the middle of the names in some cases, just one in others. Automatable, but needs to be discovered manually) – Much, much worse in Georgia • Missing data from Montgomery County and Kent County • This is the fun but dangerous part: data analysis 2/23/2023 Sample Footer Text 20
  • 21. What did we learn? • Moore outperformed Jealous / Cox performed worse than Hogan almost everywhere • Precinct-level data lets us see: – the incredible expansion of blue territory in the west part of the County and City from 2018 to 2022 – The expansion of blue between D.C. and Baltimore and the lightening of red support all around the state 2/23/2023 21
  • 22. What did we learn (2) • Moore preference related to race – <20% Black census tracts had fully mixed opinions on Moore – >80% Black nearly all for Moore – Log relationship • Strong majority of voters wanted legal weed – More support in Black community – Driven by: • long left tail in white • Less variation in Black – Violin plot! 2/23/2023 22
  • 23. 23
  • 24. Thank you Nick Thieme, The Baltimore Banner Nick.thieme@thebaltimorebanner.com 703-850-0935 Twitter: @FurrierTranform