2. "AI is fueling a 4th Industrial Revolution."
United Nations, UNESCO, March 2018
"The AI Revolution is happening now."
Forbes, August 2021
"AI is poised to disrupt entire industries."
Harvard Business Review, 2021
But, like every major evolution, some will get left behind . . .
3. non profits
charities
schools
small business
government agencies
mission focused
small
cheap
focal accountability
short horizon
But they share concerns with major retailers:
• growing clients
• scaling services
• revenue, revenue, revenue
• relevance (competition)
4. Budget
Talent
Data
Methods
Challenge 1:
Challenge 2:
Challenge 3:
Challenge 4:
open source
non-profit licensing
donations
grants
“citizen data scientists”
volunteers
public challenges (Kaggle)
collaboration tools
Agile
Exploratory Analyses
CRISP-DM
SEMMA, KDD
industry association data
data brokers
social media
open, public data
5. Pretrained
Models
Transfer
Learning
Cloud
Computing
Auto ML
Opportunity 1:
Opportunity 2:
Opportunity 3:
Opportunity 4:
Image Classification like EfficientNet
Natural Language like GPT-3, BERT, Huggingface
Predictors like gluon MXNet, XGBoost
Drive time like ArcGIS, Google API
from Pretrained Models
Google’s MobileNet
tensorflow, keras, pytorch
. . .
7. Mission
support high school youth in foster care to successfully transition
to self-sufficiency through higher education or other vocations
Problem
Which program locations are most effective?
What are the next best candidate universities?
Privacy laws limit identifiable data on children.
Data
Internal enrollment
Department of Education, University data
Department of HHS foster system data
Model
Volunteers through Catchafire used discriminant analysis based
upon race, ethnicity, gender, grade level, etc. Independence
analysis especially on gender and race.
8. Mission
close that gap in public assistance to reduce hunger and poverty and
build pathways to economic mobility. Today, more than $80 billion in
food, financial aid, healthcare, and other assistance goes untapped.
Problem
Find the lowest level of geography for intervention or legal action
based upon difference between estimated needs and benefits issued.
Data
SNAP card issuance
FCC Affordable Connectivity and Housing
Bureau of Labor Statistics on Unemployment
CDC’s Social Vulnerability Indexes
Census American Community Survey for population and poverty
Model
Volunteers from Datakind proposed two networks, a predictor and a
regressor, applied to an integrated dataset to generate an expected
value of unmet need..
9. Mission
provide a safety net for older and disabled adults to help them
remain in their homes with dignity and strengthens food and
financial security for all community members in need of support.
Problem
ElderNET’s clients live are near its physical locations, offices,
transportation and food pantries. Growth and new county-based
grants benefit from an ability to find, estimate prioritize locations
with the greatest need.
Data
Client lists
Service volumes
Model
A datathon run by Philadelphia’s Datajawn and R-Ladies proposed a
“gravity” model prioritizing block group geographies with the
greatest need and need gap.
10. Mission
provide policymakers with research to promote healthy child
development, and strong, nurturing families that are economically
secure.
Problem
Estimate family support available in each geography detecting
changes to state laws and budgets change with each legislative
session. Estimate benefit “cliffs” where households lose more
support than they gain with higher resources.
Data
Census demographics, state human services reporting.
Model
Family Resource Simulator uses object databases to collect policy
documents. It’s processing monitors changes to public policy
documents in select states.
11. Mission
bring Science, Technology, Engineering, and Math ("STEM")
education to girls grade 5-8 from underserved communities.
Problem
JerseySTEM needs to find combinations of colleges (for teachers),
schools (locations & students) and sponsoring donors (for funding).
Data
Salesforce CRM for all interactions
Past enrollment, locations, donors, teachers
Travel time APIs
S&P CapitalIQ for companies and high wealth
Model
Relying on Catchafire volunteers for long duration engagements,
provide targets of colleges, schools and sponsors for direct marketing
by channel.
12. Mission
The US Constitution (I § 8) says “Congress shall have power . . . To
promote the progress of science and useful arts [giving] inventors
the exclusive right to their … discoveries.”
Problem
Patent examiners judge whether a new patent is too near an existing
claim from text full of legal and scientific language. At the same
time, patent “trolls” look for idea vacuums for licensing or litigation.
Data
USPTO’s Yellow Book
Public word and sentence embeddings
Open source transformers from huggingface.co
Model
Hosting Kaggle competitions, the USPTO collected solutions that of
NLP transformers to detect phrase matching, infringement and
anomalous text.
13. Mission
prevent and alleviate human suffering in the face of emergencies
Problem
More than 360,000 home fires occur each year, seriously injuring
more than 13,000 people and cause over $7B in property damage.
On average, 7 people die every day in home fires.
Data
Red Cross data on fire response services
Census American Community Survey
TIGER/Line shapefiles
CDC Social Vulnerability Indexes
FEMA’s NFIRS dataset
Model
Internal data scientists developed ensemble models combining fire
propensity, fire intensity and smoke detector risk. ARC’s Home Fire
Campaign distributed more than 2.3 million smoke detectors in over
a million home visits.
15. American Community Survey data.census.gov/
TIGERLine shape files census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html
CDC Social Vulnerability Index atsdr.cdc.gov/placeandhealth/svi/index.html
Harvard’s Dataverse dataverse.harvard.edu/
NOAA’s Weather Datasets ncei.noaa.gov/products/climate-data-records
Federal Research Economic Resarch fred.stlouisfed.org/
Yahoo Finance finance.yahoo.com/
U.S. Government’s open data data.gov/
BLS Consumer Expenditure Survey bls.gov/cex/
NORC General Social Survey norc.org/Research/Capabilities/Pages/gss-data-explorer.aspx
Data
Kaggle (data and notebooks) kaggle.com/
Google’s Colab colab.research.google.com/
CoCalc cocalc.com/
Notebooks
16. Huggingface Models & Datasets huggingface.co/
OpenCV Computer Vision opencv.org/
Always AI Computer Vision alwaysai.co/
Apache MXNet https://github.com/apache/mxnet
Volunteering, Meetups and Datathons
Catchafire catchafire.org/
DataKind datakind.org/
Red Cross Code4Good code4good.io/
Data meetups meetup.com/dataphilly/
Datajawn phillydatajawn.com/
Volunteer Match volunteermatch.org/
Open Data Philly opendataphilly.org/organization/volunteer
Data Science Solve for Good (DSSG) https://www.solveforgood.org/
Pretrained
Editor's Notes
“AI or Die” started as a robotics specific phrase, but is now used to describe data strategies – with AI development, there is no second place. This is often true during tech booms – 20 years ago the internet itself.
Since the mid 2010’s, the US military has used the phrase “AI Arms Race” to treat AI like a weapons capability – the tank that shoots second farthest is useless.
“AI or Die” started as a robotics specific phrase, but is now used to describe data strategies – with AI development, there is no second place. This is often true during tech booms – 20 years ago the internet itself.
Since the mid 2010’s, the US military has used the phrase “AI Arms Race” to treat AI like a weapons capability – the tank that shoots second farthest is useless.
McKinsey coined the phrase “the War for Talent” initially for data analytical skills and more recently for AI & Machine Learning.
In financial services, I can attest that hiring has included rationale like:
“by the time we justify
McKinsey coined the phrase “the War for Talent” initially for data analytical skills and more recently for AI & Machine Learning.
In financial services, I can attest that hiring has included rationale like:
“by the time we justify