Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Forecasting critical food
violations at restaurants
using open data
Nicole Donnelly
PyData DC
October 8, 2106
Hello!
Thank you!
Who are you?
Who am I?
Why am I here?
The Project
Replicate Chicago’s Food
Inspection Forecasting
project using Python and
data about DC.
Data ComputeWrangleIngest ReportVisualizeData
Report
Data ComputeWrangleIngest ReportVisualizeData
Report
Data ComputeWrangleIngest ReportVisualizeData
Report
Hypothesis
Foodborne illness outbreaks affect millions of people annually. The city of Washington, DC,
like most cities, h...
Instance: an inspection
Features: the data about
the instance
Prediction: will there be a
critical violation
Data
Weather
...
Scraping
APIs
CSVs
Ingest
Clean the data
Create the instances
Come to terms with
features
Feature engineering
Wrangle
Which estimator?
All of them
Compute
Drumroll please...
Visualize
Results, out of sample data
The scores were not
great, but reprioritizing
the inspections using the
model confidence scores
yields results.
Report
11%...
What now?
Build better dataset
Get more data
Get more input
Poor scores do not
mean failure, they are
just a starting point.
Thanks!
Nicole Donnelly
nicole@nicoledonnelly.me
@NicoleADonnelly
Github: nd1
PyDataDC- Forecasting critical food violations at restaurants using open data
You’ve finished this document.
Download and read it offline.
Upcoming SlideShare
ForecastIT 4. Holt's Exponential Smoothing
Next
Upcoming SlideShare
ForecastIT 4. Holt's Exponential Smoothing
Next
Download to read offline and view in fullscreen.

0

Share

PyDataDC- Forecasting critical food violations at restaurants using open data

Download to read offline

This talk provides an end-to-end demonstration of how I replicated Chicago's Food Inspection Forecasting using Python and open data from Washington, DC. The content is targeted toward the novice data scientist and will discuss the practical aspects of planning and executing the project.

  • Be the first to like this

PyDataDC- Forecasting critical food violations at restaurants using open data

  1. 1. Forecasting critical food violations at restaurants using open data Nicole Donnelly PyData DC October 8, 2106
  2. 2. Hello! Thank you!
  3. 3. Who are you?
  4. 4. Who am I?
  5. 5. Why am I here?
  6. 6. The Project Replicate Chicago’s Food Inspection Forecasting project using Python and data about DC.
  7. 7. Data ComputeWrangleIngest ReportVisualizeData Report
  8. 8. Data ComputeWrangleIngest ReportVisualizeData Report
  9. 9. Data ComputeWrangleIngest ReportVisualizeData Report
  10. 10. Hypothesis Foodborne illness outbreaks affect millions of people annually. The city of Washington, DC, like most cities, has limited resources to inspect food establishments for critical violations that lead to these outbreaks. We can use machine learning to predict when a critical violation is likely to occur and prioritize inspections to catch these violations sooner, mitigating foodborne illness outbreaks and more effectively deploying limited resources.
  11. 11. Instance: an inspection Features: the data about the instance Prediction: will there be a critical violation Data Weather DOH Inspections Crime ABRA DCRA Construction Rating Number of Reviews Category Non-emergency City Issues Places
  12. 12. Scraping APIs CSVs Ingest
  13. 13. Clean the data Create the instances Come to terms with features Feature engineering Wrangle
  14. 14. Which estimator? All of them Compute
  15. 15. Drumroll please... Visualize
  16. 16. Results, out of sample data
  17. 17. The scores were not great, but reprioritizing the inspections using the model confidence scores yields results. Report 11% more violations 10 day sooner
  18. 18. What now? Build better dataset Get more data Get more input
  19. 19. Poor scores do not mean failure, they are just a starting point.
  20. 20. Thanks! Nicole Donnelly nicole@nicoledonnelly.me @NicoleADonnelly Github: nd1

This talk provides an end-to-end demonstration of how I replicated Chicago's Food Inspection Forecasting using Python and open data from Washington, DC. The content is targeted toward the novice data scientist and will discuss the practical aspects of planning and executing the project.

Views

Total views

537

On Slideshare

0

From embeds

0

Number of embeds

6

Actions

Downloads

6

Shares

0

Comments

0

Likes

0

×