This document summarizes an analysis of NYC restaurant inspection data to identify violation trends and predict violations. The analysis found that (1) facility amenities and animal violations were most common, (2) Manhattan and Queens had the most violations, and (3) spring had the most violations seasonally. Unsupervised learning identified four clusters with different prevalence by season, grade, and location. Supervised models found landmarks generally performed better and factors like low initial grade and re-inspection type increased critical violation risk. Recommendations included being wary of certain cuisines and dining after spring, and helping restaurants focus on cleanliness and construction.
Conflict is everywhere in the workplace. It is neither good nor bad. Left unchecked it can transform into violence. Agile teams are not immune. In this talk, we will look at the unusual ways conflict was exposed on an Agile project as it attempted to scale under stress. We will look at how the failure to come to grips with the underlying conflicts triggered dysfunction and disengagement. We will see how the conflict affected the people, their relationships and ultimately the project itself.
Conflict is everywhere in the workplace. It is neither good nor bad. Left unchecked it can transform into violence. Agile teams are not immune. In this talk, we will look at the unusual ways conflict was exposed on an Agile project as it attempted to scale under stress. We will look at how the failure to come to grips with the underlying conflicts triggered dysfunction and disengagement. We will see how the conflict affected the people, their relationships and ultimately the project itself.
PDI Perjuangan mengadakan Focus Group Discussion (FGD) terkait Urgensi Pengesahan RUU Penghapusan Kekerasan Seksual guna menggali informasi dan masukan dari berbagai pihak. Menyadari pentingnya peran masyarakat sipil maupun lembaga-lembaga di luar Legislatif, Himpunan Psikologi Indonesia (HIMPSI) diantara pihak yang diundang untuk memberikan masukan pada acara tsb. Adapun FGD dilaksanakan pada: Selasa, 28 Juli 2020 Pukul : 14.00 sd selesai.
The objective of data mining on food inspections was to get causality of why the license are getting rejected. Also, determine the trend and the seasonality involved in number of licenses getting failed or passed, and forecast the same. What measures food establishments must take to make sure the license inspection gets pass.
The Yelp Dataset consists of 1.6M reviews by customers for 61K businesses. There are three tasks accomplished using this dataset:-
1. Assign categories to businesses based on customer reviews
2. Recommend food items and services of a restaurant based on reviews
3. Determine Influential factors in a city affecting restaurants
Learn what FSQA data is overlooked, how to restructure data to better predict food safety and quality risk, and drive efficiency across your entire plant.
Learn how to transform from a mild-mannered online organizer into a true data-driven mastermind! What to track, how to test, and methods for creating a data-driven culture at your nonprofit.
Culling from a rich pool of proprietary research, Tyson Foods Inc.’s Eric LeBlanc illustrates the importance of connecting with consumers before they enter the grocery store – and while they’re deciding what to have for dinner. As important as the pre-shop connecting is, however, it’s execution in the store that determines success or failure. LeBlanc reveals the real cost of failing to address three key grocerant elements: product issues, staffing issues, and general issues, such as cleanliness, wait or product readiness.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
PDI Perjuangan mengadakan Focus Group Discussion (FGD) terkait Urgensi Pengesahan RUU Penghapusan Kekerasan Seksual guna menggali informasi dan masukan dari berbagai pihak. Menyadari pentingnya peran masyarakat sipil maupun lembaga-lembaga di luar Legislatif, Himpunan Psikologi Indonesia (HIMPSI) diantara pihak yang diundang untuk memberikan masukan pada acara tsb. Adapun FGD dilaksanakan pada: Selasa, 28 Juli 2020 Pukul : 14.00 sd selesai.
The objective of data mining on food inspections was to get causality of why the license are getting rejected. Also, determine the trend and the seasonality involved in number of licenses getting failed or passed, and forecast the same. What measures food establishments must take to make sure the license inspection gets pass.
The Yelp Dataset consists of 1.6M reviews by customers for 61K businesses. There are three tasks accomplished using this dataset:-
1. Assign categories to businesses based on customer reviews
2. Recommend food items and services of a restaurant based on reviews
3. Determine Influential factors in a city affecting restaurants
Learn what FSQA data is overlooked, how to restructure data to better predict food safety and quality risk, and drive efficiency across your entire plant.
Learn how to transform from a mild-mannered online organizer into a true data-driven mastermind! What to track, how to test, and methods for creating a data-driven culture at your nonprofit.
Culling from a rich pool of proprietary research, Tyson Foods Inc.’s Eric LeBlanc illustrates the importance of connecting with consumers before they enter the grocery store – and while they’re deciding what to have for dinner. As important as the pre-shop connecting is, however, it’s execution in the store that determines success or failure. LeBlanc reveals the real cost of failing to address three key grocerant elements: product issues, staffing issues, and general issues, such as cleanliness, wait or product readiness.
Similar to New York City Restaurant Inspection Analysis (9)
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
7. Data Attributes
● Inspection Date
● Inspection Type
● Violation Code
● Critical Flag
● Grade (A,B,C)
● Scores
● ID
● Restaurant Name
● Cuisine Description
● New York Boro
● Zip Code
RESTAURANT DETAILS VIOLATION DETAILS
477,000 rows
8. Data Cleaning
1
2
3
4
Removed rows with
inspection dates in the
future.
REMOVED
BAD DATA
Reduced number
of rows
SHRANK
DATA SET
FIXED SPELLING &
INCONSISTENCIES
REPLACEMENT
& FLAG CREATION
Fixed spelling errors.
Replaced ‘Not Yet Graded’
with ‘N’.
Broke ‘Inspection Type’
into 2 columns.
Violation Categories
Inspection Categories
Seasonal Flags
Landmark Flags
13. 1,438,159 population
13,221 persons/sq. km
9.48%
2,321,580 population
8,237 persons/sq. km
24.07%
39%
Violations
Manhattan
3%
Violations
Staten Island
24%
Violations
Brooklyn
9%
Violations
Bronx
24%
Violations
Queens
Restaurant Density vs. Percent Violations
15. Insight:
There are not major
differences in average
restaurant scores
despite differing
borough wealth and
popularity.
Do inspection scores differ
across borough? ?
16. Recommendation:
Re-opening average
scores are lowest
scores. A separate
process could be in
place for re-openings to
ensure good scores.
Inspection Type:
How Do Scores Differ for Inspection Types ?
17. Restaurant Grade Distribution:Takeaways:
● Hamburgers,
Cafes and
American
food have the
highest % of
A grades.
● Indian food
has the
largest share
of C grades
Grade A
Grade B
Grade C
Source: What’s the safest food in New York City? - Data Diversions - tumblr.com [NYC Open Data]
25. Cluster Findings:
What are the prevalence of violations by season?
Takeaways:
Cluster 1: Spring
Cluster 2: Summer
Cluster 3: Winter.
highest Manhattan
incidence
Cluster 4: Spring
All Clusters:
American & Chinese
food violations,
Manhattan &
Brooklyn, Score
impactful on all
clusters, especially
1 & 4
Other Findings:
Staten Island is not
impactful on any
cluster
26. Cluster Findings:
What are the prevalence of violations by grade?
Takeaways:
Cluster 1: C Grade,
Food Temp,
Flies/Food Refuse
Violation, Mice
Cluster 2: A Grade
Cluster 3: A Grade,
highest Manhattan
incidence
Cluster 4: B Grade
All Clusters:
Manhattan
impactful on all
clusters
29. Focus Point: Chipotle
Answer:
Yes...in STATEN
ISLAND - No
violations were
detected in any
Chipotle outlets there
Top Borough for
violations at Chipotle
outlets:
MANHATTAN
32. Focus Point: Landmark Restaurants
Landmark
Restaurants:
- Famous
- Oldest
- Movie Scenes
- Favorites
33. Focus Point: Landmark Restaurants
Hypothesis
Confirmed:
Not Critical violations
are more common for
Landmark
restaurants than
others.
34. Focus Point: Landmark Restaurants
Hypothesis
Confirmed:
Landmark
restaurants have
higher percentage of
A’s.
35. Focus Point: Landmark Restaurants
Finding:
Second most
common violation for
landmark restaurants
due to not cleaning
surfaces after each
use
Recommendation:
Hire employee who
cleans while chefs
cook
36. Focus Point: Landmark Restaurants
Hypothesis Not
Supported:
Violations, or lack
thereof are not
indicators of
Landmark
restaurants.
38. Part One: Decision Tree Model
VIOLATION PREDICTION --- Interpreting the Inspection Result
What kind of restaurants are more likely to be judged critical violation?
Key: Create a CRITICAL_DUMMY according to CRITICAL_FLAG; Assign Role “Target” and Level “Binary”
Not Critical Critical
Critical_Dummy = 0
VS Critical_Dummy = 1
42. Findings (Two-Way):
Grade
1.0000
Inspection
_Type
0.4314
BORO
0.1675
Restaurants who get a
score under B are 68.17%
likely to be judged critical
violation, compared to
48% likely to be critical
violation with Grade A.
Restaurants with an initial
low grade are more likely
to be judged a critical
violation during re-
inspection, with a
possibility to nearly 70%.
“BORO” does not appear
to affect much on Critical
Violation. The probability for
critical judging is around
52% for re-inspection with
initial high grades in all
regions.
43. Part Two: Logistic Regression
Outcome: Critical_Dummy
Variable Selection: Stepwise
44.
45. Findings (Similar to Decision Tree):
Score GRADE BInspection
Type
GRADE C
0.0983 0.0948 0.06100.1596
47. ● Dine after Spring, since restaurants have been issued the most violations
by that time.
● Be wary of Indian and Chinese restaurants in New York City.
● Don’t pay Manhattan prices; it does not have cleaner restaurants.
● If you want to eat at Chipotle, go to Staten Island.
FOR THE HUNGRY CONSUMER
48. ● Hire a dedicated cleaner in high-volume landmark restaurants.
● Since Facility Amenities violations are the most common, construction
is a critical stage -- do extensive research before contracting.
● Focus on cleanliness for the Spring season.
● Be sure to do well for re-inspection, you’ll either pass with flying colors
or be severely penalized.
● Set a benchmark to be met before allowing re-opening.
FOR RESTAURANTS