SlideShare a Scribd company logo
1 of 27
Fall 2014
Analytics Project Presentation - Fall 2014
NYU Real Time and Big Data
Project : Rodent Baiting in NYC.
Team: Sanchit Khandelwal, Rohit Shankar, Simran
Kaur.
1
Fall 2014
Rodent Baiting in NYC.
Abstract
Analytic 1
•To find the factor which can be best used to predict the occurrence of
Rodents in a particular area.
•Using Garbage, Water Leaks complaints with Rodent complaints to
find the if there is an increase in Rodent complaints.
Analytic 2
•Analyze the frequency of rodent complaints made in the city with
respect to temperature ranges since 2012
Analytic 3
•To estimate the rat population of the city. 8 million rats for 8 million
New Yorkers? Debunk the myth ?
2
Fall 2014
Rodent Baiting in NYC.
Background
•NYC- infamous for its rodent problem.
•311-non emergency helpline to provide access to different government
services.
•Takes requests in the form of complaints. Tracks and Manages complaints.
•311 complaints database updated daily and open source.
•New York City Department of Health and Mental Hygiene (DHMH)
3
Fall 2014
Rodent Baiting in NYC.
Motivation
•The aforementioned rodent problem.
•DHMH does not take well planned preemptive actions to control rodent
population.
•First come first serve basis problem solving.
•No official estimate of no. of rodents.
•DHMH can use our analytic to take preemptive actions which can help
reduce /control the no. of rodents.
4
Fall 2014
Rodent Baiting in NYC.
Data Sources
<311 Rodent Complaint Database>
•Contains rodent complaints with details like timestamp of complaint,
zip code, location type etc. for year 2010- Nov ’14.
•Size: 38MB; Format: ‘.CSV’
<311 Sanitation Complaint Database>
•Contains sanitation complaints having fields similar to rodent
database for 2010-Nov’14.
•Size: 41MB; Format: ‘.CSV’
<311 Water Leak Database>
•Contains several water complaints like water leaking, standing water,
hydrant overflow along with timestamp, zip code etc. for 2010-Nov’14.
•Size: 30MB; Format: ‘.CSV’
5
Fall 2014
Rodent Baiting in NYC.
Data Sources Contd.
<NCDC Weather Database>
•The National Climate Data Center (NCDC) weather database for NYC
contains fields like max, min temp, rainfall, wind speeds for each day for
years 2012-Nov’2014.
•Size:1MB; Format: ‘.CSV’
Analytic 1: Sanitation, Water Factor
Design Diagram:
6
Fall 2014
Figure 1: Sanitation/Water leak
7
‘311 Rodent complaints’
database
‘311 Sanitation complaints’
database
Data cleanup: Extract
{date,zipcode} fields
Data cleanup: Extract
{date ,zipcode} fields
PIG: Join operation to get for each
sanitation date all rodent dates along
with zipcodes (area)
MR1: For each sanitation date get count of no. of
rodent complaints ,1 week prior(negative) and 1
week (positive)after the sanitation date, along with
zipcodes (area)
MR2: Get Average no of negative and positive
rodent complaints for each ZipCode(area)
Analysis of results
Fall 2014
8
Fall 2014
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
Top 10 areas with highest sanitation factor
Sanitation Factor
Result
Areas where, when a sanitation complaint is received, preemptive rodent
control action should be taken .
Fall 2014
Result
Areas where sanitation is not the cause for a rodent complaint
-0.4
-0.35
-0.3
-0.25
-0.2
-0.15
-0.1
-0.05
0
Top 10 areas least affected by sanitation
complaint
Fall 2014
11.60%
88.40%
Sanitation factors - comparison
Negative Sanitation Factor Positive Sanitation Factor
Result
In almost all cases number of rodent complaints a week after a
sanitation complaint is more than the rodent complaints a week before
Fall 2014
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
Top 10 areas with highest water leak factor
Result
Areas where, when a water leak complaint is received, preemptive
rodent control action should be taken
Fall 2014
-1.8
-1.6
-1.4
-1.2
-1
-0.8
-0.6
-0.4
-0.2
0
Lower
West Side
Chelsea &
Clinton
Bronx
Park and
Fordham
Central
Bronx
Upper
East Side
Borough
Park
Central
Harlem
Upper
East Side
Northwest
Brooklyn
West
Queens
Top 10 areas least affected by water leak
complaint
Result
Areas where a water leak is not the prime cause for a rodent complaint;
other factors are more dominant.
Fall 2014
28.12%
71.88%
Water Leak factors - comparison
Negative Water Leak Factor Positive Water Leak Factor
Result
In most cases number of rodent complaints a week after a water leak
complaint is more than the rodent complaints a week before
Fall 2014
Rodent Baiting in NYC.
Analytic 2: Weather affecting rodent complaints
Aim to find Rodent complaints and temperature relation.
Design Diagram:
Fall 2014
Figure 3: Weather Analytic
NCDC Weather database
for NYC, 2012-14
311 Rodent Complaints
database
Data Cleanup and
date formatting
Data Cleanup and
extracting 2012-14
data only.
MR1:Date
formatting
Individual temperature values
replaced by 5⁰C interval Ranges.
PIG: Inner Join to get
temperature range for each
rodent complaint date
MR2: Aggregation of
complaints based on
temperature ranges.
Analysis of results
Fall 2014
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
[-15 , -10] [-10 , -5] [-5 , 0] [0 , 5] [5 , 10] [10 , 15] [15 , 20] [20 , 25] [25 , 30] [30 , 35]
Number of complaints for each temperature
range (in Celsius)
Rodent Complaints
Result
1)As NYC experiences moderate temperature [15 – 25 C] the number of
rodent complaints increase.
Fall 2014
2) Results analogous to scientific finding
3) When we move from summer to winter ((30-25)->(10-5)) Rodent
complaints increase. Because rodents move indoors. Preemptive measure
when fall ends and winter starts.
Analytic 3: Estimation of Rodent Population
Design Diagram:
Fall 2014
311 Rodent Complaint Database for 5
years (2010-14)
Calculate Avg. no of complaints each
year=>Total no of complaints /5.
Assuming one rat lives 1 year.
Multiply the Avg. by 50. Each
colony of rat has around 50 rats.
Assuming Each complaint is for
different colony
OutPut:
Overestimate of the number of
rats in NYC
PIG: Calculating rodent
complaints for each
zipcode for each year.
Analysis of result
Fall 2014
0
5000
10000
15000
20000
25000
30000
35000
Top 10 areas with highest number of rodents
(numbers are estimates)
Fall 2014
0
20
40
60
80
100
120
140
160
180
Top 10 areas with lowest number of rodents
(numbers are estimates)
Fall 2014
0
1
2
3
4
5
6
7
Top 10 areas with greatest percentage change in
rodent population between 2010-2014
% change in rodent population
22
Fall 2014
Rodent Baiting in NYC.
Analysis of Results for Estimation of Rodent Population:
1) Scientific studies have shown that life expectancy of a rodent is 1 year
in a city.
2) Hence we found Avg. no rodent complaints for 1 year
3) Taking the big overestimation-each rodent call represents each entire
colony (on an avg. rodents live in a colony of 40-50)
4) We Get approx.1.2million
5) Sewer population(not that much)+ 1.2million = approx. 2 million. A
very good Overestimation.
6) Which is still less than 8 Million. Urban myth debunked.
23
Fall 2014
Rodent Baiting in NYC.
Obstacles
•Change of analytic project- no access to College data.
•NYC HPC Cluster – Encountered several problems and had to start
over using Cloudera VM
•Each database had a date format that was entirely different from the
other (sometimes even within a database)
24
Fall 2014
Rodent Baiting in NYC.
Conclusion
1) Sanitation and Water leakage are a cause for increase in rodents in
85% of the NYC areas.
2) Rodents increase between 65F -90F, which conforms to scientific
findings.
3) Urban Theory “8 million rats for 8 million people” debunked.
Acknowledgements
25
•NCDC for providing us with the weather database for NYC
•311 service of NYC for putting up their extensive databases online
•Prof. Suzanne Macintosh for her guidance and support during the
course of this project
Fall 2014
Rodent Baiting in NYC.
References
[1] http://www.statetechmagazine.com/article/2014/11/chicago-
leverages-311-and-big-data-tackle-its-rat-problems
[2] New York Department of sanitation: Spatial Analysis Of
Complaints. Sarah Williams, Nick Klien
[3]http://www.health.ny.gov/statistics/cancer/registry/appendix/neig
hborhoods.htm
[4] Planning Rodent Control For Boston’s Central Artery/Tunnel
Project. Bruce Colvin, A.Daniel AShton,Wellard McCartney, William
Jackson
26
Fall 2014
Rodent Baiting in NYC.
27

More Related Content

Viewers also liked

Duke OHNS Lumbar Drain AN Poster 44x44 vfinal
Duke OHNS Lumbar Drain AN Poster 44x44 vfinalDuke OHNS Lumbar Drain AN Poster 44x44 vfinal
Duke OHNS Lumbar Drain AN Poster 44x44 vfinalMatthew Crowson
 
0514 2 timothy 215 handles the word of truth power point church sermon
0514 2 timothy 215 handles the word of truth power point church sermon0514 2 timothy 215 handles the word of truth power point church sermon
0514 2 timothy 215 handles the word of truth power point church sermonPowerPoint_Sermons
 
Ferris buellar's day off title research
Ferris buellar's day off title researchFerris buellar's day off title research
Ferris buellar's day off title researchThomas Constable
 
Kentucky's Global Economy
Kentucky's Global EconomyKentucky's Global Economy
Kentucky's Global EconomyKAED1
 
Alegriainformaticaa
AlegriainformaticaaAlegriainformaticaa
Alegriainformaticaamaabarcelo
 
John Hughes: Director Case Study
John Hughes: Director Case StudyJohn Hughes: Director Case Study
John Hughes: Director Case Studyjaimiesian
 
Building Science 2 Report
Building Science 2 ReportBuilding Science 2 Report
Building Science 2 ReportJoanne Yunn Tze
 
Theories of Architecture & Urbanism
Theories of Architecture & UrbanismTheories of Architecture & Urbanism
Theories of Architecture & UrbanismJoanne Yunn Tze
 
Служитель слова – Владимир Даль
Служитель слова – Владимир ДальСлужитель слова – Владимир Даль
Служитель слова – Владимир ДальSavua
 

Viewers also liked (12)

Duke OHNS Lumbar Drain AN Poster 44x44 vfinal
Duke OHNS Lumbar Drain AN Poster 44x44 vfinalDuke OHNS Lumbar Drain AN Poster 44x44 vfinal
Duke OHNS Lumbar Drain AN Poster 44x44 vfinal
 
0514 2 timothy 215 handles the word of truth power point church sermon
0514 2 timothy 215 handles the word of truth power point church sermon0514 2 timothy 215 handles the word of truth power point church sermon
0514 2 timothy 215 handles the word of truth power point church sermon
 
Ferris buellar's day off title research
Ferris buellar's day off title researchFerris buellar's day off title research
Ferris buellar's day off title research
 
Vitalle Condomínio
Vitalle CondomínioVitalle Condomínio
Vitalle Condomínio
 
Kentucky's Global Economy
Kentucky's Global EconomyKentucky's Global Economy
Kentucky's Global Economy
 
Alegriainformaticaa
AlegriainformaticaaAlegriainformaticaa
Alegriainformaticaa
 
John Hughes: Director Case Study
John Hughes: Director Case StudyJohn Hughes: Director Case Study
John Hughes: Director Case Study
 
Building Science 2 Report
Building Science 2 ReportBuilding Science 2 Report
Building Science 2 Report
 
Theories of Architecture & Urbanism
Theories of Architecture & UrbanismTheories of Architecture & Urbanism
Theories of Architecture & Urbanism
 
Sris @ bput project
Sris @ bput projectSris @ bput project
Sris @ bput project
 
писанкарство
писанкарствописанкарство
писанкарство
 
Служитель слова – Владимир Даль
Служитель слова – Владимир ДальСлужитель слова – Владимир Даль
Служитель слова – Владимир Даль
 

Similar to Final PresentationRodent Baiting

The Future of Water in New York
The Future of Water in New YorkThe Future of Water in New York
The Future of Water in New YorkCarter Craft
 
After the Boom No One Tweets: Microblog-based Influenza Detection Incorporati...
After the Boom No One Tweets: Microblog-based Influenza Detection Incorporati...After the Boom No One Tweets: Microblog-based Influenza Detection Incorporati...
After the Boom No One Tweets: Microblog-based Influenza Detection Incorporati...Shoko Wakamiya
 
After the Boom No One Tweets: Microblog-based Influenza Detection Incorporati...
After	the Boom	No One Tweets: Microblog-based Influenza Detection Incorporati...After	the Boom	No One Tweets: Microblog-based Influenza Detection Incorporati...
After the Boom No One Tweets: Microblog-based Influenza Detection Incorporati...奈良先端大 情報科学研究科
 
Lablupconf session3 "Application of DL in fight against COVID-19(EN)"
Lablupconf session3 "Application of DL in fight against COVID-19(EN)"Lablupconf session3 "Application of DL in fight against COVID-19(EN)"
Lablupconf session3 "Application of DL in fight against COVID-19(EN)"Lablup Inc.
 
APHA Presentation: Using Predictive Analytics for West Nile Disease Prevention
APHA Presentation: Using Predictive Analytics for West Nile Disease PreventionAPHA Presentation: Using Predictive Analytics for West Nile Disease Prevention
APHA Presentation: Using Predictive Analytics for West Nile Disease PreventionRaed Mansour
 
Google Earth Engine: Health Applications of Google’s Cloud Platform for Big E...
Google Earth Engine: Health Applications of Google’s Cloud Platform for Big E...Google Earth Engine: Health Applications of Google’s Cloud Platform for Big E...
Google Earth Engine: Health Applications of Google’s Cloud Platform for Big E...MEASURE Evaluation
 
Modelling tick densities using VGI and machine learning (2016)
Modelling tick densities using VGI and machine learning (2016)Modelling tick densities using VGI and machine learning (2016)
Modelling tick densities using VGI and machine learning (2016)Irene Garcia-Marti
 
Preliminary results of scanner data analysis and their use to estimate italia...
Preliminary results of scanner data analysis and their use to estimate italia...Preliminary results of scanner data analysis and their use to estimate italia...
Preliminary results of scanner data analysis and their use to estimate italia...Istituto nazionale di statistica
 
Fired Up: An Analysis of Fire Incidents In Metro Manila
Fired Up: An Analysis of Fire Incidents In Metro ManilaFired Up: An Analysis of Fire Incidents In Metro Manila
Fired Up: An Analysis of Fire Incidents In Metro ManilaFor The Women Foundation
 
Almaden may 6th 2014 gilbert
Almaden may 6th 2014 gilbertAlmaden may 6th 2014 gilbert
Almaden may 6th 2014 gilbertJack Gilbert
 
Effectiveness of the telemetric flood monitoring device
Effectiveness of the telemetric flood monitoring deviceEffectiveness of the telemetric flood monitoring device
Effectiveness of the telemetric flood monitoring deviceHarhar Caparida
 
Bringing Intelligence to Everything - ICI - Printability and Graphic Communic...
Bringing Intelligence to Everything - ICI - Printability and Graphic Communic...Bringing Intelligence to Everything - ICI - Printability and Graphic Communic...
Bringing Intelligence to Everything - ICI - Printability and Graphic Communic...Christine Canet
 
Public crowd-sensing of heat-waves by social media data
Public crowd-sensing of heat-waves by social media dataPublic crowd-sensing of heat-waves by social media data
Public crowd-sensing of heat-waves by social media dataAlfonso Crisci
 
Health Datapalooza 2013: Apps Expo City of Louisville & Asthmapolis
Health Datapalooza 2013: Apps Expo City of Louisville & AsthmapolisHealth Datapalooza 2013: Apps Expo City of Louisville & Asthmapolis
Health Datapalooza 2013: Apps Expo City of Louisville & AsthmapolisHealth Data Consortium
 
Twitter floods when it rains: A case study of the UK floods in early 2014
Twitter floods when it rains: A case study of the UK floods in early 2014Twitter floods when it rains: A case study of the UK floods in early 2014
Twitter floods when it rains: A case study of the UK floods in early 2014antoniasar
 
Master's Presentation: Spatio-Temporal Data Analysis using Statistical Method...
Master's Presentation: Spatio-Temporal Data Analysis using Statistical Method...Master's Presentation: Spatio-Temporal Data Analysis using Statistical Method...
Master's Presentation: Spatio-Temporal Data Analysis using Statistical Method...DiegoErcoli
 
Results, calculations, and assumptions of the resilience.io WASH sector in GA...
Results, calculations, and assumptions of the resilience.io WASH sector in GA...Results, calculations, and assumptions of the resilience.io WASH sector in GA...
Results, calculations, and assumptions of the resilience.io WASH sector in GA...Ecological Sequestration Trust
 

Similar to Final PresentationRodent Baiting (20)

The Future of Water in New York
The Future of Water in New YorkThe Future of Water in New York
The Future of Water in New York
 
After the Boom No One Tweets: Microblog-based Influenza Detection Incorporati...
After the Boom No One Tweets: Microblog-based Influenza Detection Incorporati...After the Boom No One Tweets: Microblog-based Influenza Detection Incorporati...
After the Boom No One Tweets: Microblog-based Influenza Detection Incorporati...
 
After the Boom No One Tweets: Microblog-based Influenza Detection Incorporati...
After	the Boom	No One Tweets: Microblog-based Influenza Detection Incorporati...After	the Boom	No One Tweets: Microblog-based Influenza Detection Incorporati...
After the Boom No One Tweets: Microblog-based Influenza Detection Incorporati...
 
Lablupconf session3 "Application of DL in fight against COVID-19(EN)"
Lablupconf session3 "Application of DL in fight against COVID-19(EN)"Lablupconf session3 "Application of DL in fight against COVID-19(EN)"
Lablupconf session3 "Application of DL in fight against COVID-19(EN)"
 
APHA Presentation: Using Predictive Analytics for West Nile Disease Prevention
APHA Presentation: Using Predictive Analytics for West Nile Disease PreventionAPHA Presentation: Using Predictive Analytics for West Nile Disease Prevention
APHA Presentation: Using Predictive Analytics for West Nile Disease Prevention
 
ESS IA -Survey IA -2
ESS IA -Survey IA -2ESS IA -Survey IA -2
ESS IA -Survey IA -2
 
Google Earth Engine: Health Applications of Google’s Cloud Platform for Big E...
Google Earth Engine: Health Applications of Google’s Cloud Platform for Big E...Google Earth Engine: Health Applications of Google’s Cloud Platform for Big E...
Google Earth Engine: Health Applications of Google’s Cloud Platform for Big E...
 
Modelling tick densities using VGI and machine learning (2016)
Modelling tick densities using VGI and machine learning (2016)Modelling tick densities using VGI and machine learning (2016)
Modelling tick densities using VGI and machine learning (2016)
 
Preliminary results of scanner data analysis and their use to estimate italia...
Preliminary results of scanner data analysis and their use to estimate italia...Preliminary results of scanner data analysis and their use to estimate italia...
Preliminary results of scanner data analysis and their use to estimate italia...
 
Fired Up: An Analysis of Fire Incidents In Metro Manila
Fired Up: An Analysis of Fire Incidents In Metro ManilaFired Up: An Analysis of Fire Incidents In Metro Manila
Fired Up: An Analysis of Fire Incidents In Metro Manila
 
ENACTS: A New Technical Innovation to Meet Climate Information Needs
ENACTS: A New Technical Innovation to Meet Climate Information NeedsENACTS: A New Technical Innovation to Meet Climate Information Needs
ENACTS: A New Technical Innovation to Meet Climate Information Needs
 
Almaden may 6th 2014 gilbert
Almaden may 6th 2014 gilbertAlmaden may 6th 2014 gilbert
Almaden may 6th 2014 gilbert
 
Effectiveness of the telemetric flood monitoring device
Effectiveness of the telemetric flood monitoring deviceEffectiveness of the telemetric flood monitoring device
Effectiveness of the telemetric flood monitoring device
 
Bringing Intelligence to Everything - ICI - Printability and Graphic Communic...
Bringing Intelligence to Everything - ICI - Printability and Graphic Communic...Bringing Intelligence to Everything - ICI - Printability and Graphic Communic...
Bringing Intelligence to Everything - ICI - Printability and Graphic Communic...
 
Public crowd-sensing of heat-waves by social media data
Public crowd-sensing of heat-waves by social media dataPublic crowd-sensing of heat-waves by social media data
Public crowd-sensing of heat-waves by social media data
 
Health Datapalooza 2013: Apps Expo City of Louisville & Asthmapolis
Health Datapalooza 2013: Apps Expo City of Louisville & AsthmapolisHealth Datapalooza 2013: Apps Expo City of Louisville & Asthmapolis
Health Datapalooza 2013: Apps Expo City of Louisville & Asthmapolis
 
Twitter floods when it rains: A case study of the UK floods in early 2014
Twitter floods when it rains: A case study of the UK floods in early 2014Twitter floods when it rains: A case study of the UK floods in early 2014
Twitter floods when it rains: A case study of the UK floods in early 2014
 
Master's Presentation: Spatio-Temporal Data Analysis using Statistical Method...
Master's Presentation: Spatio-Temporal Data Analysis using Statistical Method...Master's Presentation: Spatio-Temporal Data Analysis using Statistical Method...
Master's Presentation: Spatio-Temporal Data Analysis using Statistical Method...
 
Results, calculations, and assumptions of the resilience.io WASH sector in GA...
Results, calculations, and assumptions of the resilience.io WASH sector in GA...Results, calculations, and assumptions of the resilience.io WASH sector in GA...
Results, calculations, and assumptions of the resilience.io WASH sector in GA...
 
Covics 19 final
Covics 19 finalCovics 19 final
Covics 19 final
 

Final PresentationRodent Baiting

  • 1. Fall 2014 Analytics Project Presentation - Fall 2014 NYU Real Time and Big Data Project : Rodent Baiting in NYC. Team: Sanchit Khandelwal, Rohit Shankar, Simran Kaur. 1
  • 2. Fall 2014 Rodent Baiting in NYC. Abstract Analytic 1 •To find the factor which can be best used to predict the occurrence of Rodents in a particular area. •Using Garbage, Water Leaks complaints with Rodent complaints to find the if there is an increase in Rodent complaints. Analytic 2 •Analyze the frequency of rodent complaints made in the city with respect to temperature ranges since 2012 Analytic 3 •To estimate the rat population of the city. 8 million rats for 8 million New Yorkers? Debunk the myth ? 2
  • 3. Fall 2014 Rodent Baiting in NYC. Background •NYC- infamous for its rodent problem. •311-non emergency helpline to provide access to different government services. •Takes requests in the form of complaints. Tracks and Manages complaints. •311 complaints database updated daily and open source. •New York City Department of Health and Mental Hygiene (DHMH) 3
  • 4. Fall 2014 Rodent Baiting in NYC. Motivation •The aforementioned rodent problem. •DHMH does not take well planned preemptive actions to control rodent population. •First come first serve basis problem solving. •No official estimate of no. of rodents. •DHMH can use our analytic to take preemptive actions which can help reduce /control the no. of rodents. 4
  • 5. Fall 2014 Rodent Baiting in NYC. Data Sources <311 Rodent Complaint Database> •Contains rodent complaints with details like timestamp of complaint, zip code, location type etc. for year 2010- Nov ’14. •Size: 38MB; Format: ‘.CSV’ <311 Sanitation Complaint Database> •Contains sanitation complaints having fields similar to rodent database for 2010-Nov’14. •Size: 41MB; Format: ‘.CSV’ <311 Water Leak Database> •Contains several water complaints like water leaking, standing water, hydrant overflow along with timestamp, zip code etc. for 2010-Nov’14. •Size: 30MB; Format: ‘.CSV’ 5
  • 6. Fall 2014 Rodent Baiting in NYC. Data Sources Contd. <NCDC Weather Database> •The National Climate Data Center (NCDC) weather database for NYC contains fields like max, min temp, rainfall, wind speeds for each day for years 2012-Nov’2014. •Size:1MB; Format: ‘.CSV’ Analytic 1: Sanitation, Water Factor Design Diagram: 6
  • 7. Fall 2014 Figure 1: Sanitation/Water leak 7 ‘311 Rodent complaints’ database ‘311 Sanitation complaints’ database Data cleanup: Extract {date,zipcode} fields Data cleanup: Extract {date ,zipcode} fields PIG: Join operation to get for each sanitation date all rodent dates along with zipcodes (area) MR1: For each sanitation date get count of no. of rodent complaints ,1 week prior(negative) and 1 week (positive)after the sanitation date, along with zipcodes (area) MR2: Get Average no of negative and positive rodent complaints for each ZipCode(area) Analysis of results
  • 9. Fall 2014 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Top 10 areas with highest sanitation factor Sanitation Factor Result Areas where, when a sanitation complaint is received, preemptive rodent control action should be taken .
  • 10. Fall 2014 Result Areas where sanitation is not the cause for a rodent complaint -0.4 -0.35 -0.3 -0.25 -0.2 -0.15 -0.1 -0.05 0 Top 10 areas least affected by sanitation complaint
  • 11. Fall 2014 11.60% 88.40% Sanitation factors - comparison Negative Sanitation Factor Positive Sanitation Factor Result In almost all cases number of rodent complaints a week after a sanitation complaint is more than the rodent complaints a week before
  • 12. Fall 2014 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 Top 10 areas with highest water leak factor Result Areas where, when a water leak complaint is received, preemptive rodent control action should be taken
  • 13. Fall 2014 -1.8 -1.6 -1.4 -1.2 -1 -0.8 -0.6 -0.4 -0.2 0 Lower West Side Chelsea & Clinton Bronx Park and Fordham Central Bronx Upper East Side Borough Park Central Harlem Upper East Side Northwest Brooklyn West Queens Top 10 areas least affected by water leak complaint Result Areas where a water leak is not the prime cause for a rodent complaint; other factors are more dominant.
  • 14. Fall 2014 28.12% 71.88% Water Leak factors - comparison Negative Water Leak Factor Positive Water Leak Factor Result In most cases number of rodent complaints a week after a water leak complaint is more than the rodent complaints a week before
  • 15. Fall 2014 Rodent Baiting in NYC. Analytic 2: Weather affecting rodent complaints Aim to find Rodent complaints and temperature relation. Design Diagram:
  • 16. Fall 2014 Figure 3: Weather Analytic NCDC Weather database for NYC, 2012-14 311 Rodent Complaints database Data Cleanup and date formatting Data Cleanup and extracting 2012-14 data only. MR1:Date formatting Individual temperature values replaced by 5⁰C interval Ranges. PIG: Inner Join to get temperature range for each rodent complaint date MR2: Aggregation of complaints based on temperature ranges. Analysis of results
  • 17. Fall 2014 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 [-15 , -10] [-10 , -5] [-5 , 0] [0 , 5] [5 , 10] [10 , 15] [15 , 20] [20 , 25] [25 , 30] [30 , 35] Number of complaints for each temperature range (in Celsius) Rodent Complaints Result 1)As NYC experiences moderate temperature [15 – 25 C] the number of rodent complaints increase.
  • 18. Fall 2014 2) Results analogous to scientific finding 3) When we move from summer to winter ((30-25)->(10-5)) Rodent complaints increase. Because rodents move indoors. Preemptive measure when fall ends and winter starts. Analytic 3: Estimation of Rodent Population Design Diagram:
  • 19. Fall 2014 311 Rodent Complaint Database for 5 years (2010-14) Calculate Avg. no of complaints each year=>Total no of complaints /5. Assuming one rat lives 1 year. Multiply the Avg. by 50. Each colony of rat has around 50 rats. Assuming Each complaint is for different colony OutPut: Overestimate of the number of rats in NYC PIG: Calculating rodent complaints for each zipcode for each year. Analysis of result
  • 20. Fall 2014 0 5000 10000 15000 20000 25000 30000 35000 Top 10 areas with highest number of rodents (numbers are estimates)
  • 21. Fall 2014 0 20 40 60 80 100 120 140 160 180 Top 10 areas with lowest number of rodents (numbers are estimates)
  • 22. Fall 2014 0 1 2 3 4 5 6 7 Top 10 areas with greatest percentage change in rodent population between 2010-2014 % change in rodent population 22
  • 23. Fall 2014 Rodent Baiting in NYC. Analysis of Results for Estimation of Rodent Population: 1) Scientific studies have shown that life expectancy of a rodent is 1 year in a city. 2) Hence we found Avg. no rodent complaints for 1 year 3) Taking the big overestimation-each rodent call represents each entire colony (on an avg. rodents live in a colony of 40-50) 4) We Get approx.1.2million 5) Sewer population(not that much)+ 1.2million = approx. 2 million. A very good Overestimation. 6) Which is still less than 8 Million. Urban myth debunked. 23
  • 24. Fall 2014 Rodent Baiting in NYC. Obstacles •Change of analytic project- no access to College data. •NYC HPC Cluster – Encountered several problems and had to start over using Cloudera VM •Each database had a date format that was entirely different from the other (sometimes even within a database) 24
  • 25. Fall 2014 Rodent Baiting in NYC. Conclusion 1) Sanitation and Water leakage are a cause for increase in rodents in 85% of the NYC areas. 2) Rodents increase between 65F -90F, which conforms to scientific findings. 3) Urban Theory “8 million rats for 8 million people” debunked. Acknowledgements 25 •NCDC for providing us with the weather database for NYC •311 service of NYC for putting up their extensive databases online •Prof. Suzanne Macintosh for her guidance and support during the course of this project
  • 26. Fall 2014 Rodent Baiting in NYC. References [1] http://www.statetechmagazine.com/article/2014/11/chicago- leverages-311-and-big-data-tackle-its-rat-problems [2] New York Department of sanitation: Spatial Analysis Of Complaints. Sarah Williams, Nick Klien [3]http://www.health.ny.gov/statistics/cancer/registry/appendix/neig hborhoods.htm [4] Planning Rodent Control For Boston’s Central Artery/Tunnel Project. Bruce Colvin, A.Daniel AShton,Wellard McCartney, William Jackson 26