SlideShare a Scribd company logo
1 of 15
MACHINE LEARNING
ON
CHICAGO CRIME DATASET
FINAL PROJECT PROPOSAL
ADVANCE DATA SCIENCE & ARCHITECTURE
Team9:
- AashriTandon
- Pragati Shaw
- Sarthak Agarwal
Introduction to data
• The main idea behind this project is to perform geospatial analytics and machine learning on
ChicagoCrime dataset.
• This dataset reflects reported incidents of crime (with the exception of murders where data exists
for each victim) that occurred in the City of Chicago from 2001 to present. Data is extracted from
the Chicago Police Department's CLEAR (Citizen Law Enforcement Analysis and Reporting)
system from the below URL.
– https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-present/ijzp-q8t2/data
• Dataset Size: 1.4 Gigabytes
• No. of records: ~6.3 million
• No of columns: 22
Columns
ID Unique identifier for the record.
Case Number Chicago Police Department RD Number (Records Division Number)
Date Date when the incident occurred
Block The partially redacted address where the incident occurred, placing it on the same block as the actual address
IUCR The Illinois Uniform Crime Reporting code
PrimaryType The primary description of the IUCR code.
Description The secondary description of the IUCR code, a subcategory of the primary description.
Location Description Description of the location where the incident occurred.
Arrest Indicates whether an arrest was made.
Domestic Indicates whether the incident was domestic-related
Beat A beat is the smallest police geographic area
District Indicates the police district where the incident occurred
Ward The ward (City Council district) where the incident occurred
CommunityArea Indicates the community area where the incident occurred.
FBI Code Indicates the crime classification as outlined in the FBI's National Incident-Based Reporting System
X Coordinate The x coordinate of the location where the incident occurred
Y Coordinate The y coordinate of the location where the incident occurred
Year Year the incident occurred.
Updated On Date and time the record was last updated.
Latitude The latitude of the location where the incident occurred.
Longitude The longitude of the location where the incident occurred.
Location The location where the incident occurred
Diving Deep into the features
Problem Statement
• Our goal is to create a web application that would give insights to its user about the crime
scenario and its various aspects in Chicago.
• Our application will contain:
– A search box/drop down list where user can select a district.
– Geospatial analysis usingArcGIS maps and visualizations that are embedded into the web app which will
be dynamically updated to show most interesting patterns or heat maps for that district.
– Statistical analysis and visualizations on historical data to the user.
– Prediction of the date when the next crime will happen and its probability.
Part1: Data Download & Preprocessing
• Data Download
– Write a python script that automatically downloads the data from the website to a particular location.
https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-present/ijzp-q8t2/data
• Handle MissingValues
– Check the percentage of missing values and their frequency distribution.Then choose appropriate
technique to handle missing data.
• Feature Engineering.
– Check for data correlation and eliminate or create new features as needed.These features will be
selected keeping in mind the machine learning component of the application.
Part2: Geospatial Analysis
• Setup ArcGIS account and integrate ArcPy which is aArcGIS Python site package that provides a
useful and productive way to perform geographic data analysis, data conversion, data
management, and map automation with Python.
• Load the data into ArcGIS and write scripts that are most interesting to the end user.
• Some of the initial ideas are as follows, but they are subject to change as we play more with the
data andArcGIS.
– What are the effects that a district with high criminal activity has on its neighbors.
– From 2001 to 2017, how the crime has spread and what are its affects on the demographics.
– Hot SpotAnalysis of events or incidents.
Part3: Data Visualization
• Exploratory data analysis will serve two purpose. Firstly, we will learn insights about the data and
secondly we will display the best analysis that will be beneficial to our end user in the web
application.
• We will do the following types of analysis:
– Perform univariate and bivariate data analysis to get insights about the data.
– Plot data visualization. E.g.
• How has crime changed over the years?
• Which areas have evolved over the time span of 2001 to 2017?
Part 4: Machine Learning
The machine learning engine in our application will have two parts:
1. Clustering:We will divide the regions in Chicago into different clusters based on districts. It will
result in 20 clusters.
2. Prediction:We will then build prediction models for each cluster that will predict the date when
the next crime will happen and its probability.
– We will try different models like Linear Regression, Random forest and SVM and will choose the best
prediction model.
– The final model will be deployed in Azure and a RESTAPI will be created to be called from the web
application.
System Architecture
Docker
S3
Azure ML Studio ArcGIS
Rest API
Web Application
Data loading, pre-processing will happen in
Docker image
Cleaned files will be loaded to S3.
Cleaned files will be used to build ML models
and ArcGIS visualization.
Rest APIs will be created for ML model and
ArcGIS and called into the web application.
Tools
• Python – Data processing and Machine Learning.
• Docker – For easy distribution and submission.
• Java –Web application.
– Microsoft Azure ML Studio – Machine learning Rest API
• ArcGIS – Geospatial analysis
Mockup
Thank You!

More Related Content

What's hot

Crime Analytics: Analysis of crimes through news paper articles
Crime Analytics: Analysis of crimes through news paper articlesCrime Analytics: Analysis of crimes through news paper articles
Crime Analytics: Analysis of crimes through news paper articlesChamath Sajeewa
 
San Francisco Crime Prediction Report
San Francisco Crime Prediction ReportSan Francisco Crime Prediction Report
San Francisco Crime Prediction ReportRohit Dandona
 
Crime sensing with big data - Singapore perspective
Crime sensing with big data - Singapore perspectiveCrime sensing with big data - Singapore perspective
Crime sensing with big data - Singapore perspectiveBenjamin Ang
 
Crime Dataset Analysis for City of Chicago
Crime Dataset Analysis for City of ChicagoCrime Dataset Analysis for City of Chicago
Crime Dataset Analysis for City of ChicagoStuti Deshpande
 
Crime rate analysis using k nn in python
Crime rate analysis using k nn in python Crime rate analysis using k nn in python
Crime rate analysis using k nn in python CloudTechnologies
 
CRIME ANALYSIS AND PREDICTION USING MACHINE LEARNING
CRIME ANALYSIS AND PREDICTION USING MACHINE LEARNINGCRIME ANALYSIS AND PREDICTION USING MACHINE LEARNING
CRIME ANALYSIS AND PREDICTION USING MACHINE LEARNINGIRJET Journal
 
Crime Analysis at Chicago
Crime Analysis at ChicagoCrime Analysis at Chicago
Crime Analysis at ChicagoRoshik Ganesan
 
Application for Women Safety
Application for Women SafetyApplication for Women Safety
Application for Women Safetyiosrjce
 
Crime Data Analysis, Visualization and Prediction using Data Mining
Crime Data Analysis, Visualization and Prediction using Data MiningCrime Data Analysis, Visualization and Prediction using Data Mining
Crime Data Analysis, Visualization and Prediction using Data MiningAnavadya Shibu
 
Predictive Policing - How Emerging Technologies Are Helping Prevent Crimes?
Predictive Policing - How Emerging Technologies Are Helping Prevent Crimes?Predictive Policing - How Emerging Technologies Are Helping Prevent Crimes?
Predictive Policing - How Emerging Technologies Are Helping Prevent Crimes?Sunil Jagani
 
Machine Learning & Cyber Security: Detecting Malicious URLs in the Haystack
Machine Learning & Cyber Security: Detecting Malicious URLs in the HaystackMachine Learning & Cyber Security: Detecting Malicious URLs in the Haystack
Machine Learning & Cyber Security: Detecting Malicious URLs in the HaystackAlistair Gillespie
 
Malware Detection Using Data Mining Techniques
Malware Detection Using Data Mining Techniques Malware Detection Using Data Mining Techniques
Malware Detection Using Data Mining Techniques Akash Karwande
 
Crown jewels risk assessment - Cost-effective risk identification
Crown jewels risk assessment - Cost-effective risk identificationCrown jewels risk assessment - Cost-effective risk identification
Crown jewels risk assessment - Cost-effective risk identificationPriyanka Aash
 
Cyber Threat Intelligence - It's not just about the feeds
Cyber Threat Intelligence - It's not just about the feedsCyber Threat Intelligence - It's not just about the feeds
Cyber Threat Intelligence - It's not just about the feedsIain Dickson
 
Android based crime manage system proposal
Android based crime manage system proposalAndroid based crime manage system proposal
Android based crime manage system proposalBeresa Abebe
 
PredPol: How Predictive Policing Works
PredPol: How Predictive Policing WorksPredPol: How Predictive Policing Works
PredPol: How Predictive Policing WorksPredPol, Inc
 
Data mining on Social Media
Data mining on Social MediaData mining on Social Media
Data mining on Social Mediahome
 
2014 Chicago Crime Data Analysis
2014 Chicago Crime Data Analysis 2014 Chicago Crime Data Analysis
2014 Chicago Crime Data Analysis Yawen Li
 
Loan Prediction System Using Machine Learning.pptx
Loan Prediction System Using Machine Learning.pptxLoan Prediction System Using Machine Learning.pptx
Loan Prediction System Using Machine Learning.pptxBhoirRitesh19ET5008
 

What's hot (20)

Crime Analytics: Analysis of crimes through news paper articles
Crime Analytics: Analysis of crimes through news paper articlesCrime Analytics: Analysis of crimes through news paper articles
Crime Analytics: Analysis of crimes through news paper articles
 
San Francisco Crime Prediction Report
San Francisco Crime Prediction ReportSan Francisco Crime Prediction Report
San Francisco Crime Prediction Report
 
Crime sensing with big data - Singapore perspective
Crime sensing with big data - Singapore perspectiveCrime sensing with big data - Singapore perspective
Crime sensing with big data - Singapore perspective
 
Crime Dataset Analysis for City of Chicago
Crime Dataset Analysis for City of ChicagoCrime Dataset Analysis for City of Chicago
Crime Dataset Analysis for City of Chicago
 
Crime rate analysis using k nn in python
Crime rate analysis using k nn in python Crime rate analysis using k nn in python
Crime rate analysis using k nn in python
 
CRIME ANALYSIS AND PREDICTION USING MACHINE LEARNING
CRIME ANALYSIS AND PREDICTION USING MACHINE LEARNINGCRIME ANALYSIS AND PREDICTION USING MACHINE LEARNING
CRIME ANALYSIS AND PREDICTION USING MACHINE LEARNING
 
Crime Analysis at Chicago
Crime Analysis at ChicagoCrime Analysis at Chicago
Crime Analysis at Chicago
 
E-Police Android APP
E-Police  Android APPE-Police  Android APP
E-Police Android APP
 
Application for Women Safety
Application for Women SafetyApplication for Women Safety
Application for Women Safety
 
Crime Data Analysis, Visualization and Prediction using Data Mining
Crime Data Analysis, Visualization and Prediction using Data MiningCrime Data Analysis, Visualization and Prediction using Data Mining
Crime Data Analysis, Visualization and Prediction using Data Mining
 
Predictive Policing - How Emerging Technologies Are Helping Prevent Crimes?
Predictive Policing - How Emerging Technologies Are Helping Prevent Crimes?Predictive Policing - How Emerging Technologies Are Helping Prevent Crimes?
Predictive Policing - How Emerging Technologies Are Helping Prevent Crimes?
 
Machine Learning & Cyber Security: Detecting Malicious URLs in the Haystack
Machine Learning & Cyber Security: Detecting Malicious URLs in the HaystackMachine Learning & Cyber Security: Detecting Malicious URLs in the Haystack
Machine Learning & Cyber Security: Detecting Malicious URLs in the Haystack
 
Malware Detection Using Data Mining Techniques
Malware Detection Using Data Mining Techniques Malware Detection Using Data Mining Techniques
Malware Detection Using Data Mining Techniques
 
Crown jewels risk assessment - Cost-effective risk identification
Crown jewels risk assessment - Cost-effective risk identificationCrown jewels risk assessment - Cost-effective risk identification
Crown jewels risk assessment - Cost-effective risk identification
 
Cyber Threat Intelligence - It's not just about the feeds
Cyber Threat Intelligence - It's not just about the feedsCyber Threat Intelligence - It's not just about the feeds
Cyber Threat Intelligence - It's not just about the feeds
 
Android based crime manage system proposal
Android based crime manage system proposalAndroid based crime manage system proposal
Android based crime manage system proposal
 
PredPol: How Predictive Policing Works
PredPol: How Predictive Policing WorksPredPol: How Predictive Policing Works
PredPol: How Predictive Policing Works
 
Data mining on Social Media
Data mining on Social MediaData mining on Social Media
Data mining on Social Media
 
2014 Chicago Crime Data Analysis
2014 Chicago Crime Data Analysis 2014 Chicago Crime Data Analysis
2014 Chicago Crime Data Analysis
 
Loan Prediction System Using Machine Learning.pptx
Loan Prediction System Using Machine Learning.pptxLoan Prediction System Using Machine Learning.pptx
Loan Prediction System Using Machine Learning.pptx
 

Similar to Chicago Crime Dataset Project Proposal

IRJET- Cyber Crime Attack Prediction
IRJET- Cyber Crime Attack PredictionIRJET- Cyber Crime Attack Prediction
IRJET- Cyber Crime Attack PredictionIRJET Journal
 
LokeshShanmuganandam_BigData_FinalProjectReport
LokeshShanmuganandam_BigData_FinalProjectReportLokeshShanmuganandam_BigData_FinalProjectReport
LokeshShanmuganandam_BigData_FinalProjectReportlokesh shanmuganandam
 
REAL TIME ANALYTICS INFRASTRUCTURE WITH AZURE
REAL TIME ANALYTICS INFRASTRUCTURE WITH AZUREREAL TIME ANALYTICS INFRASTRUCTURE WITH AZURE
REAL TIME ANALYTICS INFRASTRUCTURE WITH AZUREMarco Pozzan
 
Analysis of Crime Big Data using MapReduce
Analysis of Crime Big Data using MapReduceAnalysis of Crime Big Data using MapReduce
Analysis of Crime Big Data using MapReduceKaushik Rajan
 
A Web of Things Based Eco-System for Urban Computing - Towards Smarter Cities
A Web of Things Based Eco-System for Urban Computing - Towards Smarter CitiesA Web of Things Based Eco-System for Urban Computing - Towards Smarter Cities
A Web of Things Based Eco-System for Urban Computing - Towards Smarter CitiesAndreas Kamilaris
 
MIT lecture - Socrata Open Data Architecture
MIT lecture - Socrata Open Data ArchitectureMIT lecture - Socrata Open Data Architecture
MIT lecture - Socrata Open Data ArchitectureEvan Chan
 
Data in Motion - tech-intro-for-paris-hackathon
Data in Motion - tech-intro-for-paris-hackathonData in Motion - tech-intro-for-paris-hackathon
Data in Motion - tech-intro-for-paris-hackathonCisco DevNet
 
Big Data & Smart City Applications
Big Data & Smart City ApplicationsBig Data & Smart City Applications
Big Data & Smart City ApplicationsAmit Sheth
 
The Chicago Police Department’s Information Collection for Automated Mapping...
 The Chicago Police Department’s Information Collection for Automated Mapping... The Chicago Police Department’s Information Collection for Automated Mapping...
The Chicago Police Department’s Information Collection for Automated Mapping...Daniel X. O'Neil
 
Predictive Modeling for Topographical Analysis of Crime Rate
Predictive Modeling for Topographical Analysis of Crime RatePredictive Modeling for Topographical Analysis of Crime Rate
Predictive Modeling for Topographical Analysis of Crime RateIRJET Journal
 
Don Talend Geospatial Rural Utility Mobile Mapping Article
Don Talend Geospatial Rural Utility Mobile Mapping ArticleDon Talend Geospatial Rural Utility Mobile Mapping Article
Don Talend Geospatial Rural Utility Mobile Mapping ArticleDon Talend
 
How Data Analytics is Re-defining Modern Era in Cyber Security
How Data Analytics is Re-defining Modern Era in Cyber SecurityHow Data Analytics is Re-defining Modern Era in Cyber Security
How Data Analytics is Re-defining Modern Era in Cyber SecuritySaqib Chaudhry
 
System Support for Internet of Things
System Support for Internet of ThingsSystem Support for Internet of Things
System Support for Internet of ThingsHarshitParkar6677
 
ŠVOČ: Design and architecture of a web applications for interactive display o...
ŠVOČ: Design and architecture of a web applications for interactive display o...ŠVOČ: Design and architecture of a web applications for interactive display o...
ŠVOČ: Design and architecture of a web applications for interactive display o...Martin Puškáč
 
Spatial Computing and the Future of Utility GIS
Spatial Computing and the Future of Utility GISSpatial Computing and the Future of Utility GIS
Spatial Computing and the Future of Utility GISGeorge Percivall
 
A Low-Cost IoT Application for the Urban Traffic of Vehicles, Based on Wirele...
A Low-Cost IoT Application for the Urban Traffic of Vehicles, Based on Wirele...A Low-Cost IoT Application for the Urban Traffic of Vehicles, Based on Wirele...
A Low-Cost IoT Application for the Urban Traffic of Vehicles, Based on Wirele...Fatima Qayyum
 
How to Manage Open Police Data - Tips for Data QA/QC and Automation
How to Manage Open Police Data - Tips for Data QA/QC and AutomationHow to Manage Open Police Data - Tips for Data QA/QC and Automation
How to Manage Open Police Data - Tips for Data QA/QC and AutomationSafe Software
 
Analytics in Your Enterprise
Analytics in Your EnterpriseAnalytics in Your Enterprise
Analytics in Your EnterpriseWSO2
 

Similar to Chicago Crime Dataset Project Proposal (20)

IRJET- Cyber Crime Attack Prediction
IRJET- Cyber Crime Attack PredictionIRJET- Cyber Crime Attack Prediction
IRJET- Cyber Crime Attack Prediction
 
LokeshShanmuganandam_BigData_FinalProjectReport
LokeshShanmuganandam_BigData_FinalProjectReportLokeshShanmuganandam_BigData_FinalProjectReport
LokeshShanmuganandam_BigData_FinalProjectReport
 
REAL TIME ANALYTICS INFRASTRUCTURE WITH AZURE
REAL TIME ANALYTICS INFRASTRUCTURE WITH AZUREREAL TIME ANALYTICS INFRASTRUCTURE WITH AZURE
REAL TIME ANALYTICS INFRASTRUCTURE WITH AZURE
 
Analysis of Crime Big Data using MapReduce
Analysis of Crime Big Data using MapReduceAnalysis of Crime Big Data using MapReduce
Analysis of Crime Big Data using MapReduce
 
A Web of Things Based Eco-System for Urban Computing - Towards Smarter Cities
A Web of Things Based Eco-System for Urban Computing - Towards Smarter CitiesA Web of Things Based Eco-System for Urban Computing - Towards Smarter Cities
A Web of Things Based Eco-System for Urban Computing - Towards Smarter Cities
 
MIT lecture - Socrata Open Data Architecture
MIT lecture - Socrata Open Data ArchitectureMIT lecture - Socrata Open Data Architecture
MIT lecture - Socrata Open Data Architecture
 
Data in Motion - tech-intro-for-paris-hackathon
Data in Motion - tech-intro-for-paris-hackathonData in Motion - tech-intro-for-paris-hackathon
Data in Motion - tech-intro-for-paris-hackathon
 
MESA- Cyber & Smart Cities - Updated
MESA- Cyber & Smart Cities - UpdatedMESA- Cyber & Smart Cities - Updated
MESA- Cyber & Smart Cities - Updated
 
Big Data & Smart City Applications
Big Data & Smart City ApplicationsBig Data & Smart City Applications
Big Data & Smart City Applications
 
The Chicago Police Department’s Information Collection for Automated Mapping...
 The Chicago Police Department’s Information Collection for Automated Mapping... The Chicago Police Department’s Information Collection for Automated Mapping...
The Chicago Police Department’s Information Collection for Automated Mapping...
 
Predictive Modeling for Topographical Analysis of Crime Rate
Predictive Modeling for Topographical Analysis of Crime RatePredictive Modeling for Topographical Analysis of Crime Rate
Predictive Modeling for Topographical Analysis of Crime Rate
 
Don Talend Geospatial Rural Utility Mobile Mapping Article
Don Talend Geospatial Rural Utility Mobile Mapping ArticleDon Talend Geospatial Rural Utility Mobile Mapping Article
Don Talend Geospatial Rural Utility Mobile Mapping Article
 
How Data Analytics is Re-defining Modern Era in Cyber Security
How Data Analytics is Re-defining Modern Era in Cyber SecurityHow Data Analytics is Re-defining Modern Era in Cyber Security
How Data Analytics is Re-defining Modern Era in Cyber Security
 
System Support for Internet of Things
System Support for Internet of ThingsSystem Support for Internet of Things
System Support for Internet of Things
 
مدیریت عملیاتی داده ها
مدیریت عملیاتی داده هامدیریت عملیاتی داده ها
مدیریت عملیاتی داده ها
 
ŠVOČ: Design and architecture of a web applications for interactive display o...
ŠVOČ: Design and architecture of a web applications for interactive display o...ŠVOČ: Design and architecture of a web applications for interactive display o...
ŠVOČ: Design and architecture of a web applications for interactive display o...
 
Spatial Computing and the Future of Utility GIS
Spatial Computing and the Future of Utility GISSpatial Computing and the Future of Utility GIS
Spatial Computing and the Future of Utility GIS
 
A Low-Cost IoT Application for the Urban Traffic of Vehicles, Based on Wirele...
A Low-Cost IoT Application for the Urban Traffic of Vehicles, Based on Wirele...A Low-Cost IoT Application for the Urban Traffic of Vehicles, Based on Wirele...
A Low-Cost IoT Application for the Urban Traffic of Vehicles, Based on Wirele...
 
How to Manage Open Police Data - Tips for Data QA/QC and Automation
How to Manage Open Police Data - Tips for Data QA/QC and AutomationHow to Manage Open Police Data - Tips for Data QA/QC and Automation
How to Manage Open Police Data - Tips for Data QA/QC and Automation
 
Analytics in Your Enterprise
Analytics in Your EnterpriseAnalytics in Your Enterprise
Analytics in Your Enterprise
 

Recently uploaded

Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...gajnagarg
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...nirzagarg
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...HyderabadDolls
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Klinik kandungan
 
Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?RemarkSemacio
 
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...HyderabadDolls
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareGraham Ware
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...gajnagarg
 
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...vershagrag
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...HyderabadDolls
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...gajnagarg
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...nirzagarg
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...kumargunjan9515
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...HyderabadDolls
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...HyderabadDolls
 

Recently uploaded (20)

Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?
 
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
 

Chicago Crime Dataset Project Proposal

  • 1. MACHINE LEARNING ON CHICAGO CRIME DATASET FINAL PROJECT PROPOSAL ADVANCE DATA SCIENCE & ARCHITECTURE Team9: - AashriTandon - Pragati Shaw - Sarthak Agarwal
  • 2. Introduction to data • The main idea behind this project is to perform geospatial analytics and machine learning on ChicagoCrime dataset. • This dataset reflects reported incidents of crime (with the exception of murders where data exists for each victim) that occurred in the City of Chicago from 2001 to present. Data is extracted from the Chicago Police Department's CLEAR (Citizen Law Enforcement Analysis and Reporting) system from the below URL. – https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-present/ijzp-q8t2/data • Dataset Size: 1.4 Gigabytes • No. of records: ~6.3 million • No of columns: 22
  • 3. Columns ID Unique identifier for the record. Case Number Chicago Police Department RD Number (Records Division Number) Date Date when the incident occurred Block The partially redacted address where the incident occurred, placing it on the same block as the actual address IUCR The Illinois Uniform Crime Reporting code PrimaryType The primary description of the IUCR code. Description The secondary description of the IUCR code, a subcategory of the primary description. Location Description Description of the location where the incident occurred. Arrest Indicates whether an arrest was made. Domestic Indicates whether the incident was domestic-related Beat A beat is the smallest police geographic area District Indicates the police district where the incident occurred Ward The ward (City Council district) where the incident occurred CommunityArea Indicates the community area where the incident occurred. FBI Code Indicates the crime classification as outlined in the FBI's National Incident-Based Reporting System X Coordinate The x coordinate of the location where the incident occurred Y Coordinate The y coordinate of the location where the incident occurred Year Year the incident occurred. Updated On Date and time the record was last updated. Latitude The latitude of the location where the incident occurred. Longitude The longitude of the location where the incident occurred. Location The location where the incident occurred Diving Deep into the features
  • 4. Problem Statement • Our goal is to create a web application that would give insights to its user about the crime scenario and its various aspects in Chicago. • Our application will contain: – A search box/drop down list where user can select a district. – Geospatial analysis usingArcGIS maps and visualizations that are embedded into the web app which will be dynamically updated to show most interesting patterns or heat maps for that district. – Statistical analysis and visualizations on historical data to the user. – Prediction of the date when the next crime will happen and its probability.
  • 5. Part1: Data Download & Preprocessing • Data Download – Write a python script that automatically downloads the data from the website to a particular location. https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-present/ijzp-q8t2/data • Handle MissingValues – Check the percentage of missing values and their frequency distribution.Then choose appropriate technique to handle missing data. • Feature Engineering. – Check for data correlation and eliminate or create new features as needed.These features will be selected keeping in mind the machine learning component of the application.
  • 6. Part2: Geospatial Analysis • Setup ArcGIS account and integrate ArcPy which is aArcGIS Python site package that provides a useful and productive way to perform geographic data analysis, data conversion, data management, and map automation with Python. • Load the data into ArcGIS and write scripts that are most interesting to the end user. • Some of the initial ideas are as follows, but they are subject to change as we play more with the data andArcGIS. – What are the effects that a district with high criminal activity has on its neighbors. – From 2001 to 2017, how the crime has spread and what are its affects on the demographics. – Hot SpotAnalysis of events or incidents.
  • 7. Part3: Data Visualization • Exploratory data analysis will serve two purpose. Firstly, we will learn insights about the data and secondly we will display the best analysis that will be beneficial to our end user in the web application. • We will do the following types of analysis: – Perform univariate and bivariate data analysis to get insights about the data. – Plot data visualization. E.g. • How has crime changed over the years? • Which areas have evolved over the time span of 2001 to 2017?
  • 8. Part 4: Machine Learning The machine learning engine in our application will have two parts: 1. Clustering:We will divide the regions in Chicago into different clusters based on districts. It will result in 20 clusters. 2. Prediction:We will then build prediction models for each cluster that will predict the date when the next crime will happen and its probability. – We will try different models like Linear Regression, Random forest and SVM and will choose the best prediction model. – The final model will be deployed in Azure and a RESTAPI will be created to be called from the web application.
  • 9. System Architecture Docker S3 Azure ML Studio ArcGIS Rest API Web Application Data loading, pre-processing will happen in Docker image Cleaned files will be loaded to S3. Cleaned files will be used to build ML models and ArcGIS visualization. Rest APIs will be created for ML model and ArcGIS and called into the web application.
  • 10. Tools • Python – Data processing and Machine Learning. • Docker – For easy distribution and submission. • Java –Web application. – Microsoft Azure ML Studio – Machine learning Rest API • ArcGIS – Geospatial analysis
  • 12.
  • 13.
  • 14.