SlideShare a Scribd company logo
1 of 17
Geolocation Data Analysis for Safe
Residence using HiveQL
TEAM: PRIYANKA KALE, PRIYAL MISTRY, HITESH JAGTAP
GUIDE: DR. JONGWOOK WOO
24th Annual Student Symposium, CSULA
26th February 2016
Table of Contents
1. Introduction
2. Big Data
3. Flowchart
4. Specifications
5. Implementation
6. Visualization
7. GitHub
8. Business Perspective
9. References
Introduction:
 Goal- To determine if a location is safe or not by analyzing
huge crime data (1.3 GB) for Chicago city in IL collected from
2001 to present(November 2015).
 This is a study of real dataset provided by the government of
United States of America using Big Data Analytics and related
Tools.
 Query output is visualized using different graphs and maps for
better interpretation.
Big Data
Volume
Complexity
Variety
Variability
Flowchart
Download Dataset
Upload data into HDFS
Trigger Hive Queries
Result Tables
Output visualization
Specifications
• Microsoft Azure
Hortonwork’s
sandbox:
1. Linux system
2. No. of nodes: 4
3. 8 cores
4. Size-14 Gb
Implementation
 Hue is a web
application which
helps to browse HDFS
and work with Hive
and Cloudera Impala
queries, MapReduce
jobs.
Creation of tables in Hcatalog:
Hive and Beeswax
Hive is an
infrastructure
built on top of
Hadoop for
data
summarization,
query and
analysis
Beeswax
an
application
to perform
HIVE
queries
Processing in Beeswax:
 Total no and rank of crime type –
select primary_type, count(iucr), rank() over (ORDER BY
count(iucr) desc) from crime group by primary_type limit 100;
QueriesandVisualization
 number of crime as per location type for a given area-
select location_description, count(iucr) from crime where
address = '008XX N MICHIGAN AVE' group by
location_description limit 100;
0
200
400
600
800
1000
1200
Total
Total
Final Outcome of Analysis:
CREATE TABLE UnsafeArea row format delimited fields terminated by ','
STORED AS RCFile
AS select address,count(iucr) AS total_crimes,rank() over (ORDER BY
count(iucr) desc) AS rank from crime GROUP BY address;
GitHub
URL: https://github.com/priya708/Project-520
Business Perspective
 Get better advertisement
 Predictive Policing for Police department: The future of Law
enforcement?
• Reducing Random Gunfire
• Connecting Burglaries and Code Violations
References
 https://catalog.data.gov
 https://cwiki.apache.org/confluence/display/Hive/Tutorial
 https://hortonworks.com/tutorials
THANKYOU

More Related Content

What's hot

My Interest in Python
My Interest in PythonMy Interest in Python
My Interest in Python
smukhtyar
 

What's hot (20)

Reactive Databases for Big Data applications
Reactive Databases for Big Data applicationsReactive Databases for Big Data applications
Reactive Databases for Big Data applications
 
OpenNebulaConf 2016 - MICHAL - flexible infrastructure accounting framework b...
OpenNebulaConf 2016 - MICHAL - flexible infrastructure accounting framework b...OpenNebulaConf 2016 - MICHAL - flexible infrastructure accounting framework b...
OpenNebulaConf 2016 - MICHAL - flexible infrastructure accounting framework b...
 
view_hdf
view_hdfview_hdf
view_hdf
 
My Interest in Python
My Interest in PythonMy Interest in Python
My Interest in Python
 
Satwik mishra resume
Satwik mishra resumeSatwik mishra resume
Satwik mishra resume
 
DBpedia mobile
DBpedia mobileDBpedia mobile
DBpedia mobile
 
DE gitConnect
DE gitConnectDE gitConnect
DE gitConnect
 
Collecting Endpoint Security Logs Through Big Data Technology - Dedi Dwianto
Collecting Endpoint Security Logs Through Big Data Technology - Dedi DwiantoCollecting Endpoint Security Logs Through Big Data Technology - Dedi Dwianto
Collecting Endpoint Security Logs Through Big Data Technology - Dedi Dwianto
 
Big Data Processing in Pharo
Big Data Processing in PharoBig Data Processing in Pharo
Big Data Processing in Pharo
 
Distributed system
Distributed systemDistributed system
Distributed system
 
Collaboratively Conceived, Designed and Implemented: Matching Visualization ...
Collaboratively Conceived, Designed and Implemented:  Matching Visualization ...Collaboratively Conceived, Designed and Implemented:  Matching Visualization ...
Collaboratively Conceived, Designed and Implemented: Matching Visualization ...
 
Reproducible Project Workflow in R (with ProjectTemplate)
Reproducible Project Workflow in R (with ProjectTemplate)Reproducible Project Workflow in R (with ProjectTemplate)
Reproducible Project Workflow in R (with ProjectTemplate)
 
Andrii Buryk "Alternative Energy and IT"
Andrii Buryk "Alternative Energy and IT"Andrii Buryk "Alternative Energy and IT"
Andrii Buryk "Alternative Energy and IT"
 
On-the-fly Integration of Static and Dynamic Linked Data
On-the-fly Integration of Static and Dynamic Linked DataOn-the-fly Integration of Static and Dynamic Linked Data
On-the-fly Integration of Static and Dynamic Linked Data
 
WSO2Con USA 2015: Patterns for Deploying Analytics in the Real World
WSO2Con USA 2015: Patterns for Deploying Analytics in the Real WorldWSO2Con USA 2015: Patterns for Deploying Analytics in the Real World
WSO2Con USA 2015: Patterns for Deploying Analytics in the Real World
 
London Borough of Harrow - Deliver GI with speed and efficiency
London Borough of Harrow - Deliver GI with speed and efficiencyLondon Borough of Harrow - Deliver GI with speed and efficiency
London Borough of Harrow - Deliver GI with speed and efficiency
 
Visualising and Linking Open Data from Multiple Sources
Visualising and Linking Open Data from Multiple SourcesVisualising and Linking Open Data from Multiple Sources
Visualising and Linking Open Data from Multiple Sources
 
Data Gathering with The Web Observatory
Data Gathering with The Web ObservatoryData Gathering with The Web Observatory
Data Gathering with The Web Observatory
 
IP EXPO Europe: Data Science in the Cloud
IP EXPO Europe: Data Science in the CloudIP EXPO Europe: Data Science in the Cloud
IP EXPO Europe: Data Science in the Cloud
 
Advait kulkarni
Advait kulkarniAdvait kulkarni
Advait kulkarni
 

Similar to Geolocation analysis using hive ql

Review of big data analytics (bda) architecture trends and analysis
Review of big data analytics (bda) architecture   trends and analysis Review of big data analytics (bda) architecture   trends and analysis
Review of big data analytics (bda) architecture trends and analysis
Conference Papers
 
Tag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformTag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh Platform
Sanjay Padhi, Ph.D
 
INF2190_W1_2016_public
INF2190_W1_2016_publicINF2190_W1_2016_public
INF2190_W1_2016_public
Attila Barta
 

Similar to Geolocation analysis using hive ql (20)

Data analysis using hive ql & tableau
Data analysis using hive ql & tableauData analysis using hive ql & tableau
Data analysis using hive ql & tableau
 
AGS Members' Day 2015 - Data Transfer Format and BIM Presentation
AGS Members' Day 2015 - Data Transfer Format and BIM PresentationAGS Members' Day 2015 - Data Transfer Format and BIM Presentation
AGS Members' Day 2015 - Data Transfer Format and BIM Presentation
 
Tracking research data footprints - slides
Tracking research data footprints - slidesTracking research data footprints - slides
Tracking research data footprints - slides
 
Bigdatacooltools
BigdatacooltoolsBigdatacooltools
Bigdatacooltools
 
BIG Data and Methodology-A review
BIG Data and Methodology-A reviewBIG Data and Methodology-A review
BIG Data and Methodology-A review
 
Review of big data analytics (bda) architecture trends and analysis
Review of big data analytics (bda) architecture   trends and analysis Review of big data analytics (bda) architecture   trends and analysis
Review of big data analytics (bda) architecture trends and analysis
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
Groundwater Data Delivery & Visualization
Groundwater Data Delivery & VisualizationGroundwater Data Delivery & Visualization
Groundwater Data Delivery & Visualization
 
Big data analytics with Apache Hadoop
Big data analytics with Apache  HadoopBig data analytics with Apache  Hadoop
Big data analytics with Apache Hadoop
 
Investment Fund Analytics
Investment Fund AnalyticsInvestment Fund Analytics
Investment Fund Analytics
 
A Gen3 Perspective of Disparate Data
A Gen3 Perspective of Disparate DataA Gen3 Perspective of Disparate Data
A Gen3 Perspective of Disparate Data
 
A Comprehensive Study on Big Data Applications and Challenges
A Comprehensive Study on Big Data Applications and ChallengesA Comprehensive Study on Big Data Applications and Challenges
A Comprehensive Study on Big Data Applications and Challenges
 
Tag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformTag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh Platform
 
INF2190_W1_2016_public
INF2190_W1_2016_publicINF2190_W1_2016_public
INF2190_W1_2016_public
 
Big Data, Beyond the Data Center
Big Data, Beyond the Data CenterBig Data, Beyond the Data Center
Big Data, Beyond the Data Center
 
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...Creating and Utilizing Linked Open Statistical Data for the Development of Ad...
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...
 
Big Data Summarization : Framework, Challenges and Possible Solutions
Big Data Summarization : Framework, Challenges and Possible SolutionsBig Data Summarization : Framework, Challenges and Possible Solutions
Big Data Summarization : Framework, Challenges and Possible Solutions
 
Big Data Summarization : Framework, Challenges and Possible Solutions
Big Data Summarization : Framework, Challenges and Possible SolutionsBig Data Summarization : Framework, Challenges and Possible Solutions
Big Data Summarization : Framework, Challenges and Possible Solutions
 
BIG DATA SUMMARIZATION: FRAMEWORK, CHALLENGES AND POSSIBLE SOLUTIONS
BIG DATA SUMMARIZATION: FRAMEWORK, CHALLENGES AND POSSIBLE SOLUTIONSBIG DATA SUMMARIZATION: FRAMEWORK, CHALLENGES AND POSSIBLE SOLUTIONS
BIG DATA SUMMARIZATION: FRAMEWORK, CHALLENGES AND POSSIBLE SOLUTIONS
 
Big Data Summarization : Framework, Challenges and Possible Solutions
Big Data Summarization : Framework, Challenges and Possible SolutionsBig Data Summarization : Framework, Challenges and Possible Solutions
Big Data Summarization : Framework, Challenges and Possible Solutions
 

Recently uploaded

edited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdfedited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdf
great91
 
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
dq9vz1isj
 
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
yulianti213969
 
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
fztigerwe
 
Displacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second DerivativesDisplacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second Derivatives
23050636
 
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
ppy8zfkfm
 
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
yulianti213969
 
Audience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptxAudience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptx
Stephen266013
 
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
jk0tkvfv
 

Recently uploaded (20)

Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
 
edited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdfedited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdf
 
Genuine love spell caster )! ,+27834335081) Ex lover back permanently in At...
Genuine love spell caster )! ,+27834335081)   Ex lover back permanently in At...Genuine love spell caster )! ,+27834335081)   Ex lover back permanently in At...
Genuine love spell caster )! ,+27834335081) Ex lover back permanently in At...
 
Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024
 
Predictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesPredictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting Techniques
 
Digital Marketing Demystified: Expert Tips from Samantha Rae Coolbeth
Digital Marketing Demystified: Expert Tips from Samantha Rae CoolbethDigital Marketing Demystified: Expert Tips from Samantha Rae Coolbeth
Digital Marketing Demystified: Expert Tips from Samantha Rae Coolbeth
 
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
 
Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"
 
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
 
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
 
The Significance of Transliteration Enhancing
The Significance of Transliteration EnhancingThe Significance of Transliteration Enhancing
The Significance of Transliteration Enhancing
 
Displacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second DerivativesDisplacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second Derivatives
 
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
 
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
 
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam DunksNOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
 
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
 
Seven tools of quality control.slideshare
Seven tools of quality control.slideshareSeven tools of quality control.slideshare
Seven tools of quality control.slideshare
 
Audience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptxAudience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptx
 
Data Analysis Project Presentation : NYC Shooting Cluster Analysis
Data Analysis Project Presentation : NYC Shooting Cluster AnalysisData Analysis Project Presentation : NYC Shooting Cluster Analysis
Data Analysis Project Presentation : NYC Shooting Cluster Analysis
 
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
 

Geolocation analysis using hive ql