SlideShare a Scribd company logo
Big (Geo) Data Science




Robert Cheetham
cheetham@azavea.com
   @rcheetham
Web/Mobile

Geospatial

UI/UX Design

High Performance
Computing

R&D
B Corporation
   • Projects w/ Social Value
   • Summer of Maps
   • Pro Bono Program
   • Donate share of profits

Research-Driven
  • 10% Research Program
  • Academic Collaborations
  • Open Source
Spatial Temporal Forecasting
with Philadelphia Crime Data
How Phila PD uses Maps

 Customized Map Products




            Weekly CompStat Meetings




   Web Crime Analysis
INCT & PARS – main database sources
over 5,000 incidents daily, over 2 million annually



                                                                                        PARS

   Complainant                                                   INCT


      Verizon                                                           Daily download
        911                      District                               & Geocoding Routines
                                48 Desk
                                       Incident Report
                                       Completed by Officer                      District X


   911 Operator
                                Police Officer      Maps distributed
                                                   Through Intranet,            District Y
                                                  Printing, CompStat
      Radio
    Dispatcher
                                  CAD                                            District Z
The Context

1,500,000 people
7,000 police
1,000 civilian employees
2,000,000 new incidents / year
3 crime analysts
What we did

•   Weekly Compstat
•   Lots of maps
•   Automation of map creation
•   Web-based systems
… but what if we could…

 Accelerate the cycle
 Proactively notify
 Automate the process
Prototype
          VB & MapObjects                                ArcView
                                                  .ini
                                                  file




Process Documentation




                                          Shapefiles
                                          and
                                          GRIDs




                        MS SQL Server
                        Crime Incidents
                        Database
… but there was a problem …
…it was crap …
… sort of.
We needed ….

1. Better Statistics

2. Notification

3. Simplicity
Crime Analysis – What has happened?
   – Mapping (spatial / temporal densities)
   – Trending
   – Intelligence Dashboard
Early Warning – What is out of the ordinary?
   – Statistical & Threshold-based Hunches (data mining)
   – Alerting
Risk Forecasting – What is likely to happen next?
   – Near Repeat Pattern
   – Load Forecasting
Crime Analysis
   – Mapping (spatial / temporal densities)
   – Trending
   – Intelligence Dashboard
Early Warning
   – Statistical & Threshold-based Hunches (data mining)
   – Alerting
Risk Forecasting
   – Near Repeat Pattern
   – Load Forecasting
Crime Analysis
Intelligence Dashboard
Crime Analysis
Early Warning
Early Warning

• Geographic Early Warning System
   – A system to alert staff of an unusual situation in a particular
     location
   – Ingests data sets to automatically “cook on” and only
     involves staff when a statistically unusual situation is found


                               Geostatistical Engine



  Operational
   Operational
   Database
                                                       Alerting
     Operational
    Database                HunchLab
                            Database                   System
     Databases
Early Warning
What is a Hunch?

• A proposed hypothesis, saved into the system, and
  continually tested for validity
• Incident Attribute Requirements
   – Location (x, y)
   – Time (timestamp)
   – Classification
• Hunch Attributes
   – Location (area)
   – Time (recent / historic periods)
   – Classification
• Analyses
   – Statistical Hunch
   – Threshold Hunch
Hunch Parameters: Location

•   Address & Radius
•   Precinct/County/Country
•   Custom Drawn Area
•   Mass Hunch
Hunch Parameters: Time

• Statistical Hunch
   – Recent Past
   – Historic Past
Hunch Parameters: Classification

• Category
• Time of Day
• Narrative
Hunch Helper
Email Alert
Hunch Details
Risk Forecasting
Predictive Analytics?

• Prediction vs. Forecasting
Near Repeat Pattern Analysis
Contagious Crime?

• Near repeat pattern analysis
      • “If one burglary occurs, how does the risk change nearby?”
What Do We Mean By Near Repeat?

• Repeat victimization
   – Incident at the same location at a later time (likely related)
• Near repeat victimization
   – Incident at a nearby location at a later time (likely related)

• Incident A (place, time) --> Incident B (place, time)
Near Repeat Pattern Analysis

• The goal:
   – Quantify short term risk due to near-repeat victimization
      • “If one burglary occurs, how does the risk of burglary for the
        neighbors change?”


• What we know:
   – Incident A (place, time) --> Incident B (place, time)
      • Distance between A and B
      • Timeframe between A and B


• What we need to know:
   – What distances/timeframes are not simply random?
Near Repeat Pattern Analysis

• The process
   –   Observe the pattern in historic data
   –   Simulate the pattern in randomized historic data
   –   Compare the observed pattern to the simulated patterns
   –   Apply the non-random pattern to new incidents

• An example
   – 180 days of burglaries in Division 6 of Philadelphia
Near Repeat Pattern Analysis
Near Repeat Pattern Analysis
Near Repeat Pattern Analysis
Near Repeat Pattern Analysis
Near Repeat Pattern Analysis

• How can you test your own data?
   – Near Repeat Calculator
      • http://www.temple.edu/cj/misc/nr/
• Papers
   – Near-Repeat Patterns in Philadelphia Shootings (2008)
      • One city block & two weeks after one shooting
           – 33% increase in likelihood of a second event




                                             Jerry Ratcliffe
                                           Temple University
Contagious Crime?
Workload Forecasting
Improving CompStat

• Workload forecasting
      • “Given the time of year, day of week, time of day and
        general trend, what counts of crimes should I expect?”
What Do We Mean By Load Forecasting?

 • Workload forecasting
         • Generating aggregate crime counts for a future timeframe
           using cyclical time series analysis



                                    Measure cyclical patterns


                                                +
                                    Identify non-cyclical trend

                                    Forecast expected count

bit.ly/gorrcrimeforecastingpaper
Load Forecasting

• Measure cyclical patterns
      • Take historic incidents (for example: last five years)
      • Generate multiplicative seasonal indices
          – For each time cycle:
              » time of year
              » day of week
              » time of day
          – Count incidents within each time unit (for example: Monday)
          – Calculate average per time unit if incidents were evenly
            distributed
          – Divide counts within each time unit by the calculated average to
            generate multiplicative indices
              » Index ~ 1 means at the average
              » Index > 1 means above average
              » Index < 1 means below average
Load Forecasting
Load Forecasting
Load Forecasting
Load Forecasting
Load Forecasting

• Identify non-cyclical trend
      • Take recent daily counts (for example: last year daily counts)
      • Remove cyclical trends by dividing by indices




      • Run a trending function on the new counts
          – Simple average
              » Last X Days
          – Smoothing function
              » Exponential smoothing
              » Holt’s linear exponential smoothing
Load Forecasting

• Forecast expected count
      • Project trend into future timeframe
          – Always flat
              » Simple average
              » Exponential smoothing
          – Linear trend
              » Holt’s linear exponential smoothing
      • Multiple by seasonal indices to reseasonalize the data
Load Forecasting




                                   Measure cyclical patterns


                                             +
                                   Identify non-cyclical trend

                                   Forecast expected count



bit.ly/gorrcrimeforecastingpaper
Improving CompStat
How Do We Know It’s Accurate?

• Testing
      • Generated forecasting techniques(examples)
            – Commonly Used
                » Average of last 30 days
                » Average of last 365 days
                » Last year’s count for the same time period
            – Advanced Combinations
                » Different cyclical indices (example: day of year vs. month of year)
                » Different levels of geographic aggregation for indices
                » Different trending functions
      • Scoring methodologies (examples)
            – Mean absolute percent error (with some enhancements)
            – Mean percent error
            – Mean squared error
      • Run thousands of forecasts through testing framework
      • Choose the right technique in the right situation
Ongoing Research
Research Topics

• Risk Forecasting
   – Load forecasting enhancements
      • Weather and special events




   – Combining short and long term risk forecasts (Temple)
      • Socioeconomic changes in neighborhoods
   – Risk Terrain Modeling (Rutgers)
      • Context of crime at the microplace
Research Topics
Research Topics

• Risk Forecasting
   – Offender Management
      • Prioritize offenders based upon statistical models using past
        behaviors
• Evaluation
   – Automate Randomized Controlled Trials
Data Processing for Big (Geo) Data
A Story
Robert’s Rules of Housing
                     Close to Center City      somewhat important
                   Walk to Grocery Store       vital
                     Nearby Restaurants        very important
                                  Library      nice to have
                             Near a Park       somewhat important
Biking / walking distance from our work        very important
               Biking distance to fencing      somewhat important
Your factors might include…
                      Child Care
                      Local School Rankings
                      Farmer's Market
                      Car Share
                      Public Transit
We stand on the
shoulders of giants
Not a new idea … Design with Nature
Not a new Idea … Dana Tomlin
Desktop GIS
Weighted Overlay


             +        +        +

    x5           x1       x3       x2




         =
Summary

      Geography-driven Decisions

      Iterative

      Individual

      Web [and Mobile]

      Growing data sets
Web Challenges
Web is different from the Desktop

  Lots of simultaneous users

  Stateless environment

  HTML+JS+CSS

  Users are less skilled

  Users are less patient
But wait … there’s a problem
 10 – 60 second calculation time

 Multiple simultaneous users …

 … that are impatient
Data Challenges
Big Data – Social Media
Big Data – Science
Big Data – Citizen Science
Big Data – Cities
Early Prototype
Specific Optimization Goals
 New Raster File Structure

 Distributed processing

 Binary messaging protocol
Optimization: File Format
 Limit data type and range

 1D arrays are fast to read/write

 Tiled

 Pyramids

 Azavea Raster Grid (ARG)
Optimization: Distributed Processing
 Parallelizable - Local Ops and Focal Ops

 Support multiple
  –   Threads
  –   Cores
  –   CPU’s
  –   Machines


 Considered
  – Hadoop
  – Amazon Map Reduce
  – Beowolf
Success!!
  Reduced from 10-60 seconds to

  <500 milliseconds
Optimizing one process sub-optimizes others
   Complex to configure and maintain
   Limited to one operation
   No interpolation
   No mixing
    – cell sizes
    – extents
    – projections
 etc.
 Broader set of functionality

 Both raster and vector

 Scala + Akka

 Open source
Faster is Different
Regional/State:     84 ms

National:           84 ms

Large Country     115 ms

Continental       271 ms

Planet          1.2 – 2.0 s
Ongoing R&D
GPUs
GPU Results
  Re-wrote a few Map
   Algebra operations:
    Local
    Neighborhood
    Zonal
    Viewshed
    etc.
  15 – 120x
  Large grids
  Large kernels
New Spatial Operations
 Vector

 Neighborhood/Focal

 Spatial Statistics

 Integration
Urban Forest Ecosystem Modeling
Crime Analysis, Early Warning and Forecasting
Open Source Geoprocessing

       GDAL

       GeoServer

       PostGIS

      R

       GeoDa
Many Thanks!
© Photo used with permission from Alphafish, via Flickr.com
Big (Geo) Data Science

                 [We are hiring]


Robert Cheetham
cheetham@azavea.com
   @rcheetham

More Related Content

What's hot

Machine Learning Approaches for Crime Pattern Detection
Machine Learning Approaches for Crime Pattern DetectionMachine Learning Approaches for Crime Pattern Detection
Machine Learning Approaches for Crime Pattern Detection
APNIC
 
Crime prediction-using-data-mining
Crime prediction-using-data-miningCrime prediction-using-data-mining
Crime prediction-using-data-mining
mohammed albash
 
Chicago Crime Dataset Project Proposal
Chicago Crime Dataset Project ProposalChicago Crime Dataset Project Proposal
Chicago Crime Dataset Project Proposal
Aashri Tandon
 
Crime Analysis & Prediction System
Crime Analysis & Prediction SystemCrime Analysis & Prediction System
Crime Analysis & Prediction System
BigDataCloud
 
Crime analysis mapping, intrusion detection using data mining
Crime analysis mapping, intrusion detection using data miningCrime analysis mapping, intrusion detection using data mining
Crime analysis mapping, intrusion detection using data mining
Venkat Projects
 
Crime Mapping & Analysis – Georgia Tech
Crime Mapping & Analysis – Georgia TechCrime Mapping & Analysis – Georgia Tech
Crime Mapping & Analysis – Georgia Tech
Jonathan D'Cruz
 
PredPol: How Predictive Policing Works
PredPol: How Predictive Policing WorksPredPol: How Predictive Policing Works
PredPol: How Predictive Policing Works
PredPol, Inc
 
Crime analysis
Crime analysisCrime analysis
Crime analysis
Colin Bartram
 
Applications of R (DataWeek 2014)
Applications of R (DataWeek 2014)Applications of R (DataWeek 2014)
Applications of R (DataWeek 2014)
Revolution Analytics
 
EvIM: a real time complex event discovery platform for CPSS
EvIM: a real time complex event discovery platform for CPSSEvIM: a real time complex event discovery platform for CPSS
EvIM: a real time complex event discovery platform for CPSS
Siripen Pongpaichet
 
Social Life Networks (Eventshop and Personal Event Shop)
Social Life Networks (Eventshop and Personal Event Shop)Social Life Networks (Eventshop and Personal Event Shop)
Social Life Networks (Eventshop and Personal Event Shop)
Siripen Pongpaichet
 
How the growth of R helps data-driven organizations succeed
How the growth of R helps data-driven organizations succeedHow the growth of R helps data-driven organizations succeed
How the growth of R helps data-driven organizations succeed
Revolution Analytics
 
Order Fulfillment Forecasting at John Deere: How R Facilitates Creativity and...
Order Fulfillment Forecasting at John Deere: How R Facilitates Creativity and...Order Fulfillment Forecasting at John Deere: How R Facilitates Creativity and...
Order Fulfillment Forecasting at John Deere: How R Facilitates Creativity and...
Revolution Analytics
 
Fundamentalsof Crime Mapping 6
Fundamentalsof Crime Mapping 6Fundamentalsof Crime Mapping 6
Fundamentalsof Crime Mapping 6
Osokop
 
Predictive policing computational thinking show and tell
Predictive policing computational thinking show and tellPredictive policing computational thinking show and tell
Predictive policing computational thinking show and tell
Archit Sharma
 
EventShop Demo
EventShop DemoEventShop Demo
EventShop Demo
Siripen Pongpaichet
 
Observing real world phenomena through event web
Observing real world phenomena through event webObserving real world phenomena through event web
Observing real world phenomena through event web
Siripen Pongpaichet
 
Crime Identification Denver Colorado
Crime Identification Denver ColoradoCrime Identification Denver Colorado
Crime Identification Denver ColoradoChad Yowler
 
EventShop ISG talk 140213
EventShop ISG talk 140213EventShop ISG talk 140213
EventShop ISG talk 140213
Siripen Pongpaichet
 
Using Data Mining Techniques to Analyze Crime Pattern
Using Data Mining Techniques to Analyze Crime PatternUsing Data Mining Techniques to Analyze Crime Pattern
Using Data Mining Techniques to Analyze Crime Pattern
Zakaria Zubi
 

What's hot (20)

Machine Learning Approaches for Crime Pattern Detection
Machine Learning Approaches for Crime Pattern DetectionMachine Learning Approaches for Crime Pattern Detection
Machine Learning Approaches for Crime Pattern Detection
 
Crime prediction-using-data-mining
Crime prediction-using-data-miningCrime prediction-using-data-mining
Crime prediction-using-data-mining
 
Chicago Crime Dataset Project Proposal
Chicago Crime Dataset Project ProposalChicago Crime Dataset Project Proposal
Chicago Crime Dataset Project Proposal
 
Crime Analysis & Prediction System
Crime Analysis & Prediction SystemCrime Analysis & Prediction System
Crime Analysis & Prediction System
 
Crime analysis mapping, intrusion detection using data mining
Crime analysis mapping, intrusion detection using data miningCrime analysis mapping, intrusion detection using data mining
Crime analysis mapping, intrusion detection using data mining
 
Crime Mapping & Analysis – Georgia Tech
Crime Mapping & Analysis – Georgia TechCrime Mapping & Analysis – Georgia Tech
Crime Mapping & Analysis – Georgia Tech
 
PredPol: How Predictive Policing Works
PredPol: How Predictive Policing WorksPredPol: How Predictive Policing Works
PredPol: How Predictive Policing Works
 
Crime analysis
Crime analysisCrime analysis
Crime analysis
 
Applications of R (DataWeek 2014)
Applications of R (DataWeek 2014)Applications of R (DataWeek 2014)
Applications of R (DataWeek 2014)
 
EvIM: a real time complex event discovery platform for CPSS
EvIM: a real time complex event discovery platform for CPSSEvIM: a real time complex event discovery platform for CPSS
EvIM: a real time complex event discovery platform for CPSS
 
Social Life Networks (Eventshop and Personal Event Shop)
Social Life Networks (Eventshop and Personal Event Shop)Social Life Networks (Eventshop and Personal Event Shop)
Social Life Networks (Eventshop and Personal Event Shop)
 
How the growth of R helps data-driven organizations succeed
How the growth of R helps data-driven organizations succeedHow the growth of R helps data-driven organizations succeed
How the growth of R helps data-driven organizations succeed
 
Order Fulfillment Forecasting at John Deere: How R Facilitates Creativity and...
Order Fulfillment Forecasting at John Deere: How R Facilitates Creativity and...Order Fulfillment Forecasting at John Deere: How R Facilitates Creativity and...
Order Fulfillment Forecasting at John Deere: How R Facilitates Creativity and...
 
Fundamentalsof Crime Mapping 6
Fundamentalsof Crime Mapping 6Fundamentalsof Crime Mapping 6
Fundamentalsof Crime Mapping 6
 
Predictive policing computational thinking show and tell
Predictive policing computational thinking show and tellPredictive policing computational thinking show and tell
Predictive policing computational thinking show and tell
 
EventShop Demo
EventShop DemoEventShop Demo
EventShop Demo
 
Observing real world phenomena through event web
Observing real world phenomena through event webObserving real world phenomena through event web
Observing real world phenomena through event web
 
Crime Identification Denver Colorado
Crime Identification Denver ColoradoCrime Identification Denver Colorado
Crime Identification Denver Colorado
 
EventShop ISG talk 140213
EventShop ISG talk 140213EventShop ISG talk 140213
EventShop ISG talk 140213
 
Using Data Mining Techniques to Analyze Crime Pattern
Using Data Mining Techniques to Analyze Crime PatternUsing Data Mining Techniques to Analyze Crime Pattern
Using Data Mining Techniques to Analyze Crime Pattern
 

Viewers also liked

Rinkal.cpd.ppt
Rinkal.cpd.pptRinkal.cpd.ppt
Rinkal.cpd.ppt
rashmika28
 
Ijcatr04061005
Ijcatr04061005Ijcatr04061005
Ijcatr04061005
Editor IJCATR
 
Exploratory Spatial Analysis Norma
Exploratory Spatial Analysis NormaExploratory Spatial Analysis Norma
Exploratory Spatial Analysis Norma
Beniamino Murgante
 
Web mapping with vector data. Is it the future ? 2012
Web mapping with vector data. Is it the future ? 2012Web mapping with vector data. Is it the future ? 2012
Web mapping with vector data. Is it the future ? 2012
Moullet
 
Spatial queries entity recognition and disambiguation
Spatial queries entity recognition and disambiguationSpatial queries entity recognition and disambiguation
Spatial queries entity recognition and disambiguation
Ehsan Hamzei
 
Introduction to Oracle Spatial
Introduction to Oracle SpatialIntroduction to Oracle Spatial
Introduction to Oracle Spatial
Ehsan Hamzei
 
3D Visibility with Vector GIS Data
3D Visibility with Vector GIS Data3D Visibility with Vector GIS Data
3D Visibility with Vector GIS Data
Wassim Suleiman
 
Spatial enhancement
Spatial enhancement Spatial enhancement
Spatial enhancement abinarkt
 
Exploratory Spatial Analysis using GeoDa
Exploratory Spatial Analysis using GeoDaExploratory Spatial Analysis using GeoDa
Exploratory Spatial Analysis using GeoDa
MEASURE Evaluation
 
Spatial Analytics, Where 2.0 2010
Spatial Analytics, Where 2.0 2010Spatial Analytics, Where 2.0 2010
Spatial Analytics, Where 2.0 2010
Kevin Weil
 
Components of Spatial Data Quality in GIS
Components of Spatial Data Quality in GISComponents of Spatial Data Quality in GIS
Components of Spatial Data Quality in GIS
Kaium Chowdhury
 
Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)
Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)
Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)
Rich Heimann
 
Spatial data analysis 1
Spatial data analysis 1Spatial data analysis 1
Spatial data analysis 1
Johan Blomme
 
Dmitriy Kolesov - GIS as an environment for integration and analysis of spati...
Dmitriy Kolesov - GIS as an environment for integration and analysis of spati...Dmitriy Kolesov - GIS as an environment for integration and analysis of spati...
Dmitriy Kolesov - GIS as an environment for integration and analysis of spati...AIST
 
Introduction To Gis With Employment Info
Introduction To Gis With Employment InfoIntroduction To Gis With Employment Info
Introduction To Gis With Employment InfoJo Dyson
 
QGIS Module 1
QGIS Module 1QGIS Module 1
QGIS Module 1
CAPSUCSF
 
Spatial data mining
Spatial data miningSpatial data mining
Spatial data mining
MITS Gwalior
 
Vector analysis
Vector analysisVector analysis
Vector analysis
Solo Hermelin
 
4.2 spatial data mining
4.2 spatial data mining4.2 spatial data mining
4.2 spatial data mining
Krish_ver2
 

Viewers also liked (20)

Rinkal.cpd.ppt
Rinkal.cpd.pptRinkal.cpd.ppt
Rinkal.cpd.ppt
 
Ijcatr04061005
Ijcatr04061005Ijcatr04061005
Ijcatr04061005
 
Exploratory Spatial Analysis Norma
Exploratory Spatial Analysis NormaExploratory Spatial Analysis Norma
Exploratory Spatial Analysis Norma
 
Web mapping with vector data. Is it the future ? 2012
Web mapping with vector data. Is it the future ? 2012Web mapping with vector data. Is it the future ? 2012
Web mapping with vector data. Is it the future ? 2012
 
Spatial queries entity recognition and disambiguation
Spatial queries entity recognition and disambiguationSpatial queries entity recognition and disambiguation
Spatial queries entity recognition and disambiguation
 
Introduction to Oracle Spatial
Introduction to Oracle SpatialIntroduction to Oracle Spatial
Introduction to Oracle Spatial
 
3D Visibility with Vector GIS Data
3D Visibility with Vector GIS Data3D Visibility with Vector GIS Data
3D Visibility with Vector GIS Data
 
Spatial enhancement
Spatial enhancement Spatial enhancement
Spatial enhancement
 
Exploratory Spatial Analysis using GeoDa
Exploratory Spatial Analysis using GeoDaExploratory Spatial Analysis using GeoDa
Exploratory Spatial Analysis using GeoDa
 
Spatial Analytics, Where 2.0 2010
Spatial Analytics, Where 2.0 2010Spatial Analytics, Where 2.0 2010
Spatial Analytics, Where 2.0 2010
 
Components of Spatial Data Quality in GIS
Components of Spatial Data Quality in GISComponents of Spatial Data Quality in GIS
Components of Spatial Data Quality in GIS
 
Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)
Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)
Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)
 
Spatial data analysis 1
Spatial data analysis 1Spatial data analysis 1
Spatial data analysis 1
 
Dmitriy Kolesov - GIS as an environment for integration and analysis of spati...
Dmitriy Kolesov - GIS as an environment for integration and analysis of spati...Dmitriy Kolesov - GIS as an environment for integration and analysis of spati...
Dmitriy Kolesov - GIS as an environment for integration and analysis of spati...
 
Introduction To Gis With Employment Info
Introduction To Gis With Employment InfoIntroduction To Gis With Employment Info
Introduction To Gis With Employment Info
 
QGIS Module 1
QGIS Module 1QGIS Module 1
QGIS Module 1
 
Vectors and Rasters
Vectors and RastersVectors and Rasters
Vectors and Rasters
 
Spatial data mining
Spatial data miningSpatial data mining
Spatial data mining
 
Vector analysis
Vector analysisVector analysis
Vector analysis
 
4.2 spatial data mining
4.2 spatial data mining4.2 spatial data mining
4.2 spatial data mining
 

Similar to Data Philly Meetup - Big (Geo) Data

Forecasting Space-Time Events - Strata + Hadoop World 2015 San Jose
Forecasting Space-Time Events - Strata + Hadoop World 2015 San JoseForecasting Space-Time Events - Strata + Hadoop World 2015 San Jose
Forecasting Space-Time Events - Strata + Hadoop World 2015 San Jose
Azavea
 
Mining Large-Scale Temporal Dynamics with Hadoop
Mining Large-Scale Temporal Dynamics with HadoopMining Large-Scale Temporal Dynamics with Hadoop
Mining Large-Scale Temporal Dynamics with HadoopDataWorks Summit
 
Rent, Rain, and Regulations | Du Phan, Dataiku | DN18
Rent, Rain, and Regulations | Du Phan, Dataiku | DN18Rent, Rain, and Regulations | Du Phan, Dataiku | DN18
Rent, Rain, and Regulations | Du Phan, Dataiku | DN18
DataconomyGmbH
 
Extracting City Traffic Events from Social Streams
 Extracting City Traffic Events from Social Streams Extracting City Traffic Events from Social Streams
Extracting City Traffic Events from Social Streams
Pramod Anantharam
 
EPOP: Quantifying Violent Risk for Every Point on the Planet
EPOP: Quantifying Violent Risk for Every Point on the PlanetEPOP: Quantifying Violent Risk for Every Point on the Planet
EPOP: Quantifying Violent Risk for Every Point on the Planet
Esri
 
HunchLab 2.0 Predictive Missions: Under the Hood
HunchLab 2.0 Predictive Missions: Under the HoodHunchLab 2.0 Predictive Missions: Under the Hood
HunchLab 2.0 Predictive Missions: Under the HoodAzavea
 
RAPID-N: A tool for mapping Natech risk due to earthquakes
RAPID-N: A tool for mapping Natech risk due to earthquakesRAPID-N: A tool for mapping Natech risk due to earthquakes
RAPID-N: A tool for mapping Natech risk due to earthquakes
Global Risk Forum GRFDavos
 
High Availability HPC ~ Microservice Architectures for Supercomputing
High Availability HPC ~ Microservice Architectures for SupercomputingHigh Availability HPC ~ Microservice Architectures for Supercomputing
High Availability HPC ~ Microservice Architectures for Supercomputing
inside-BigData.com
 
Get Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysGet Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysAerospike, Inc.
 
Cyber Threat Ranking using READ
Cyber Threat Ranking using READCyber Threat Ranking using READ
Cyber Threat Ranking using READ
Zachary S. Brown
 
Machine Learning from Statistical Point of View
Machine Learning from Statistical Point of ViewMachine Learning from Statistical Point of View
Machine Learning from Statistical Point of View
Yury Gubman
 
Integrating Sensor and Social Data for Understanding City Events
Integrating Sensor and Social Data for Understanding City EventsIntegrating Sensor and Social Data for Understanding City Events
Integrating Sensor and Social Data for Understanding City Events
Artificial Intelligence Institute at UofSC
 
(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems
(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems
(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems
Amazon Web Services
 
Cyber Attacks Spatial Analysis
Cyber Attacks Spatial AnalysisCyber Attacks Spatial Analysis
Cyber Attacks Spatial Analysis
Shwetha Narayanan
 
Data Science At Zillow
Data Science At ZillowData Science At Zillow
Data Science At Zillow
Nicholas McClure
 
Nye forskninsgresultater inden for geo-spatiale data af Christian S. Jensen, AAU
Nye forskninsgresultater inden for geo-spatiale data af Christian S. Jensen, AAUNye forskninsgresultater inden for geo-spatiale data af Christian S. Jensen, AAU
Nye forskninsgresultater inden for geo-spatiale data af Christian S. Jensen, AAU
InfinIT - Innovationsnetværket for it
 
A data driven approach for monitoring network events
A data driven approach for monitoring network eventsA data driven approach for monitoring network events
A data driven approach for monitoring network events
Jisc
 
Cyber Analytics Applications for Data-Intensive Computing
Cyber Analytics Applications for Data-Intensive ComputingCyber Analytics Applications for Data-Intensive Computing
Cyber Analytics Applications for Data-Intensive Computing
Mike Fisk
 
Time series and forecasting from wikipedia
Time series and forecasting from wikipediaTime series and forecasting from wikipedia
Time series and forecasting from wikipedia
Monica Barros
 

Similar to Data Philly Meetup - Big (Geo) Data (20)

Forecasting Space-Time Events - Strata + Hadoop World 2015 San Jose
Forecasting Space-Time Events - Strata + Hadoop World 2015 San JoseForecasting Space-Time Events - Strata + Hadoop World 2015 San Jose
Forecasting Space-Time Events - Strata + Hadoop World 2015 San Jose
 
Mining Large-Scale Temporal Dynamics with Hadoop
Mining Large-Scale Temporal Dynamics with HadoopMining Large-Scale Temporal Dynamics with Hadoop
Mining Large-Scale Temporal Dynamics with Hadoop
 
Rent, Rain, and Regulations | Du Phan, Dataiku | DN18
Rent, Rain, and Regulations | Du Phan, Dataiku | DN18Rent, Rain, and Regulations | Du Phan, Dataiku | DN18
Rent, Rain, and Regulations | Du Phan, Dataiku | DN18
 
Extracting City Traffic Events from Social Streams
 Extracting City Traffic Events from Social Streams Extracting City Traffic Events from Social Streams
Extracting City Traffic Events from Social Streams
 
EPOP: Quantifying Violent Risk for Every Point on the Planet
EPOP: Quantifying Violent Risk for Every Point on the PlanetEPOP: Quantifying Violent Risk for Every Point on the Planet
EPOP: Quantifying Violent Risk for Every Point on the Planet
 
HunchLab 2.0 Predictive Missions: Under the Hood
HunchLab 2.0 Predictive Missions: Under the HoodHunchLab 2.0 Predictive Missions: Under the Hood
HunchLab 2.0 Predictive Missions: Under the Hood
 
RAPID-N: A tool for mapping Natech risk due to earthquakes
RAPID-N: A tool for mapping Natech risk due to earthquakesRAPID-N: A tool for mapping Natech risk due to earthquakes
RAPID-N: A tool for mapping Natech risk due to earthquakes
 
High Availability HPC ~ Microservice Architectures for Supercomputing
High Availability HPC ~ Microservice Architectures for SupercomputingHigh Availability HPC ~ Microservice Architectures for Supercomputing
High Availability HPC ~ Microservice Architectures for Supercomputing
 
Get Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysGet Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California Highways
 
Cyber Threat Ranking using READ
Cyber Threat Ranking using READCyber Threat Ranking using READ
Cyber Threat Ranking using READ
 
Machine Learning from Statistical Point of View
Machine Learning from Statistical Point of ViewMachine Learning from Statistical Point of View
Machine Learning from Statistical Point of View
 
Integrating Sensor and Social Data for Understanding City Events
Integrating Sensor and Social Data for Understanding City EventsIntegrating Sensor and Social Data for Understanding City Events
Integrating Sensor and Social Data for Understanding City Events
 
(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems
(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems
(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems
 
Cyber Attacks Spatial Analysis
Cyber Attacks Spatial AnalysisCyber Attacks Spatial Analysis
Cyber Attacks Spatial Analysis
 
Data Science At Zillow
Data Science At ZillowData Science At Zillow
Data Science At Zillow
 
PPT.pptx
PPT.pptxPPT.pptx
PPT.pptx
 
Nye forskninsgresultater inden for geo-spatiale data af Christian S. Jensen, AAU
Nye forskninsgresultater inden for geo-spatiale data af Christian S. Jensen, AAUNye forskninsgresultater inden for geo-spatiale data af Christian S. Jensen, AAU
Nye forskninsgresultater inden for geo-spatiale data af Christian S. Jensen, AAU
 
A data driven approach for monitoring network events
A data driven approach for monitoring network eventsA data driven approach for monitoring network events
A data driven approach for monitoring network events
 
Cyber Analytics Applications for Data-Intensive Computing
Cyber Analytics Applications for Data-Intensive ComputingCyber Analytics Applications for Data-Intensive Computing
Cyber Analytics Applications for Data-Intensive Computing
 
Time series and forecasting from wikipedia
Time series and forecasting from wikipediaTime series and forecasting from wikipedia
Time series and forecasting from wikipedia
 

More from Azavea

Using New Tools to Analyze and Plan Your Urban Forest
Using New Tools to Analyze and Plan Your Urban Forest Using New Tools to Analyze and Plan Your Urban Forest
Using New Tools to Analyze and Plan Your Urban Forest
Azavea
 
7 misconceptions about predictive policing webinar
7 misconceptions about predictive policing webinar7 misconceptions about predictive policing webinar
7 misconceptions about predictive policing webinar
Azavea
 
Tracking Your Green Infrastructure
Tracking Your Green InfrastructureTracking Your Green Infrastructure
Tracking Your Green Infrastructure
Azavea
 
Growing Your Urban Forest: Using the OpenTreeMap Bulk Uploader
Growing Your Urban Forest: Using the OpenTreeMap Bulk UploaderGrowing Your Urban Forest: Using the OpenTreeMap Bulk Uploader
Growing Your Urban Forest: Using the OpenTreeMap Bulk Uploader
Azavea
 
November 12, 2014 Webinar: Hackers, Beer Geeks, and Arborly Love - Reaching o...
November 12, 2014 Webinar: Hackers, Beer Geeks, and Arborly Love - Reaching o...November 12, 2014 Webinar: Hackers, Beer Geeks, and Arborly Love - Reaching o...
November 12, 2014 Webinar: Hackers, Beer Geeks, and Arborly Love - Reaching o...
Azavea
 
Mobile Citizen Science
Mobile Citizen Science Mobile Citizen Science
Mobile Citizen Science
Azavea
 
Getting Started with OpenTreeMap Cloud
Getting Started with OpenTreeMap CloudGetting Started with OpenTreeMap Cloud
Getting Started with OpenTreeMap Cloud
Azavea
 
HunchLab 2.0 Getting Started
HunchLab 2.0 Getting StartedHunchLab 2.0 Getting Started
HunchLab 2.0 Getting Started
Azavea
 
Is it a Package or a Wrapper? Designing, Documenting, and Distributing a Pyth...
Is it a Package or a Wrapper? Designing, Documenting, and Distributing a Pyth...Is it a Package or a Wrapper? Designing, Documenting, and Distributing a Pyth...
Is it a Package or a Wrapper? Designing, Documenting, and Distributing a Pyth...
Azavea
 
Your New Partners: Understanding Civic Hackathons, Why You Should be Involved...
Your New Partners: Understanding Civic Hackathons, Why You Should be Involved...Your New Partners: Understanding Civic Hackathons, Why You Should be Involved...
Your New Partners: Understanding Civic Hackathons, Why You Should be Involved...
Azavea
 
Using Open Data and Citizen Science to Promote Citizen Engagement with Green ...
Using Open Data and Citizen Science to Promote Citizen Engagement with Green ...Using Open Data and Citizen Science to Promote Citizen Engagement with Green ...
Using Open Data and Citizen Science to Promote Citizen Engagement with Green ...
Azavea
 
HunchLab 2.0 Preview Webinar - Place
HunchLab 2.0 Preview Webinar - PlaceHunchLab 2.0 Preview Webinar - Place
HunchLab 2.0 Preview Webinar - Place
Azavea
 
Five Technology Trends Every Nonprofit Needs to Know
Five Technology Trends Every Nonprofit Needs to KnowFive Technology Trends Every Nonprofit Needs to Know
Five Technology Trends Every Nonprofit Needs to Know
Azavea
 
PhillyHistory.org - Tracking Metrics for a Digital Project
PhillyHistory.org - Tracking Metrics for a Digital ProjectPhillyHistory.org - Tracking Metrics for a Digital Project
PhillyHistory.org - Tracking Metrics for a Digital Project
Azavea
 
Fed Geo Day - Applying GeoTrellis at the US Army Corps
Fed Geo Day - Applying GeoTrellis at the US Army CorpsFed Geo Day - Applying GeoTrellis at the US Army Corps
Fed Geo Day - Applying GeoTrellis at the US Army Corps
Azavea
 
Fed Geo Day - GeoTrellis Intro
Fed Geo Day - GeoTrellis IntroFed Geo Day - GeoTrellis Intro
Fed Geo Day - GeoTrellis Intro
Azavea
 
Fed Geo Day 2013 - Azavea Intro
Fed Geo Day 2013 - Azavea Intro Fed Geo Day 2013 - Azavea Intro
Fed Geo Day 2013 - Azavea Intro
Azavea
 
Modeling Count-based Raster Data with ArcGIS and R
Modeling Count-based Raster Data with ArcGIS and RModeling Count-based Raster Data with ArcGIS and R
Modeling Count-based Raster Data with ArcGIS and R
Azavea
 
OpenTreeMap NCGIS
OpenTreeMap NCGISOpenTreeMap NCGIS
OpenTreeMap NCGISAzavea
 
OpenTreeMap Overview
OpenTreeMap OverviewOpenTreeMap Overview
OpenTreeMap Overview
Azavea
 

More from Azavea (20)

Using New Tools to Analyze and Plan Your Urban Forest
Using New Tools to Analyze and Plan Your Urban Forest Using New Tools to Analyze and Plan Your Urban Forest
Using New Tools to Analyze and Plan Your Urban Forest
 
7 misconceptions about predictive policing webinar
7 misconceptions about predictive policing webinar7 misconceptions about predictive policing webinar
7 misconceptions about predictive policing webinar
 
Tracking Your Green Infrastructure
Tracking Your Green InfrastructureTracking Your Green Infrastructure
Tracking Your Green Infrastructure
 
Growing Your Urban Forest: Using the OpenTreeMap Bulk Uploader
Growing Your Urban Forest: Using the OpenTreeMap Bulk UploaderGrowing Your Urban Forest: Using the OpenTreeMap Bulk Uploader
Growing Your Urban Forest: Using the OpenTreeMap Bulk Uploader
 
November 12, 2014 Webinar: Hackers, Beer Geeks, and Arborly Love - Reaching o...
November 12, 2014 Webinar: Hackers, Beer Geeks, and Arborly Love - Reaching o...November 12, 2014 Webinar: Hackers, Beer Geeks, and Arborly Love - Reaching o...
November 12, 2014 Webinar: Hackers, Beer Geeks, and Arborly Love - Reaching o...
 
Mobile Citizen Science
Mobile Citizen Science Mobile Citizen Science
Mobile Citizen Science
 
Getting Started with OpenTreeMap Cloud
Getting Started with OpenTreeMap CloudGetting Started with OpenTreeMap Cloud
Getting Started with OpenTreeMap Cloud
 
HunchLab 2.0 Getting Started
HunchLab 2.0 Getting StartedHunchLab 2.0 Getting Started
HunchLab 2.0 Getting Started
 
Is it a Package or a Wrapper? Designing, Documenting, and Distributing a Pyth...
Is it a Package or a Wrapper? Designing, Documenting, and Distributing a Pyth...Is it a Package or a Wrapper? Designing, Documenting, and Distributing a Pyth...
Is it a Package or a Wrapper? Designing, Documenting, and Distributing a Pyth...
 
Your New Partners: Understanding Civic Hackathons, Why You Should be Involved...
Your New Partners: Understanding Civic Hackathons, Why You Should be Involved...Your New Partners: Understanding Civic Hackathons, Why You Should be Involved...
Your New Partners: Understanding Civic Hackathons, Why You Should be Involved...
 
Using Open Data and Citizen Science to Promote Citizen Engagement with Green ...
Using Open Data and Citizen Science to Promote Citizen Engagement with Green ...Using Open Data and Citizen Science to Promote Citizen Engagement with Green ...
Using Open Data and Citizen Science to Promote Citizen Engagement with Green ...
 
HunchLab 2.0 Preview Webinar - Place
HunchLab 2.0 Preview Webinar - PlaceHunchLab 2.0 Preview Webinar - Place
HunchLab 2.0 Preview Webinar - Place
 
Five Technology Trends Every Nonprofit Needs to Know
Five Technology Trends Every Nonprofit Needs to KnowFive Technology Trends Every Nonprofit Needs to Know
Five Technology Trends Every Nonprofit Needs to Know
 
PhillyHistory.org - Tracking Metrics for a Digital Project
PhillyHistory.org - Tracking Metrics for a Digital ProjectPhillyHistory.org - Tracking Metrics for a Digital Project
PhillyHistory.org - Tracking Metrics for a Digital Project
 
Fed Geo Day - Applying GeoTrellis at the US Army Corps
Fed Geo Day - Applying GeoTrellis at the US Army CorpsFed Geo Day - Applying GeoTrellis at the US Army Corps
Fed Geo Day - Applying GeoTrellis at the US Army Corps
 
Fed Geo Day - GeoTrellis Intro
Fed Geo Day - GeoTrellis IntroFed Geo Day - GeoTrellis Intro
Fed Geo Day - GeoTrellis Intro
 
Fed Geo Day 2013 - Azavea Intro
Fed Geo Day 2013 - Azavea Intro Fed Geo Day 2013 - Azavea Intro
Fed Geo Day 2013 - Azavea Intro
 
Modeling Count-based Raster Data with ArcGIS and R
Modeling Count-based Raster Data with ArcGIS and RModeling Count-based Raster Data with ArcGIS and R
Modeling Count-based Raster Data with ArcGIS and R
 
OpenTreeMap NCGIS
OpenTreeMap NCGISOpenTreeMap NCGIS
OpenTreeMap NCGIS
 
OpenTreeMap Overview
OpenTreeMap OverviewOpenTreeMap Overview
OpenTreeMap Overview
 

Recently uploaded

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 

Data Philly Meetup - Big (Geo) Data

  • 1. Big (Geo) Data Science Robert Cheetham cheetham@azavea.com @rcheetham
  • 3. B Corporation • Projects w/ Social Value • Summer of Maps • Pro Bono Program • Donate share of profits Research-Driven • 10% Research Program • Academic Collaborations • Open Source
  • 4. Spatial Temporal Forecasting with Philadelphia Crime Data
  • 5. How Phila PD uses Maps Customized Map Products Weekly CompStat Meetings Web Crime Analysis
  • 6. INCT & PARS – main database sources over 5,000 incidents daily, over 2 million annually PARS Complainant INCT Verizon Daily download 911 District & Geocoding Routines 48 Desk Incident Report Completed by Officer District X 911 Operator Police Officer Maps distributed Through Intranet, District Y Printing, CompStat Radio Dispatcher CAD District Z
  • 7. The Context 1,500,000 people 7,000 police 1,000 civilian employees 2,000,000 new incidents / year 3 crime analysts
  • 8. What we did • Weekly Compstat • Lots of maps • Automation of map creation • Web-based systems
  • 9. … but what if we could…  Accelerate the cycle  Proactively notify  Automate the process
  • 10. Prototype VB & MapObjects ArcView .ini file Process Documentation Shapefiles and GRIDs MS SQL Server Crime Incidents Database
  • 11.
  • 12. … but there was a problem …
  • 15. We needed …. 1. Better Statistics 2. Notification 3. Simplicity
  • 16.
  • 17. Crime Analysis – What has happened? – Mapping (spatial / temporal densities) – Trending – Intelligence Dashboard Early Warning – What is out of the ordinary? – Statistical & Threshold-based Hunches (data mining) – Alerting Risk Forecasting – What is likely to happen next? – Near Repeat Pattern – Load Forecasting
  • 18. Crime Analysis – Mapping (spatial / temporal densities) – Trending – Intelligence Dashboard Early Warning – Statistical & Threshold-based Hunches (data mining) – Alerting Risk Forecasting – Near Repeat Pattern – Load Forecasting
  • 23. Early Warning • Geographic Early Warning System – A system to alert staff of an unusual situation in a particular location – Ingests data sets to automatically “cook on” and only involves staff when a statistically unusual situation is found Geostatistical Engine Operational Operational Database Alerting Operational Database HunchLab Database System Databases
  • 25. What is a Hunch? • A proposed hypothesis, saved into the system, and continually tested for validity • Incident Attribute Requirements – Location (x, y) – Time (timestamp) – Classification • Hunch Attributes – Location (area) – Time (recent / historic periods) – Classification • Analyses – Statistical Hunch – Threshold Hunch
  • 26. Hunch Parameters: Location • Address & Radius • Precinct/County/Country • Custom Drawn Area • Mass Hunch
  • 27. Hunch Parameters: Time • Statistical Hunch – Recent Past – Historic Past
  • 28. Hunch Parameters: Classification • Category • Time of Day • Narrative
  • 35. Contagious Crime? • Near repeat pattern analysis • “If one burglary occurs, how does the risk change nearby?”
  • 36. What Do We Mean By Near Repeat? • Repeat victimization – Incident at the same location at a later time (likely related) • Near repeat victimization – Incident at a nearby location at a later time (likely related) • Incident A (place, time) --> Incident B (place, time)
  • 37. Near Repeat Pattern Analysis • The goal: – Quantify short term risk due to near-repeat victimization • “If one burglary occurs, how does the risk of burglary for the neighbors change?” • What we know: – Incident A (place, time) --> Incident B (place, time) • Distance between A and B • Timeframe between A and B • What we need to know: – What distances/timeframes are not simply random?
  • 38. Near Repeat Pattern Analysis • The process – Observe the pattern in historic data – Simulate the pattern in randomized historic data – Compare the observed pattern to the simulated patterns – Apply the non-random pattern to new incidents • An example – 180 days of burglaries in Division 6 of Philadelphia
  • 43. Near Repeat Pattern Analysis • How can you test your own data? – Near Repeat Calculator • http://www.temple.edu/cj/misc/nr/ • Papers – Near-Repeat Patterns in Philadelphia Shootings (2008) • One city block & two weeks after one shooting – 33% increase in likelihood of a second event Jerry Ratcliffe Temple University
  • 46. Improving CompStat • Workload forecasting • “Given the time of year, day of week, time of day and general trend, what counts of crimes should I expect?”
  • 47. What Do We Mean By Load Forecasting? • Workload forecasting • Generating aggregate crime counts for a future timeframe using cyclical time series analysis Measure cyclical patterns + Identify non-cyclical trend Forecast expected count bit.ly/gorrcrimeforecastingpaper
  • 48. Load Forecasting • Measure cyclical patterns • Take historic incidents (for example: last five years) • Generate multiplicative seasonal indices – For each time cycle: » time of year » day of week » time of day – Count incidents within each time unit (for example: Monday) – Calculate average per time unit if incidents were evenly distributed – Divide counts within each time unit by the calculated average to generate multiplicative indices » Index ~ 1 means at the average » Index > 1 means above average » Index < 1 means below average
  • 53. Load Forecasting • Identify non-cyclical trend • Take recent daily counts (for example: last year daily counts) • Remove cyclical trends by dividing by indices • Run a trending function on the new counts – Simple average » Last X Days – Smoothing function » Exponential smoothing » Holt’s linear exponential smoothing
  • 54. Load Forecasting • Forecast expected count • Project trend into future timeframe – Always flat » Simple average » Exponential smoothing – Linear trend » Holt’s linear exponential smoothing • Multiple by seasonal indices to reseasonalize the data
  • 55. Load Forecasting Measure cyclical patterns + Identify non-cyclical trend Forecast expected count bit.ly/gorrcrimeforecastingpaper
  • 57. How Do We Know It’s Accurate? • Testing • Generated forecasting techniques(examples) – Commonly Used » Average of last 30 days » Average of last 365 days » Last year’s count for the same time period – Advanced Combinations » Different cyclical indices (example: day of year vs. month of year) » Different levels of geographic aggregation for indices » Different trending functions • Scoring methodologies (examples) – Mean absolute percent error (with some enhancements) – Mean percent error – Mean squared error • Run thousands of forecasts through testing framework • Choose the right technique in the right situation
  • 59. Research Topics • Risk Forecasting – Load forecasting enhancements • Weather and special events – Combining short and long term risk forecasts (Temple) • Socioeconomic changes in neighborhoods – Risk Terrain Modeling (Rutgers) • Context of crime at the microplace
  • 61. Research Topics • Risk Forecasting – Offender Management • Prioritize offenders based upon statistical models using past behaviors • Evaluation – Automate Randomized Controlled Trials
  • 62. Data Processing for Big (Geo) Data
  • 64. Robert’s Rules of Housing Close to Center City  somewhat important Walk to Grocery Store  vital Nearby Restaurants  very important Library  nice to have Near a Park  somewhat important Biking / walking distance from our work  very important Biking distance to fencing  somewhat important
  • 65. Your factors might include…  Child Care  Local School Rankings  Farmer's Market  Car Share  Public Transit
  • 66. We stand on the shoulders of giants
  • 67. Not a new idea … Design with Nature
  • 68. Not a new Idea … Dana Tomlin
  • 70. Weighted Overlay + + + x5 x1 x3 x2 =
  • 71. Summary Geography-driven Decisions Iterative Individual Web [and Mobile] Growing data sets
  • 73. Web is different from the Desktop  Lots of simultaneous users  Stateless environment  HTML+JS+CSS  Users are less skilled  Users are less patient
  • 74. But wait … there’s a problem  10 – 60 second calculation time  Multiple simultaneous users …  … that are impatient
  • 76. Big Data – Social Media
  • 77. Big Data – Science
  • 78. Big Data – Citizen Science
  • 79. Big Data – Cities
  • 81.
  • 82. Specific Optimization Goals  New Raster File Structure  Distributed processing  Binary messaging protocol
  • 83. Optimization: File Format  Limit data type and range  1D arrays are fast to read/write  Tiled  Pyramids  Azavea Raster Grid (ARG)
  • 84. Optimization: Distributed Processing  Parallelizable - Local Ops and Focal Ops  Support multiple – Threads – Cores – CPU’s – Machines  Considered – Hadoop – Amazon Map Reduce – Beowolf
  • 85. Success!! Reduced from 10-60 seconds to <500 milliseconds
  • 86. Optimizing one process sub-optimizes others  Complex to configure and maintain  Limited to one operation  No interpolation  No mixing – cell sizes – extents – projections  etc.
  • 87.
  • 88.  Broader set of functionality  Both raster and vector  Scala + Akka  Open source
  • 90.
  • 91.
  • 92.
  • 93.
  • 94.
  • 95.
  • 96.
  • 97.
  • 98.
  • 99.
  • 100.
  • 101. Regional/State: 84 ms National: 84 ms Large Country 115 ms Continental 271 ms Planet 1.2 – 2.0 s
  • 103. GPUs
  • 104.
  • 105. GPU Results  Re-wrote a few Map Algebra operations:  Local  Neighborhood  Zonal  Viewshed  etc.  15 – 120x  Large grids  Large kernels
  • 106. New Spatial Operations Vector Neighborhood/Focal Spatial Statistics Integration
  • 108. Crime Analysis, Early Warning and Forecasting
  • 109. Open Source Geoprocessing  GDAL  GeoServer  PostGIS R  GeoDa
  • 110. Many Thanks! © Photo used with permission from Alphafish, via Flickr.com
  • 111. Big (Geo) Data Science [We are hiring] Robert Cheetham cheetham@azavea.com @rcheetham