SlideShare a Scribd company logo
Heatmaps & Geohash:
          Integration of Multi-Source Geospatial Data




Abe Usher, CIO
abe@thehumangeo.com
703.955.1540
@abeusher
                                         AKA Heatmaps are the Heat




INFORMATION INTO INSIGHT
Our Menu of Subtopics




A LITTLE HISTORY                     WHY HEATMAPS? GEOHASH?              DATA GONE WILD
                                     Big data requires new approaches.   A new organizing construct for
Why geospatial data?
                                                                         information analysis.




 KITCHEN MODEL                       SPECIFIC EXAMPLES                   EASTER EGG
 New ways to combine internal data   Concrete take-aways.                Treats for making it through another
 with external new media for                                             presentation.
 maximum insight.




                                                                                                                2
What’s in it for you?




                        3
What’s in it for you?




 Make custom heatmaps
 Three iron laws of geospatial data
  integration and analysis


                                                   4
Whoami?




People   Data   Beverages




                            5
Geohash




Geohash is a coordinate
transformation

that facilitates combining two
variables (latitude and longitude)
into a single(text) variable

that represents a bounding-box
containing the point of interest.




                                               6
Heatmap


“Discrete & continuous methods of kernel density
estimation”




                                                   7
A rose by any other name


“Discrete & continuous methods of kernel density
estimation”




      Gaussian
      Quartic
      Exponential
      Triangular
      Uniform
      Epanechnikov


                                                    8
A rose by any other name


“Discrete methods of kernel density estimation”




                                                    9
About


Big Data                 (Digital) Human Geography


  Predictive models of social drift &
   human behavior
  Streaming media analytics
  Micro-demographics




We’re hiring! info@thehumangeo.com
                                                       10
Why Heatmaps and Geohash?


 Too much data




                                         11
Why Heatmaps and Geohash?


 Too much data
 Trust in Internet data




                                            12
Why Heatmaps and Geohash?


 Too much data
 Trust in Internet data
 Heatmaps look cool




                                            13
Why Heatmaps and Geohash?


   Too much data
   Trust in Internet data
   Heatmaps look cool
   Geohash helps quantify
    data




                                             14
Why Heatmaps and Geohash?


                             Visual summary
   Too much data
   Trust in Internet data
   Heatmaps look cool
   Geohash helps quantify
    data
                     Quantitative methods




                                               15
Trust and Internet Information




Tracy Morrow aka “Ice T”


                                                            16
Trust and Internet Information


                              “Game knows
                              game, baby.”




Tracy Morrow aka “Ice T”


                                                            17
Trust and Internet Information


                              “If you have expert
                              knowledge, then
                              you are capable of
                              recognizing expert
                              knowledge.”

                              [paraphrased]

Tracy Morrow aka “Ice T”


                                                            18
Trust and Internet Information




Can we actually trust this Internet stuff?


                                                19
Trust and Internet Information




                                 20
Trust and Internet Information




                                 21
Trust and Internet Information




                                 22
Salami Slicing




  Salami slicing: series of minor observations, resulting
  in a larger observation that would be difficult to perform
http://en.wikipedia.org/wiki/Salami_slicing
                                                                 23
Seven Layer GLT


1.    OpenStreetMap data
2.    Flickr
3.    Panoramio
4.    Geonames.org
5.    Twitter
6.    Wikimapia
7.    4Square

* Geospatial Lattice of Trusted Data
                                                         24
Seven Layer GLT


1.   OpenStreetMap data
2.   Flickr
3.   Panoramio
4.   Geonames.org
5.   Twitter
6.   Wikimapia
7.   4Square

Spatial Temporal User Finds From the Field (STUFFF)

                                                             25
Rule #1: Think in terms of aggregation


                                                   Twitter data




                                             Panoramio Tourist photos




                                                  Classified data
Twitter geohash ez420 – coffee shop
Panoramio geohash ez420 – Starbucks
Classified geohash ez420 - Wifi




                                           Trust through aggregation




                                                                        26
Rule #1: Think in terms of aggregation


                                                      Twitter data
Twitter geohash ez420 – coffee shop
Panoramio geohash ez420 – Starbucks
Classified geohash ez420 - Wifi
                                                Panoramio Tourist photos




Geohash creates simple string variables.             Classified data



Matching strings = super easy
Matching similar coordinates = impossible



                                              Trust through aggregation




                                                                           27
Rule #1: Think in terms of aggregation


                                                    Twitter data




                                              Panoramio Tourist photos



Use geohash to apply collaborative
filtering techniques to develop new                Classified data

models of trust & data confidence.




                                            Trust through aggregation




                                                                         28
Rule #2: Selectively throw away precision

Entity #1
   Latitude   Longitude
  40.998946    28.9232
  41.005164   28.973668
  41.018765   29.016412
  41.062268   29.030145

Entity #2
   Latitude   Longitude
  40.999100   28.92111
  41.018112   28.973991
  41.018880   29.016902
  41.062110   29.030122


                                                         29
Rule #2: Selectively throw away precision

Entity #1
   Latitude   Longitude    Geohash
  40.998946    28.9232      SXK94
  41.005164   28.973668     SXK97
  41.018765   29.016412     SXK9K
  41.062268   29.030145     SXK9S




                                                            30
Rule #2: Selectively throw away precision

Entity #1
   Latitude   Longitude   Geohash
  40.998946    28.9232     SXK94
  41.005164   28.973668    SXK97
  41.018765   29.016412    SXK9K
  41.062268   29.030145    SXK9S

Entity #2
   Latitude   Longitude   Geohash
  40.999100   28.92111     SXK94
  41.018112   28.973991    SXK97
  41.018880   29.016902    SXK9K
  41.062110   29.030122    SXK9S


                                                         31
Kitchen Model for
  Spatial Analysis




                     32
Kitchen Model for
         Spatial Analysis



Chef




                            33
Kitchen Model for
                       Spatial Analysis



Chef   Ingredients




                                          34
Kitchen Model for
                             Spatial Analysis



Chef   Ingredients   Utensils




                                                35
Kitchen Model for
                             Spatial Analysis



Chef   Ingredients   Utensils      Presentation




                                                  36
Kitchen Model for
                             Spatial Analysis



Chef   Ingredients   Utensils      Presentation




                                                  37
Types of Heatmaps


Turnkey
 GeoCommons
 SpatialKey
 MapBox/TileMill
 ArcGIS Desktop
 QGIS

Custom
 Python
 R
 Javascript                            38
Types of Heatmaps


Turnkey
 GeoCommons
 SpatialKey
 MapBox/TileMill
 ArcGIS Desktop
 QGIS

Custom
 Python
 R
 Javascript                            39
Heatmap: Recipe One

“OSM Style”

Get Python http://python.org

Get the sethoscope library
http://www.sethoscope.net/heatmap/

Get data
http://bit.ly/geotweet_sc
https://dev.twitter.com/docs/streaming-api/methods#locations

Command line:
heatmap.py -g portland.gpx -o output.png --height 800 --osm


                                                              40
41
Heatmap: Recipe One

Stitch it together in an MP4 movie!


Get the CLI app:
http://ffmpeg.org/



Command line:

heatmap.py -g portland.gpx -o output.png --height 800 –osm –a

ffmpeg -i frame-%05d.png OSM_is_awesome.mp4



                                                                42
Heatmap: Recipe One


Props to

Seth Golub from Google
http://www.sethoscope.net/




                                                   43
Rule #3:
Beware of population effects




              http://xkcd.com/1138/
                                      44
Rule #3:
                                    Beware of population effects



                       Absolute value

Normalized value =

                     Population estimate




                                                                   45
Rule #3:
Beware of population effects


                        34



                72      52



         22


                               46
Rule #3:
                                    Beware of population effects


                                                           34
                       Absolute value
                                                          2,000
Normalized value =

                     Population estimate
                                                     72     52
                                                   16,000 25,000

                                            22
                                           2,000
                                                                   47
Conclusion

1. Think in terms of aggregation
2. Selectively throw away precision
3. Beware of population effects




                                                   48
Contact Us




HumanGeo NY                                   HumanGeo DC
1221 Avenue of the Americas                   2500 Wilson Boulevard
Suite 4200 | New York, NY 10020               Suite 310 | Arlington, VA 22201




info@thehumangeo.com              |   703.955.1540       |     www.thehumangeo.com


                                                                                        49
Easter Egg




http://bit.ly/geotweet_sc                50

More Related Content

Viewers also liked

Public Health Surveillance Through Collaboration
Public Health Surveillance Through CollaborationPublic Health Surveillance Through Collaboration
Public Health Surveillance Through Collaboration
Taha Kass-Hout, MD, MS
 
precisionFDA
precisionFDAprecisionFDA
Latest Advances in Megapixel Surveillance
Latest Advances in Megapixel SurveillanceLatest Advances in Megapixel Surveillance
Latest Advances in Megapixel Surveillance
Steve Ma
 
GeoSpatially enabling your Spark and Accumulo clusters with LocationTech
GeoSpatially enabling your Spark and Accumulo clusters with LocationTechGeoSpatially enabling your Spark and Accumulo clusters with LocationTech
GeoSpatially enabling your Spark and Accumulo clusters with LocationTech
Rob Emanuele
 
Matchinguu droidcon presentation
Matchinguu droidcon presentationMatchinguu droidcon presentation
Matchinguu droidcon presentationDroidcon Berlin
 
Riff: A Social Network and Collaborative Platform for Public Health Disease S...
Riff: A Social Network and Collaborative Platform for Public Health Disease S...Riff: A Social Network and Collaborative Platform for Public Health Disease S...
Riff: A Social Network and Collaborative Platform for Public Health Disease S...
Taha Kass-Hout, MD, MS
 
[FOSS4G Korea 2016] GeoHash를 이용한 지형도 변화탐지와 시계열 관리
[FOSS4G Korea 2016] GeoHash를 이용한 지형도 변화탐지와 시계열 관리[FOSS4G Korea 2016] GeoHash를 이용한 지형도 변화탐지와 시계열 관리
[FOSS4G Korea 2016] GeoHash를 이용한 지형도 변화탐지와 시계열 관리
BJ Jang
 
Using Qualtrics to Create Automated Online Trainings
Using Qualtrics to Create Automated Online TrainingsUsing Qualtrics to Create Automated Online Trainings
Using Qualtrics to Create Automated Online Trainings
Shalin Hai-Jew
 
Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...
Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...
Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...
Shalin Hai-Jew
 
Epi Info™ Mesh4x
Epi Info™ Mesh4xEpi Info™ Mesh4x
Epi Info™ Mesh4x
Taha Kass-Hout, MD, MS
 
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods: Extracting So...
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting So...Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting So...
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods: Extracting So...
Shalin Hai-Jew
 
Writing and Publishing about Applied Technologies in Tech Journals and Books
Writing and Publishing about Applied Technologies in Tech Journals and BooksWriting and Publishing about Applied Technologies in Tech Journals and Books
Writing and Publishing about Applied Technologies in Tech Journals and Books
Shalin Hai-Jew
 
Catching imsi catchers
Catching imsi catchersCatching imsi catchers
Catching imsi catchers
Geoffrey Vaughan
 
LIWC-ing at Texts for Insights from Linguistic Patterns
LIWC-ing at Texts for Insights from Linguistic PatternsLIWC-ing at Texts for Insights from Linguistic Patterns
LIWC-ing at Texts for Insights from Linguistic Patterns
Shalin Hai-Jew
 
Building a Digital Learning Object w/ Articulate Storyline 2
Building a Digital Learning Object w/ Articulate Storyline 2Building a Digital Learning Object w/ Articulate Storyline 2
Building a Digital Learning Object w/ Articulate Storyline 2
Shalin Hai-Jew
 
A Brief Introduction to the SCRUM Agile Methodology
A Brief Introduction to the SCRUM Agile MethodologyA Brief Introduction to the SCRUM Agile Methodology
A Brief Introduction to the SCRUM Agile Methodology
Taha Kass-Hout, MD, MS
 
Scaling GIS Data in Non-relational Data Stores
Scaling GIS Data in Non-relational Data StoresScaling GIS Data in Non-relational Data Stores
Scaling GIS Data in Non-relational Data Stores
Mike Malone
 
iParanoid: an IMSI Catcher - Stingray Intrusion Detection System
 iParanoid: an IMSI Catcher - Stingray Intrusion Detection System iParanoid: an IMSI Catcher - Stingray Intrusion Detection System
iParanoid: an IMSI Catcher - Stingray Intrusion Detection System
Luca Bongiorni
 

Viewers also liked (20)

Public Health Surveillance Through Collaboration
Public Health Surveillance Through CollaborationPublic Health Surveillance Through Collaboration
Public Health Surveillance Through Collaboration
 
BioSense 2.0
BioSense 2.0BioSense 2.0
BioSense 2.0
 
Big Data in Public Health
Big Data in Public HealthBig Data in Public Health
Big Data in Public Health
 
precisionFDA
precisionFDAprecisionFDA
precisionFDA
 
Latest Advances in Megapixel Surveillance
Latest Advances in Megapixel SurveillanceLatest Advances in Megapixel Surveillance
Latest Advances in Megapixel Surveillance
 
GeoSpatially enabling your Spark and Accumulo clusters with LocationTech
GeoSpatially enabling your Spark and Accumulo clusters with LocationTechGeoSpatially enabling your Spark and Accumulo clusters with LocationTech
GeoSpatially enabling your Spark and Accumulo clusters with LocationTech
 
Matchinguu droidcon presentation
Matchinguu droidcon presentationMatchinguu droidcon presentation
Matchinguu droidcon presentation
 
Riff: A Social Network and Collaborative Platform for Public Health Disease S...
Riff: A Social Network and Collaborative Platform for Public Health Disease S...Riff: A Social Network and Collaborative Platform for Public Health Disease S...
Riff: A Social Network and Collaborative Platform for Public Health Disease S...
 
[FOSS4G Korea 2016] GeoHash를 이용한 지형도 변화탐지와 시계열 관리
[FOSS4G Korea 2016] GeoHash를 이용한 지형도 변화탐지와 시계열 관리[FOSS4G Korea 2016] GeoHash를 이용한 지형도 변화탐지와 시계열 관리
[FOSS4G Korea 2016] GeoHash를 이용한 지형도 변화탐지와 시계열 관리
 
Using Qualtrics to Create Automated Online Trainings
Using Qualtrics to Create Automated Online TrainingsUsing Qualtrics to Create Automated Online Trainings
Using Qualtrics to Create Automated Online Trainings
 
Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...
Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...
Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...
 
Epi Info™ Mesh4x
Epi Info™ Mesh4xEpi Info™ Mesh4x
Epi Info™ Mesh4x
 
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods: Extracting So...
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting So...Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting So...
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods: Extracting So...
 
Writing and Publishing about Applied Technologies in Tech Journals and Books
Writing and Publishing about Applied Technologies in Tech Journals and BooksWriting and Publishing about Applied Technologies in Tech Journals and Books
Writing and Publishing about Applied Technologies in Tech Journals and Books
 
Catching imsi catchers
Catching imsi catchersCatching imsi catchers
Catching imsi catchers
 
LIWC-ing at Texts for Insights from Linguistic Patterns
LIWC-ing at Texts for Insights from Linguistic PatternsLIWC-ing at Texts for Insights from Linguistic Patterns
LIWC-ing at Texts for Insights from Linguistic Patterns
 
Building a Digital Learning Object w/ Articulate Storyline 2
Building a Digital Learning Object w/ Articulate Storyline 2Building a Digital Learning Object w/ Articulate Storyline 2
Building a Digital Learning Object w/ Articulate Storyline 2
 
A Brief Introduction to the SCRUM Agile Methodology
A Brief Introduction to the SCRUM Agile MethodologyA Brief Introduction to the SCRUM Agile Methodology
A Brief Introduction to the SCRUM Agile Methodology
 
Scaling GIS Data in Non-relational Data Stores
Scaling GIS Data in Non-relational Data StoresScaling GIS Data in Non-relational Data Stores
Scaling GIS Data in Non-relational Data Stores
 
iParanoid: an IMSI Catcher - Stingray Intrusion Detection System
 iParanoid: an IMSI Catcher - Stingray Intrusion Detection System iParanoid: an IMSI Catcher - Stingray Intrusion Detection System
iParanoid: an IMSI Catcher - Stingray Intrusion Detection System
 

Similar to Geohash: Integration of Disparate Geospatial Data

Visualize Big Graph Data
Visualize Big Graph DataVisualize Big Graph Data
Visualize Big Graph DataMathieu Bastian
 
Capabilities Brief Analytics
Capabilities Brief AnalyticsCapabilities Brief Analytics
Capabilities Brief AnalyticsDataTactics
 
Why quality control and quality assurance is important for the legacy of GEOT...
Why quality control and quality assurance is important for the legacy of GEOT...Why quality control and quality assurance is important for the legacy of GEOT...
Why quality control and quality assurance is important for the legacy of GEOT...Adam Leadbetter
 
Big Data = Big Decisions
Big Data = Big DecisionsBig Data = Big Decisions
Big Data = Big Decisions
InnoTech
 
The Age of Exabytes: Tools & Approaches for Managing Big Data
The Age of Exabytes: Tools & Approaches for Managing Big DataThe Age of Exabytes: Tools & Approaches for Managing Big Data
The Age of Exabytes: Tools & Approaches for Managing Big Data
ReadWrite
 
Minnesota GIS/LIS The Geospatial Revolution Peter Batty
Minnesota GIS/LIS The Geospatial Revolution Peter BattyMinnesota GIS/LIS The Geospatial Revolution Peter Batty
Minnesota GIS/LIS The Geospatial Revolution Peter Batty
Peter Batty
 
The Geospatial Revolution - AGI GeoCommunity keynote
The Geospatial Revolution - AGI GeoCommunity keynoteThe Geospatial Revolution - AGI GeoCommunity keynote
The Geospatial Revolution - AGI GeoCommunity keynote
Peter Batty
 
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
i_scienceEU
 
UNIT2-Data Mining.pdf
UNIT2-Data Mining.pdfUNIT2-Data Mining.pdf
UNIT2-Data Mining.pdf
Nancykumari47
 
The Geospatial Revolution in Copenhagen
The Geospatial Revolution in CopenhagenThe Geospatial Revolution in Copenhagen
The Geospatial Revolution in Copenhagen
Peter Batty
 
Cs501 dm intro
Cs501 dm introCs501 dm intro
Cs501 dm intro
Kamal Singh Lodhi
 
DRCOG: The Geospatial Revolution Peter Batty
DRCOG: The Geospatial Revolution Peter BattyDRCOG: The Geospatial Revolution Peter Batty
DRCOG: The Geospatial Revolution Peter Batty
Peter Batty
 
NoSQL & Big Data Analytics: History, Hype, Opportunities
NoSQL & Big Data Analytics: History, Hype, OpportunitiesNoSQL & Big Data Analytics: History, Hype, Opportunities
NoSQL & Big Data Analytics: History, Hype, Opportunities
Vishy Poosala
 
Data Mining Overview
Data Mining OverviewData Mining Overview
Data Mining Overview
Golda Margret Sheeba J
 
El camino hacia el éxito con las bases de datos de grafos, la ciencia de dato...
El camino hacia el éxito con las bases de datos de grafos, la ciencia de dato...El camino hacia el éxito con las bases de datos de grafos, la ciencia de dato...
El camino hacia el éxito con las bases de datos de grafos, la ciencia de dato...
Neo4j
 
Data Mining mod1 ppt.pdf bca sixth semester notes
Data Mining mod1 ppt.pdf bca sixth semester notesData Mining mod1 ppt.pdf bca sixth semester notes
Data Mining mod1 ppt.pdf bca sixth semester notes
asnaparveen414
 
High Performance and Scalable Geospatial Analytics on Cloud with Open Source
High Performance and Scalable Geospatial Analytics on Cloud with Open SourceHigh Performance and Scalable Geospatial Analytics on Cloud with Open Source
High Performance and Scalable Geospatial Analytics on Cloud with Open Source
"Constantin \"Cristi\"" Stanca
 
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Data Science London
 
Graph visualization options and latest developments
Graph visualization options and latest developmentsGraph visualization options and latest developments
Graph visualization options and latest developmentsLinkurious
 

Similar to Geohash: Integration of Disparate Geospatial Data (20)

Visualize Big Graph Data
Visualize Big Graph DataVisualize Big Graph Data
Visualize Big Graph Data
 
Capabilities Brief Analytics
Capabilities Brief AnalyticsCapabilities Brief Analytics
Capabilities Brief Analytics
 
Why quality control and quality assurance is important for the legacy of GEOT...
Why quality control and quality assurance is important for the legacy of GEOT...Why quality control and quality assurance is important for the legacy of GEOT...
Why quality control and quality assurance is important for the legacy of GEOT...
 
Big Data = Big Decisions
Big Data = Big DecisionsBig Data = Big Decisions
Big Data = Big Decisions
 
The Age of Exabytes: Tools & Approaches for Managing Big Data
The Age of Exabytes: Tools & Approaches for Managing Big DataThe Age of Exabytes: Tools & Approaches for Managing Big Data
The Age of Exabytes: Tools & Approaches for Managing Big Data
 
Minnesota GIS/LIS The Geospatial Revolution Peter Batty
Minnesota GIS/LIS The Geospatial Revolution Peter BattyMinnesota GIS/LIS The Geospatial Revolution Peter Batty
Minnesota GIS/LIS The Geospatial Revolution Peter Batty
 
The Geospatial Revolution - AGI GeoCommunity keynote
The Geospatial Revolution - AGI GeoCommunity keynoteThe Geospatial Revolution - AGI GeoCommunity keynote
The Geospatial Revolution - AGI GeoCommunity keynote
 
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
 
UNIT2-Data Mining.pdf
UNIT2-Data Mining.pdfUNIT2-Data Mining.pdf
UNIT2-Data Mining.pdf
 
The Geospatial Revolution in Copenhagen
The Geospatial Revolution in CopenhagenThe Geospatial Revolution in Copenhagen
The Geospatial Revolution in Copenhagen
 
Cs501 dm intro
Cs501 dm introCs501 dm intro
Cs501 dm intro
 
DRCOG: The Geospatial Revolution Peter Batty
DRCOG: The Geospatial Revolution Peter BattyDRCOG: The Geospatial Revolution Peter Batty
DRCOG: The Geospatial Revolution Peter Batty
 
NoSQL & Big Data Analytics: History, Hype, Opportunities
NoSQL & Big Data Analytics: History, Hype, OpportunitiesNoSQL & Big Data Analytics: History, Hype, Opportunities
NoSQL & Big Data Analytics: History, Hype, Opportunities
 
Data scientist
Data scientistData scientist
Data scientist
 
Data Mining Overview
Data Mining OverviewData Mining Overview
Data Mining Overview
 
El camino hacia el éxito con las bases de datos de grafos, la ciencia de dato...
El camino hacia el éxito con las bases de datos de grafos, la ciencia de dato...El camino hacia el éxito con las bases de datos de grafos, la ciencia de dato...
El camino hacia el éxito con las bases de datos de grafos, la ciencia de dato...
 
Data Mining mod1 ppt.pdf bca sixth semester notes
Data Mining mod1 ppt.pdf bca sixth semester notesData Mining mod1 ppt.pdf bca sixth semester notes
Data Mining mod1 ppt.pdf bca sixth semester notes
 
High Performance and Scalable Geospatial Analytics on Cloud with Open Source
High Performance and Scalable Geospatial Analytics on Cloud with Open SourceHigh Performance and Scalable Geospatial Analytics on Cloud with Open Source
High Performance and Scalable Geospatial Analytics on Cloud with Open Source
 
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
 
Graph visualization options and latest developments
Graph visualization options and latest developmentsGraph visualization options and latest developments
Graph visualization options and latest developments
 

More from DataCards

Information Extraction and Integration of Hard and Soft Information for D2D v...
Information Extraction and Integration of Hard and Soft Information for D2D v...Information Extraction and Integration of Hard and Soft Information for D2D v...
Information Extraction and Integration of Hard and Soft Information for D2D v...
DataCards
 
Fusion of Human Geography Data
Fusion of Human Geography DataFusion of Human Geography Data
Fusion of Human Geography DataDataCards
 
Data Normalization and Alignment in Heterogeneous Data Sets
Data Normalization and Alignment in Heterogeneous Data SetsData Normalization and Alignment in Heterogeneous Data Sets
Data Normalization and Alignment in Heterogeneous Data SetsDataCards
 
The Challenges and Pitfalls of Aggregating Social Media Data
The Challenges and Pitfalls of Aggregating Social Media DataThe Challenges and Pitfalls of Aggregating Social Media Data
The Challenges and Pitfalls of Aggregating Social Media Data
DataCards
 
How NOT to Aggregrate Polling Data
How NOT to Aggregrate Polling DataHow NOT to Aggregrate Polling Data
How NOT to Aggregrate Polling DataDataCards
 
Alignment and Analytics of Large Scale, Disparate Data from IARPA's Knowledge...
Alignment and Analytics of Large Scale, Disparate Data from IARPA's Knowledge...Alignment and Analytics of Large Scale, Disparate Data from IARPA's Knowledge...
Alignment and Analytics of Large Scale, Disparate Data from IARPA's Knowledge...DataCards
 
3rd Socio-Cultural Data Summit
3rd Socio-Cultural Data Summit3rd Socio-Cultural Data Summit
3rd Socio-Cultural Data Summit
DataCards
 
Statistical Approaches to Missing Data
Statistical Approaches to Missing DataStatistical Approaches to Missing Data
Statistical Approaches to Missing Data
DataCards
 

More from DataCards (8)

Information Extraction and Integration of Hard and Soft Information for D2D v...
Information Extraction and Integration of Hard and Soft Information for D2D v...Information Extraction and Integration of Hard and Soft Information for D2D v...
Information Extraction and Integration of Hard and Soft Information for D2D v...
 
Fusion of Human Geography Data
Fusion of Human Geography DataFusion of Human Geography Data
Fusion of Human Geography Data
 
Data Normalization and Alignment in Heterogeneous Data Sets
Data Normalization and Alignment in Heterogeneous Data SetsData Normalization and Alignment in Heterogeneous Data Sets
Data Normalization and Alignment in Heterogeneous Data Sets
 
The Challenges and Pitfalls of Aggregating Social Media Data
The Challenges and Pitfalls of Aggregating Social Media DataThe Challenges and Pitfalls of Aggregating Social Media Data
The Challenges and Pitfalls of Aggregating Social Media Data
 
How NOT to Aggregrate Polling Data
How NOT to Aggregrate Polling DataHow NOT to Aggregrate Polling Data
How NOT to Aggregrate Polling Data
 
Alignment and Analytics of Large Scale, Disparate Data from IARPA's Knowledge...
Alignment and Analytics of Large Scale, Disparate Data from IARPA's Knowledge...Alignment and Analytics of Large Scale, Disparate Data from IARPA's Knowledge...
Alignment and Analytics of Large Scale, Disparate Data from IARPA's Knowledge...
 
3rd Socio-Cultural Data Summit
3rd Socio-Cultural Data Summit3rd Socio-Cultural Data Summit
3rd Socio-Cultural Data Summit
 
Statistical Approaches to Missing Data
Statistical Approaches to Missing DataStatistical Approaches to Missing Data
Statistical Approaches to Missing Data
 

Geohash: Integration of Disparate Geospatial Data

  • 1. Heatmaps & Geohash: Integration of Multi-Source Geospatial Data Abe Usher, CIO abe@thehumangeo.com 703.955.1540 @abeusher AKA Heatmaps are the Heat INFORMATION INTO INSIGHT
  • 2. Our Menu of Subtopics A LITTLE HISTORY WHY HEATMAPS? GEOHASH? DATA GONE WILD Big data requires new approaches. A new organizing construct for Why geospatial data? information analysis. KITCHEN MODEL SPECIFIC EXAMPLES EASTER EGG New ways to combine internal data Concrete take-aways. Treats for making it through another with external new media for presentation. maximum insight. 2
  • 3. What’s in it for you? 3
  • 4. What’s in it for you?  Make custom heatmaps  Three iron laws of geospatial data integration and analysis 4
  • 5. Whoami? People Data Beverages 5
  • 6. Geohash Geohash is a coordinate transformation that facilitates combining two variables (latitude and longitude) into a single(text) variable that represents a bounding-box containing the point of interest. 6
  • 7. Heatmap “Discrete & continuous methods of kernel density estimation” 7
  • 8. A rose by any other name “Discrete & continuous methods of kernel density estimation”  Gaussian  Quartic  Exponential  Triangular  Uniform  Epanechnikov 8
  • 9. A rose by any other name “Discrete methods of kernel density estimation” 9
  • 10. About Big Data (Digital) Human Geography  Predictive models of social drift & human behavior  Streaming media analytics  Micro-demographics We’re hiring! info@thehumangeo.com 10
  • 11. Why Heatmaps and Geohash?  Too much data 11
  • 12. Why Heatmaps and Geohash?  Too much data  Trust in Internet data 12
  • 13. Why Heatmaps and Geohash?  Too much data  Trust in Internet data  Heatmaps look cool 13
  • 14. Why Heatmaps and Geohash?  Too much data  Trust in Internet data  Heatmaps look cool  Geohash helps quantify data 14
  • 15. Why Heatmaps and Geohash? Visual summary  Too much data  Trust in Internet data  Heatmaps look cool  Geohash helps quantify data Quantitative methods 15
  • 16. Trust and Internet Information Tracy Morrow aka “Ice T” 16
  • 17. Trust and Internet Information “Game knows game, baby.” Tracy Morrow aka “Ice T” 17
  • 18. Trust and Internet Information “If you have expert knowledge, then you are capable of recognizing expert knowledge.” [paraphrased] Tracy Morrow aka “Ice T” 18
  • 19. Trust and Internet Information Can we actually trust this Internet stuff? 19
  • 20. Trust and Internet Information 20
  • 21. Trust and Internet Information 21
  • 22. Trust and Internet Information 22
  • 23. Salami Slicing Salami slicing: series of minor observations, resulting in a larger observation that would be difficult to perform http://en.wikipedia.org/wiki/Salami_slicing 23
  • 24. Seven Layer GLT 1. OpenStreetMap data 2. Flickr 3. Panoramio 4. Geonames.org 5. Twitter 6. Wikimapia 7. 4Square * Geospatial Lattice of Trusted Data 24
  • 25. Seven Layer GLT 1. OpenStreetMap data 2. Flickr 3. Panoramio 4. Geonames.org 5. Twitter 6. Wikimapia 7. 4Square Spatial Temporal User Finds From the Field (STUFFF) 25
  • 26. Rule #1: Think in terms of aggregation Twitter data Panoramio Tourist photos Classified data Twitter geohash ez420 – coffee shop Panoramio geohash ez420 – Starbucks Classified geohash ez420 - Wifi Trust through aggregation 26
  • 27. Rule #1: Think in terms of aggregation Twitter data Twitter geohash ez420 – coffee shop Panoramio geohash ez420 – Starbucks Classified geohash ez420 - Wifi Panoramio Tourist photos Geohash creates simple string variables. Classified data Matching strings = super easy Matching similar coordinates = impossible Trust through aggregation 27
  • 28. Rule #1: Think in terms of aggregation Twitter data Panoramio Tourist photos Use geohash to apply collaborative filtering techniques to develop new Classified data models of trust & data confidence. Trust through aggregation 28
  • 29. Rule #2: Selectively throw away precision Entity #1 Latitude Longitude 40.998946 28.9232 41.005164 28.973668 41.018765 29.016412 41.062268 29.030145 Entity #2 Latitude Longitude 40.999100 28.92111 41.018112 28.973991 41.018880 29.016902 41.062110 29.030122 29
  • 30. Rule #2: Selectively throw away precision Entity #1 Latitude Longitude Geohash 40.998946 28.9232 SXK94 41.005164 28.973668 SXK97 41.018765 29.016412 SXK9K 41.062268 29.030145 SXK9S 30
  • 31. Rule #2: Selectively throw away precision Entity #1 Latitude Longitude Geohash 40.998946 28.9232 SXK94 41.005164 28.973668 SXK97 41.018765 29.016412 SXK9K 41.062268 29.030145 SXK9S Entity #2 Latitude Longitude Geohash 40.999100 28.92111 SXK94 41.018112 28.973991 SXK97 41.018880 29.016902 SXK9K 41.062110 29.030122 SXK9S 31
  • 32. Kitchen Model for Spatial Analysis 32
  • 33. Kitchen Model for Spatial Analysis Chef 33
  • 34. Kitchen Model for Spatial Analysis Chef Ingredients 34
  • 35. Kitchen Model for Spatial Analysis Chef Ingredients Utensils 35
  • 36. Kitchen Model for Spatial Analysis Chef Ingredients Utensils Presentation 36
  • 37. Kitchen Model for Spatial Analysis Chef Ingredients Utensils Presentation 37
  • 38. Types of Heatmaps Turnkey  GeoCommons  SpatialKey  MapBox/TileMill  ArcGIS Desktop  QGIS Custom  Python  R  Javascript 38
  • 39. Types of Heatmaps Turnkey  GeoCommons  SpatialKey  MapBox/TileMill  ArcGIS Desktop  QGIS Custom  Python  R  Javascript 39
  • 40. Heatmap: Recipe One “OSM Style” Get Python http://python.org Get the sethoscope library http://www.sethoscope.net/heatmap/ Get data http://bit.ly/geotweet_sc https://dev.twitter.com/docs/streaming-api/methods#locations Command line: heatmap.py -g portland.gpx -o output.png --height 800 --osm 40
  • 41. 41
  • 42. Heatmap: Recipe One Stitch it together in an MP4 movie! Get the CLI app: http://ffmpeg.org/ Command line: heatmap.py -g portland.gpx -o output.png --height 800 –osm –a ffmpeg -i frame-%05d.png OSM_is_awesome.mp4 42
  • 43. Heatmap: Recipe One Props to Seth Golub from Google http://www.sethoscope.net/ 43
  • 44. Rule #3: Beware of population effects http://xkcd.com/1138/ 44
  • 45. Rule #3: Beware of population effects Absolute value Normalized value = Population estimate 45
  • 46. Rule #3: Beware of population effects 34 72 52 22 46
  • 47. Rule #3: Beware of population effects 34 Absolute value 2,000 Normalized value = Population estimate 72 52 16,000 25,000 22 2,000 47
  • 48. Conclusion 1. Think in terms of aggregation 2. Selectively throw away precision 3. Beware of population effects 48
  • 49. Contact Us HumanGeo NY HumanGeo DC 1221 Avenue of the Americas 2500 Wilson Boulevard Suite 4200 | New York, NY 10020 Suite 310 | Arlington, VA 22201 info@thehumangeo.com | 703.955.1540 | www.thehumangeo.com 49