SlideShare a Scribd company logo
1 of 37
Download to read offline
Big Data and Geospatial with HPCC Systems®
Powered by LexisNexis Risk Solutions
Ignacio Calvo
Greg McRandal
10/05/2016
Concepts in Geospatial
How to use them with HPCC
Use cases
@HPCCSystems
An approach to applying statistical
analysis and other analytic techniques
to data which has a geographical or
spatial aspect
Definition
Origin of Geospatial
John Snow’s original map (1854),
using GIS to save lives. This map
was used to determine that
Cholera was water-borne
Need to know :
• Format
• Projection / coordinate system
Understanding the data
Formats : Vector vs Raster
Vector Raster
Projections are used to represent the world in ways
we can process
•The Earth is round and maps are flat
•Physical Maps
•Computer Maps
What is a projection?
Have I seen projections before?
•Peter vs Mercator vs Winkel tripel
•GPS (latitude/longitude)
•Google Maps
Two different projections representing the same place.
Projections
WGS84
•Latitude and longitude
•Our best approximation of the world
•Not always the best for a specific region
•Not technically a projection
Projections to know about
Mercator
•Many different ones, choose one based on your location
•Reduces the area it covers to a simple Cartesian plane
•Good near the central axis, bad far away from it :
• Web Mercator covers the whole world – good near equator, gets worse as you travel north or
south
• Irish National Grid – very good for Ireland, awful anywhere else.
Lies, damned lies, statistics… and maps!
*https://twitter.com/flashboy/status/641221733509373952
Lies, damned lies, statistics… and maps!
Projection Woes:
A straight line in Mercator is
not a straight line in WGS84
Four points converted
to WGS84
Where the lines
should be
Don’t re-project polygons!
This “solution” is only good
enough for visuals, not for
maths.
Lies, damned lies, statistics… and maps!
Lies, damned lies, statistics… and maps!
Visuals don’t agree with maths: Wind and Hail.
Web Mercator WGS84
Number one bug in Geospatial
*http://twcc.fr
Number one bug in Geospatial
Latitude
Longitude
X
Y
LatY LonX
Now I understand my data, what’s next?
Data Ingest Index Query
Bringing Geospatial into HPCC
GOAL
Bring our geospatial processes
into the realm of Big Data
STEPS
Spatial filtering of vector geometries
Spatial operations using vector geometries
Spatial reference projection and transformation
Reading of compressed geo-raster files
Big Data
Extend HPCC and ECL to support the following main
capabilities :
STEPS
Big Data
Integration of open source libraries
Ingesting Vector Data
It’s a CSV file.
Id Name Geometry Projection Value
1 Alice’s
place
POINT (53.78925462 -6.08354321) 4326* €5,973,000
2 Bob’s place POINT (-34.78925462 7.08354321) 4326 €872,000
3 Celine’s
place
POINT (102.78925462 -6.08354321) 4326 €9,324,000
* WGS84 (Lat/Lon)
3.
Peril tag
2.
Geocode address
1.
Policy data
Data ready to
ingest
Ingesting Vector Data
It’s a GML / XML file.
3.
Process and index
2.
Parse XPATH
1.
Shape data
Data ready to
query
Ingesting Vector Data
It’s a GML / XML file.
3.
Process and index
2.
Parse XPATH
1.
Shape data
Data ready to
query
Ingesting Vector Data
It’s a GML / XML file.
3.
Process and index
2.
Parse XPATH
1.
Shape data
Data ready to
query
Indexing vector data
• Outline Box: Biggest rectangle
• Boxes contain boxes
• Bottom box in the tree contains actual
geometries
• Here, 3 levels pictured
• Boxes can overlap (entries are only in one)
Querying vector data
Searching an R-Tree: e.g. Finding all buildings (points) inside a flood zone (polygon)
Does the query polygon overlap our box?
Return empty list
Search our boxes’
children
Is it a leaf node?
Return all nodes
for verification
Y
N
Y
N
Ingesting Raster Data
It’s a raster / TIFF file. Bitmap image
3.
Process and index
2.
Tile and spray
1.
Raster data
Data ready to
query
Ingesting Raster Data
3.
Process and index
2.
Tile and spray
1.
Raster data
Data ready to
query
Tiling divides raster images into
small manageable areas of known
dimensions.
These tiles have their own
metadata:
• Bounding box
• Grid position
Ingesting Raster Data
3.
Process and index
2.
Tile and spray
1.
Raster data
Data ready to
query
1. Figure out which grid position the
geometry needs
2. Extract the required pixel
3. Interrogate the pixel for its value
4. Interpret its value
5. Return to user
Ingesting Raster Data
It’s a raster / TIFF file. Bitmap image
3.
Process and index
2.
Tile and spray
1.
Raster data
Data ready to
query
Ingesting Raster Data
It’s a raster / TIFF file.
3.
Process and index
2.
Tile and spray
1.
Raster data
Data ready to
query
Bringing it all together
*Andrew Farrell
In pursuit of perils : Geo-spatial risk analysis through HPCC Systems
https://hpccsystems.com/resources/blog/afarrell/pursuit-perils-geo-spatial-risk-analysis-
through-hpcc-systems
Add even more value
Add even more value
Why Geospatial with HPCC?
• Efficient parallel processing
• Ability to import libraries from different languages
• Good coverage of functions and spatial predicates
• Fast ingestion
• Support for different formats
• Sub-second queries
hpccsystems.com

More Related Content

What's hot

Spatial vs non spatial
Spatial vs non spatialSpatial vs non spatial
Spatial vs non spatial
Sumant Diwakar
 

What's hot (20)

GIS data structure
GIS data structureGIS data structure
GIS data structure
 
Spatial databases
Spatial databasesSpatial databases
Spatial databases
 
3D Analyst - Lab
3D Analyst - Lab3D Analyst - Lab
3D Analyst - Lab
 
4.2 spatial data mining
4.2 spatial data mining4.2 spatial data mining
4.2 spatial data mining
 
Spatial analysis and Analysis Tools ( GIS )
Spatial analysis and Analysis Tools ( GIS )Spatial analysis and Analysis Tools ( GIS )
Spatial analysis and Analysis Tools ( GIS )
 
Spatial Data Model
Spatial Data ModelSpatial Data Model
Spatial Data Model
 
Improvement of Spatial Data Quality Using the Data Conflation
Improvement of Spatial Data Quality Using the Data ConflationImprovement of Spatial Data Quality Using the Data Conflation
Improvement of Spatial Data Quality Using the Data Conflation
 
Iccsa stankuteha180611
Iccsa stankuteha180611Iccsa stankuteha180611
Iccsa stankuteha180611
 
3D Analyst - Lake, Jatiluhur
3D Analyst - Lake, Jatiluhur3D Analyst - Lake, Jatiluhur
3D Analyst - Lake, Jatiluhur
 
Spatial data analysis
Spatial data analysisSpatial data analysis
Spatial data analysis
 
Spatial vs non spatial
Spatial vs non spatialSpatial vs non spatial
Spatial vs non spatial
 
Vector data model
Vector data model Vector data model
Vector data model
 
GIS Modeling
GIS ModelingGIS Modeling
GIS Modeling
 
ePOM - Intro to Ocean Data Science - Raster and Vector Data Formats
ePOM - Intro to Ocean Data Science - Raster and Vector Data FormatsePOM - Intro to Ocean Data Science - Raster and Vector Data Formats
ePOM - Intro to Ocean Data Science - Raster and Vector Data Formats
 
Geographical information system unit 5
Geographical information  system unit 5Geographical information  system unit 5
Geographical information system unit 5
 
Conversion of Existing Data
Conversion of Existing DataConversion of Existing Data
Conversion of Existing Data
 
LIDAR and Drone Data - Datamine Discover3D
LIDAR and Drone Data - Datamine Discover3DLIDAR and Drone Data - Datamine Discover3D
LIDAR and Drone Data - Datamine Discover3D
 
MapInfo Discover 3D for Wind Energy Resources
MapInfo Discover 3D for Wind Energy ResourcesMapInfo Discover 3D for Wind Energy Resources
MapInfo Discover 3D for Wind Energy Resources
 
3D Analyst - Watershed Lorelindu
3D Analyst - Watershed Lorelindu3D Analyst - Watershed Lorelindu
3D Analyst - Watershed Lorelindu
 
Spatial Database Systems
Spatial Database SystemsSpatial Database Systems
Spatial Database Systems
 

Viewers also liked

Farm Management System - Delivering a Precision Agriculture Solution
Farm Management System - Delivering a Precision Agriculture SolutionFarm Management System - Delivering a Precision Agriculture Solution
Farm Management System - Delivering a Precision Agriculture Solution
HPCC Systems
 
Enabling Aviation Analytics through HPCC Systems
Enabling Aviation Analytics through HPCC SystemsEnabling Aviation Analytics through HPCC Systems
Enabling Aviation Analytics through HPCC Systems
HPCC Systems
 
Two Days Training on Advocacy at Lahore 8 - 9 December 2016
Two Days Training on Advocacy at Lahore 8 - 9 December 2016Two Days Training on Advocacy at Lahore 8 - 9 December 2016
Two Days Training on Advocacy at Lahore 8 - 9 December 2016
sultantareen1976
 

Viewers also liked (20)

2016 HPCC Systems Poster Presentation Competition
2016 HPCC Systems Poster Presentation Competition2016 HPCC Systems Poster Presentation Competition
2016 HPCC Systems Poster Presentation Competition
 
Farm Management System - Delivering a Precision Agriculture Solution
Farm Management System - Delivering a Precision Agriculture SolutionFarm Management System - Delivering a Precision Agriculture Solution
Farm Management System - Delivering a Precision Agriculture Solution
 
Enabling Aviation Analytics through HPCC Systems
Enabling Aviation Analytics through HPCC SystemsEnabling Aviation Analytics through HPCC Systems
Enabling Aviation Analytics through HPCC Systems
 
Introduction to the Open Source HPCC Systems Platform by Arjuna Chala
Introduction to the Open Source HPCC Systems Platform by Arjuna ChalaIntroduction to the Open Source HPCC Systems Platform by Arjuna Chala
Introduction to the Open Source HPCC Systems Platform by Arjuna Chala
 
HPCC Systems - Using Big Data to Help Feed the World
HPCC Systems - Using Big Data to Help Feed the WorldHPCC Systems - Using Big Data to Help Feed the World
HPCC Systems - Using Big Data to Help Feed the World
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
HPCC Presentation
HPCC PresentationHPCC Presentation
HPCC Presentation
 
HUG Ireland Event - HPCC Presentation Slides
HUG Ireland Event - HPCC Presentation SlidesHUG Ireland Event - HPCC Presentation Slides
HUG Ireland Event - HPCC Presentation Slides
 
Proagrica - Big Data to Feed the World
Proagrica - Big Data to Feed the WorldProagrica - Big Data to Feed the World
Proagrica - Big Data to Feed the World
 
Big Data Ready Enterprise
Big Data Ready Enterprise Big Data Ready Enterprise
Big Data Ready Enterprise
 
Big Data - Hadoop and MapReduce - Aditya Garg
Big Data - Hadoop and MapReduce - Aditya GargBig Data - Hadoop and MapReduce - Aditya Garg
Big Data - Hadoop and MapReduce - Aditya Garg
 
Poultry farm management system
Poultry farm management systemPoultry farm management system
Poultry farm management system
 
The current challenges and opportunities of big data and analytics in emergen...
The current challenges and opportunities of big data and analytics in emergen...The current challenges and opportunities of big data and analytics in emergen...
The current challenges and opportunities of big data and analytics in emergen...
 
Big-data analytics: challenges and opportunities
Big-data analytics: challenges and opportunitiesBig-data analytics: challenges and opportunities
Big-data analytics: challenges and opportunities
 
Big Data: Issues and Challenges
Big Data: Issues and ChallengesBig Data: Issues and Challenges
Big Data: Issues and Challenges
 
Webinar 2013 11-21-sebillo
Webinar 2013 11-21-sebilloWebinar 2013 11-21-sebillo
Webinar 2013 11-21-sebillo
 
LR Каталог продукции 2012
LR Каталог продукции 2012LR Каталог продукции 2012
LR Каталог продукции 2012
 
MY NAME IS DUBIAN MARIN - UNAD
MY NAME IS DUBIAN MARIN - UNADMY NAME IS DUBIAN MARIN - UNAD
MY NAME IS DUBIAN MARIN - UNAD
 
LR Прайс лист 08.2012
LR Прайс лист 08.2012LR Прайс лист 08.2012
LR Прайс лист 08.2012
 
Two Days Training on Advocacy at Lahore 8 - 9 December 2016
Two Days Training on Advocacy at Lahore 8 - 9 December 2016Two Days Training on Advocacy at Lahore 8 - 9 December 2016
Two Days Training on Advocacy at Lahore 8 - 9 December 2016
 

Similar to Big Data and Geospatial with HPCC Systems

geographic information system pdf
geographic information system pdfgeographic information system pdf
geographic information system pdf
Rolan Ben Lorono
 

Similar to Big Data and Geospatial with HPCC Systems (20)

What is Geography Information Systems (GIS)
What is Geography Information Systems (GIS)What is Geography Information Systems (GIS)
What is Geography Information Systems (GIS)
 
GIS Analysis For Site Remediation
GIS Analysis For Site RemediationGIS Analysis For Site Remediation
GIS Analysis For Site Remediation
 
THE NATURE AND SOURCE OF GEOGRAPHIC DATA
THE NATURE AND SOURCE OF GEOGRAPHIC DATATHE NATURE AND SOURCE OF GEOGRAPHIC DATA
THE NATURE AND SOURCE OF GEOGRAPHIC DATA
 
Getting started with GIS
Getting started with GISGetting started with GIS
Getting started with GIS
 
PIAS 2013-GIS.pptxfskjczjsbchdbfscnnND dHSA
PIAS 2013-GIS.pptxfskjczjsbchdbfscnnND  dHSAPIAS 2013-GIS.pptxfskjczjsbchdbfscnnND  dHSA
PIAS 2013-GIS.pptxfskjczjsbchdbfscnnND dHSA
 
Fundamentals of GIS
Fundamentals of GISFundamentals of GIS
Fundamentals of GIS
 
Geographic Information System unit 1
Geographic Information System   unit 1Geographic Information System   unit 1
Geographic Information System unit 1
 
Data models in geographical information system(GIS)
Data models in geographical information system(GIS)Data models in geographical information system(GIS)
Data models in geographical information system(GIS)
 
geographic information system pdf
geographic information system pdfgeographic information system pdf
geographic information system pdf
 
GIS_Intro_March_2014
GIS_Intro_March_2014GIS_Intro_March_2014
GIS_Intro_March_2014
 
Info Grafix
Info GrafixInfo Grafix
Info Grafix
 
Topological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial SystemsTopological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial Systems
 
Geographic information system(GIS) and its applications in agriculture
Geographic information system(GIS) and its applications in agricultureGeographic information system(GIS) and its applications in agriculture
Geographic information system(GIS) and its applications in agriculture
 
Final ies
Final iesFinal ies
Final ies
 
GIS_FDP_Final.pdf
GIS_FDP_Final.pdfGIS_FDP_Final.pdf
GIS_FDP_Final.pdf
 
Spatial Data Science with R
Spatial Data Science with RSpatial Data Science with R
Spatial Data Science with R
 
Exploratory Spatial Analytics (ESA)
Exploratory Spatial Analytics (ESA)Exploratory Spatial Analytics (ESA)
Exploratory Spatial Analytics (ESA)
 
Scattered gis handbook
Scattered gis handbookScattered gis handbook
Scattered gis handbook
 
Intro to GIS and Remote Sensing
Intro to GIS and Remote SensingIntro to GIS and Remote Sensing
Intro to GIS and Remote Sensing
 
Vector data model
Vector data modelVector data model
Vector data model
 

More from HPCC Systems

Leveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC SystemsLeveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC Systems
HPCC Systems
 

More from HPCC Systems (20)

Natural Language to SQL Query conversion using Machine Learning Techniques on...
Natural Language to SQL Query conversion using Machine Learning Techniques on...Natural Language to SQL Query conversion using Machine Learning Techniques on...
Natural Language to SQL Query conversion using Machine Learning Techniques on...
 
Improving Efficiency of Machine Learning Algorithms using HPCC Systems
Improving Efficiency of Machine Learning Algorithms using HPCC SystemsImproving Efficiency of Machine Learning Algorithms using HPCC Systems
Improving Efficiency of Machine Learning Algorithms using HPCC Systems
 
Towards Trustable AI for Complex Systems
Towards Trustable AI for Complex SystemsTowards Trustable AI for Complex Systems
Towards Trustable AI for Complex Systems
 
Welcome
WelcomeWelcome
Welcome
 
Closing / Adjourn
Closing / Adjourn Closing / Adjourn
Closing / Adjourn
 
Community Website: Virtual Ribbon Cutting
Community Website: Virtual Ribbon CuttingCommunity Website: Virtual Ribbon Cutting
Community Website: Virtual Ribbon Cutting
 
Path to 8.0
Path to 8.0 Path to 8.0
Path to 8.0
 
Release Cycle Changes
Release Cycle ChangesRelease Cycle Changes
Release Cycle Changes
 
Geohashing with Uber’s H3 Geospatial Index
Geohashing with Uber’s H3 Geospatial Index Geohashing with Uber’s H3 Geospatial Index
Geohashing with Uber’s H3 Geospatial Index
 
Advancements in HPCC Systems Machine Learning
Advancements in HPCC Systems Machine LearningAdvancements in HPCC Systems Machine Learning
Advancements in HPCC Systems Machine Learning
 
Docker Support
Docker Support Docker Support
Docker Support
 
Expanding HPCC Systems Deep Neural Network Capabilities
Expanding HPCC Systems Deep Neural Network CapabilitiesExpanding HPCC Systems Deep Neural Network Capabilities
Expanding HPCC Systems Deep Neural Network Capabilities
 
Leveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC SystemsLeveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC Systems
 
DataPatterns - Profiling in ECL Watch
DataPatterns - Profiling in ECL Watch DataPatterns - Profiling in ECL Watch
DataPatterns - Profiling in ECL Watch
 
Leveraging the Spark-HPCC Ecosystem
Leveraging the Spark-HPCC Ecosystem Leveraging the Spark-HPCC Ecosystem
Leveraging the Spark-HPCC Ecosystem
 
Work Unit Analysis Tool
Work Unit Analysis ToolWork Unit Analysis Tool
Work Unit Analysis Tool
 
Community Award Ceremony
Community Award Ceremony Community Award Ceremony
Community Award Ceremony
 
Dapper Tool - A Bundle to Make your ECL Neater
Dapper Tool - A Bundle to Make your ECL NeaterDapper Tool - A Bundle to Make your ECL Neater
Dapper Tool - A Bundle to Make your ECL Neater
 
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
 
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
 

Recently uploaded

Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Klinik kandungan
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
chadhar227
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
wsppdmt
 
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
vexqp
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Bertram Ludäscher
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
gajnagarg
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
q6pzkpark
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
vexqp
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
vexqp
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
wsppdmt
 

Recently uploaded (20)

Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxThe-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
 
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptx
 

Big Data and Geospatial with HPCC Systems

  • 1. Big Data and Geospatial with HPCC Systems® Powered by LexisNexis Risk Solutions Ignacio Calvo Greg McRandal 10/05/2016
  • 2. Concepts in Geospatial How to use them with HPCC Use cases @HPCCSystems
  • 3. An approach to applying statistical analysis and other analytic techniques to data which has a geographical or spatial aspect Definition
  • 4.
  • 5. Origin of Geospatial John Snow’s original map (1854), using GIS to save lives. This map was used to determine that Cholera was water-borne
  • 6. Need to know : • Format • Projection / coordinate system Understanding the data
  • 7. Formats : Vector vs Raster Vector Raster
  • 8. Projections are used to represent the world in ways we can process •The Earth is round and maps are flat •Physical Maps •Computer Maps What is a projection? Have I seen projections before? •Peter vs Mercator vs Winkel tripel •GPS (latitude/longitude) •Google Maps
  • 9. Two different projections representing the same place. Projections
  • 10. WGS84 •Latitude and longitude •Our best approximation of the world •Not always the best for a specific region •Not technically a projection Projections to know about Mercator •Many different ones, choose one based on your location •Reduces the area it covers to a simple Cartesian plane •Good near the central axis, bad far away from it : • Web Mercator covers the whole world – good near equator, gets worse as you travel north or south • Irish National Grid – very good for Ireland, awful anywhere else.
  • 11. Lies, damned lies, statistics… and maps! *https://twitter.com/flashboy/status/641221733509373952
  • 12. Lies, damned lies, statistics… and maps! Projection Woes: A straight line in Mercator is not a straight line in WGS84 Four points converted to WGS84 Where the lines should be Don’t re-project polygons! This “solution” is only good enough for visuals, not for maths.
  • 13. Lies, damned lies, statistics… and maps!
  • 14. Lies, damned lies, statistics… and maps! Visuals don’t agree with maths: Wind and Hail. Web Mercator WGS84
  • 15. Number one bug in Geospatial *http://twcc.fr
  • 16. Number one bug in Geospatial Latitude Longitude X Y LatY LonX
  • 17. Now I understand my data, what’s next? Data Ingest Index Query
  • 18. Bringing Geospatial into HPCC GOAL Bring our geospatial processes into the realm of Big Data
  • 19. STEPS Spatial filtering of vector geometries Spatial operations using vector geometries Spatial reference projection and transformation Reading of compressed geo-raster files Big Data Extend HPCC and ECL to support the following main capabilities :
  • 20. STEPS Big Data Integration of open source libraries
  • 21. Ingesting Vector Data It’s a CSV file. Id Name Geometry Projection Value 1 Alice’s place POINT (53.78925462 -6.08354321) 4326* €5,973,000 2 Bob’s place POINT (-34.78925462 7.08354321) 4326 €872,000 3 Celine’s place POINT (102.78925462 -6.08354321) 4326 €9,324,000 * WGS84 (Lat/Lon) 3. Peril tag 2. Geocode address 1. Policy data Data ready to ingest
  • 22. Ingesting Vector Data It’s a GML / XML file. 3. Process and index 2. Parse XPATH 1. Shape data Data ready to query
  • 23. Ingesting Vector Data It’s a GML / XML file. 3. Process and index 2. Parse XPATH 1. Shape data Data ready to query
  • 24. Ingesting Vector Data It’s a GML / XML file. 3. Process and index 2. Parse XPATH 1. Shape data Data ready to query
  • 25. Indexing vector data • Outline Box: Biggest rectangle • Boxes contain boxes • Bottom box in the tree contains actual geometries • Here, 3 levels pictured • Boxes can overlap (entries are only in one)
  • 26. Querying vector data Searching an R-Tree: e.g. Finding all buildings (points) inside a flood zone (polygon) Does the query polygon overlap our box? Return empty list Search our boxes’ children Is it a leaf node? Return all nodes for verification Y N Y N
  • 27. Ingesting Raster Data It’s a raster / TIFF file. Bitmap image 3. Process and index 2. Tile and spray 1. Raster data Data ready to query
  • 28. Ingesting Raster Data 3. Process and index 2. Tile and spray 1. Raster data Data ready to query Tiling divides raster images into small manageable areas of known dimensions. These tiles have their own metadata: • Bounding box • Grid position
  • 29. Ingesting Raster Data 3. Process and index 2. Tile and spray 1. Raster data Data ready to query 1. Figure out which grid position the geometry needs 2. Extract the required pixel 3. Interrogate the pixel for its value 4. Interpret its value 5. Return to user
  • 30. Ingesting Raster Data It’s a raster / TIFF file. Bitmap image 3. Process and index 2. Tile and spray 1. Raster data Data ready to query
  • 31. Ingesting Raster Data It’s a raster / TIFF file. 3. Process and index 2. Tile and spray 1. Raster data Data ready to query
  • 32. Bringing it all together *Andrew Farrell In pursuit of perils : Geo-spatial risk analysis through HPCC Systems https://hpccsystems.com/resources/blog/afarrell/pursuit-perils-geo-spatial-risk-analysis- through-hpcc-systems
  • 33. Add even more value
  • 34. Add even more value
  • 35. Why Geospatial with HPCC? • Efficient parallel processing • Ability to import libraries from different languages • Good coverage of functions and spatial predicates • Fast ingestion • Support for different formats • Sub-second queries
  • 36.