SlideShare a Scribd company logo
OSM data in MariaDB/MySQL
All the world in a few large tables
Well, almost ...

Hartmut Holzgraefe
SkySQL AB hartmut@skysql.com

February 1, 2014

Hartmut Holzgraefe (SkySQL)

OSM data in MariaDB/MySQL

February 1, 2014

1 / 30
Overview

1

MySQL, MariaDB and GIS

2

OpenStreetMap

3

osm2pgsql

4

Examples

5

The End ...

Hartmut Holzgraefe (SkySQL)

OSM data in MariaDB/MySQL

February 1, 2014

2 / 30
MySQL, MariaDB and GIS

1

MySQL, MariaDB and GIS
History
Current Status
Roadmap

2

OpenStreetMap

3

osm2pgsql

4

Examples

5

The End ...

Hartmut Holzgraefe (SkySQL)

OSM data in MariaDB/MySQL

February 1, 2014

3 / 30
History

First appeard in MySQL 4.1 (2004)
... with MBR relations only
Lab Release adds true spatial relations (200?)
MariaDB 5.3 GA with true spatial relations (2011)
MySQL 5.6 GA with true spatial relations (2013)
... to be continued ...

Hartmut Holzgraefe (SkySQL)

OSM data in MariaDB/MySQL

February 1, 2014

4 / 30
MBR is not enough

MBR CONTAINS()

Hartmut Holzgraefe (SkySQL)

True ST CONTAINS()

OSM data in MariaDB/MySQL

February 1, 2014

5 / 30
Current Status

Spatial relations work
... but the world is still flat (no projections)
OK for many use cases
... but be aware of gotchas like DISTANCE()
GEOMETRY types in all storage engines
... but only MyISAM has SPATIAL indexes

Hartmut Holzgraefe (SkySQL)

OSM data in MariaDB/MySQL

February 1, 2014

6 / 30
MariaDB 10.1 Roadmap

System Tables (spatial ref sys, geometry columns, ...)
Precision math calculations and storage
Coordinate transformations / projections
3rd coordinate (e.g. for altitude)
all spatial functions required by OGC
spatial aware optimizer

Hartmut Holzgraefe (SkySQL)

OSM data in MariaDB/MySQL

February 1, 2014

7 / 30
... and beyond

SPATIAL indexes in other storage engines
3D calculations
client side support for GIS transformations

Hartmut Holzgraefe (SkySQL)

OSM data in MariaDB/MySQL

February 1, 2014

8 / 30
Openstreetmap
1

MySQL, MariaDB and GIS

2

OpenStreetMap
Intro
Core Data Model
Data Access
Data Import

3

osm2pgsql

4

Examples

5

The End ...
Hartmut Holzgraefe (SkySQL)

OSM data in MariaDB/MySQL

February 1, 2014

9 / 30
OpenStreetMap History

founded in 2004 by Steve
Coast
data under open license
(CC-BY-SA first, now
ODBL)
1.5 million contributors
2 billion map nodes
200 million ways
2 million relations
almost 4 billion GPX points

Hartmut Holzgraefe (SkySQL)

OSM data in MariaDB/MySQL

February 1, 2014

10 / 30
Pretty Tiles

Hartmut Holzgraefe (SkySQL)

OSM data in MariaDB/MySQL

February 1, 2014

11 / 30
... and raw data

Raw map data can be used for
other things, too:
for routing
for coverage checks
for flight simulators
for science

Hartmut Holzgraefe (SkySQL)

OSM data in MariaDB/MySQL

February 1, 2014

12 / 30
Core Data Model

Just three simple things
Nodes (Points)
Ways
Relations

Hartmut Holzgraefe (SkySQL)

OSM data in MariaDB/MySQL

February 1, 2014

13 / 30
Nodes

Nodes describe a single point at a specific location using:
A numeric ID
Object version, Timestamp of last change, User
Node coordinates
Node attributes as key/value pairs

Hartmut Holzgraefe (SkySQL)

OSM data in MariaDB/MySQL

February 1, 2014

14 / 30
Ways

Ways form an open or closed line by connecting nodes, using:
A numeric ID
Object version, Timestamp of last change, User
An ordered list of node IDs
Way attributes as key/value pairs

Hartmut Holzgraefe (SkySQL)

OSM data in MariaDB/MySQL

February 1, 2014

15 / 30
Relations

Relations bundle objects to describe more complex relations, using:
A numeric ID
Object version, Timestamp of last change, User
Ordered lists of member nodes, ways and sub-relations
Optional member roles
Attributes as key/value pairs

Hartmut Holzgraefe (SkySQL)

OSM data in MariaDB/MySQL

February 1, 2014

16 / 30
Data Access

The main database is not exposed directly
Only one central instance, accessible via “The API”
API meant for editor applications only
Full data export once a week (“the planet”)
Plus daily, hourly, minutely diffs
Two file formats for planets:
.osm XML based, usually bz2 compressed (32GB packed, 400GB
unpacked)
.pbf compact binary format based on Google ProtoBuf (23GB)
Regional extracts available by 3rd parties, e.g. GeoFabrik.de

Hartmut Holzgraefe (SkySQL)

OSM data in MariaDB/MySQL

February 1, 2014

17 / 30
Data Import

The raw data is not really suitable for most purposes, esp. rendering
Several import/preprocessing tools provide more convenient schemas,
e.g. by
... only extracting certain attributes
... making a difference beween ways and areas
... resolving relations into simpler objects
Besides osm2pgsql that I’m about to talk about in a minute there are also
impOsm, ...

Hartmut Holzgraefe (SkySQL)

OSM data in MariaDB/MySQL

February 1, 2014

18 / 30
osm2pgsql
1

MySQL, MariaDB and GIS

2

OpenStreetMap

3

osm2pgsql
Block Diagram
Data Model Again
Adding MySQL Support
Performance

4

Examples

5

The End ...
Hartmut Holzgraefe (SkySQL)

OSM data in MariaDB/MySQL

February 1, 2014

19 / 30
osm2pgsql

osm2pgsql is a tool
written in C
with a small C++ part now
reads OSM data
preprocesses it
stores results in relational tables
originally in PostGIS only

Hartmut Holzgraefe (SkySQL)

OSM data in MariaDB/MySQL

February 1, 2014

20 / 30
Block Diagram

Figure : osm2pgsql block diagram
Hartmut Holzgraefe (SkySQL)

OSM data in MariaDB/MySQL

February 1, 2014

21 / 30
Data Model Again

prefix point for single node POIs
prefix line for linear 2D objects like roads, rivers, power lines ...
prefix roads a subset of the above, optimizer for rendering
prefix polygon objects covering an area: buildings, landuse,
administrative borders ...

Hartmut Holzgraefe (SkySQL)

OSM data in MariaDB/MySQL

February 1, 2014

22 / 30
Adding MySQL Support

turned out to be more tricky than thought
some core parts directly called PostgreSQL functions
a lot of general functionality was hidden in PostgreSQL specific
modules
MySQL output module works, could be faster though
MySQL middle layer is “code complete” ...
... but crashes while processing relations :(
so for now imports are limited by RAM size

Hartmut Holzgraefe (SkySQL)

OSM data in MariaDB/MySQL

February 1, 2014

23 / 30
Import performance

imports currently take about 4-5 times as long
... as we have no direct equivalent to COPY
osm2pgsql at less than 50% CPU only
... so switching to async API would be a 2x win already
... with multi-insert even more so
index building is faster ...
... but may not be once we get I/O bound
tables on disk are of similar size

Hartmut Holzgraefe (SkySQL)

OSM data in MariaDB/MySQL

February 1, 2014

24 / 30
Query performance
select
from
join
on
where
and

count(*)
nrw_point n
nrw_polygon p
st_contains(p.way,n.way)
p.name = ’Bielefeld’
n.amenity=’post_box’;

Data Set
Germany
Northrhine-Westfalia
same with better indexes

Hartmut Holzgraefe (SkySQL)

MySQL 5.5 (MBR)

MariaDB 5.5

PostGIS

15.8s
2.7s
0.2s

16.5s
3.1s
0.2s

?
6.1s
.04s

OSM data in MariaDB/MySQL

February 1, 2014

25 / 30
Examples

1

MySQL, MariaDB and GIS

2

OpenStreetMap

3

osm2pgsql

4

Examples

5

The End ...

Hartmut Holzgraefe (SkySQL)

OSM data in MariaDB/MySQL

February 1, 2014

26 / 30
The End ...

1

MySQL, MariaDB and GIS

2

OpenStreetMap

3

osm2pgsql

4

Examples

5

The End ...

Hartmut Holzgraefe (SkySQL)

OSM data in MariaDB/MySQL

February 1, 2014

27 / 30
References

Contact hartmut@skysql.com
MariaDB GIS https://mariadb.com/kb/en/gis-functionality/
MySQL GIS http://dev.mysql.com/doc/refman/5.6/en/spatialextensions.html
OpenStreetMap http://openstreetmap.org/
MapCompare http://mc.bbbike.org/mc/
RiverMap http://www.kompf.de/gps/rivermap.html
osm2pgsql https://wiki.openstreetmap.org/wiki/Osm2pgsql
My Code https://github.com/hholzgra/osm2pgsql/tree/devel-mysql
Table Files http://php-groupies.de/gis-data/ (soon)

Hartmut Holzgraefe (SkySQL)

OSM data in MariaDB/MySQL

February 1, 2014

28 / 30
Questions!

Hartmut Holzgraefe (SkySQL)

OSM data in MariaDB/MySQL

February 1, 2014

29 / 30
The End?
Or just the beginning?

Hartmut Holzgraefe (SkySQL)

OSM data in MariaDB/MySQL

February 1, 2014

30 / 30

More Related Content

What's hot

The State of the GeoServer project
The State of the GeoServer projectThe State of the GeoServer project
The State of the GeoServer project
GeoSolutions
 
What's new in GeoServer 2.2
What's new in GeoServer 2.2What's new in GeoServer 2.2
What's new in GeoServer 2.2
GeoSolutions
 
Using GeoServer for spatio-temporal data management with examples for MetOc a...
Using GeoServer for spatio-temporal data management with examples for MetOc a...Using GeoServer for spatio-temporal data management with examples for MetOc a...
Using GeoServer for spatio-temporal data management with examples for MetOc a...
GeoSolutions
 
Intro To PostGIS
Intro To PostGISIntro To PostGIS
Intro To PostGIS
mleslie
 
Why is postgis awesome?
Why is postgis awesome?Why is postgis awesome?
Why is postgis awesome?
Kasper Van Lombeek
 
G-Store: High-Performance Graph Store for Trillion-Edge Processing
G-Store: High-Performance Graph Store for Trillion-Edge ProcessingG-Store: High-Performance Graph Store for Trillion-Edge Processing
G-Store: High-Performance Graph Store for Trillion-Edge Processing
Pradeep Kumar
 
Performance features12102 doag_2014
Performance features12102 doag_2014Performance features12102 doag_2014
Performance features12102 doag_2014
Trivadis
 
State of GeoServer 2.14
State of GeoServer 2.14State of GeoServer 2.14
State of GeoServer 2.14
Jody Garnett
 
Introduction To PostGIS
Introduction To PostGISIntroduction To PostGIS
Introduction To PostGIS
mleslie
 
Java Image Processing for Geospatial Community
Java Image Processing for Geospatial CommunityJava Image Processing for Geospatial Community
Java Image Processing for Geospatial Community
Jody Garnett
 
GeoServer Orientation
GeoServer OrientationGeoServer Orientation
GeoServer Orientation
Jody Garnett
 
Wms Performance Tests Map Server Vs Geo Server
Wms Performance Tests Map Server Vs Geo ServerWms Performance Tests Map Server Vs Geo Server
Wms Performance Tests Map Server Vs Geo ServerDonnyV
 
Graph Regularised Hashing
Graph Regularised HashingGraph Regularised Hashing
Graph Regularised Hashing
Sean Moran
 
Accelerated Logistic Regression on GPU(s)
Accelerated Logistic Regression on GPU(s)Accelerated Logistic Regression on GPU(s)
Accelerated Logistic Regression on GPU(s)
RAHUL BHOJWANI
 
Overview of MassGIS Web Mapping Services
Overview of MassGIS Web Mapping ServicesOverview of MassGIS Web Mapping Services
Overview of MassGIS Web Mapping Services
aleda_freeman
 
Creating Stunning Maps in GeoServer: mastering SLD and CSS styles
Creating Stunning Maps in GeoServer: mastering SLD and CSS stylesCreating Stunning Maps in GeoServer: mastering SLD and CSS styles
Creating Stunning Maps in GeoServer: mastering SLD and CSS styles
GeoSolutions
 
Serving earth observation data with GeoServer: addressing real world requirem...
Serving earth observation data with GeoServer: addressing real world requirem...Serving earth observation data with GeoServer: addressing real world requirem...
Serving earth observation data with GeoServer: addressing real world requirem...
GeoSolutions
 
MongoDB + GeoServer
MongoDB + GeoServerMongoDB + GeoServer
MongoDB + GeoServer
MongoDB
 
Scaling PostreSQL with Stado
Scaling PostreSQL with StadoScaling PostreSQL with Stado
Scaling PostreSQL with StadoJim Mlodgenski
 

What's hot (19)

The State of the GeoServer project
The State of the GeoServer projectThe State of the GeoServer project
The State of the GeoServer project
 
What's new in GeoServer 2.2
What's new in GeoServer 2.2What's new in GeoServer 2.2
What's new in GeoServer 2.2
 
Using GeoServer for spatio-temporal data management with examples for MetOc a...
Using GeoServer for spatio-temporal data management with examples for MetOc a...Using GeoServer for spatio-temporal data management with examples for MetOc a...
Using GeoServer for spatio-temporal data management with examples for MetOc a...
 
Intro To PostGIS
Intro To PostGISIntro To PostGIS
Intro To PostGIS
 
Why is postgis awesome?
Why is postgis awesome?Why is postgis awesome?
Why is postgis awesome?
 
G-Store: High-Performance Graph Store for Trillion-Edge Processing
G-Store: High-Performance Graph Store for Trillion-Edge ProcessingG-Store: High-Performance Graph Store for Trillion-Edge Processing
G-Store: High-Performance Graph Store for Trillion-Edge Processing
 
Performance features12102 doag_2014
Performance features12102 doag_2014Performance features12102 doag_2014
Performance features12102 doag_2014
 
State of GeoServer 2.14
State of GeoServer 2.14State of GeoServer 2.14
State of GeoServer 2.14
 
Introduction To PostGIS
Introduction To PostGISIntroduction To PostGIS
Introduction To PostGIS
 
Java Image Processing for Geospatial Community
Java Image Processing for Geospatial CommunityJava Image Processing for Geospatial Community
Java Image Processing for Geospatial Community
 
GeoServer Orientation
GeoServer OrientationGeoServer Orientation
GeoServer Orientation
 
Wms Performance Tests Map Server Vs Geo Server
Wms Performance Tests Map Server Vs Geo ServerWms Performance Tests Map Server Vs Geo Server
Wms Performance Tests Map Server Vs Geo Server
 
Graph Regularised Hashing
Graph Regularised HashingGraph Regularised Hashing
Graph Regularised Hashing
 
Accelerated Logistic Regression on GPU(s)
Accelerated Logistic Regression on GPU(s)Accelerated Logistic Regression on GPU(s)
Accelerated Logistic Regression on GPU(s)
 
Overview of MassGIS Web Mapping Services
Overview of MassGIS Web Mapping ServicesOverview of MassGIS Web Mapping Services
Overview of MassGIS Web Mapping Services
 
Creating Stunning Maps in GeoServer: mastering SLD and CSS styles
Creating Stunning Maps in GeoServer: mastering SLD and CSS stylesCreating Stunning Maps in GeoServer: mastering SLD and CSS styles
Creating Stunning Maps in GeoServer: mastering SLD and CSS styles
 
Serving earth observation data with GeoServer: addressing real world requirem...
Serving earth observation data with GeoServer: addressing real world requirem...Serving earth observation data with GeoServer: addressing real world requirem...
Serving earth observation data with GeoServer: addressing real world requirem...
 
MongoDB + GeoServer
MongoDB + GeoServerMongoDB + GeoServer
MongoDB + GeoServer
 
Scaling PostreSQL with Stado
Scaling PostreSQL with StadoScaling PostreSQL with Stado
Scaling PostreSQL with Stado
 

Similar to OSM data in MariaDB / MySQL - All the world in a few large tables

Why Spark Is the Next Top (Compute) Model
Why Spark Is the Next Top (Compute) ModelWhy Spark Is the Next Top (Compute) Model
Why Spark Is the Next Top (Compute) Model
Dean Wampler
 
Getting Started with Hadoop
Getting Started with HadoopGetting Started with Hadoop
Getting Started with Hadoop
Josh Devins
 
Why Spark Is the Next Top (Compute) Model
Why Spark Is the Next Top (Compute) ModelWhy Spark Is the Next Top (Compute) Model
Why Spark Is the Next Top (Compute) Model
Dean Wampler
 
Report Hadoop Map Reduce
Report Hadoop Map ReduceReport Hadoop Map Reduce
Report Hadoop Map Reduce
Urvashi Kataria
 
Dynamic viz in the IPython Notebook
Dynamic viz in the IPython NotebookDynamic viz in the IPython Notebook
Dynamic viz in the IPython Notebook
Brianna Laugher
 
Spark the next top compute model
Spark   the next top compute modelSpark   the next top compute model
Spark the next top compute model
Dean Wampler
 
Unified Big Data Processing with Apache Spark (QCON 2014)
Unified Big Data Processing with Apache Spark (QCON 2014)Unified Big Data Processing with Apache Spark (QCON 2014)
Unified Big Data Processing with Apache Spark (QCON 2014)
Databricks
 
barplotv4.pdf
barplotv4.pdfbarplotv4.pdf
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
Reynold Xin
 
Analyzing Data at Scale with Apache Spark
Analyzing Data at Scale with Apache SparkAnalyzing Data at Scale with Apache Spark
Analyzing Data at Scale with Apache Spark
Nicola Ferraro
 
Architecting and productionising data science applications at scale
Architecting and productionising data science applications at scaleArchitecting and productionising data science applications at scale
Architecting and productionising data science applications at scale
samthemonad
 
A gentle introduction to the world of BigData and Hadoop
A gentle introduction to the world of BigData and HadoopA gentle introduction to the world of BigData and Hadoop
A gentle introduction to the world of BigData and Hadoop
Stefano Paluello
 
OpenStreetMap in the age of Spark
OpenStreetMap in the age of SparkOpenStreetMap in the age of Spark
OpenStreetMap in the age of Spark
Adrian Bona
 
Hadoop with Lustre WhitePaper
Hadoop with Lustre WhitePaperHadoop with Lustre WhitePaper
Hadoop with Lustre WhitePaper
David Luan
 
Q4 2016 GeoTrellis Presentation
Q4 2016 GeoTrellis PresentationQ4 2016 GeoTrellis Presentation
Q4 2016 GeoTrellis Presentation
Rob Emanuele
 
Introduction to Apache Hadoop
Introduction to Apache HadoopIntroduction to Apache Hadoop
Introduction to Apache Hadoop
Steve Watt
 
Map Reduce Online
Map Reduce OnlineMap Reduce Online
Map Reduce Online
Hadoop User Group
 
Processing Geospatial Data At Scale @locationtech
Processing Geospatial Data At Scale @locationtechProcessing Geospatial Data At Scale @locationtech
Processing Geospatial Data At Scale @locationtech
Rob Emanuele
 
Meethadoop
MeethadoopMeethadoop
Meethadoop
IIIT-H
 
Fossgis 2013 GeoServer Presentation
Fossgis 2013 GeoServer PresentationFossgis 2013 GeoServer Presentation
Fossgis 2013 GeoServer Presentation
GeoSolutions
 

Similar to OSM data in MariaDB / MySQL - All the world in a few large tables (20)

Why Spark Is the Next Top (Compute) Model
Why Spark Is the Next Top (Compute) ModelWhy Spark Is the Next Top (Compute) Model
Why Spark Is the Next Top (Compute) Model
 
Getting Started with Hadoop
Getting Started with HadoopGetting Started with Hadoop
Getting Started with Hadoop
 
Why Spark Is the Next Top (Compute) Model
Why Spark Is the Next Top (Compute) ModelWhy Spark Is the Next Top (Compute) Model
Why Spark Is the Next Top (Compute) Model
 
Report Hadoop Map Reduce
Report Hadoop Map ReduceReport Hadoop Map Reduce
Report Hadoop Map Reduce
 
Dynamic viz in the IPython Notebook
Dynamic viz in the IPython NotebookDynamic viz in the IPython Notebook
Dynamic viz in the IPython Notebook
 
Spark the next top compute model
Spark   the next top compute modelSpark   the next top compute model
Spark the next top compute model
 
Unified Big Data Processing with Apache Spark (QCON 2014)
Unified Big Data Processing with Apache Spark (QCON 2014)Unified Big Data Processing with Apache Spark (QCON 2014)
Unified Big Data Processing with Apache Spark (QCON 2014)
 
barplotv4.pdf
barplotv4.pdfbarplotv4.pdf
barplotv4.pdf
 
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
 
Analyzing Data at Scale with Apache Spark
Analyzing Data at Scale with Apache SparkAnalyzing Data at Scale with Apache Spark
Analyzing Data at Scale with Apache Spark
 
Architecting and productionising data science applications at scale
Architecting and productionising data science applications at scaleArchitecting and productionising data science applications at scale
Architecting and productionising data science applications at scale
 
A gentle introduction to the world of BigData and Hadoop
A gentle introduction to the world of BigData and HadoopA gentle introduction to the world of BigData and Hadoop
A gentle introduction to the world of BigData and Hadoop
 
OpenStreetMap in the age of Spark
OpenStreetMap in the age of SparkOpenStreetMap in the age of Spark
OpenStreetMap in the age of Spark
 
Hadoop with Lustre WhitePaper
Hadoop with Lustre WhitePaperHadoop with Lustre WhitePaper
Hadoop with Lustre WhitePaper
 
Q4 2016 GeoTrellis Presentation
Q4 2016 GeoTrellis PresentationQ4 2016 GeoTrellis Presentation
Q4 2016 GeoTrellis Presentation
 
Introduction to Apache Hadoop
Introduction to Apache HadoopIntroduction to Apache Hadoop
Introduction to Apache Hadoop
 
Map Reduce Online
Map Reduce OnlineMap Reduce Online
Map Reduce Online
 
Processing Geospatial Data At Scale @locationtech
Processing Geospatial Data At Scale @locationtechProcessing Geospatial Data At Scale @locationtech
Processing Geospatial Data At Scale @locationtech
 
Meethadoop
MeethadoopMeethadoop
Meethadoop
 
Fossgis 2013 GeoServer Presentation
Fossgis 2013 GeoServer PresentationFossgis 2013 GeoServer Presentation
Fossgis 2013 GeoServer Presentation
 

Recently uploaded

Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
Vlad Stirbu
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 

Recently uploaded (20)

Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 

OSM data in MariaDB / MySQL - All the world in a few large tables

  • 1. OSM data in MariaDB/MySQL All the world in a few large tables Well, almost ... Hartmut Holzgraefe SkySQL AB hartmut@skysql.com February 1, 2014 Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 1 / 30
  • 2. Overview 1 MySQL, MariaDB and GIS 2 OpenStreetMap 3 osm2pgsql 4 Examples 5 The End ... Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 2 / 30
  • 3. MySQL, MariaDB and GIS 1 MySQL, MariaDB and GIS History Current Status Roadmap 2 OpenStreetMap 3 osm2pgsql 4 Examples 5 The End ... Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 3 / 30
  • 4. History First appeard in MySQL 4.1 (2004) ... with MBR relations only Lab Release adds true spatial relations (200?) MariaDB 5.3 GA with true spatial relations (2011) MySQL 5.6 GA with true spatial relations (2013) ... to be continued ... Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 4 / 30
  • 5. MBR is not enough MBR CONTAINS() Hartmut Holzgraefe (SkySQL) True ST CONTAINS() OSM data in MariaDB/MySQL February 1, 2014 5 / 30
  • 6. Current Status Spatial relations work ... but the world is still flat (no projections) OK for many use cases ... but be aware of gotchas like DISTANCE() GEOMETRY types in all storage engines ... but only MyISAM has SPATIAL indexes Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 6 / 30
  • 7. MariaDB 10.1 Roadmap System Tables (spatial ref sys, geometry columns, ...) Precision math calculations and storage Coordinate transformations / projections 3rd coordinate (e.g. for altitude) all spatial functions required by OGC spatial aware optimizer Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 7 / 30
  • 8. ... and beyond SPATIAL indexes in other storage engines 3D calculations client side support for GIS transformations Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 8 / 30
  • 9. Openstreetmap 1 MySQL, MariaDB and GIS 2 OpenStreetMap Intro Core Data Model Data Access Data Import 3 osm2pgsql 4 Examples 5 The End ... Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 9 / 30
  • 10. OpenStreetMap History founded in 2004 by Steve Coast data under open license (CC-BY-SA first, now ODBL) 1.5 million contributors 2 billion map nodes 200 million ways 2 million relations almost 4 billion GPX points Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 10 / 30
  • 11. Pretty Tiles Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 11 / 30
  • 12. ... and raw data Raw map data can be used for other things, too: for routing for coverage checks for flight simulators for science Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 12 / 30
  • 13. Core Data Model Just three simple things Nodes (Points) Ways Relations Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 13 / 30
  • 14. Nodes Nodes describe a single point at a specific location using: A numeric ID Object version, Timestamp of last change, User Node coordinates Node attributes as key/value pairs Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 14 / 30
  • 15. Ways Ways form an open or closed line by connecting nodes, using: A numeric ID Object version, Timestamp of last change, User An ordered list of node IDs Way attributes as key/value pairs Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 15 / 30
  • 16. Relations Relations bundle objects to describe more complex relations, using: A numeric ID Object version, Timestamp of last change, User Ordered lists of member nodes, ways and sub-relations Optional member roles Attributes as key/value pairs Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 16 / 30
  • 17. Data Access The main database is not exposed directly Only one central instance, accessible via “The API” API meant for editor applications only Full data export once a week (“the planet”) Plus daily, hourly, minutely diffs Two file formats for planets: .osm XML based, usually bz2 compressed (32GB packed, 400GB unpacked) .pbf compact binary format based on Google ProtoBuf (23GB) Regional extracts available by 3rd parties, e.g. GeoFabrik.de Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 17 / 30
  • 18. Data Import The raw data is not really suitable for most purposes, esp. rendering Several import/preprocessing tools provide more convenient schemas, e.g. by ... only extracting certain attributes ... making a difference beween ways and areas ... resolving relations into simpler objects Besides osm2pgsql that I’m about to talk about in a minute there are also impOsm, ... Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 18 / 30
  • 19. osm2pgsql 1 MySQL, MariaDB and GIS 2 OpenStreetMap 3 osm2pgsql Block Diagram Data Model Again Adding MySQL Support Performance 4 Examples 5 The End ... Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 19 / 30
  • 20. osm2pgsql osm2pgsql is a tool written in C with a small C++ part now reads OSM data preprocesses it stores results in relational tables originally in PostGIS only Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 20 / 30
  • 21. Block Diagram Figure : osm2pgsql block diagram Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 21 / 30
  • 22. Data Model Again prefix point for single node POIs prefix line for linear 2D objects like roads, rivers, power lines ... prefix roads a subset of the above, optimizer for rendering prefix polygon objects covering an area: buildings, landuse, administrative borders ... Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 22 / 30
  • 23. Adding MySQL Support turned out to be more tricky than thought some core parts directly called PostgreSQL functions a lot of general functionality was hidden in PostgreSQL specific modules MySQL output module works, could be faster though MySQL middle layer is “code complete” ... ... but crashes while processing relations :( so for now imports are limited by RAM size Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 23 / 30
  • 24. Import performance imports currently take about 4-5 times as long ... as we have no direct equivalent to COPY osm2pgsql at less than 50% CPU only ... so switching to async API would be a 2x win already ... with multi-insert even more so index building is faster ... ... but may not be once we get I/O bound tables on disk are of similar size Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 24 / 30
  • 25. Query performance select from join on where and count(*) nrw_point n nrw_polygon p st_contains(p.way,n.way) p.name = ’Bielefeld’ n.amenity=’post_box’; Data Set Germany Northrhine-Westfalia same with better indexes Hartmut Holzgraefe (SkySQL) MySQL 5.5 (MBR) MariaDB 5.5 PostGIS 15.8s 2.7s 0.2s 16.5s 3.1s 0.2s ? 6.1s .04s OSM data in MariaDB/MySQL February 1, 2014 25 / 30
  • 26. Examples 1 MySQL, MariaDB and GIS 2 OpenStreetMap 3 osm2pgsql 4 Examples 5 The End ... Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 26 / 30
  • 27. The End ... 1 MySQL, MariaDB and GIS 2 OpenStreetMap 3 osm2pgsql 4 Examples 5 The End ... Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 27 / 30
  • 28. References Contact hartmut@skysql.com MariaDB GIS https://mariadb.com/kb/en/gis-functionality/ MySQL GIS http://dev.mysql.com/doc/refman/5.6/en/spatialextensions.html OpenStreetMap http://openstreetmap.org/ MapCompare http://mc.bbbike.org/mc/ RiverMap http://www.kompf.de/gps/rivermap.html osm2pgsql https://wiki.openstreetmap.org/wiki/Osm2pgsql My Code https://github.com/hholzgra/osm2pgsql/tree/devel-mysql Table Files http://php-groupies.de/gis-data/ (soon) Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 28 / 30
  • 29. Questions! Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 29 / 30
  • 30. The End? Or just the beginning? Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 30 / 30