Shapefile: a curse
of today’s Geoinformatics
Jáchym Čepický
OpenGeoLabs s.r.o.
GIVS 2018
OGC GeoPackage – new
format for data exchange
Jáchym Čepický
OpenGeoLabs s.r.o.
FOSS4G-Europe 2018
(ESRI) Shapefile
https://www.esri.com/library/whitepapers/pdfs/shapefile.pdf
http://octogeo.cz
Raster → Vektor
Layer “grass” of global landcover, vector,
cca 3 000 000 vector features
2 days of calculation ….
ERROR 1: Failure writing DBF record 31212233.
ERROR 1: Error in psSHP->sHooks.Fwrite() while writing object
to .shp file.
ERROR 1: Terminating translation prematurely after failed
http://switchfromshapefile.org
https://github.com/OpenGeoLabs/shapefilemustdie/
Shapefile - Positives
Shapefile - positives
Well supported in all software packages.
Shapefile - positives
Even proprietary, the spec is open.
Shapefile - positives
For many use cases, it is good enough.
Shapefile - positives
Inner index (*.shx) can speed up reading.
Shapefile - positives
It’s relatively efficient.
The dark side of Shapefile
1. Multiple-file based format
Multi-file format
● shp
● shx
● dbf
Multi-file format
● .shp
● .shx
● .dbf
● .prj
● .sbn a .sbx
● .fbn and .fbx
● .ain a .aih
● .ixs
● .mxs
● .atx
● .shp.xml
● .cpg
2. Maximal length of attribute name is 10
characters
2. Maximal length of attribute name is 10 characters
RUIAN - register of land identifications and
addresses,
LandRegistry:
● Kod [Code]
● Nazev [Name]
● Nespravny [FalseName]
● ExistujeDigitalniMapa [DigitalMapExists]
● ObecKod [CityCode]
● PlatiOd [ValidSince]
● PlatiDo [ValidTill]
● IdTransakce [TransactionId]
● GlobalniIdNavrhuZmeny
[GlobalChangeRequest]
● RizeniId [ProcedureId]
● MluvnickeCharakteristikyPad2
[GramaticalCharacteristicsPad2]
● MluvnickeCharakteristikyPad3
[GramaticalCharacteristicsPad3]
● MluvnickeCharakteristikyPad4
[GramaticalCharacteristicsPad4]
● MluvnickeCharakteristikyPad6
[GramaticalCharacteristicsPad6]
● MluvnickeCharakteristikyPad7
[GramaticalCharacteristicsPad7]
● DatumVzniku [Origin]
2. Maximálně 10 znaků na jméno atributu
RUIAN - register of land identifications and
addresses,
LandRegistry:
● Kod
● Nazev
● Nespravny
● ExistujeDigitalniMapa → 21
● ObecKod
● PlatiOd
● PlatiDo
● IdTransakce → 11
● GlobalniIdNavrhuZmeny → 21
● RizeniId
● MluvnickeCharakteristikyPad2 → 28
● MluvnickeCharakteristikyPad3 → 28
● MluvnickeCharakteristikyPad4 → 28
● MluvnickeCharakteristikyPad6 → 28
● MluvnickeCharakteristikyPad7 → 28
● DatumVzniku → 11
2. Maximálně 10 znaků na jméno atributu
Warning 6: Normalized/laundered field name: 'ExistujeDigitalniMapa' to 'ExistujeDi'
Warning 6: Normalized/laundered field name: 'IdTransakce' to 'IdTransakc'
Warning 6: Normalized/laundered field name: 'GlobalniIdNavrhuZmeny' to 'GlobalniId'
Warning 6: Normalized/laundered field name: 'MluvnickeCharakteristikyPad2' to 'MluvnickeC'
Warning 6: Normalized/laundered field name: 'MluvnickeCharakteristikyPad3' to 'Mluvnick_1'
Warning 6: Normalized/laundered field name: 'MluvnickeCharakteristikyPad4' to 'Mluvnick_2'
Warning 6: Normalized/laundered field name: 'MluvnickeCharakteristikyPad6' to 'Mluvnick_3'
Warning 6: Normalized/laundered field name: 'MluvnickeCharakteristikyPad7' to 'Mluvnick_4'
Warning 6: Normalized/laundered field name: 'DatumVzniku' to 'DatumVznik'
3. At most 255 attributes in total in table.
4. Flat data structure
5. Bad support for data types
Bad support for data types
+ Float
+ Integer
+ Date
+ Character string - 255 characters max.
- big integers
- boolean
- text
- blob
- ...
6. Unknown characterset
Unknown characterset
7. Maximal file size is 2GB.
7. Maximal file size is 2GB.
8. Only one geometry type
9. Limitted support for 3D
10. Unconsitent projection definition
11 sins of Shapfile - http://switchfromshapefile.org
Shapefile replacement?
Shapefile replacement
● Complicated data structures
● Reasonable big
● Reasonable fast
● Maitained
● Open
Shapefile replacement
● OGC GeoPackage
● GeoJSON
● OGC GML
● SpatiaLite
● CSV
● OGC KML
SpatiaLite
● http://spatialite.org
● Builds on SQLite
○ File SQL database (almost complete SQL-92)
● OGC Simple Features
● No raster data support
OGC GML
● http://opegeospatial.org/standards/gml
● XML
● OGC
● INSPIRE
GeoJSON
● http://geojson.org
● JSON
OGC GeoPackage
● http://geopackage.org
● OGC Standard
● Postavený nad SQLite
● Podporuje i rastry
● Oficiální rozšíření
SQLite
● http://sqlite.org
● Souborová databáze
● SQL 92
● Dynamické typování, datový typ je vázaný na buňku, ne na celý sloupec
● Špatná podpora pro paralelní zápis
● fulltext
● Chrome, Opera, Safari, Firefox, Skype, Windows 10, ...
GeoPackage vs. Shapefile
http://switchfromshapefile.org/compare.html
GeoPackage vs. Shapefile - velikost dat
GeoPackage vs. Shapefile - conclusion
● GeoPackage seems slightly more efficient
● It’s relatively slower for simple queries, but it’s not fatal.
● It seems to be significantly faster for complex spatial and database queries.
Implementations of OGC GeoPackage
● QGIS
● GDAL
● ArcGIS
● FME
● Hexagon
● …
● http://www.geopackage.org/implementations.html
QGIS 3
Map Styling Information
https://github.com/pka/qgpkg/blob/master/qgis_geopackage_extension.md
Official extension
GeoPackage - finalisation
● http://geopackage.org
● OGC Standard
● Nad SQLite
● File-based
● Raster support
● Extensions
● Already spread around - even you might not notice
Jáchym Čepický
jachym.cepicky@opengeolabs.cz
http://opengeolabs.cz
http://switchfromshapefile.org

Switch from shapefile