Revisiting the Representation of and Need for Raw Geometries on the Linked Da...Blake Regalia
Geospatial data on the Semantic Web historically stems from using point geometries to represent the geographic locations of places. As the practice evolved in the Semantic Web community, a demand for more complex geometries and geospatial query capabilities came about as a consequence of integrating traditional GIS and geo-data into the Linked Data cloud. However, recent projects have revealed that, in practice, these established techniques have major shortcomings that limit their storage, transmission, and query potential. In this position paper, we examine these shortcomings, propose to treat geometries similar to how other binary data are stored and referenced on the Semantic Web, namely by representing them as resources via URIs instead of RDF literals, and demonstrate the utility of precomputing topological relations rather than computing them on-demand by arguing that end users are most often interested in topology and not raw geometries.
Catálogo de muebles de salón comedor y dormitorios de matrimonio de la colección OCEAN V.3. de la firma Indufex en http://www.mueblipedia.com/es/coleccion/217/salones-y-dormitorios-ocean
digital oder tot? - How Digital Disruption is Driving Change in the CX LandscapeTorben Haagh
Digitale Disruption transformiert den CX-Bereich! Der Artikel „How Digital Disruption is Driving Change in the CX Landscape“ vom CX Network nimmt Handel, Banking, Automobil, mobile Zahlungen und Apps unter die Lupe, da hier jede digitale Disruption in Kauf genommen wird, um zu überleben!
Hören Sie auf der CEM-interactive 2016 in Berlin Best Practice Beispiele u.a. von Swiss Air, Loewe, Commerzbank, 1&1, Chefkoch und lesen Sie hier, wie BMW, Etsy, Fider, Pingit und Whatsapp Hürden bewältigt haben: http://bit.ly/CEM-DigitalDisruption-SlideShare
Audible bietet Hörbücher für alle Altersklassen an - aber kann das auch für alle zum Erlebnis werden? Welche CX-Technologien und -Strategien setzt das Unternehmen heute ein und was ist für die Zukunft geplant? Britta Backhaus, Head of Customer Development bei Audible, beantwortet u.a. diese Fragen im exklusiven Q&A, das Sie hier kostenlos lesen können: http://bit.ly/CEM-QnA-Slideshare
Revisiting the Representation of and Need for Raw Geometries on the Linked Da...Blake Regalia
Geospatial data on the Semantic Web historically stems from using point geometries to represent the geographic locations of places. As the practice evolved in the Semantic Web community, a demand for more complex geometries and geospatial query capabilities came about as a consequence of integrating traditional GIS and geo-data into the Linked Data cloud. However, recent projects have revealed that, in practice, these established techniques have major shortcomings that limit their storage, transmission, and query potential. In this position paper, we examine these shortcomings, propose to treat geometries similar to how other binary data are stored and referenced on the Semantic Web, namely by representing them as resources via URIs instead of RDF literals, and demonstrate the utility of precomputing topological relations rather than computing them on-demand by arguing that end users are most often interested in topology and not raw geometries.
Catálogo de muebles de salón comedor y dormitorios de matrimonio de la colección OCEAN V.3. de la firma Indufex en http://www.mueblipedia.com/es/coleccion/217/salones-y-dormitorios-ocean
digital oder tot? - How Digital Disruption is Driving Change in the CX LandscapeTorben Haagh
Digitale Disruption transformiert den CX-Bereich! Der Artikel „How Digital Disruption is Driving Change in the CX Landscape“ vom CX Network nimmt Handel, Banking, Automobil, mobile Zahlungen und Apps unter die Lupe, da hier jede digitale Disruption in Kauf genommen wird, um zu überleben!
Hören Sie auf der CEM-interactive 2016 in Berlin Best Practice Beispiele u.a. von Swiss Air, Loewe, Commerzbank, 1&1, Chefkoch und lesen Sie hier, wie BMW, Etsy, Fider, Pingit und Whatsapp Hürden bewältigt haben: http://bit.ly/CEM-DigitalDisruption-SlideShare
Audible bietet Hörbücher für alle Altersklassen an - aber kann das auch für alle zum Erlebnis werden? Welche CX-Technologien und -Strategien setzt das Unternehmen heute ein und was ist für die Zukunft geplant? Britta Backhaus, Head of Customer Development bei Audible, beantwortet u.a. diese Fragen im exklusiven Q&A, das Sie hier kostenlos lesen können: http://bit.ly/CEM-QnA-Slideshare
CX Network veranschaulicht den Unterschied von Multi-Channel vs. Omni-Channel, damit Sie die wachsenden Erwartungen Ihrer Kunden besser erfüllen können. Hier geht’s zum kostenlosen Download des gesamten Artikels:
http://bit.ly/CX-Channels-Slideshare
This is the helpful guideline released by Indian Standard IS 11769, Part 3 which specifies the Guidelines for Safe Usage of Products Containing Asbestos and the steps needed for Installation, Cleaning, Disposal of it. This deals primary with Asbestos or CAF Gaskets, Asbestos Insulation, Asbestos Millboard, Asbestos Brake Liners.
Capítulo 3:Protocolos y comunicaciones de red
Los estudiantes podrán hacer lo siguiente:
Explicar la forma en que se utilizan las reglas para facilitar la comunicación.
Explicar la función de los protocolos y de los organismos de estandarización para facilitar la interoperabilidad en las comunicaciones de red.
Explicar la forma en que los dispositivos de una LAN acceden a los recursos en una red de pequeña o mediana empresa.
after a brief intro about indexes present in PostgreSQL, the new feature in PostgreSQL 10 are shown: parallel index scans, hash persistence, BRIN autosummarization, unbalanced support (SP-GiST) for inet data.
PG Day'14 Russia, GIN — Stronger than ever in 9.4 and further, Александр Коро...pgdayrussia
Доклад был представлен на официальной российской конференции PG Day'14 Russia, посвященной вопросам разработки и эксплуатации PostgreSQL.
Доклад посвящен улучшениям в GIN-индексах в PostgreSQL 9.4 и далее, которые выводят GIN на новый уровень производительности и расширяемости. Наиболее важные улучшения:
Сжатие постинг-листов. Индексы становятся в среднем в 2 раза компактнее. При это не требуется никаких изменений со стороны opclass'ов. pg_upgrade поддерживается, индексы сжимаются "на лету".
Алгоритм быстрого сканирования GIN-индексов позволяет пропускать части больших постинг-деревьев при сканировании. Этот алгоритм кардинально улучшает скорость поиска для hstore и jsonb операторов, а также случай "частое_слово & редкое_слово" для полнотекстового поиска.
Хранение дополнительной информации в постинг-листах. Содержимое этой дополнительной информации зависит от конкретной разновидности GIN-индекса (определяется opclass'ом). Дополнительная информация может быть полезна при самых разных видах поиска: поиск по фразам, поиск по похожести массивов, обратный полнотекстовый поиск (поиск тех tsquery, которые подходят под tsvector), обратный поиск по регулярным выражением (поиск регулярных выражений, подходящих под строку), поиск по строковой "похожести" с использованием позиционных n-грам.
Ранжирование по индексу. Это улучшение позволяет возвращать результаты из индекса таким образом, как это определяет opclass. Наиболее важное применение — возвращение результатов полнотекстового поиска в порядке релевантности, кардинально снижающее загрузку IO. Но есть также и другие применения, такие как возврат массивов или строк в порядке их "похожести".
В докладе представлены результаты "бенчмарков" полнотектового поиска, использующие реальные наборы данных (6М и 15М документов) и реальные поисковые запросы, которые демонстрируют, что улучшенный полнотекстовый поиск PostgreSQL (со всеми накладными расходами ACID) может превосходить по скорости Sphinx.
Graph operations in Git version control systemJakub Narębski
Graph operations in Git version control system: how the performance was improved (for large repositories), how can it be further improved.
Git uses various clever methods for making operations on very large repositories faster, from bitmap indices for 'git fetch', to generation numbers (also known as topological levels) in the commit-graph file for commit graph traversal operations like 'git log --graph'. There are also other ideas that could be used to make those operations even faster.
CX Network veranschaulicht den Unterschied von Multi-Channel vs. Omni-Channel, damit Sie die wachsenden Erwartungen Ihrer Kunden besser erfüllen können. Hier geht’s zum kostenlosen Download des gesamten Artikels:
http://bit.ly/CX-Channels-Slideshare
This is the helpful guideline released by Indian Standard IS 11769, Part 3 which specifies the Guidelines for Safe Usage of Products Containing Asbestos and the steps needed for Installation, Cleaning, Disposal of it. This deals primary with Asbestos or CAF Gaskets, Asbestos Insulation, Asbestos Millboard, Asbestos Brake Liners.
Capítulo 3:Protocolos y comunicaciones de red
Los estudiantes podrán hacer lo siguiente:
Explicar la forma en que se utilizan las reglas para facilitar la comunicación.
Explicar la función de los protocolos y de los organismos de estandarización para facilitar la interoperabilidad en las comunicaciones de red.
Explicar la forma en que los dispositivos de una LAN acceden a los recursos en una red de pequeña o mediana empresa.
after a brief intro about indexes present in PostgreSQL, the new feature in PostgreSQL 10 are shown: parallel index scans, hash persistence, BRIN autosummarization, unbalanced support (SP-GiST) for inet data.
PG Day'14 Russia, GIN — Stronger than ever in 9.4 and further, Александр Коро...pgdayrussia
Доклад был представлен на официальной российской конференции PG Day'14 Russia, посвященной вопросам разработки и эксплуатации PostgreSQL.
Доклад посвящен улучшениям в GIN-индексах в PostgreSQL 9.4 и далее, которые выводят GIN на новый уровень производительности и расширяемости. Наиболее важные улучшения:
Сжатие постинг-листов. Индексы становятся в среднем в 2 раза компактнее. При это не требуется никаких изменений со стороны opclass'ов. pg_upgrade поддерживается, индексы сжимаются "на лету".
Алгоритм быстрого сканирования GIN-индексов позволяет пропускать части больших постинг-деревьев при сканировании. Этот алгоритм кардинально улучшает скорость поиска для hstore и jsonb операторов, а также случай "частое_слово & редкое_слово" для полнотекстового поиска.
Хранение дополнительной информации в постинг-листах. Содержимое этой дополнительной информации зависит от конкретной разновидности GIN-индекса (определяется opclass'ом). Дополнительная информация может быть полезна при самых разных видах поиска: поиск по фразам, поиск по похожести массивов, обратный полнотекстовый поиск (поиск тех tsquery, которые подходят под tsvector), обратный поиск по регулярным выражением (поиск регулярных выражений, подходящих под строку), поиск по строковой "похожести" с использованием позиционных n-грам.
Ранжирование по индексу. Это улучшение позволяет возвращать результаты из индекса таким образом, как это определяет opclass. Наиболее важное применение — возвращение результатов полнотекстового поиска в порядке релевантности, кардинально снижающее загрузку IO. Но есть также и другие применения, такие как возврат массивов или строк в порядке их "похожести".
В докладе представлены результаты "бенчмарков" полнотектового поиска, использующие реальные наборы данных (6М и 15М документов) и реальные поисковые запросы, которые демонстрируют, что улучшенный полнотекстовый поиск PostgreSQL (со всеми накладными расходами ACID) может превосходить по скорости Sphinx.
Graph operations in Git version control systemJakub Narębski
Graph operations in Git version control system: how the performance was improved (for large repositories), how can it be further improved.
Git uses various clever methods for making operations on very large repositories faster, from bitmap indices for 'git fetch', to generation numbers (also known as topological levels) in the commit-graph file for commit graph traversal operations like 'git log --graph'. There are also other ideas that could be used to make those operations even faster.
How can you use PosgreSQL as a schemaless (NoSQL) database? Here we cover our use case and highlight upcoming features in postgres 9.4 and its integration with Django 1.7
LSGAN - SIMPle(Simple Idea Meaningful Performance Level up)Hansol Kang
LSGAN은 기존의 GAN loss가 아닌 MSE loss를 사용하여, 더욱 realistic한 데이터를 생성함.
LSGAN 논문 리뷰 및 PyTorch 기반의 구현.
[참고]
Mao, Xudong, et al. "Least squares generative adversarial networks." Proceedings of the IEEE International Conference on Computer Vision. 2017.
Every application has to store and manage data that in one form or another has a temporal extent. For some years, database systems started integrating features that help with the management of such kinds of data. In this talk, we dive into the support that the Open Source database system PostgreSQL together with its ecosystem provides to facilitate the querying and processing of temporal data. We will also take a brief look at the projects we are working on at unibz in this context.
Raster Data In GeoServer And GeoTools: Achievements, Issues And Future Develo...GeoSolutions
The purpose of this presentation is to discuss the developments during last years in raster data support in GeoTools and GeoServer, and also to introduce and discuss future development directions.
Abstract: Iterative stencils represent the core computational kernel of many applications belonging to different domains, from scientific computing to finance. Given the complex dependencies and the low computation to memory access ratio, this kernels represent a challenging acceleration target on every architecture. This is especially true for FPGAs, whose direct hardware execution offers the possibility for high performance and power efficiency, but where the non-fixed architecture can lead to very large solutions spaces to be explored.
In this work, we build upon an FPGA-based acceleration methodology for iterative stencil algorithms previously presented, where we provide a dataflow architectural template that implements optimal on-chip buffering and is able to increase almost linearly in performance using a scaling technique denoted as iterations queuing. In particular, we propose a set of design improvements and we elaborate an accurate analytical performance model that can be used to support the exploration of the design space. Experimental results obtained implementing a set of benchmarks from different application domains on a Xilinx VC707 board show an average performance and power efficiency increase over the previous work of respectively around 22x and 8x, and a prediction error that is on average less than 1%.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
1. 2ndQuadrant Italia Giuseppe Broccolo – giuseppe.broccolo@2ndquadrant.it
June, 14th
2016
BRIN indexes on
geospatial
databases
www.2ndquadrant.com
2. 2ndQuadrant Italia Giuseppe Broccolo – giuseppe.broccolo@2ndquadrant.it
~$ whoami~$ whoami
Giuseppe Broccolo, PhDGiuseppe Broccolo, PhD
PostgreSQL & PostGIS consultantPostgreSQL & PostGIS consultant
@giubro gbroccolo7
giuseppe.broccolo@2ndquadrant.it
gbroccolo gemini__81
3. 2ndQuadrant Italia Giuseppe Broccolo – giuseppe.broccolo@2ndquadrant.it
Indexes & PostgreSQLIndexes & PostgreSQL
• Binary structures on disc:
– Indexing organised in nodes into 8kB pages
– Nodes point to table rows
• Speed up data access: ~O(log N)
– High performance until the index can be contained in RAM
• Size: ~O(N)
4. 2ndQuadrant Italia Giuseppe Broccolo – giuseppe.broccolo@2ndquadrant.it
Geospatial indexes in PostgreSQLGeospatial indexes in PostgreSQL
• All datatypes can be indexed in principle...
– ...just define the needed Operator Class!
• Operators & support functions
– ...define how the data has to be stored into the index nodes
• Geospatial data:
– can be larger than 8kB
– Is generally a varlena
– a single node cannot be contained into a single page
5. 2ndQuadrant Italia Giuseppe Broccolo – giuseppe.broccolo@2ndquadrant.it
PostGIS datatypes & GiST indexingPostGIS datatypes & GiST indexing
In PostGIS (since v0.6) geometry/geography are
converted into their float-precision bounding boxes
gidx/box2df are then used as storage
datatype to populate the index’s nodes
6. 2ndQuadrant Italia Giuseppe Broccolo – giuseppe.broccolo@2ndquadrant.it
GIS & BigDataGIS & BigData
LiDAR surveys
Maps of entire countries
It’s relatively easy to have to work with terabytes of geospatial data…
...can indexes be contained in RAM?
7. 2ndQuadrant Italia Giuseppe Broccolo – giuseppe.broccolo@2ndquadrant.it
A new index: Block Range Indexing (PG 9.5)A new index: Block Range Indexing (PG 9.5)
CREATE INDEX idx ON table
USING brin(column) WITH (pages_per_range=10);
data must be physically sorted on disk!
index node → row
index node → block
less specific than GiST!
Really small!
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
(S. Riggs, A. Herrera)
8. 2ndQuadrant Italia Giuseppe Broccolo – giuseppe.broccolo@2ndquadrant.it
BRIN support for PostGISBRIN support for PostGIS
https://github.com/dalibo/postgis/tree/for_upstream
Julien RouhaudRonan Dunklau me
9. 2ndQuadrant Italia Giuseppe Broccolo – giuseppe.broccolo@2ndquadrant.it
BRIN support for PostGISBRIN support for PostGIS
NO kNN SUPPORT!!
– Different OpClass for
• 2D (default), 3D, 4D geometry
• geography
• box2d/box3d
• cross-operators defined in the OpFamily
– Storage datatype: float-precision (such as in GiST)
• gidx (3D, 4D geometry)
• box2df (2D geometry, geography)
– Operators: &&, @, ~ (2D) - &&& (3D, 4D)
8kB8kB8kB8kB
8kB8kB8kB8kB
10. 2ndQuadrant Italia Giuseppe Broccolo – giuseppe.broccolo@2ndquadrant.it
11stst
example: “sorted”example: “sorted” geometrygeometry pointspoints
~10M, 3D, SRID=0, distribution range: 0÷100kunits
=# CREATE INDEX idx_gist ON points USING gist
-# (geom gist_geometry_ops_nd);
=# CREATE INDEX idx_brin_128 ON points USING brin
-# (geom brin_geometry_inclusion_ops_3d);
=# CREATE INDEX idx_brin_10 ON points USING brin
-# (geom brin_geometry_inclusion_ops_3d)
-# WITH (pages_per_range=10);
BRIN(128)/GiST 1/2000
BRIN(10)/GiST 1/1000
BRIN(128)/GiST 1/30
BRIN(10)/GiST 1/20
Sizes
Building
times
11. 2ndQuadrant Italia Giuseppe Broccolo – giuseppe.broccolo@2ndquadrant.it
11stst
example: “sorted”example: “sorted” geometrygeometry pointspoints
~10M, 3D, SRID=0, distribution range: 0÷100kunits
=# SELECT * FROM points
-# WHERE ‘BOX3D(10. 10. 10., 12., 12., 12.)’::box3d &&& geom;
BRIN(128)/GiST 50/1
BRIN(10)/GiST 6/1
BRIN(128)/SeqScan 1/60
BRIN(10)/SeqScan 1/350
Execution times
12. 2ndQuadrant Italia Giuseppe Broccolo – giuseppe.broccolo@2ndquadrant.it
22ndnd
example: “unsorted”example: “unsorted” geometrygeometry pointspoints
~10M, 3D, SRID=0, distribution range: 0÷100kunits
=# CREATE TABLE unsorted_points AS
-# SELECT * FROM points ORDER BY random();
=# SELECT * FROM unsorted_points
-# WHERE ‘BOX3D(10. 10. 10., 12., 12., 12.)’::box3d &&& geom;
BRIN(128)/GiST 2800/1
BRIN(10)/GiST 400/1
BRIN(128)/SeqScan 1/1
BRIN(10)/SeqScan 1/12
Execution times
14. 2ndQuadrant Italia Giuseppe Broccolo – giuseppe.broccolo@2ndquadrant.it
...what about...what about INSERTINSERT’s?’s?
INSERT INTO points
SELECT ST_MakePoint(a, a, a)
FROM generate_series(1, 1000) AS f(a);
GiST 6 times than insertions without index
BRIN (10) 3 times than insertions without index
BRIN (128) 2 times than insertions without index
...blocks could be “not sorted” anymore!
15. 2ndQuadrant Italia Giuseppe Broccolo – giuseppe.broccolo@2ndquadrant.it
Manage LiDAR
data with
PostgreSQL
is it possible?
Is the relational approach valid for point clouds?Is the relational approach valid for point clouds?
16. 2ndQuadrant Italia Giuseppe Broccolo – giuseppe.broccolo@2ndquadrant.it
PostgreSQL & LiDARPostgreSQL & LiDAR
• PostgreSQL is a relational DBMS with an extension for LiDAR data:
pg_pointcloud
• Part of the OpenGeo suite, completely compatible with PostGIS
– two new datatype: pcpoint, pcpatch (compressed set of points)
• N points (with all attributes from the survey) → 1 patch → 1 record
– compatible with PDAL drivers to import data directly from .las
32b 32b 32b 16b
https://github.com/pgpointcloud/pointcloud
17. 2ndQuadrant Italia Giuseppe Broccolo – giuseppe.broccolo@2ndquadrant.it
Relational approach to LiDAR data with PGRelational approach to LiDAR data with PG
• GiST indexing in PostGIS
• GiST performances:
– storage: 1TB RAID1, RAM 16GB, 8 CPU @3.3GHz,
PostgreSQL9.3
– index size ~O(table size)
– Index was used:
• up to ~300M points in bbox inclusion searches
• up to ~10M points in kNN searches
LiDAR size: ~O(109
÷1011
) → few % can be properly indexed!
http://www.slideshare.net/GiuseppeBroccolo/gbroccolo-foss4-geugeodbindex
18. 2ndQuadrant Italia Giuseppe Broccolo – giuseppe.broccolo@2ndquadrant.it
The LiDAR dataset: the ahn2 projectThe LiDAR dataset: the ahn2 project
• 3D point cloud, coverage: almost the whole Netherlands
– EPSG: 28992, ~8 points/m2
• 1.6TB, ~250G points in ~560M patches (compression: ~10x)
– PDAL driver – filter.chipper
• available RAM: 16GB
• the point structure:
X Y Z scan LAS time RGB chipper
32b 32b 32b 40b 16b 64b 48b 32b
the “indexed” part (can be converted to PostGIS datatype)
(thanks to Tom Van Tilburg)
19. 2ndQuadrant Italia Giuseppe Broccolo – giuseppe.broccolo@2ndquadrant.it
Typical searches on ahn2 -Typical searches on ahn2 - x3d_viewerx3d_viewer
• Intersection with a polygon (PostGIS)
– the part that lasts longer – need indexes here!
• Patch “explosion” + NN sorting (pg_PointCloud+PostGIS)
• constrained Delaunay Triangulation (SFCGAL)
20. 2ndQuadrant Italia Giuseppe Broccolo – giuseppe.broccolo@2ndquadrant.it
WITH patches AS (
SELECT patches FROM ahn2
WHERE patches && ST_GeomFromText(‘POLYGON(...)’)
), points AS (
SELECT ST_Explode(patches) AS points
FROM patches
), sorted_points AS (
SELECT points,
ST_DumpPoints(ST_GeomFromText(‘POLYGON(...)’))).geom AS poly_pt
FROM points ORDER BY points <#> poly_pt LIMIT 1;
), sel AS (
SELECT points FROM sorted_points
WHERE points && ST_GeomFromText(‘POLYGON(...)’)
)
SELECT ST_Dump(ST_Triangulate2DZ(ST_Collect(points))) FROM sel;
All in the DB & just with one query!All in the DB & just with one query!
21. 2ndQuadrant Italia Giuseppe Broccolo – giuseppe.broccolo@2ndquadrant.it
patches && polygons - GiST performancepatches && polygons - GiST performance
GiST 2.5 days
GiST 29GB
index building
index size
polygon
size
timing
~O(10m) ~20ms
~O(100m) ~60ms
~O(1Km) ~3s
~O(10km) hours
searches based on GiST
index not contained in
RAM anymore
(~5G points → ~3%)
22. 2ndQuadrant Italia Giuseppe Broccolo – giuseppe.broccolo@2ndquadrant.it
patches && polygons - BRIN performancepatches && polygons - BRIN performance
BRIN 4 h BRIN 15MB
index building
polygon
size
timing
~O(m) ~150s
searches based on BRIN
index size
23. 2ndQuadrant Italia Giuseppe Broccolo – giuseppe.broccolo@2ndquadrant.it
patches && polygons - BRIN performancepatches && polygons - BRIN performance
BRIN 5 h BRIN 14MB
index building
polygon
size
timing
~O(m) ~150s
searches based on BRIN
How data was inserted...
LASLASLASLASLASLAS
chipper
chipper
chipper
PDAL driver
(parallel processes)
ahn2
index size
24. 2ndQuadrant Italia Giuseppe Broccolo – giuseppe.broccolo@2ndquadrant.it
Spatial sorting of a LiDAR datasetSpatial sorting of a LiDAR dataset
CREATE INDEX patch_geohash ON ahn2
USING btree (ST_GeoHash(ST_Transform(Geometry(patch), 4326), 20));
CLUSTER ahn2 USING patch_geohash;
Morton order [http://geohash.org/]
Need more than 2X table size free space
Relatively fast!
25. 2ndQuadrant Italia Giuseppe Broccolo – giuseppe.broccolo@2ndquadrant.it
Spatial sorting of a LiDAR datasetSpatial sorting of a LiDAR dataset
CREATE TABLE ahn2_sorted ON ahn2 AS
SELECT * FROM ahn2
ORDER BY ST_GeoHash(ST_Transform(Geometry(patch), 4326), 10)
COLLATE “C”;
Morton order [http://geohash.org/]
Just need size for a second table
Slower than clustering using the index!
26. 2ndQuadrant Italia Giuseppe Broccolo – giuseppe.broccolo@2ndquadrant.it
Spatial sorting of a LiDAR datasetSpatial sorting of a LiDAR dataset
CREATE TABLE ahn2_sorted ON ahn2 AS
SELECT * FROM ahn2
ORDER BY ST_GeoHash(ST_Transform(Geometry(patch), 4326), 10)
COLLATE “C”;
Morton order [http://geohash.org/]
Just need size for a second table
Slower than clustering using the index!
45 hours
27. 2ndQuadrant Italia Giuseppe Broccolo – giuseppe.broccolo@2ndquadrant.it
patches && polygons – BRIN performancepatches && polygons – BRIN performance
polygon
size
BRIN
timing
GiST
timing
~O(10m) ~380ms ~20ms
~O(100m) ~400ms ~60ms
~O(1Km) ~2.7s ~3s
~O(10km) ~4.7s hours
• After sorting: 150s→380ms
• GiST X20 faster than BRIN [r~O(10m)]
– BRIN X200 faster than SeqScan
• BRIN X1000 faster than SeqScan [r~O(100m)]
•
• BRIN ~ GiST [r~O(1Km)]
• Just BRIN is used for searches r>1km
28. 2ndQuadrant Italia Giuseppe Broccolo – giuseppe.broccolo@2ndquadrant.it
is the drop in performance acceptable?is the drop in performance acceptable?
BRIN searches = x20 GiST searches
= x10÷x100 GiST searches
= x10÷x100 GiST searches
patch explosion
+
NN sorting
constrained
triangulation
29. 2ndQuadrant Italia Giuseppe Broccolo – giuseppe.broccolo@2ndquadrant.it
the future of the project...add kNN supportthe future of the project...add kNN support
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
k
get blocks gradually included within the k parameter, as already made
in GiST with single tuples
WIP…sponsorships are welcome!
ORDER BY geoms <-> point
LIMIT N
30. 2ndQuadrant Italia Giuseppe Broccolo – giuseppe.broccolo@2ndquadrant.it
ConclusionsConclusions
• BRINs can be successfully used in geo DB based on PostgreSQL
– totally support PostGIS datatype
• A patch almost ready for the next PostGIS release (2.3.0)
– easier indexes maintenance, low indexing time, low impact on concurrent INSERTs
– less specific than GiST...but not to much!
• Really small indexes!
– GiST performances drop as well as it cannot be totally contained in RAM
• Can be relational approach to LiDAR data valid with PostgreSQL?
– Yes, at least for bbox inclusion searches
– make sure that data has the same sequentiality of .las