The document discusses several models for representing hierarchical or tree-like data structures in relational databases, including adjacency list, closure table, path enumeration, nested set, and various extensions of the David Chandler model. It provides examples of creating tables and querying data for each model, and notes some advantages and limitations of each approach.
MySQL 8 introduces support for ANSI SQL recursive queries with common table expressions, a powerful method for working with recursive data references. Until now, MySQL application developers have had to use workarounds for hierarchical data relationships. It's time to write SQL queries in a more standardized way, and be compatible with other brands of SQL implementations. But as always, the bottom line is: how does it perform? This presentation will briefly describe how to use recursive queries, and then test the performance and scalability of those queries against other solutions for hierarchical queries.
Tree-like data relationships are common, but working with trees in SQL usually requires awkward recursive queries. This talk describes alternative solutions in SQL, including:
- Adjacency List
- Path Enumeration
- Nested Sets
- Closure Table
Code examples will show using these designs in PHP, and offer guidelines for choosing one design over another.
Trees In The Database - Advanced data structuresLorenzo Alberton
Storing tree structures in a bi-dimensional table has always been problematic. The simplest tree models are usually quite inefficient, while more complex ones aren't necessarily better. In this talk I briefly go through the most used models (adjacency list, materialized path, nested sets) and introduce some more advanced ones belonging to the nested intervals family (Farey algorithm, Continued Fractions, and other encodings). I describe the advantages and pitfalls of each model, some proprietary solutions (e.g. Oracle's CONNECT BY) and one of the SQL Standard's upcoming features, Common Table Expressions.
Presentation given at OSCON 2009 and PostgreSQL West 09. Describes SQL solutions to a selection of object-oriented problems:
- Extensibility
- Polymorphism
- Hierarchies
- Using ORM in MVC application architecture
These slides are excerpted from another presentation, "SQL Antipatterns Strike Back."
Designing an extensible, flexible schema that supports user customization is a common requirement, but it's easy to paint yourself into a corner.
Examples of extensible database requirements:
- A database that allows users to declare new fields on demand.
- Or an e-commerce catalog with many products, each with distinct attributes.
- Or a content management platform that supports extensions for custom data.
The solutions we use to meet these requirements is overly complex and the performance is terrible. How should we find the right balance between schema and schemaless database design?
I'll briefly cover the disadvantages of Entity-Attribute-Value (EAV), a problematic design that's an example of the antipattern called the Inner-Platform Effect, That is, modeling an attribute-management system on top of the RDBMS architecture, which already provides attributes through columns, data types, and constraints.
Then we'll discuss the pros and cons of alternative data modeling patterns, with respect to developer productivity, data integrity, storage efficiency and query performance, and ease of extensibility.
- Class Table Inheritance
- Serialized BLOB
- Inverted Indexing
Finally we'll show tools like pt-online-schema-change and new features of MySQL 5.6 that take the pain out of schema modifications.
MySQL 8 introduces support for ANSI SQL recursive queries with common table expressions, a powerful method for working with recursive data references. Until now, MySQL application developers have had to use workarounds for hierarchical data relationships. It's time to write SQL queries in a more standardized way, and be compatible with other brands of SQL implementations. But as always, the bottom line is: how does it perform? This presentation will briefly describe how to use recursive queries, and then test the performance and scalability of those queries against other solutions for hierarchical queries.
Tree-like data relationships are common, but working with trees in SQL usually requires awkward recursive queries. This talk describes alternative solutions in SQL, including:
- Adjacency List
- Path Enumeration
- Nested Sets
- Closure Table
Code examples will show using these designs in PHP, and offer guidelines for choosing one design over another.
Trees In The Database - Advanced data structuresLorenzo Alberton
Storing tree structures in a bi-dimensional table has always been problematic. The simplest tree models are usually quite inefficient, while more complex ones aren't necessarily better. In this talk I briefly go through the most used models (adjacency list, materialized path, nested sets) and introduce some more advanced ones belonging to the nested intervals family (Farey algorithm, Continued Fractions, and other encodings). I describe the advantages and pitfalls of each model, some proprietary solutions (e.g. Oracle's CONNECT BY) and one of the SQL Standard's upcoming features, Common Table Expressions.
Presentation given at OSCON 2009 and PostgreSQL West 09. Describes SQL solutions to a selection of object-oriented problems:
- Extensibility
- Polymorphism
- Hierarchies
- Using ORM in MVC application architecture
These slides are excerpted from another presentation, "SQL Antipatterns Strike Back."
Designing an extensible, flexible schema that supports user customization is a common requirement, but it's easy to paint yourself into a corner.
Examples of extensible database requirements:
- A database that allows users to declare new fields on demand.
- Or an e-commerce catalog with many products, each with distinct attributes.
- Or a content management platform that supports extensions for custom data.
The solutions we use to meet these requirements is overly complex and the performance is terrible. How should we find the right balance between schema and schemaless database design?
I'll briefly cover the disadvantages of Entity-Attribute-Value (EAV), a problematic design that's an example of the antipattern called the Inner-Platform Effect, That is, modeling an attribute-management system on top of the RDBMS architecture, which already provides attributes through columns, data types, and constraints.
Then we'll discuss the pros and cons of alternative data modeling patterns, with respect to developer productivity, data integrity, storage efficiency and query performance, and ease of extensibility.
- Class Table Inheritance
- Serialized BLOB
- Inverted Indexing
Finally we'll show tools like pt-online-schema-change and new features of MySQL 5.6 that take the pain out of schema modifications.
In this Meetup Victor Perepelitsky - R&D Technical Leader at LivePerson leading the 'Real Time Event Processing Platform' team , will talk about Java 8', 'Stream API', 'Lambda', and 'Method reference'.
Victor will clarify what functional programming is and how can you use java 8 in order to create better software.
Victor will also cover some pain points that Java 8 did not solve regarding functionality and see how you can work around it.
Graphs in the Database: Rdbms In The Social Networks AgeLorenzo Alberton
Despite the NoSQL movement trying to flag traditional databases as a dying breed, the RDBMS keeps evolving and adding new powerful weapons to its arsenal. In this talk we'll explore Common Table Expressions (SQL-99) and how SQL handles recursion, breaking the bi-dimensional barriers and paving the way to more complex data structures like trees and graphs, and how we can replicate features from social networks and recommendation systems. We'll also have a look at window functions (SQL:2003) and the advanced reporting features they make finally possible.
Let's get into several common types of queries that developers struggle with, showing SQL solutions, and then analyze them for optimal efficiency. I'll cover Exclusion Join, Random Selection, Greatest-Per-Group, Dynamic Pivot, and Relational Division.
A comparison of different solutions for full-text search in web applications using PostgreSQL and other technology. Presented at the PostgreSQL Conference West, in Seattle, October 2009.
Using MySQL without Maatkit is like taking a photo without removing the camera's lens cap. Professional MySQL experts use this toolkit to help keep complex MySQL installations running smoothly and efficiently. This session will show you practical ways to use Maatkit every day.
The JSON data type and functions that support it comprise one of the most interesting features introduced in MySQL 5.7 for application developers. But no feature is a Golden Hammer. We need to apply a little expertise to get the best of it, and avoid misusing it. I’ll show practical examples that work well with JSON, and other scenarios where conventional columns would perform better. Questions addressed in this presentation: How much space does JSON data use, compared to conventional data? What is the performance of querying JSON vs. conventional data? How do I create indexes for JSON data? What kind of data is best to store in JSON? How do I get the best of both worlds?
Les slides de ma présentation à Devoxx France 2017.
Introduite en Java 8, l'API Collector vit dans l'ombre de l'API Stream, ce qui est logique puisqu'un collecteur doit se connecter à un stream pour fonctionner. Le JDK est organisé de sorte que l'on utilise surtout les collectors sur étagère : groupingBy, counting et quelques autres. Ces deux éléments masquent non seulement le modèle de traitement de données des collectors, mais aussi sa puissance et ses performances.
Ces présentation parle des collectors qui existent et qu'il faut connaître, ceux que l'on peut créer, ceux dont on se doute que l'on peut les créer une fois que l'on comprend un peu les choses, et les autres, tant les possibilités offertes par cette API sont illimitées.
Clean Architecture helps you build applications that are Independent of Frameworks, UI, Database or any external agency. This talk will be an introduction to Clean Architecture principles and system design in Python. We will choose a sample application and walk through a simple use case of creating clean code.
What you will gain from this talk:
* Grow your application with confidence
* Build applications to be independent of infrastructure like Web Frameworks, Database etc.
* Avoid mixing of business logic across different layers of your application
* Test close to 100% of your core business logic
* Keep your tests fast and independent of infrastructure (DB, Web Layer etc.)
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
In this Meetup Victor Perepelitsky - R&D Technical Leader at LivePerson leading the 'Real Time Event Processing Platform' team , will talk about Java 8', 'Stream API', 'Lambda', and 'Method reference'.
Victor will clarify what functional programming is and how can you use java 8 in order to create better software.
Victor will also cover some pain points that Java 8 did not solve regarding functionality and see how you can work around it.
Graphs in the Database: Rdbms In The Social Networks AgeLorenzo Alberton
Despite the NoSQL movement trying to flag traditional databases as a dying breed, the RDBMS keeps evolving and adding new powerful weapons to its arsenal. In this talk we'll explore Common Table Expressions (SQL-99) and how SQL handles recursion, breaking the bi-dimensional barriers and paving the way to more complex data structures like trees and graphs, and how we can replicate features from social networks and recommendation systems. We'll also have a look at window functions (SQL:2003) and the advanced reporting features they make finally possible.
Let's get into several common types of queries that developers struggle with, showing SQL solutions, and then analyze them for optimal efficiency. I'll cover Exclusion Join, Random Selection, Greatest-Per-Group, Dynamic Pivot, and Relational Division.
A comparison of different solutions for full-text search in web applications using PostgreSQL and other technology. Presented at the PostgreSQL Conference West, in Seattle, October 2009.
Using MySQL without Maatkit is like taking a photo without removing the camera's lens cap. Professional MySQL experts use this toolkit to help keep complex MySQL installations running smoothly and efficiently. This session will show you practical ways to use Maatkit every day.
The JSON data type and functions that support it comprise one of the most interesting features introduced in MySQL 5.7 for application developers. But no feature is a Golden Hammer. We need to apply a little expertise to get the best of it, and avoid misusing it. I’ll show practical examples that work well with JSON, and other scenarios where conventional columns would perform better. Questions addressed in this presentation: How much space does JSON data use, compared to conventional data? What is the performance of querying JSON vs. conventional data? How do I create indexes for JSON data? What kind of data is best to store in JSON? How do I get the best of both worlds?
Les slides de ma présentation à Devoxx France 2017.
Introduite en Java 8, l'API Collector vit dans l'ombre de l'API Stream, ce qui est logique puisqu'un collecteur doit se connecter à un stream pour fonctionner. Le JDK est organisé de sorte que l'on utilise surtout les collectors sur étagère : groupingBy, counting et quelques autres. Ces deux éléments masquent non seulement le modèle de traitement de données des collectors, mais aussi sa puissance et ses performances.
Ces présentation parle des collectors qui existent et qu'il faut connaître, ceux que l'on peut créer, ceux dont on se doute que l'on peut les créer une fois que l'on comprend un peu les choses, et les autres, tant les possibilités offertes par cette API sont illimitées.
Clean Architecture helps you build applications that are Independent of Frameworks, UI, Database or any external agency. This talk will be an introduction to Clean Architecture principles and system design in Python. We will choose a sample application and walk through a simple use case of creating clean code.
What you will gain from this talk:
* Grow your application with confidence
* Build applications to be independent of infrastructure like Web Frameworks, Database etc.
* Avoid mixing of business logic across different layers of your application
* Test close to 100% of your core business logic
* Keep your tests fast and independent of infrastructure (DB, Web Layer etc.)
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
2. In most projects we have to deal with
some kind of trees or hierarchies!
Think about…
• Categories
• Organisation
• Threads
• Folders
• Components
•…
3. How can we store an retrieve trees and hierarchies
effenciently from relational databases?
4. Agenda
• Adjacency List Model
• Closure Table Model
• Path Enumeration Model
• David Chandler Model
• Modified David Chandler Model
• Nested Set Model
• Hyprid Model
• Frequent Insertion in Nested Sets
5. What is a „tree“?
A tree is a connected, undirected, acyclic grap.
A
B C
D E F G H
I
6. What is a „tree“?
A tree is a connected, undirected, acyclic grap.
A
B C
D E F G H
I X
7. What is a „tree“?
A tree is a connected, undirected, acyclic grap.
A
B C
D E F G H
I
8. What is a „tree“?
A tree is a connected, undirected, acyclic grap.
A
B C
D E F G H
I
9. Adjacency List Model
A
B C
D E F G H
I
title parent Creating the table:
CREATE TABLE nodes
A null (title VARCHAR(255) NOT NULL PRIMARY KEY,
B A parent VARCHAR(255));
C A Finding the root node:
D B SELECT *
FROM nodes
E B WHERE parent IS NULL;
F C
Finding the leaf nodes:
G C
SELECT node1.*
H C FROM nodes AS node1
LEFT JOIN node AS node2 ON node1.title = node2.parent
I F WHERE node2.title = ‘A’;
10. Adjacency List Model
UPDATE anomalies
A
B C
D E F G H
title parent I
A null UPDATE nodes
SET title = ‘X’
B A WHERE title = ‘C’;
C A
D B START TRANSACTION;
UPDATE nodes
E B SET title = ‘X’
WHERE title = ‘C’;
F C UPDATE nodes
G C SET parent = ‘X’
WHERE parent = ‘C’;
H C COMMIT;
I F
11. Adjacency List Model
INSERT anomalies
INSERT INTO nodes (title, parent) VALUES
(‘X’, ‘Y’), X Y
(‘Y’, ‘X’);
INSERT INTO nodes (title, parent) VALUES X
(‘X’, ‘X’);
Parent must exist:
ALTER TABLE nodes ADD CONSTRAINT parent_fk FOREIGN KEY (parent)
REFERENCES nodes (id);
Node can not be it‘s own parent:
CHECK (title <> parent)
Prevent double connections:
UNIQUE (title, parent)
Connected graph (edges = nodes – 1):
CHECK ((SELECT COUNT(*) FROM nodes) – 1
= SELECT COUNT(parent) FROM nodes;
One root only:
CHECK ((SELECT COUNT(*) FROM nodes WHERE parent IS NULL) = 1);
12. Adjacency List Model
DELETE anomalies
DELETE FROM nodes WHERE title = ‘C’;
A
B C
D E F G H
I
13. Adjacency List Model
DELETE anomalies
What happens with nodes F, G, H and I?
A
B ?
D E F G H
I
ALTER TABLE nodes ADD CONSTRAINT parent_fk FOREIGN KEY (parent)
REFERENCES nodes (id)
ON DELETE ...;
14. Adjacency List Model
DELETE anomalies
Solution 1: delete all children
A
B C
D E F G H
I
15. Adjacency List Model
DELETE anomalies
Solution 2: move children to the level of the parent
A
B F G H
D E I
16. Adjacency List Model
DELETE anomalies
Solution 3: replace parent with the first child
A
B F
D E I G H
17. Adjacency List Model
with self -references
A
B C
D E F G H
title parent
I
A A Creating the table:
CREATE TABLE nodes
B A (title VARCHAR(255) NOT NULL PRIMARY KEY,
parent VARCHAR(255) NOT NULL);
C A
D B
E B Avoids NULL in the parent column!
F C
G C
H C
I F
18. Path Enumeration Model
A
B C
D E F G H
I
title path Creating the table:
CREATE TABLE nodes
A A (title VARCHAR(255) NOT NULL PRIMARY KEY
B A/B -- don’t use a separater in primary key
CHECK (REPLACE (title, ‘/’, ‘’) = title,
C A/C path VARCHAR(1024) NOT NULL);
D A/B/D
E A/B/E
F A/C/F
G A/C/G
H A/C/H
I A/C/F/I
19. Closure Table Model
A
B C
D E F G H
ID parent Ancestor Descendant
1 A 1 2
2 B 1 4
3 C 1 5
4 D 2 4
5 E 2 5
6 F 1 3
7 G 1 6
8 H 1 7
1 8
20. David Chandler Model
A
B C
D E F G H
ID T L1 L2 L3 Creating the table:
CREATE TABLE nodes
1 A 1 0 0 (id INTEGER NOT NULL PRIMARY KEY
2 B 1 1 0 title VARCHAR(255) NOT NULL
L1 INTEGER NOT NULL,
3 C 1 2 0 L2 INTEGER NOT NULL,
L3 INTEGER NOT NULL);
4 D 1 1 1
5 E 1 1 2 Algorithm is patened by David Chandler!
6 F 1 2 1
7 G 1 2 2 Hierarchy level is fixed!
8 H 1 2 3
21. Modified David Chandler Model
A
B C
D E F G H
ID T L1 L2 L3 Creating the table:
CREATE TABLE nodes
1 A 1 0 0 (id INTEGER NOT NULL PRIMARY KEY
2 B 1 2 0 title VARCHAR(255) NOT NULL
L1 INTEGER NOT NULL,
3 C 1 3 0 L2 INTEGER NOT NULL,
L3 INTEGER NOT NULL);
4 D 1 2 4
5 E 1 2 5 Not sure about implecations of David
6 F 1 3 6 Chandlers patent…
7 G 1 3 7 Hierarchy level is fixed!
8 H 1 3 8
22. Nested Set Model
1 A 18
2 B 7 8 C 17
3 D 4 5 E 6 9 F 12 13 G 14 15 H 16
10 I 11
title lft rgt Creating the table:
CREATE TABLE nodes
A 1 18 (title VARCHAR(255) NOT NULL PRIMARY KEY,
B 2 7 lft INTEGER NOT NULL,
rgt INTEGER NOT NULL);
C 8 17 Finding the root node:
D 3 4 SELECT *
FROM nodes
E 5 6 WHERE lft = 1;
F 9 12
Finding the leaf nodes:
G 13 14
SELECT *
H 15 16 FROM nodes
WHERE lft = (MAX(rgt) – 1)
I 10 11
23. Nested Set Model
1 A 18
2 B 7 8 C 17
3 D 4 5 E 6 9 F 12 13 G 14 15 H 16
10 I 11
title lft rgt Finding subtrees:
A 1 18 SELECT * FROM nodes
WHERE lft BETWEEN lft AND rgt
B 2 7 ORDER BY lft ASC;
C 8 17 Finding path to a node:
D 3 4 SELECT * FROM nodes
WHERE lft < ? AND rgt > ?
E 5 6 ORDER BY lft ASC;
F 9 12
G 13 14
H 15 16
I 10 11
24. Hybrid Models
Combining models
Adjacency List
David Chandler
Path Enumeration
Nested Set
Closure Table
25. Nested Set Model
Problem: Tree must be reorganized on every insertion (table lock).
1 A 100
2 B 49 50 C 99
D E F G H
3 24 25 48 51 72 73 14 15 16
Possible solution: Avoid overhead with larger spreads and bigger gaps
26. Nested Set Model
What datatype shout I use for left and right values?
INTEGER
FLOAT / REAL / DOUBLE
NUMERIC / DECIMAL
27. Nested Set Model
with rational numbers
0/5 A 5/5
1/5 B 2/5 3/5 C 4/5
a b
a+(b-a)/3 a+2(b-a)/3
28. Nested Set Model
with rational numbers
0/5 A 5/5
1/5 B 2/5 3/5 C 4/5
16/25 D 17/25
T ln ld rn rd
A 0 5 5 5
Adding an aditional child node is alway possible.
B 1 5 2 5 No need to reorganize the table!
C 3 5 4 5
D 16 25 17 25
29. Nested Set Model
Binary Fraction
node X Y
1 1 1/2
1.1 1 3/4
1.1.1 1 7/8
1.1.1.1 1 15/16
1.1.2 7/8 13/16
1.1.3 13/16 25/32
1.2 3/4 5/8
1.2.1 3/4 11/16
1.3 5/8 9/16