Querying Hierarchical
Data with CTE
ALEXANDER ALDEV
DBaaS Architect
MariaDB Corporation
Common Table Expressions
● SQL:1999 standard
● Temporary named result set
● Supported by: Oracle, MS SQL, PostgreSQL, SQLLite, MySQL, …
● Available in MariaDB since 10.2
● Two types:
○ Non-recursive
○ Recursive
CTE Syntax
WITH engineers AS (
SELECT *
FROM employees
WHERE dept = ‘engineering’
)
SELECT * FROM engineers ...
CTE nameWITH keyword
CTE body
use in query
Similar to Derived Queries
WITH engineers AS (
SELECT *
FROM employees
WHERE dept =
‘engineering’
)
SELECT * FROM engineers ...
SELECT * FROM (
SELECT *
FROM employees
WHERE dept =
‘engineering’
) engineers
...
Use Case: Readability
WITH engineers AS (
SELECT *
FROM employees
WHERE dept = ‘engineering’
),
WITH eu_engineers AS AS (
SELECT * FROM engineers
WHERE country IN (‘BG’,’FI’,’DE’)
)
SELECT COUNT(*) FROM eu_engineers
Use Case: Multiple References to CTE
WITH engineers AS (
SELECT *
FROM employees
WHERE dept = ‘engineering’
)
SELECT * FROM engineers e1
WHERE NOT EXISTS( SELECT 1 FROM engineers e2
WHERE e1.country = e2.country
AND e1.first = e2.first )
Use Case: Year-over-Year comparison
WITH annual_sales AS (
SELECT product, YEAR(invoice_date) AS year,
SUM( amount ) AS total
FROM invoices
GROUP BY 1, 2
)
SELECT y1.year, y1.product, y1.total, y2.total last_yr_total
FROM annual_sales y1
JOIN annual_sales y2
ON y1.year = y2.year + 1
AND y1.product = y2.product
Recursive CTE
● SQL does not handle hierarchical data well
● Trees in format (parent_id, child_id)
● Graphs in format (point 1, point 2)
● Recursive CTEs allow for hierarchical data to be handled
● Also useful for search algorithms where next iteration depends on previous
Recursive CTE Syntax
WITH RECURSIVE ancestors AS (
SELECT * FROM family
WHERE name = ‘Alex’
UNION ALL
SELECT f.* FROM family f, ancestors
WHERE f.parent_id = ancestors.id
) SELECT * FROM ancestors;
SELECT * FROM engineers ...
RECURSIVE keyword
CTE anchor
recursive use
How CTEs work?
● Base execution strategy is to materialize
● Optimizations possible for non-recursive CTE
● Recursive CTEs use this flow:
1. Evaluate anchor expression
2. Evaluate recursive expression -> new data
3. Append new data to result
4. If new data is not-empty goto 2
Example:
Sudoku Solver
Example:
ETL Job Scheduler
THANK YOU!

Query hierarchical data the easy way, with CTEs

  • 1.
    Querying Hierarchical Data withCTE ALEXANDER ALDEV DBaaS Architect MariaDB Corporation
  • 2.
    Common Table Expressions ●SQL:1999 standard ● Temporary named result set ● Supported by: Oracle, MS SQL, PostgreSQL, SQLLite, MySQL, … ● Available in MariaDB since 10.2 ● Two types: ○ Non-recursive ○ Recursive
  • 3.
    CTE Syntax WITH engineersAS ( SELECT * FROM employees WHERE dept = ‘engineering’ ) SELECT * FROM engineers ... CTE nameWITH keyword CTE body use in query
  • 4.
    Similar to DerivedQueries WITH engineers AS ( SELECT * FROM employees WHERE dept = ‘engineering’ ) SELECT * FROM engineers ... SELECT * FROM ( SELECT * FROM employees WHERE dept = ‘engineering’ ) engineers ...
  • 5.
    Use Case: Readability WITHengineers AS ( SELECT * FROM employees WHERE dept = ‘engineering’ ), WITH eu_engineers AS AS ( SELECT * FROM engineers WHERE country IN (‘BG’,’FI’,’DE’) ) SELECT COUNT(*) FROM eu_engineers
  • 6.
    Use Case: MultipleReferences to CTE WITH engineers AS ( SELECT * FROM employees WHERE dept = ‘engineering’ ) SELECT * FROM engineers e1 WHERE NOT EXISTS( SELECT 1 FROM engineers e2 WHERE e1.country = e2.country AND e1.first = e2.first )
  • 7.
    Use Case: Year-over-Yearcomparison WITH annual_sales AS ( SELECT product, YEAR(invoice_date) AS year, SUM( amount ) AS total FROM invoices GROUP BY 1, 2 ) SELECT y1.year, y1.product, y1.total, y2.total last_yr_total FROM annual_sales y1 JOIN annual_sales y2 ON y1.year = y2.year + 1 AND y1.product = y2.product
  • 8.
    Recursive CTE ● SQLdoes not handle hierarchical data well ● Trees in format (parent_id, child_id) ● Graphs in format (point 1, point 2) ● Recursive CTEs allow for hierarchical data to be handled ● Also useful for search algorithms where next iteration depends on previous
  • 9.
    Recursive CTE Syntax WITHRECURSIVE ancestors AS ( SELECT * FROM family WHERE name = ‘Alex’ UNION ALL SELECT f.* FROM family f, ancestors WHERE f.parent_id = ancestors.id ) SELECT * FROM ancestors; SELECT * FROM engineers ... RECURSIVE keyword CTE anchor recursive use
  • 10.
    How CTEs work? ●Base execution strategy is to materialize ● Optimizations possible for non-recursive CTE ● Recursive CTEs use this flow: 1. Evaluate anchor expression 2. Evaluate recursive expression -> new data 3. Append new data to result 4. If new data is not-empty goto 2
  • 11.
  • 12.
  • 13.

Editor's Notes

  • #2 Alex: 20 yrs experience in DWH, analytics and a couple of business functions, currently leading MariaDB’s DBaaS architecture.
  • #12 We’ll open two minikube consoles and do the following: Deploy a simple “Hello, world” NodeJS application. Use “docker build” and “kubectl create” Start a simple client that measures performance. Use “docker build” and “kubectl create” Scale the application from 1 to 3 and observe throughput. Use “kubectl get pods Fail a number of pods. Use “kubectl delete pod” Upgrade the application to v2.0. Use “kubectl set image”
  • #13 We’ll open two minikube consoles and do the following: Deploy a simple “Hello, world” NodeJS application. Use “docker build” and “kubectl create” Start a simple client that measures performance. Use “docker build” and “kubectl create” Scale the application from 1 to 3 and observe throughput. Use “kubectl get pods Fail a number of pods. Use “kubectl delete pod” Upgrade the application to v2.0. Use “kubectl set image”
  • #14 Last slide -- the remaining are backup and will be deleted