SlideShare a Scribd company logo
Building a Hierarchical Data Model
Using the Latest IBM Informix
Features
Ajaykumar Gupte
gupte@us.ibm.com
1
Agenda
●
Problem of querying hierarchical data
●
Hierarchical data design
●
“Connect By”- keywords & pseudo columns
●
Execution model
●
Query transformation
Problem of querying hierarchical data
• Common technique of storing hierarchical data in
relational tables is self-reference
– Employee-Manager
• Employee table (key – empid)
• Every employee has a manager (indicated by mgrid)
• Manager is also an employee (with a valid empid)
– Shipment
• Inbound shipment table (key – item_id)
• Each item can belong to a package ( key –
package_id)
• Every package is itself an item (with a valid item_id)
CREATE TABLE employee (
empid INTEGER NOT NULL
PRIMARY KEY,
name VARCHAR(10),
salary DECIMAL(9, 2),
mgrid INTEGER);
CREATE TABLE inbound_shipment (
shipment_id VARCHAR(50),
item_id VARCHAR(20) ,
package_id VARCHAR(20),
.......
ship_CX2555
Pallet_BX505 Pallet xxx
box_C3524box_C1255
Pallet xxx
band_aid… A1_pharma..band_aid.. vicks_.. vicks_...A1_pharma.. vicks_..
box_C4000
Tylenol.. Tylenol…
Characteristics/Limitations
■ Multi-step approach – requiring complex application/SPL logic
■ Recursive self-join
■ Filtering/ordering/grouping requires more additions
■ Joining results with other tables becomes complex
■ Reuse amongst other applications
– understanding of the complex logic (data placement etc)
– more customization
SELECT level as package_level, item_id,
package_id
FROM inbound_shipment
START WITH item_id = 'pallet_BX505'
CONNECT BY PRIOR
item_id = package_id
Using CONNECT BY to discover data
hierarchy
C o n d it io n o f r e c u r s io n
s e e d o f r e c u r s io n
Results of CONNECT BY Query
package_level item_id package_id
1 pallet_BX505 ship_CX2555
2 box_C1255 pallet_BX505
3 band_aid_H10 box_C1255
3 band_aid_H12 box_C1255
3 A1_pharma_F23 box_C1255
3 A1_pharma_F33 box_C1255
Hierarchical view of data
17
15 16
10 13 11 12 14
1 2 43 5 6 7 8 9
SELECT name, empid, mgrid
FROM emp
START WITH name = 'Goyal'
CONNECT BY PRIOR empid = mgrid
G o y a l Goyal 16 17
Zander 11 16
McKeoug
h
5 11
Barnes 6 11
Henry 12 16
O'Neil 7 12
Smith 8 12
Shoeman 9 12
Scott 14 16
empid mgrid
12
Flow of Execution
17
15 16
10 13 11 12 14
1 2 43 5 6 7 8 9
SELECT name, empid, mgrid
FROM emp
START WITH name = 'Goyal'
CONNECT BY PRIOR empid = mgrid
Stack
JOIN
16
PUSH
POP11 14
65 987
Where is hierarchical data ?

Bill of materials

Reporting structure

Package tracking

Inventory management

Social media

date/time

Geography / region
PRIOR■ Unary operator PRIOR is used in join filter to distinguish column references
of the last prior recursive step, from column references to the base table.
■ Query without PRIOR can result in a forever running query or single row
package_level item_id package_id
1 pallet_BX505 ship_CX2555
2 box_C1255 pallet_BX505
3 band_aid_H10 box_C1255
3 band_aid_H12 box_C1255
3 A1_pharma_F23 box_C1255
3 A1_pharma_F33 box_C1255
2 box_C3524 pallet_BX505
3 vicks_CK215 box_C3524
3 vicks_CK315 box_C3524
3 vicks_CK324 box_C3524
SELECT level , item_id, package_id
FROM inbound_shipment
START WITH item_id = 'pallet_BX505'
CONNECT BY PRIOR
item_id = package_id
LEVEL
■ Pseudo column that tracks the level of a node in hierarchy starting with level 1
for the root node.
■ Can be used in CONNECT BY clause as a filter to limit the depth of hierarchy
package_level item_id package_id
1 pallet_BX505 ship_CX2555
2 box_C1255 pallet_BX505
2 box_C3524 pallet_BX505
2 box_C4520 pallet_BX505
2 box_C4000 pallet_BX505
5 row(s) retrieved.
SELECT level as package_level,
item_id, package_id
FROM inbound_shipment
where level < 3
START WITH item_id = 'pallet_BX505'
CONNECT BY PRIOR item_id =
package_id
NOCYCLE
■ By default hierarchical queries return error when they detect cycle in the data
■ Used to allow the query to return all rows by ignoring the cycle causing row
insert into inbound_shipment(item_id,package_id) values ("ship_CX2555",
"pallet_BX505");
package_level item_id package_id
1 pallet_BX505 ship_CX2555
26079: CONNECT BY query resulted in a loop/cycle.
Error in line 9
Near character position 37
SELECT level , item_id, package_id
FROM inbound_shipment
START WITH item_id = 'pallet_BX505'
CONNECT BY PRIOR
item_id = package_id
NOCYCLE Example
package_level item_id package_id
1 pallet_BX505 ship_CX2555
2 ship_CX2555 pallet_BX505
2 box_C1255 pallet_BX505
2 box_C3524 pallet_BX505
2 box_C4520 pallet_BX505
2 box_C4000 pallet_BX505
6 row(s) retrieved.
SELECT level as package_level, item_id, package_id
FROM inbound_shipment
where level < 3
START WITH item_id = 'pallet_BX505'
CONNECT BY NOCYCLE PRIOR item_id = package_id
CONNECT_BY_ISCYCLE
■ Identify the nodes that would result in a cycle
package_level item_id package_id connect_by_iscycle
1 pallet_BX505 ship_CX2555 0
2 ship_CX2555 pallet_BX505 1
2 box_C1255 pallet_BX505 0
2 box_C3524 pallet_BX505 0
2 box_C4520 pallet_BX505 0
2 box_C4000 pallet_BX505 0
6 row(s) retrieved.
SELECT level as package_level,
item_id, package_id ,
connect_by_iscycle
FROM inbound_shipment
where level < 3
START WITH item_id =
'pallet_BX505'
CONNECT BY NOCYCLE PRIOR
item_id = package_id
CONNECT_BY_ISLEAF Example
package_level item_id package_id connect_by_isleaf
3 band_aid_H10 box_C1255 1
3 band_aid_H12 box_C1255 1
3 A1_pharma_F23 box_C1255 1
3 A1_pharma_F33 box_C1255 1
3 vicks_CK215 box_C3524 1
3 vicks_CK315 box_C3524 1
3 vicks_CK324 box_C3524 1
3 A1_pharma_T30 box_C3524 1
3 A1_pharma_T20 box_C3524 1
3 A1_pharma_T10 box_C3524 1
3 A1_pharma_415 box_C4520 1
3 A1_pharma_413 box_C4520 1
3 A1_pharma_329 box_C4520 1
3 A1_pharma_343 box_C4520 1
3 tylenol_BA341 box_C4000 1
3 tylenol_BA455 box_C4000 1
3 tylenol_BA570 box_C4000 1
3 tylenol_BA521 box_C4000 1
3 tylenol_BA520 box_C4000 1
3 tylenol_BA500 box_C4000 1
20 row(s) retrieved.
SELECT level as
package_level, item_id,
package_id ,
connect_by_isleaf
FROM inbound_shipment
where connect_by_isleaf = 1
START WITH item_id =
'pallet_BX505'
CONNECT BY NOCYCLE
PRIOR item_id = package_id
SYS_CONNECT_BY_PATH
■ Expression which is used to build a string representing a path from the root row
to current row.
■ >>--SYS_CONNECT_BY_PATH--(--string-expression1--,--string-expression2--)--><
path pallet_BX505
item_id pallet_BX505
package_id ship_CX2555
path pallet_BX505box_C1255
item_id box_C1255
package_id pallet_BX505
path pallet_BX505box_C3524
item_id box_C3524
package_id pallet_BX505
path pallet_BX505box_C4520
item_id box_C4520
package_id pallet_BX505
path pallet_BX505box_C4000
item_id box_C4000
package_id pallet_BX505
5 row(s) retrieved.
SELECT
sys_connect_by_path(item_id,"") as path ,
item_id, package_id
FROM inbound_shipment
where level < 3
START WITH item_id = 'pallet_BX505'
CONNECT BY PRIOR item_id = package_id
CONNECT_BY_ROOT
■ unary operator which, for every row in the hierarchy, returns the expression for
the row’s root ancestor
■ >>--CONNECT_BY_ROOT--expression----------------------------------><
root item_id package_id
pallet_BX505 pallet_BX505 ship_CX2555
pallet_BX505 box_C1255 pallet_BX505
pallet_BX505 box_C3524 pallet_BX505
pallet_BX505 box_C4520 pallet_BX505
pallet_BX505 box_C4000 pallet_BX505
5 row(s) retrieved.
SELECT
connect_by_root item_id as root,
item_id, package_id
FROM inbound_shipment
where level < 3
START WITH item_id =
'pallet_BX505'
CONNECT BY PRIOR item_id =
package_id
SIBLINGS
■ Attribute of ORDER BY clause to order the siblings at every level of hierarchy
■ Same semantics of ORDER BY but applied at siblings rows
level item_id package_id
1 pallet_BX505 ship_CX2555
2 box_C1255 pallet_BX505
2 box_C3524 pallet_BX505
2 box_C4000 pallet_BX505
2 box_C4520 pallet_BX505
5 row(s) retrieved.
SELECT level, item_id,
package_id
FROM inbound_shipment
where level < 3
START WITH item_id =
'pallet_BX505'
CONNECT BY PRIOR item_id
= package_id
order SIBLINGS by item_id
Query rewrite & Execution model
• Query rewrite
SELECT level , item_id, package_id
FROM inbound_shipment
START WITH item_id = 'pallet_BX505'
CONNECT BY PRIOR
item_id = package_id
SELECT level , item_id, package_id FROM
( SELECT level, item_id, package_id
FROM inbound_shipment
WHERE item_id = 'pallet_BX505'
UNION ALL
SELECT level, ship.item_id , ship.package_id
FROM inbound_shipment ship, dtab
WHERE ship.package_id = dtab.item_id
)
AS dtab;
Execution model of recursive queries in IDS
TEMP TABLE
CYCLE OR
TRAVERSAL
SCAN
JOIN
UNION ALL
SORT
SCAN SCAN
SORT
SCAN
Scan of shipment
table
Scan of
shipment
table
order
siblings by
Connect
by filters
Top level scan on
derived table
sqexplainQUERY:
SELECT level as package_level, item_id, package_id FROM inbound_shipment
START WITH item_id = 'pallet_BX505' CONNECT BY PRIOR item_id = package_id
Connect by Query Rewrite:
select x0.level ,x0.item_id ,x0.package_id from
(select x1.item_id ,x1.package_id ,x1.item_id ,1 ,1 ,0 from
"informix".inbound_shipment x1 where (x1.item_id = 'pallet_BX505' )
union all
select x2.item_id ,x2.package_id ,x2.item_id ,(level + 1 ) ::integer
,connect_by_isleaf ,dtab_30093_173_stkcol from "informix".inbound_shipment
x2 ,"informix".dtab_30093_173 x0 where (dtab_30093_173_p_item_id =
x2.package_id ) )
X0
(item_id,package_id,dtab_30093_173_p_item_id,level,connect_by_isleaf,dtab_3
0093_173_stkcol)
S T A R T W I T H
Estimated Cost: 1
Estimated # of Rows Returned: 5
1) informix.dtab_30093_173: COLLECTION SCAN
Subquery:
---------
Estimated Cost: 13
Estimated # of Rows Returned: 5
1) informix.inbound_shipment: SEQUENTIAL SCAN
Filters: informix.inbound_shipment.item_id = 'pallet_BX505'
Union Query:
------------
1) informix.dtab_30093_173: SEQUENTIAL SCAN
2) informix.inbound_shipment: SEQUENTIAL SCAN
DYNAMIC HASH JOIN (Build Outer)
Dynamic Hash Filters: informix.dtab_30093_173.dtab_30093_173_p_item_id =
informix.inbound_shipment.package_id
Query statistics:
Table map :
----------------------------
Internal name Table name
----------------------------
t1 dtab_30093_173
type table rows_prod time
-----------------------------------
clscan t1 25 00:00.00
CONNECT BY Restriction

Multiple tables are not allowed
SELECT ship.item_id , ord.name
FROM inbound_shipment ship, orders ordinbound_shipment ship, orders ord
START WITH item_id = “pallet_BX505”
CONNECT BY PRIOR item_id = package_id
WHERE ship.item_id = ord.item_id
Rewrite to
SELECT item_id , name
FROM (SELECT ship.item_id, ord.name
FROM inbound_shipment ship, orders ord
WHERE ship.item_id = ord.item_id )
START WITH item_id = “pallet_BX505”
CONNECT BY PRIOR item_id = package_id
Tree node traversal
10
20 30
40 50
level c1 c2
1 10 0
2 30 10
3 50 30
4 20 50
5 40 20
2 20 10
3 40 20
7 row(s) retrieved.
c1 c2
10 0
20 10
30 10
40 20
50 30
20 50
6 row(s) retrieved.
select * from t1;
select level , * from t1 start with c1 = 10 connect
by prior c1 = c2;
10--30--50--20--40
10--20--40
Child to Parent Traversal
package_level item_id package_id
1 tylenol_BA500 box_C4000
2 box_C4000 pallet_BX505
3 pallet_BX505 ship_CX2555
3 row(s) retrieved.
SELECT level as package_level, item_id, package_id
FROM inbound_shipment
START WITH item_id = 'tylenol_BA500'
CONNECT BY PRIOR package_id = item_id
SEQUENCE NUMBER GENERATOR
SELECT level FROM sysmaster:sysdual CONNECT BY level <= 10
S in g le r o w t a b leConnect by Query Rewrite:
---------------------------
select x0.level from (select 1 ,1 ,0 from sysmaster:"informix".sysdual x1 union all select (level + 1 ) ::integer ,connect_by_isleaf
,dtab_27465_191_stkcol from sysmaster:"informix".sysdual x2 ,"informix".dtab_27465_191 x0 where ((level + 1 ) <= 10. ) )
x0(level,connect_by_isleaf,dtab_27465_191_stkcol)
1) informix.dtab_27465_191: COLLECTION SCAN
Subquery:
---------
Estimated Cost: 5
Estimated # of Rows Returned: 2
1) sysmaster:informix.sysdual: SEQUENTIAL SCAN
Union Query:
------------
1) informix.dtab_27465_191: SEQUENTIAL SCAN
Filters: informix.dtab_27465_191.level + 1 <= 10
2) sysmaster:informix.sysdual: SEQUENTIAL SCAN
NESTED LOOP JOIN
Performance Considerations
• Queries are recursive and involves repeated self joins
• Use “PRIOR” Keyword, else query will run forever !!
• TEMP Dbspace used for hierarchy traversal (stack) and
cycle detection
• Configure - DBSPACETEMP
Conclusion
• Simple queries for complex reporting
• Useful for single or multiple data tree structure
• Easy to map path between two nodes/rows
Questions?
Ajaykumar Gupte
gupte@us.ibm.com
30

More Related Content

What's hot

Nested subqueries and subquery chaining in openCypher
Nested subqueries and subquery chaining in openCypherNested subqueries and subquery chaining in openCypher
Nested subqueries and subquery chaining in openCypher
openCypher
 
Single row functions
Single row functionsSingle row functions
Single row functions
Balqees Al.Mubarak
 
What's New in MariaDB Server 10.3
What's New in MariaDB Server 10.3What's New in MariaDB Server 10.3
What's New in MariaDB Server 10.3
MariaDB plc
 
解决Ora 14098分区交换索引不匹配错误
解决Ora 14098分区交换索引不匹配错误解决Ora 14098分区交换索引不匹配错误
解决Ora 14098分区交换索引不匹配错误
maclean liu
 
MariaDB Server 10.3 - Temporale Daten und neues zur DB-Kompatibilität
MariaDB Server 10.3 - Temporale Daten und neues zur DB-KompatibilitätMariaDB Server 10.3 - Temporale Daten und neues zur DB-Kompatibilität
MariaDB Server 10.3 - Temporale Daten und neues zur DB-Kompatibilität
MariaDB plc
 
New SQL Features in Firebird 3, by Vlad Khorsun
New SQL Features in Firebird 3, by Vlad KhorsunNew SQL Features in Firebird 3, by Vlad Khorsun
New SQL Features in Firebird 3, by Vlad Khorsun
Mind The Firebird
 
Refatoração + Design Patterns em Ruby
Refatoração + Design Patterns em RubyRefatoração + Design Patterns em Ruby
Refatoração + Design Patterns em Ruby
Cássio Marques
 
Les09
Les09Les09
Laracon EU 2018: OMG MySQL 8.0 is out! are we there yet?
Laracon EU 2018: OMG MySQL 8.0 is out! are we there yet?Laracon EU 2018: OMG MySQL 8.0 is out! are we there yet?
Laracon EU 2018: OMG MySQL 8.0 is out! are we there yet?
Gabriela Ferrara
 
Database management system file
Database management system fileDatabase management system file
Database management system file
Ankit Dixit
 
New SQL features in latest MySQL releases
New SQL features in latest MySQL releasesNew SQL features in latest MySQL releases
New SQL features in latest MySQL releases
Georgi Sotirov
 
Histograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQLHistograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQL
Sergey Petrunya
 
Oracle naveen Sql
Oracle naveen   SqlOracle naveen   Sql
Oracle naveen Sql
naveen
 
MySQL best practices at Trovit
MySQL best practices at TrovitMySQL best practices at Trovit
MySQL best practices at Trovit
Ivan Lopez
 
The MySQL Query Optimizer Explained Through Optimizer Trace
The MySQL Query Optimizer Explained Through Optimizer TraceThe MySQL Query Optimizer Explained Through Optimizer Trace
The MySQL Query Optimizer Explained Through Optimizer Trace
oysteing
 
Modularization & Catch Statement
Modularization & Catch StatementModularization & Catch Statement
Modularization & Catch Statement
sapdocs. info
 
ZFINDALLZPROGAM
ZFINDALLZPROGAMZFINDALLZPROGAM
ZFINDALLZPROGAM
Jay Dalwadi
 
Structured query language functions
Structured query language functionsStructured query language functions
Structured query language functions
Vineeta Garg
 
Sql 2005 the ranking functions
Sql 2005   the ranking functionsSql 2005   the ranking functions
Sql 2005 the ranking functions
rchakra
 

What's hot (19)

Nested subqueries and subquery chaining in openCypher
Nested subqueries and subquery chaining in openCypherNested subqueries and subquery chaining in openCypher
Nested subqueries and subquery chaining in openCypher
 
Single row functions
Single row functionsSingle row functions
Single row functions
 
What's New in MariaDB Server 10.3
What's New in MariaDB Server 10.3What's New in MariaDB Server 10.3
What's New in MariaDB Server 10.3
 
解决Ora 14098分区交换索引不匹配错误
解决Ora 14098分区交换索引不匹配错误解决Ora 14098分区交换索引不匹配错误
解决Ora 14098分区交换索引不匹配错误
 
MariaDB Server 10.3 - Temporale Daten und neues zur DB-Kompatibilität
MariaDB Server 10.3 - Temporale Daten und neues zur DB-KompatibilitätMariaDB Server 10.3 - Temporale Daten und neues zur DB-Kompatibilität
MariaDB Server 10.3 - Temporale Daten und neues zur DB-Kompatibilität
 
New SQL Features in Firebird 3, by Vlad Khorsun
New SQL Features in Firebird 3, by Vlad KhorsunNew SQL Features in Firebird 3, by Vlad Khorsun
New SQL Features in Firebird 3, by Vlad Khorsun
 
Refatoração + Design Patterns em Ruby
Refatoração + Design Patterns em RubyRefatoração + Design Patterns em Ruby
Refatoração + Design Patterns em Ruby
 
Les09
Les09Les09
Les09
 
Laracon EU 2018: OMG MySQL 8.0 is out! are we there yet?
Laracon EU 2018: OMG MySQL 8.0 is out! are we there yet?Laracon EU 2018: OMG MySQL 8.0 is out! are we there yet?
Laracon EU 2018: OMG MySQL 8.0 is out! are we there yet?
 
Database management system file
Database management system fileDatabase management system file
Database management system file
 
New SQL features in latest MySQL releases
New SQL features in latest MySQL releasesNew SQL features in latest MySQL releases
New SQL features in latest MySQL releases
 
Histograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQLHistograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQL
 
Oracle naveen Sql
Oracle naveen   SqlOracle naveen   Sql
Oracle naveen Sql
 
MySQL best practices at Trovit
MySQL best practices at TrovitMySQL best practices at Trovit
MySQL best practices at Trovit
 
The MySQL Query Optimizer Explained Through Optimizer Trace
The MySQL Query Optimizer Explained Through Optimizer TraceThe MySQL Query Optimizer Explained Through Optimizer Trace
The MySQL Query Optimizer Explained Through Optimizer Trace
 
Modularization & Catch Statement
Modularization & Catch StatementModularization & Catch Statement
Modularization & Catch Statement
 
ZFINDALLZPROGAM
ZFINDALLZPROGAMZFINDALLZPROGAM
ZFINDALLZPROGAM
 
Structured query language functions
Structured query language functionsStructured query language functions
Structured query language functions
 
Sql 2005 the ranking functions
Sql 2005   the ranking functionsSql 2005   the ranking functions
Sql 2005 the ranking functions
 

Similar to Building a Hierarchical Data Model Using the Latest IBM Informix Features

Advanced PL/SQL Optimizing for Better Performance 2016
Advanced PL/SQL Optimizing for Better Performance 2016Advanced PL/SQL Optimizing for Better Performance 2016
Advanced PL/SQL Optimizing for Better Performance 2016
Zohar Elkayam
 
Oracle Database 12c Application Development
Oracle Database 12c Application DevelopmentOracle Database 12c Application Development
Oracle Database 12c Application Development
Saurabh K. Gupta
 
OpenWorld Sep14 12c for_developers
OpenWorld Sep14 12c for_developersOpenWorld Sep14 12c for_developers
OpenWorld Sep14 12c for_developers
Connor McDonald
 
What's New In MySQL 5.6
What's New In MySQL 5.6What's New In MySQL 5.6
What's New In MySQL 5.6
Abdul Manaf
 
Sql 99 and_some_techniques
Sql 99 and_some_techniquesSql 99 and_some_techniques
Sql 99 and_some_techniques
Alexey Kiselyov
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101
ScyllaDB
 
PerlApp2Postgresql (2)
PerlApp2Postgresql (2)PerlApp2Postgresql (2)
PerlApp2Postgresql (2)
Jerome Eteve
 
Tutorial - Learn SQL with Live Online Database
Tutorial - Learn SQL with Live Online DatabaseTutorial - Learn SQL with Live Online Database
Tutorial - Learn SQL with Live Online Database
DBrow Adm
 
Data recovery using pg_filedump
Data recovery using pg_filedumpData recovery using pg_filedump
Data recovery using pg_filedump
Aleksander Alekseev
 
The Hidden Face of Cost-Based Optimizer: PL/SQL Specific Statistics
The Hidden Face of Cost-Based Optimizer: PL/SQL Specific StatisticsThe Hidden Face of Cost-Based Optimizer: PL/SQL Specific Statistics
The Hidden Face of Cost-Based Optimizer: PL/SQL Specific Statistics
Michael Rosenblum
 
PL/SQL New and Advanced Features for Extreme Performance
PL/SQL New and Advanced Features for Extreme PerformancePL/SQL New and Advanced Features for Extreme Performance
PL/SQL New and Advanced Features for Extreme Performance
Zohar Elkayam
 
Informix Warehouse Accelerator (IWA) features in version 12.1
Informix Warehouse Accelerator (IWA) features in version 12.1Informix Warehouse Accelerator (IWA) features in version 12.1
Informix Warehouse Accelerator (IWA) features in version 12.1
Keshav Murthy
 
Advanced PLSQL Optimizing for Better Performance
Advanced PLSQL Optimizing for Better PerformanceAdvanced PLSQL Optimizing for Better Performance
Advanced PLSQL Optimizing for Better Performance
Zohar Elkayam
 
5 Cool Things About SQL
5 Cool Things About SQL5 Cool Things About SQL
5 Cool Things About SQL
Connor McDonald
 
SequoiaDB Distributed Relational Database
SequoiaDB Distributed Relational DatabaseSequoiaDB Distributed Relational Database
SequoiaDB Distributed Relational Database
wangzhonnew
 
Sql queries
Sql queriesSql queries
Sql queries
narendrababuc
 
Cassandra Community Webinar | Become a Super Modeler
Cassandra Community Webinar | Become a Super ModelerCassandra Community Webinar | Become a Super Modeler
Cassandra Community Webinar | Become a Super Modeler
DataStax
 
Trig
TrigTrig
Trig
alur raju
 
Data Purge Algorithm
Data Purge AlgorithmData Purge Algorithm
Data Purge Algorithm
Saurabh S Agrawal
 
How to Implement Distributed Data Store
How to Implement Distributed Data Store How to Implement Distributed Data Store
How to Implement Distributed Data Store
Philip Zhong
 

Similar to Building a Hierarchical Data Model Using the Latest IBM Informix Features (20)

Advanced PL/SQL Optimizing for Better Performance 2016
Advanced PL/SQL Optimizing for Better Performance 2016Advanced PL/SQL Optimizing for Better Performance 2016
Advanced PL/SQL Optimizing for Better Performance 2016
 
Oracle Database 12c Application Development
Oracle Database 12c Application DevelopmentOracle Database 12c Application Development
Oracle Database 12c Application Development
 
OpenWorld Sep14 12c for_developers
OpenWorld Sep14 12c for_developersOpenWorld Sep14 12c for_developers
OpenWorld Sep14 12c for_developers
 
What's New In MySQL 5.6
What's New In MySQL 5.6What's New In MySQL 5.6
What's New In MySQL 5.6
 
Sql 99 and_some_techniques
Sql 99 and_some_techniquesSql 99 and_some_techniques
Sql 99 and_some_techniques
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101
 
PerlApp2Postgresql (2)
PerlApp2Postgresql (2)PerlApp2Postgresql (2)
PerlApp2Postgresql (2)
 
Tutorial - Learn SQL with Live Online Database
Tutorial - Learn SQL with Live Online DatabaseTutorial - Learn SQL with Live Online Database
Tutorial - Learn SQL with Live Online Database
 
Data recovery using pg_filedump
Data recovery using pg_filedumpData recovery using pg_filedump
Data recovery using pg_filedump
 
The Hidden Face of Cost-Based Optimizer: PL/SQL Specific Statistics
The Hidden Face of Cost-Based Optimizer: PL/SQL Specific StatisticsThe Hidden Face of Cost-Based Optimizer: PL/SQL Specific Statistics
The Hidden Face of Cost-Based Optimizer: PL/SQL Specific Statistics
 
PL/SQL New and Advanced Features for Extreme Performance
PL/SQL New and Advanced Features for Extreme PerformancePL/SQL New and Advanced Features for Extreme Performance
PL/SQL New and Advanced Features for Extreme Performance
 
Informix Warehouse Accelerator (IWA) features in version 12.1
Informix Warehouse Accelerator (IWA) features in version 12.1Informix Warehouse Accelerator (IWA) features in version 12.1
Informix Warehouse Accelerator (IWA) features in version 12.1
 
Advanced PLSQL Optimizing for Better Performance
Advanced PLSQL Optimizing for Better PerformanceAdvanced PLSQL Optimizing for Better Performance
Advanced PLSQL Optimizing for Better Performance
 
5 Cool Things About SQL
5 Cool Things About SQL5 Cool Things About SQL
5 Cool Things About SQL
 
SequoiaDB Distributed Relational Database
SequoiaDB Distributed Relational DatabaseSequoiaDB Distributed Relational Database
SequoiaDB Distributed Relational Database
 
Sql queries
Sql queriesSql queries
Sql queries
 
Cassandra Community Webinar | Become a Super Modeler
Cassandra Community Webinar | Become a Super ModelerCassandra Community Webinar | Become a Super Modeler
Cassandra Community Webinar | Become a Super Modeler
 
Trig
TrigTrig
Trig
 
Data Purge Algorithm
Data Purge AlgorithmData Purge Algorithm
Data Purge Algorithm
 
How to Implement Distributed Data Store
How to Implement Distributed Data Store How to Implement Distributed Data Store
How to Implement Distributed Data Store
 

More from Ajay Gupte

Discover the power of Recursive SQL and query transformation with Informix da...
Discover the power of Recursive SQL and query transformation with Informix da...Discover the power of Recursive SQL and query transformation with Informix da...
Discover the power of Recursive SQL and query transformation with Informix da...
Ajay Gupte
 
Using Lateral derived table in Informix database
Using Lateral derived table in Informix databaseUsing Lateral derived table in Informix database
Using Lateral derived table in Informix database
Ajay Gupte
 
Enabling Applications with Informix' new OLAP functionality
 Enabling Applications with Informix' new OLAP functionality Enabling Applications with Informix' new OLAP functionality
Enabling Applications with Informix' new OLAP functionality
Ajay Gupte
 
Using JSON/BSON types in your hybrid application environment
Using JSON/BSON types in your hybrid application environmentUsing JSON/BSON types in your hybrid application environment
Using JSON/BSON types in your hybrid application environment
Ajay Gupte
 
How IBM API Management use Informix and NoSQL
How IBM API Management use Informix and NoSQLHow IBM API Management use Informix and NoSQL
How IBM API Management use Informix and NoSQL
Ajay Gupte
 
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB WorldNoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
Ajay Gupte
 
IBM Informix Database SQL Set operators and ANSI Hash Join
IBM Informix Database SQL Set operators and ANSI Hash JoinIBM Informix Database SQL Set operators and ANSI Hash Join
IBM Informix Database SQL Set operators and ANSI Hash Join
Ajay Gupte
 

More from Ajay Gupte (7)

Discover the power of Recursive SQL and query transformation with Informix da...
Discover the power of Recursive SQL and query transformation with Informix da...Discover the power of Recursive SQL and query transformation with Informix da...
Discover the power of Recursive SQL and query transformation with Informix da...
 
Using Lateral derived table in Informix database
Using Lateral derived table in Informix databaseUsing Lateral derived table in Informix database
Using Lateral derived table in Informix database
 
Enabling Applications with Informix' new OLAP functionality
 Enabling Applications with Informix' new OLAP functionality Enabling Applications with Informix' new OLAP functionality
Enabling Applications with Informix' new OLAP functionality
 
Using JSON/BSON types in your hybrid application environment
Using JSON/BSON types in your hybrid application environmentUsing JSON/BSON types in your hybrid application environment
Using JSON/BSON types in your hybrid application environment
 
How IBM API Management use Informix and NoSQL
How IBM API Management use Informix and NoSQLHow IBM API Management use Informix and NoSQL
How IBM API Management use Informix and NoSQL
 
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB WorldNoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
 
IBM Informix Database SQL Set operators and ANSI Hash Join
IBM Informix Database SQL Set operators and ANSI Hash JoinIBM Informix Database SQL Set operators and ANSI Hash Join
IBM Informix Database SQL Set operators and ANSI Hash Join
 

Recently uploaded

ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
Green Software Development
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
Green Software Development
 
What is Augmented Reality Image Tracking
What is Augmented Reality Image TrackingWhat is Augmented Reality Image Tracking
What is Augmented Reality Image Tracking
pavan998932
 
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
kalichargn70th171
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
Neo4j
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
Boni García
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
Ayan Halder
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 
DDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systemsDDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systems
Gerardo Pardo-Castellote
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
Hironori Washizaki
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
TheSMSPoint
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
Aftab Hussain
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
Sven Peters
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
Octavian Nadolu
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Łukasz Chruściel
 

Recently uploaded (20)

ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
 
What is Augmented Reality Image Tracking
What is Augmented Reality Image TrackingWhat is Augmented Reality Image Tracking
What is Augmented Reality Image Tracking
 
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 
DDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systemsDDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systems
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
 

Building a Hierarchical Data Model Using the Latest IBM Informix Features

  • 1. Building a Hierarchical Data Model Using the Latest IBM Informix Features Ajaykumar Gupte gupte@us.ibm.com 1
  • 2. Agenda ● Problem of querying hierarchical data ● Hierarchical data design ● “Connect By”- keywords & pseudo columns ● Execution model ● Query transformation
  • 3. Problem of querying hierarchical data • Common technique of storing hierarchical data in relational tables is self-reference – Employee-Manager • Employee table (key – empid) • Every employee has a manager (indicated by mgrid) • Manager is also an employee (with a valid empid) – Shipment • Inbound shipment table (key – item_id) • Each item can belong to a package ( key – package_id) • Every package is itself an item (with a valid item_id) CREATE TABLE employee ( empid INTEGER NOT NULL PRIMARY KEY, name VARCHAR(10), salary DECIMAL(9, 2), mgrid INTEGER); CREATE TABLE inbound_shipment ( shipment_id VARCHAR(50), item_id VARCHAR(20) , package_id VARCHAR(20), .......
  • 4. ship_CX2555 Pallet_BX505 Pallet xxx box_C3524box_C1255 Pallet xxx band_aid… A1_pharma..band_aid.. vicks_.. vicks_...A1_pharma.. vicks_.. box_C4000 Tylenol.. Tylenol…
  • 5. Characteristics/Limitations ■ Multi-step approach – requiring complex application/SPL logic ■ Recursive self-join ■ Filtering/ordering/grouping requires more additions ■ Joining results with other tables becomes complex ■ Reuse amongst other applications – understanding of the complex logic (data placement etc) – more customization
  • 6. SELECT level as package_level, item_id, package_id FROM inbound_shipment START WITH item_id = 'pallet_BX505' CONNECT BY PRIOR item_id = package_id Using CONNECT BY to discover data hierarchy C o n d it io n o f r e c u r s io n s e e d o f r e c u r s io n
  • 7. Results of CONNECT BY Query package_level item_id package_id 1 pallet_BX505 ship_CX2555 2 box_C1255 pallet_BX505 3 band_aid_H10 box_C1255 3 band_aid_H12 box_C1255 3 A1_pharma_F23 box_C1255 3 A1_pharma_F33 box_C1255
  • 8. Hierarchical view of data 17 15 16 10 13 11 12 14 1 2 43 5 6 7 8 9 SELECT name, empid, mgrid FROM emp START WITH name = 'Goyal' CONNECT BY PRIOR empid = mgrid G o y a l Goyal 16 17 Zander 11 16 McKeoug h 5 11 Barnes 6 11 Henry 12 16 O'Neil 7 12 Smith 8 12 Shoeman 9 12 Scott 14 16 empid mgrid
  • 9. 12 Flow of Execution 17 15 16 10 13 11 12 14 1 2 43 5 6 7 8 9 SELECT name, empid, mgrid FROM emp START WITH name = 'Goyal' CONNECT BY PRIOR empid = mgrid Stack JOIN 16 PUSH POP11 14 65 987
  • 10. Where is hierarchical data ?  Bill of materials  Reporting structure  Package tracking  Inventory management  Social media  date/time  Geography / region
  • 11. PRIOR■ Unary operator PRIOR is used in join filter to distinguish column references of the last prior recursive step, from column references to the base table. ■ Query without PRIOR can result in a forever running query or single row package_level item_id package_id 1 pallet_BX505 ship_CX2555 2 box_C1255 pallet_BX505 3 band_aid_H10 box_C1255 3 band_aid_H12 box_C1255 3 A1_pharma_F23 box_C1255 3 A1_pharma_F33 box_C1255 2 box_C3524 pallet_BX505 3 vicks_CK215 box_C3524 3 vicks_CK315 box_C3524 3 vicks_CK324 box_C3524 SELECT level , item_id, package_id FROM inbound_shipment START WITH item_id = 'pallet_BX505' CONNECT BY PRIOR item_id = package_id
  • 12. LEVEL ■ Pseudo column that tracks the level of a node in hierarchy starting with level 1 for the root node. ■ Can be used in CONNECT BY clause as a filter to limit the depth of hierarchy package_level item_id package_id 1 pallet_BX505 ship_CX2555 2 box_C1255 pallet_BX505 2 box_C3524 pallet_BX505 2 box_C4520 pallet_BX505 2 box_C4000 pallet_BX505 5 row(s) retrieved. SELECT level as package_level, item_id, package_id FROM inbound_shipment where level < 3 START WITH item_id = 'pallet_BX505' CONNECT BY PRIOR item_id = package_id
  • 13. NOCYCLE ■ By default hierarchical queries return error when they detect cycle in the data ■ Used to allow the query to return all rows by ignoring the cycle causing row insert into inbound_shipment(item_id,package_id) values ("ship_CX2555", "pallet_BX505"); package_level item_id package_id 1 pallet_BX505 ship_CX2555 26079: CONNECT BY query resulted in a loop/cycle. Error in line 9 Near character position 37 SELECT level , item_id, package_id FROM inbound_shipment START WITH item_id = 'pallet_BX505' CONNECT BY PRIOR item_id = package_id
  • 14. NOCYCLE Example package_level item_id package_id 1 pallet_BX505 ship_CX2555 2 ship_CX2555 pallet_BX505 2 box_C1255 pallet_BX505 2 box_C3524 pallet_BX505 2 box_C4520 pallet_BX505 2 box_C4000 pallet_BX505 6 row(s) retrieved. SELECT level as package_level, item_id, package_id FROM inbound_shipment where level < 3 START WITH item_id = 'pallet_BX505' CONNECT BY NOCYCLE PRIOR item_id = package_id
  • 15. CONNECT_BY_ISCYCLE ■ Identify the nodes that would result in a cycle package_level item_id package_id connect_by_iscycle 1 pallet_BX505 ship_CX2555 0 2 ship_CX2555 pallet_BX505 1 2 box_C1255 pallet_BX505 0 2 box_C3524 pallet_BX505 0 2 box_C4520 pallet_BX505 0 2 box_C4000 pallet_BX505 0 6 row(s) retrieved. SELECT level as package_level, item_id, package_id , connect_by_iscycle FROM inbound_shipment where level < 3 START WITH item_id = 'pallet_BX505' CONNECT BY NOCYCLE PRIOR item_id = package_id
  • 16. CONNECT_BY_ISLEAF Example package_level item_id package_id connect_by_isleaf 3 band_aid_H10 box_C1255 1 3 band_aid_H12 box_C1255 1 3 A1_pharma_F23 box_C1255 1 3 A1_pharma_F33 box_C1255 1 3 vicks_CK215 box_C3524 1 3 vicks_CK315 box_C3524 1 3 vicks_CK324 box_C3524 1 3 A1_pharma_T30 box_C3524 1 3 A1_pharma_T20 box_C3524 1 3 A1_pharma_T10 box_C3524 1 3 A1_pharma_415 box_C4520 1 3 A1_pharma_413 box_C4520 1 3 A1_pharma_329 box_C4520 1 3 A1_pharma_343 box_C4520 1 3 tylenol_BA341 box_C4000 1 3 tylenol_BA455 box_C4000 1 3 tylenol_BA570 box_C4000 1 3 tylenol_BA521 box_C4000 1 3 tylenol_BA520 box_C4000 1 3 tylenol_BA500 box_C4000 1 20 row(s) retrieved. SELECT level as package_level, item_id, package_id , connect_by_isleaf FROM inbound_shipment where connect_by_isleaf = 1 START WITH item_id = 'pallet_BX505' CONNECT BY NOCYCLE PRIOR item_id = package_id
  • 17. SYS_CONNECT_BY_PATH ■ Expression which is used to build a string representing a path from the root row to current row. ■ >>--SYS_CONNECT_BY_PATH--(--string-expression1--,--string-expression2--)-->< path pallet_BX505 item_id pallet_BX505 package_id ship_CX2555 path pallet_BX505box_C1255 item_id box_C1255 package_id pallet_BX505 path pallet_BX505box_C3524 item_id box_C3524 package_id pallet_BX505 path pallet_BX505box_C4520 item_id box_C4520 package_id pallet_BX505 path pallet_BX505box_C4000 item_id box_C4000 package_id pallet_BX505 5 row(s) retrieved. SELECT sys_connect_by_path(item_id,"") as path , item_id, package_id FROM inbound_shipment where level < 3 START WITH item_id = 'pallet_BX505' CONNECT BY PRIOR item_id = package_id
  • 18. CONNECT_BY_ROOT ■ unary operator which, for every row in the hierarchy, returns the expression for the row’s root ancestor ■ >>--CONNECT_BY_ROOT--expression---------------------------------->< root item_id package_id pallet_BX505 pallet_BX505 ship_CX2555 pallet_BX505 box_C1255 pallet_BX505 pallet_BX505 box_C3524 pallet_BX505 pallet_BX505 box_C4520 pallet_BX505 pallet_BX505 box_C4000 pallet_BX505 5 row(s) retrieved. SELECT connect_by_root item_id as root, item_id, package_id FROM inbound_shipment where level < 3 START WITH item_id = 'pallet_BX505' CONNECT BY PRIOR item_id = package_id
  • 19. SIBLINGS ■ Attribute of ORDER BY clause to order the siblings at every level of hierarchy ■ Same semantics of ORDER BY but applied at siblings rows level item_id package_id 1 pallet_BX505 ship_CX2555 2 box_C1255 pallet_BX505 2 box_C3524 pallet_BX505 2 box_C4000 pallet_BX505 2 box_C4520 pallet_BX505 5 row(s) retrieved. SELECT level, item_id, package_id FROM inbound_shipment where level < 3 START WITH item_id = 'pallet_BX505' CONNECT BY PRIOR item_id = package_id order SIBLINGS by item_id
  • 20. Query rewrite & Execution model • Query rewrite SELECT level , item_id, package_id FROM inbound_shipment START WITH item_id = 'pallet_BX505' CONNECT BY PRIOR item_id = package_id SELECT level , item_id, package_id FROM ( SELECT level, item_id, package_id FROM inbound_shipment WHERE item_id = 'pallet_BX505' UNION ALL SELECT level, ship.item_id , ship.package_id FROM inbound_shipment ship, dtab WHERE ship.package_id = dtab.item_id ) AS dtab;
  • 21. Execution model of recursive queries in IDS TEMP TABLE CYCLE OR TRAVERSAL SCAN JOIN UNION ALL SORT SCAN SCAN SORT SCAN Scan of shipment table Scan of shipment table order siblings by Connect by filters Top level scan on derived table
  • 22. sqexplainQUERY: SELECT level as package_level, item_id, package_id FROM inbound_shipment START WITH item_id = 'pallet_BX505' CONNECT BY PRIOR item_id = package_id Connect by Query Rewrite: select x0.level ,x0.item_id ,x0.package_id from (select x1.item_id ,x1.package_id ,x1.item_id ,1 ,1 ,0 from "informix".inbound_shipment x1 where (x1.item_id = 'pallet_BX505' ) union all select x2.item_id ,x2.package_id ,x2.item_id ,(level + 1 ) ::integer ,connect_by_isleaf ,dtab_30093_173_stkcol from "informix".inbound_shipment x2 ,"informix".dtab_30093_173 x0 where (dtab_30093_173_p_item_id = x2.package_id ) ) X0 (item_id,package_id,dtab_30093_173_p_item_id,level,connect_by_isleaf,dtab_3 0093_173_stkcol) S T A R T W I T H
  • 23. Estimated Cost: 1 Estimated # of Rows Returned: 5 1) informix.dtab_30093_173: COLLECTION SCAN Subquery: --------- Estimated Cost: 13 Estimated # of Rows Returned: 5 1) informix.inbound_shipment: SEQUENTIAL SCAN Filters: informix.inbound_shipment.item_id = 'pallet_BX505' Union Query: ------------ 1) informix.dtab_30093_173: SEQUENTIAL SCAN 2) informix.inbound_shipment: SEQUENTIAL SCAN DYNAMIC HASH JOIN (Build Outer) Dynamic Hash Filters: informix.dtab_30093_173.dtab_30093_173_p_item_id = informix.inbound_shipment.package_id Query statistics: Table map : ---------------------------- Internal name Table name ---------------------------- t1 dtab_30093_173 type table rows_prod time ----------------------------------- clscan t1 25 00:00.00
  • 24. CONNECT BY Restriction  Multiple tables are not allowed SELECT ship.item_id , ord.name FROM inbound_shipment ship, orders ordinbound_shipment ship, orders ord START WITH item_id = “pallet_BX505” CONNECT BY PRIOR item_id = package_id WHERE ship.item_id = ord.item_id Rewrite to SELECT item_id , name FROM (SELECT ship.item_id, ord.name FROM inbound_shipment ship, orders ord WHERE ship.item_id = ord.item_id ) START WITH item_id = “pallet_BX505” CONNECT BY PRIOR item_id = package_id
  • 25. Tree node traversal 10 20 30 40 50 level c1 c2 1 10 0 2 30 10 3 50 30 4 20 50 5 40 20 2 20 10 3 40 20 7 row(s) retrieved. c1 c2 10 0 20 10 30 10 40 20 50 30 20 50 6 row(s) retrieved. select * from t1; select level , * from t1 start with c1 = 10 connect by prior c1 = c2; 10--30--50--20--40 10--20--40
  • 26. Child to Parent Traversal package_level item_id package_id 1 tylenol_BA500 box_C4000 2 box_C4000 pallet_BX505 3 pallet_BX505 ship_CX2555 3 row(s) retrieved. SELECT level as package_level, item_id, package_id FROM inbound_shipment START WITH item_id = 'tylenol_BA500' CONNECT BY PRIOR package_id = item_id
  • 27. SEQUENCE NUMBER GENERATOR SELECT level FROM sysmaster:sysdual CONNECT BY level <= 10 S in g le r o w t a b leConnect by Query Rewrite: --------------------------- select x0.level from (select 1 ,1 ,0 from sysmaster:"informix".sysdual x1 union all select (level + 1 ) ::integer ,connect_by_isleaf ,dtab_27465_191_stkcol from sysmaster:"informix".sysdual x2 ,"informix".dtab_27465_191 x0 where ((level + 1 ) <= 10. ) ) x0(level,connect_by_isleaf,dtab_27465_191_stkcol) 1) informix.dtab_27465_191: COLLECTION SCAN Subquery: --------- Estimated Cost: 5 Estimated # of Rows Returned: 2 1) sysmaster:informix.sysdual: SEQUENTIAL SCAN Union Query: ------------ 1) informix.dtab_27465_191: SEQUENTIAL SCAN Filters: informix.dtab_27465_191.level + 1 <= 10 2) sysmaster:informix.sysdual: SEQUENTIAL SCAN NESTED LOOP JOIN
  • 28. Performance Considerations • Queries are recursive and involves repeated self joins • Use “PRIOR” Keyword, else query will run forever !! • TEMP Dbspace used for hierarchy traversal (stack) and cycle detection • Configure - DBSPACETEMP
  • 29. Conclusion • Simple queries for complex reporting • Useful for single or multiple data tree structure • Easy to map path between two nodes/rows

Editor's Notes

  1. Employee-Manager All employees reporting to “Goyal” Entire organization chart for “Goyal” All managers under Goyal with salary &amp;lt; $X All non-manager employee under Goyal with salary &amp;lt; $Y Shipment List all items from a pallet #10 Which product units are inside pallet #10 ? Find out a pallet number of unit (upc 456….) ? Display all products from a pallet by scanning a single unit with upc (678….) Count number of boxes from a pallet by scanning a single unit with upc (567….) Count number of product units &amp; boxes from a pallet by scanning a single unit with upc (567….)
  2. List all items/boxes from pallet “pallet_BX505” Fetch row from inbound_shipment where item_id = “pallet_BX505” Materialize result of step 1 into TEMP table Join the result of step 2 back into the inbound_shipment such that item_id from step 2 == package_id (similar to self join) Materialize results of step 3 into TEMP table Repeat step 3 and 4 until step 3 results in no data i.e. Join results in no data
  3. A hierarchical query operates on rows, which correspond to nodes within a logical structure of parent-child relationships. If parent rows have multiple children, sibling relationships exist among child rows of the same parent. These relationships might reflect, for example, the reporting structure among employees and managers within the divisions and management levels of an organization. Important: Hierarchical queries are most efficient for data sets in which parent-child dependencies in the table have the logical topology of a simple graph. If the self-referencing table includes more than one independent hierarchy for the same set of columns, or if any child row is also an ancestor of its parent, see also the section Dependency patterns that are not a simple graph.
  4. Pseudo column which returns a 1 or 0 to indicate if the row resulted in a cycle or not (row when joined back into the base table would result in cycle or not) to identify the nodes that would result in a cycle Can be used only when NOCYCLE attribute is used Cannot be used in START WITH and CONNECT BY clause
  5. This Pseudo column returns either 1 or 0 based on whether the node is a leaf node or not A node is a leaf node if it has no children in the query result hierarchy (not in the actual data hierarchy) Cannot appear in START WITH and CONNECT BY clause.
  6. CONNECT BY queries are Supported inside views / Derived tables Supported inside subqueries SPLs (static and dynamic statements in SPL) CONNECT BY queries do not support joins in the FROM clause Workaround is to rewrite queries to push down join into FROM clause of CONNECT BY query
  7. Queries are optimized exactly like normal SQL queries Access paths/join types are chosen based on available statistics Subqueries with CONNECT BY are not flattened (merged into parent query block) Views with CONNECT BY or views referenced in FROM clause of CONNECT BY queries are always materialized