The Story About The Migration

The story about the migration
Sushant Pandey Engineering Architect
Postgres Build 2020, Dec 9.
Alicja Kucharczyk EMEA Global Black Belt
OSS Data Tech Specialist

Overview
Oracle Exadata
to
Azure Database for PostgreSQL – Single Server, v11
4vCores, 20GB (Postgres) <–> 4vCores, 32GB (Oracle)

PoC Scope
2 packages from 1 schema
+ single objects and schemas from dependent packages, e.g. for
logging

Success Criteria
• Comparable performance to the existing Oracle Exadata instance
• Based on test script provided by customer which tests performance
of 2 packages.

Migration Tooling
Schema and Data migration

Results
 Schema and Data migration

Test Set
Test 1
Node 1
duration
Node 2
duration
[...]

Test 1 - Tree Traversal
Node Oracle
(ms)
Postgres
(ms)
Postgres vs. Oracle %
Node 1 1 16 6,25
Node 2
1 16 6,25
Node 3 0 0 100
Node 4 0 0 100
Node 5 0 0 100
Node 6 0 0 100
Node 7 2 0 200
Total
4 32 12,5

Test 2 – Cursor Free Execution
Node Oracle
(ms)
Postgres
(ms)
Node 1 738 62 1190
Node 2 738 47 1570
Node 3 1427 47 3036
Node 4 1424 63 2260
Node 5 1132 31 3651
Node 6 1154 31 3722
Node 7 2340 344 680
Node 8 2256 375 601
Node 9 2240 422 530
Node 10 2242 375 597
Total 15691 1797 873

Test 3 – Cursor Free Execution – datatype mapping scenario
Node Oracle
(ms)
Postgres
(ms)
Node 1 28 31 90
Node 2 27 16 168,75
Total 55 47 117

SYS_CONNECT_BY_PATH
valid only in hierarchical
queries. It returns the path
of a column value from root
to node, with column values
separated by char for each
row returned by CONNECT
BY condition.
SELECT LPAD(' ', 2*level-1)||SYS_CONNECT_BY_PATH(last_name, '/')
"Path"
FROM employees
START WITH last_name = 'Kochhar'
CONNECT BY PRIOR employee_id = manager_id;
Path
---------------------------------------------------------------
/Kochhar
/Kochhar/Greenberg
/Kochhar/Greenberg/Faviet
/Kochhar/Greenberg/Chen
/Kochhar/Greenberg/Sciarra
/Kochhar/Greenberg/Urman
/Kochhar/Greenberg/Popp
/Kochhar/Whalen
/Kochhar/Mavris
/Kochhar/Baer
/Kochhar/Higgins
/Kochhar/Higgins/Gietz
https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions164.htm

CREATE OR REPLACE TYPE FOO_TYPE IS
TABLE OF INTEGER;
CREATE OR REPLACE PROCEDURE SysConnectByPath
IS
bar_ids_tab FOO_TYPE := foo_type(111, 20, 3, 4, 5);
bar_ids_string VARCHAR(1000);
BEGIN
SELECT SUBSTR(SYS_CONNECT_BY_PATH(column_value, ','), 2) csv
INTO bar_ids_string
FROM (SELECT column_value,
ROW_NUMBER() OVER (ORDER BY column_value ) rn,
COUNT(*) OVER () cnt
FROM TABLE (bar_ids_tab))
WHERE rn = cnt
START WITH rn = 1
CONNECT BY rn = PRIOR rn + 1;
DBMS_OUTPUT.PUT_LINE('out ' || bar_ids_string);
END;
completed in 45 ms
out 3,4,5,20,111

ora2pg -i SysConnectByPath.sql -t PROCEDURE -c config/ora2pg.conf
CREATE OR REPLACE FUNCTION sysconnectbypath() RETURNS VOID AS
$body$
DECLARE
bar_ids_string VARCHAR(1000);WITH RECURSIVE cte AS (
BEGIN
SELECT SUBSTR(column_value, 2) CSV
INTO STRICT bar_ids_string
FROM TABLE(bar_ids_tab) alias6) alias7
WHERE rn = 1
UNION ALL
BEGIN
SELECT C.bar_ids_string || ',' || SUBSTR(column_value, 2) CSV
INTO STRICT bar_ids_string
FROM TABLE(bar_ids_tab) alias6) JOIN cte C ON (C.rn + 1 = alias7.rn)
) SELECT * FROM cte WHERE rn = cnt;
;
RAISE NOTICE 'out %', bar_ids_string;
END;
$body$
LANGUAGE PLPGSQL;

Let’s get rid of WITH RECURSIVE
DO
$$
DECLARE
bar_ids_tab NUMERIC[];
bar_ids_string TEXT;
BEGIN
bar_ids_tab := '{111, 20, 3, 4, 5}';
SELECT string_agg(x::TEXT, ',')
INTO bar_ids_string
FROM (
SELECT unnest(bar_ids_tab) AS x
ORDER BY x) a;
RAISE NOTICE '%', bar_ids_string;
END;
$$
[00000] 3,4,5,20,111
completed in 3 ms

Postgres WINS!
DO
$$
DECLARE
bar_ids_tab NUMERIC[];
bar_ids_string TEXT;
BEGIN
bar_ids_tab := '{111, 20, 3, 4, 5}';
SELECT string_agg(x::TEXT, ',')
INTO bar_ids_string
FROM (
SELECT unnest(bar_ids_tab) AS x
ORDER BY x) a;
RAISE NOTICE '%', bar_ids_string;
END;
$$
CREATE OR REPLACE TYPE FOO_TYPE IS
TABLE OF INTEGER;
CREATE OR REPLACE PROCEDURE SysConnectByPath
IS
bar_ids_string VARCHAR(1000);
BEGIN
SELECT SUBSTR(SYS_CONNECT_BY_PATH(column_value, ','), 2) csv
INTO bar_ids_string
FROM TABLE (bar_ids_tab))
WHERE rn = cnt
START WITH rn = 1
CONNECT BY rn = PRIOR rn + 1;
DBMS_OUTPUT.PUT_LINE('out ' || bar_ids_string);
END;
[00000] 3,4,5,20,111
completed in 3 ms
out 3,4,5,20,111
completed in 45 ms

BULK COLLECTS + CURSORS + FOR LOOPS

BULK COLLECT
With the BULK COLLECT clause, each of the preceding statements retrieves an
entire result set and stores it in one or more collection variables in a single
operation (which is more efficient than using a loop statement to retrieve one
result row at a time).
https://docs.oracle.com/database/121/LNPLS/tuning.htm#LNPLS891

Oracle snippet
CREATE OR REPLACE PROCEDURE BulkCollect(p_version IN NUMBER)
IS
CURSOR get_foo (p_version number) IS
SELECT * FROM secret_data
WHERE foo_id = p_version
ORDER BY bar_id;
TYPE my_type IS TABLE of get_foo%ROWTYPE INDEX BY pls_integer;
my_tab my_type;
some_string VARCHAR2(5000);
BEGIN
OPEN get_foo(1);
-- Process one level at a time.
LOOP
FETCH get_foo
BULK COLLECT INTO my_tab LIMIT 1000;
EXIT WHEN my_tab.COUNT = 0;
FOR indx IN 1 .. my_tab.COUNT
LOOP
some_string := '';
FOR prnt_indx IN REVERSE 2..indx - 1
LOOP
some_string := some_string || ', tab.' ||
TO_CHAR(my_tab(prnt_indx).my_flag);
END LOOP;
END LOOP;
END LOOP;
CLOSE get_foo;
END;
CALL BulkCollect(1)
completed in 41 ms

ora2pg -i BulkCollect.sql -t PROCEDURE –c config/ora2pg.conf
CREATE OR REPLACE FUNCTION bulkcollect(p_version BIGINT) RETURNS VOID AS
$body$
DECLARE
get_foo CURSOR (p_version BIGINT) FOR
SELECT *
FROM secret_data
ORDER BY bar_id;
TYPE MY_TYPE IS TABLE OF RECORD INDEX BY INTEGER;
my_tab MY_TYPE;
some_string VARCHAR(5000);
BEGIN
OPEN get_foo(1);
LOOP
FETCH get_foo BULK COLLECT INTO my_tab LIMIT 1000;
LOOP
some_string := '';
FOR prnt_indx IN REVERSE indx..2 - 1
LOOP
my_tab[prnt_indx].my_flag::VARCHAR;
END LOOP;
END LOOP;
END LOOP;
CLOSE get_foo;
END;
$body$
LANGUAGE PLPGSQL
;

We don’t like cursors, right?
CREATE OR REPLACE FUNCTION
get_foo(p_version INTEGER)
RETURNS SETOF SECRET_DATA AS
$body$
ORDER BY bar_id;
$body$
LANGUAGE SQL STABLE;
ORDER BY bar_id;

PostgreSQL snippet
CREATE PROCEDURE BulkCollect(p_version INT)
LANGUAGE plpgsql AS
$body$
DECLARE
some_string TEXT;
get_foo_row RECORD;
i INT;
BEGIN
DROP TABLE IF EXISTS temp_get_foo;
CREATE TEMPORARY TABLE temp_get_foo ON COMMIT DROP AS
SELECT row_number() OVER () as rnum, * FROM get_foo(p_version);
DELETE FROM temp_get_foo WHERE rnum = 1;
ANALYZE temp_get_foo;
i := 1;
FOR get_foo_row IN SELECT * FROM get_foo(p_version)
LOOP
SELECT 'tab.' || string_agg(my_flag, ', tab.' ORDER BY rnum DESC)
INTO some_string
FROM temp_get_foo
WHERE rnum < (
SELECT rnum
FROM temp_get_foo
WHERE bar_id = get_foo_row.bar_id)
AND rnum <> (SELECT max(rnum) FROM temp_get_foo);
i := i + 1;
END LOOP;
END;
$body$;
CALL BulkCollect(1)
completed in 14 ms

Postgres WINS!
completed in 14 ms completed in 41 ms
CREATE PROCEDURE BulkCollect(p_version INT)
LANGUAGE plpgsql AS
$body$
DECLARE
some_string TEXT;
get_foo_row RECORD;
i INT;
BEGIN
DROP TABLE IF EXISTS temp_get_foo;
CREATE TEMPORARY TABLE temp_get_foo ON COMMIT DROP AS
SELECT row_number() OVER () as rnum, * FROM get_foo(p_version);
DELETE FROM temp_get_foo WHERE rnum = 1;
ANALYZE temp_get_foo;
i := 1;
FOR get_foo_row IN SELECT * FROM get_foo(p_version)
LOOP
SELECT 'tab.' || string_agg(my_flag, ', tab.' ORDER BY rnum DESC)
INTO some_string
FROM temp_get_foo
WHERE rnum < (
SELECT rnum
FROM temp_get_foo
WHERE bar_id = get_foo_row.bar_id)
AND rnum <> (SELECT max(rnum) FROM temp_get_foo);
i := i + 1;
END LOOP;
END;
$body$;
CREATE OR REPLACE PROCEDURE BulkCollect(p_version IN NUMBER)
IS
ORDER BY bar_id;
TYPE my_type IS TABLE of get_foo%ROWTYPE INDEX BY pls_integer;
my_tab my_type;
some_string VARCHAR2(5000);
BEGIN
OPEN get_foo(1);
LOOP
FETCH get_foo
BULK COLLECT INTO my_tab LIMIT 1000;
LOOP
some_string := '';
FOR prnt_indx IN REVERSE 2..indx - 1
LOOP
TO_CHAR(my_tab(prnt_indx).my_flag);
END LOOP;
END LOOP;
END LOOP;
CLOSE get_foo;
END;

PostgreSQL snippet (another set replaced the cursor and used with dynamic queries)
completed in 31 ms

Cursor free execution – datatype mapping
SELECT c.m_h_n_d_id
FROM m_h e
JOIN m_h_l d ON d.m_h_v_id = e.current_m_h_v_id
JOIN m_h_n c ON c.M_H_V_ID = d.M_H_V_ID AND c.m_h_l_id = d.m_h_l_id
JOIN m_h_l_t g ON d.m_h_l_t_id = g.m_h_l_t_id
WHERE UPPER(g.m_h_l_t_c) = ('L')
AND e.M_H_ID = 71
I/O Timings: read=9706.450
-> Hash Join (cost=1.10..138954.78 rows=7446 width=25) (actual time=1447.797..4278.543 rows=13005 loops=3)
Hash Cond: ((c_1.m_h_v_id)::numeric = e.current_m_h_v_id)
Buffers: shared hit=6929 read=102289
-> Parallel Append (cost=0.00..131433.18 rows=1489208 width=20) (actual time=0.279..3902.550 rows=1191353 loops=3)
-> Parallel Seq Scan on m_h_n_sys_p2816 c_1 (cost=0.00..123976.91 rows=1489191 width=20) (actual time=0.277..3808.935 rows=1191353 loops=3)
-> Parallel Seq Scan on m_h_n_p0 c (cost=0.00..10.24 rows=24 width=20) (actual time=0.000..0.001 rows=0 loops=1)
-> Hash (cost=1.09..1.09 rows=1 width=5) (actual time=0.138..0.138 rows=1 loops=3)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=3
-> Seq Scan on m_h e (cost=0.00..1.09 rows=1 width=5) (actual time=0.117..0.119 rows=1 loops=3)
Filter: (m_h_id = '71'::numeric)
Rows Removed by Filter: 6
-> Append (cost=0.14..7247.57 rows=1272 width=20) (actual time=0.021..16.859 rows=38383 loops=1)
-> Index Scan using m_h_n_p0_m_h_v_id_m_h_l_id_idx on m_h_n_p0 c (cost=0.14..0.30 rows=1 width=20) (never executed)
Index Cond: (m_h_v_id = d_1.m_h_v_id)
Filter: (d_1.m_h_l_id = (m_h_l_id)::numeric)
-> Index Scan using m_h_n_sys_p2816_m_h_v_id_idx on m__n_sys_p2816 c_1 (cost=0.43..7240.91 rows=1271 width=20) (actual time=0.019..15.490 rows=38383 loops=1)
Index Cond: (m_h_v_id = d_1.m_h_v_id)
Filter: (d_1.m_h_l_id = (m_h_l_id)::numeric)
Rows Removed by Filter: 631
Incorrect datatype mapping captured in
EXECUTION PLAN – 9329 ms overall
Correct datatype mapping captured in
EXECUTION PLAN – 31 ms overall

Migrate for Performance
General Rules

General Rules
Existing code
Nested Loops
Dynamic Queries
Refcursor as output along with other output params
Replacements
Reduced loop nesting with improvements
SETOF Record as output
Improved logic for dynamic queries to make them performant

Our Azure Postgres service page—and our blog!
Azure Database for PostgreSQL
https://aka.ms/azure-postgres
Azure Postgres Blog
https://aka.ms/azure-postgres-blog

Migrations
https://aka.ms/postgres-migration-tutorial
Wealth of documentation resources, too
Azure Postgres Quickstart Docs
https://aka.ms/azure-postgres-quickstart
Azure Postgres Blog

Migrations
https://aka.ms/postgres-migration-tutorial
AskAzureDBforPostgreSQL@service.microsoft.com
Citus open source packages on GitHub—also, Email
https://aka.ms/citus
Azure Postgres Quickstart Docs
https://aka.ms/azure-postgres-quickstart
Azure Postgres Blog

© Copyright Microsoft Corporation. All rights reserved.
danke schön
dank u
merci
dziękuję
धन्यवाद
teşekkürler
thank you
grazie
gracias
tack
@StiepanTrofimo
@AzureDBPostgres
Alicja Kucharczyk
Sushant Pandey

The Story About The Migration

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to The Story About The Migration

Similar to The Story About The Migration (20)

More from EDB

More from EDB (20)

Recently uploaded

Recently uploaded (20)

The Story About The Migration