Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
What to Upload to SlideShare
What to Upload to SlideShare
Loading in …3
×
1 of 107

[Pgday.Seoul 2021] 2. Porting Oracle UDF and Optimization

0

Share

Download to read offline

pgday.seoul 2021 Porting Oracle UDF and Optimization - 유재근

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

[Pgday.Seoul 2021] 2. Porting Oracle UDF and Optimization

  1. 1. https://2021.postgresconf .cn Porting From Oracle UDF and Optimization 유 재 근 mail: naivety1@naver.com
  2. 2. I. Function Optimization Golden Rule II. 테스트 데이터 III. PL/PGSQL function, SQL function V. PARALLEL UNSAFE, RESTICTED, SAFE VI. STRICT IV. VOLATILE, STABLE, IMMUTABLE VI. Porting Examples CONTENT 목 차
  3. 3. 2 I make PostgreSQL database faster and more reliable with sql tuning and data modeling 발표자 • Oracle/MySQL/SQL Server DBA 10년 • 데이터모델러 5년 • PostgreSQL DBA 3년 • 오라스코프 www.postgresdba.com
  4. 4. 3 I make PostgreSQL database faster and more reliable with sql tuning and data modeling 발표자
  5. 5. 4 I make PostgreSQL database faster and more reliable with sql tuning and data modeling OBJECTIVE • PostgreSQL 함수 종류별 특징/성능에 대한 이해 sql function, PL/pgSQL function volatile, stable, immutable function parallel unsafe, parallel safe • 현행시스템 Oracle 함수의 비효율 파악 및 개선 방안 • Oracle과 다른 PostgreSQL 옵티마이저의 특징 공지사항 • 발표자는 청중이 예제 Oracle function 비효율에 대해 분석할 시간을 주지 않고 빠르 게 진행할 예정임 • 본 자료는 다운로드 받아서 천천히 분석하면서 다시 보아야 이해할 수 있을 것임 • Function 을 작성할 업무가 없는 사람에게는 매우 지루한 세션이므로 함수 케터고리 에 대한 설명만 듣고 로그아웃 했다가 다음 세션때 다시 로그인 하기 바람
  6. 6. 5 I make PostgreSQL database faster and more reliable with sql tuning and data modeling I. Function Optimization Golden Rule • Rule #1 : Never create a function. • Rule #2 : Never forget rule#1. • User Defined Fuctions are black boxes to the planner. COST 100 • It is difficult to reduce fuction call overhead.
  7. 7. 6 I make PostgreSQL database faster and more reliable with sql tuning and data modeling II. 테스트 데이터
  8. 8. 7 I make PostgreSQL database faster and more reliable with sql tuning and data modeling II. 테스트 데이터 CREATE TABLE employee ( empno numeric(5,0) NOT NULL, ename character varying(10), job character varying(9), mgr numeric(5,0), hiredate timestamp(0), sal numeric(7,2), comm numeric(7,2), deptno numeric(2,0), sido_nm character varying(100) ); insert into employee select i, chr(65+mod(i,26))||i::text||'NM' ,case when mod(i,10000)=0 then 'PRESIDENT' when mod(i,1000) = 0 then 'MANAGER' when mod(i,3)=0 then 'SALESMAN' when mod(i,3)=1 then 'ANALYST' when mod(i,3)=2 then 'CLERK' end as job ,case when mod(i,10000)= 0 then null when mod(i,1000)= 1 then 10000 when i >= 9000 then 1000 else ceiling((i+1000)/1000)*1000 end as mgr , current_date - i , trunc(random() * 10000) as sal , trunc(random() * 10000) as com , mod(i,12)+1 as deptno , case when mod(i,3) = 0 then 'Jeonbuk' when mod(i,3) = 1 then 'Kangwon' else 'Chungnam' end as sido_nm from generate_series(1,10000) a(i); ALTER TABLE employee ADD CONSTRAINT employee_pk PRIMARY KEY (empno); select pg_relation_size('employee'); 0.9Mbytes • 테스트 데이터 생성(1) drop table if exists customer; create table customer ( cust_no numeric not null, cust_nm character varying(100), register_date timestamp(0), register_dt varchar(8), cust_status_cd varchar(1), register_channel_cd varchar(1), cust_age numeric(3), active_yn varchar(1), sigungu_cd varchar(5), sido_cd varchar(2) ); insert into customer select i, chr(65+mod(i,26))||i::text||'CUST_NM' , current_date - mod(i,10000) , to_char((current_date - mod(i,10000)),'yyyymmdd') as register_dt , mod(i,5)+1 as cust_status_cd , mod(i,3)+1 as register_channel_cd , trunc(random() * 100) +1 as age , case when mod(i,22) = 0 then 'N' else 'Y' end as active_yn , case when mod(i,1000) = 0 then '11007' when mod(i,1000) = 1 then '11006' when mod(i,1000) = 2 then '11005' when mod(i,1000) = 3 then '11004' when mod(i,1000) = 4 then '11003' when mod(i,1000) = 5 then '11002' else '11001' end as sigungu_cd , case when mod(i,3) = 0 then '01' when mod(i,3) = 1 then '02' when mod(i,3) = 2 then '03' end as sido_cd from generate_series(1,1000000) a(i); ALTER TABLE customer ADD CONSTRAINT customer_pk PRIMARY KEY (cust_no); CREATE INDEX CUSTOMER_X01 ON CUSTOMER(sigungu_cd, register_date, cust_nm); select * from pg_relation_size('customer'); 93Mbytes create table com_code ( group_cd varchar(10), cd varchar(10), cd_nm varchar(100)); insert into com_code values ('G1','11001','SEOUL') ,('G1','11002','PUSAN') ,('G1','11003','INCHEON') ,('G1','11004','DAEGU') ,('G1','11005','JAEJU') ,('G1','11006','ULEUNG') ,('G1','11007','ETC'); insert into com_code values ('G2','1','Infant') ,('G2','2','Child') ,('G2','3','Adolescent') ,('G2','4','Adult') ,('G2','5','Senior'); insert into com_code values ('G3','01','Jeonbuk') ,('G3','02','Kangwon') ,('G3','03','Chungnam'); alter table com_code add constraint com_code_pk primary key (group_cd, cd); select * from pg_relation_size('com_code'); --8k drop table product; create table product ( prod_id varchar(10) not null, prod_nm varchar(100) not null, regist_empno numeric(5,0) ); insert into product select 'prod'||i::text, 'prod_nm'||i::text, i*10 from generate_series(1,200) a(i); alter table product add constraint product_pk primary key (prod_id); select * from pg_relation_size('product'); --16Kbyts
  9. 9. 8 I make PostgreSQL database faster and more reliable with sql tuning and data modeling II. 테스트 데이터 drop table online_order; create table online_order ( ord_no numeric(10,0) not null, cust_no numeric not null, ord_date timestamp(0) not null, ord_dt varchar(8) not null, ord_status_cd varchar(1) not null, comment varchar(100) ); insert into online_order select i, mod(i,1000000) as cust_no ,current_date - mod(i,1000) as ord_date ,to_char((current_date - mod(i,1000)),'yyyymmdd') as ord_dt ,(mod(i,4) + 1) as ord_status_cd ,lpad('x',100,'x') from generate_series(1,2000000,2) a(i); alter table online_order add constraint online_order_pk primary key (ord_no); CREATE INDEX ONLINE_ORDER_X01 ON ONLINE_ORDER(CUST_NO); select * from pg_relation_size('online_order'); --167M • 테스트 데이터 생성(2) drop table offline_order; create table offline_order ( ord_no numeric(10,0) not null, cust_no numeric not null, ord_date timestamp(0) not null, ord_dt varchar(8) not null, ord_status_cd varchar(1) not null, empno numeric(5,0), comment varchar(100) ); insert into offline_order select i, mod(i,1000000) as cust_no ,current_date - mod(i,1000) as ord_date ,to_char((current_date - mod(i,1000)),'yyyymmdd') as ord_dt ,(mod(i,4) + 1) as ord_status_cd ,mod(i,10000) + 1 as empno ,lpad('y',100,'y') from generate_series(2,2000000,2) a(i); alter table offline_order add constraint offline_order_pk primary key (ord_no); CREATE INDEX OFFLINE_ORDER_X01 ON OFFLINE_ORDER(CUST_NO); CREATE INDEX OFFLINE_ORDER_X02 ON OFFLINE_ORDER(EMPNO); select * from pg_relation_size('offline_order'); --174M create table ord_item ( ord_no numeric(10,0) not null, prod_id varchar(10) not null, unit_price numeric(10) not null, quantity numeric(10) not null, on_off_code varchar(2) not null, oder_comment varchar(100) ); --prod1 ~ prod200 insert into ord_item select a.ord_no, 'prod'||(mod(ord_no,200)+1)::text as prod_id , trunc(100*random())+1 as unit_price , trunc(10*random())+1 as quantity , case when mod(ord_no,2)=0 then '01' else '02' end as on_off_code , lpad('c',100,'y') from (select ord_no from online_order union all select ord_no from offline_order ) a ; insert into ord_item select a.ord_no, 'prod'||(mod(ord_no,200)+2)::text as prod_id , trunc(100*random())+1 as unit_price , trunc(10*random())+1 as quantity , case when mod(ord_no,2)=0 then '01' else '02' end as on_off_code , lpad('d',100,'q') from (select ord_no from online_order union all select ord_no from offline_order ) a ; alter table ord_item add constraint ord_item_pk primary key(ord_no, prod_id); select * from ord_item where ord_no in (998,999); select pg_relation_size('ord_item'); --648M
  10. 10. 9 I make PostgreSQL database faster and more reliable with sql tuning and data modeling II. 테스트 데이터 CREATE TABLE employee_hist ( operation varchar(10) NOT NULL, empno numeric(5,0) NOT NULL, ename character varying(10), sal numeric(7,2), update_date timestamp(0) ); ALTER TABLE employee_hist ADD CONSTRAINT PK_E_HIST PRIMARY KEY(OPERATION, EMPNO); DROP TABLE ERR_LOG; --에러 로깅테이블 CREATE TABLE ERR_LOG ( ORA_ERR_NUMBER INT, ORA_ERR_MSG VARCHAR(2000), OPERATION VARCHAR(10), EMPNO NUMERIC(5,0), SAL NUMERIC(7,2), ERR_DATE TIMESTAMP(0) ); • 테스트 데이터 생성(3)
  11. 11. 10 I make PostgreSQL database faster and more reliable with sql tuning and data modeling III. PL/pgSQL vs. SQL function CREATE OR REPLACE FUNCTION F_GET_AGE_CATEGORY ( P_AGE IN NUMBER -- 나이 입력) RETURN varchar2 IS v_category VARCHAR2(100); BEGIN IF P_AGE <= 2 THEN v_category := 'Infant'; ELSIF P_AGE <= 12 THEN v_category := 'Child'; ELSIF P_AGE <= 19 THEN v_category := 'Adolescent'; ELSIF P_AGE <= 65 THEN v_category := 'Adult'; ELSE v_category := 'Senior'; END IF; RETURN v_category ; EXCEPTION WHEN OTHERS THEN DBMS_OUTPUT.PUT_LINE(SQLCODE || ' ' || SQLERRM); RETURN NULL; END; • Oracle function
  12. 12. 11 I make PostgreSQL database faster and more reliable with sql tuning and data modeling III. PL/pgSQL vs. SQL function CREATE OR REPLACE FUNCTION F_GET_AGE_CATEGORY ( P_AGE IN NUMBER -- 나이 입력 ) RETURN varchar2 IS v_category VARCHAR2(100); BEGIN IF P_AGE <= 2 THEN v_category := 'Infant'; ELSIF P_AGE <= 12 THEN v_category := 'Child'; ELSIF P_AGE <= 19 THEN v_category := 'Adolescnt'; ELSIF P_AGE <= 65 THEN v_category := 'Adult'; ELSE v_category := 'Senior'; END IF; RETURN v_category ; EXCEPTION WHEN OTHERS THEN DBMS_OUTPUT.PUT_LINE(SQLCODE || ' ' || SQLERRM); RETURN NULL; END; • Oracle function CREATE OR REPLACE FUNCTION age_category_pl (p_age numeric) RETURNS text language plpgsql AS $$ BEGIN RETURN (case when p_age <= 2 then 'Infant' when p_age <= 12 then 'Child' when p_age <= 19 then 'Adolescent' when p_age <= 65 then 'Adult' else 'Senior' END); END; $$ CREATE OR REPLACE FUNCTION age_category (p_age numeric) RETURNS text language sql AS $$ select case when p_age <= 2 then 'Infant' when p_age <= 12 then 'Child' when p_age <= 19 then 'Adolescent' when p_age <= 65 then 'Adult' else 'Senior' END; $$ • PostgreSQL function
  13. 13. 12 I make PostgreSQL database faster and more reliable with sql tuning and data modeling III. PL/pgSQL vs. SQL function select age_category_pl(cust_age) from customer; QUERY PLAN Seq Scan on portal.customer (actual time=0.020..538.045 rows=1000000 loops=1) Output: age_category_pl(cust_age) Buffers: shared hit=11364 Planning Time: 0.029 ms Execution Time: 566.112 ms select age_category(cust_age) from customer; QUERY PLAN Seq Scan on portal.customer (actual time=0.007..223.952 rows=1000000 loops=1) Output: CASE WHEN (cust_age <= '2'::numeric) THEN 'Infant'::text WHEN (cust_age <= '12'::numeric) THEN 'Child'::text WHEN (cust_age <= '19'::numeric) THEN 'Adolescent'::text WHEN (cust_age <= '65'::numeric) THEN 'Adult'::text ELSE 'Senior'::text END Buffers: shared hit=11364 Planning Time: 0.065 ms Execution Time: 250.567 ms ○ 실행계획 Output을 보면 sql function은 inlining이 발생했다. ○ sql function inlining이 발생하면 function call 이 발생하지 않아 elapsed time이 단축된다. • SQL function • PL/pgSQL function
  14. 14. 13 I make PostgreSQL database faster and more reliable with sql tuning and data modeling III. PL/pgSQL vs. SQL function ○ scalar function inlining 조건 LANGUAGE SQL not SECURITY DEFINER not RETURNS SETOF (or RETURNS TABLE) not RETURNS RECORD not CTEs, no FROM clause, no reference to any table, none of GROUP BY, ORDER BY, LIMIT exactly one column ○ table function inlining 조건 LANGUAGE SQL declared STABLE or IMMUTABLE RETURNS SETOF or RETURNS TABLE
  15. 15. 14 I make PostgreSQL database faster and more reliable with sql tuning and data modeling Function Optimization Silver Rule Rule #1 : Never create a function. Rule #2 : Never foreget rule#1. ○ PL/pgSQL function 보다는 inlining 가능한 SQL function 을 사용하라.
  16. 16. 15 I make PostgreSQL database faster and more reliable with sql tuning and data modeling III. PL/pgSQL vs. SQL function CREATE OR REPLACE FUNCTION F_GET_SIGNGU_NM (P_CD IN VARCHAR2(5) -- 시군구코드 입력) RETURN VARCHAR2 IS v_signgu_nm VARCHAR2(100); BEGIN SELECT CD_NM INTO v_signgu_nm FROM COM_CODE WHERE GROUP_CD = 'G1' AND CD = P_CD; RETURN v_signgu_nm ; EXCEPTION WHEN OTHERS THEN DBMS_OUTPUT.PUT_LINE(SQLCODE || ' ' || SQLERRM); RETURN NULL; END; • Oracle function
  17. 17. 16 I make PostgreSQL database faster and more reliable with sql tuning and data modeling III. PL/pgSQL vs. SQL function CREATE OR REPLACE FUNCTION F_GET_SIGNGU_NM ( P_CD IN VARCHAR2(5) -- 시군구코드 입력 ) RETURN VARCHAR2 IS v_signgu_nm VARCHAR2(100); BEGIN SELECT CD_NM INTO v_signgu_nm FROM COM_CODE WHERE GROUP_CD = 'G1' AND CD = P_CD; RETURN v_signgu_nm ; EXCEPTION WHEN OTHERS THEN DBMS_OUTPUT.PUT_LINE(SQLCODE || ' ' || SQLERRM); RETURN NULL; END; • Oracle function create or replace function f_get_signgu_nm_pl(p_cd varchar) returns varchar language plpgsql as $$ declare v_signgu_nm VARCHAR(100); begin SELECT CD_NM INTO v_signgu_nm FROM COM_CODE WHERE GROUP_CD = 'G1' AND CD = p_cd; --시군구 코드값 입력 RETURN v_signgu_nm ; end; $$ create or replace function f_get_signgu_nm_sql(p_cd varchar) returns varchar language sql as $$ SELECT CD_NM FROM COM_CODE WHERE GROUP_CD = 'G1' AND CD = p_cd; --시군구 코드값 입력 $$ • PostgreSQL function
  18. 18. 17 I make PostgreSQL database faster and more reliable with sql tuning and data modeling III. PL/pgSQL vs. SQL function SELECT CUST_NM , f_get_signgu_nm_pl(SIGUNGU_CD) FROM CUSTOMER; QUERY PLAN Seq Scan on portal.customer (actual time=0.029..3605.273 rows=1000000 loops=1) Output: cust_nm, f_get_signgu_nm_pl(sigungu_cd) Buffers: shared hit=1011364 Planning Time: 0.029 ms Execution Time: 3648.277 ms SELECT CUST_NM , f_get_signgu_nm_sql(SIGUNGU_CD) FROM CUSTOMER; QUERY PLAN Seq Scan on portal.customer (actual time=0.167..2556.886 rows=1000000 loops=1) Output: cust_nm, f_get_signgu_nm_sql(sigungu_cd) Buffers: shared hit=1011364 Planning Time: 0.069 ms Execution Time: 2600.038 ms ○ sql function inlining이 동작하지 않았으나, elapsed time은 약 30 % 단축되었다. • SQL function • PL/pgSQL function
  19. 19. 18 I make PostgreSQL database faster and more reliable with sql tuning and data modeling III. Parallel Unsafe vs. Parallel Safe • PostgreSQL function create or replace function f_get_signgu_nm_pl_parallel(p_cd varchar) returns varchar language plpgsql parallel safe as $$ declare v_signgu_nm VARCHAR(100); begin SELECT CD_NM INTO v_signgu_nm FROM COM_CODE WHERE GROUP_CD = 'G1' AND CD = p_cd; --시군구 코드값 입력 RETURN v_signgu_nm ; end; $$ create or replace function f_get_signgu_nm_sql_parallel(p_cd varchar) returns varchar language sql parallel safe as $$ SELECT CD_NM FROM COM_CODE WHERE GROUP_CD = 'G1' AND CD = p_cd; --시군구 코드값 입력 $$
  20. 20. 19 I make PostgreSQL database faster and more reliable with sql tuning and data modeling III. Parallel Unsafe vs. Parallel Safe SELECT CUST_NM, f_get_signgu_nm_pl(SIGUNGU_CD) FROM CUSTOMER; Buffers: shared hit=1011364 Planning Time: 0.029 ms Execution Time: 3648.277 ms SELECT CUST_NM, f_get_signgu_nm_sql(SIGUNGU_CD) FROM CUSTOMER; Planning Time: 0.069 ms Execution Time: 2600.038 ms • SQL function • PL/pgSQL function SELECT CUST_NM , f_get_signgu_nm_pl_parallel(SIGUNGU_CD) FROM CUSTOMER; Gather (actual time=2.486..1587.627 rows=1000000 loops=1) Output: cust_nm, (f_get_signgu_nm_pl_parallel(sigungu_cd)) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=1012472 -> Parallel Seq Scan on portal.customer (actual time=0.833..1497.364 rows=333333 loops=3) Output: cust_nm, f_get_signgu_nm_pl_parallel(sigungu_cd) Buffers: shared hit=1012472 Worker 0: actual time=1.027..1534.903 rows=360536 loops=1 Buffers: shared hit=365187 Worker 1: actual time=1.429..1536.441 rows=335151 loops=1 Buffers: shared hit=339514 Planning Time: 0.034 ms Execution Time: 1629.886 ms SELECT CUST_NM , f_get_signgu_nm_sql_parallel(SIGUNGU_CD) FROM CUSTOMER; Gather (cost=1000.00..220697.42 rows=1000000 width=46) (actual time=2.578..1101.832 rows=1000000 loops=1) Output: cust_nm, (f_get_signgu_nm_sql_parallel(sigungu_cd)) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=1012364 -> Parallel Seq Scan on portal.customer (actual time=0.632..1015.088 rows=333333 loops=3) Output: cust_nm, f_get_signgu_nm_sql_parallel(sigungu_cd) Buffers: shared hit=1012364 Worker 0: actual time=0.893..1047.825 rows=330968 loops=1 Buffers: shared hit=335229 Worker 1: actual time=0.891..1047.471 rows=335591 loops=1 Buffers: shared hit=339905 Planning Time: 0.059 ms Execution Time: 1137.549 ms
  21. 21. 20 I make PostgreSQL database faster and more reliable with sql tuning and data modeling Function Optimization Silver Rule Rule #1 : Never create a function. Rule #2 : Never foreget rule#1. ○ PL/pgSQL function 보다는 inlining 가능한 SQL function 을 사용 ○ Parallel Unsafe -> Parallel Restricted -> Parallel Safe 사용 ○ Paralle Unsafe - INSERT / UPDATE / DELETE - exception block 등 sub-transaction 사용 - sequence 사용 - 다른 parallel unsafe 함수 사용 (current_schema, current_user 등) ○ Parallel Resticted - 함수가 포함된 SQL이 parallel로 수행될 수 있으나, leader process만이 함수 수행 가능 - temporary table, cursor, prepared statement 등 사용 - 다른 parallel restricted 함수 사용 (random() 등) - 실행계획에 InitPlan, SubPlan 등 발생 ○ Parallel Safe - parallel mode에서 수행 가능
  22. 22. 21 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. Called on null input vs. STRICT function CREATE OR REPLACE FUNCTION F_GET_SIGNGU_NM (P_CD IN VARCHAR2(5) -- 시군구코드 입력) RETURN VARCHAR2 IS v_signgu_nm VARCHAR2(100); BEGIN SELECT CD_NM INTO v_signgu_nm FROM COM_CODE WHERE GROUP_CD = 'G1' AND CD = P_CD; RETURN v_signgu_nm ; EXCEPTION WHEN OTHERS THEN DBMS_OUTPUT.PUT_LINE(SQLCODE || ' ' || SQLERRM); RETURN NULL; END; • Oracle function
  23. 23. 22 I make PostgreSQL database faster and more reliable with sql tuning and data modeling CREATE OR REPLACE FUNCTION F_GET_SIGNGU_NM ( P_CD IN VARCHAR2(5) -- 시군구코드 입력 ) RETURN VARCHAR2 IS v_signgu_nm VARCHAR2(100); BEGIN SELECT CD_NM INTO v_signgu_nm FROM COM_CODE WHERE GROUP_CD = 'G1' AND CD = P_CD; RETURN v_signgu_nm ; EXCEPTION WHEN OTHERS THEN DBMS_OUTPUT.PUT_LINE(SQLCODE || ' ' || SQLERRM); RETURN NULL; END; • Oracle function create or replace function f_get_signgu_nm_sql_parallel_strict (p_cd varchar) returns varchar language sql strict parallel safe as $$ SELECT CD_NM FROM COM_CODE WHERE GROUP_CD = 'G1' AND CD = p_cd; --시군구 코드값 입력 $$ • PostgreSQL function IV. Called on null input vs. STRICT function
  24. 24. 23 I make PostgreSQL database faster and more reliable with sql tuning and data modeling SELECT CUST_NM , f_get_signgu_nm_sql_parallel(null) FROM CUSTOMER; QUERY PLAN Gather (actual time=1.649..1038.029 rows=1000000 loops=1) Output: cust_nm, (f_get_signgu_nm_sql_parallel(NULL::character varying)) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=1012364 -> Parallel Seq Scan on portal.customer (actual time=0.562..956.515 rows=333333 loops=3) Output: cust_nm, f_get_signgu_nm_sql_parallel(NULL::character varying) Buffers: shared hit=1012364 Worker 0: actual time=0.787..987.412 rows=338847 loops=1 Buffers: shared hit=343198 Worker 1: actual time=0.829..990.604 rows=332640 loops=1 Buffers: shared hit=336920 Planning Time: 0.062 ms Execution Time: 1071.776 ms SELECT CUST_NM , f_get_signgu_nm_sql_parallel_strict(null) FROM CUSTOMER; Seq Scan on portal.customer (actual time=0.007..87.368 rows=1000000 loops=1) Output: cust_nm, NULL::character varying Buffers: shared hit=11364 Planning Time: 0.028 ms Execution Time: 112.834 ms ○ strict function은 parallel operation이 동작하지 않았으나 함수가 전혀 수행되지 않았다. ○ 입력 변수에 null 입력 시 일정한 값이 출력되는 함수도 무조건 결과 값이 null이 되므로 주의해야 한다. • SQL function • PL/pgSQL function IV. Called on null input vs. STRICT function
  25. 25. 24 I make PostgreSQL database faster and more reliable with sql tuning and data modeling Function Optimization Silver Rule Rule #1 : Never create a function. Rule #2 : Never foreget rule#1. ○ PL/pgSQL function 보다는 inlining 가능한 SQL function 을 사용 ○ Parallel Unsafe -> Parallel Restricted -> Parallel Safe 사용 ○ NULL 입력 시 NULL 출력이 확실한 경우 STRICT 사용
  26. 26. 25 I make PostgreSQL database faster and more reliable with sql tuning and data modeling V. STABLE vs. IMMUTABLE function CREATE OR REPLACE FUNCTION F_GET_REGION_NM ( p_flag IN VARCHAR2(5), -- 시군구/시도 구분자 p_cust_no IN NUMBER ) RETURN VARCHAR2 IS v_err EXCEPTION; v_out_name VARCHAR2(100); BEGIN IF p_flag ='G1' THEN SELECT B.CD_NM --시군구명 출력 INTO v_out_name FROM CUSTOMER A LEFT JOIN COM_CODE B ON (A.SIGUNGU_CD = B.CD AND A.CUST_NO = p_cust_no) WHERE B.group_cd = p_flag; • Oracle function ELSIF p_flag='G3' THEN SELECT B.CD_NM --시도명 출력 INTO v_out_name FROM CUSTOMER A LEFT JOIN COM_CODE B ON (A.SIDO_CD = B.CD AND A.CUST_NO = p_cust_no) WHERE B.group_cd = p_flag; ELSE RAISE v_err; END IF; RETURN v_out_name; EXCEPTION WHEN v_err THEN -- DBMS_OUTPUT.PUT_LINE('p_flag 오류: 시구 A 시도 B’) RETURN -1; WHEN others THEN RETURN -1; END; ○ EXCEPTION WHEN others THEN RETURN 로직 이용할 경우가 있는가? ○ 가능하면 exception block 사용을 지양하라. Sub-transaction을 유발한다.
  27. 27. 26 I make PostgreSQL database faster and more reliable with sql tuning and data modeling • STABLE function V. STABLE vs. IMMUTABLE function CREATE OR REPLACE FUNCTION F_GET_REGION_NM_STABLE( p_flag IN VARCHAR, p_cust_no IN NUMERIC) RETURNS VARCHAR LANGUAGE PLPGSQL STABLE PARALLEL SAFE AS $$ DECLARE v_out_name VARCHAR(100); BEGIN IF p_flag ='G1' THEN SELECT B.CD_NM --시군구명 출력 INTO v_out_name FROM CUSTOMER A LEFT JOIN COM_CODE B ON (A.SIGUNGU_CD = B.CD AND A.CUST_NO = p_cust_no) WHERE B.group_cd = p_flag; ELSIF p_flag='G3' THEN SELECT B.CD_NM --시도명 출력 INTO v_out_name FROM CUSTOMER A LEFT JOIN COM_CODE B ON (A.SIDO_CD = B.CD AND A.CUST_NO = p_cust_no) WHERE B.group_cd = p_flag; ELSE v_out_name := ‘-1’; END IF; RETURN v_out_name; END; $$ ○ volatile (default) : database 변경, 같은 변수 값에 대해서도 다른 결과 값 나올 수 있을 때 사용 ○ stable : 같은 입력 변수 값에 대해서 같은 결과 값이 나올 때 사용 ○ immutable : 주로 no table access, query planning 시에 planner가 결과 값을 추출하여 사용한다.
  28. 28. 27 I make PostgreSQL database faster and more reliable with sql tuning and data modeling • IMMUTABLE function V. STABLE vs. IMMUTABLE function CREATE OR REPLACE FUNCTION F_GET_REGION_NM_IMMUTABLE( p_flag IN VARCHAR, p_cust_no IN NUMERIC) RETURNS VARCHAR LANGUAGE PLPGSQL IMMUTABLE PARALLEL SAFE AS $$ DECLARE v_out_name VARCHAR(100); BEGIN IF p_flag ='G1' THEN SELECT B.CD_NM --시군구명 출력 INTO v_out_name FROM CUSTOMER A LEFT JOIN COM_CODE B ON (A.SIGUNGU_CD = B.CD AND A.CUST_NO = p_cust_no) WHERE B.group_cd = p_flag; ELSIF p_flag='G3' THEN SELECT B.CD_NM --시도명 출력 INTO v_out_name FROM CUSTOMER A LEFT JOIN COM_CODE B ON (A.SIDO_CD = B.CD AND A.CUST_NO = p_cust_no) WHERE B.group_cd = p_flag; ELSE v_out_name := ‘-1’; END IF; RETURN v_out_name; END; $$
  29. 29. 28 I make PostgreSQL database faster and more reliable with sql tuning and data modeling V. STABLE vs. IMMUTABLE function • IMMUTABLE function • STABLE function SELECT COUNT(*) FROM EMPLOYEE WHERE SIDO_NM = F_GET_REGION_NM_STABLE('G3',321); Seq Scan on employee (actual time=0.037..99.468 rows=3333 loops=1) Filter: ((sido_nm)::text = (f_get_region_nm_stable('G3'::character varying, '321'::numeric))::text) Rows Removed by Filter: 6667 Buffers: shared hit=50110 Planning: Buffers: shared hit=5 Planning Time: 0.099 ms Execution Time: 99.668 ms SELECT COUNT(*) FROM EMPLOYEE WHERE SIDO_NM = F_GET_REGION_NM_IMMUTABLE('G3',321); Seq Scan on employee (actual time=0.004..1.064 rows=3333 loops=1) Filter: ((sido_nm)::text = 'Jeonbuk'::text) Rows Removed by Filter: 6667 Buffers: shared hit=110 Planning: Buffers: shared hit=5 Planning Time: 0.100 ms Execution Time: 1.152 ms ○ EXECUTION PLAN을 보면 immutable function 은 1회만 수행되었음을 알 수 있다. ○ stable function 으로 immutable function 만큼 성능 개선 하려면 subquery로 만들어야 한다. ○ subquery 는 parallel restricted 하게 동작한다. 즉 leader process에서만 subquery를 수행한다. SELECT COUNT(*) FROM EMPLOYEE WHERE SIDO_NM = (SELECT F_GET_REGION_NM_STABLE('G3',321));
  30. 30. 29 I make PostgreSQL database faster and more reliable with sql tuning and data modeling ○ volatile, stable , immutable 확신이 없으면 volatile 사용 ○ internal function 또는 UDF의 strict, volatility, parallel safe 여부 확인 select proname, prolang, procost, prokind, proisstrict, provolatile, proparallel from pg_proc where proname like 'f%'; ○ internal function 사용 시에도 stable 보다는 immutable function 을 사용하라. V. STABLE vs. IMMUTABLE function
  31. 31. 30 I make PostgreSQL database faster and more reliable with sql tuning and data modeling V. STABLE vs. IMMUTABLE function • IMMUTABLE function 성능 • STABLE function 성능 SELECT COUNT(*) FROM CUSTOMER WHERE REGISTER_DATE > TO_TIMESTAMP('20201201','YYYYMMDD'); Finalize Aggregate (actual time=126.310..127.914 rows=1 loops=1) Buffers: shared hit=11364 -> Gather (actual time=126.184..127.909 rows=3 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=11364 -> Partial Aggregate (actual time=123.276..123.277 rows=1 loops=3) Buffers: shared hit=11364 -> Parallel Seq Scan on customer (actual time=1.532..122.762 rows=10700 loops=3) Filter: (register_date > to_timestamp('20201201'::text, 'YYYYMMDD'::text)) Rows Removed by Filter: 322633 Buffers: shared hit=11364 Planning Time: 0.070 ms Execution Time: 127.938 ms SELECT COUNT(*) FROM CUSTOMER WHERE REGISTER_DATE > make_date(2020,12,1); Finalize Aggregate (actual time=28.726..30.411 rows=1 loops=1) Buffers: shared hit=11364 -> Gather (actual time=28.586..30.405 rows=3 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=11364 -> Partial Aggregate (actual time=25.899..25.900 rows=1 loops=3) Buffers: shared hit=11364 -> Parallel Seq Scan on customer (actual time=0.334..25.469 rows=10700 loops=3) Filter: (register_date > '2020-12-01'::date) Rows Removed by Filter: 322633 Buffers: shared hit=11364 Planning Time: 0.055 ms Execution Time: 30.433 ms ○ TO_TIMESTAMP 100 만번 수행 ○ '20201201'::date 형변환 1회 발생 ○ '20201201'::timestamp 도 동일한 성능
  32. 32. 31 I make PostgreSQL database faster and more reliable with sql tuning and data modeling Function Optimization Silver Rule Rule #1 : Never create a function. Rule #2 : Never foreget rule#1. ○ PL/pgSQL function 보다는 inlining 가능한 SQL function 을 사용 ○ Parallel Unsafe(default) -> Parallel Restricted -> Parallel Safe 사용 ○ NULL 입력 시 NULL 출력이 확실한 경우 STRICT 사용 ○ VOLATILE(default) -> STABLE -> IMMUTABLE 사용
  33. 33. 32 I make PostgreSQL database faster and more reliable with sql tuning and data modeling EXCEPTION BLOCK 특징 oracle에서는 exception block은 transaction control과는 무관한다. 다른 branch일 뿐이다. PostgreSQL에서는 exception이 발생하면 block 전체가 rollback되고, exception block 실행한다. PostgreSQL에서는 subtransaction(savepoint ~ rollback)이 수행된다. 이는 run-time error 발생한 transction이 계속 수행되기 위해서다. exception block 이 있으면 transaction이 block 내에서 끝날 수 없다. 즉, block 내에 commit/rollback 사용 불가 NOTICE: cannot commit while a subtransaction is active. --SUBTRANSACTION 부작용 https://postgres.ai/blog/20210831-postgresql-subtransactions-considered-harmful 1. higher XID growth 2. per-session cache overflow : 세션당 64개의 subtransaction 넘으면 성능 저하 3. SELECT ... FOR UPDATE 와 사용시 성능 저하 4. subtransaction 과 long-running transaction 사용시 standby 에서의 성능 저하 CREATE TABLE T1 (C1 INT); ALTER TABLE T1 ADD CONSTRAINT T1_PK PRIMARY KEY (C1); set AUTOCOMMIT off do $$ begin insert into t1 values(1),(2); insert into t1 values(2); exception when others then insert into t1 values(3); end; $$; select * from t1; exception block이 없으면, 아래 에러 발생한다. ERROR: current transaction is aborted, commands ignored until end of transaction block exception block은 savepoint를 생성하기 때문에 비용이 크다. exception 발생 가능성 없거나, application이 exeption 발생을 알기만 하면 된다면 exception block을 생성하지 마라.
  34. 34. 33 I make PostgreSQL database faster and more reliable with sql tuning and data modeling EXCEPTION BLOCK 특징 CREATE OR REPLACE FUNCTION F_REGION_NM_EX ( p_flag IN VARCHAR, -- 시군구/나이 구분자 p_cust_no IN NUMERIC) RETURNS VARCHAR LANGUAGE PLPGSQL STABLE PARALLEL SAFE STRICT AS $$ DECLARE v_out_name VARCHAR(100); BEGIN IF p_flag ='G1' THEN SELECT B.CD_NM --시군구명 출력 INTO v_out_name FROM CUSTOMER A LEFT JOIN COM_CODE B ON (A.SIGUNGU_CD = B.CD AND A.CUST_NO = p_cust_no) WHERE B.GROUP_CD = p_flag; • Parallel Safe Function 수행 에러 ELSIF p_flag='G3' THEN SELECT B.CD_NM --시도명 출력 INTO v_out_name FROM CUSTOMER A LEFT JOIN COM_CODE B ON (A.SIDO_CD = B.CD AND A.CUST_NO = p_cust_no) WHERE B.GROUP_CD = p_flag; END IF; RETURN v_out_name; EXCEPTION WHEN OTHERS THEN raise notice '% %', SQLERRM, SQLSTATE; END; $$ SELECT CUST_NM, F_GET_REGION_NM_EX('G1',CUST_NO) FROM CUSTOMER; ERROR: cannot start subtransactions during a parallel operation CONTEXT: PL/pgSQL function f_get_region_nm_ex(character varying,numeric) line 4 during statement block entry SQL state: 25000
  35. 35. 34 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (exception block) • Oracle Function CREATE OR REPLACE FUNCTION F_FINAL_ORD_DT ( p_cust_no IN OFFLINE_ORDER.CUST_NO%TYPE ) RETURN VARCHAR2 IS v_final_ord_dt OFFLINE_ORDER.ORD_DT%TYPE; BEGIN SELECT MAX(ORD_DT) INTO v_final_ord_dt FROM OFFLINE_ORDER WHERE CUST_NO = p_cust_no; RETURN gv_final_closing_dt; EXCEPTION WHEN OTHERS THEN RETURN NULL; END; CREATE OR REPLACE FUNCTION F_FINAL_ORD_DT (p_cust_no numeric) RETURNS VARCHAR LANGUAGE SQL STABLE PARALLEL SAFE STRICT AS $$ SELECT MAX(ORD_DT) FROM OFFLINE_ORDER WHERE CUST_NO = p_cust_no; $$ • PostgreSQL Function EXCEPTION block 이 꼭 필요한지 확인 EXCEPTION block 은 sql function 으로 구현 불가능 EXCEPTION block 있으면 무조건 parallel unsafe 만 생성 가능 commit/rollback 사용 불가능 Performance Barrier
  36. 36. 35 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (exception block) CREATE OR REPLACE FUNCTION F_SIDO_NM ( p_empno IN EMPLOYEE.EMPNO%TYPE ) RETURN VARCHAR2 IS v_sido_nm EMPLOYEE.SIDO_NM%TYPE; BEGIN SELECT SIDO_NM INTO v_sido_nm FROM EMPLOYEE WHERE EMPNO = p_empno; RETURN v_sido_nm; EXCEPTION WHEN NO_DATA_FOUND THEN v_sido_nm := '고객없음'; RETURN v_sido_nm; END; • Oracle Procedure EXCEPTION block 이 꼭 필요한가?
  37. 37. 36 I make PostgreSQL database faster and more reliable with sql tuning and data modeling • PL/pgSQL function CREATE OR REPLACE FUNCTION F_SIDO_NM_PL( p_empno IN NUMERIC) RETURNS VARCHAR LANGUAGE PLPGSQL STABLE PARALLEL UNSAFE AS $$ DECLARE v_sido_nm EMPLOYEE.SIDO_NM%TYPE; BEGIN SELECT SIDO_NM INTO STRICT v_sido_nm FROM EMPLOYEE WHERE EMPNO = p_empno; RETURN v_sido_nm; EXCEPTION WHEN NO_DATA_FOUND THEN v_sido_nm := '고객없음'; RETURN v_sido_nm; END; $$ CREATE OR REPLACE FUNCTION F_SIDO_NM( p_empno IN NUMERIC) RETURNS VARCHAR LANGUAGE SQL STABLE PARALLEL SAFE AS $$ SELECT COALESCE(max(sido_nm),'고객없음') FROM EMPLOYEE WHERE EMPNO = p_empno; $$ ○ INTO 뒤에 STRICT를 입력하면 데이터 없을때, NO_DATA_FOUND, TOO_MANY_ROWS exception 사용 가능 ○ EMPLOYEE 테이블 SIDO_NM 컬럼이 NOT NULL로 설계되어 있다면, SQL function 으로 porting 가능 IV. PORTING EXAMPLE (exception block) • SQL function
  38. 38. 37 I make PostgreSQL database faster and more reliable with sql tuning and data modeling • PL/pgSQL function CREATE OR REPLACE FUNCTION F_SIDO_NM_NE( p_empno IN NUMERIC) RETURNS VARCHAR LANGUAGE PLPGSQL STABLE PARALLEL SAFE AS $$ DECLARE v_sido_nm EMPLOYEE.SIDO_NM%TYPE; BEGIN SELECT SIDO_NM INTO v_sido_nm FROM EMPLOYEE WHERE EMPNO = p_empno; IF NOT FOUND THEN RETURN ‘고객없음’; END IF; RETURN v_sido_nm; END; $$ IV. PORTING EXAMPLE (exception block) exception block 제거로 parallel safe 가능 FOUND 변수는 1개 이상의 row가 반환되면 TRUE
  39. 39. 38 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (exception block) SELECT F_SIDO_NM_PL(EMPNO) FROM OFFLINE_ORDER; Index Only Scan using offline_order_x02 on portal.offline_order (actual time=0.090..5815.471 rows=1000002 loops=1) Output: f_sido_nm_pl(empno) Heap Fetches: 613 Buffers: shared hit=3002755 Planning Time: 0.036 ms Execution Time: 5870.846 ms SELECT F_SIDO_NM(EMPNO) FROM OFFLINE_ORDER; Gather (actual time=0.484..2725.507 rows=1000002 loops=1) Output: (f_sido_nm(empno)) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=3003755 -> Parallel Index Only Scan using offline_order_x02 on portal.offline_order (actual time=0.784..2639.281 rows=333334 loops=3) Output: f_sido_nm(empno) Heap Fetches: 613 Buffers: shared hit=3003755 Worker 0: actual time=1.075..2677.726 rows=334158 loops=1 Buffers: shared hit=1003887 Worker 1: actual time=1.077..2676.948 rows=344494 loops=1 Buffers: shared hit=1034924 Planning: Buffers: shared hit=13 Planning Time: 0.111 ms Execution Time: 2769.172 ms • SQL function • PL/pgSQL function with exception SELECT F_SIDO_NM_NE(EMPNO) FROM OFFLINE_ORDER; Gather (actual time=0.457..2170.676 rows=1000002 loops=1) Output: (f_sido_nm_ne(empno)) Workers Planned: 2 Workers Launched: 2 ............... Planning Time: 0.058 ms Execution Time: 2216.689 ms exception block 제거하면 parallel unsafe 도 20% 성능 개선 SQL function은 coalsesce, max 사용으로 성능 저하 • without exception block
  40. 40. 39 I make PostgreSQL database faster and more reliable with sql tuning and data modeling Function Optimization Silver Rule Rule #1 : Never create a function. Rule #2 : Never foreget rule#1. ○ PL/pgSQL function 보다는 inlining 가능한 SQL function 을 사용 ○ Parallel Unsafe(default) -> Parallel Restricted -> Parallel Safe 사용 ○ NULL 입력 시 NULL 출력이 확실한 경우 STRICT 사용 ○ VOLATILE(default) -> STABLE -> IMMUTABLE 사용 ○ EXCEPTION block 사용을 최소화
  41. 41. 40 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (check for existence) • Oracle Function CREATE OR REPLACE FUNCTION F_CHECK_CUST_YN (P_CUST_NM IN VARCHAR2, P_ORD_DT IN VARCHAR2) --고객명을 입력받아 주문 내역이 있는지 조회 RETURN VARCHAR2 IS v_chk_cnt NUMBER; v_chk_result VARCHAR2(1); BEGIN BEGIN SELECT COUNT(1) INTO v_chk_cnt FROM CUSTOMER A, ONLINE_ORDER B WHERE A.CUST_NO = B.CUST_NO AND A.CUST_NM = p_cust_nm AND B.ORD_DATE >= TO_DATE(P_ORD_DT,'YYYYMMDD') AND B.ORD_DATE < TO_DATE(P_ORD_DT,'YYYYMMDD') + 1 AND ROWNUM = 1; EXCEPTION WHEN NO_DATA_FOUND THEN v_chk_result := 'N'; RETURN v_chk_result; WHEN OTHERS THEN --P_ORD_DT가 유효한 날짜가 아닌경우 v_chk_result := 'N'; RETURN v_check_result; END; IF v_chk_cnt > 0 THEN --주문내역이 있으면 'Y' 출력 v_chk_result := 'Y'; ELSE v_chk_result := 'N'; --주문내역이 없으면 'N' 출력 END IF; RETURN v_chk_result; END; / 비효율 요소 ○ COUNT(1) ○ ROWNUM = 1 ○ IF v_chk_cnt >0 ○ TO_DATE 함수 ○ inner block 사용 ○ no_data_found 발생할 수 있는가? ○ EXCEPTION block 제거할 수 있는가?
  42. 42. 41 I make PostgreSQL database faster and more reliable with sql tuning and data modeling • 소스코드만 porting한 PL/pgSQL function IV. PORTING EXAMPLE (check for existence) CREATE OR REPLACE FUNCTION F_CHECK_CUST_YN (P_CUST_NM IN VARCHAR, P_ORD_DT IN VARCHAR) RETURNS VARCHAR language plpgsql stable parallel unsafe AS $$ DECLARE v_chk_cnt INT; v_chk_result VARCHAR(1); BEGIN BEGIN SELECT COUNT(1) INTO STRICT v_chk_cnt FROM CUSTOMER A, ONLINE_ORDER B WHERE A.CUST_NO = B.CUST_NO AND A.CUST_NM = p_cust_nm AND B.ORD_DATE >= TO_DATE(P_ORD_DT,'YYYYMMDD') AND B.ORD_DATE < TO_DATE(P_ORD_DT,'YYYYMMDD') + 1 FETCH NEXT 1 ROWS ONLY; EXCEPTION WHEN NO_DATA_FOUND THEN v_chk_result := 'N'; RETURN v_chk_result; WHEN OTHERS THEN v_chk_result := 'N'; RETURN v_chk_result; END; IF v_chk_cnt > 0 THEN --주문내역이 있으면 'Y' 출력 v_chk_result := 'Y'; ELSE v_chk_result := 'N'; --주문내역이 없으면 'N' 출력 END IF; RETURN v_chk_result; END; $$ 비효율 요소 ○ COUNT(1) ○ FETCH NEXT 1 ROWS ONLY ○ IF v_chk_cnt >0 ○ TO_DATE 함수 ○ inner block 사용 ○ no_data_found 발생할 수 있는가? ○ EXCEPTION block 제거할 수 있는가?
  43. 43. 42 I make PostgreSQL database faster and more reliable with sql tuning and data modeling SELECT count(*) FROM CUSTOMER A, ONLINE_ORDER B WHERE A.CUST_NO = B.CUST_NO AND A.CUST_NM = 'L225795CUST_NM' AND B.ORD_DATE >= TO_DATE('20190815','YYYYMMDD') AND B.ORD_DATE < TO_DATE('20190815','YYYYMMDD') + 1 FETCH NEXT 1 ROWS ONLY; Limit (actual time=180.208..182.522 rows=1 loops=1) -> Aggregate (actual time=180.207..182.520 rows=1 loops=1) -> Gather (actual time=0.638..182.514 rows=2 loops=1) Workers Planned: 2 Workers Launched: 2 -> Nested Loop (actual time=86.696..175.015 rows=1 loops=3) -> Parallel Seq Scan on online_order b (actual time=0.387..170.769 rows=667 loops=3) Filter: ((ord_date >= to_date('20190815'::text, 'YYYYMMDD'::text)) AND (ord_date < (to_date('20190815'::text, 'YYYYMMDD'::text) + 1))) Rows Removed by Filter: 332667 -> Index Scan using customer_pk on customer a (actual time=0.006..0.006 rows=0 loops=2000) Index Cond: (cust_no = b.cust_no) Filter: ((cust_nm)::text = 'L225795CUST_NM'::text) Rows Removed by Filter: 1 Planning Time: 0.152 ms Execution Time: 182.549 ms • 함수 내 SQL 성능 확인 IV. PORTING EXAMPLE (check for existence)
  44. 44. 43 I make PostgreSQL database faster and more reliable with sql tuning and data modeling SELECT COUNT(*) WHERE EXISTS (SELECT 1 FROM CUSTOMER A, ONLINE_ORDER B WHERE A.CUST_NO = B.CUST_NO AND A.CUST_NM = 'L225795CUST_NM' AND B.ORD_DATE >= TO_DATE('20190815','YYYYMMDD') AND B.ORD_DATE < TO_DATE('20190815','YYYYMMDD') + 1); Aggregate (actual time=194.693..196.654 rows=1 loops=1) InitPlan 1 (returns $2) -> Gather (actual time=194.686..196.646 rows=1 loops=1) Workers Planned: 2 Workers Launched: 2 -> Nested Loop (actual time=134.338..192.142 rows=1 loops=3) -> Parallel Seq Scan on online_order b (actual time=0.455..187.458 rows=667 loops=3) Filter: ((ord_date >= to_date('20190815'::text, 'YYYYMMDD'::text)) AND (ord_date < (to_date('20190815'::text, 'YY~D'::text) + 1))) Rows Removed by Filter: 332667 -> Index Scan using customer_pk on customer a (actual time=0.006..0.006 rows=0 loops=2000) Index Cond: (cust_no = b.cust_no) Filter: ((cust_nm)::text = 'L225795CUST_NM'::text) Rows Removed by Filter: 1 -> Result (actual time=194.689..194.689 rows=1 loops=1) One-Time Filter: $2 Planning Time: 0.160 ms Execution Time: 196.677 ms • 함수 내 SQL 수정 후 성능(1) parallel restricted operation 중에는 효율적인 실행계획이 만들어지지 않음 IV. PORTING EXAMPLE (check for existence)
  45. 45. 44 I make PostgreSQL database faster and more reliable with sql tuning and data modeling SELECT COUNT(*) WHERE EXISTS (SELECT 1 FROM CUSTOMER A, ONLINE_ORDER B, current_schema() WHERE A.CUST_NO = B.CUST_NO AND A.CUST_NM = 'L225795CUST_NM' AND B.ORD_DATE >= TO_DATE('20190815','YYYYMMDD') AND B.ORD_DATE < TO_DATE('20190815','YYYYMMDD') + 1); Aggregate (actual time=0.023..0.024 rows=1 loops=1) InitPlan 1 (returns $1) -> Nested Loop (actual time=0.020..0.021 rows=1 loops=1) -> Nested Loop (actual time=0.017..0.017 rows=1 loops=1) -> Seq Scan on online_order b (actual time=0.007..0.007 rows=1 loops=1) Filter: ((ord_date >= to_date('20190815'::text, 'YYYYMMDD'::text)) AND (ord_date < (to_date('20190815'::text, 'Y~D'::text) + 1))) Rows Removed by Filter: 1 -> Index Scan using customer_pk on customer a (actual time=0.009..0.009 rows=1 loops=1) Index Cond: (cust_no = b.cust_no) Filter: ((cust_nm)::text = 'L225795CUST_NM'::text) -> Function Scan on "current_schema" (actual time=0.003..0.003 rows=1 loops=1) -> Result (actual time=0.022..0.022 rows=1 loops=1) One-Time Filter: $1 Planning Time: 0.170 ms Execution Time: 0.044 ms • 함수 내 SQL 수정 후 성능(2) Parallel operation이 동작하지 않음 current_schema() 는 1회만 호출함 IV. PORTING EXAMPLE (check for existence)
  46. 46. 45 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (check for existence) • 튜닝 후 function CREATE OR REPLACE FUNCTION F_CHECK_CUST_YN_OPTIM (P_CUST_NM IN VARCHAR, P_ORD_DT IN VARCHAR) RETURNS VARCHAR language plpgsql parallel unsafe AS $$ DECLARE v_chk_cnt INT; v_chk_result VARCHAR(1); BEGIN BEGIN SELECT COUNT(*) INTO STRICT v_chk_cnt WHERE EXISTS (SELECT 1 FROM CUSTOMER A, ONLINE_ORDER B, current_schema() WHERE A.CUST_NO = B.CUST_NO AND A.CUST_NM = p_cust_nm AND B.ORD_DATE >= P_ORD_DT::timestamp AND B.ORD_DATE < P_ORD_DT::timestamp+ interval ‘1 day’ ); EXCEPTION WHEN NO_DATA_FOUND THEN v_chk_result := 'N'; RETURN v_chk_result; WHEN OTHERS THEN v_chk_result := 'N'; RETURN v_chk_result; END; IF v_chk_cnt > 0 THEN --주문내역이 있으면 'Y' 출 력 v_chk_result := 'Y'; ELSE v_chk_result := 'N'; --주문내역이 없으면 'N' 출력 END IF; RETURN v_chk_result; END; $$ 개선사항 • count(1) -> count(*) • parallel operation 막기 위해 current_schema() 추가 (이유는 다음 slide 참조) • stable function TO_DATE 함수 제거 • p_ord_dt 가 유효한 값이 아닐 경우 있으므로 exception block 필요 count(*) 에서는 no_data_found 나오는 경우 없다.
  47. 47. 46 I make PostgreSQL database faster and more reliable with sql tuning and data modeling • 튜닝 후 function 성능 • 튜닝 전 function 성능 SELECT F_CHECK_CUST_YN ('L225795CUST_NM','20190815'); Result (actual time=163.996..163.996 rows=1 loops=1) Output: f_check_cust_yn('L225795CUST_NM'::character varying, '20190815'::character varying) Buffers: shared hit=12271 read=16148 Planning Time: 0.014 ms Execution Time: 164.007 ms SELECT F_CHECK_CUST_YN_OPTIM ('L225795CUST_NM','20190815'); Result (actual time=0.494..0.494 rows=1 loops=1) Output: f_check_cust_yn_optim('L225795CUST_NM'::character varying, '20190815'::character varying) Buffers: shared hit=51 Planning Time: 0.014 ms Execution Time: 0.504 ms IV. PORTING EXAMPLE (check for existence)
  48. 48. 47 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (check for existence) • 함수 내부 SQL statement 실행계획 보는 방법 • parallel unsafe 옵션을 부여해도 함수 내부 SQL은 parallel mode로 실행될 수 있음 CREATE OR REPLACE FUNCTION F_CHECK_CUST_YN_OPTIM (P_CUST_NM IN VARCHAR, P_ORD_DT IN VARCHAR) RETURNS VARCHAR language plpgsql stable parallel unsafe AS $$ DECLARE v_chk_cnt INT; v_chk_result VARCHAR(1); BEGIN BEGIN SELECT COUNT(*) INTO STRICT v_chk_cnt WHERE EXISTS (SELECT 1 FROM CUSTOMER A, ONLINE_ORDER B WHERE A.CUST_NO = B.CUST_NO AND A.CUST_NM = p_cust_nm AND B.ORD_DATE >= P_ORD_DT::timestamp AND B.ORD_DATE < P_ORD_DT::timestamp+ interval ‘1 day’ ); EXCEPTION ... 함수 수행 세션에서 아래 command 수행 load 'auto_explain'; set auto_explain.log_min_duration=0; set auto_explain.log_analyze to true; set auto_explain.log_buffers to false; set auto_explain.log_timing to true; set auto_explain.log_verbose to true; set auto_explain.log_nested_statements to true;
  49. 49. 48 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (check for existence) • 함수 내부 SQL statement 실행계획 보는 방법 • parallel unsafe 옵션을 부여해도 함수 내부 SQL은 parallel mode로 실행될 수 있음 Aggregate (cost=36055.88..36055.89 rows=1 width=8) (actual time=0.286..295.796 rows=1 loops=1) Output: count(*) InitPlan 1 (returns $2) -> Gather (cost=1000.42..36055.87 rows=1 width=0) (actual time=0.281..295.790 rows=1 loops=1) Workers Planned: 2 Workers Launched: 2 -> Nested Loop (cost=0.42..35055.77 rows=1 width=0) (actual time=125.254..125.255 rows=1 loops=3) Inner Unique: true Worker 0: actual time=99.434..99.435 rows=1 loops=1 Worker 1: actual time=276.305..276.307 rows=0 loops=1 -> Parallel Seq Scan on portal.online_order b (cost=0.00..31866.86 rows=415 width=6) (actual time=1.962..121.375 rows=667 loops=3) Output: b.ord_no, b.cust_no, b.ord_date, b.ord_dt, b.ord_status_cd, b.comment Filter: ((b.ord_date >= ('20190815'::cstring)::date) AND (b.ord_date < (('20190815'::cstring)::date + '1 day'::interval))) Rows Removed by Filter: 332637 Worker 0: actual time=4.605..95.948 rows=498 loops=1 Worker 1: actual time=1.271..268.166 rows=1501 loops=1 -> Index Scan using customer_pk on portal.customer a (cost=0.42..7.67 rows=1 width=6) (actual time=0.005..0.005 rows=0 loops=2000) Output: a.cust_no, a.cust_nm, a.register_date, a.register_dt, a.cust_status_cd, a.register_channel_cd, a.cust_age, a.active_yn, a.sigungu_cd, a.sido_cd Index Cond: (a.cust_no = b.cust_no) Filter: ((a.cust_nm)::text = 'L225795CUST_NM'::text) Rows Removed by Filter: 1 Worker 0: actual time=0.006..0.006 rows=0 loops=498 Worker 1: actual time=0.005..0.005 rows=0 loops=1501 -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.283..0.283 rows=1 loops=1) One-Time Filter: $2
  50. 50. 49 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (check for existence) • function 내부 optimizer 제어 parameter 설정 CREATE OR REPLACE FUNCTION F_CHECK_CUST_YN_OPTIM (P_CUST_NM IN VARCHAR, P_ORD_DT IN VARCHAR) RETURNS VARCHAR language plpgsql stable parallel unsafe set max_parallel_workers_per_gather = 0 AS $$ DECLARE v_chk_cnt INT; v_chk_result VARCHAR(1); BEGIN BEGIN SELECT COUNT(*) INTO STRICT v_chk_cnt FROM CUSTOMER A, ONLINE_ORDER B--, current_schema() WHERE A.CUST_NO = B.CUST_NO AND A.CUST_NM = p_cust_nm AND B.ORD_DATE >= P_ORD_DT::timestamp AND B.ORD_DATE < P_ORD_DT::timestamp+ interval ‘1 day’ FETCH NEXT 1 ROWS ONLY; EXCEPTION WHEN NO_DATA_FOUND THEN v_chk_result := 'N'; RETURN v_chk_result; WHEN OTHERS THEN v_chk_result := 'N'; RETURN v_chk_result; END; IF v_chk_cnt > 0 THEN --주문내역이 있으면 'Y' 출력 v_chk_result := 'Y'; ELSE v_chk_result := 'N'; --주문내역이 없으면 'N' 출력 END IF; RETURN v_chk_result; END; $$ current_schema() internal function 사용하고 싶지 않다면, function 내부에서 optimizer 제어 parameter 설정할 수 있 다.. 함수 내부에만 적용되고 함수 수행하는 SQL statement 에 는 적용되지 않는다.
  51. 51. 50 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (check for existence) • 로직 간소화, inner block 제거 CREATE OR REPLACE FUNCTION F_CHECK_CUST_YN_OPT2 (P_CUST_NM IN VARCHAR, P_ORD_DT IN VARCHAR) RETURNS VARCHAR LANGUAGE PLPGSQL STABLE PARALLEL UNSAFE AS $$ DECLARE v_chk_result VARCHAR(1); BEGIN IF EXISTS ( SELECT current_schema() --prevents parallel operation FROM CUSTOMER A, ONLINE_ORDER B WHERE A.CUST_NO = B.CUST_NO AND A.CUST_NM = p_cust_nm AND B.ORD_DATE >= P_ORD_DT::timestamp AND B.ORD_DATE < P_ORD_DT::timestamp+ interval ‘1 day’ ) THEN v_chk_result = ‘Y’ --주문 내역이 있으면 Y 출력 ELSE v_chk_result = ‘N’ ; --주문 내역이 없으면 N 출력 END IF; RETURN v_chk_result; EXCEPTION WHEN OTHERS THEN v_chk_result := 'N'; RETURN v_chk_result; END; $$
  52. 52. 51 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (check for existence) • v_chk_cnt 제거 후 • v_chk_cnt 제거 전 SELECT F_CHECK_CUST_YN_OPTIM(CUST_NM,'20190815') FROM CUSTOMER LIMIT 100; Limit (actual time=332.772..34586.427 rows=100 loops=1) Output: (f_check_cust_yn_optim(cust_nm, '20190815'::character varying)) Buffers: shared hit=2842502 -> Seq Scan on portal.customer (actual time=332.770..34586.343 rows=100 loops=1) Output: f_check_cust_yn_optim(cust_nm, '20190815'::character varying) Buffers: shared hit=2842502 Planning Time: 0.040 ms Execution Time: 34586.490 ms SELECT F_CHECK_CUST_YN_OPT2(CUST_NM,'20190815') FROM CUSTOMER LIMIT 100; Limit (actual time=353.044..34389.961 rows=100 loops=1) Output: (f_check_cust_yn_opt2(cust_nm, '20190815'::character varying)) Buffers: shared hit=2842502 -> Seq Scan on portal.customer (actual time=353.042..34389.873 rows=100 loops=1) Output: f_check_cust_yn_opt2(cust_nm, '20190815'::character varying) Buffers: shared hit=2842502 Planning Time: 0.049 ms Execution Time: 34390.019 m 100회 함수 수행 시 근소하 게 항상 로직 개선한 Function가 성능이 좋다.
  53. 53. 52 I make PostgreSQL database faster and more reliable with sql tuning and data modeling Function Optimization Silver Rule Rule #1 : Never create a function. Rule #2 : Never foreget rule#1. ○ PL/pgSQL function 보다는 inlining 가능한 SQL function 을 사용 ○ Parallel Unsafe(default) -> Parallel Restricted -> Parallel Safe 사용 ○ NULL 입력 시 NULL 출력이 확실한 경우 STRICT 사용 ○ VOLATILE(default) -> STABLE -> IMMUTABLE 사용 ○ EXCEPTION block 사용을 최소화 ○ 존재 여부를 묻는 로직은 1개 row만 읽고 처리
  54. 54. 53 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (skewed data) • 개선 전 Function CREATE OR REPLACE FUNCTION F_MAX_AGE(P_SIGUNGU_CD IN VARCHAR) RETURNS NUMERIC language plpgsql stable parallel unsafe --테스트 목적 set max_parallel_workers_per_gather=0 --테스트 목적 AS $$ DECLARE v_max NUMERIC; BEGIN SELECT MAX(CUST_AGE) INTO v_max FROM CUSTOMER WHERE SIGUNGU_CD = P_SIGUNGU_CD; RETURN v_max; END $$ 데이터 분포 sigungu_cd count 11001 994000 11002 1000 11003 1000 11004 1000 11005 1000 11006 1000 11007 1000
  55. 55. 54 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (skewed data) SELECT F_MAX_AGE('11001'); --서울 Time: 181.463 ms --auto_explain 확인 시 full scan SELECT F_MAX_AGE('11006'); --제주 Time: 1.512 ms -- auto_explain 확인 시 Bitmap Index Scan SELECT F_MAX_AGE('11001'); --서울 Time: 215.504 ms -- auto_explain 확인 시 Bitmap Index Scan PostgreSQL은 Oracle과 달리 일반 SQL statement 에서 항상 bind peaking이 발생한다 그러나, User Defined Function 의 실행계획은 재활용 될 수도 있다. (항상 재활용 되는 것은 아님)  함수 내부 SQL의 실행계획은 저장하여 parsing 부하를 감소시킬 수 있다. PostgreSQL Manual Typically Plan Caching will happen only if the execution plan is not very sensitive to the values of the PL/pgSQL variables referenced in it. • 함수 반복 수행과 elapsed time 측정
  56. 56. 55 I make PostgreSQL database faster and more reliable with sql tuning and data modeling CREATE OR REPLACE FUNCTION F_MAX_AGE_DYN (P_SIGUNGU_CD IN VARCHAR) RETURNS NUMERIC language plpgsql stable strict parallel unsafe --테스트 목적 set max_parallel_workers_per_gather=0 --테스트 목적 AS $body$ DECLARE v_max numeric; v_sql text; BEGIN v_sql := $$SELECT MAX(CUST_AGE) FROM CUSTOMER WHERE SIGUNGU_CD =$$||quote_literal(P_SIGUNGU_CD)||$$$$; EXECUTE v_sql INFO v_max; RETURN v_max; END $body$ IV. PORTING EXAMPLE (skewed data) • 함수 내 Dynamic SQL 사용
  57. 57. 56 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (skewed data) SELECT F_MAX_AGE_DYN('11001'); --서울 Time: 186.320 ms --auto_explain 확인 시 full scan SELECT F_MAX_AGE_DYN('11006'); --제주 Time: 1.730 ms -- auto_explain 확인 시 Bitmap Index Scan SELECT F_MAX_AGE_DYN('11001'); --서울 Time: 184.611 ms -- auto_explain 확인 시 full scan 함수 내 dynamic SQL을 사용하면 매번 parse analysis을 수행한다. 매번 parsing 부하가 발생한다. 입력 변수에 따라서 다른 실행계획이 필요하다면 dynamic SQL을 사용해야 한다. • dynamic SQL 함수 반복 수행과 elapsed time 측정
  58. 58. 57 I make PostgreSQL database faster and more reliable with sql tuning and data modeling Function Optimization Silver Rule Rule #1 : Never create a function. Rule #2 : Never foreget rule#1. ○ PL/pgSQL function 보다는 inlining 가능한 SQL function 을 사용 ○ Parallel Unsafe(default) -> Parallel Restricted -> Parallel Safe 사용 ○ NULL 입력 시 NULL 출력이 확실한 경우 STRICT 사용 ○ VOLATILE(default) -> STABLE -> IMMUTABLE 사용 ○ EXCEPTION block 사용을 최소화 ○ 존재 여부를 묻는 로직은 1개 row만 읽고 처리 ○ 함수 입력 변수에 따라 다른 실행계획이 필요하면 dynamic SQL 사용
  59. 59. 58 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (Autonomous Transaction) • Oracle Function CREATE OR REPLACE FUNCTION F_SINGO_NEXT_VAL (v_ymd IN VARCHAR2) RETURN VARCHAR2 IS PRAGMA AUTONOMOUS_TRANSACTION; v_last_singo_no VARCHAR2(17) BEGIN UPDATE singo_no SET last_singo_no = 'S' || v_ymd || LPAD(TO_CHAR(TO_NUMBER( NVL(SUBSTR(last_singo_no,10,17),'0') ) +1),8,'0') ; --업데이트 데이터가 없으면, 최초 채번이므로 INSERT 수행 IF SQL%ROWCOUNT = 0 THEN INSERT INTO singo_no VALUES ('S'||v_ymd||'00000001'); END IF; --채번값 GET SELECT last_singo_no INTO v_last_singo_no FROM singo_no; COMMIT; --transaction commit RETURN v_last_singo_no; END; 최신 신고번호를 저장하는 테이블이 미리 생성되어 있음 create table singo_no ( last_singo_no varchar2(40) not null ); 채번번호 구조 : 'S' + 'yyyymmdd' + 8자리 숫자 ex) S2021102800000009
  60. 60. 59 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (Autonomous Transaction) 시간 • Autonomous Trnasaction 사용 이유 transaction 1 transaction 2 transaction 시작 신고번호 추출 신고번호 추출 성공 다른 INSERT/UPDATE 수행 다른 INSERT/UPDATE 완료 및 commit 수행 transaction 시작 신고번호 추출 (tx1 이 아직 끝나지 않았으므로 대기) 신고번호 추출 다른 SELECT / INSERT /UPDATE 수행 시작 TX2 는 불필요한 대기 상태
  61. 61. 60 I make PostgreSQL database faster and more reliable with sql tuning and data modeling • PostgreSQL 은 pragma autonomous_transaction 기능이 없다. • 적극적이고 유능한 개발자라면 business owner를 설득해서 PostgreSQL 이 지원하지 않는 기능은 business requirement에서 빼도록 해야 한다. IV. PORTING EXAMPLE (Autonomous Transaction)
  62. 62. 61 I make PostgreSQL database faster and more reliable with sql tuning and data modeling • autonomos transaction 기능 제거한 PostgreSQL function create or replace function singo_next_val(v_ymd in varchar) returns varchar language sql volatile parallel unsafe as $$ UPDATE singo_no SET last_singo_no = 'S'||v_ymd||LPAD(( COALESCE(SUBSTR(last_singo_no,10,17),'0')::numeric +1)::text,8,'0') RETURNING last_singo_no ; $$ create table singo_no ( last_singo_no varchar(40) not null); --autovacuum 최대한 자주 발생하게 설정 --오라클은 항상 테이블 크기 8k 이하, PostgreSQL은 테이블 크기 증가 alter table singo_no set (autovacuum_vacuum_scale_factor = 0.0); alter table singo_no set (autovacuum_vacuum_threshold = 10); alter table singo_no set (autovacuum_vacuum_insert_threshold = 10); --최초 채번 로직 제거 위해 미리 아래 데이터 입력 insert into singo_no values (S2021010100000000); IV. PORTING EXAMPLE (Autonomous Transaction)
  63. 63. 62 I make PostgreSQL database faster and more reliable with sql tuning and data modeling Function Optimization Silver Rule Rule #1 : Never create a function. Rule #2 : Never foreget rule#1. ○ PL/pgSQL function 보다는 inlining 가능한 SQL function 을 사용 ○ Parallel Unsafe(default) -> Parallel Restricted -> Parallel Safe 사용 ○ NULL 입력 시 NULL 출력이 확실한 경우 STRICT 사용 ○ VOLATILE(default) -> STABLE -> IMMUTABLE 사용 ○ EXCEPTION block 사용을 최소화 ○ 존재 여부를 묻는 로직은 1개 row만 읽고 처리 ○ 함수 입력 변수에 따라 다른 실행계획이 필요하면 dynamic SQL 사용 ○ PostgreSQL이 지원하지 않는 funtionality는 BR를 customizing
  64. 64. 63 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (Autonomous Transaction) • autonomos transaction dblink 이용 PostgreSQL function create or replace function singo_next_val_at(v_ymd in varchar) returns varchar language sql volatile parallel safe as $$ SELECT * FROM dblink('host=/var/run/postgresql port=5432 user=scott dbname=analdb', format($$UPDATE singo_no SET last_singo_no = 'S'||%L||LPAD(( COALESCE(SUBSTR(last_singo_no,10,17),'0')::numeric +1)::text,8,'0') RETURNING last_singo_no $$, v_ymd)) as t1(last_singo_no varchar(17)) $$ 동시성 향상을 위해 autonomous transaction 을 꼭 사용해야 한다면? 자신의 database로 DBLink connection을 이용하면, 별도의 transaction으로 인식한다. CREATE EXTENSION dblink; --dblink is a "contrib" extension
  65. 65. 64 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (Autonomous Transaction) • autonomos transaction FDW 이용 PostgreSQL function create or replace function singo_next_val_at2(v_ymd in varchar) returns varchar language sql volatile parallel safe as $$ SELECT * FROM dblink(‘loopback_dblink', format($$UPDATE singo_no SET last_singo_no = 'S'||%L||LPAD(( COALESCE(SUBSTR(last_singo_no,10,17),'0')::numeric +1)::text,8,'0') RETURNING last_singo_no $$, v_ymd)) as t1(last_singo_no varchar(17)) $$ CREATE EXTENSION IF NOT EXISTS postgres_fdw; CREATE SERVER loopback_dblink FOREIGN DATA WRAPPER dblink_fdw OPTIONS (hostaddr '127.0.0.1', dbname 'analdb'); CREATE USER MAPPING FOR scott SERVER loopback_dblink OPTIONS (user 'scott', password 'tiger');
  66. 66. 65 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (Autonomous Transaction) • atonomous transaction 미사용 select singo_next_val (‘20210101’) from generate_series(1,100); Time: 8.879 ms • dblink로 atonomous transaction 구현 select singo_next_val_at (‘20210101’) from generate_series(1,100); Time: 1233.183 ms (00:01.233) • FDW로 atonomous transaction 구현 select singo_next_val_at2(‘20210101’) from generate_series(1,100); Time: 1280.185 ms (00:01.280) dblink 또는 FDW 사용하여 autonomous transaction 구현 시 성능 저하 발생 • Autonomous Transaction 구현 함수 성능
  67. 67. 66 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (Autonomous Transaction) • 세션 내에서 dblink open 후, open 상태 유지 create or replace function singo_next_val_at3(v_ymd in varchar) returns varchar language plpgsql volatile parallel safe as $body$ DECLARE v_output varchar; v_cnt int; BEGIN SELECT count(*) INTO v_cnt FROM dblink_get_connections() WHERE dblink_get_connections @> ‘{dblink_self}’; /* dblink_connect opens a persistent connectioin to a remote database */ IF v_cnt = 0 THEN PERFORM dblink_connect(‘dblink_self’,’loopback_dblink’); END IF; SELECT * FROM dblink(‘loopback_dblink', format($$UPDATE singo_no SET last_singo_no = 'S'||%L||LPAD(( COALESCE(SUBSTR(last_singo_no,10,17),'0')::numeric +1)::text,8,'0') RETURNING last_singo_no $$, v_ymd)) as t1(last_singo_no varchar(17)) INTO v_output; RETURN v_output; END; $body$ dblink_connect 로 open된 link 는 dblink_disconnect 로 close 하지 않으면 세션 close 시에 자 동으로 close 된다.
  68. 68. 67 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (Autonomous Transaction) • atonomous transaction 미사용 select singo_next_val(ord_date) from generate_series(1,100); Time: 8.879 ms • FDW로 atonomous transaction 구현 select singo_next_val_at2(ord_date) from generate_series(1,100); Time: 1280.185 ms (00:01.280) • dblink open 상태 유지 구현 select singo_next_val_at3('20211028') from generate_series(1,100); Time: 1343.339 ms (00:01.343) 예상과는 달리 dblink open을 유지하도록 작성한 함수가 더 성능이 좋지 않다. PL/pgSQL 함수와 SQL 함수 성능 차이로 보인다. • Autonomous Transaction 구현 함수 성능
  69. 69. 68 I make PostgreSQL database faster and more reliable with sql tuning and data modeling Function Optimization Silver Rule Rule #1 : Never create a function. Rule #2 : Never foreget rule#1. ○ PL/pgSQL function 보다는 inlining 가능한 SQL function 을 사용 ○ Parallel Unsafe(default) -> Parallel Restricted -> Parallel Safe 사용 ○ NULL 입력 시 NULL 출력이 확실한 경우 STRICT 사용 ○ VOLATILE(default) -> STABLE -> IMMUTABLE 사용 ○ EXCEPTION block 사용을 최소화 ○ 존재 여부를 묻는 로직은 1개 row만 읽고 처리 ○ 함수 입력 변수에 따라 다른 실행계획이 필요하면 dynamic SQL 사용 ○ PostgreSQL이 지원하지 않는 funtionality는 BR를 customizing ○ Autonomous Transaction은 dblink 또는 FDW를 활용
  70. 70. 69 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (remote sequence) • Oracle Procedure CREATE OR REPLACE PROCEDURE P_SEND_SMS( p_ord_no NUMBER ) IS BEGIN INSERT INTO SMS_LOG (sms_snd_no ,sms_snd_tel_no ,sms_content ,reg_dt) SELECT sq_sms_snd.nextval@DL_SMSDB ,'0212349999' ,'test message' ,current_timestamp FROM ONLINE_ORDER WHERE ORD_NO = p_ord_no; COMMIT; DBMS_OUTPUT.PUT_LINE('Insert completed successfully'); EXCEPTION WHEN OTHERS THEN DBMS_OUTPUT.PUT_LINE ('Error ord_no: '||p_ord_no); DBMS_OUTPUT.PUT_LINE ('ERRMSG:'||SQLERRM); END; remote DB의 sequence 에서 번호 추출
  71. 71. 70 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (remote sequence) • PostgreSQL Procedure CREATE OR REPLACE PROCEDURE P_SEND_SMS_dblink( p_ord_no NUMERIC) LANGUAGE PLPGSQL AS $$ DECLARE BEGIN INSERT INTO SMS_LOG(sms_snd_no, sms_snd_tel_no, sms_content, reg_dt) SELECT (SELECT sms_snd_no FROM dblink('dl_smsdb', 'SELECT NEXTVAL(''public.SQ_SMS_NO'')') as (sms_snd_no numeric) ) ,'0212349999', 'test message', current_timestamp FROM ONLINE_ORDER WHERE ORD_NO = p_ord_no; RAISE NOTICE 'Insert completed successfully.'; --COMMIT; --a transaction cannot be ended inside a block with exception handlers EXCEPTION WHEN OTHERS THEN RAISE NOTICE 'Error ord_no: %', p_ord_no; RAISE NOTICE '% %', SQLERRM, SQLSTATE; END; $$ CREATE SERVER dl_smsdb FOREIGN DATA WRAPPER dblink_fdw OPTIONS (hostaddr '127.0.0.1', dbname 'analdb2'); CREATE USER MAPPING FOR scott SERVER dl_smsdb OPTIONS (user 'scott', password 'tiger');
  72. 72. 71 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (table function) • Oracle Function CREATE OR REPLACE TYPE EMP_TYPE AS OBJECT ( ENAME VARCHAR2(100), MAX_ORD_DATE DATE ) CREATE OR REPLACE TYPE EMP_TABLE AS TABLE OF EMP_TYPE; CREATE OR REPLACE FUNCTION F_MAX_ORD_DATE(P_DATE IN DATE) RETURN EMP_TABLE IS V_RSLT EMP_TABLE; BEGIN SELECT /*+ push_pred(b) */ A.ENAME, B.MAX_ORD_DATE FROM EMPLOYEE A, (SELECT EMPNO, MAX(ORD_DATE) AS MAX_ORD_DATE FROM OFFLINE_ORDER GROUP BY EMPNO ) B WHERE A.EMPNO = B.EMPNO AND A.HIREDATE >= TO_DATE(P_DATE,'YYYYMMDD'); RETURN V_RSLT; EXCEPTION WHEN NO_DATA_FOUND THEN --select에서 출력된 row가 없으면 DBMS_OUTPUT.PUT_LINE('No data found on '||P_DATE); END F_MAX_ORD_DATE; 날짜 값을 입력으로 받아서, 해당날짜 이후 에 고용된 직원과 그 직원들의 최종 주문 처리일자를 출력
  73. 73. 72 I make PostgreSQL database faster and more reliable with sql tuning and data modeling • 소스코드만 poring한 PL/pgSQL function CREATE OR REPLACE FUNCTION F_MAX_ORD_DATE (P_DATE IN VARCHAR , OUT ENAME VARCHAR , OUT MAX_ORD_DATE TIMESTAMP) RETURNS SETOF RECORD LANGUAGE PLPGSQL PARALLEL SAFE AS $$ BEGIN RETURN QUERY SELECT A.ENAME, B.MAX_ORD_DATE FROM EMPLOYEE A, (SELECT EMPNO, MAX(ORD_DATE) AS MAX_ORD_DATE FROM OFFLINE_ORDER GROUP BY EMPNO) B WHERE A.EMPNO = B.EMPNO AND A.HIREDATE >= P_DATE::TIMESTAMP; IF NOT FOUND THEN --select 결과가 1row 이상 아니면 tx를 abort RAISE NOTICE ‘No data found on %’, p_date; END IF; RETURN; END; $$ IV. PORTING EXAMPLE (table function) output 파라미터를 사용하면 anonymous composite type 이 생성된다.
  74. 74. 73 I make PostgreSQL database faster and more reliable with sql tuning and data modeling • 내부 SQL statement 실행 계획 Hash Join (cost=29861.00..29974.01 rows=4 width=15) Hash Cond: (offline_order.empno = a.empno) -> Finalize HashAggregate (cost=29625.90..29675.85 rows=4995 width=13) Group Key: offline_order.empno -> Gather (cost=28527.00..29575.95 rows=9990 width=13) Workers Planned: 2 -> Partial HashAggregate (cost=27527.00..27576.95 rows=4995 width=13) Group Key: offline_order.empno -> Parallel Seq Scan on offline_order (cost=0.00..25443.67 rows=416667 width=13) -> Hash (cost=235.00..235.00 rows=8 width=12) -> Seq Scan on employee a (cost=0.00..235.00 rows=8 width=12) Filter: (hiredate >= '2021-10-21 00:00:00'::timestamp without time zone IV. PORTING EXAMPLE (table function) PostgreSQL 옵티마이저는 Oracle의 Join Predicate Push Down 을 지원하지 않는다. OFFLINE_ORDER 를 full scan 하였다.
  75. 75. 74 I make PostgreSQL database faster and more reliable with sql tuning and data modeling • SQL statement 재작성한 PL/pgSQL function CREATE OR REPLACE FUNCTION F_MAX_ORD_DATE (P_DATE IN VARCHAR , OUT ENAME VARCHAR , OUT MAX_ORD_DATE TIMESTAMP) RETURNS SETOF RECORD LANGUAGE PLPGSQL PARALLEL SAFE AS $$ BEGIN RETURN QUERY SELECT A.ENAME, B.MAX_ORD_DATE FROM EMPLOYEE A, LATERAL (SELECT EMPNO , MAX(ORD_DATE) AS MAX_ORD_DATE FROM OFFLINE_ORDER B WHERE A.EMPNO = B.EMPNO GROUP BY EMPNO) B WHERE 1=1 AND A.HIREDATE >= P_DATE::TIMESTAMP; IF NOT FOUND THEN --select 결과가 1row 이상 아니면 tx를 abort(exception) RAISE NOTICE ‘No data found on %’, p_date; END IF; RETURN; END; $$ IV. PORTING EXAMPLE (table function) NL Join 하도록 Lateral Join 활용
  76. 76. 75 I make PostgreSQL database faster and more reliable with sql tuning and data modeling • 내부 SQL statement 실행 계획 Nested Loop (cost=5.97..7856.83 rows=1960 width=15) -> Seq Scan on employee a (cost=0.00..285.00 rows=10 width=12) Filter: (hiredate >= (CURRENT_TIMESTAMP - '10 days'::interval)) -> GroupAggregate (cost=5.97..753.26 rows=196 width=13) Group Key: b.empno -> Bitmap Heap Scan on offline_order b (cost=5.97..750.30 rows=200 width=13) Recheck Cond: (a.empno = empno) -> Bitmap Index Scan on offline_order_x02 (cost=0.00..5.92 rows=200 width=0) Index Cond: (empno = a.empno) IV. PORTING EXAMPLE (table function) Oracle 에서와 동일한 실행계획이 만들어 졌다. employees 테이블에서 조건을 만족하는 empno를 추출 후 그 empno 에 해당하는 데이터를 offline_order 테이블에서 찾아서 group by를 수행하였다. 하지만 위 실행계획도 employees 테이블에서 조건을 만족하는 empno가 많다면, 매우 비효율적이다.
  77. 77. 76 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (table function) SELECT A.ENAME, B.MAX_ORD_DATE FROM EMPLOYEE A, (SELECT EMPNO, MAX(ORD_DATE) AS MAX_ORD_DATE FROM OFFLINE_ORDER GROUP BY EMPNO ) B WHERE A.EMPNO = B.EMPNO AND A.HIREDATE >= P_DATE::TIMESTAMP SELECT A.ENAME, B.MAX_ORD_DATE FROM EMPLOYEE A, LATERAL (SELECT EMPNO, MAX(ORD_DATE) AS MAX_ORD_DATE FROM OFFLINE_ORDER B WHERE A.EMPNO = B.EMPNO GROUP BY EMPNO ) B WHERE 1=1 AND A.HIREDATE >= P_DATE::TIMESTAMP 입력 날짜가 먼 과거일자일 경우 유리 입력 날짜가 최 근일자일 경우 유 리
  78. 78. 77 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (table function) SELECT A.ENAME, B.MAX_ORD_DATE FROM EMPLOYEE A, LATERAL (SELECT EMPNO, MAX(ORD_DATE) AS MAX_ORD_DATE FROM OFFLINE_ORDER B WHERE A.EMPNO = B.EMPNO GROUP BY EMPNO ) B WHERE 1=1 AND A.HIREDATE >= P_DATE::TIMESTAMP AND CURRENT_DATE - P_DATE::DATE <= 30 UNION ALL SELECT A.ENAME, B.MAX_ORD_DATE FROM EMPLOYEE A, (SELECT EMPNO, MAX(ORD_DATE) AS MAX_ORD_DATE FROM OFFLINE_ORDER GROUP BY EMPNO ) B WHERE A.EMPNO = B.EMPNO AND A.HIREDATE >= P_DATE::TIMESTAMP AND CURRENT_DATE - P_DATE::DATE > 30 Nested Loop Join Hash Join
  79. 79. 78 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (table function) • SQL statement 재작성한 PL/pgSQL function CREATE OR REPLACE FUNCTION F_MAX_ORD_DATE2 (P_DATE IN VARCHAR) RETURNS TABLE (ENAME VARCHAR, MAX_ORD_DATE TIMESTAMP) LANGUAGE PLPGSQL PARALLEL SAFE AS $$ BEGIN RETURN QUERY SELECT A.ENAME, B.MAX_ORD_DATE FROM EMPLOYEE A, LATERAL (SELECT EMPNO, MAX(ORD_DATE) AS MAX_ORD_DATE FROM OFFLINE_ORDER B WHERE A.EMPNO = B.EMPNO GROUP BY EMPNO ) B WHERE 1=1 AND A.HIREDATE >= P_DATE::TIMESTAMP AND CURRENT_DATE - P_DATE::DATE <= 30 UNION ALL SELECT A.ENAME, B.MAX_ORD_DATE FROM EMPLOYEE A, (SELECT EMPNO, MAX(ORD_DATE) AS MAX_ORD_DATE FROM OFFLINE_ORDER GROUP BY EMPNO ) B WHERE A.EMPNO = B.EMPNO AND A.HIREDATE >= P_DATE::TIMESTAMP AND CURRENT_DATE - P_DATE::DATE > 30 IF NOT FOUND THEN --select 결과가 1row 이상 아니면 RAISE NOTICE ‘No data found on %’, p_date; END IF; RETURN; END; $$ output 파라미터 대신에 RETURNS TABLE 절에 출력 table의 데이터타입 기술 튜닝한 SQL
  80. 80. 79 I make PostgreSQL database faster and more reliable with sql tuning and data modeling Function Optimization Silver Rule Rule #1 : Never create a function. Rule #2 : Never foreget rule#1. ○ PL/pgSQL function 보다는 inlining 가능한 SQL function 을 사용 ○ Parallel Unsafe(default) -> Parallel Restricted -> Parallel Safe 사용 ○ NULL 입력 시 NULL 출력이 확실한 경우 STRICT 사용 ○ VOLATILE(default) -> STABLE -> IMMUTABLE 사용 ○ EXCEPTION block 사용을 최소화 ○ 존재 여부를 묻는 로직은 1개 row만 읽고 처리 ○ 함수 입력 변수에 따라 다른 실행계획이 필요하면 dynamic SQL 사용 ○ PostgreSQL이 지원하지 않는 funtionality는 BR를 customizing ○ Autonomous Transaction은 dblink 또는 FDW를 활용 ○ UNION ALL 을 이용해서 입력변수 값에 따라 실행계획 제어
  81. 81. 80 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (rownum) • Oracle Function CREATE OR REPLACE FUNCTION F_EXTRACT_VAL ( P_CUST_NO IN NUMBER , P_RTN_GUBUN IN VARCHAR2 /* 추출구분 'ORDDATE','EMPNO' */ , P_RNO IN NUMBER DEFAULT 1 /* 1st row, 2nd row 구분 */ ) RETURN VARCHAR2 IS v_rtn_val VARCHAR2(20); BEGIN BEGIN v_rtn_val := NULL; SELECT CASE WHEN p_rtn_gubun = 'ORDDATE' THEN TO_CHAR(ORD_DATE,'YYYYMMDD') WHEN p_rtn_gubun = 'EMPNO' THEN TO_CHAR(EMPNO) ELSE '0' END INTO v_rtn_val FROM ( SELECT DECODE(MOD(rno,2),1,1,2) AS rno , ORD_NO, CUST_NO, ORD_DATE , EMPNO FROM (SELECT ROWNUM AS rno , ORD_NO, CUST_NO, ORD_DATE , EMPNO FROM OFFLINE_ORDER A WHERE CUST_NO = p_cust_no ORDER BY ORD_NO ) WHERE ROWNUM <= 2 ) WHERE RNO = p_rno; EXCEPTION WHEN NO_DATA_FOUND THEN v_rtn_val := NULL; WHEN OTHERS THEN v_rtn_val := NULL; END; RETURN v_rtn_val; END rownum은 Oracle에서 데이터 출력시 임의로 부여하는 pseudo column이다.
  82. 82. 81 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (rownum) • PostgreSQL Function CREATE OR REPLACE FUNCTION F_EXTRACT_VAL ( P_CUST_NO IN NUMERIC , P_RTN_GUBUN IN VARCHAR , P_RNO IN NUMERIC DEFAULT 1 ) RETURNS VARCHAR LANGUAGE SQL PARALLEL SAFE AS $BODY$ SELECT MAX(CASE WHEN p_rtn_gubun = 'ORDDATE' THEN TO_CHAR(ORD_DATE,'YYYYMMDD') WHEN p_rtn_gubun = 'EMPNO‘ THEN EMPNO::VARCHAR ELSE '0' END) FROM (SELECT RNO, ORD_DATE, EMPNO FROM ( SELECT ORD_DATE, EMPNO , ROW_NUMBER() OVER (ORDER BY ORD_NO) AS RNO FROM OFFLINE_ORDER A WHERE CUST_NO = p_cust_no ORDER BY ORD_NO FETCH NEXT 2 ROWS ONLY ) A ) A WHERE RNO = p_rno $BODY$
  83. 83. 82 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (rownum) • row_number() over() vs. sequence 사용 SELECT (SELECT MAX(EMPNO) FROM EMPLOYEE B) + ROWNUM as EMP_NO , CUST_NM , CUST_NO AS ORG_CUST_NO , 'CLERK' , CURRENT_TIMESTAMP FROM CUSTOMER A; CUSTOMERS : size 97 Mbytes, row 수 : 1,000,000 큰 집합을 대상으로 row_number() over()는 성능이 매우 느리다. sequence 사용을 위해서는 항상 alter sequence start with 1 cache 100000 로 초기화 해야 한다.
  84. 84. 83 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (rownum) • row_number() over() 사용 SELECT (SELECT MAX(EMPNO) FROM EMPLOYEE B) + row_number() over() AS EMP_NO , CUST_NM , CUST_NO AS ORG_CUST_NO , 'CLERK' , CURRENT_TIMESTAMP FROM CUSTOMER; WindowAgg (actual time=0.065..434.033 rows=1000000 loops=1) Output: ($1 + (row_number() OVER (?))::numeric), a.cust_nm, a.cust_no, 'CLERK'::text, CURRENT_TIMESTAMP Buffers: shared hit=792 read=10575 InitPlan 2 (returns $1) -> Result (actual time=0.026..0.027 rows=1 loops=1) Output: $0 Buffers: shared hit=3 InitPlan 1 (returns $0) -> Limit (actual time=0.024..0.025 rows=1 loops=1) Output: b.empno Buffers: shared hit=3 -> Index Only Scan Backward using employee_pk on portal.employee b (actual time=0.023..0.023 rows=1 loops=1) Output: b.empno Index Cond: (b.empno IS NOT NULL) Heap Fetches: 0 Buffers: shared hit=3 -> Seq Scan on portal.customer a (actual time=0.029..100.348 rows=1000000 loops=1) Output: a.cust_nm, a.cust_no Buffers: shared hit=789 read=10575 Planning Time: 0.087 ms Execution Time: 462.552 m
  85. 85. 84 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (rownum) • 임시 sequence 사용 CREATE SEQUENCE star_id_seq; ALTER SEQUENCE star_id_seq start with 1 cache 1000000; SELECT (SELECT MAX(EMPNO) FROM EMPLOYEE B) + nextval(‘star_id_seq’) AS EMP_NO , CUST_NM , CUST_NO AS ORG_CUST_NO , 'CLERK' , CURRENT_TIMESTAMP FROM CUSTOMER; Seq Scan on portal.customer a (actual time=1.616..346.494 rows=1000000 loops=1) Output: ($1 + (nextval('star_id_seq'::regclass))::numeric), a.cust_nm, a.cust_no, 'CLERK'::text, CURRENT_TIMESTAMP Buffers: shared hit=825 read=10543 InitPlan 2 (returns $1) -> Result (actual time=0.012..0.013 rows=1 loops=1) Output: $0 Buffers: shared hit=3 InitPlan 1 (returns $0) -> Limit (actual time=0.010..0.011 rows=1 loops=1) Output: b.empno Buffers: shared hit=3 -> Index Only Scan Backward using employee_pk on portal.employee b (actual time=0.009..0.010 rows=1 loops=1) Output: b.empno Index Cond: (b.empno IS NOT NULL) Heap Fetches: 0 Buffers: shared hit=3 Planning Time: 0.069 ms Execution Time: 377.825 ms
  86. 86. 85 I make PostgreSQL database faster and more reliable with sql tuning and data modeling Function Optimization Silver Rule Rule #1 : Never create a function. Rule #2 : Never foreget rule#1. ○ PL/pgSQL function 보다는 inlining 가능한 SQL function 을 사용 ○ Parallel Unsafe(default) -> Parallel Restricted -> Parallel Safe 사용 ○ NULL 입력 시 NULL 출력이 확실한 경우 STRICT 사용 ○ VOLATILE(default) -> STABLE -> IMMUTABLE 사용 ○ EXCEPTION block 사용을 최소화 ○ 존재 여부를 묻는 로직은 1개 row만 읽고 처리 ○ 함수 입력 변수에 따라 다른 실행계획이 필요하면 dynamic SQL 사용 ○ PostgreSQL이 지원하지 않는 funtionality는 BR를 customizing ○ Autonomous Transaction은 dblink 또는 FDW를 활용 ○ UNION ALL 을 이용해서 입력변수 값에 따라 실행계획 제어 ○ 출력 row에 번호 부여를 위한 ROWNUM은 row_number() over() 활용
  87. 87. 86 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (dbms_errlog) • Oracle Trigger CREATE OR REPLACE TRIGGER TR_EMP_HIST BEFORE INSERT OR UPDATE OR DELETE ON EMPLOYEE FOR EACH ROW DECLARE BEGIN IF INSERTING THEN INSERT INTO EMPLOYEE_HIST(operation,empno,ename, sal, update_date) VALUES('INSERT',:new.EMPNO, :new.ENAME, :new.SAL, SYSDATE) LOG ERRORS INTO ERR_LOG('INSERT') REJECT LIMIT UNLIMITED; ELSIF UPDATING THEN INSERT INTO EMPLOYEE_HIST(operation,empno,ename, sal, update_date) VALUES('UPDATE',:new.EMPNO, :new.ENAME, :new.SAL, SYSDATE) LOG ERRORS INTO ERR_LOG('UPDATE') REJECT LIMIT UNLIMITED; ELSIF DELETING THEN INSERT INTO EMPLOYEE_HIST(operation,empno,ename, sal, update_date) VALUES('DELETE',:old.EMPNO, :old.ENAME, :old.SAL, SYSDATE) LOG ERRORS INTO ERR_LOG('UPDATE') REJECT LIMIT UNLIMITED; END IF; END; EMPLOYEE_HIST 테이블에 발생한 DML 에러내역을 ERR_LOG 테이블에 저장한다. exec DBMS_ERRLOG.CREATE_ERROR_LOG(‘EMPLOYEE_HIST’,’ERR_LOG’);
  88. 88. 87 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (dbms_errlog) • PostgreSQL Trigger CREATE OR REPLACE FUNCTION f_emp_hist() RETURNS TRIGGER LANGUAGE PLPGSQL AS $$ DECLARE v_err_no NUMERIC; --SQLSTATE v_err_desc VARCHAR(100); --SQLERRM BEGIN -- when insert or update NEW value inserted, when delete OLD value inserted INSERT INTO EMPLOYEE_HIST VALUES(TG_OP, OLD.EMPNO, COALESCE(NEW.ENAME,OLD.ENAME) , COALESCE(NEW.SAL,OLD.SAL),NOW()); RETURN COALESCE(NEW, OLD); --AFTER trigger 라면 RETURN NULL; EXCEPTION WHEN OTHERS THEN v_err_no := SQLSTATE; v_err_desc := SQLERRM; INSERT INTO ERR_LOG(ORA_ERR_NUMBER,ORA_ERR_MSG,OPERATION,EMPNO,SAL,ERR_DATE) VALUES (v_err_no, v_err_desc, TG_OP, OLD.EMPNO,COALESCE(NEW.SAL,OLD.SAL),NOW()); RETURN NULL; END; $$ exception block 을 이용해서 SQLSTATE와 SQLERRM 값을 저장할 수 있다.
  89. 89. 88 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (dbms_errlog) • PostgreSQL Trigger CREATE TRIGGER TR_EMP_HIST BEFORE INSERT OR UPDATE OR DELETE ON EMPLOYEE FOR EACH ROW EXECUTE FUNCTION F_EMP_HIST()
  90. 90. 89 I make PostgreSQL database faster and more reliable with sql tuning and data modeling Function Optimization Silver Rule Rule #1 : Never create a function. Rule #2 : Never foreget rule#1. ○ PL/pgSQL function 보다는 inlining 가능한 SQL function 을 사용 ○ Parallel Unsafe(default) -> Parallel Restricted -> Parallel Safe 사용 ○ NULL 입력 시 NULL 출력이 확실한 경우 STRICT 사용 ○ VOLATILE(default) -> STABLE -> IMMUTABLE 사용 ○ EXCEPTION block 사용을 최소화 ○ 존재 여부를 묻는 로직은 1개 row만 읽고 처리 ○ 함수 입력 변수에 따라 다른 실행계획이 필요하면 dynamic SQL 사용 ○ PostgreSQL이 지원하지 않는 funtionality는 BR를 customizing ○ Autonomous Transaction은 dblink 또는 FDW를 활용 ○ UNION ALL 을 이용해서 입력변수 값에 따라 실행계획 제어 ○ 출력 row에 번호 부여를 위한 ROWNUM은 row_number() over() 활용 ○ DBMS_ERRLOG 는 exception block 으로 DML error 처리
  91. 91. 90 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (redundant logic) • Oracle Proceure CREATE OR REPLACE PROCEDURE P_BATCH_JOB( p_ename VARCHAR2 ,p_sal NUMBER ) IS v_empno EMPLOYEE.EMPNO%TYPE; v_ename EMPLOYEE.ENAME%TYPE; v_sal EMPLOYEE.SAL%TYPE; v_yyyymmdd VARCHAR2(8); BEGIN SELECT TO_CHAR(SYSDATE-1,'YYYYMMDD') INTO v_yyyymmdd FROM DUAL; SELECT SQ_EMP_NO.NEXTVAL --sequence number를 고정시키기 위해 변수에 할당 INTO v_empno FROM DUAL; INSERT INTO EMPLOYEE (EMPNO, ENAME, SAL) SELECT v_empno, p_ename, p_sal FROM DUAL; INSERT INTO EMPLOYEE_HIST (OPERATION, EMPNO, ENAME, SAL) SELECT 'INSERT', v_empno, v_ename, v_sal FROM DUAL; DBMS_OUTPUT.PUT_LINE('Insert completed successfully at '||v_yyyymmdd); END;
  92. 92. 91 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (redundant logic) • PL/pgSQL CREATE OR REPLACE PROCEDURE P_BATCH_JOB( p_ename VARCHAR ,p_sal NUMERIC) LANGUAGE PLPGSQL AS $$ DECLARE v_empno EMPLOYEE.EMPNO%TYPE; v_ename EMPLOYEE.ENAME%TYPE; v_sal EMPLOYEE.SAL%TYPE; v_yyyymmdd VARCHAR(8); BEGIN v_yyyymmdd := TO_CHAR(NOW()-interval '1 day','YYYYMMDD'); --select ~ into제거 --You get a fixed sequence number via returning clause. INSERT INTO EMPLOYEE (EMPNO, ENAME, SAL) SELECT NEXTVAL('sq_emp_no'), p_ename, p_sal RETURNING EMPNO, ENAME, SAL INTO v_empno, v_ename, v_sal; INSERT INTO EMPLOYEE_HIST (OPERATION, EMPNO, ENAME, SAL) SELECT 'INSERT', v_empno, v_ename, v_sal; RAISE NOTICE 'Insert completed successfully at %.', v_yyyymmdd ; END; $$
  93. 93. 92 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (redundant logic) • 성능비교 DO $BODY$ DECLARE v_yyyymmdd VARCHAR(8); BEGIN FOR i IN 1 .. 1000000 LOOP v_yyyymmdd := TO_CHAR(NOW()-interval '1 day','YYYYMMDD'); END LOOP; END; $BODY$; 628 msec DO $BODY$ DECLARE v_yyyymmdd VARCHAR(8); BEGIN FOR i IN 1 .. 1000000 LOOP SELECT TO_CHAR(NOW()-interval '1 day','YYYYMMDD') INTO v_yyyymmdd; END LOOP; END; $BODY$; 1861 msec 3배 느리다.
  94. 94. 93 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (built-in function) • Oracle Function CREATE OR REPLACE FUNCTION F_CONNECT_BY( P_GROUP_CD VARCHAR2 ) RETURN VARCHAR2 IS v_string VARCHAR2(2000); BEGIN SELECT SUBSTR(SYS_CONNECT_BY_PATH(CD_NM,','),2) INTO v_string FROM ( SELECT CD_NM , ROW_NUMBER() OVER (ORDER BY CD_NM) RN , COUNT(*) OVER () AS CNT FROM COM_CODE WHERE GROUP_CD = p_group_cd ) WHERE RN = CNT --check condition START WITH RN = 1 --access condition CONNECT BY RN = PRIOR RN+1; --join condition RETURN v_string; END; 출력 결과 DAEGU,ETC,INCHEON,JAEJU,PUSAN,SEOUL,ULEUNG WITH RECURIVE CLAUSE START WITH ~ CONNECT BY ~
  95. 95. 94 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (built-in function) • PostgreSQL Function CREATE OR REPLACE FUNCTION F_CONNECT_BY( P_GROUP_CD VARCHAR ) RETURNS VARCHAR LANGUAGE SQL PARALLEL SAFE STRICT AS $$ SELECT STRING_AGG(CD_NM, ',' ORDER BY CD_NM) FROM COM_CODE WHERE GROUP_CD = p_group_cd; $$
  96. 96. 95 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (merge into ~) • Oracle Proceure CREATE OR REPLACE PROCEDURE P_CONV_ORD ( v_cust_no IN ONLINE_ORDER.CUST_NO%TYPE ) IS v_ord_no ONLINE_ORDER.ORD_NO%TYPE; v_ord_date ONLINE_ORDER.ORD_DATE%TYPE; v_ord_dt ONLINE_ORDER.ORD_DT%TYPE; v_ord_status_cd ONLINE_ORDER.ORD_STATUS_CD%TYPE; CURSOR c1 IS SELECT ORD_NO, CUST_NO, ORD_DATE , ORD_DT, ORD_STATUS_CD FROM ONLINE_ORDER WHERE CUST_NO = v_cust_no; BEGIN p_rtn_cnt := 0; OPEN c1; LOOP FETCH c1 INTO v_ord_no, v_cust_no, v_ord_date, v_ord_dt, v_ord_status_cd; EXIT WHEN c1%NOTFOUND; p_rtn_cnt := p_rtn_cnt + 1; IF v_ord_status_cd = '2' THEN --매우 비효율적인 조건 BEGIN INSERT INTO OFFLINE_ORDER VALUES (v_ord_no, v_cust_no, v_ord_date, v_ord_dt, v_ord_status_cd,-1); EXCEPTION WHEN dup_val_on_index THEN UPDATE OFFLINE_ORDER SET ORD_STATUS_CD = v_ord_status_cd WHERE ORD_NO = v_ord_no; END; END IF; END LOOP; CLOSE c1; COMMIT; END; SQL 문장으로 처리 가능한 것을 복잡한 CURSOR, LOOP ~ 를 사용하고 있다.
  97. 97. 96 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (merge into ~) • PostgreSQL Proceure CREATE OR REPLACE PROCEDURE P_CONV_ORD( v_cust_no IN NUMERIC ) LANGUAGE PLPGSQL AS $BODY$ DECLARE v_ord_no ONLINE_ORDER.ORD_NO%TYPE; v_ord_date ONLINE_ORDER.ORD_DATE%TYPE; v_ord_dt ONLINE_ORDER.ORD_DT%TYPE; v_ord_status_cd ONLINE_ORDER.ORD_STATUS_CD%TYPE; BEGIN WITH BASE AS (SELECT ORD_NO, CUST_NO, ORD_DATE, ORD_DT, ORD_STATUS_CD FROM ONLINE_ORDER WHERE CUST_NO = v_cust_no AND ORD_STATUS_CD = '2') ,UPSERT AS ( UPDATE OFFLINE_ORDER A SET ORD_STATUS_CD = B.ord_status_cd FROM BASE B WHERE A.ORD_NO = B.ORD_NO RETURNING A.ORD_NO) INSERT INTO OFFLINE_ORDER (ORD_NO, CUST_NO, ORD_DATE, ORD_DT, ORD_STATUS_CD, EMPNO, COMMENT) SELECT B.ORD_NO, B.CUST_NO, B.ORD_DATE , B.ORD_DT, B.ORD_STATUS_CD, -1,NULL FROM BASE B WHERE NOT EXISTS (SELECT 1 FROM UPSERT C WHERE B.ORD_NO = C.ORD_NO); COMMIT; --autocommit off인 환경에서는 제거 필요 END; $BODY$ PostgreSQL에서는 WITH 절로 오라클 MERGE INTO ~ 구문 기능 구현 가능 CURSOR, LOOP ~ 에 비해 성능이 좋다.
  98. 98. 97 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (check for existence) • Oracle Function CREATE OR REPLACE FUNCTION F_GET_ORD_STATUS ( P_CUST_NO IN ONLINE_ORDER.CUST_NO%TYPE , P_ORD_DT IN ONLINE_ORDER.ORD_DT%TYPE ) RETURN VARCHAR2 IS v_max_ord_no NUMBER; v_cnt NUMBER(4); v_rtn VARCHAR2(3); BEGIN BEGIN SELECT COUNT(*) --주문내역이 있는지 확인 INTO v_cnt FROM ONLINE_ORDER WHERE CUST_NO = p_cust_no AND ORD_DT < p_ord_dt AND ROWNUM = 1; --비효율적 rownum 사용 IF v_cnt > 1 THEN --주문이 있으면 SELECT MAX(ORD_NO) INTO v_max_ord_no FROM ONLINE_ORDER WHERE CUST_NO = p_cust_no FOR UPDATE; -- locking 수행 UPDATE ONLINE_ORDER SET ORD_STATUS_CD = '4‘ --’4’ 로 업데이트 WHERE ORD_NO = v_max_ord_no; v_rtn := 'OK'; ELSIF v_cnt = 0 THEN --주문이 없으면 INSERT INTO ONLINE_ORDER( ORD_NO, CUST_NO, ORD_DATE , ORD_DT,ORD_STATUS_CD) VALUES ((SELECT MAX(ORD_NO)+1 FROM ONINE_ORDER) ,P_CUST_NO, SYSDATE , TO_CHAR(SYSDATE,'YYYYMMDD'), '1'); v_rtn := 'OK'; ELSE v_rtn := 'ERR'; END IF; IF v_rtn <> 'ERR' THEN --update/insert 성공하면 SELECT ORD_STATUS_CD --주문상태코드 출력 INTO v_rtn FROM ONLINE_ORDER WHERE CUST_NO = p_cust_no AND ORD_NO = (SELECT MAX(ORD_NO) FROM ONLINE_ORDER WHERE CUST_NO = p_cust_no); END IF; EXCEPTION WHEN OTHERS THEN ROLLBACK; --PostgreSQL doesn’t support this. v_rtn := 'ERR‘ ; END; COMMIT; RETURN v_rtn; END;
  99. 99. 98 I make PostgreSQL database faster and more reliable with sql tuning and data modeling • PL/pgSQL function IV. PORTING EXAMPLE (check for existence) CREATE OR REPLACE FUNCTION F_GET_ORD_STATUS ( P_CUST_NO IN NUMERIC ,P_ORD_DT IN VARCHAR ) RETURNS VARCHAR language plpgsql VOLATILE parallel unsafe AS $$ DECLARE v_rtn VARCHAR(3); BEGIN WITH BASE AS ( --기존의 ORD_NO 최대값 SELECT MAX(ORD_NO) AS M_ORD_NO FROM ONLINE_ORDER ) ,UPSERT AS ( UPDATE ONLINE_ORDER B SET ORD_STATUS_CD = '4' --’4’ 로 업데이트 FROM (SELECT MAX(ORD_NO) AS MAX_ORD_NO FROM ONLINE_ORDER WHERE CUST_NO = P_CUST_NO AND ORD_DT < P_ORD_DT ) A WHERE A.MAX_ORD_NO = B.ORD_NO RETURNING B.ORD_STATUS_CD ) INSERT INTO ONLINE_ORDER --주문 없으면 신규입력 (ORD_NO, CUST_NO, ORD_DATE , ORD_DT,ORD_STATUS_CD) SELECT A.M_ORD_NO+1 --기존의 ORD_NO + 1 ,P_CUST_NO, CURRENT_TIMESTAMP , TO_CHAR(CURRENT_DATE,'YYYYMMDD'),'1' FROM BASE A WHERE NOT EXISTS (SELECT 1 FROM UPSERT C ) RETURNING ORD_STATUS_CD INTO v_rtn; IF v_rtn IS NULL THEN --UPDATE 성공했으면 SELECT ORD_STATUS_CD INTO v_rtn FROM (SELECT ORD_STATUS_CD, ORD_NO , MAX(ORD_NO) OVER () AS MAX_NO FROM ONLINE_ORDER WHERE CUST_NO = P_CUST_NO ) A WHERE ORD_NO = MAX_NO; END IF; RETURN v_rtn; --INSERT/UPDATE한 ord_status_cd값 EXCEPTION WHEN OTHERS THEN v_rtn := 'ERR'; RETURN v_rtn; END; $$ Oracle function 내 4개 SQL 문을 with 절 1개로 구현
  100. 100. 99 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (cursor, loop) • Oracle Procedure CREATE OR REPLACE PROCEDURE P_UPDATE_COMMENT(p_ord_dt VARCHAR2 ) IS v_comment ONLINE_ORDER%TYPE; CURSOR C1 IS SELECT * FROM ONLINE_ORDER WHERE ORD_DT = p_ord_dt FOR UPDATE; --prevent other transactions change the data. BEGIN FOR rec IN C1 LOOP IF rec.ORD_STATUS_CD = '1' THEN v_comment = 'Received'; END IF; IF rec.ORD_STATUS_CD = '2' THEN v_comment = 'Confirmed'; END IF; IF rec.ORD_STATUS_CD = '3' THEN v_comment = 'Cancelled'; END IF; IF rec.ORD_STATUS_CD = '4' THEN v_comment = 'Ordered'; END IF; -- cursor에 100건이 있으면 100번 update 수행 UPDATE ONLINE_ORDER SET COMMENT = v_comment WHERE CURRENT OF C1; END LOOP; COMMIT; END;
  101. 101. 100 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (cursor, loop) • PostgreSQL Procedure CREATE OR REPLACE PROCEDURE P_UPDATE_COMMENT ( p_ord_dt VARCHAR ) LANGUAGE SQL AS $$ --여러번 수행되던 update문을 1회 수행으로 줄임 UPDATE ONLINE_ORDER SET COMMENT = CASE WHEN ORD_STATUS_CD = '1' THEN 'Received' WHEN ORD_STATUS_CD = '2' THEN 'Confirmed' WHEN ORD_STATUS_CD = '3' THEN 'Cancelled' WHEN ORD_STATUS_CD = '4' THEN 'Ordered' END WHERE ORD_DT = p_ord_dt; --COMMIT; --WAS에서는 autocommit off로 사용하므로 commit 사용 불가 $$
  102. 102. 101 I make PostgreSQL database faster and more reliable with sql tuning and data modeling IV. PORTING EXAMPLE (cursor, loop) • Oracle Procedure CREATE OR REPLACE PROCEDURE P_SUM_AMT( p_empno in number ,v_amt_total out number) IS v_amount NUMBER; v_ord_no NUMBER; v_amount NUMBER; BEGIN v_amt_total := 0 ; v_amount := 0 ; CURSOR C1 IS SELECT ORD_NO FROM OFFLINE_ORDER WHERE EMPNO = p_empno; OPEN C1; LOOP FETCH C1 INTO v_ord_no EXIT WHEN c1%NOTFOUND; SELECT NVL(UNIT_PRICE*QUANTY,0) INTO v_amount FROM ORD_ITEM WHERE ORD_NO = v_ord_no; v_amt_total := v_amt_total + v_amount; END LOOP; CLOSE C1; END; PostgreSQL version 13 에서는 OUT mode 미지원 OFFLINE_ORDER, ORD_ITEM join 으로 대체 가능
  103. 103. 102 I make PostgreSQL database faster and more reliable with sql tuning and data modeling • PL/pgSQL function CREATE OR REPLACE FUNCTION P_SUM_AMT (p_empno numeric) RETURNS NUMERIC LANGUAGE SQL PARALLEL SAFE AS $BODY$ SELECT COALESCE(SUM(AMOUNT),0) FROM OFFLINE_ORDER A , LATERAL (SELECT UNIT_PRICE*QUANTITY AS AMOUNT FROM ORD_ITEM WHERE ORD_NO = A.ORD_NO) B WHERE 1=1 AND A.EMPNO = p_empno; $BODY$ IV. PORTING EXAMPLE (cursor, loop)
  104. 104. 103 I make PostgreSQL database faster and more reliable with sql tuning and data modeling Function Optimization Silver Rule ○ PL/pgSQL function 보다는 inlining 가능한 SQL function 을 사용 ○ Parallel Unsafe(default) -> Parallel Restricted -> Parallel Safe 사용 ○ NULL 입력 시 NULL 출력이 확실한 경우 STRICT 사용 ○ VOLATILE(default) -> STABLE -> IMMUTABLE 사용 ○ EXCEPTION block 사용을 최소화 ○ 존재 여부를 묻는 로직은 1개 row만 읽고 처리 ○ 함수 입력 변수에 따라 다른 실행계획이 필요하면 dynamic SQL 사용 ○ PostgreSQL이 지원하지 않는 funtionality는 BR를 customizing ○ Autonomous Transaction은 dblink 또는 FDW를 활용 ○ UNION ALL 을 이용해서 입력변수 값에 따라 실행계획 제어 ○ 출력 row에 번호 부여를 위한 ROWNUM은 row_number() over() 활용 ○ DBMS_ERRLOG 는 exception block 으로 DML error 처리 ○ 복잡한 로직(cursor, loop등)을 SQL로 대체 Rule #1 : Never create a function. Rule #2 : Never foreget rule#1.
  105. 105. 104 I make PostgreSQL database faster and more reliable with sql tuning and data modeling References • PostgreSQL 9.6 Performance Story Siyeon Kim • 프로젝트에서 강한 오라클 PL/SQL 프로그래밍 비팬북스 • PostgreSQL Query Optimization APRESS • https://www.youtube.com/watch?v=uAiofEikCSM The Lost Art of plpgsql • https://www.postgresql.org/docs/13/plpgsql-porting.html • https://scidb.tistory.com/entry/%EC%98%A4%EB%9D%BC%ED%81%B4%EC%97%90%EC%84%9C- isnumber-isdate-%ED%95%A8%EC%88%98-%EC%82%AC%EC%9A%A9%ED%95%98%EA%B8%B0
  106. 106. 105 I make PostgreSQL database faster and more reliable with sql tuning and data modeling Addendum • UDF performance check set track_functions = 'all'; SELECT * FROM pg_stat_user_functions; • function DDL 확인 SELECT pg_get_functiondef(oid) FROM pg_proc WHERE proname = 'f_sido_nm'; • trigger DDL 확인 SELECT pg_get_functiondef(tgfoid) FROM pg_catalog.pg_trigger WHERE tgname = 'tr_emp_hist'; • User Defined Trigger 조회 SELECT * FROM pg_trigger WHERE tgisinternal=false;
  107. 107. All rights not reserved. Any part of this material may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without any permission from me. I will not be held liable for any damage caused or alleged to have been caused directly or indirectly by applying the skills described in this presentation. I would appreciate it if you could make your database server run faster with the technique in this material. THANK YOU

×