Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
MySQL
User Conference and Expo 2010



  Optimizing
Stored Routines

     Blog: http://rpbouman.blogspot.com/   1
     twi...
Welcome, thanks for attending!


●   Roland Bouman; Leiden, Netherlands
●   Ex MySQL AB, Sun Microsystems
●   Web and BI D...
Program
●   Stored routine issues
●   Variables and assignments
●   Flow of control
●   Cursor handling
●   Summary




  ...
Program
●   Stored routine issues?
●   Variables and assignments
●   Flow of control
●   Cursor handling
●   Summary




 ...
Stored Routines: Definition
●   Stored routines:
        –   stored functions (SQL functions)
        –   stored procedure...
Performance Issues
●   SQL inside stored routines is still SQL,
    ...but...
        –   invocation overhead
       –    ...
Invocation overhead
●   Plain expression (10 mln)
    mysql> SELECT BENCHMARK(10000000, 1);
    +------------------------+...
Computation inefficiency
●   Plain addition
    mysql> SELECT BENCHMARK(10000000, 1+1);
    +--------------------------+
 ...
Computation inefficiency
●   Raw measurements
                                 plain expression         function   ratio
 ...
Program
●   Stored routine issues
●   Variables and assignments
●   Flow of control
●   Cursor handling
●   Summary




  ...
Types of Variables
●   User-defined variables
         –   session scope
         –   runtime type
    SET @user_defined_v...
User-defined variable Benchmark
 ●   Baseline
     CREATE FUNCTION f_variable_baseline()
     RETURNS INT
     BEGIN
     ...
User-defined variables
●   User-defined variables about 5x slower
              9

              8

              7

     ...
Assignments
●   SET statement
    SET v_variable := 'some value';

●   SELECT statement
    SELECT 'some value' INTO v_var...
Assignment Benchmarks
    ●   SELECT INTO about 60% slower than SET
    ●   SET about 40% slower than DEFAULT
            ...
More about SELECT INTO

●   Assigning from a SELECT...INTO statement:
         –   ok if you're assigning from a real quer...
Sample function:
           Sakila rental count
CREATE FUNCTION f_assign_select_into(p_customer_id INT) RETURNS INT
BEGIN
...
Sakila Rental count benchmark
●   SET about 25% slower than SELECT INTO

                 10

                  9

       ...
More on variables and
                assignments
●   Match expression and variable data types
           –   example: cal...
Matching expression and variable
           data types
 ●   Multiple expression of this form:
     DECLARE b      SMALLINT...
Improved easter function:
    CREATE FUNCTION f_easter_int_nodiv(
        p_year INT
    ) RETURNS DATE
    BEGIN
        ...
Variable and assignment
             Summary
●   Don't use user-defined variables
       –   Use local variables instead
●...
Program
●   Stored routine Issues?
●   Variables and assignments
●   Flow of control
●   Cursor handling
●   Summary




 ...
Flow of Control
●   Decisions, alternate code paths
●   Plain SQL operators and functions:
       –   IF(), CASE...END
   ...
Case operator vs Case statement
CREATE FUNCTION                       CREATE FUNCTION
f_case_operator(                    ...
Case operator vs Case statement
     ●     linear slowdown of the CASE statement
                          30



         ...
Flow of control summary

●   Use conditional expressions if possible




             Blog: http://rpbouman.blogspot.com/ ...
Program
●   Stored routine Issues?
●   Variables and assignments
●   Flow of control
●   Cursor handling
●   Summary




 ...
Cursor Handling
●   Why do you need that cursor anyway?
●   Only very few cases justify cursors
        –   Data driven st...
You need a cursor to do what?!
CREATE FUNCTION f_film_categories(p_film_id INT)              SELECT    fc.film_id
RETURNS ...
Cursor Looping

    REPEAT, WHILE, LOOP
●   Loop control
●   What's inside the loop?
       –   Treat nested cursor loops ...
Why to avoid cursor loops with
          REPEAT
●   Always runs at least once
        –   So what if the set is empty?
●  ...
Why to avoid cursor loops with
          REPEAT
BEGIN
    DECLARE   v_done BOOL DEFAULT FALSE;
    DECLARE   csr FOR SELEC...
Why to avoid cursor loops with
           WHILE
●   Slightly better than REPEAT
        –   Only one check at the top of t...
Why to avoid cursor loops with
           WHILE
BEGIN
    DECLARE   v_has_rows BOOL DEFAULT TRUE;
    DECLARE   csr FOR SE...
Why to write cursor loops with
            LOOP
●   No double checking (like in REPEAT)
●   No code duplication (like in W...
Why you should write cursor
       loops with LOOP
BEGIN
    DECLARE   v_done BOOL DEFAULT FALSE;
    DECLARE   csr FOR SE...
Cursor summary
●   Avoid cursors if you can
        –   Use GROUP_CONCAT for lists
        –   Use joins, not nested curso...
Program
●   Stored routine Issues?
●   Variables and assignments
●   Flow of control
●   Cursor handling
●   Summary




 ...
Summary
●   Variables
       –   Use local rather than user-defined variables
●   Assignments
       –   Use DEFAULT and S...
Upcoming SlideShare
Loading in …5
×
Upcoming SlideShare
MySQL Create Table
Next
Download to read offline and view in fullscreen.

3

Share

Optimizing mysql stored routines uc2010

Download to read offline

M

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Optimizing mysql stored routines uc2010

  1. 1. MySQL User Conference and Expo 2010 Optimizing Stored Routines Blog: http://rpbouman.blogspot.com/ 1 twitter: @rolandbouman
  2. 2. Welcome, thanks for attending! ● Roland Bouman; Leiden, Netherlands ● Ex MySQL AB, Sun Microsystems ● Web and BI Developer ● Co-author of “Pentaho Solutions” ● Blog: http://rpbouman.blogspot.com/ ● Twitter: @rolandbouman Blog: http://rpbouman.blogspot.com/ 2 twitter: @rolandbouman
  3. 3. Program ● Stored routine issues ● Variables and assignments ● Flow of control ● Cursor handling ● Summary Blog: http://rpbouman.blogspot.com/ 3 twitter: @rolandbouman
  4. 4. Program ● Stored routine issues? ● Variables and assignments ● Flow of control ● Cursor handling ● Summary Blog: http://rpbouman.blogspot.com/ 4 twitter: @rolandbouman
  5. 5. Stored Routines: Definition ● Stored routines: – stored functions (SQL functions) – stored procedures – triggers – events Blog: http://rpbouman.blogspot.com/ 5 twitter: @rolandbouman
  6. 6. Performance Issues ● SQL inside stored routines is still SQL, ...but... – invocation overhead – suboptimal computational performance ● Benchmarking method – BENCHMARK(1000000, expression) – Appropriate for computation speed – 1 million times ● MySQL 5.1.36, Windows Blog: http://rpbouman.blogspot.com/ 6 twitter: @rolandbouman
  7. 7. Invocation overhead ● Plain expression (10 mln) mysql> SELECT BENCHMARK(10000000, 1); +------------------------+ | benchmark(10000000, 1) | +------------------------+ | 0 | +------------------------+ 1 row in set (0.19 sec) ● Equivalent function (10 mln) mysql> CREATE FUNCTION f_one() RETURNS INT RETURN 1; mysql> SELECT BENCHMARK(10000000, f_one()); +----------------------------+ | benchmark(10000000, f_one) | +----------------------------+ | 0 | +----------------------------+ 1 row in set (24.59 sec) ● Slowdown 130 times Blog: http://rpbouman.blogspot.com/ 7 twitter: @rolandbouman
  8. 8. Computation inefficiency ● Plain addition mysql> SELECT BENCHMARK(10000000, 1+1); +--------------------------+ | benchmark(10000000, 1+1) | +--------------------------+ | 0 | +--------------------------+ 1 row in set (0.30 sec) ● Equivalent function mysql> CREATE FUNCTION f_one_plus_one() RETURNS INT RETURN 1+1; mysql> SELECT BENCHMARK(10000000, f_one_plus_one()); +---------------------------------------+ | benchmark(10000000, f_one_plus_one()) | +---------------------------------------+ | 0 | +---------------------------------------+ 1 row in set (28.73 sec) Blog: http://rpbouman.blogspot.com/ 8 twitter: @rolandbouman
  9. 9. Computation inefficiency ● Raw measurements plain expression function ratio 1 f_one() 0.19 24.59 0.0077 1+1 f_one_plus_one() 0.29 28.73 0.0101 ● Correction for invocation overhead plain expression function ratio 1 f_one() 0.00 00.00 1+1 f_one_plus_one() 0.10 4.14 0.0242 ● Slowdown about 40 times – after correction for invocation overhead Blog: http://rpbouman.blogspot.com/ 9 twitter: @rolandbouman
  10. 10. Program ● Stored routine issues ● Variables and assignments ● Flow of control ● Cursor handling ● Summary Blog: http://rpbouman.blogspot.com/ 10 twitter: @rolandbouman
  11. 11. Types of Variables ● User-defined variables – session scope – runtime type SET @user_defined_variable := 'some value'; ● Local variables – block scope – declared type BEGIN DECLARE v_local_variable VARCHAR(50); SET v_local_variable := 'some value'; ... END; Blog: http://rpbouman.blogspot.com/ 11 twitter: @rolandbouman
  12. 12. User-defined variable Benchmark ● Baseline CREATE FUNCTION f_variable_baseline() RETURNS INT BEGIN DECLARE a INT DEFAULt 1; RETURN a; END; ● Local variable CREATE FUNCTION f_variable_baseline() RETURNS INT BEGIN DECLARE a INT DEFAULT 1; SET a := 1; RETURN a; END; ● User-defined variable CREATE FUNCTION f_variable_baseline() RETURNS INT BEGIN DECLARE a INT DEFAULT 1; SET @a := 1; RETURN a; END; Blog: http://rpbouman.blogspot.com/ 12 twitter: @rolandbouman
  13. 13. User-defined variables ● User-defined variables about 5x slower 9 8 7 6 5 4 Row 45 3 2 1 0 f_variable_baseline f_local_variable f _user_defined_variable baseline local variable User-defined variable 4.6 5.32 7.89 0.0 0.72 3.29 0.72/3.29 = 0,22 Blog: http://rpbouman.blogspot.com/ 13 twitter: @rolandbouman
  14. 14. Assignments ● SET statement SET v_variable := 'some value'; ● SELECT statement SELECT 'some value' INTO v_variable; ● DEFAULT clause BEGIN DECLARE v_local_variable VARCHAR(50) DEFAULT 'some value'; ... END; Blog: http://rpbouman.blogspot.com/ 14 twitter: @rolandbouman
  15. 15. Assignment Benchmarks ● SELECT INTO about 60% slower than SET ● SET about 40% slower than DEFAULT 30 baseline DEFAULT SET SELECT 25 8.2 15.06 18.25 32.08 20 0 6.86 10.05 23.88 15 100% 42.09% Row 29 10 100% 68.26% 5 0 default clause set statement select into statement Blog: http://rpbouman.blogspot.com/ 15 twitter: @rolandbouman
  16. 16. More about SELECT INTO ● Assigning from a SELECT...INTO statement: – ok if you're assigning from a real query – not so much if you're assigning literals SELECT COUNT(*) SELECT 1 , user_id , 'some value' INTO v_count INTO v_number , v_user_id , v_string FROM t_users Blog: http://rpbouman.blogspot.com/ 16 twitter: @rolandbouman
  17. 17. Sample function: Sakila rental count CREATE FUNCTION f_assign_select_into(p_customer_id INT) RETURNS INT BEGIN DECLARE c INT; SELECT SQL_NO_CACHE, COUNT(*) INTO c FROM sakila.rental WHERE customer_id = p_customer_id; RETURN c; END; CREATE FUNCTION f_assign_select_set(p_customer_id INT) RETURNS INT BEGIN DECLARE c INT; SET c := ( SELECT SQL_NO_CACHE, COUNT(*) FROM sakila.rental WHERE customer_id = p_customer_id); RETURN c; END; CREATE FUNCTION f_noassign_select(p_customer_id INT) RETURNS INT BEGIN RETURN ( SELECT SQL_NO_CACHE, COUNT(*) FROM sakila.rental WHERE customer_id = p_customer_id); END; Blog: http://rpbouman.blogspot.com/ 17 twitter: @rolandbouman
  18. 18. Sakila Rental count benchmark ● SET about 25% slower than SELECT INTO 10 9 8 7 6 5 Row 2 4 3 2 1 0 select into set subquery return subquery N select into set subquery return subquery 100000 7.00 9.06 8.75 Blog: http://rpbouman.blogspot.com/ 18 twitter: @rolandbouman
  19. 19. More on variables and assignments ● Match expression and variable data types – example: calculating easter CREATE FUNCTION f_easter_int_nodiv( p_year INT ) RETURNS DATE BEGIN DECLARE a SMALLINT DEFAULT p_year % 19; DECLARE b SMALLINT DEFAULT FLOOR(p_year / 100); DECLARE c SMALLINT DEFAULT p_year % 100; DECLARE d SMALLINT DEFAULT FLOOR(b / 4); DECLARE e SMALLINT DEFAULT b % 4; DECLARE f SMALLINT DEFAULT FLOOR((b + 8) / 25); DECLARE g SMALLINT DEFAULT FLOOR((b - f + 1) / 3); DECLARE h SMALLINT DEFAULT (19*a + b - d - g + 15) % 30; DECLARE i SMALLINT DEFAULT FLOOR(c / 4); DECLARE k SMALLINT DEFAULT c % 4; DECLARE L SMALLINT DEFAULT (32 + 2*e + 2*i - h - k) % 7; DECLARE m SMALLINT DEFAULT FLOOR((a + 11*h + 22*L) / 451); DECLARE v100 SMALLINT DEFAULT h + L - 7*m + 114; RETURN STR_TO_DATE( CONCAT(p_year, '-', v100 DIV 31, '-', (v100 % 31) + 1) , '%Y-%c-%e' ); END; Blog: http://rpbouman.blogspot.com/ 19 twitter: @rolandbouman
  20. 20. Matching expression and variable data types ● Multiple expression of this form: DECLARE b SMALLINT DEFAULT FLOOR(p_year / 100); ● Divide and round to next lowest integer – Alternative: using integer division (DIV) DECLARE b SMALLINT DEFAULT p_year DIV 100; ● 13x performance increase! – ...but: beware for negative values Blog: http://rpbouman.blogspot.com/ 20 twitter: @rolandbouman
  21. 21. Improved easter function: CREATE FUNCTION f_easter_int_nodiv( p_year INT ) RETURNS DATE BEGIN DECLARE a SMALLINT DEFAULT p_year % 19; DECLARE b SMALLINT DEFAULT p_year DIV 100; DECLARE c SMALLINT DEFAULT p_year % 100; DECLARE d SMALLINT DEFAULT b DIV 4; DECLARE e SMALLINT DEFAULT b % 4; DECLARE f SMALLINT DEFAULT (b + 8) DIV 25; DECLARE g SMALLINT DEFAULT (b - f + 1) DIV 3; DECLARE h SMALLINT DEFAULT (19*a + b - d - g + 15) % 30; DECLARE i SMALLINT DEFAULT c DIV 4; DECLARE k SMALLINT DEFAULT c % 4; DECLARE L SMALLINT DEFAULT (32 + 2*e + 2*i - h - k) % 7; DECLARE m SMALLINT DEFAULT (a + 11*h + 22*L) DIV 451; DECLARE v100 SMALLINT DEFAULT h + L - 7*m + 114; RETURN STR_TO_DATE( CONCAT(p_year, '-', v100 DIV 31, '-', (v100 % 31) + 1) , '%Y-%c-%e' ); END; ● 30% faster than using FLOOR and / ● Also applicable to regular SQL Blog: http://rpbouman.blogspot.com/ 21 twitter: @rolandbouman
  22. 22. Variable and assignment Summary ● Don't use user-defined variables – Use local variables instead ● If possible, use DEFAULT – If you don't, time is wasted ● Beware of SELECT INTO – Only use it for assigning values from queries – Use SET instead for assigning literals ● Match expression and variable data type Blog: http://rpbouman.blogspot.com/ 22 twitter: @rolandbouman
  23. 23. Program ● Stored routine Issues? ● Variables and assignments ● Flow of control ● Cursor handling ● Summary Blog: http://rpbouman.blogspot.com/ 23 twitter: @rolandbouman
  24. 24. Flow of Control ● Decisions, alternate code paths ● Plain SQL operators and functions: – IF(), CASE...END – IFNULL(), NULLIF(), COALESCE() – ELT(), FIELD(), FIND_IN_SET() ● Stored routine statements: – IF...END IF – CASE...END CASE Blog: http://rpbouman.blogspot.com/ 24 twitter: @rolandbouman
  25. 25. Case operator vs Case statement CREATE FUNCTION CREATE FUNCTION f_case_operator( f_case_statement( p_arg INT p_arg INT ) ) RETURNS INT RETURNS INT BEGIN BEGIN DECLARE a CHAR(1); DECLARE a CHAR(1); SET a := CASE p_arg CASE p_arg WHEN 1 THEN 'a' WHEN 1 THEN SET a := 'a'; WHEN 2 THEN 'b' WHEN 2 THEN SET a := 'b'; WHEN 3 THEN 'c' WHEN 3 THEN SET a := 'c'; WHEN 4 THEN 'd' WHEN 4 THEN SET a := 'd'; WHEN 5 THEN 'e' WHEN 5 THEN SET a := 'e'; WHEN 6 THEN 'f' WHEN 6 THEN SET a := 'f'; WHEN 7 THEN 'g' WHEN 7 THEN SET a := 'g'; WHEN 8 THEN 'h' WHEN 8 THEN SET a := 'h'; WHEN 9 THEN 'i' WHEN 9 THEN SET a := 'i'; ELSE NULL ELSE NULL END; END; RETURN NULL; RETURN NULL; END; END; Blog: http://rpbouman.blogspot.com/ 25 twitter: @rolandbouman
  26. 26. Case operator vs Case statement ● linear slowdown of the CASE statement 30 25 20 15 case operator case statement 10 5 0 1 2 3 4 5 6 7 8 9 10 argument 1 2 3 4 5 6 7 8 9 10 case operator 9,27 9,31 9,33 9,33 9,36 9,38 9,36 9,36 9,36 9,05 case statement 10,2 11,55 12,83 14,14 15,45 16,75 18,09 19,41 20,75 24,83 Blog: http://rpbouman.blogspot.com/ 26 twitter: @rolandbouman
  27. 27. Flow of control summary ● Use conditional expressions if possible Blog: http://rpbouman.blogspot.com/ 27 twitter: @rolandbouman
  28. 28. Program ● Stored routine Issues? ● Variables and assignments ● Flow of control ● Cursor handling ● Summary Blog: http://rpbouman.blogspot.com/ 28 twitter: @rolandbouman
  29. 29. Cursor Handling ● Why do you need that cursor anyway? ● Only very few cases justify cursors – Data driven stored procedure calls – Data driven dynamic SQL Blog: http://rpbouman.blogspot.com/ 29 twitter: @rolandbouman
  30. 30. You need a cursor to do what?! CREATE FUNCTION f_film_categories(p_film_id INT) SELECT fc.film_id RETURNS VARCHAR(2048) , GROUP_CONCAT(c.name) BEGIN FROM film_category fc DECLARE v_done BOOL DEFAULT FALSE; LEFT JOIN category c DECLARE v_category VARCHAR(25); ON fc.category_id = c.category_id DECLARE v_categories VARCHAR(2048); GROUP BY fc.film_id DECLARE film_categories CURSOR FOR SELECT c.name 35 FROM sakila.film_category fc 30 N=100000 INNER JOIN sakila.category c ON fc.category_id = c.category_id 25 WHERE fc.film_id = p_film_id; 20 DECLARE CONTINUE HANDLER FOR NOT FOUND SET v_done := TRUE; 15 Row 2 OPEN film_categories; 10 categories_loop: LOOP FETCH film_categories INTO v_category; 5 IF v_done THEN 0 CLOSE film_categories; group_concat cursor LEAVE categories_loop; END IF; group_concat cursor SET v_categories := CONCAT_WS( ',', v_categories, v_category 15,34 29,57 ); END LOOP; RETURN v_categories; END; Blog: http://rpbouman.blogspot.com/ 30 twitter: @rolandbouman
  31. 31. Cursor Looping REPEAT, WHILE, LOOP ● Loop control ● What's inside the loop? – Treat nested cursor loops as suspicious – Be very weary of SQL statements inside the loop. Blog: http://rpbouman.blogspot.com/ 31 twitter: @rolandbouman
  32. 32. Why to avoid cursor loops with REPEAT ● Always runs at least once – So what if the set is empty? ● Iteration before checking the loop condition – Always requires an additional explicit check inside the loop ● Loop control scattered: – Both in top and bottom of the loop Blog: http://rpbouman.blogspot.com/ 32 twitter: @rolandbouman
  33. 33. Why to avoid cursor loops with REPEAT BEGIN DECLARE v_done BOOL DEFAULT FALSE; DECLARE csr FOR SELECT * FROM tab; Loop is entered, DECLARE CONTINUE HANDLER FOR NOT FOUND without checking if the SET v_done := TRUE; resultset is empty OPEN csr; REPEAT FETCH csr INTO var1,...,varN; 1 positve and one IF NOT v_done THEN negative check to see -- ... do stuff... if he resultset is END IF; exhausted; UNTIL v_done END REPEAT; CLOSE csr; END; Blog: http://rpbouman.blogspot.com/ 33 twitter: @rolandbouman
  34. 34. Why to avoid cursor loops with WHILE ● Slightly better than REPEAT – Only one check at the top of the loop ● Requires code duplication – One FETCH needed outside the loop ● Loop control still scattered – condition is checked at the top of the loop – FETCH required at the bottom Blog: http://rpbouman.blogspot.com/ 34 twitter: @rolandbouman
  35. 35. Why to avoid cursor loops with WHILE BEGIN DECLARE v_has_rows BOOL DEFAULT TRUE; DECLARE csr FOR SELECT * FROM tab; DECLARE CONTINUE HANDLER FOR NOT FOUND SET v_has_rows := FALSE; OPEN csr; Fetch required both FETCH csr INTO var1,...,varN; outside (just once) and WHILE v_has_rows DO inside the loop -- ... do stuff... FETCH csr INTO var1,...,varN; END WHILE; CLOSE csr; END; Blog: http://rpbouman.blogspot.com/ 35 twitter: @rolandbouman
  36. 36. Why to write cursor loops with LOOP ● No double checking (like in REPEAT) ● No code duplication (like in WHILE) ● All loop control code in one place – All at top of loop Blog: http://rpbouman.blogspot.com/ 36 twitter: @rolandbouman
  37. 37. Why you should write cursor loops with LOOP BEGIN DECLARE v_done BOOL DEFAULT FALSE; DECLARE csr FOR SELECT * FROM tab; DECLARE CONTINUE HANDLER FOR NOT FOUND SET v_done := TRUE; OPEN csr; my_loop: LOOP FETCH csr INTO var1,...,varN; IF v_done THEN CLOSE csr; LEAVE my_loop; END IF; -- ... do stuff... END LOOP; END; Blog: http://rpbouman.blogspot.com/ 37 twitter: @rolandbouman
  38. 38. Cursor summary ● Avoid cursors if you can – Use GROUP_CONCAT for lists – Use joins, not nested cursors – Only for data driven dynamic SQL and stored procedure calls ● Use LOOP instead of REPEAT and WHILE – REPEAT requires double condition checking – WHILE requires code duplication – LOOP allows you to keep all loop control together Blog: http://rpbouman.blogspot.com/ 38 twitter: @rolandbouman
  39. 39. Program ● Stored routine Issues? ● Variables and assignments ● Flow of control ● Cursor handling ● Summary Blog: http://rpbouman.blogspot.com/ 39 twitter: @rolandbouman
  40. 40. Summary ● Variables – Use local rather than user-defined variables ● Assignments – Use DEFAULT and SET for simple values – Use SELECT INTO for queries ● Flow of Control – Use functions and operators rather than statements ● Cursors – Avoid if possible – Use LOOP, not REPEAT and WHILE Blog: http://rpbouman.blogspot.com/ 40 twitter: @rolandbouman
  • sestakm

    Feb. 20, 2013
  • vietnambaby

    Feb. 17, 2013
  • Mamun_Reza

    May. 4, 2012

M

Views

Total views

1,688

On Slideshare

0

From embeds

0

Number of embeds

5

Actions

Downloads

43

Shares

0

Comments

0

Likes

3

×