2. $ docker pull mysql
$ docker start some-mysql
$ docker run -it --link some-mysql:mysql --rm mysql sh -c
'exec mysql -h"$MYSQL_PORT_3306_TCP_ADDR"
-P"$MYSQL_PORT_3306_TCP_PORT" -uroot
-p"$MYSQL_ENV_MYSQL_ROOT_PASSWORD"'
mysql> CREATE DATABASE wf;
mysql> USE wf;
mysql> CREATE TABLE gain
(hotel VARCHAR(10), date DATE, sale INT);
6. Window Function
One of the biggest “news” of MySQL 8
Meetup PUG
30 Aprile 2019
#AperiTech
7. Index
● Window function history
● What it is
● Types of window function
● Logical flow
● Optimization
8. Who am I?
I’m Davide Dell’Erba
Full Stack Web Developer @
@delda80
github.com/delda
info@davidedellerba.it
9. Window function story
Window and window function were first introduced to
SQL:1999 as amendment.
Window functions were incorporated in SQL:2003
version of Standar SQL.
They were updated in the next version SQL:2008.
The last expansion was in the last version of the
standard: SQL:2016.
10. Window function story
Along the years, almost all major database systems introduced this feature:
2008 2009 2010 2011 2012 2013 2014 2015 2016 2017
11. What is a “window”?
It is a set of rows defined by OVER() clause
Set one
Set two
Set three
ORDER BY()
Set one
Set two
Set three
PARTITION BY()
12. What is a “function”?
Function Example
Ranking ROW_NUMBER(), RANK(), DENSE_RANK(), NTILE()
Aggregate MIN(), MAX(), AVG(), SUM(), COUNT(), STDEV(),
STDEVP(), VAR(), VARP(), CHECKSUM_AGG(),
COUNT_BIG()
Analytic CUME_DIST(), LAG(), LEAD(), FIRST_VALUE(),
LAST_VALUE(), PERCENT_RAIN()
13. So… What is a “window function”?
● It is a function that computes a value for a row using a window
● Operates on a window, witch is a group of related rows.
● Returns a value for each row of the table.
● The return value is calculated using data from the rows of window.
● This is a new concept: you can reach outside the current row.
14. GROUP BY() vs OVER()
SELECT
key,
SUM(values)
FROM table
GROUP BY key;
SELECT
key,
SUM(value) OVER (PARTITION BY key
ORDER BY date)
FROM table;
16. Partition clause
mysql> SELECT hotel, date, sale, SUM(sale) OVER ( PARTITION BY hotel) AS total
FROM gain ORDER BY hotel, date;
+-----------+------------+------+-------+
| hotel | date | sale | total |
+-----------+------------+------+-------+
| Cavalieri | 2019-01-01 | 250 | 781 |
| Cavalieri | 2019-02-01 | 262 | 781 |
| Cavalieri | 2019-03-01 | 269 | 781 |
| J.K.Place | 2019-01-01 | 460 | 1385 |
| J.K.Place | 2019-02-01 | 450 | 1385 |
| J.K.Place | 2019-03-01 | 475 | 1385 |
| Manfredi | 2019-01-01 | 400 | 1060 |
| Manfredi | 2019-02-01 | 319 | 1060 |
| Manfredi | 2019-03-01 | 341 | 1060 |
+-----------+------------+------+-------+
SELECT hotel, SUM(sale)
FROM gain GROUP BY hotel;
+-----------+-----------+
| hotel | SUM(sale) |
+-----------+-----------+
| Cavalieri | 781 |
| J.K.Place | 1385 |
| Manfredi | 1060 |
+-----------+-----------+
17. Types of window function: partition clause
partition clause
● specified PARTITION BY clause
● windows are separated by partition boundary
● is supported by all window functions
mysql> SELECT hotel, date, sale, SUM(sale) OVER ( PARTITION BY hotel) AS total
FROM gain ORDER BY hotel, date;
18. Order by clause
mysql> SELECT hotel, date, sale, SUM(sale) OVER (PARTITION BY hotel ORDER BY
date) AS partial, SUM(sale) OVER (PARTITION BY hotel) AS total FROM gain;
+-----------+------------+------+---------+-------+
| hotel | date | sale | partial | total |
+-----------+------------+------+---------+-------+
| Cavalieri | 2019-01-01 | 250 | 250 | 781 |
| Cavalieri | 2019-02-01 | 262 | 512 | 781 |
| Cavalieri | 2019-03-01 | 269 | 781 | 781 |
| J.K.Place | 2019-01-01 | 460 | 460 | 1385 |
| J.K.Place | 2019-02-01 | 450 | 910 | 1385 |
| J.K.Place | 2019-03-01 | 475 | 1385 | 1385 |
| Manfredi | 2019-01-01 | 400 | 400 | 1060 |
| Manfredi | 2019-02-01 | 319 | 719 | 1060 |
| Manfredi | 2019-03-01 | 341 | 1060 | 1060 |
+-----------+------------+------+---------+-------+
19. Types of window function: order by clause
order by clause
● specified by ORDER BY clause
● defines ordering on set
● is supported by all window functions
mysql> SELECT hotel, date, sale, SUM(sale) OVER (PARTITION BY hotel ORDER BY
date) AS partial, SUM(sale) OVER (PARTITION BY hotel) AS total FROM gain;
20. Queries...
mysql> SELECT hotel, date, sale, ROUND(AVG(sale) OVER (PARTITION BY hotel
ORDER BY date), 2) AS average , SUM(sale) OVER (PARTITION BY hotel ORDER BY
date) AS partial, SUM(sale) OVER (PARTITION BY hotel) AS total FROM gain;
+-----------+------------+------+---------+---------+-------+
| hotel | date | sale | average | partial | total |
+-----------+------------+------+---------+---------+-------+
| Cavalieri | 2019-01-01 | 250 | 250.00 | 250 | 781 |
| Cavalieri | 2019-02-01 | 262 | 256.00 | 512 | 781 |
| Cavalieri | 2019-03-01 | 269 | 260.33 | 781 | 781 |
| J.K.Place | 2019-01-01 | 460 | 460.00 | 460 | 1385 |
| J.K.Place | 2019-02-01 | 450 | 455.00 | 910 | 1385 |
| J.K.Place | 2019-03-01 | 475 | 461.67 | 1385 | 1385 |
| Manfredi | 2019-01-01 | 400 | 400.00 | 400 | 1060 |
| Manfredi | 2019-02-01 | 319 | 359.50 | 719 | 1060 |
| Manfredi | 2019-03-01 | 341 | 353.33 | 1060 | 1060 |
+-----------+------------+------+---------+---------+-------+
21. Queries...
mysql> SELECT hotel, date, sale, ROUND(AVG(sale) OVER (PARTITION BY hotel
ORDER BY date), 2) AS average, ROUND(sale - AVG(sale) OVER (PARTITION BY
hotel ORDER BY date), 2) AS delta , SUM(sale) OVER (PARTITION by hotel ORDER
BY date) AS partial, SUM(sale) OVER (PARTITION BY hotel) AS total FROM gain;
+-----------+------------+------+---------+--------+---------+-------+
| hotel | date | sale | average | delta | partial | total |
+-----------+------------+------+---------+--------+---------+-------+
| Cavalieri | 2019-01-01 | 250 | 250.00 | 0.00 | 250 | 781 |
| Cavalieri | 2019-02-01 | 262 | 256.00 | 6.00 | 512 | 781 |
| Cavalieri | 2019-03-01 | 269 | 260.33 | 8.67 | 781 | 781 |
| J.K.Place | 2019-01-01 | 460 | 460.00 | 0.00 | 460 | 1385 |
| J.K.Place | 2019-02-01 | 450 | 455.00 | -5.00 | 910 | 1385 |
| J.K.Place | 2019-03-01 | 475 | 461.67 | 13.33 | 1385 | 1385 |
| Manfredi | 2019-01-01 | 400 | 400.00 | 0.00 | 400 | 1060 |
| Manfredi | 2019-02-01 | 319 | 359.50 | -40.50 | 719 | 1060 |
| Manfredi | 2019-03-01 | 341 | 353.33 | -12.33 | 1060 | 1060 |
+-----------+------------+------+---------+--------+---------+-------+
22. Update table
mysql> UPDATE gain SET sale = 400 WHERE hotel = 'Manfredi';
mysql> UPDATE gain SET sale = 450 WHERE date = '2019-02-01';
mysql> select * from gain;
+-----------+------------+------+
| hotel | date | sale |
+-----------+------------+------+
| Cavalieri | 2019-03-01 | 269 |
| J.K.Place | 2019-02-01 | 450 |
| Manfredi | 2019-01-01 | 400 |
| Cavalieri | 2019-02-01 | 450 |
| Cavalieri | 2019-01-01 | 250 |
| Manfredi | 2019-02-01 | 450 |
| J.K.Place | 2019-03-01 | 475 |
| J.K.Place | 2019-01-01 | 460 |
| Manfredi | 2019-03-01 | 400 |
+-----------+------------+------+
23. Frame clause introduction
mysql> SELECT hotel, date, sale, COUNT(sale) OVER ( ORDER BY sale DESC ROWS
BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS 'order', rank() OVER ( ORDER
BY sale DESC) AS ranking, dense_rank() OVER ( ORDER BY sale DESC ) AS 'dense
rank' FROM gain;
+-----------+------------+------+-------+---------+------------+
| hotel | date | sale | order | ranking | dense rank |
+-----------+------------+------+-------+---------+------------+
| J.K.Place | 2019-03-01 | 475 | 1 | 1 | 1 |
| J.K.Place | 2019-01-01 | 460 | 2 | 2 | 2 |
| J.K.Place | 2019-02-01 | 450 | 3 | 3 | 3 |
| Cavalieri | 2019-02-01 | 450 | 4 | 3 | 3 |
| Manfredi | 2019-02-01 | 450 | 5 | 3 | 3 |
| Manfredi | 2019-01-01 | 400 | 6 | 6 | 4 |
| Manfredi | 2019-03-01 | 400 | 7 | 6 | 4 |
| Cavalieri | 2019-03-01 | 269 | 8 | 8 | 5 |
| Cavalieri | 2019-01-01 | 250 | 9 | 9 | 6 |
+-----------+------------+------+-------+---------+------------+
24. Frame clause
mysql> SELECT hotel, sale,
COUNT(sale) OVER (ORDER BY sale ROWS BETWEEN UNBOUNDED PRECEDING AND
CURRENT ROW) AS rows ,
COUNT(sale) OVER (ORDER BY sale RANGE BETWEEN UNBOUNDED PRECEDING AND
CURRENT ROW) AS range FROM gain;
+-----------+------+------+-------+
| hotel | sale | rows | range |
+-----------+------+------+-------+
| Cavalieri | 250 | 1 | 1 |
| Cavalieri | 269 | 2 | 2 |
| Manfredi | 400 | 3 | 4 |
| Manfredi | 400 | 4 | 4 |
| J.K.Place | 450 | 5 | 7 |
| Cavalieri | 450 | 6 | 7 |
| Manfredi | 450 | 7 | 7 |
| J.K.Place | 460 | 8 | 8 |
| J.K.Place | 475 | 9 | 9 |
+-----------+------+------+-------+
25. Frame clause
mysql> SELECT hotel, sale, COUNT(sale) OVER (ORDER BY sale ROWS BETWEEN
UNBOUNDED PRECEDING AND CURRENT ROW) AS 'rows', COUNT(sale) OVER (ORDER BY
sale ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS 'rows2', COUNT(sale) OVER
(ORDER BY sale RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS 'range'
FROM gain;
+-----------+------+------+-------+-------+
| hotel | sale | rows | rows2 | range |
+-----------+------+------+-------+-------+
| Cavalieri | 250 | 1 | 2 | 1 |
| Cavalieri | 269 | 2 | 3 | 2 |
| Manfredi | 400 | 3 | 3 | 4 |
| Manfredi | 400 | 4 | 3 | 4 |
| J.K.Place | 450 | 5 | 3 | 7 |
| Cavalieri | 450 | 6 | 3 | 7 |
| Manfredi | 450 | 7 | 3 | 7 |
| J.K.Place | 460 | 8 | 3 | 8 |
| J.K.Place | 475 | 9 | 2 | 9 |
+-----------+------+------+-------+-------+
26. Frame clause
frame clause
● specified respect the current row
● allow to tell how far the set is applied
● relationships between raw and frame are ROWS
and RANGE
mysql> SELECT hotel, sale, COUNT(sale) OVER (ORDER BY sale ROWS BETWEEN
UNBOUNDED PRECEDING AND CURRENT ROW) AS 'rows', COUNT(sale) OVER (ORDER BY
sale ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS 'rows2', COUNT(sale) OVER
(ORDER BY sale RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS 'range'
FROM gain;
27. Order please
Order Clause Function
1 FROM (including JOINs) Choose and join tables
2 WHERE Filters the base data
3 GROUP BY Aggregate the base data
4 HAVING Filters the aggregate data
5 WINDOW FUNCTION Performing calculation on subset of data
6 SELECT Returns the final data
9 ORDER BY Sort the finale data
10 LIMIT + OFFSET Limits the returned data to a row count
29. Order please
mysql> SELECT hotel, date, sale,
RANK() OVER (ORDER BY hotel, date) AS 'rank' FROM gain ORDER BY date;
+-----------+------------+------+------+
| hotel | date | sale | rank |
+-----------+------------+------+------+
| Manfredi | 2019-01-01 | 400 | 7 |
| Cavalieri | 2019-01-01 | 250 | 1 |
| J.K.Place | 2019-01-01 | 460 | 4 |
| Manfredi | 2019-02-01 | 450 | 8 |
| Cavalieri | 2019-02-01 | 450 | 2 |
| J.K.Place | 2019-02-01 | 450 | 5 |
| J.K.Place | 2019-03-01 | 475 | 6 |
| Manfredi | 2019-03-01 | 400 | 9 |
| Cavalieri | 2019-03-01 | 269 | 3 |
+-----------+------------+------+------+
30. Implicit and explicit window function
● Window can be implicit and unnamed:
SELECT SUM(sale) OVER (PARTITION BY sale)
FROM gain;
● Window can be named via WINDOW clause:
SELECT SUM(sale) OVER (wf)
FROM gain
WINDOW wf OVER (PARTITION BY sale);
31. Implicit and explicit window function
● A window can inherit from another window adding details
SELECT hotel, date, sale,
SUM(sale) OVER (wf2) AS partial,
SUM(sale) OVER (wf1) AS total
FROM gain
WINDOW wf1 AS (PARTITION BY hotel),
wf2 AS (wf1 ORDER BY sale);
32. EXPLAIN WITH JSON
Using the simple EXPLAIN command, you can’t see the window function’s
performance;
alternatively, if you digit EXPLAIN FORMAT=JSON, you can know how to optimize
the subquery.
33. EXPLAIN
mysql> EXPLAIN SELECT hotel, date, sale, SUM(sale) OVER() total FROM
gainG
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: gain
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 9
filtered: 100.00
Extra: NULL