PGDay SF 2020 - Timeseries data in Postgres with updates

PGDay SF 2020 G Gordon Worley III

● Problem/use case
● Desiderata
● Possible solutions
● Limitations
● Naive approach
● Pivoted metrics
● Partitioning
● TimescaleDB
● "Columnar" storage
● Limit row count
● TOAST arrays
● Functions can be fast

Date Impressions Clicks Conversions Spend
2020-1-1 103 3 2 $8
2020-1-2 124 4 1 $15
2020-1-3 65 0 0 $5

●
○
■
○
●
○
○
○
●
○

●
●
●
●
●
● Cost eﬀective
● Timeseries and aggregate queries
● Flexible schema

●
●
●
●
●
●
●
●

●
●
●
●
●
●
●
●
●

●
●
●
●
●
●
●
●
●
●

●
●
●
●
●
●
●
●
●
●
●

●
○
○
○
●
○
○
○
●

●
○
●
○
●
○
●
○

Naive
Approach
create table entity_date_metric_value (
entity integer not null,
ts date not null,
metric integer not null,
value numeric
);

Naive
Approach
create table entity_date_metric_value (
ts date not null,
metric integer not null,
value numeric
);
●
○
○
○
●

Pivoted
Metrics
create table entity_date_metric (
ts date not null,
metric_1 numeric,
metric_2 numeric,
...
metric_100 numeric
);

Pivoted
Metrics
ts date not null,
metric_1 numeric,
metric_2 numeric,
...
metric_100 numeric
);
●
○
○
●

Partitioned
Pivoted
Metrics
ts date not null,
metric_1 numeric,
metric_2 numeric,
...
metric_100 numeric
) partition by range(ts);

Partitioned
Pivoted
Metrics
ts date not null,
metric_1 numeric,
metric_2 numeric,
...
metric_100 numeric
) partition by range(ts);
●
○
●

Partitioned
Pivoted
Metrics
select
entity, date_trunc('week', ts) ts_trunc,
sum(metric_*) ...
from entity_date_metric
where entity in (*list of 1000 random entities*)
and ts between '2001-01-15' and '2001-02-15'
group by entity, ts_trunc;
●

TimescaleDB
ts date not null,
metric_1 numeric,
metric_2 numeric,
...
metric_100 numeric
);
select create_hypertable('entity_date_metric', 'ts');

TimescaleDB
select
entity, date_trunc('week', ts) ts_trunc,
sum(metric_*) ...
and ts between '2001-01-15' and '2001-02-15'
●

●
●
○
○
●
○
○
○
○

TOASTy
Arrays
create table entity_metric (
metric_1 numeric[],
metric_2 numeric[],
...
metric_100 numeric[]
);
●
●
○
○
○
○
●
○
○

TOASTy
Arrays
●
○
●
select
entity, date_trunc('week', day) ts_trunc,
sum(metric_*) ...
from (
select
entity,
unnest(array(select generate_series(s, e, '1 day'))) as day,
unnest(metric_*[start:end]) as metric_*,
...
) as unnested_metrics

TOASTy
Arrays
●
●
●
●
●
What if I didn't have to UNNEST?

TOASTy
Arrays
create or replace function metric_array_sum(
input numeric[]
) returns numeric as $fun$
declare
output numeric := 0;
begin
if array_length(input, 1) is null then return null; end if;
for i in array_lower(input, 1)..array_upper(input, 1) loop
output := output + coalesce(input[i], 0);
end loop;
return output;
end
$fun$ language 'plpgsql' immutable strict parallel safe;
●
●
○
○
○

TOASTy
Arrays
●
●
select
entity,
metric_array_sum_by_date_part(metric_*, start, end,
'week')
from entity_metric
where entity in (*list of 1000 random entities*);

PGDay SF 2020 - Timeseries data in Postgres with updates

PGDay SF 2020 - Timeseries data in Postgres with updates

Recommended

Recommended

More Related Content

Similar to PGDay SF 2020 - Timeseries data in Postgres with updates

Similar to PGDay SF 2020 - Timeseries data in Postgres with updates (20)

Recently uploaded

Recently uploaded (20)

PGDay SF 2020 - Timeseries data in Postgres with updates