http://www.craigkerstiens.com
Hash based Range based
Overly excited engineers
ž I saw this tech talk from a
google engineer
ž Instagram did it early and look
at them now
Management 1 year latter
ž You’re not google
ž We didn’t ship any features
during that time and we’re no
more scalable
10 years ago, all this was true
1 in a million good possible outcomes
Sharding scales
CREATE TABLE visits_01 … (on node 1)
CREATE TABLE visits_02 … (on node 2)
CREATE TABLE visits_03 … (on node 1)
CREATE TABLE visits_04 … (on node 2)
Account page occurred_at visitor_id
1 https://www.craigkerstiens.com 6/5/2018 12:34:00 ab49e5-bc34-46d12
2 http://www.facebook.com 6/5/2018 13:10:00 ce52062-bc38-43d52
Hashed
Account
page occurred_at visitor_id
46154 https://www.craigkerstiens.com 6/5/2018 12:34:00 ab49e5-bc34-46d12
27193 http://www.facebook.com 6/5/2018 13:10:00 ce52062-bc38-43d52
Hash ranges Table data
0-2047 visits_01
2048-4095 visits_02
… …
26624 to 28672 visits_13
…
63488-65536 visits_32
Hashed
Account
page occurred_at visitor_id Table
46154
https://
www.craigkerstiens.com
6/5/2018 12:34:00
ab49e5-
bc34-46d12
visits_27
27193 http://www.facebook.com 6/5/2018 13:10:00
ce52062-
bc38-43d52
visits_13
SELECT *
FROM visits
WHERE account_id = 1;
Postgres-XC
Developed by NTT and Enterprise DB
We evolved from XC to XL, and many other forks
• Focuses on staying as close as possible to Postgres core experience
• Global transaction manager involved in every transaction
• Transactions require multiple network round trips
• Data is sharded, each node has own caching
• Scale-out read performance on single core at time
ž APPLICATION
SELECT
FROM
WHERE
AND
count(*)
ads JOIN campaigns ON
ads.company_id = campaigns.company_id
ads.designer_name = ‘Isaac’
campaigns.company_id = ‘Elly Co’ ;
METADATA
COORDINATOR NODE
W1
W2
W3 … Wn
SELECT …
FROM
ads_1001,
campaigns_2001
…
CREATE TABLE github_events
(
event_id bigint,
event_type text,
event_public boolean,
repo_id bigint,
payload jsonb,
repo jsonb,
user_id bigint,
org jsonb,
created_at timestamp
);
Create a table
CREATE TABLE github_users
(
user_id bigint,
url text,
login text,
avatar_url text,
gravatar_id text,
display_login text
);
Create your table
Distribute your tables
SELECT create_distributed_table('github_events', 'user_id');
SELECT create_distributed_table('github_users', 'user_id');
# dt
List of relations
Schema | Name | Type | Owner
--------+----------------------+-------+-------
public | github_events_102011 | table | citus
public | github_events_102015 | table | citus
public | github_events_102019 | table | citus
public | github_events_102023 | table | citus
public | github_events_102027 | table | citus
public | github_events_102031 | table | citus
public | github_events_102035 | table | citus
public | github_events_102039 | table | citus
public | github_users_102043 | table | citus
public | github_users_102047 | table | citus
public | github_users_102051 | table | citus
public | github_users_102055 | table | citus
public | github_users_102059 | table | citus
public | github_users_102063 | table | citus
public | github_users_102067 | table | citus
public | github_users_102071 | table | citus
(16 rows)
# dt
List of relations
Schema | Name | Type | Owner
--------+----------------------+-------+-------
public | github_events_102009 | table | citus
public | github_events_102013 | table | citus
public | github_events_102017 | table | citus
public | github_events_102021 | table | citus
public | github_events_102025 | table | citus
public | github_events_102029 | table | citus
public | github_events_102033 | table | citus
public | github_events_102037 | table | citus
public | github_users_102041 | table | citus
public | github_users_102045 | table | citus
public | github_users_102049 | table | citus
public | github_users_102053 | table | citus
public | github_users_102057 | table | citus
public | github_users_102061 | table | citus
public | github_users_102065 | table | citus
public | github_users_102069 | table | citus
(16 rows)
SELECT count(*) from github_events;
count
--------
126245
(1 row)
Time: 177.491 ms
Querying our data
SELECT date_trunc('hour', created_at) AS hour,
sum((payload->>'distinct_size')::int) AS num_commits
FROM github_events
WHERE event_type = 'PushEvent'
GROUP BY hour
ORDER BY hour;
hour | num_commits
---------------------+-------------
2016-12-01 05:00:00 | 22160
2016-12-01 06:00:00 | 53562
2016-12-01 07:00:00 | 46540
2016-12-01 08:00:00 | 35002
(4 rows)
Time: 186.176 ms
Querying our data
CREATE SCHEMA github;
CREATE TABLE github.events (
event_id bigint,
event_type text,
event_public boolean,
repo_id bigint,
payload jsonb,
repo jsonb, actor jsonb,
org jsonb,
created_at timestamp
) PARTITION BY RANGE (created_at);
SELECT
partman.create_parent('github.events',
'created_at', 'native', 'hourly');
UPDATE partman.part_config SET
infinite_time_partitions = true;
List of relations
Schema | Name | Type | Owner
--------+-------------------------+----------+----------
public | events | table | citus
public | events_event_id_seq | sequence | citus
public | events_p2018_10_23_0900 | table | citus
public | events_p2018_10_23_0905 | table | citus
public | events_p2018_10_23_0910 | table | citus
public | events_p2018_10_23_0915 | table | citus
public | events_p2018_10_23_0920 | table | citus
public | events_p2018_10_23_0925 | table | citus
public | events_p2018_10_23_0930 | table | citus
public | events_p2018_10_23_0935 | table | citus
Thanks!
Questions?
@craigkerstiens

The Future of Sharding

  • 2.
  • 7.
  • 14.
    Overly excited engineers žI saw this tech talk from a google engineer ž Instagram did it early and look at them now Management 1 year latter ž You’re not google ž We didn’t ship any features during that time and we’re no more scalable
  • 15.
    10 years ago,all this was true 1 in a million good possible outcomes
  • 18.
  • 22.
    CREATE TABLE visits_01… (on node 1) CREATE TABLE visits_02 … (on node 2) CREATE TABLE visits_03 … (on node 1) CREATE TABLE visits_04 … (on node 2)
  • 23.
    Account page occurred_atvisitor_id 1 https://www.craigkerstiens.com 6/5/2018 12:34:00 ab49e5-bc34-46d12 2 http://www.facebook.com 6/5/2018 13:10:00 ce52062-bc38-43d52
  • 24.
    Hashed Account page occurred_at visitor_id 46154https://www.craigkerstiens.com 6/5/2018 12:34:00 ab49e5-bc34-46d12 27193 http://www.facebook.com 6/5/2018 13:10:00 ce52062-bc38-43d52
  • 25.
    Hash ranges Tabledata 0-2047 visits_01 2048-4095 visits_02 … … 26624 to 28672 visits_13 … 63488-65536 visits_32
  • 26.
    Hashed Account page occurred_at visitor_idTable 46154 https:// www.craigkerstiens.com 6/5/2018 12:34:00 ab49e5- bc34-46d12 visits_27 27193 http://www.facebook.com 6/5/2018 13:10:00 ce52062- bc38-43d52 visits_13
  • 27.
  • 32.
  • 33.
    We evolved fromXC to XL, and many other forks • Focuses on staying as close as possible to Postgres core experience • Global transaction manager involved in every transaction • Transactions require multiple network round trips • Data is sharded, each node has own caching • Scale-out read performance on single core at time
  • 40.
    ž APPLICATION SELECT FROM WHERE AND count(*) ads JOINcampaigns ON ads.company_id = campaigns.company_id ads.designer_name = ‘Isaac’ campaigns.company_id = ‘Elly Co’ ; METADATA COORDINATOR NODE W1 W2 W3 … Wn SELECT … FROM ads_1001, campaigns_2001 …
  • 41.
    CREATE TABLE github_events ( event_idbigint, event_type text, event_public boolean, repo_id bigint, payload jsonb, repo jsonb, user_id bigint, org jsonb, created_at timestamp ); Create a table CREATE TABLE github_users ( user_id bigint, url text, login text, avatar_url text, gravatar_id text, display_login text ); Create your table
  • 42.
    Distribute your tables SELECTcreate_distributed_table('github_events', 'user_id'); SELECT create_distributed_table('github_users', 'user_id');
  • 43.
    # dt List ofrelations Schema | Name | Type | Owner --------+----------------------+-------+------- public | github_events_102011 | table | citus public | github_events_102015 | table | citus public | github_events_102019 | table | citus public | github_events_102023 | table | citus public | github_events_102027 | table | citus public | github_events_102031 | table | citus public | github_events_102035 | table | citus public | github_events_102039 | table | citus public | github_users_102043 | table | citus public | github_users_102047 | table | citus public | github_users_102051 | table | citus public | github_users_102055 | table | citus public | github_users_102059 | table | citus public | github_users_102063 | table | citus public | github_users_102067 | table | citus public | github_users_102071 | table | citus (16 rows) # dt List of relations Schema | Name | Type | Owner --------+----------------------+-------+------- public | github_events_102009 | table | citus public | github_events_102013 | table | citus public | github_events_102017 | table | citus public | github_events_102021 | table | citus public | github_events_102025 | table | citus public | github_events_102029 | table | citus public | github_events_102033 | table | citus public | github_events_102037 | table | citus public | github_users_102041 | table | citus public | github_users_102045 | table | citus public | github_users_102049 | table | citus public | github_users_102053 | table | citus public | github_users_102057 | table | citus public | github_users_102061 | table | citus public | github_users_102065 | table | citus public | github_users_102069 | table | citus (16 rows)
  • 45.
    SELECT count(*) fromgithub_events; count -------- 126245 (1 row) Time: 177.491 ms Querying our data
  • 46.
    SELECT date_trunc('hour', created_at)AS hour, sum((payload->>'distinct_size')::int) AS num_commits FROM github_events WHERE event_type = 'PushEvent' GROUP BY hour ORDER BY hour; hour | num_commits ---------------------+------------- 2016-12-01 05:00:00 | 22160 2016-12-01 06:00:00 | 53562 2016-12-01 07:00:00 | 46540 2016-12-01 08:00:00 | 35002 (4 rows) Time: 186.176 ms Querying our data
  • 49.
    CREATE SCHEMA github; CREATETABLE github.events ( event_id bigint, event_type text, event_public boolean, repo_id bigint, payload jsonb, repo jsonb, actor jsonb, org jsonb, created_at timestamp ) PARTITION BY RANGE (created_at);
  • 50.
  • 51.
    List of relations Schema| Name | Type | Owner --------+-------------------------+----------+---------- public | events | table | citus public | events_event_id_seq | sequence | citus public | events_p2018_10_23_0900 | table | citus public | events_p2018_10_23_0905 | table | citus public | events_p2018_10_23_0910 | table | citus public | events_p2018_10_23_0915 | table | citus public | events_p2018_10_23_0920 | table | citus public | events_p2018_10_23_0925 | table | citus public | events_p2018_10_23_0930 | table | citus public | events_p2018_10_23_0935 | table | citus
  • 64.