3. 1/24/13
Cloud-Native: Flexible, Redundant, & Infinitely Scalable by
Nature.
What Cloud Translates To
3
Small to mid size
instances
Lack of Control Over
Physical Resources
No delay in
expanding
Limited Storage
Options
Redundancy is Easy
Redundancy is
Necessary
Saturday, May 11, 13
4. 1/24/13
Cloud-Native: Flexible, Redundant, & Infinitely Scalable by
Nature.
What Does Your Data Look Like?
• Log Data Classic OLTP
• By Nature, it’s Distributed
• Long & Mid Term Storage
• Eventual v. Immediate Consistency
• Speed of Retrieval -Classing OLAP Warehouse
4
Saturday, May 11, 13
5. 1/24/13
Cloud-Native: Flexible, Redundant, & Infinitely Scalable by
Nature.
Requirements
5
• No Data Loss
• Ability to Pause, Promote, and Update Seamlessly
• Do Not be the Bottleneck
• Allow for Out of Sequence Events
• Restate and Correct Analytics
Saturday, May 11, 13
7. 1/24/13
Cloud-Native: Flexible, Redundant, & Infinitely Scalable by
Nature.
Distributed OLTP
7
Disparate event types can be sent to separate
instances.
Within the same event, parallel writes don’t require
synchronous multi-masters.
Batch Process Writes with Copy Into
Saturday, May 11, 13
8. 1/24/13
Cloud-Native: Flexible, Redundant, & Infinitely Scalable by
Nature.
Vertical Sharding- Inheritance is your friend!
8
One parent per event, sharded by time, allows for efficient
querying, individual backups, and general sanity.
Saturday, May 11, 13
9. 1/24/13
Cloud-Native: Flexible, Redundant, & Infinitely Scalable by
Nature.
Vertical Sharding- Inheritance is your friend!
9
Constraints: Timestamp not Date
Event ID (pkey)
Explain Analyze!
Saturday, May 11, 13
10. 1/24/13
Cloud-Native: Flexible, Redundant, & Infinitely Scalable by
Nature.
Vertical Sharding- Inheritance is your friend!
10
WHILE startday <= endday endday LOOP
tomorrow := startday + interval '1 day';
index_name := 'idx_name' || replace(startday::text, '-', '');
tname := 'event_' || to_char(startday, 'YYYY_MM_DD');
EXECUTE 'create table ' || quote_ident(tname) ||'() inherits(event)';
EXECUTE 'ALTER TABLE ' || quote_ident(tname) ||' ADD CONSTRAINT
createdck CHECK (created >= ' || quote_literal(startday) ||'
AND created < '|| quote_literal(tomorrow) ||')';
EXECUTE 'CREATE INDEX '|| index_name ||' ON '||
quote_ident(tname)||'(act_id) ;
startday := tomorrow;
END LOOP;
Saturday, May 11, 13
11. 1/24/13
Cloud-Native: Flexible, Redundant, & Infinitely Scalable by
Nature.
Close The Shard
11
FOR tbl IN SELECT t.table_name FROM information_schema.tables t
WHERE LEFT(t.table_name, 6 ) = 'event_'
AND t.table_name NOT IN (SELECT table_name
FROM information_schema.constraint_column_usage
WHERE constraint_name = 'idcheck')
AND REPLACE(REPLACE(t.table_name, 'event_', ''), '_', '-')::DATE < NOW()::DATE
LOOP
EXECUTE 'SELECT coalesce(min(event_id), 0), coalesce(max(event_id), 0) FROM ' ||
quote_ident(tbl) ||';'
INTO smin, smax;
EXECUTE 'ALTER TABLE '|| quote_ident(tbl) || ' ADD CONSTRAINT idcheck CHECK (event_id
BETWEEN '|| smin || ' AND ' || smax ||' )';
END LOOP;
Saturday, May 11, 13
12. 1/24/13
Cloud-Native: Flexible, Redundant, & Infinitely Scalable by
Nature.
Star & Snowflake OLAP
12
Because API calls can’t time out.
Partition-able on fact, account - what works for you.
Saturday, May 11, 13
13. 1/24/13
Cloud-Native: Flexible, Redundant, & Infinitely Scalable by
Nature.
Build Your Snowflake
13
• Pull Aggregates
• Add/Update base fact
• Cascade Updates Through Dimensions
Quartz Triggered Calls Three Phase Process:
Saturday, May 11, 13
14. 1/24/13
Cloud-Native: Flexible, Redundant, & Infinitely Scalable by
Nature.
Build Your Snowflake
14
Driven by Sequential ID
Position Lookup, Pull, & Update in a Single
Transaction
Save State for Performance Analysis
Saturday, May 11, 13
15. 1/24/13
Cloud-Native: Flexible, Redundant, & Infinitely Scalable by
Nature.
Aggregate on Pull
15
CREATE TABLE source_epoch
AS SELECT * FROM dblink('dbname=src host=127.0.0.2 port=5432 user=rickastley
password=never_gonna_give_you_up',
'SELECT some, thing, COUNT(id),
DATE_TRUNC(''hour'', TO_TIMESTAMP(date)),
MAX(id) AS id
FROM events
WHERE id > 50823461
GROUP BY 1, 2 ORDER BY id)
AS t(some TEXT, thing, INT event_count INT, hour TIMESTAMP, id INT)
Saturday, May 11, 13
16. 1/24/13
Cloud-Native: Flexible, Redundant, & Infinitely Scalable by
Nature.
The Defense
16
Bugs for the entire system will find their way through
to your warehouse. Do not always trust your data.
Saturday, May 11, 13
18. 1/24/13
Cloud-Native: Flexible, Redundant, & Infinitely Scalable by
Nature.
What It All Looks Like
18
Writer
Writer
OLTP
TypeA
OLTP
TypeB
OLTP
TypeB
Saturday, May 11, 13
19. 1/24/13
Cloud-Native: Flexible, Redundant, & Infinitely Scalable by
Nature.
What It All Looks Like
19
Writer
Writer
OLTP
TypeA
OLTP
TypeB
OLTP
TypeB
OLAP
WH
Type1
Saturday, May 11, 13
20. 1/24/13
Cloud-Native: Flexible, Redundant, & Infinitely Scalable by
Nature.
What It All Looks Like
20
Writer
Writer
OLAP
WH
Type1
Query Head
OLTP
TypeA
OLTP
TypeB
OLTP
TypeB
Saturday, May 11, 13
21. 1/24/13
Cloud-Native: Flexible, Redundant, & Infinitely Scalable by
Nature.
What It All Looks Like
21
Writer
Writer
OLAP
WH
Type1
Query Head
OLTP
TypeA
OLTP
TypeB
OLTP
TypeC
OLTP
TypeA
OLTP
TypeA
OLTP
TypeC
OLTP
TypeC
Saturday, May 11, 13
22. 1/24/13
Cloud-Native: Flexible, Redundant, & Infinitely Scalable by
Nature.
What It All Looks Like
22
Writer
Writer
OLAP
WH
Type1
Query Head
OLTP
TypeA
OLTP
TypeB
OLTP
TypeC
OLTP
TypeA
OLTP
TypeA
OLTP
TypeC
OLTP
TypeC
OLAP
WH
Type1
Query Head
Saturday, May 11, 13