0
Developing polyglot persistence applications
Chris Richardson
Author of POJOs in Action
Founder of the original CloudFound...
@crichardson
Presentation goal
The benefits and drawbacks of polyglot
persistence
and
How to design applications that use t...
@crichardson
About Chris
@crichardson
(About Chris)
@crichardson
About Chris()
@crichardson
About Chris
@crichardson
About Chris
http://www.theregister.co.uk/2009/08/19/springsource_cloud_foundry/
@crichardson
Agenda
• Why polyglot persistence?
• Optimizing queries using Redis materialized views
• Synchronizing MySQL ...
@crichardson
FoodTo Go Architecture - 2006
Order taking
Restaurant
Management
MySQL
Database
Hungry
Consumer Restaurant
Su...
@crichardson
Solution: Spend Money
http://upload.wikimedia.org/wikipedia/commons/e/e5/Rising_Sun_Yacht.JPG
OR
http://www.t...
@crichardson
Solution: Use NoSQL
Benefits
• Higher performance
• Higher scalability
• Richer data-model
• Schema-less
Drawb...
@crichardson
Example NoSQL Databases
Database Key features
Cassandra
Extensible column store, very
scalable, distributed
M...
@crichardson
Redis: Fast and Scalable
Redis master
Redis slave Redis slave
File
system
ClientIn-memory
= FAST
Durable
Redi...
@crichardson
Redis: advanced key/value store
K1 V1
K2 V2
... ...
String SET, GET
List LPUSH, LPOP, ...
Set SADD, SMEMBERS,...
@crichardson
Sorted sets
myset
a
5.0
b
10.0
Key
Value
ScoreMembers are sorted
by score
@crichardson
Adding members to a sorted set
Redis Server
zadd myset 5.0 a myset
a
5.0
Key Score Value
@crichardson
Adding members to a sorted set
Redis Server
zadd myset 10.0 b myset
a
5.0
b
10.0
@crichardson
Adding members to a sorted set
Redis Server
zadd myset 1.0 c myset
a
5.0
b
10.0
c
1.0
@crichardson
Retrieving members by index range
Redis Server
zrange myset 0 1
myset
a
5.0
b
10.0
c
1.0
ac
Key
Start
Index
E...
@crichardson
Retrieving members by score
Redis Server
zrangebyscore myset 1 6
myset
a
5.0
b
10.0
c
1.0
ac
Key
Min
value
Ma...
@crichardson
Redis is great but there are tradeoffs
• No rich queries: PK-based access only
• Limited transaction model:
•...
@crichardson
The future is polyglot
IEEE Software Sept/October 2010 - Debasish Ghosh / Twitter @debasishg
e.g. Netflix
• RD...
@crichardson
Benefits
RDBMS
ACID transactions
Rich queries
...
+
NoSQL
Performance
Scalability
...
@crichardson
Drawbacks
Complexity
Development
Operational
@crichardson
Polyglot Pattern: Cache
Order taking
Restaurant
Management
MySQL
Database
CONSUMER RESTAURANT
OWNER
Redis
Cac...
@crichardson
Redis:Timeline
Polyglot Pattern: write to SQL and NoSQL
Follower 1
Follower 2
Follower 3 ...
LPUSHX
LTRIM
MyS...
@crichardson
Polyglot Pattern: replicate from MySQL to
NoSQL
Query
Service
Update
Service
MySQL
Database
Reader Writer
Red...
@crichardson
Agenda
• Why polyglot persistence?
• Optimizing queries using Redis materialized views
• Synchronizing MySQL ...
@crichardson
Finding available restaurants
Available restaurants =
Serve the zip code of the delivery address
AND
Are open...
@crichardson
Database schema
ID Name …
1 Ajanta
2 Montclair Eggshop
Restaurant_id zipcode
1 94707
1 94619
2 94611
2 94619
...
@crichardson
Finding available restaurants on Monday, 6.15pm for 94619 zipcode
Straightforward three-way join
select r.*
f...
@crichardson
How to scale this query?
• Query caching is usually ineffective
• MySQL Master/slave replication is straightf...
@crichardson
Query denormalized copy stored in Redis
Order taking
Restaurant
Management
MySQL
Database
CONSUMER
RESTAURANT...
@crichardson
BUT how to implement findAvailableRestaurants() with Redis?!
?
select r.*
from restaurant r
inner join restaur...
@crichardson
ZRANGEBYSCORE = Simple SQL Query
ZRANGEBYSCORE myset 1 6
select value
from sorted_set
where key = ‘myset’
and...
@crichardson
select r.*
from restaurant r
inner join restaurant_time_range tr
on r.id =tr.restaurant_id
inner join restaur...
@crichardson
Simplification #1: Denormalization
Restaurant_id Day_of_week Open_time Close_time Zip_code
1 Monday 1130 1430 ...
@crichardson
Simplification #2:Application filtering
SELECT restaurant_id, open_time
FROM time_range_zip_code
WHERE day_of_w...
@crichardson
Simplification #3: Eliminate multiple =’s with concatenation
SELECT restaurant_id, open_time
FROM time_range_z...
@crichardson
Simplification #4: Eliminate multiple SELECTVALUES with
concatenation
SELECT open_time_restaurant_id,
FROM tim...
@crichardson
zip_dow open_time_restaurant_id close_time
94707:Monday 1130_1 1430
94619:Monday 1130_1 1430
94707:Monday 173...
@crichardson
Finding restaurants that have not yet closed
ZRANGEBYSCORE 94619:Monday 1815 2359
1730 is before 1815 è Ajan...
@crichardson
NoSQL Denormalized representation for
each query
@crichardson
SorryTed!
http://en.wikipedia.org/wiki/Edgar_F._Codd
@crichardson
Agenda
• Why polyglot persistence?
• Optimizing queries using Redis materialized views
• Synchronizing MySQL ...
@crichardson
MySQL & Redis
need to be consistent
@crichardson
Two-Phase commit is not an option
• Redis does not support it
• Even if it did, 2PC is best avoided http://ww...
@crichardson
Atomic
Consistent
Isolated
Durable
Basically Available
Soft state
Eventually consistent
BASE:An Acid Alternat...
@crichardson
Updating Redis #FAIL
begin MySQL transaction
update MySQL
update Redis
rollback MySQL transaction
Redis has u...
@crichardson
Updating Redis reliably
Step 1 of 2
begin MySQL transaction
update MySQL
queue CRUD event in MySQL
commit tra...
@crichardson
Updating Redis reliably
Step 2 of 2
for each CRUD event in MySQL queue
get next CRUD event from MySQL queue
I...
@crichardson
Generating CRUD events
• Explicit code
• Hibernate event listener
• Service-layer aspect
• CQRS/Event-sourcin...
@crichardson
ID JSON processed?
ENTITY_CRUD_EVENT
EntityCrudEvent
Processor
Redis
Updater
apply(event)
Step 1 Step 2
INSER...
@crichardson
Representing restaurants in Redis
...JSON...
restaurant:lastSeenEventId:1
restaurant:1
55
94619:Monday [0700_...
@crichardson
Updating Redis
WATCH restaurant:lastSeenEventId:≪restaurantId≫
Optimistic
locking
Duplicate
detection
lastSee...
@crichardson
Summary
• Each SQL/NoSQL database = set of tradeoffs
• Polyglot persistence: leverage the strengths of SQL an...
@crichardson
@crichardson chris@chrisrichardson.net
http://plainoldobjects.com - code and slides
Upcoming SlideShare
Loading in...5
×

Developing polyglot persistence applications (oscon oscon2013)

1,158

Published on

NoSQL databases such as Redis, MongoDB and Cassandra are emerging as a compelling choice for many applications. They can simplify the persistence of complex data models and offer significantly better scalability and performance. However, using a NoSQL database means giving up the benefits of the relational model such as SQL, constraints and ACID transactions. For some applications, the solution is polyglot persistence: using SQL and NoSQL databases together.

In this talk, you will learn about the benefits and drawbacks of polyglot persistence and how to design applications that use this approach. We will explore the architecture and implementation of an example application that uses MySQL as the system of record and Redis as a very high-performance database that handles queries from the front-end. You will learn about mechanisms for maintaining consistency across the various databases.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,158
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
23
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Transcript of "Developing polyglot persistence applications (oscon oscon2013)"

  1. 1. Developing polyglot persistence applications Chris Richardson Author of POJOs in Action Founder of the original CloudFoundry.com @crichardson chris@chrisrichardson.net http://plainoldobjects.com
  2. 2. @crichardson Presentation goal The benefits and drawbacks of polyglot persistence and How to design applications that use this approach
  3. 3. @crichardson About Chris
  4. 4. @crichardson (About Chris)
  5. 5. @crichardson About Chris()
  6. 6. @crichardson About Chris
  7. 7. @crichardson About Chris http://www.theregister.co.uk/2009/08/19/springsource_cloud_foundry/
  8. 8. @crichardson Agenda • Why polyglot persistence? • Optimizing queries using Redis materialized views • Synchronizing MySQL and Redis
  9. 9. @crichardson FoodTo Go Architecture - 2006 Order taking Restaurant Management MySQL Database Hungry Consumer Restaurant Success = inadequate
  10. 10. @crichardson Solution: Spend Money http://upload.wikimedia.org/wikipedia/commons/e/e5/Rising_Sun_Yacht.JPG OR http://www.trekbikes.com/us/en/bikes/road/race_performance/madone_5_series/madone_5_2/#
  11. 11. @crichardson Solution: Use NoSQL Benefits • Higher performance • Higher scalability • Richer data-model • Schema-less Drawbacks • Limited transactions • Limited querying • Relaxed consistency • Unconstrained data
  12. 12. @crichardson Example NoSQL Databases Database Key features Cassandra Extensible column store, very scalable, distributed MongoDB Document-oriented, fast, scalable Redis Key-value store, very fast http://nosql-database.org/ lists 122+ NoSQL databases
  13. 13. @crichardson Redis: Fast and Scalable Redis master Redis slave Redis slave File system ClientIn-memory = FAST Durable Redis master Redis slave Redis slave File system Consistent Hashing
  14. 14. @crichardson Redis: advanced key/value store K1 V1 K2 V2 ... ... String SET, GET List LPUSH, LPOP, ... Set SADD, SMEMBERS, .. Sorted Set Hashes ZADD, ZRANGE HSET, HGET
  15. 15. @crichardson Sorted sets myset a 5.0 b 10.0 Key Value ScoreMembers are sorted by score
  16. 16. @crichardson Adding members to a sorted set Redis Server zadd myset 5.0 a myset a 5.0 Key Score Value
  17. 17. @crichardson Adding members to a sorted set Redis Server zadd myset 10.0 b myset a 5.0 b 10.0
  18. 18. @crichardson Adding members to a sorted set Redis Server zadd myset 1.0 c myset a 5.0 b 10.0 c 1.0
  19. 19. @crichardson Retrieving members by index range Redis Server zrange myset 0 1 myset a 5.0 b 10.0 c 1.0 ac Key Start Index End Index
  20. 20. @crichardson Retrieving members by score Redis Server zrangebyscore myset 1 6 myset a 5.0 b 10.0 c 1.0 ac Key Min value Max value
  21. 21. @crichardson Redis is great but there are tradeoffs • No rich queries: PK-based access only • Limited transaction model: • Read first and then execute updates as batch • Difficult to compose code • Data must fit in memory • Single-threaded server: run multiple with client-side sharding • Missing features such as access control, ... Hope you don’t need ACID or rich queries one day
  22. 22. @crichardson The future is polyglot IEEE Software Sept/October 2010 - Debasish Ghosh / Twitter @debasishg e.g. Netflix • RDBMS • SimpleDB • Cassandra • Hadoop/Hbase
  23. 23. @crichardson Benefits RDBMS ACID transactions Rich queries ... + NoSQL Performance Scalability ...
  24. 24. @crichardson Drawbacks Complexity Development Operational
  25. 25. @crichardson Polyglot Pattern: Cache Order taking Restaurant Management MySQL Database CONSUMER RESTAURANT OWNER Redis Cache First Second
  26. 26. @crichardson Redis:Timeline Polyglot Pattern: write to SQL and NoSQL Follower 1 Follower 2 Follower 3 ... LPUSHX LTRIM MySQL Tweets Followers INSERT NEW TWEET Follows Avoids expensive MySQL joins
  27. 27. @crichardson Polyglot Pattern: replicate from MySQL to NoSQL Query Service Update Service MySQL Database Reader Writer Redis System of RecordMaterialized view query() update() replicate and denormalize
  28. 28. @crichardson Agenda • Why polyglot persistence? • Optimizing queries using Redis materialized views • Synchronizing MySQL and Redis
  29. 29. @crichardson Finding available restaurants Available restaurants = Serve the zip code of the delivery address AND Are open at the delivery time public interface AvailableRestaurantRepository { List<AvailableRestaurant> findAvailableRestaurants(Address deliveryAddress, Date deliveryTime); ... }
  30. 30. @crichardson Database schema ID Name … 1 Ajanta 2 Montclair Eggshop Restaurant_id zipcode 1 94707 1 94619 2 94611 2 94619 Restaurant_id dayOfWeek openTime closeTime 1 Monday 1130 1430 1 Monday 1730 2130 2 Tuesday 1130 … RESTAURANT table RESTAURANT_ZIPCODE table RESTAURANT_TIME_RANGE table
  31. 31. @crichardson Finding available restaurants on Monday, 6.15pm for 94619 zipcode Straightforward three-way join select r.* from restaurant r inner join restaurant_time_range tr on r.id =tr.restaurant_id inner join restaurant_zipcode sa on r.id = sa.restaurant_id where ’94619’ = sa.zip_code and tr.day_of_week=’monday’ and tr.openingtime <= 1815 and 1815 <= tr.closingtime
  32. 32. @crichardson How to scale this query? • Query caching is usually ineffective • MySQL Master/slave replication is straightforward BUT • Complexity of slaves • Risk of replication failing • Limitations of MySQL master/slave replication
  33. 33. @crichardson Query denormalized copy stored in Redis Order taking Restaurant Management MySQL Database CONSUMER RESTAURANT OWNER Redis System of RecordCopy findAvailable() update()
  34. 34. @crichardson BUT how to implement findAvailableRestaurants() with Redis?! ? select r.* from restaurant r inner join restaurant_time_range tr on r.id =tr.restaurant_id inner join restaurant_zipcode sa on r.id = sa.restaurant_id where ’94619’ = sa.zip_code and tr.day_of_week=’monday’ and tr.openingtime <= 1815 and 1815 <= tr.closingtime K1 V1 K2 V2 ... ...
  35. 35. @crichardson ZRANGEBYSCORE = Simple SQL Query ZRANGEBYSCORE myset 1 6 select value from sorted_set where key = ‘myset’ and score >= 1 and score <= 6 = key value score sorted_set
  36. 36. @crichardson select r.* from restaurant r inner join restaurant_time_range tr on r.id =tr.restaurant_id inner join restaurant_zipcode sa on r.id = sa.restaurant_id where ’94619’ = sa.zip_code and tr.day_of_week=’monday’ and tr.openingtime <= 1815 and 1815 <= tr.closingtime select value from sorted_set where key = ? and score >= ? and score <= ? ? How to transform the SELECT statement? We need to denormalize
  37. 37. @crichardson Simplification #1: Denormalization Restaurant_id Day_of_week Open_time Close_time Zip_code 1 Monday 1130 1430 94707 1 Monday 1130 1430 94619 1 Monday 1730 2130 94707 1 Monday 1730 2130 94619 2 Monday 0700 1430 94619 … SELECT restaurant_id FROM time_range_zip_code WHERE day_of_week = ‘Monday’ AND zip_code = 94619 AND 1815 < close_time AND open_time < 1815 Simpler query: § No joins § Two = and two <
  38. 38. @crichardson Simplification #2:Application filtering SELECT restaurant_id, open_time FROM time_range_zip_code WHERE day_of_week = ‘Monday’ AND zip_code = 94619 AND 1815 < close_time AND open_time < 1815 Even simpler query • No joins • Two = and one <
  39. 39. @crichardson Simplification #3: Eliminate multiple =’s with concatenation SELECT restaurant_id, open_time FROM time_range_zip_code WHERE zip_code_day_of_week = ‘94619:Monday’ AND 1815 < close_time Restaurant_id Zip_dow Open_time Close_time 1 94707:Monday 1130 1430 1 94619:Monday 1130 1430 1 94707:Monday 1730 2130 1 94619:Monday 1730 2130 2 94619:Monday 0700 1430 … key range
  40. 40. @crichardson Simplification #4: Eliminate multiple SELECTVALUES with concatenation SELECT open_time_restaurant_id, FROM time_range_zip_code WHERE zip_code_day_of_week = ‘94619:Monday’ AND 1815 < close_time zip_dow open_time_restaurant_id close_time 94707:Monday 1130_1 1430 94619:Monday 1130_1 1430 94707:Monday 1730_1 2130 94619:Monday 1730_1 2130 94619:Monday 0700_2 1430 ... ✔
  41. 41. @crichardson zip_dow open_time_restaurant_id close_time 94707:Monday 1130_1 1430 94619:Monday 1130_1 1430 94707:Monday 1730_1 2130 94619:Monday 1730_1 2130 94619:Monday 0700_2 1430 ... Sorted Set [ Entry:Score, …] [1130_1:1430, 1730_1:2130] [0700_2:1430, 1130_1:1430, 1730_1:2130] Using a Redis sorted set as an index Key 94619:Monday 94707:Monday 94619:Monday 0700_2 1430
  42. 42. @crichardson Finding restaurants that have not yet closed ZRANGEBYSCORE 94619:Monday 1815 2359 1730 is before 1815 è Ajanta is open Delivery timeDelivery zip and day Sorted Set [ Entry:Score, …] [1130_1:1430, 1730_1:2130] [0700_2:1430, 1130_1:1430, 1730_1:2130] Key 94619:Monday 94707:Monday {1730_1}
  43. 43. @crichardson NoSQL Denormalized representation for each query
  44. 44. @crichardson SorryTed! http://en.wikipedia.org/wiki/Edgar_F._Codd
  45. 45. @crichardson Agenda • Why polyglot persistence? • Optimizing queries using Redis materialized views • Synchronizing MySQL and Redis
  46. 46. @crichardson MySQL & Redis need to be consistent
  47. 47. @crichardson Two-Phase commit is not an option • Redis does not support it • Even if it did, 2PC is best avoided http://www.infoq.com/articles/ebay-scalability-best-practices
  48. 48. @crichardson Atomic Consistent Isolated Durable Basically Available Soft state Eventually consistent BASE:An Acid Alternative http://queue.acm.org/detail.cfm?id=1394128
  49. 49. @crichardson Updating Redis #FAIL begin MySQL transaction update MySQL update Redis rollback MySQL transaction Redis has update MySQL does not begin MySQL transaction update MySQL commit MySQL transaction <<system crashes>> update Redis MySQL has update Redis does not
  50. 50. @crichardson Updating Redis reliably Step 1 of 2 begin MySQL transaction update MySQL queue CRUD event in MySQL commit transaction ACID Event Id Operation: Create, Update, Delete New entity state, e.g. JSON
  51. 51. @crichardson Updating Redis reliably Step 2 of 2 for each CRUD event in MySQL queue get next CRUD event from MySQL queue If CRUD event is not duplicate then Update Redis (incl. eventId) end if begin MySQL transaction mark CRUD event as processed commit transaction
  52. 52. @crichardson Generating CRUD events • Explicit code • Hibernate event listener • Service-layer aspect • CQRS/Event-sourcing • Database triggers • Reading the MySQL binlog
  53. 53. @crichardson ID JSON processed? ENTITY_CRUD_EVENT EntityCrudEvent Processor Redis Updater apply(event) Step 1 Step 2 INSERT INTO ... Timer SELECT ... FROM ... Redis
  54. 54. @crichardson Representing restaurants in Redis ...JSON... restaurant:lastSeenEventId:1 restaurant:1 55 94619:Monday [0700_2:1430, 1130_1:1430, ...] 94619:Tuesday [0700_2:1430, 1130_1:1430, ...] duplicate detection
  55. 55. @crichardson Updating Redis WATCH restaurant:lastSeenEventId:≪restaurantId≫ Optimistic locking Duplicate detection lastSeenEventId = GET restaurant:lastSeenEventId:≪restaurantId≫ if (lastSeenEventId >= eventId) return; Transaction MULTI SET restaurant:lastSeenEventId:≪restaurantId≫ eventId ... SET restaurant:≪restaurantId≫ ≪restaurantJSON≫ ZADD ≪ZipCode≫:≪DayOfTheWeek≫ ... ... update the restaurant data... EXEC
  56. 56. @crichardson Summary • Each SQL/NoSQL database = set of tradeoffs • Polyglot persistence: leverage the strengths of SQL and NoSQL databases • Use Redis as a distributed cache • Store denormalized data in Redis for fast querying • Reliable database synchronization required
  57. 57. @crichardson @crichardson chris@chrisrichardson.net http://plainoldobjects.com - code and slides
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×