Developing polyglot persistence applications (svcc, svcc2013)

1,246 views

Published on

NoSQL databases such as Redis, MongoDB and Cassandra are emerging as a compelling choice for many applications. They can simplify the persistence of complex data models and offer significantly better scalability and performance. However, using a NoSQL database means giving up the benefits of the relational model such as SQL, constraints and ACID transactions. For some applications, the solution is polyglot persistence: using SQL and NoSQL databases together. In this talk, you will learn about the benefits and drawbacks of polyglot persistence and how to design applications that use this approach. We will explore the architecture and implementation of an example application that uses MySQL as the system of record and Redis as a very high-performance database that handles queries from the front-end. You will learn about mechanisms for maintaining consistency across the various databases.

Published in: Technology
0 Comments
6 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,246
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
47
Comments
0
Likes
6
Embeds 0
No embeds

No notes for slide

Developing polyglot persistence applications (svcc, svcc2013)

  1. 1. DEVELOPING POLYGLOT PERSISTENCE APPLICATIONS Chris Richardson Author of POJOs in Action Founder of the original CloudFoundry.com @crichardson chris@chrisrichardson.net http://plainoldobjects.com
  2. 2. @crichardson Presentation goal The benefits and drawbacks of polyglot persistence and How to design applications that use this approach
  3. 3. @crichardson About Chris
  4. 4. @crichardson (About Chris)
  5. 5. @crichardson About Chris()
  6. 6. @crichardson About Chris
  7. 7. @crichardson About Chris http://www.theregister.co.uk/2009/08/19/springsource_cloud_foundry/
  8. 8. @crichardson About Chris
  9. 9. @crichardson Agenda • Why polyglot persistence? • Using Redis as a cache • Optimizing queries using Redis materialized views • Synchronizing MySQL and Redis • Tracking changes to entities
  10. 10. @crichardson Food to Go • Take-out food delivery service • “Launched” in 2006
  11. 11. @crichardson FoodTo Go Architecture Order taking Restaurant Management MySQL Database CONSUMER RESTAURANT OWNER
  12. 12. @crichardson Success Growth challenges • Increasing traffic • Increasing data volume • Distribute across a few data centers • Increasing domain model complexity
  13. 13. @crichardson Limitations of relational databases • Scalability • Distribution • Schema updates • O/R impedance mismatch • Handling semi-structured data
  14. 14. @crichardson Solution: Spend $$$ - high-end databases and servers http://www.powerandmotoryacht.com/megayachts/megayacht-musashi
  15. 15. @crichardson Solution: Spend $$$ - open- source stack + DevOps people http://www.trekbikes.com/us/en/bikes/road/race_performance/madone_5_series/madone_5_2/#
  16. 16. @crichardson Solution: Use NoSQL Benefits • Higher performance • Higher scalability • Richer data-model • Schema-less Drawbacks • Limited transactions • Limited querying • Relaxed consistency • Unconstrained data
  17. 17. @crichardson Example NoSQL Databases Database Key features Cassandra Extensible column store, very scalable, distributed MongoDB Document-oriented, fast, scalable Redis Key-value store, very fast http://nosql-database.org/ lists 122+ NoSQL databases
  18. 18. @crichardson Redis: Fast and Scalable Redis master Redis slave Redis slave File system Client In-memory = FAST Durable Redis master Redis slave Redis slave File system Consistent Hashing
  19. 19. @crichardson Redis: advanced key/value store K1 V1 K2 V2 ... ... String SET, GET List LPUSH, LPOP, ... Set SADD, SMEMBERS, .. Sorted Set Hashes ZADD, ZRANGE HSET, HGET
  20. 20. @crichardson Sorted sets myset a 5.0 b 10. Key Value ScoreMembers are sorted by score
  21. 21. @crichardson Adding members to a sorted set Redis Server zadd myset 5.0 a myset a 5.0 Key Score Value
  22. 22. @crichardson Adding members to a sorted set Redis Server zadd myset 10.0 b myset a 5.0 b 10.
  23. 23. @crichardson Adding members to a sorted set Redis Server zadd myset 1.0 c myset a 5.0 b 10. c 1.0
  24. 24. @crichardson Retrieving members by index range Redis Server zrange myset 0 1 myset a 5.0 b 10. c 1.0 ac Key Start Index End Index
  25. 25. @crichardson Retrieving members by score Redis Server zrangebyscore myset 1 6 myset a 5.0 b 10. c 1.0 ac Key Min value Max value
  26. 26. @crichardson Redis use cases • Replacement for Memcached • Session state • Cache of data retrieved from system of record (SOR) • Replica of SOR for queries needing high-performance • Handling tasks that overload an RDBMS • Hit counts - INCR • Most recent N items - LPUSH and LTRIM • Randomly selecting an item – SRANDMEMBER • Queuing – Lists with LPOP, RPUSH, …. • High score tables – Sorted sets and ZINCRBY • …
  27. 27. @crichardson The future is polyglot IEEE Software Sept/October 2010 - Debasish Ghosh / Twitter @debasishg e.g. Netflix • RDBMS • SimpleDB • Cassandra • Hadoop/Hbase
  28. 28. @crichardson Benefits RDBMS ACID transactions Rich queries ... + NoSQL Performance Scalability ...
  29. 29. @crichardson Drawbacks Complexity Development Operational
  30. 30. @crichardson Polyglot Pattern: Cache Order taking Restaurant Management MySQL Database CONSUMER RESTAURANT OWNER Redis Cache First Second
  31. 31. @crichardson Redis:Timeline Polyglot Pattern: write to SQL and NoSQL Follower 1 Follower 2 Follower 3 ... LPUSHX LTRIM MySQL Tweets Followers INSERT NEW TWEET Follows Avoids expensive MySQL joins
  32. 32. @crichardson Polyglot Pattern: replicate from MySQL to NoSQL Query Service Update Service MySQL Database Reader Writer Redis System of RecordMaterialized view query() update() replicate and denormalize
  33. 33. @crichardson Agenda • Why polyglot persistence? • Using Redis as a cache • Optimizing queries using Redis materialized views • Synchronizing MySQL and Redis • Tracking changes to entities
  34. 34. @crichardson Food to Go – Domain model (partial) class Restaurant { long id; String name; Set<String> serviceArea; Set<TimeRange> openingHours; List<MenuItem> menuItems; } class MenuItem { String name; double price; } class TimeRange { long id; int dayOfWeek; int openTime; int closeTime; }
  35. 35. @crichardson Database schema ID Name … 1 Ajanta 2 Montclair Eggshop Restaurant_id zipcode 1 94707 1 94619 2 94611 2 94619 Restaurant_id dayOfWeek openTime closeTime 1 Monday 1130 1430 1 Monday 1730 2130 2 Tuesday 1130 … RESTAURANT table RESTAURANT_ZIPCODE table RESTAURANT_TIME_RANGE table
  36. 36. @crichardson RestaurantRepository public interface RestaurantRepository { void addRestaurant(Restaurant restaurant); Restaurant findById(long id); ... } Food To Go will have scaling eventually issues
  37. 37. @crichardson Increase scalability by caching Order taking Restaurant Management MySQL Database CONSUMER RESTAURANT OWNER Cache
  38. 38. @crichardson Caching Options • Where: • Hibernate 2nd level cache • Explicit calls from application code • Caching aspect • Cache technologies: Ehcache, Memcached, Infinispan, ... Redis is also an option
  39. 39. @crichardson Using Redis as a cache • Spring 3.1 cache abstraction • Annotations specify which methods to cache • CacheManager - pluggable back-end cache • Spring Data for Redis • Simplifies the development of Redis applications • Provides RedisTemplate (analogous to JdbcTemplate) • Provides RedisCacheManager
  40. 40. @crichardson Using Spring 3.1 Caching @Service public class RestaurantManagementServiceImpl implements RestaurantManagementService { private final RestaurantRepository restaurantRepository; @Autowired public RestaurantManagementServiceImpl(RestaurantRepository restaurantRepository) { this.restaurantRepository = restaurantRepository; } @Override public void add(Restaurant restaurant) { restaurantRepository.add(restaurant); } @Override @Cacheable(value = "Restaurant") public Restaurant findById(int id) { return restaurantRepository.findRestaurant(id); } @Override @CacheEvict(value = "Restaurant", key="#restaurant.id") public void update(Restaurant restaurant) { restaurantRepository.update(restaurant); } Cache result Evict from cache
  41. 41. @crichardson Configuring the Redis Cache Manager <cache:annotation-driven /> <bean id="cacheManager" class="org.springframework.data.redis.cache.RedisCacheManager" > <constructor-arg ref="restaurantTemplate"/> </bean> Enables caching Specifies CacheManager implementation The RedisTemplate used to access Redis
  42. 42. @crichardson TimeRange MenuItem Domain object to key-value mapping? Restaurant TimeRange MenuItem ServiceArea K1 V1 K2 V2 ... ...
  43. 43. @crichardson RedisTemplate handles the mapping • Principal API provided by Spring Data to Redis • Analogous to JdbcTemplate • Encapsulates boilerplate code, e.g. connection management • Maps Java objects Redis byte[]’s
  44. 44. @crichardson Serializers: object byte[] • RedisTemplate has multiple serializers • DefaultSerializer - defaults to JdkSerializationRedisSerializer • KeySerializer • ValueSerializer • ...
  45. 45. @crichardson Serializing a Restaurant as JSON @Configuration public class RestaurantManagementRedisConfiguration { @Autowired private RestaurantObjectMapperFactory restaurantObjectMapperFactory; private JacksonJsonRedisSerializer<Restaurant> makeRestaurantJsonSerializer() { JacksonJsonRedisSerializer<Restaurant> serializer = new JacksonJsonRedisSerializer<Restaurant>(Restaurant.class); ... return serializer; } @Bean @Qualifier("Restaurant") public RedisTemplate<String, Restaurant> restaurantTemplate(RedisConnectionFactory factory) { RedisTemplate<String, Restaurant> template = new RedisTemplate<String, Restaurant>(); template.setConnectionFactory(factory); JacksonJsonRedisSerializer<Restaurant> jsonSerializer = makeRestaurantJsonSerializer(); template.setValueSerializer(jsonSerializer); return template; } } Serialize restaurants using Jackson JSON
  46. 46. @crichardson Caching with Redis Order taking Restaurant Management MySQL Database CONSUMER RESTAURANT OWNER Redis Cache First Second
  47. 47. @crichardson Storing restaurants in Redis ...JSON...restaurant:1
  48. 48. @crichardson Agenda • Why polyglot persistence? • Using Redis as a cache • Optimizing queries using Redis materialized views • Synchronizing MySQL and Redis • Tracking changes to entities
  49. 49. @crichardson Finding available restaurants Available restaurants = Serve the zip code of the delivery address AND Are open at the delivery time public interface AvailableRestaurantRepository { List<AvailableRestaurant> findAvailableRestaurants(Address deliveryAddress, Date deliveryTime); ... }
  50. 50. @crichardson Finding available restaurants on Monday, 6.15pm for 94619 zipcode Straightforward three-way join select r.* from restaurant r inner join restaurant_time_range tr on r.id =tr.restaurant_id inner join restaurant_zipcode sa on r.id = sa.restaurant_id where ’94619’ = sa.zip_code and tr.day_of_week=’monday’ and tr.openingtime <= 1815 and 1815 <= tr.closingtime
  51. 51. @crichardson How to scale queries?
  52. 52. @crichardson Option #1: Query caching • [ZipCode, DeliveryTime] ⇨ list of available restaurants BUT • Long tail queries • Update restaurant ⇨ Flush entire cache Ineffective
  53. 53. @crichardson Option #2: Master/Slave replication MySQL Master MySQL Slave 1 MySQL Slave 2 MySQL Slave N Writes Consistent reads Queries (Inconsistent reads)
  54. 54. @crichardson Master/Slave replication • Mostly straightforward BUT • Assumes that SQL query is efficient • Complexity of administration of slaves • Doesn’t scale writes
  55. 55. @crichardson Option #3: Redis materialized views Order taking Restaurant Management MySQL Database CONSUMER RESTAURANT OWNER CacheRedis System of RecordCopy findAvailable() update()
  56. 56. @crichardson BUT how to implement findAvailableRestaurants() with Redis?! ? select r.* from restaurant r inner join restaurant_time_range tr on r.id =tr.restaurant_id inner join restaurant_zipcode sa on r.id = sa.restaurant_id where ’94619’ = sa.zip_code and tr.day_of_week=’monday’ and tr.openingtime <= 1815 and 1815 <= tr.closingtime K1 V1 K2 V2 ... ...
  57. 57. @crichardson Solution: Build an index using sorted sets and ZRANGEBYSCORE ZRANGEBYSCORE myset 1 6 select value from sorted_set where key = ‘myset’ and score >= 1 and score <= 6 = key value score sorted_set
  58. 58. @crichardson select r.* from restaurant r inner join restaurant_time_range tr on r.id =tr.restaurant_id inner join restaurant_zipcode sa on r.id = sa.restaurant_id where ’94619’ = sa.zip_code and tr.day_of_week=’monday’ and tr.openingtime <= 1815 and 1815 <= tr.closingtime select value from sorted_set where key = ? and score >= ? and score <= ? ? How to transform the SELECT statement?
  59. 59. @crichardson We need to denormalize Think materialized view
  60. 60. @crichardson Simplification #1: Denormalization Restaurant_id Day_of_week Open_time Close_time Zip_code 1 Monday 1130 1430 94707 1 Monday 1130 1430 94619 1 Monday 1730 2130 94707 1 Monday 1730 2130 94619 2 Monday 0700 1430 94619 … SELECT restaurant_id FROM time_range_zip_code WHERE day_of_week = ‘Monday’ AND zip_code = 94619 AND 1815 < close_time AND open_time < 1815 Simpler query: § No joins § Two = and two <
  61. 61. @crichardson Simplification #2:Application filtering SELECT restaurant_id, open_time FROM time_range_zip_code WHERE day_of_week = ‘Monday’ AND zip_code = 94619 AND 1815 < close_time AND open_time < 1815 Even simpler query • No joins • Two = and one <
  62. 62. @crichardson Simplification #3: Eliminate multiple =’s with concatenation SELECT restaurant_id, open_time FROM time_range_zip_code WHERE zip_code_day_of_week = ‘94619:Monday’ AND 1815 < close_time Restaurant_id Zip_dow Open_time Close_time 1 94707:Monday 1130 1430 1 94619:Monday 1130 1430 1 94707:Monday 1730 2130 1 94619:Monday 1730 2130 2 94619:Monday 0700 1430 … key range
  63. 63. @crichardson Simplification #4: Eliminate multiple RETURN VALUES with concatenation SELECT open_time_restaurant_id, FROM time_range_zip_code WHERE zip_code_day_of_week = ‘94619:Monday’ AND 1815 < close_time zip_dow open_time_restaurant_id close_time 94707:Monday 1130_1 1430 94619:Monday 1130_1 1430 94707:Monday 1730_1 2130 94619:Monday 1730_1 2130 94619:Monday 0700_2 1430 ... ✔
  64. 64. @crichardson zip_dow open_time_restaurant_id close_time 94707:Monday 1130_1 1430 94619:Monday 1130_1 1430 94707:Monday 1730_1 2130 94619:Monday 1730_1 2130 94619:Monday 0700_2 1430 ... Sorted Set [ Entry:Score, …] [1130_1:1430, 1730_1:2130] [0700_2:1430, 1130_1:1430, 1730_1:2130] Using a Redis sorted set as an index Key 94619:Monday 94707:Monday 94619:Monday 0700_2 1430
  65. 65. @crichardson Querying with ZRANGEBYSCORE ZRANGEBYSCORE 94619:Monday 1815 2359 è {1730_1} 1730 is before 1815 è Ajanta is open Delivery timeDelivery zip and day Sorted Set [ Entry:Score, …] [1130_1:1430, 1730_1:2130] [0700_2:1430, 1130_1:1430, 1730_1:2130] Key 94619:Monday 94707:Monday
  66. 66. @crichardson Adding a Restaurant Text @Component public class AvailableRestaurantRepositoryImpl implements AvailableRestaurantRepository { @Override public void add(Restaurant restaurant) { addRestaurantDetails(restaurant); addAvailabilityIndexEntries(restaurant); } private void addRestaurantDetails(Restaurant restaurant) { restaurantTemplate.opsForValue().set(keyFormatter.key(restaurant.getId()), restaurant); } private void addAvailabilityIndexEntries(Restaurant restaurant) { for (TimeRange tr : restaurant.getOpeningHours()) { String indexValue = formatTrId(restaurant, tr); int dayOfWeek = tr.getDayOfWeek(); int closingTime = tr.getClosingTime(); for (String zipCode : restaurant.getServiceArea()) { redisTemplate.opsForZSet().add(closingTimesKey(zipCode, dayOfWeek), indexValue, closingTime); } } } key member score Store as JSON
  67. 67. @crichardson Finding available Restaurants @Component public class AvailableRestaurantRepositoryImpl implements AvailableRestaurantRepository { @Override public List<AvailableRestaurant> findAvailableRestaurants(Address deliveryAddress, Date deliveryTime) { String zipCode = deliveryAddress.getZip(); int dayOfWeek = DateTimeUtil.dayOfWeek(deliveryTime); int timeOfDay = DateTimeUtil.timeOfDay(deliveryTime); String closingTimesKey = closingTimesKey(zipCode, dayOfWeek); Set<String> trsClosingAfter = redisTemplate.opsForZSet().rangeByScore(closingTimesKey, timeOfDay, 2359); Set<String> restaurantIds = new HashSet<String>(); for (String tr : trsClosingAfter) { String[] values = tr.split("_"); if (Integer.parseInt(values[0]) <= timeOfDay) restaurantIds.add(values[1]); } Collection<String> keys = keyFormatter.keys(restaurantIds); return availableRestaurantTemplate.opsForValue().multiGet(keys); } Find those that close after Filter out those that open after Retrieve open restaurants
  68. 68. @crichardson Representing restaurants in Redis ...JSON...restaurant:1 94619:Monday [0700_2:1430, 1130_1:1430, ...] 94619:Tuesday [0700_2:1430, 1130_1:1430, ...]
  69. 69. @crichardson NoSQL Denormalized representation for each query
  70. 70. @crichardson SorryTed! http://en.wikipedia.org/wiki/Edgar_F._Codd
  71. 71. @crichardson Agenda • Why polyglot persistence? • Using Redis as a cache • Optimizing queries using Redis materialized views • Synchronizing MySQL and Redis • Tracking changes to entities
  72. 72. @crichardson MySQL & Redis need to be consistent
  73. 73. @crichardson Two-Phase commit is not an option • Redis does not support it • Even if it did, 2PC is best avoided http://www.infoq.com/articles/ebay-scalability-best-practices
  74. 74. @crichardson Atomic Consistent Isolated Durable Basically Available Soft state Eventually consistent BASE:An Acid Alternative http://queue.acm.org/detail.cfm?id=1394128
  75. 75. @crichardson Updating Redis #FAIL begin MySQL transaction update MySQL update Redis rollback MySQL transaction Redis has update MySQL does not begin MySQL transaction update MySQL commit MySQL transaction <<system crashes>> update Redis MySQL has update Redis does not
  76. 76. @crichardson Updating Redis reliably Step 1 of 2 begin MySQL transaction update MySQL queue CRUD event in MySQL commit transaction ACID Event Id Operation: Create, Update, Delete New entity state, e.g. JSON
  77. 77. @crichardson Updating Redis reliably Step 2 of 2 for each CRUD event in MySQL queue get next CRUD event from MySQL queue If CRUD event is not duplicate then Update Redis (incl. eventId) end if begin MySQL transaction mark CRUD event as processed commit transaction
  78. 78. @crichardson ID JSON processed? ENTITY_CRUD_EVENT EntityCrudEvent Processor Redis Updater apply(event) Step 1 Step 2 EntityCrudEvent Repository INSERT INTO ... Timer SELECT ... FROM ... Redis
  79. 79. @crichardson Updating Redis WATCH restaurant:lastSeenEventId:≪restaurantId≫ Optimistic locking Duplicate detection lastSeenEventId = GET restaurant:lastSeenEventId:≪restaurantId≫ if (lastSeenEventId >= eventId) return; Transaction MULTI SET restaurant:lastSeenEventId:≪restaurantId≫ eventId ... update the restaurant data... EXEC
  80. 80. @crichardson Representing restaurants in Redis restaurant:lastSeenEventId:1 55 duplicate detection ...JSON...restaurant:1 94619:Monday [0700_2:1430, 1130_1:1430, ...] 94619:Tuesday [0700_2:1430, 1130_1:1430, ...]
  81. 81. @crichardson Agenda • Why polyglot persistence? • Using Redis as a cache • Optimizing queries using Redis materialized views • Synchronizing MySQL and Redis • Tracking changes to entities
  82. 82. @crichardson How do we generate CRUD events?
  83. 83. @crichardson Change tracking options • Explicit code • Hibernate event listener • Service-layer aspect • CQRS/Event-sourcing
  84. 84. @crichardson ID JSON processed? ENTITY_CRUD_EVENT EntityCrudEvent Repository HibernateEvent Listener
  85. 85. @crichardson Hibernate event listener public class ChangeTrackingListener implements PostInsertEventListener, PostDeleteEventListener, PostUpdateEventListener { @Autowired private EntityCrudEventRepository entityCrudEventRepository; private void maybeTrackChange(Object entity, EntityCrudEventType eventType) { if (isTrackedEntity(entity)) { entityCrudEventRepository.add(new EntityCrudEvent(eventType, entity)); } } @Override public void onPostInsert(PostInsertEvent event) { Object entity = event.getEntity(); maybeTrackChange(entity, EntityCrudEventType.CREATE); } @Override public void onPostUpdate(PostUpdateEvent event) { Object entity = event.getEntity(); maybeTrackChange(entity, EntityCrudEventType.UPDATE); } @Override public void onPostDelete(PostDeleteEvent event) { Object entity = event.getEntity(); maybeTrackChange(entity, EntityCrudEventType.DELETE); }
  86. 86. @crichardson Summary • Each SQL/NoSQL database = set of tradeoffs • Polyglot persistence: leverage the strengths of SQL and NoSQL databases • Use Redis as a distributed cache • Store denormalized data in Redis for fast querying • Reliable database synchronization required
  87. 87. @crichardson Questions? @crichardson chris@chrisrichardson.net http://plainoldobjects.com

×