Developing polyglot persistence applications (gluecon 2013)
Upcoming SlideShare
Loading in...5
×
 

Developing polyglot persistence applications (gluecon 2013)

on

  • 3,952 views

This is 30 minute GlueCon 2013 version of a much longer talk. See http://plainoldobjects.com/presentations/developing-polyglot-persistence-applications/ for other versions and the example code.

This is 30 minute GlueCon 2013 version of a much longer talk. See http://plainoldobjects.com/presentations/developing-polyglot-persistence-applications/ for other versions and the example code.

NoSQL databases such as Redis, MongoDB and Cassandra are emerging as a compelling choice for many applications. They can simplify the persistence of complex data models and offer significantly better scalability and performance. However, using a NoSQL database means giving up the benefits of the relational model such as SQL, constraints and ACID transactions. For some applications, the solution is polyglot persistence: using SQL and NoSQL databases together.

In this talk, you will learn about the benefits and drawbacks of polyglot persistence and how to design applications that use this approach. We will explore the architecture and implementation of an example application that uses MySQL as the system of record and Redis as a very high-performance database that handles queries from the front-end. You will learn about mechanisms for maintaining consistency across the various databases.

Statistics

Views

Total Views
3,952
Views on SlideShare
1,310
Embed Views
2,642

Actions

Likes
1
Downloads
36
Comments
0

4 Embeds 2,642

http://plainoldobjects.com 2639
http://translate.googleusercontent.com 1
http://www.newsblur.com 1
http://feedly.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Developing polyglot persistence applications (gluecon 2013) Developing polyglot persistence applications (gluecon 2013) Presentation Transcript

  • Developing polyglot persistenceapplicationsChris RichardsonAuthor of POJOs in ActionFounder of the original CloudFoundry.com@crichardsonchris.richardson@springsource.comhttp://plainoldobjects.com
  • @crichardsonPresentation goalThe benefits and drawbacks ofpolyglot persistenceandHow to design applications thatuse this approach
  • @crichardsonAbout Chris
  • @crichardson(About Chris)
  • @crichardsonAbout Chris()
  • @crichardsonAbout Chris
  • @crichardsonAbout Chrishttp://www.theregister.co.uk/2009/08/19/springsource_cloud_foundry/
  • @crichardsonvmc push About-ChrisDeveloper Advocate forCloudFoundry.com
  • @crichardsonAgenda• Why polyglot persistence?• Optimizing queries using Redis materialized views• Synchronizing MySQL and Redis
  • @crichardsonFoodTo Go Architecture - 2006OrdertakingRestaurantManagementMySQLDatabaseHungryConsumer RestaurantSuccess=inadequate
  • @crichardsonSolution: Spend Moneyhttp://upload.wikimedia.org/wikipedia/commons/e/e5/Rising_Sun_Yacht.JPGORhttp://www.trekbikes.com/us/en/bikes/road/race_performance/madone_5_series/madone_5_2/#
  • @crichardsonSolution: Use NoSQLBenefits• Higher performance• Higher scalability• Richer data-model• Schema-lessDrawbacks• Limited transactions• Limited querying• Relaxed consistency• Unconstrained data
  • @crichardsonExample NoSQL DatabasesDatabase Key featuresCassandraExtensible columnstore, very scalable,distributedMongoDBDocument-oriented,fast, scalableRedisKey-value store, veryfasthttp://nosql-database.org/ lists 122+ NoSQLdatabases
  • @crichardsonRedis: Fast and ScalableRedismasterRedisslaveRedisslaveFilesystemClientIn-memory= FASTDurableRedismasterRedisslaveRedisslaveFilesystemConsistent Hashing
  • @crichardsonRedis: advanced key/value storeK1 V1K2 V2... ...String SET, GETList LPUSH, LPOP, ...Set SADD, SMEMBERS, ..Sorted SetHashesZADD, ZRANGEHSET, HGET
  • @crichardsonSorted setsmyseta5.0b10.KeyValueScoreMembers aresorted by score
  • @crichardsonAdding members to a sorted setRedis Serverzadd myset 5.0 a myseta5.0Key Score Value
  • @crichardsonAdding members to a sorted setRedis Serverzadd myset 10.0 b myseta5.0b10.
  • @crichardsonAdding members to a sorted setRedis Serverzadd myset 1.0 c myseta5.0b10.c1.0
  • @crichardsonRetrieving members by index rangeRedis Serverzrange myset 0 1myseta5.0b10.c1.0acKey StartIndexEndIndex
  • @crichardsonRetrieving members by scoreRedis Serverzrangebyscore myset 1 6myseta5.0b10.c1.0acKeyMinvalueMaxvalue
  • @crichardsonRedis is great but there aretradeoffs• No rich queries: PK-based access only• Limited transaction model:• Read first and then execute updates as batch• Difficult to compose code• Data must fit in memory• Single-threaded server: run multiple with client-side sharding• Missing features such as access control, ...Hope youdon’tneedACIDor richqueriesone day
  • @crichardsonThe future is polyglotIEEE Software Sept/October 2010 - Debasish Ghosh / Twitter @debasishge.g. Netflix• RDBMS• SimpleDB• Cassandra• Hadoop/Hbase
  • @crichardsonPolyglot Pattern: CacheOrdertakingRestaurantManagementMySQLDatabaseCONSUMERRESTAURANTOWNERRedisCacheFirst Second
  • @crichardsonRedis:TimelinePolyglot Pattern: write to SQL and NoSQLFollower 1Follower 2Follower 3 ...LPUSHXLTRIMMySQLTweetsFollowersINSERTNEWTWEETFollowsAvoidsexpensiveMySQL joins
  • @crichardsonPolyglot Pattern: replicate fromMySQL to NoSQLQueryServiceUpdateServiceMySQLDatabaseReader WriterRedisSystemofRecordMaterializedview query() update()replicateanddenormalize
  • @crichardsonAgenda• Why polyglot persistence?• Optimizing queries using Redis materialized views• Synchronizing MySQL and Redis
  • @crichardsonFinding available restaurantsAvailable restaurants =Serve the zip code of the delivery addressANDAre open at the delivery timepublic interface AvailableRestaurantRepository {List<AvailableRestaurant> findAvailableRestaurants(Address deliveryAddress, Date deliveryTime);...}
  • @crichardsonDatabase schemaID Name …1 Ajanta2 Montclair EggshopRestaurant_id zipcode1 947071 946192 946112 94619Restaurant_id dayOfWeek openTime closeTime1 Monday 1130 14301 Monday 1730 21302 Tuesday 1130 …RESTAURANT tableRESTAURANT_ZIPCODE tableRESTAURANT_TIME_RANGE table
  • @crichardsonFinding available restaurants on Monday, 6.15pmfor 94619 zipcodeStraightforward three-way joinselect r.*from restaurant rinner join restaurant_time_range tron r.id =tr.restaurant_idinner join restaurant_zipcode saon r.id = sa.restaurant_idwhere ’94619’ = sa.zip_codeand tr.day_of_week=’monday’and tr.openingtime <= 1815and 1815 <= tr.closingtime
  • @crichardsonHow to scale this query?• Query caching is usually ineffective• MySQL Master/slave replication is straightforward BUT• Complexity of slaves• Risk of replication failing• Limitations of MySQL master/slave replication
  • @crichardsonQuery denormalized copy storedin RedisOrdertakingRestaurantManagementMySQLDatabaseCONSUMERRESTAURANTOWNERRedisSystemofRecordCopyfindAvailable()update()
  • @crichardsonBUT how to implement findAvailableRestaurants()with Redis?!?select r.*from restaurant rinner join restaurant_time_range tron r.id =tr.restaurant_idinner join restaurant_zipcode saon r.id = sa.restaurant_idwhere ’94619’ = sa.zip_codeand tr.day_of_week=’monday’and tr.openingtime <= 1815and 1815 <= tr.closingtimeK1 V1K2 V2... ...
  • @crichardsonZRANGEBYSCORE = SimpleSQL QueryZRANGEBYSCORE myset 1 6select valuefrom sorted_setwhere key = ‘myset’and score >= 1and score <= 6=key value scoresorted_set
  • @crichardsonselect r.*from restaurant rinner join restaurant_time_range tron r.id =tr.restaurant_idinner join restaurant_zipcode saon r.id = sa.restaurant_idwhere ’94619’ = sa.zip_codeand tr.day_of_week=’monday’and tr.openingtime <= 1815and 1815 <= tr.closingtimeselect valuefrom sorted_setwhere key = ?and score >= ?and score <= ??How to transform the SELECTstatement?We need to denormalize
  • @crichardsonSimplification #1:DenormalizationRestaurant_id Day_of_week Open_time Close_time Zip_code1 Monday 1130 1430 947071 Monday 1130 1430 946191 Monday 1730 2130 947071 Monday 1730 2130 946192 Monday 0700 1430 94619…SELECT restaurant_idFROM time_range_zip_codeWHERE day_of_week = ‘Monday’AND zip_code = 94619AND 1815 < close_timeAND open_time < 1815Simpler query:§ No joins§ Two = and two <
  • @crichardsonSimplification #2:ApplicationfilteringSELECT restaurant_id, open_timeFROM time_range_zip_codeWHERE day_of_week = ‘Monday’AND zip_code = 94619AND 1815 < close_timeAND open_time < 1815Even simpler query• No joins• Two = and one <
  • @crichardsonSimplification #3: Eliminate multiple =’s withconcatenationSELECT restaurant_id, open_timeFROM time_range_zip_codeWHERE zip_code_day_of_week = ‘94619:Monday’AND 1815 < close_timeRestaurant_id Zip_dow Open_time Close_time1 94707:Monday 1130 14301 94619:Monday 1130 14301 94707:Monday 1730 21301 94619:Monday 1730 21302 94619:Monday 0700 1430…keyrange
  • @crichardsonSimplification #4: Eliminate multiple SELECTVALUES with concatenationSELECT open_time_restaurant_id,FROM time_range_zip_codeWHERE zip_code_day_of_week = ‘94619:Monday’AND 1815 < close_timezip_dow open_time_restaurant_id close_time94707:Monday 1130_1 143094619:Monday 1130_1 143094707:Monday 1730_1 213094619:Monday 1730_1 213094619:Monday 0700_2 1430...✔
  • @crichardsonzip_dow open_time_restaurant_id close_time94707:Monday 1130_1 143094619:Monday 1130_1 143094707:Monday 1730_1 213094619:Monday 1730_1 213094619:Monday 0700_2 1430...Sorted Set [ Entry:Score, …][1130_1:1430, 1730_1:2130][0700_2:1430, 1130_1:1430, 1730_1:2130]Using a Redis sorted set as an indexKey94619:Monday94707:Monday94619:Monday 0700_2 1430
  • @crichardsonFinding restaurants that have not yet closedZRANGEBYSCORE 94619:Monday 1815 2359è{1730_1}1730 is before 1815 è Ajanta is openDelivery timeDelivery zip and daySorted Set [ Entry:Score, …][1130_1:1430, 1730_1:2130][0700_2:1430, 1130_1:1430, 1730_1:2130]Key94619:Monday94707:Monday
  • @crichardsonNoSQL Denormalizedrepresentation for each query
  • @crichardsonSorryTed!http://en.wikipedia.org/wiki/Edgar_F._Codd
  • @crichardsonAgenda• Why polyglot persistence?• Optimizing queries using Redis materialized views• Synchronizing MySQL and Redis
  • @crichardsonMySQL & Redisneed to be consistent
  • @crichardsonTwo-Phase commit is not anoption• Redis does not support it• Even if it did, 2PC is best avoided http://www.infoq.com/articles/ebay-scalability-best-practices
  • @crichardsonAtomicConsistentIsolatedDurableBasically AvailableSoft stateEventually consistentBASE:An Acid Alternative http://queue.acm.org/detail.cfm?id=1394128
  • @crichardsonUpdating Redis #FAILbegin MySQL transactionupdate MySQLupdate Redisrollback MySQL transactionRedis has updateMySQL does notbegin MySQL transactionupdate MySQLcommit MySQL transaction<<system crashes>>update RedisMySQL has updateRedis does not
  • @crichardsonUpdating Redis reliablyStep 1 of 2begin MySQL transactionupdate MySQLqueue CRUD event in MySQLcommit transactionACIDEvent IdOperation: Create, Update, DeleteNew entity state, e.g. JSON
  • @crichardsonUpdating Redis reliablyStep 2 of 2for each CRUD event in MySQL queueget next CRUD event from MySQL queueIf CRUD event is not duplicate thenUpdate Redis (incl. eventId)end ifbegin MySQL transactionmark CRUD event as processedcommit transaction
  • @crichardsonGenerating CRUD events• Explicit code• Hibernate event listener• Service-layer aspect• CQRS/Event-sourcing• Database triggers• Reading the MySQL binlog
  • @crichardsonID JSON processed?ENTITY_CRUD_EVENTEntityCrudEventProcessorRedisUpdaterapply(event)Step 1 Step 2INSERT INTO ...TimerSELECT ... FROM ...Redis
  • @crichardsonRepresenting restaurants in Redis...JSON...restaurant:lastSeenEventId:1restaurant:15594619:Monday [0700_2:1430, 1130_1:1430, ...]94619:Tuesday [0700_2:1430, 1130_1:1430, ...]duplicatedetection
  • @crichardsonUpdating RedisWATCH restaurant:lastSeenEventId:≪restaurantId≫OptimisticlockingDuplicatedetectionlastSeenEventId = GET restaurant:lastSeenEventId:≪restaurantId≫if (lastSeenEventId >= eventId) return;TransactionMULTISET restaurant:lastSeenEventId:≪restaurantId≫ eventId...SET restaurant:≪restaurantId≫ ≪restaurantJSON≫ZADD ≪ZipCode≫:≪DayOfTheWeek≫ ...... update the restaurant data...EXEC
  • @crichardsonSummary• Each SQL/NoSQL database = set of tradeoffs• Polyglot persistence: leverage the strengths of SQL andNoSQL databases• Use Redis as a distributed cache• Store denormalized data in Redis for fast querying• Reliable database synchronization required
  • @crichardson@crichardson chris.richardson@springsource.comhttp://plainoldobjects.com - code and slides