Developing polyglot persistence applications (SpringOne China 2012)
Upcoming SlideShare
Loading in...5
×
 

Developing polyglot persistence applications (SpringOne China 2012)

on

  • 2,205 views

 

Statistics

Views

Total Views
2,205
Views on SlideShare
1,235
Embed Views
970

Actions

Likes
3
Downloads
20
Comments
0

1 Embed 970

http://plainoldobjects.com 970

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Developing polyglot persistence applications (SpringOne China 2012) Developing polyglot persistence applications (SpringOne China 2012) Presentation Transcript

  • Developing polyglot persistence applicationsChris Richardson,Author of POJOs in Action, Founder of the original CloudFoundry.com @crichardson chris.richardson@springsource.com http://plainoldobjects.com
  • Presentation goalThe benefits and drawbacks of polyglot persistence andHow to design applications that use this approach
  • About Chris View slide
  • (About Chris) View slide
  • About Chris()
  • About Chris
  • About Chrishttp://www.theregister.co.uk/2009/08/19/springsource_cloud_foundry/
  • vmc push About-Chris Developer AdvocateSignup at http://cloudfoundry.com
  • Agenda• Why polyglot persistence?• Using Redis as a cache• Optimizing queries using Redis materialized views• Synchronizing MySQL and Redis• Tracking changes to entities• Using a modular asynchronous architecture
  • Food to Go• Take-out food delivery service• “Launched” in 2006
  • Food To Go Architecture RESTAURANT CONSUMER OWNER RestaurantOrder taking Management MySQL Database
  • Success Growth challenges• Increasing traffic• Increasing data volume• Distribute across a few data centers• Increasing domain model complexity
  • Limitations of relational databases• Scalability• Distribution• Schema updates• O/R impedance mismatch• Handling semi-structured data
  • Solution: Spend Moneyhttp://upload.wikimedia.org/wikipedia/commons/e/e5/Rising_Sun_Yacht.JPG OR http://www.trekbikes.com/us/en/bikes/road/race_performance/madone_5_series/madone_5_2/#
  • Solution: Use NoSQL Benefits Drawbacks• Higher performance • Limited transactions• Higher scalability • Limited querying• Richer data-model • Relaxed consistency• Schema-less • Unconstrained data
  • Example NoSQL DatabasesDatabase Key featuresCassandra Extensible column store, very scalable, distributedNeo4j Graph database Document-oriented, fast, scalableMongoDBRedis Key-value store, very fast http://nosql-database.org/ lists 122+ NoSQL databases
  • Redis K1 V1• Advanced key-value store K2 V2• Very fast, e.g. 100K reqs/sec• Optional persistence ... ...• Transactions with optimistic locking• Master-slave replication• Sharding using client-side consistent hashing
  • Sorted sets Value Key a b myset 5.0 10.0Members are sorted Score by score
  • Adding members to a sorted set Redis Server Key Score Value azadd myset 5.0 a myset 5.0
  • Adding members to a sorted set Redis Server a bzadd myset 10.0 b myset 5.0 10.0
  • Adding members to a sorted set Redis Server c a bzadd myset 1.0 c myset 1.0 5.0 10.0
  • Retrieving members by index range Start End Key Index Index Redis Server zrange myset 0 1 c a b myset 1.0 5.0 10.0 c a
  • Retrieving members by score Min Max Key value value Redis Serverzrangebyscore myset 1 6 c a b myset 1.0 5.0 10.0 c a
  • Redis use cases• Replacement for Memcached • Handling tasks that overload an RDBMS • Session state • Hit counts - INCR • Cache of data retrieved from • Most recent N items - LPUSH and LTRIM system of record (SOR) • Randomly selecting an item –• Replica of SOR for queries needing SRANDMEMBER high-performance • Queuing – Lists with LPOP, RPUSH, …. • High score tables – Sorted sets and ZINCRBY • …
  • Redis is great but there are tradeoffs• Low-level query language: PK-based access only• Limited transaction model: • Read first and then execute updates as batch • Difficult to compose code• Data must fit in memory• Single-threaded server: run multiple with client-side sharding• Missing features such as access control, ...
  • And don’t forget:An RDBMS is fine for many applications
  • The future is polyglot e.g. Netflix • RDBMS • SimpleDB • Cassandra • Hadoop/HbaseIEEE Software Sept/October 2010 - Debasish Ghosh / Twitter @debasishg
  • Agenda• Why polyglot persistence?• Using Redis as a cache• Optimizing queries using Redis materialized views• Synchronizing MySQL and Redis• Tracking changes to entities• Using a modular asynchronous architecture
  • Increase scalability by caching RESTAURANT CONSUMER OWNER RestaurantOrder taking Management MySQL Cache Database
  • Caching Options• Where: • Hibernate 2nd level cache • Explicit calls from application code • Caching aspect• Cache technologies: Ehcache, Memcached, Infinispan, ... Redis is also an option
  • Using Redis as a cache• Spring 3.1 cache abstraction • Annotations specify which methods to cache • CacheManager - pluggable back-end cache• Spring Data for Redis • Simplifies the development of Redis applications • Provides RedisTemplate (analogous to JdbcTemplate) • Provides RedisCacheManager
  • Using Spring 3.1 Caching@Servicepublic class RestaurantManagementServiceImpl implements RestaurantManagementService { private final RestaurantRepository restaurantRepository; @Autowired public RestaurantManagementServiceImpl(RestaurantRepository restaurantRepository) { this.restaurantRepository = restaurantRepository; } @Override public void add(Restaurant restaurant) { Cache result restaurantRepository.add(restaurant); } @Override @Cacheable(value = "Restaurant") public Restaurant findById(int id) { return restaurantRepository.findRestaurant(id); Evict from } cache @Override @CacheEvict(value = "Restaurant", key="#restaurant.id") public void update(Restaurant restaurant) { restaurantRepository.update(restaurant); }
  • Configuring the Redis Cache Manager Enables caching <cache:annotation-driven /> <bean id="cacheManager" class="org.springframework.data.redis.cache.RedisCacheManager" > <constructor-arg ref="restaurantTemplate"/> </bean> Specifies CacheManager The RedisTemplate used to access implementation Redis
  • Domain object to key-value mapping? Restaurant K1 V1TimeRange MenuItem K2 V2 TimeRange MenuItem ... ... ServiceArea
  • RedisTemplate• Analogous to JdbcTemplate• Encapsulates boilerplate code, e.g. connection management• Maps Java objects Redis byte[]’s
  • Serializers: object byte[]• RedisTemplate has multiple serializers• DefaultSerializer - defaults to JdkSerializationRedisSerializer• KeySerializer• ValueSerializer• HashKeySerializer• HashValueSerializer
  • Serializing a Restaurant as JSON@Configurationpublic class RestaurantManagementRedisConfiguration { @Autowired private RestaurantObjectMapperFactory restaurantObjectMapperFactory; private JacksonJsonRedisSerializer<Restaurant> makeRestaurantJsonSerializer() { JacksonJsonRedisSerializer<Restaurant> serializer = new JacksonJsonRedisSerializer<Restaurant>(Restaurant.class); ... return serializer; } @Bean @Qualifier("Restaurant") public RedisTemplate<String, Restaurant> restaurantTemplate(RedisConnectionFactory factory) { RedisTemplate<String, Restaurant> template = new RedisTemplate<String, Restaurant>(); template.setConnectionFactory(factory); JacksonJsonRedisSerializer<Restaurant> jsonSerializer = makeRestaurantJsonSerializer(); template.setValueSerializer(jsonSerializer); return template; }} Serialize restaurants using Jackson JSON
  • Caching with Redis CONSUMER RESTAURANT OWNER Restaurant Order taking Management Redis MySQLFirst Second Cache Database
  • Agenda• Why polyglot persistence?• Using Redis as a cache• Optimizing queries using Redis materialized views• Synchronizing MySQL and Redis• Tracking changes to entities• Using a modular asynchronous architecture
  • Finding available restaurantsAvailable restaurants = Serve the zip code of the delivery address AND Are open at the delivery timepublic interface AvailableRestaurantRepository { List<AvailableRestaurant> findAvailableRestaurants(Address deliveryAddress, Date deliveryTime); ...}
  • Food to Go – Domain model (partial)class Restaurant { class TimeRange { long id; long id; String name; int dayOfWeek; Set<String> serviceArea; int openTime; Set<TimeRange> openingHours; int closeTime; List<MenuItem> menuItems; }} class MenuItem { String name; double price; }
  • Database schemaID Name … RESTAURANT table1 Ajanta2 Montclair Eggshop Restaurant_id zipcode RESTAURANT_ZIPCODE table 1 94707 1 94619 2 94611 2 94619 RESTAURANT_TIME_RANGE tableRestaurant_id dayOfWeek openTime closeTime1 Monday 1130 14301 Monday 1730 21302 Tuesday 1130 …
  • Finding available restaurants on Monday, 6.15pm for 94619 zipcode Straightforward three-way joinselect r.*from restaurant r inner join restaurant_time_range tr on r.id =tr.restaurant_id inner join restaurant_zipcode sa on r.id = sa.restaurant_idwhere ’94619’ = sa.zip_code and tr.day_of_week=’monday’ and tr.openingtime <= 1815 and 1815 <= tr.closingtime
  • How to scale queries?
  • Option #1: Query caching• [ZipCode, DeliveryTime] ⇨ list of available restaurants BUT• Long tail queries• Update restaurant ⇨ Flush entire cache Ineffective
  • Option #2: Master/Slave replication Writes Consistent reads Queries (Inconsistent reads) MySQL Master MySQL MySQL MySQL Slave 1 Slave 2 Slave N
  • Master/Slave replication• Mostly straightforward BUT• Assumes that SQL query is efficient• Complexity of administration of slaves• Doesn’t scale writes
  • Option #3: Redis materialized views RESTAURANT CONSUMER OWNER Restaurant Order taking Management SystemCopy update() of Record findAvailable() MySQL Redis Cache Database
  • BUT how to implement findAvailableRestaurants() with Redis?! ?select r.*from restaurant r K1 V1 inner join restaurant_time_range tr on r.id =tr.restaurant_id inner join restaurant_zipcode sa on r.id = sa.restaurant_id K2 V2where ’94619’ = sa.zip_code and tr.day_of_week=’monday’ and tr.openingtime <= 1815 ... ... and 1815 <= tr.closingtime
  • Where we need to beZRANGEBYSCORE myset 1 6 = sorted_setselect value,score key value scorefrom sorted_setwhere key = ‘myset’ and score >= 1 and score <= 6
  • We need to denormalizeThink materialized view
  • Simplification #1: DenormalizationRestaurant_id Day_of_week Open_time Close_time Zip_code1 Monday 1130 1430 947071 Monday 1130 1430 946191 Monday 1730 2130 947071 Monday 1730 2130 946192 Monday 0700 1430 94619… SELECT restaurant_id FROM time_range_zip_code Simpler query: WHERE day_of_week = ‘Monday’ § No joins AND zip_code = 94619 § Two = and two < AND 1815 < close_time AND open_time < 1815
  • Simplification #2: Application filteringSELECT restaurant_id, open_timeFROM time_range_zip_code Even simpler queryWHERE day_of_week = ‘Monday’ • No joins AND zip_code = 94619 • Two = and one < AND 1815 < close_time AND open_time < 1815
  • Simplification #3: Eliminate multiple =’s with concatenation Restaurant_id Zip_dow Open_time Close_time 1 94707:Monday 1130 1430 1 94619:Monday 1130 1430 1 94707:Monday 1730 2130 1 94619:Monday 1730 2130 2 94619:Monday 0700 1430 …SELECT restaurant_id, open_timeFROM time_range_zip_codeWHERE zip_code_day_of_week = ‘94619:Monday’ AND 1815 < close_time key range
  • Simplification #4: Eliminate multiple RETURN VALUES with concatenation zip_dow open_time_restaurant_id close_time 94707:Monday 1130_1 1430 94619:Monday 1130_1 1430 94707:Monday 1730_1 2130 94619:Monday 1730_1 2130 94619:Monday 0700_2 1430 ... SELECT open_time_restaurant_id, FROM time_range_zip_code WHERE zip_code_day_of_week = ‘94619:Monday’ AND 1815 < close_time ✔
  • Using a Redis sorted set as an index zip_dow open_time_restaurant_id close_time 94707:Monday 1130_1 1430 94619:Monday 1130_1 1430 94707:Monday 1730_1 2130 94619:Monday 1730_1 2130 94619:Monday 94619:Monday 0700_2 0700_2 1430 1430 ...Key Sorted Set [ Entry:Score, …]94619:Monday [0700_2:1430, 1130_1:1430, 1730_1:2130]94707:Monday [1130_1:1430, 1730_1:2130]
  • Querying with ZRANGEBYSCOREKey Sorted Set [ Entry:Score, …]94619:Monday [0700_2:1430, 1130_1:1430, 1730_1:2130]94707:Monday [1130_1:1430, 1730_1:2130] Delivery zip and day Delivery time ZRANGEBYSCORE 94619:Monday 1815 2359 è {1730_1} 1730 is before 1815 è Ajanta is open
  • Adding a Restaurant@Componentpublic class AvailableRestaurantRepositoryImpl implements AvailableRestaurantRepository { @Override public void add(Restaurant restaurant) { Store as addRestaurantDetails(restaurant); addAvailabilityIndexEntries(restaurant); JSON } private void addRestaurantDetails(Restaurant restaurant) { Text restaurantTemplate.opsForValue().set(keyFormatter.key(restaurant.getId()), restaurant); } private void addAvailabilityIndexEntries(Restaurant restaurant) { for (TimeRange tr : restaurant.getOpeningHours()) { String indexValue = formatTrId(restaurant, tr); key member int dayOfWeek = tr.getDayOfWeek(); int closingTime = tr.getClosingTime(); for (String zipCode : restaurant.getServiceArea()) { redisTemplate.opsForZSet().add(closingTimesKey(zipCode, dayOfWeek), indexValue, closingTime); } } } score
  • Finding available Restaurants@Componentpublic class AvailableRestaurantRepositoryImpl implements AvailableRestaurantRepository { @Override public List<AvailableRestaurant> findAvailableRestaurants(Address deliveryAddress, Date deliveryTime) { Find those that close String zipCode = deliveryAddress.getZip(); after int dayOfWeek = DateTimeUtil.dayOfWeek(deliveryTime); int timeOfDay = DateTimeUtil.timeOfDay(deliveryTime); String closingTimesKey = closingTimesKey(zipCode, dayOfWeek); Set<String> trsClosingAfter = redisTemplate.opsForZSet().rangeByScore(closingTimesKey, timeOfDay, 2359); Set<String> restaurantIds = new HashSet<String>(); for (String tr : trsClosingAfter) { Filter out those that String[] values = tr.split("_"); open after if (Integer.parseInt(values[0]) <= timeOfDay) restaurantIds.add(values[1]); } Collection<String> keys = keyFormatter.keys(restaurantIds); return availableRestaurantTemplate.opsForValue().multiGet(keys); Retrieve open } restaurants
  • Sorry Ted!http://en.wikipedia.org/wiki/Edgar_F._Codd
  • Agenda• Why polyglot persistence?• Using Redis as a cache• Optimizing queries using Redis materialized views• Synchronizing MySQL and Redis• Tracking changes to entities• Using a modular asynchronous architecture
  • MySQL & Redisneed to be consistent
  • Two-Phase commit is not an option• Redis does not support it• Even if it did, 2PC is best avoided http://www.infoq.com/articles/ebay-scalability-best-practices
  • AtomicConsistent Basically AvailableIsolated Soft stateDurable Eventually consistentBASE: An Acid Alternative http://queue.acm.org/detail.cfm?id=1394128
  • Updating Redis #FAILbegin MySQL transaction update MySQL Redis has update update Redis MySQL does notrollback MySQL transactionbegin MySQL transaction update MySQL MySQL has updatecommit MySQL transaction Redis does not<<system crashes>> update Redis
  • Updating Redis reliably Step 1 of 2begin MySQL transaction update MySQL ACID queue CRUD event in MySQLcommit transaction Event Id Operation: Create, Update, Delete New entity state, e.g. JSON
  • Updating Redis reliably Step 2 of 2for each CRUD event in MySQL queue get next CRUD event from MySQL queue If CRUD event is not duplicate then Update Redis (incl. eventId) end if begin MySQL transaction mark CRUD event as processed commit transaction
  • Step 1 Step 2 Timer EntityCrudEvent EntityCrudEvent apply(event) Redis Repository Processor UpdaterINSERT INTO ... SELECT ... FROM ...ENTITY_CRUD_EVENT ID JSON processed? Redis
  • Optimistic locking Updating RedisWATCH restaurant:lastSeenEventId:≪restaurantId≫lastSeenEventId = GET restaurant:lastSeenEventId:≪restaurantId≫ Duplicateif (lastSeenEventId >= eventId) return; detectionMULTI SET restaurant:lastSeenEventId:≪restaurantId≫ eventId Transaction ... update the restaurant data...EXEC
  • Agenda• Why polyglot persistence?• Using Redis as a cache• Optimizing queries using Redis materialized views• Synchronizing MySQL and Redis• Tracking changes to entities• Using a modular asynchronous architecture
  • How do we generate CRUD events?
  • Change tracking options• Explicit code• Hibernate event listener• Service-layer aspect• CQRS/Event-sourcing
  • HibernateEvent EntityCrudEvent Listener Repository ENTITY_CRUD_EVENT ID JSON processed?
  • Hibernate event listenerpublic class ChangeTrackingListener implements PostInsertEventListener, PostDeleteEventListener, PostUpdateEventListener { @Autowired private EntityCrudEventRepository entityCrudEventRepository; private void maybeTrackChange(Object entity, EntityCrudEventType eventType) { if (isTrackedEntity(entity)) { entityCrudEventRepository.add(new EntityCrudEvent(eventType, entity)); } } @Override public void onPostInsert(PostInsertEvent event) { Object entity = event.getEntity(); maybeTrackChange(entity, EntityCrudEventType.CREATE); } @Override public void onPostUpdate(PostUpdateEvent event) { Object entity = event.getEntity(); maybeTrackChange(entity, EntityCrudEventType.UPDATE); } @Override public void onPostDelete(PostDeleteEvent event) { Object entity = event.getEntity(); maybeTrackChange(entity, EntityCrudEventType.DELETE); }
  • Agenda• Why polyglot persistence?• Using Redis as a cache• Optimizing queries using Redis materialized views• Synchronizing MySQL and Redis• Tracking changes to entities• Using a modular asynchronous architecture
  • Original architecture WAR Restaurant Management ...
  • Drawbacks of this monolithic architectureWAR Restaurant • Obstacle to frequent deployments Management • Overloads IDE and web container ... • Obstacle to scaling development • Technology lock-in
  • Need a more modular architecture
  • Using a message broker Asynchronous is preferredJSON is fashionable but binary format is more efficient
  • Modular architecture CONSUMER Timer RESTAURANT OWNEROrder taking Event Publisher Restaurant Management MySQL Redis Redis RabbitMQ Database Cache
  • Benefits of a modular asynchronous architecture• Scales development: develop, deploy and scale each service independently• Redeploy UI frequently/independently• Improves fault isolation• Eliminates long-term commitment to a single technology stack• Message broker decouples producers and consumers
  • Step 2 of 2for each CRUD event in MySQL queue get next CRUD event from MySQL queue Publish persistent message to RabbitMQ begin MySQL transaction mark CRUD event as processed commit transaction
  • Message flowEntityCrudEvent Processor AvailableRestaurantManagement Service Redis Updater Spring Integration glue code RABBITMQ REDIS
  • RedisUpdater AMQP<beans> <int:gateway id="redisUpdaterGateway" service-interface="net...RedisUpdater" Creates proxy default-request-channel="eventChannel" /> <int:channel id="eventChannel"/> <int:object-to-json-transformer input-channel="eventChannel" output-channel="amqpOut"/> <int:channel id="amqpOut"/> <amqp:outbound-channel-adapter channel="amqpOut" amqp-template="rabbitTemplate" routing-key="crudEvents" exchange-name="crudEvents" /></beans>
  • AMQP Available...Service<beans> <amqp:inbound-channel-adapter channel="inboundJsonEventsChannel" connection-factory="rabbitConnectionFactory" queue-names="crudEvents"/> <int:channel id="inboundJsonEventsChannel"/> <int:json-to-object-transformer input-channel="inboundJsonEventsChannel" type="net.chrisrichardson.foodToGo.common.JsonEntityCrudEvent" output-channel="inboundEventsChannel"/> <int:channel id="inboundEventsChannel"/> Invokes service <int:service-activator input-channel="inboundEventsChannel" ref="availableRestaurantManagementServiceImpl" method="processEvent"/></beans>
  • Summary• Each SQL/NoSQL database = set of tradeoffs• Polyglot persistence: leverage the strengths of SQL and NoSQL databases• Use Redis as a distributed cache• Store denormalized data in Redis for fast querying• Reliable database synchronization required
  • @crichardson chris.richardson@springsource.com http://plainoldobjects.com Questions?Sign up for CloudFoundry.com
  • Cloud Foundry 启动营在www.cloudfoundry.com注册账号并成功上传应用程序,即可于12月8日中午后凭账号ID和应用URL到签到处换取Cloud Foundry主题卫衣一件。88
  • iPhone5 等你拿第二天大会结束前,请不要提前离 ,将填写完整的意见反馈表投到签到处的抽奖箱内,即可参与“iPhone5”抽奖活动。89
  • Birds of a Feather 专家面对面所有讲师都会在课程结束后,到紫兰厅与来宾讨论课程上的问题90