Developing polyglot persistence
               applications
Chris Richardson,
Author of POJOs in Action, Founder of the original CloudFoundry.com
  @crichardson      chris.richardson@springsource.com    http://plainoldobjects.com
Presentation goal
The benefits and drawbacks of polyglot
             persistence
                  and
How to design applications that use this
              approach
About Chris
(About Chris)
About Chris()
About Chris
About Chris




http://www.theregister.co.uk/2009/08/19/springsource_cloud_foundry/
vmc push About-Chris

      Developer Advocate




Signup at http://cloudfoundry.com
Agenda

•   Why polyglot persistence?

•   Using Redis as a cache

•   Optimizing queries using Redis materialized views

•   Synchronizing MySQL and Redis

•   Tracking changes to entities

•   Using a modular asynchronous architecture
Food to Go



•   Take-out food delivery service

•   “Launched” in 2006
Food To Go Architecture
                                   RESTAURANT
        CONSUMER
                                     OWNER



                           Restaurant
Order taking
                          Management



               MySQL
               Database
Success                  Growth challenges


•   Increasing traffic

•   Increasing data volume

•   Distribute across a few data centers

•   Increasing domain model complexity
Limitations of relational databases

•   Scalability

•   Distribution

•   Schema updates

•   O/R impedance mismatch

•   Handling semi-structured data
Solution: Spend Money



http://upload.wikimedia.org/wikipedia/commons/e/e5/Rising_Sun_Yacht.JPG

                                                                          OR




                                                                               http://www.trekbikes.com/us/en/bikes/road/race_performance/madone_5_series/madone_5_2/#
Solution: Use NoSQL
    Benefits                            Drawbacks

•   Higher performance             •   Limited transactions

•   Higher scalability             •   Limited querying

•   Richer data-model              •   Relaxed consistency

•   Schema-less                    •   Unconstrained data
Example NoSQL Databases
Database                                       Key features

Cassandra        Extensible column store, very scalable, distributed

Neo4j            Graph database

                 Document-oriented, fast, scalable
MongoDB

Redis            Key-value store, very fast


            http://nosql-database.org/ lists 122+ NoSQL databases
Redis
                                                    K1    V1
•   Advanced key-value store
                                                    K2    V2
•   Very fast, e.g. 100K reqs/sec

•   Optional persistence                            ...   ...

•   Transactions with optimistic locking

•   Master-slave replication

•   Sharding using client-side consistent hashing
Sorted sets
                                    Value
              Key

                                a      b
                      myset
                              5.0     10.0




Members are sorted            Score
    by score
Adding members to a sorted set
                             Redis Server
  Key    Score     Value

                                             a
zadd myset 5.0 a                   myset
                                            5.0
Adding members to a sorted set
                           Redis Server




                                           a     b
zadd myset 10.0 b                myset
                                          5.0   10.0
Adding members to a sorted set
                           Redis Server




                                            c    a     b
zadd myset 1.0 c                 myset
                                          1.0   5.0   10.0
Retrieving members by index range
            Start     End
    Key
           Index     Index   Redis Server


  zrange myset 0 1

                                              c    a     b
                                   myset
                                            1.0   5.0   10.0
              c      a
Retrieving members by score
                  Min     Max
         Key
                 value    value       Redis Server


zrangebyscore myset 1 6

                                                       c    a     b
                                            myset
                                                     1.0   5.0   10.0
                            c     a
Redis use cases
•   Replacement for Memcached            •   Handling tasks that overload an RDBMS
    •   Session state                        •   Hit counts - INCR
    •   Cache of data retrieved from         •   Most recent N items - LPUSH and LTRIM
        system of record (SOR)
                                             •   Randomly selecting an item –
•   Replica of SOR for queries needing           SRANDMEMBER
    high-performance
                                             •   Queuing – Lists with LPOP, RPUSH, ….
                                             •   High score tables – Sorted sets and ZINCRBY
                                             •   …
Redis is great but there are tradeoffs

•   Low-level query language: PK-based access only

•   Limited transaction model:

    •   Read first and then execute updates as batch

    •   Difficult to compose code

•   Data must fit in memory

•   Single-threaded server: run multiple with client-side sharding

•   Missing features such as access control, ...
And don’t forget:

An RDBMS is fine for many applications
The future is polyglot


                                                                        e.g. Netflix
                                                                        • RDBMS
                                                                        • SimpleDB
                                                                        • Cassandra
                                                                        • Hadoop/Hbase




IEEE Software Sept/October 2010 - Debasish Ghosh / Twitter @debasishg
Agenda

•   Why polyglot persistence?

•   Using Redis as a cache

•   Optimizing queries using Redis materialized views

•   Synchronizing MySQL and Redis

•   Tracking changes to entities

•   Using a modular asynchronous architecture
Increase scalability by caching
                                 RESTAURANT
         CONSUMER
                                   OWNER



                         Restaurant
Order taking
                        Management



                    MySQL
         Cache
                    Database
Caching Options

•   Where:

    •   Hibernate 2nd level cache

    •   Explicit calls from application code

    •   Caching aspect

•   Cache technologies: Ehcache, Memcached, Infinispan, ...



                                Redis is also an option
Using Redis as a cache

•   Spring 3.1 cache abstraction

    •   Annotations specify which methods to cache

    •   CacheManager - pluggable back-end cache

•   Spring Data for Redis

    •   Simplifies the development of Redis applications

    •   Provides RedisTemplate (analogous to JdbcTemplate)

    •   Provides RedisCacheManager
Using Spring 3.1 Caching
@Service
public class RestaurantManagementServiceImpl implements RestaurantManagementService {

  private final RestaurantRepository restaurantRepository;

  @Autowired
  public RestaurantManagementServiceImpl(RestaurantRepository restaurantRepository) {
    this.restaurantRepository = restaurantRepository;
  }

  @Override
  public void add(Restaurant restaurant) {
                                                                      Cache result
    restaurantRepository.add(restaurant);
  }

  @Override
  @Cacheable(value = "Restaurant")
  public Restaurant findById(int id) {
    return restaurantRepository.findRestaurant(id);
                                                                        Evict from
  }                                                                       cache
  @Override
  @CacheEvict(value = "Restaurant", key="#restaurant.id")
  public void update(Restaurant restaurant) {
    restaurantRepository.update(restaurant);
  }
Configuring the Redis Cache Manager
                 Enables caching


	   <cache:annotation-driven />

	   <bean id="cacheManager"
          class="org.springframework.data.redis.cache.RedisCacheManager" >
	   	 <constructor-arg ref="restaurantTemplate"/>
	   </bean>




         Specifies CacheManager                                The RedisTemplate used to access
             implementation                                               Redis
Domain object to key-value mapping?


         Restaurant
                                  K1    V1


TimeRange              MenuItem   K2    V2
 TimeRange             MenuItem

                                  ...   ...
         ServiceArea
RedisTemplate


•   Analogous to JdbcTemplate

•   Encapsulates boilerplate code, e.g. connection management

•   Maps Java objects         Redis byte[]’s
Serializers: object                     byte[]

•   RedisTemplate has multiple serializers

•   DefaultSerializer - defaults to JdkSerializationRedisSerializer

•   KeySerializer

•   ValueSerializer

•   HashKeySerializer

•   HashValueSerializer
Serializing a Restaurant as JSON
@Configuration
public class RestaurantManagementRedisConfiguration {

    @Autowired
    private RestaurantObjectMapperFactory restaurantObjectMapperFactory;

    private JacksonJsonRedisSerializer<Restaurant> makeRestaurantJsonSerializer() {
      JacksonJsonRedisSerializer<Restaurant> serializer =
                              new JacksonJsonRedisSerializer<Restaurant>(Restaurant.class);
      ...
      return serializer;
    }

    @Bean
    @Qualifier("Restaurant")
    public RedisTemplate<String, Restaurant> restaurantTemplate(RedisConnectionFactory factory) {
      RedisTemplate<String, Restaurant> template = new RedisTemplate<String, Restaurant>();
      template.setConnectionFactory(factory);
      JacksonJsonRedisSerializer<Restaurant> jsonSerializer = makeRestaurantJsonSerializer();
      template.setValueSerializer(jsonSerializer);
      return template;
    }



}                                                                                  Serialize restaurants using Jackson JSON
Caching with Redis
                  CONSUMER                 RESTAURANT
                                             OWNER



                                  Restaurant
        Order taking
                                 Management



                 Redis       MySQL
First                                           Second
                 Cache       Database
Agenda

•   Why polyglot persistence?

•   Using Redis as a cache

•   Optimizing queries using Redis materialized views

•   Synchronizing MySQL and Redis

•   Tracking changes to entities

•   Using a modular asynchronous architecture
Finding available restaurants
Available restaurants =
   Serve the zip code of the delivery address
                                                   AND
   Are open at the delivery time


public interface AvailableRestaurantRepository {

   List<AvailableRestaurant>
	

 findAvailableRestaurants(Address deliveryAddress, Date deliveryTime);
   ...
}
Food to Go – Domain model (partial)
class Restaurant {                         class TimeRange {
  long id;                                   long id;
  String name;                               int dayOfWeek;
  Set<String> serviceArea;                   int openTime;
  Set<TimeRange> openingHours;               int closeTime;
  List<MenuItem> menuItems;
                                           }
}


                        class MenuItem {
                          String name;
                          double price;
                        }
Database schema
ID                     Name                           …
                                                                            RESTAURANT table
1                      Ajanta
2                      Montclair Eggshop

    Restaurant_id               zipcode
                                                            RESTAURANT_ZIPCODE table
    1                           94707
    1                           94619
    2                           94611
    2                           94619
                                                          RESTAURANT_TIME_RANGE table

Restaurant_id       dayOfWeek              openTime           closeTime
1                   Monday                 1130               1430
1                   Monday                 1730               2130
2                   Tuesday                1130               …
Finding available restaurants on Monday, 6.15pm for 94619 zipcode
                              Straightforward three-way join

select r.*
from restaurant r
 inner join restaurant_time_range tr
   on r.id =tr.restaurant_id
 inner join restaurant_zipcode sa
   on r.id = sa.restaurant_id
where ’94619’ = sa.zip_code
 and tr.day_of_week=’monday’
 and tr.openingtime <= 1815
 and 1815 <= tr.closingtime
How to scale queries?
Option #1: Query caching


•   [ZipCode, DeliveryTime] ⇨ list of available restaurants

                                             BUT

•   Long tail queries

•   Update restaurant ⇨ Flush entire cache

                              Ineffective
Option #2: Master/Slave replication
                   Writes       Consistent reads



                                                         Queries
                                                   (Inconsistent reads)
                            MySQL
                            Master




         MySQL              MySQL              MySQL
         Slave 1            Slave 2            Slave N
Master/Slave replication

•   Mostly straightforward

                                             BUT

•   Assumes that SQL query is efficient

•   Complexity of administration of slaves

•   Doesn’t scale writes
Option #3: Redis materialized views
                                                RESTAURANT
                   CONSUMER
                                                  OWNER



                                         Restaurant
           Order taking
                                        Management
                                                             System
Copy                      update()                           of Record
         findAvailable()

                                                MySQL
           Redis                Cache
                                                Database
BUT how to implement findAvailableRestaurants() with Redis?!



                                       ?
select r.*
from restaurant r                          K1    V1
 inner join restaurant_time_range tr
   on r.id =tr.restaurant_id
 inner join restaurant_zipcode sa
   on r.id = sa.restaurant_id
                                           K2    V2
where ’94619’ = sa.zip_code
 and tr.day_of_week=’monday’
 and tr.openingtime <= 1815                ...   ...
 and 1815 <= tr.closingtime
Where we need to be
ZRANGEBYSCORE myset 1 6

                =
                               sorted_set
select value,score             key   value   score
from sorted_set
where key = ‘myset’
  and score >= 1
  and score <= 6
We need to denormalize


Think materialized view
Simplification #1: Denormalization
Restaurant_id          Day_of_week     Open_time   Close_time           Zip_code

1                      Monday          1130        1430                 94707
1                      Monday          1130        1430                 94619
1                      Monday          1730        2130                 94707
1                      Monday          1730        2130                 94619
2                      Monday          0700        1430                 94619
…



                SELECT restaurant_id
                FROM time_range_zip_code
                                                                Simpler query:
                WHERE day_of_week = ‘Monday’
                                                                § No joins
                  AND zip_code = 94619                          § Two = and two <
                  AND 1815 < close_time
                  AND open_time < 1815
Simplification #2: Application filtering

SELECT restaurant_id, open_time
FROM time_range_zip_code
                                  Even simpler query
WHERE day_of_week = ‘Monday’
                                  • No joins
  AND zip_code = 94619            • Two = and one <
  AND 1815 < close_time
  AND open_time < 1815
Simplification #3: Eliminate multiple =’s with concatenation
 Restaurant_id    Zip_dow         Open_time   Close_time

 1                94707:Monday    1130        1430
 1                94619:Monday    1130        1430
 1                94707:Monday    1730        2130
 1                94619:Monday    1730        2130
 2                94619:Monday    0700        1430
 …


SELECT restaurant_id, open_time
FROM time_range_zip_code
WHERE zip_code_day_of_week = ‘94619:Monday’
  AND 1815 < close_time
                                                       key

                                   range
Simplification #4: Eliminate multiple RETURN VALUES with
                       concatenation
    zip_dow               open_time_restaurant_id   close_time
    94707:Monday          1130_1                    1430
    94619:Monday          1130_1                    1430
    94707:Monday          1730_1                    2130
    94619:Monday          1730_1                    2130
    94619:Monday          0700_2                    1430
    ...




  SELECT open_time_restaurant_id,
  FROM time_range_zip_code
  WHERE zip_code_day_of_week = ‘94619:Monday’
    AND 1815 < close_time                                        ✔
Using a Redis sorted set as an index
        zip_dow          open_time_restaurant_id                  close_time
        94707:Monday     1130_1                                   1430
        94619:Monday     1130_1                                   1430
        94707:Monday     1730_1                                   2130
        94619:Monday     1730_1                                   2130
        94619:Monday
          94619:Monday   0700_2
                          0700_2                                  1430
                                                                   1430
        ...




Key                               Sorted Set [ Entry:Score, …]

94619:Monday                      [0700_2:1430, 1130_1:1430, 1730_1:2130]

94707:Monday                      [1130_1:1430, 1730_1:2130]
Querying with ZRANGEBYSCORE
Key                            Sorted Set [ Entry:Score, …]

94619:Monday                   [0700_2:1430, 1130_1:1430, 1730_1:2130]

94707:Monday                   [1130_1:1430, 1730_1:2130]



                    Delivery zip and day                      Delivery time


               ZRANGEBYSCORE 94619:Monday 1815 2359
                               è
                            {1730_1}



               1730 is before 1815 è Ajanta is open
Adding a Restaurant
@Component
public class AvailableRestaurantRepositoryImpl implements AvailableRestaurantRepository {

  @Override
  public void add(Restaurant restaurant) {                               Store as
    addRestaurantDetails(restaurant);
    addAvailabilityIndexEntries(restaurant);                              JSON
  }

  private void addRestaurantDetails(Restaurant restaurant) {          Text
    restaurantTemplate.opsForValue().set(keyFormatter.key(restaurant.getId()), restaurant);
  }

  private void addAvailabilityIndexEntries(Restaurant restaurant) {
    for (TimeRange tr : restaurant.getOpeningHours()) {
      String indexValue = formatTrId(restaurant, tr);                          key            member
      int dayOfWeek = tr.getDayOfWeek();
      int closingTime = tr.getClosingTime();
      for (String zipCode : restaurant.getServiceArea()) {
        redisTemplate.opsForZSet().add(closingTimesKey(zipCode, dayOfWeek), indexValue,
                                             closingTime);
      }
    }
  }
                                                                      score
Finding available Restaurants
@Component
public class AvailableRestaurantRepositoryImpl implements AvailableRestaurantRepository {
  @Override
  public List<AvailableRestaurant>
         findAvailableRestaurants(Address deliveryAddress, Date deliveryTime) {                 Find those that close
    String zipCode = deliveryAddress.getZip();                                                          after
    int dayOfWeek = DateTimeUtil.dayOfWeek(deliveryTime);
    int timeOfDay = DateTimeUtil.timeOfDay(deliveryTime);
    String closingTimesKey = closingTimesKey(zipCode, dayOfWeek);

      Set<String> trsClosingAfter =
                   redisTemplate.opsForZSet().rangeByScore(closingTimesKey, timeOfDay, 2359);

      Set<String> restaurantIds = new HashSet<String>();
      for (String tr : trsClosingAfter) {                                                 Filter out those that
        String[] values = tr.split("_");                                                        open after
        if (Integer.parseInt(values[0]) <= timeOfDay)
          restaurantIds.add(values[1]);
      }
      Collection<String> keys = keyFormatter.keys(restaurantIds);
      return availableRestaurantTemplate.opsForValue().multiGet(keys);                             Retrieve open
  }                                                                                                 restaurants
Sorry Ted!




http://en.wikipedia.org/wiki/Edgar_F._Codd
Agenda

•   Why polyglot persistence?

•   Using Redis as a cache

•   Optimizing queries using Redis materialized views

•   Synchronizing MySQL and Redis

•   Tracking changes to entities

•   Using a modular asynchronous architecture
MySQL & Redis
need to be consistent
Two-Phase commit is not an option



•   Redis does not support it

•   Even if it did, 2PC is best avoided
         http://www.infoq.com/articles/ebay-scalability-best-practices
Atomic
Consistent                                                Basically Available
Isolated                                                  Soft state
Durable                                                   Eventually consistent




BASE: An Acid Alternative http://queue.acm.org/detail.cfm?id=1394128
Updating Redis #FAIL
begin MySQL transaction
 update MySQL                        Redis has update
 update Redis                        MySQL does not
rollback MySQL transaction

begin MySQL transaction
 update MySQL
                                   MySQL has update
commit MySQL transaction
                                   Redis does not
<<system crashes>>
 update Redis
Updating Redis reliably
                 Step 1 of 2
begin MySQL transaction
 update MySQL
                                       ACID
 queue CRUD event in MySQL
commit transaction


  Event Id
  Operation: Create, Update, Delete
  New entity state, e.g. JSON
Updating Redis reliably
                    Step 2 of 2
for each CRUD event in MySQL queue
     get next CRUD event from MySQL queue
     If CRUD event is not duplicate then
      Update Redis (incl. eventId)
     end if
     begin MySQL transaction
       mark CRUD event as processed
     commit transaction
Step 1                Step 2
                                  Timer
   EntityCrudEvent     EntityCrudEvent         apply(event)    Redis
     Repository           Processor                           Updater



INSERT INTO ...        SELECT ... FROM ...


ENTITY_CRUD_EVENT


        ID           JSON         processed?
                                                              Redis
Optimistic
 locking                 Updating Redis

WATCH restaurant:lastSeenEventId:≪restaurantId≫

lastSeenEventId = GET restaurant:lastSeenEventId:≪restaurantId≫
                                                                   Duplicate
if (lastSeenEventId >= eventId) return;                            detection
MULTI
 SET restaurant:lastSeenEventId:≪restaurantId≫ eventId
                                                                  Transaction
  ... update the restaurant data...

EXEC
Agenda

•   Why polyglot persistence?

•   Using Redis as a cache

•   Optimizing queries using Redis materialized views

•   Synchronizing MySQL and Redis

•   Tracking changes to entities

•   Using a modular asynchronous architecture
How do we generate CRUD events?
Change tracking options


•   Explicit code

•   Hibernate event listener

•   Service-layer aspect

•   CQRS/Event-sourcing
HibernateEvent                     EntityCrudEvent
   Listener                          Repository




        ENTITY_CRUD_EVENT


               ID           JSON           processed?
Hibernate event listener
public class ChangeTrackingListener
  implements PostInsertEventListener, PostDeleteEventListener, PostUpdateEventListener {

 @Autowired
 private EntityCrudEventRepository entityCrudEventRepository;

 private void maybeTrackChange(Object entity, EntityCrudEventType eventType) {
   if (isTrackedEntity(entity)) {
     entityCrudEventRepository.add(new EntityCrudEvent(eventType, entity));
   }
 }

 @Override
 public void onPostInsert(PostInsertEvent event) {
   Object entity = event.getEntity();
   maybeTrackChange(entity, EntityCrudEventType.CREATE);
 }

 @Override
 public void onPostUpdate(PostUpdateEvent event) {
   Object entity = event.getEntity();
   maybeTrackChange(entity, EntityCrudEventType.UPDATE);
 }

 @Override
 public void onPostDelete(PostDeleteEvent event) {
   Object entity = event.getEntity();
   maybeTrackChange(entity, EntityCrudEventType.DELETE);
 }
Agenda

•   Why polyglot persistence?

•   Using Redis as a cache

•   Optimizing queries using Redis materialized views

•   Synchronizing MySQL and Redis

•   Tracking changes to entities

•   Using a modular asynchronous architecture
Original architecture
   WAR
          Restaurant
         Management

             ...
Drawbacks of this monolithic architecture

WAR
       Restaurant    •   Obstacle to frequent deployments
      Management
                     •   Overloads IDE and web container

          ...        •   Obstacle to scaling development

                     •   Technology lock-in
Need a more modular architecture
Using a message broker

     Asynchronous is preferred


JSON is fashionable but binary format is
             more efficient
Modular architecture
         CONSUMER                  Timer               RESTAURANT
                                                         OWNER




Order taking             Event Publisher       Restaurant
                                              Management




                                           MySQL            Redis
 Redis                 RabbitMQ
                                           Database         Cache
Benefits of a modular asynchronous
                        architecture

•   Scales development: develop, deploy and scale each service independently

•   Redeploy UI frequently/independently

•   Improves fault isolation

•   Eliminates long-term commitment to a single technology stack

•   Message broker decouples producers and consumers
Step 2 of 2

for each CRUD event in MySQL queue
 get next CRUD event from MySQL queue
  Publish persistent message to RabbitMQ
  begin MySQL transaction
   mark CRUD event as processed
  commit transaction
Message flow
EntityCrudEvent
   Processor
                                 AvailableRestaurantManagement
                                             Service
           Redis
          Updater


             Spring Integration glue code


                     RABBITMQ                              REDIS
RedisUpdater                           AMQP
<beans>

	   <int:gateway id="redisUpdaterGateway"
     	 service-interface="net...RedisUpdater"
                                                                  Creates proxy
	   	 default-request-channel="eventChannel"
     	 />

	   <int:channel id="eventChannel"/>

	   <int:object-to-json-transformer
        input-channel="eventChannel" output-channel="amqpOut"/>

	   <int:channel id="amqpOut"/>

	   <amqp:outbound-channel-adapter
	   	 channel="amqpOut"
	   	 amqp-template="rabbitTemplate"
	   	 routing-key="crudEvents"
	   	 exchange-name="crudEvents"
	    />

</beans>
AMQP                     Available...Service
<beans>
	 <amqp:inbound-channel-adapter
	 	 channel="inboundJsonEventsChannel"
	 	 connection-factory="rabbitConnectionFactory"
	 	 queue-names="crudEvents"/>

	 <int:channel id="inboundJsonEventsChannel"/>
	
	 <int:json-to-object-transformer
	 	 input-channel="inboundJsonEventsChannel"
	 	 type="net.chrisrichardson.foodToGo.common.JsonEntityCrudEvent"
	 	 output-channel="inboundEventsChannel"/>
	
	 <int:channel id="inboundEventsChannel"/>                           Invokes service
	
	 <int:service-activator
	 	 input-channel="inboundEventsChannel"
	 	 ref="availableRestaurantManagementServiceImpl"
	 	 method="processEvent"/>
</beans>
Summary

•   Each SQL/NoSQL database = set of tradeoffs

•   Polyglot persistence: leverage the strengths of SQL and NoSQL databases

•   Use Redis as a distributed cache

•   Store denormalized data in Redis for fast querying

•   Reliable database synchronization required
@crichardson chris.richardson@springsource.com
           http://plainoldobjects.com




             Questions?
Sign up for CloudFoundry.com
Cloud Foundry 启动营
在www.cloudfoundry.com注册账号并成功上传应用程序,
即可于12月8日中午后凭账号ID和应用URL到签到处换取Cloud Foundry主题卫衣一件。




88
iPhone5 等你拿
第二天大会结束前,请不要提前离      ,将填写完整的意见反馈表投到签到处的抽奖箱内,
即可参与“iPhone5”抽奖活动。




89
Birds of a Feather 专家面对面
所有讲师都会在课程结束后,到紫兰厅与来宾讨论课程上的问题




90

Developing polyglot persistence applications (SpringOne China 2012)

  • 1.
    Developing polyglot persistence applications Chris Richardson, Author of POJOs in Action, Founder of the original CloudFoundry.com @crichardson chris.richardson@springsource.com http://plainoldobjects.com
  • 2.
    Presentation goal The benefitsand drawbacks of polyglot persistence and How to design applications that use this approach
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
    vmc push About-Chris Developer Advocate Signup at http://cloudfoundry.com
  • 9.
    Agenda • Why polyglot persistence? • Using Redis as a cache • Optimizing queries using Redis materialized views • Synchronizing MySQL and Redis • Tracking changes to entities • Using a modular asynchronous architecture
  • 10.
    Food to Go • Take-out food delivery service • “Launched” in 2006
  • 11.
    Food To GoArchitecture RESTAURANT CONSUMER OWNER Restaurant Order taking Management MySQL Database
  • 12.
    Success Growth challenges • Increasing traffic • Increasing data volume • Distribute across a few data centers • Increasing domain model complexity
  • 13.
    Limitations of relationaldatabases • Scalability • Distribution • Schema updates • O/R impedance mismatch • Handling semi-structured data
  • 14.
    Solution: Spend Money http://upload.wikimedia.org/wikipedia/commons/e/e5/Rising_Sun_Yacht.JPG OR http://www.trekbikes.com/us/en/bikes/road/race_performance/madone_5_series/madone_5_2/#
  • 15.
    Solution: Use NoSQL Benefits Drawbacks • Higher performance • Limited transactions • Higher scalability • Limited querying • Richer data-model • Relaxed consistency • Schema-less • Unconstrained data
  • 16.
    Example NoSQL Databases Database Key features Cassandra Extensible column store, very scalable, distributed Neo4j Graph database Document-oriented, fast, scalable MongoDB Redis Key-value store, very fast http://nosql-database.org/ lists 122+ NoSQL databases
  • 17.
    Redis K1 V1 • Advanced key-value store K2 V2 • Very fast, e.g. 100K reqs/sec • Optional persistence ... ... • Transactions with optimistic locking • Master-slave replication • Sharding using client-side consistent hashing
  • 18.
    Sorted sets Value Key a b myset 5.0 10.0 Members are sorted Score by score
  • 19.
    Adding members toa sorted set Redis Server Key Score Value a zadd myset 5.0 a myset 5.0
  • 20.
    Adding members toa sorted set Redis Server a b zadd myset 10.0 b myset 5.0 10.0
  • 21.
    Adding members toa sorted set Redis Server c a b zadd myset 1.0 c myset 1.0 5.0 10.0
  • 22.
    Retrieving members byindex range Start End Key Index Index Redis Server zrange myset 0 1 c a b myset 1.0 5.0 10.0 c a
  • 23.
    Retrieving members byscore Min Max Key value value Redis Server zrangebyscore myset 1 6 c a b myset 1.0 5.0 10.0 c a
  • 24.
    Redis use cases • Replacement for Memcached • Handling tasks that overload an RDBMS • Session state • Hit counts - INCR • Cache of data retrieved from • Most recent N items - LPUSH and LTRIM system of record (SOR) • Randomly selecting an item – • Replica of SOR for queries needing SRANDMEMBER high-performance • Queuing – Lists with LPOP, RPUSH, …. • High score tables – Sorted sets and ZINCRBY • …
  • 25.
    Redis is greatbut there are tradeoffs • Low-level query language: PK-based access only • Limited transaction model: • Read first and then execute updates as batch • Difficult to compose code • Data must fit in memory • Single-threaded server: run multiple with client-side sharding • Missing features such as access control, ...
  • 26.
    And don’t forget: AnRDBMS is fine for many applications
  • 27.
    The future ispolyglot e.g. Netflix • RDBMS • SimpleDB • Cassandra • Hadoop/Hbase IEEE Software Sept/October 2010 - Debasish Ghosh / Twitter @debasishg
  • 28.
    Agenda • Why polyglot persistence? • Using Redis as a cache • Optimizing queries using Redis materialized views • Synchronizing MySQL and Redis • Tracking changes to entities • Using a modular asynchronous architecture
  • 29.
    Increase scalability bycaching RESTAURANT CONSUMER OWNER Restaurant Order taking Management MySQL Cache Database
  • 30.
    Caching Options • Where: • Hibernate 2nd level cache • Explicit calls from application code • Caching aspect • Cache technologies: Ehcache, Memcached, Infinispan, ... Redis is also an option
  • 31.
    Using Redis asa cache • Spring 3.1 cache abstraction • Annotations specify which methods to cache • CacheManager - pluggable back-end cache • Spring Data for Redis • Simplifies the development of Redis applications • Provides RedisTemplate (analogous to JdbcTemplate) • Provides RedisCacheManager
  • 32.
    Using Spring 3.1Caching @Service public class RestaurantManagementServiceImpl implements RestaurantManagementService { private final RestaurantRepository restaurantRepository; @Autowired public RestaurantManagementServiceImpl(RestaurantRepository restaurantRepository) { this.restaurantRepository = restaurantRepository; } @Override public void add(Restaurant restaurant) { Cache result restaurantRepository.add(restaurant); } @Override @Cacheable(value = "Restaurant") public Restaurant findById(int id) { return restaurantRepository.findRestaurant(id); Evict from } cache @Override @CacheEvict(value = "Restaurant", key="#restaurant.id") public void update(Restaurant restaurant) { restaurantRepository.update(restaurant); }
  • 33.
    Configuring the RedisCache Manager Enables caching <cache:annotation-driven /> <bean id="cacheManager" class="org.springframework.data.redis.cache.RedisCacheManager" > <constructor-arg ref="restaurantTemplate"/> </bean> Specifies CacheManager The RedisTemplate used to access implementation Redis
  • 34.
    Domain object tokey-value mapping? Restaurant K1 V1 TimeRange MenuItem K2 V2 TimeRange MenuItem ... ... ServiceArea
  • 35.
    RedisTemplate • Analogous to JdbcTemplate • Encapsulates boilerplate code, e.g. connection management • Maps Java objects Redis byte[]’s
  • 36.
    Serializers: object byte[] • RedisTemplate has multiple serializers • DefaultSerializer - defaults to JdkSerializationRedisSerializer • KeySerializer • ValueSerializer • HashKeySerializer • HashValueSerializer
  • 37.
    Serializing a Restaurantas JSON @Configuration public class RestaurantManagementRedisConfiguration { @Autowired private RestaurantObjectMapperFactory restaurantObjectMapperFactory; private JacksonJsonRedisSerializer<Restaurant> makeRestaurantJsonSerializer() { JacksonJsonRedisSerializer<Restaurant> serializer = new JacksonJsonRedisSerializer<Restaurant>(Restaurant.class); ... return serializer; } @Bean @Qualifier("Restaurant") public RedisTemplate<String, Restaurant> restaurantTemplate(RedisConnectionFactory factory) { RedisTemplate<String, Restaurant> template = new RedisTemplate<String, Restaurant>(); template.setConnectionFactory(factory); JacksonJsonRedisSerializer<Restaurant> jsonSerializer = makeRestaurantJsonSerializer(); template.setValueSerializer(jsonSerializer); return template; } } Serialize restaurants using Jackson JSON
  • 38.
    Caching with Redis CONSUMER RESTAURANT OWNER Restaurant Order taking Management Redis MySQL First Second Cache Database
  • 39.
    Agenda • Why polyglot persistence? • Using Redis as a cache • Optimizing queries using Redis materialized views • Synchronizing MySQL and Redis • Tracking changes to entities • Using a modular asynchronous architecture
  • 40.
    Finding available restaurants Availablerestaurants = Serve the zip code of the delivery address AND Are open at the delivery time public interface AvailableRestaurantRepository { List<AvailableRestaurant> findAvailableRestaurants(Address deliveryAddress, Date deliveryTime); ... }
  • 41.
    Food to Go– Domain model (partial) class Restaurant { class TimeRange { long id; long id; String name; int dayOfWeek; Set<String> serviceArea; int openTime; Set<TimeRange> openingHours; int closeTime; List<MenuItem> menuItems; } } class MenuItem { String name; double price; }
  • 42.
    Database schema ID Name … RESTAURANT table 1 Ajanta 2 Montclair Eggshop Restaurant_id zipcode RESTAURANT_ZIPCODE table 1 94707 1 94619 2 94611 2 94619 RESTAURANT_TIME_RANGE table Restaurant_id dayOfWeek openTime closeTime 1 Monday 1130 1430 1 Monday 1730 2130 2 Tuesday 1130 …
  • 43.
    Finding available restaurantson Monday, 6.15pm for 94619 zipcode Straightforward three-way join select r.* from restaurant r inner join restaurant_time_range tr on r.id =tr.restaurant_id inner join restaurant_zipcode sa on r.id = sa.restaurant_id where ’94619’ = sa.zip_code and tr.day_of_week=’monday’ and tr.openingtime <= 1815 and 1815 <= tr.closingtime
  • 44.
    How to scalequeries?
  • 45.
    Option #1: Querycaching • [ZipCode, DeliveryTime] ⇨ list of available restaurants BUT • Long tail queries • Update restaurant ⇨ Flush entire cache Ineffective
  • 46.
    Option #2: Master/Slavereplication Writes Consistent reads Queries (Inconsistent reads) MySQL Master MySQL MySQL MySQL Slave 1 Slave 2 Slave N
  • 47.
    Master/Slave replication • Mostly straightforward BUT • Assumes that SQL query is efficient • Complexity of administration of slaves • Doesn’t scale writes
  • 48.
    Option #3: Redismaterialized views RESTAURANT CONSUMER OWNER Restaurant Order taking Management System Copy update() of Record findAvailable() MySQL Redis Cache Database
  • 49.
    BUT how toimplement findAvailableRestaurants() with Redis?! ? select r.* from restaurant r K1 V1 inner join restaurant_time_range tr on r.id =tr.restaurant_id inner join restaurant_zipcode sa on r.id = sa.restaurant_id K2 V2 where ’94619’ = sa.zip_code and tr.day_of_week=’monday’ and tr.openingtime <= 1815 ... ... and 1815 <= tr.closingtime
  • 50.
    Where we needto be ZRANGEBYSCORE myset 1 6 = sorted_set select value,score key value score from sorted_set where key = ‘myset’ and score >= 1 and score <= 6
  • 51.
    We need todenormalize Think materialized view
  • 52.
    Simplification #1: Denormalization Restaurant_id Day_of_week Open_time Close_time Zip_code 1 Monday 1130 1430 94707 1 Monday 1130 1430 94619 1 Monday 1730 2130 94707 1 Monday 1730 2130 94619 2 Monday 0700 1430 94619 … SELECT restaurant_id FROM time_range_zip_code Simpler query: WHERE day_of_week = ‘Monday’ § No joins AND zip_code = 94619 § Two = and two < AND 1815 < close_time AND open_time < 1815
  • 53.
    Simplification #2: Applicationfiltering SELECT restaurant_id, open_time FROM time_range_zip_code Even simpler query WHERE day_of_week = ‘Monday’ • No joins AND zip_code = 94619 • Two = and one < AND 1815 < close_time AND open_time < 1815
  • 54.
    Simplification #3: Eliminatemultiple =’s with concatenation Restaurant_id Zip_dow Open_time Close_time 1 94707:Monday 1130 1430 1 94619:Monday 1130 1430 1 94707:Monday 1730 2130 1 94619:Monday 1730 2130 2 94619:Monday 0700 1430 … SELECT restaurant_id, open_time FROM time_range_zip_code WHERE zip_code_day_of_week = ‘94619:Monday’ AND 1815 < close_time key range
  • 55.
    Simplification #4: Eliminatemultiple RETURN VALUES with concatenation zip_dow open_time_restaurant_id close_time 94707:Monday 1130_1 1430 94619:Monday 1130_1 1430 94707:Monday 1730_1 2130 94619:Monday 1730_1 2130 94619:Monday 0700_2 1430 ... SELECT open_time_restaurant_id, FROM time_range_zip_code WHERE zip_code_day_of_week = ‘94619:Monday’ AND 1815 < close_time ✔
  • 56.
    Using a Redissorted set as an index zip_dow open_time_restaurant_id close_time 94707:Monday 1130_1 1430 94619:Monday 1130_1 1430 94707:Monday 1730_1 2130 94619:Monday 1730_1 2130 94619:Monday 94619:Monday 0700_2 0700_2 1430 1430 ... Key Sorted Set [ Entry:Score, …] 94619:Monday [0700_2:1430, 1130_1:1430, 1730_1:2130] 94707:Monday [1130_1:1430, 1730_1:2130]
  • 57.
    Querying with ZRANGEBYSCORE Key Sorted Set [ Entry:Score, …] 94619:Monday [0700_2:1430, 1130_1:1430, 1730_1:2130] 94707:Monday [1130_1:1430, 1730_1:2130] Delivery zip and day Delivery time ZRANGEBYSCORE 94619:Monday 1815 2359 è {1730_1} 1730 is before 1815 è Ajanta is open
  • 58.
    Adding a Restaurant @Component publicclass AvailableRestaurantRepositoryImpl implements AvailableRestaurantRepository { @Override public void add(Restaurant restaurant) { Store as addRestaurantDetails(restaurant); addAvailabilityIndexEntries(restaurant); JSON } private void addRestaurantDetails(Restaurant restaurant) { Text restaurantTemplate.opsForValue().set(keyFormatter.key(restaurant.getId()), restaurant); } private void addAvailabilityIndexEntries(Restaurant restaurant) { for (TimeRange tr : restaurant.getOpeningHours()) { String indexValue = formatTrId(restaurant, tr); key member int dayOfWeek = tr.getDayOfWeek(); int closingTime = tr.getClosingTime(); for (String zipCode : restaurant.getServiceArea()) { redisTemplate.opsForZSet().add(closingTimesKey(zipCode, dayOfWeek), indexValue, closingTime); } } } score
  • 59.
    Finding available Restaurants @Component publicclass AvailableRestaurantRepositoryImpl implements AvailableRestaurantRepository { @Override public List<AvailableRestaurant> findAvailableRestaurants(Address deliveryAddress, Date deliveryTime) { Find those that close String zipCode = deliveryAddress.getZip(); after int dayOfWeek = DateTimeUtil.dayOfWeek(deliveryTime); int timeOfDay = DateTimeUtil.timeOfDay(deliveryTime); String closingTimesKey = closingTimesKey(zipCode, dayOfWeek); Set<String> trsClosingAfter = redisTemplate.opsForZSet().rangeByScore(closingTimesKey, timeOfDay, 2359); Set<String> restaurantIds = new HashSet<String>(); for (String tr : trsClosingAfter) { Filter out those that String[] values = tr.split("_"); open after if (Integer.parseInt(values[0]) <= timeOfDay) restaurantIds.add(values[1]); } Collection<String> keys = keyFormatter.keys(restaurantIds); return availableRestaurantTemplate.opsForValue().multiGet(keys); Retrieve open } restaurants
  • 60.
  • 61.
    Agenda • Why polyglot persistence? • Using Redis as a cache • Optimizing queries using Redis materialized views • Synchronizing MySQL and Redis • Tracking changes to entities • Using a modular asynchronous architecture
  • 62.
    MySQL & Redis needto be consistent
  • 63.
    Two-Phase commit isnot an option • Redis does not support it • Even if it did, 2PC is best avoided http://www.infoq.com/articles/ebay-scalability-best-practices
  • 64.
    Atomic Consistent Basically Available Isolated Soft state Durable Eventually consistent BASE: An Acid Alternative http://queue.acm.org/detail.cfm?id=1394128
  • 65.
    Updating Redis #FAIL beginMySQL transaction update MySQL Redis has update update Redis MySQL does not rollback MySQL transaction begin MySQL transaction update MySQL MySQL has update commit MySQL transaction Redis does not <<system crashes>> update Redis
  • 66.
    Updating Redis reliably Step 1 of 2 begin MySQL transaction update MySQL ACID queue CRUD event in MySQL commit transaction Event Id Operation: Create, Update, Delete New entity state, e.g. JSON
  • 67.
    Updating Redis reliably Step 2 of 2 for each CRUD event in MySQL queue get next CRUD event from MySQL queue If CRUD event is not duplicate then Update Redis (incl. eventId) end if begin MySQL transaction mark CRUD event as processed commit transaction
  • 68.
    Step 1 Step 2 Timer EntityCrudEvent EntityCrudEvent apply(event) Redis Repository Processor Updater INSERT INTO ... SELECT ... FROM ... ENTITY_CRUD_EVENT ID JSON processed? Redis
  • 69.
    Optimistic locking Updating Redis WATCH restaurant:lastSeenEventId:≪restaurantId≫ lastSeenEventId = GET restaurant:lastSeenEventId:≪restaurantId≫ Duplicate if (lastSeenEventId >= eventId) return; detection MULTI SET restaurant:lastSeenEventId:≪restaurantId≫ eventId Transaction ... update the restaurant data... EXEC
  • 70.
    Agenda • Why polyglot persistence? • Using Redis as a cache • Optimizing queries using Redis materialized views • Synchronizing MySQL and Redis • Tracking changes to entities • Using a modular asynchronous architecture
  • 71.
    How do wegenerate CRUD events?
  • 72.
    Change tracking options • Explicit code • Hibernate event listener • Service-layer aspect • CQRS/Event-sourcing
  • 73.
    HibernateEvent EntityCrudEvent Listener Repository ENTITY_CRUD_EVENT ID JSON processed?
  • 74.
    Hibernate event listener publicclass ChangeTrackingListener implements PostInsertEventListener, PostDeleteEventListener, PostUpdateEventListener { @Autowired private EntityCrudEventRepository entityCrudEventRepository; private void maybeTrackChange(Object entity, EntityCrudEventType eventType) { if (isTrackedEntity(entity)) { entityCrudEventRepository.add(new EntityCrudEvent(eventType, entity)); } } @Override public void onPostInsert(PostInsertEvent event) { Object entity = event.getEntity(); maybeTrackChange(entity, EntityCrudEventType.CREATE); } @Override public void onPostUpdate(PostUpdateEvent event) { Object entity = event.getEntity(); maybeTrackChange(entity, EntityCrudEventType.UPDATE); } @Override public void onPostDelete(PostDeleteEvent event) { Object entity = event.getEntity(); maybeTrackChange(entity, EntityCrudEventType.DELETE); }
  • 75.
    Agenda • Why polyglot persistence? • Using Redis as a cache • Optimizing queries using Redis materialized views • Synchronizing MySQL and Redis • Tracking changes to entities • Using a modular asynchronous architecture
  • 76.
    Original architecture WAR Restaurant Management ...
  • 77.
    Drawbacks of thismonolithic architecture WAR Restaurant • Obstacle to frequent deployments Management • Overloads IDE and web container ... • Obstacle to scaling development • Technology lock-in
  • 78.
    Need a moremodular architecture
  • 79.
    Using a messagebroker Asynchronous is preferred JSON is fashionable but binary format is more efficient
  • 80.
    Modular architecture CONSUMER Timer RESTAURANT OWNER Order taking Event Publisher Restaurant Management MySQL Redis Redis RabbitMQ Database Cache
  • 81.
    Benefits of amodular asynchronous architecture • Scales development: develop, deploy and scale each service independently • Redeploy UI frequently/independently • Improves fault isolation • Eliminates long-term commitment to a single technology stack • Message broker decouples producers and consumers
  • 82.
    Step 2 of2 for each CRUD event in MySQL queue get next CRUD event from MySQL queue Publish persistent message to RabbitMQ begin MySQL transaction mark CRUD event as processed commit transaction
  • 83.
    Message flow EntityCrudEvent Processor AvailableRestaurantManagement Service Redis Updater Spring Integration glue code RABBITMQ REDIS
  • 84.
    RedisUpdater AMQP <beans> <int:gateway id="redisUpdaterGateway" service-interface="net...RedisUpdater" Creates proxy default-request-channel="eventChannel" /> <int:channel id="eventChannel"/> <int:object-to-json-transformer input-channel="eventChannel" output-channel="amqpOut"/> <int:channel id="amqpOut"/> <amqp:outbound-channel-adapter channel="amqpOut" amqp-template="rabbitTemplate" routing-key="crudEvents" exchange-name="crudEvents" /> </beans>
  • 85.
    AMQP Available...Service <beans> <amqp:inbound-channel-adapter channel="inboundJsonEventsChannel" connection-factory="rabbitConnectionFactory" queue-names="crudEvents"/> <int:channel id="inboundJsonEventsChannel"/> <int:json-to-object-transformer input-channel="inboundJsonEventsChannel" type="net.chrisrichardson.foodToGo.common.JsonEntityCrudEvent" output-channel="inboundEventsChannel"/> <int:channel id="inboundEventsChannel"/> Invokes service <int:service-activator input-channel="inboundEventsChannel" ref="availableRestaurantManagementServiceImpl" method="processEvent"/> </beans>
  • 86.
    Summary • Each SQL/NoSQL database = set of tradeoffs • Polyglot persistence: leverage the strengths of SQL and NoSQL databases • Use Redis as a distributed cache • Store denormalized data in Redis for fast querying • Reliable database synchronization required
  • 87.
    @crichardson chris.richardson@springsource.com http://plainoldobjects.com Questions? Sign up for CloudFoundry.com
  • 88.
  • 89.
    iPhone5 等你拿 第二天大会结束前,请不要提前离 ,将填写完整的意见反馈表投到签到处的抽奖箱内, 即可参与“iPhone5”抽奖活动。 89
  • 90.
    Birds of aFeather 专家面对面 所有讲师都会在课程结束后,到紫兰厅与来宾讨论课程上的问题 90