NoSQL databases such as Redis, MongoDB and Cassandra are emerging as a compelling choice for many applications. They can simplify the persistence of complex data models and offer significantly better scalability and performance. However, using a NoSQL database means giving up the benefits of the relational model such as SQL, constraints and ACID transactions. For some applications, the solution is polyglot persistence: using SQL and NoSQL databases together. In this talk, you will learn about the benefits and drawbacks of polyglot persistence and how to design applications that use this approach. We will explore the architecture and implementation of an example application that uses MySQL as the system of record and Redis as a very high-performance database that handles queries from the front-end. You will learn about mechanisms for maintaining consistency across the various databases.
9. @crichardson
Agenda
• Why polyglot persistence?
• Using Redis as a cache
• Optimizing queries using Redis materialized views
• Synchronizing MySQL and Redis
• Tracking changes to entities
26. @crichardson
Redis use cases
• Replacement for Memcached
• Session state
• Cache of data retrieved from
system of record (SOR)
• Replica of SOR for queries
needing high-performance
• Handling tasks that overload an RDBMS
• Hit counts - INCR
• Most recent N items - LPUSH and
LTRIM
• Randomly selecting an item –
SRANDMEMBER
• Queuing – Lists with LPOP, RPUSH, ….
• High score tables – Sorted sets and
ZINCRBY
• …
27. @crichardson
The future is polyglot
IEEE Software Sept/October 2010 - Debasish Ghosh / Twitter @debasishg
e.g. Netflix
• RDBMS
• SimpleDB
• Cassandra
• Hadoop/Hbase
32. @crichardson
Polyglot Pattern: replicate from
MySQL to NoSQL
Query
Service
Update
Service
MySQL
Database
Reader Writer
Redis
System
of
RecordMaterialized
view query() update()
replicate
and
denormalize
33. @crichardson
Agenda
• Why polyglot persistence?
• Using Redis as a cache
• Optimizing queries using Redis materialized views
• Synchronizing MySQL and Redis
• Tracking changes to entities
34. @crichardson
Food to Go – Domain model (partial)
class Restaurant {
long id;
String name;
Set<String> serviceArea;
Set<TimeRange> openingHours;
List<MenuItem> menuItems;
}
class MenuItem {
String name;
double price;
}
class TimeRange {
long id;
int dayOfWeek;
int openTime;
int closeTime;
}
38. @crichardson
Caching Options
• Where:
• Hibernate 2nd level cache
• Explicit calls from application code
• Caching aspect
• Cache technologies: Ehcache, Memcached, Infinispan, ...
Redis is also an option
39. @crichardson
Using Redis as a cache
• Spring 3.1 cache abstraction
• Annotations specify which methods to cache
• CacheManager - pluggable back-end cache
• Spring Data for Redis
• Simplifies the development of Redis applications
• Provides RedisTemplate (analogous to JdbcTemplate)
• Provides RedisCacheManager
40. @crichardson
Using Spring 3.1 Caching
@Service
public class RestaurantManagementServiceImpl implements RestaurantManagementService {
private final RestaurantRepository restaurantRepository;
@Autowired
public RestaurantManagementServiceImpl(RestaurantRepository restaurantRepository) {
this.restaurantRepository = restaurantRepository;
}
@Override
public void add(Restaurant restaurant) {
restaurantRepository.add(restaurant);
}
@Override
@Cacheable(value = "Restaurant")
public Restaurant findById(int id) {
return restaurantRepository.findRestaurant(id);
}
@Override
@CacheEvict(value = "Restaurant", key="#restaurant.id")
public void update(Restaurant restaurant) {
restaurantRepository.update(restaurant);
}
Cache result
Evict from
cache
41. @crichardson
Configuring the Redis Cache
Manager
<cache:annotation-driven />
<bean id="cacheManager"
class="org.springframework.data.redis.cache.RedisCacheManager" >
<constructor-arg ref="restaurantTemplate"/>
</bean>
Enables caching
Specifies CacheManager
implementation
The RedisTemplate used
to access Redis
43. @crichardson
RedisTemplate handles the
mapping
• Principal API provided by Spring Data to Redis
• Analogous to JdbcTemplate
• Encapsulates boilerplate code, e.g. connection management
• Maps Java objects Redis byte[]’s
48. @crichardson
Agenda
• Why polyglot persistence?
• Using Redis as a cache
• Optimizing queries using Redis materialized views
• Synchronizing MySQL and Redis
• Tracking changes to entities
49. @crichardson
Finding available restaurants
Available restaurants =
Serve the zip code of the delivery address
AND
Are open at the delivery time
public interface AvailableRestaurantRepository {
List<AvailableRestaurant>
findAvailableRestaurants(Address deliveryAddress, Date deliveryTime);
...
}
50. @crichardson
Finding available restaurants on Monday, 6.15pm
for 94619 zipcode
Straightforward three-way join
select r.*
from restaurant r
inner join restaurant_time_range tr
on r.id =tr.restaurant_id
inner join restaurant_zipcode sa
on r.id = sa.restaurant_id
where ’94619’ = sa.zip_code
and tr.day_of_week=’monday’
and tr.openingtime <= 1815
and 1815 <= tr.closingtime
52. @crichardson
Option #1: Query caching
• [ZipCode, DeliveryTime] ⇨ list of available restaurants
BUT
• Long tail queries
• Update restaurant ⇨ Flush entire cache
Ineffective
53. @crichardson
Option #2: Master/Slave replication
MySQL
Master
MySQL
Slave 1
MySQL
Slave 2
MySQL
Slave N
Writes Consistent reads
Queries
(Inconsistent reads)
55. @crichardson
Option #3: Redis materialized
views
Order
taking
Restaurant
Management
MySQL
Database
CONSUMER
RESTAURANT
OWNER
CacheRedis
System
of
RecordCopy
findAvailable()
update()
56. @crichardson
BUT how to implement findAvailableRestaurants()
with Redis?!
?
select r.*
from restaurant r
inner join restaurant_time_range tr
on r.id =tr.restaurant_id
inner join restaurant_zipcode sa
on r.id = sa.restaurant_id
where ’94619’ = sa.zip_code
and tr.day_of_week=’monday’
and tr.openingtime <= 1815
and 1815 <= tr.closingtime
K1 V1
K2 V2
... ...
57. @crichardson
Solution: Build an index using sorted
sets and ZRANGEBYSCORE
ZRANGEBYSCORE myset 1 6
select value
from sorted_set
where key = ‘myset’
and score >= 1
and score <= 6
=
key value score
sorted_set
58. @crichardson
select r.*
from restaurant r
inner join restaurant_time_range tr
on r.id =tr.restaurant_id
inner join restaurant_zipcode sa
on r.id = sa.restaurant_id
where ’94619’ = sa.zip_code
and tr.day_of_week=’monday’
and tr.openingtime <= 1815
and 1815 <= tr.closingtime
select value
from sorted_set
where key = ?
and score >= ?
and score <= ?
?
How to transform the SELECT
statement?
64. @crichardson
zip_dow open_time_restaurant_id close_time
94707:Monday 1130_1 1430
94619:Monday 1130_1 1430
94707:Monday 1730_1 2130
94619:Monday 1730_1 2130
94619:Monday 0700_2 1430
...
Sorted Set [ Entry:Score, …]
[1130_1:1430, 1730_1:2130]
[0700_2:1430, 1130_1:1430, 1730_1:2130]
Using a Redis sorted set as an index
Key
94619:Monday
94707:Monday
94619:Monday 0700_2 1430
65. @crichardson
Querying with ZRANGEBYSCORE
ZRANGEBYSCORE 94619:Monday 1815 2359
è
{1730_1}
1730 is before 1815 è Ajanta is open
Delivery timeDelivery zip and day
Sorted Set [ Entry:Score, …]
[1130_1:1430, 1730_1:2130]
[0700_2:1430, 1130_1:1430, 1730_1:2130]
Key
94619:Monday
94707:Monday
66. @crichardson
Adding a Restaurant
Text
@Component
public class AvailableRestaurantRepositoryImpl implements AvailableRestaurantRepository {
@Override
public void add(Restaurant restaurant) {
addRestaurantDetails(restaurant);
addAvailabilityIndexEntries(restaurant);
}
private void addRestaurantDetails(Restaurant restaurant) {
restaurantTemplate.opsForValue().set(keyFormatter.key(restaurant.getId()), restaurant);
}
private void addAvailabilityIndexEntries(Restaurant restaurant) {
for (TimeRange tr : restaurant.getOpeningHours()) {
String indexValue = formatTrId(restaurant, tr);
int dayOfWeek = tr.getDayOfWeek();
int closingTime = tr.getClosingTime();
for (String zipCode : restaurant.getServiceArea()) {
redisTemplate.opsForZSet().add(closingTimesKey(zipCode, dayOfWeek), indexValue,
closingTime);
}
}
}
key member
score
Store as
JSON
67. @crichardson
Finding available Restaurants
@Component
public class AvailableRestaurantRepositoryImpl implements AvailableRestaurantRepository {
@Override
public List<AvailableRestaurant>
findAvailableRestaurants(Address deliveryAddress, Date deliveryTime) {
String zipCode = deliveryAddress.getZip();
int dayOfWeek = DateTimeUtil.dayOfWeek(deliveryTime);
int timeOfDay = DateTimeUtil.timeOfDay(deliveryTime);
String closingTimesKey = closingTimesKey(zipCode, dayOfWeek);
Set<String> trsClosingAfter =
redisTemplate.opsForZSet().rangeByScore(closingTimesKey, timeOfDay, 2359);
Set<String> restaurantIds = new HashSet<String>();
for (String tr : trsClosingAfter) {
String[] values = tr.split("_");
if (Integer.parseInt(values[0]) <= timeOfDay)
restaurantIds.add(values[1]);
}
Collection<String> keys = keyFormatter.keys(restaurantIds);
return availableRestaurantTemplate.opsForValue().multiGet(keys);
}
Find those that
close after
Filter out those that
open after
Retrieve open
restaurants
71. @crichardson
Agenda
• Why polyglot persistence?
• Using Redis as a cache
• Optimizing queries using Redis materialized views
• Synchronizing MySQL and Redis
• Tracking changes to entities
73. @crichardson
Two-Phase commit is not an
option
• Redis does not support it
• Even if it did, 2PC is best avoided http://www.infoq.com/articles/ebay-scalability-best-practices
75. @crichardson
Updating Redis #FAIL
begin MySQL transaction
update MySQL
update Redis
rollback MySQL transaction
Redis has update
MySQL does not
begin MySQL transaction
update MySQL
commit MySQL transaction
<<system crashes>>
update Redis
MySQL has update
Redis does not
76. @crichardson
Updating Redis reliably
Step 1 of 2
begin MySQL transaction
update MySQL
queue CRUD event in MySQL
commit transaction
ACID
Event Id
Operation: Create, Update, Delete
New entity state, e.g. JSON
77. @crichardson
Updating Redis reliably
Step 2 of 2
for each CRUD event in MySQL queue
get next CRUD event from MySQL queue
If CRUD event is not duplicate then
Update Redis (incl. eventId)
end if
begin MySQL transaction
mark CRUD event as processed
commit transaction
81. @crichardson
Agenda
• Why polyglot persistence?
• Using Redis as a cache
• Optimizing queries using Redis materialized views
• Synchronizing MySQL and Redis
• Tracking changes to entities
86. @crichardson
Summary
• Each SQL/NoSQL database = set of tradeoffs
• Polyglot persistence: leverage the strengths of SQL and
NoSQL databases
• Use Redis as a distributed cache
• Store denormalized data in Redis for fast querying
• Reliable database synchronization required