Polygot persistence for Java Developers - August 2011 / @Oakjug

Polyglot persistence for Java
developers - moving out of the
relational comfort zone

Chris Richardson

Author of POJOs in Action
Founder of CloudFoundry.com
chris@chrisrichardson.net
@crichardson

Overall presentation goal

The joy and pain of
building Java
applications that
use NoSQL

8/19/11 Copyright (c) 2011 Chris Richardson. All rights reserved.
Slide 2

About Chris
•  Grew up in England and live in Oakland,
CA
•  Over 25+ years of software development
experience including 14+ years of Java
•  Speaker at JavaOne, SpringOne,
PhillyETE, Devoxx, etc.
•  Organize the Oakland JUG and the
Groovy Grails meetup

http://www.theregister.co.uk/2009/08/19/springsource_cloud_foundry/

Slide 3

Agenda
o  Why NoSQL?
o  Overview of NoSQL databases
o  Introduction to Spring Data
o  Case study: POJOs in Action & NoSQL

8/19/11
Copyright (c) 2011 Chris Richardson. All rights reserved.
Slide 4

Relational databases are great
o  SQL = Rich, declarative query language
o  Database enforces referential integrity
o  ACID semantics
o  Well understood by developers
o  Well supported by frameworks and tools, e.g. Spring
JDBC, Hibernate, JPA
o  Well understood by operations
n  Configuration
n  Care and feeding
n  Backups
n  Tuning
n  Failure and recovery
n  Performance characteristics
o  But….

Slide 5

Problem: Complex object graphs
o  Object/relational
impedance
mismatch
o  Complicated to
map rich domain
model to relational
schema
o  Performance issues
n  Many rows in many
tables
n  Many joins

Problem: Semi-structured data
o  Relational schema doesn’t easily handle
semi-structured data:
n  Varying attributes
n  Custom attributes on a customer record
o  Common solution = Name/value table
n  Poor performance
n  E.g. Finding specific attributes for customers
satisfying some criteria = multi-way outer
JOIN
n  Lack of constraints
o  Another solution = Serialize as blob
n  Fewer joins
n  BUT can’t be queried

Problem: Schema evolution
o  For example:
n  Add attributes to an object è add
columns to table
o  Schema changes =
n  Holding locks for a long time è
application downtime
n  $$

Problem: Scaling
o  Scaling reads:
n  Master/slave
n  But beware of consistency issues
o  Scaling writes
n  Extremely difficult/impossible/expensive
n  Vertical scaling is limited and requires $$
n  Horizontal scaling is limited/requires $$

Solution: Buy high end technology

http://upload.wikimedia.org/wikipedia/commons/e/e5/Rising_Sun_Yacht.JPG

Solution: Hire more developers
o  Application-level sharding
o  Build your own middleware
o  …

http://www.trekbikes.com/us/en/bikes/road/race_performance/madone_4_series/madone_4_5

Solution: Use NewSQL
o  Led by Stonebraker
n  Current databases are designed for 1970s
hardware and for both OLTP and data
warehouses
n  http://www.slideshare.net/VoltDB/sql-
myths-webinar
o  NewSQL
n  Next generation SQL databases, e.g. VoltDB
n  Leverage multi-core, commodity hardware
n  In-memory
n  Horizontally scalable
n  Transparently shardable
n  ACID

NoSQL databases are emerging…
Each one offers
some combination
of:
o  Higher performance
o  Higher scalability
o  Richer data-model
o  Schema-less
In return for:
o  Limited transactions
o  Relaxed consistency
o  Unconstrained data
o  …

Slide 13

… but there are few commonalities

o  Everyone and their dog has written
one
o  Different data models
n  Key-value “Same sorry state as the database
market in the 1970s before SQL was
n  Column invented”
http://queue.acm.org/detail.cfm?
n  Document id=1961297

n  Graph
o  Different APIs
o  No JDBC, Hibernate, JPA (generally)

Slide 14

Future = multi-paradigm data storage
for enterprise applications

IEEE Software Sept/October 2010 - Debasish Ghosh / Twitter @debasishg

Slide 15

Agenda
o  Why NoSQL?

Slide 16

Redis
o  Advanced key-value store
n  Values can be binary strings, Lists, Sets,
Sorted Sets, Hashes, …
n  Data-type specific operations
o  Very fast
n  ~100K operations/second on entry-level
hardware
n  In-memory operations K1 V1

o  Persistent K2 V2

n  Periodic snapshots of memory OR K3 V2

append commands to log file
o  Transactions within a single server
n  Atomic execution of batched commands
n  Optimistic locking

Slide 17

Redis CLI Sorted set member = value + score

redis> zadd mysortedset 5.0 a
(integer) 1
redis> zadd mysortedset 10.0 b
(integer) 1
redis> zadd mysortedset 1.0 c
(integer) 1
redis> zrange mysortedset 0 1
1) "c"
2) "a"
redis> zrangebyscore mysortedset 1 6
1) "c"
2) "a"

Slide 18

Scaling Redis
o  Master/slave replication
n  Tree of Redis servers
n  Non-persistent master can replicate to a
persistent slave
n  Use slaves for read-only queries
o  Sharding
n  Client-side only – consistent hashing based
on key
n  Server-side sharding – coming one day
o  Run multiple servers per physical host
n  Server is single threaded => Leverage
multiple CPUs
n  32 bit more efficient than 64 bit

Slide 19

Downsides of Redis
o  Low-level API compared to SQL
o  Single threaded:
n  Multiple cores è multiple Redis servers
o  Master/slave failover is manual
o  Partitioning is done by the client
o  Dataset has to fit in memory

Redis use cases
o  Drop-in replacement for Memcached
n  Session state
n  Cache of data retrieved from SOR
o  Replica of SOR for queries needing high-
performance
o  Miscellaneous yet important
n  Counting using INCR command, e.g. hit counts
n  Most recent N items - LPUSH and LTRIM
n  Randomly selecting an item – SRANDMEMBER
n  Queuing – Lists with LPOP, RPUSH, ….
n  High score tables – Sorted sets and ZINCRBY
n  …

o  Notable users: github, guardian.co.uk, ….
Slide 21

Cassandra
o  An Apache open-source project
o  Developed by Facebook for inbox search
o  Column-oriented database/Extensible row store
n  The data model will hurt your brain
n  Row = map or map of maps
o  Fast writes = append to a log
o  Extremely scalable
n  Transparent and dynamic clustering
n  Rack and datacenter aware data replication
o  Tunable read/write consistency per operation
n  Writes: any, one replica, quorum of replicas, …, all
n  Read: one, quorum, …, all
o  CQL = “SQL”-like DDL and DML
Slide 22

Cassandra data model
My Column family (within a key space)
Keys Columns

a colA: value1 colB: value2 colC: value3

b colA: value colD: value colE: value
A column has a
timestamp to

o  4-D map: keySpace x key x columnFamily x column è
value
o  Arbitrary number of columns
o  Column names are dynamic; can contain data
o  Columns for a row are stored on disk in order
determined by comparator
o  One CF row = one DDD aggregate

Slide 23

Cassandra data model – insert/update
Keys Columns

a colA: value1 colB: value2 colC: value3 Transaction =
updates to a
row within a
b colA: value colD: value colE: value ColumnFamily

Insert(key=a, columName=colZ, value=foo) Idempotent
Keys Columns

a colA: value1 colB: value2 colC: value3 colZ: foo

b colA: value colD: value colE: value

Slide 24

Cassandra query example – slice
Key Columns
s
colA: colB: colC: colZ:
a
value1 value2 value3 foo

colA: colD: colE:
b
value value value

slice(key=a, startColumn=colA, endColumnName=colC)

Key Columns You can also do a
s
rangeSlice which
colA: colB:
a
value1 value2 returns a range of keys
– less efficient

Slide 25

Super Column Families – one more
dimension
Keys Super columns

ScA ScB
a
colA: value1 colB: value2 colC: value3

b
colA: value colD: value colE: value

Insert(key=a, superColumn=scB, columName=colZ, value=foo)

keySpace x key x columnFamily x superColumn x column -> value
Keys Super columns

ScA ScB
a
colA: value1 colB: value2 colC:colZ: foo
value3

b

Slide 26

Getting data with super slice

Keys Super columns

ScA ScB
a
colA: value1 colB: value2 colC: value3

b

superSlice(key=a, startColumn=scB, endColumnName=scC)

Keys Super columns

ScB
a
colC: value3

Slide 27

Cassandra CLI
$ bin/cassandra-cli -h localhost
Connected to: "Test Cluster" on localhost/9160
Welcome to cassandra CLI.
[default@unknown] use Keyspace1;
Authenticated to keyspace: Keyspace1
[default@Keyspace1] list restaurantDetails;
Using default limit of 100
-------------------
RowKey: 1
=> (super_column=attributes,
(column=json, value={"id":
1,"name":"Ajanta","menuItems"....
[default@Keyspace1] get restaurantDetails['1']
['attributes’];
=> (column=json, value={"id":
1,"name":"Ajanta","menuItems"....

Slide 28

Scaling Cassandra
• Client connects to any node
• Dynamically add/remove nodes
Keys = [D, A]
Node 1 • Reads/Writes specify how many nodes
• Configurable # of replicas
Token = A •  adjacent nodes
•  rack and data center aware
replicates replicates

Node 4 Node 2
Keys = [A, B]
Token = D Token = B

replicates
Keys = [C, D] replicates Replicates to

Node 3
Token = C

Keys = [B, C]

Slide 29

Downsides of Cassandra
o  Learning curve
o  Still maturing, currently v0.8.4
o  Limited queries, i.e. KV lookup
o  Transactions limited to a column
family row
o  Lacks an easy to use API

Slide 30

Cassandra use cases
o  Use cases
•  Big data
•  Multiple Data Center distributed database
•  Persistent cache
•  (Write intensive) Logging
•  High-availability (writes)
o  Who is using it
n  Digg, Facebook, Twitter, Reddit, Rackspace
n  Cloudkick, Cisco, SimpleGeo, Ooyala, OpenX
n  The largest production cluster has over 100
TB of data in over 150 machines. –
Casssandra web site

Slide 31

MongoDB
o  Document-oriented database
n  JSON-style documents: Lists, Maps, primitives
n  Documents organized into collections (~table)
n  Schema-less
o  Rich query language for dynamic queries
o  Asynchronous, configurable writes:
n  No wait
n  Wait for replication
n  Wait for write to disk
o  Very fast
o  Highly scalable and available:
n  Replica sets (generalized master/slave)
n  Sharding
n  Transparent to client

Slide 32

Data Model = Binary JSON documents
{
"name" : "Sahn Maru", One document
"type" : ”Korean",
"serviceArea" : [ =
"94619",
"94618" one DDD aggregate
],
"openingHours" : [
{ DBObject o = new BasicDBObject();
"dayOfWeek" : "Wednesday", o.put("name", ”Sahn Maru");
"open" : 1730,
"close" : 2230 DBObject mi = new BasicDBObject();
} mi.put("name", "Daeji Bulgogi");
], …
"_id" : ObjectId("4bddc2f49d1505567c6220a0") List<DBObject> mis = Collections.singletonList(mi);
}
o.put("menuItems", mis);

o  Sequence of bytes on disk = fast I/O
n  No joins/seeks
n  In-place updates when possible è no index updates
o  Transaction = update of single document

Slide 33

MongoDB CLI
$ bin/mongo
> use mydb
> r1 = {name: 'Ajanta'}
{name: 'Ajanta'}
> r2 = {name: 'Montclair Egg Shop'}
{name: 'Montclair Egg Shop'}
> db.restaurants.save(r1)
> r1
{ _id: ObjectId("98…"), name: "Ajanta"}
> db.restaurants.save(r2)
> r2
{ _id: ObjectId("66…"), name: "Montclair Egg Shop"}
> db.restaurants.find({name: /^A/})
{ _id: ObjectId("98…"), name: "Ajanta"}
> db.restaurants.update({name: "Ajanta"},
{name: "Ajanta Restaurant"})

Slide 34

MongoDB query by example
{
serviceArea:"94619", Find a
openingHours: {
$elemMatch : { restaurant
"dayOfWeek" : "Monday",
"open": {$lte: 1800}, that serves
}
"close": {$gte: 1800}
the 94619 zip
}
}
code and is
open at 6pm
DBCursor cursor = collection.find(qbeObject);
while (cursor.hasNext()) { on a Monday
DBObject o = cursor.next();
…
}

Slide 35

Scaling MongoDB
Shard 1 Shard 2
Mongod Mongod
(replica) (replica)

Mongod Mongod
(master) Mongod (master) Mongod
(replica) (replica)

Config
Server

mongod
A shard consists of a
mongos replica set =
generalization of
master slave
mongod

mongod Collections spread
over multiple
client shards

Slide 36

Mongo Downsides
o  Server has a global write lock
n  Single writer OR multiple readers
è Long running queries blocks writers
o  Great that writes are not synchronous
n  BUT perhaps an asynchronous response
would be better than a synchronous
getLastError()

Interesting story: http://www.slideshare.net/eonnen/from-100s-to-100s-of-millions

MongoDB use cases
o  Use cases
n  High volume writes
n  Complex data
n  Semi-structured data
o  Who is using it?
n  Shutterfly, Foursquare
n  Bit.ly Intuit
n  SourceForge, NY Times
n  GILT Groupe, Evite,
n  SugarCRM

Slide 38

Other NoSQL databases
Type Examples

Extensible columns/Column- Hbase
oriented SimpleDB

Graph Neo4j

Key-value Membase

Document CouchDb

http://nosql-database.org/ lists 122+ NoSQL databases

Slide 39

Picking a database
Application requirement Solution
Complex transactions/ACID Relational database
Scaling NoSQL
Social data Graph database
Multiple datacenters Cassandra
Highly-available writes Cassandra
Flexible data Document store
High write volumes Mongo, Cassandra
Super fast cache Redis
Adhoc queries Relational or Mongo
…
http://highscalability.com/blog/2011/6/20/35-use-cases-for-choosing-your-next-nosql-database.html

Slide 40

Proceed with caution
o  Don’t commit to a
NoSQL DB until you
have done a
significant POC
o  Encapsulate your data
access code so you
can switch
o  Hope that one day
you won’t need ACID

Agenda
o  Why NoSQL?

Slide 42

NoSQL Java APIs

Database Libraries
Redis Jedis, JRedis, JDBC-Redis, RJC

Cassandra Raw Thrift if you are a masochist
Hector, …

MongoDB MongoDB provides a Java driver

Some are not so easy to use
Stylistic differences
Boilerplate code
…

Slide 43

Spring Data Project Goals
Bring classic Spring value propositions to a wide
range of NoSQL databases
è
n  Productivity
n  Programming model consistency: E.g.
<NoSQL>Template classes
n  “Portability”

http://www.springsource.org/spring-data

Slide 44

Spring Data sub-projects
§ Commons: Polyglot persistence
§ Key-Value: Redis, Riak
§ Document: MongoDB, CouchDB
§ Graph: Neo4j
§ GORM for NoSQL
§ Various milestone releases
§ Redis 1.0.0.M4 (July 20th, 2011)
§ Document 1.0.0.M2 (April 9, 2011)
§ Graph - Neo4j Support 1.0.0 (April 19, 2011)
§ …
Slide 45

MongoTemplate
MongoTemplate
Simplifies data databaseName
POJO ó DBObject
access userId mapping
Password
Translates
defaultCollectionName
exceptions
writeConcern
writeResultChecking

save()
<<interface>>
insert()
remove()
MongoConvertor
updateFirst() write(Object, DBObject)
findOne() read(Class, DBObject)
find()
…

SimpleMongo
uses Converter
Mongo
MongoMapping
(Java Driver class)
Converter
Slide 46

Richer mapping
Annotations define mapping:
@Document, @Id, @Indexed,
@PersistanceConstructor,
@Document
@CompoundIndex, @DBRef,
public class Person {
@GeoSpatialIndexed, @Value
@Id
private ObjectId id; Map fields instead of properties
private String firstname; è no getters or setters required

@Indexed Non-default constructor
private String lastname;
Index generation
@PersistenceConstructor
public Person(String firstname, String lastname) {
this.firstname = firstname;
this.lastname = lastname;
}

….
}

Slide 47

Generic Mongo Repositories
interface PersonRepository extends MongoRepository<Person, ObjectId> {
List<Person> findByLastname(String lastName);
}

<bean>
<mongo:repositories
base-package="net.chrisrichardson.mongodb.example.mongorepository"
mongo-template-ref="mongoTemplate" />
</beans>

Person p = new Person("John", "Doe");
personRepository.save(p);

Person p2 = personRepository.findOne(p.getId());

List<Person> johnDoes = personRepository.findByLastname("Doe");
assertEquals(1, johnDoes.size());

Slide 48

Support for the QueryDSL project

Generated from Type-safe
domain model class composable queries

QPerson person = QPerson.person;

Predicate predicate =
person.homeAddress.street1.eq("1 High Street")
.and(person.firstname.eq("John"))

List<Person> people = personRepository.findAll(predicate);

assertEquals(1, people.size());
assertPersonEquals(p, people.get(0));

Slide 49

Cross-store/polyglot persistence
Person person = new Person(…);
@Entity entityManager.persist(person);
public class Person {
// In Database Person p2 = entityManager.find(…)
@Id private Long id;
private String firstname;
private String lastname;

// In MongoDB
@RelatedDocument private Address address;

{ "_id" : ObjectId(”….."),
"_entity_id" : NumberLong(1),
"_entity_class" : "net.. Person",
"_entity_field_name" : "address",
"zip" : "94611", "street1" : "1 High Street", …}

Slide 50

Agenda
o  Why NoSQL?
o  Case study: POJOs in Action &
NoSQL

Slide 51

Food to Go – placing a takeout
order
o  Customer enters delivery address and delivery time
o  System displays available restaurants = restaurants
that serve the zip code of the delivery address AND
are open at the delivery time

class Restaurant { class TimeRange {
long id; long id;
String name; int dayOfWeek;
Set<String> serviceArea; int openingTime;
Set<TimeRange> openingHours;
int closingTime;
List<MenuItem> menuItems;
}
}

class MenuItem {
String name;
double price;
}

Slide 52

Database schema
ID Name …
RESTAURANT
1 Ajanta
table
2 Montclair Eggshop

Restaurant_id zipcode
RESTAURANT_ZIPCODE
1 94707
table
1 94619
2 94611
2 94619 RESTAURANT_TIME_RANGE
table

Restaurant_id dayOfWeek openTime closeTime
1 Monday 1130 1430
1 Monday 1730 2130
2 Tuesday 1130 …

Slide 53

Finding available restaurants on
monday, 7.30pm for 94619 zip
select r.* Straightforward
from restaurant r three-way join
inner join restaurant_time_range tr
on r.id =tr.restaurant_id
inner join restaurant_zipcode sa
on r.id = sa.restaurant_id
Where ’94619’ = sa.zip_code
and tr.day_of_week=’monday’
and tr.openingtime <= 1930
and 1930 <=tr.closingtime

Slide 54

Redis - Persisting restaurants is
“easy”
rest:1:details [ name: “Ajanta”, … ]
Multiple KV value
rest:1:serviceArea [ “94619”, “94611”, …]
pairs
rest:1:openingHours [10, 11]

timerange:10 [“dayOfWeek”: “Monday”, ..]

timerange:11 [“dayOfWeek”: “Tuesday”, ..]

Single KV hash
OR

rest:1 [ name: “Ajanta”,
“serviceArea:0” : “94611”, “serviceArea:1” : “94619”,
“menuItem:0:name”, “Chicken Vindaloo”,
…]

OR
Single KV String
rest:1 { .. A BIG STRING/BYTE ARRAY, E.G. JSON }

Slide 55

BUT…
o  … we can only retrieve them via primary key
è  We need to implement indexes
è  Queries instead of data model drives
NoSQL database design
o  But how can a key-value store support a
query that has

?
n  A 3-way join
n  Multiple =
n  > and <

Slide 56

Simplification #1: Denormalization
Restaurant_id Day_of_week Open_time Close_time Zip_code

1 Monday 1130 1430 94707
1 Monday 1130 1430 94619
1 Monday 1730 2130 94707
1 Monday 1730 2130 94619
2 Monday 0700 1430 94619
…

SELECT restaurant_id, open_time
FROM time_range_zip_code
WHERE day_of_week = ‘Monday’ Simpler query:
AND zip_code = 94619 §  No joins
§  Two = and two <
AND 1815 < close_time
AND open_time < 1815

Slide 57

Simplification #2: Application filtering

SELECT restaurant_id, open_time
WHERE day_of_week = ‘Monday’ Even simple query
AND zip_code = 94619 •  No joins
AND 1815 < close_time •  Two = and one <
AND open_time < 1815

Slide 58

Simplification #3: Eliminate multiple
=’s with concatenation

Restaurant_id Zip_dow Open_time Close_time

1 94707:Monday 1130 1430
1 94619:Monday 1130 1430
1 94707:Monday 1730 2130
1 94619:Monday 1730 2130
2 94619:Monday 0700 1430
…

SELECT …
WHERE zip_code_day_of_week = ‘94619:Monday’
AND 1815 < close_time
key

range

Slide 59

Sorted sets support range queries
Key Sorted Set [ Entry:Score, …]

94707:Monday [1130_1:1430, 1730_1:2130]

94619:Monday [0700_2:1430, 1130_1:1430, 1730_1:2130]

zipCode:dayOfWeek Member: OpeningTime_RestaurantId
Score: ClosingTime

ZRANGEBYSCORE 94619:Monday 1815 2359
è
{1730_1}

1730 is before 1815 è Ajanta is open

Slide 60

What did I just do to query the data?

Slide 61

What did I just do to query the data?
o  Wrote code to maintain an index
o  Reduced performance due to extra
writes

Slide 62

RedisTemplate-based code
@Repository
public class AvailableRestaurantRepositoryRedisImpl implements AvailableRestaurantRepository {

@Autowired private final StringRedisTemplate redisTemplate;

private BoundZSetOperations<String, String> closingTimes(int dayOfWeek, String zipCode) {
return redisTemplate.boundZSetOps(AvailableRestaurantKeys.closingTimesKey(dayOfWeek, zipCode));
}

public List<AvailableRestaurant> findAvailableRestaurants(Address deliveryAddress, Date deliveryTime) {
String zipCode = deliveryAddress.getZip();
int timeOfDay = timeOfDay(deliveryTime);
int dayOfWeek = dayOfWeek(deliveryTime);

Set<String> closingTrs = closingTimes(dayOfWeek, zipCode).rangeByScore(timeOfDay, 2359);
Set<String> restaurantIds = new HashSet<String>();
String paddedTimeOfDay = FormattingUtil.format4(timeOfDay);
for (String trId : closingTrs) {
if (trId.substring(0, 4).compareTo(paddedTimeOfDay) <= 0)
restaurantIds.add(StringUtils.substringAfterLast(trId, "_"));
}

Collection<String> jsonForRestaurants =
redisTemplate.opsForValue().multiGet(AvailableRestaurantKeys.timeRangeRestaurantInfoKeys(restaurantIds ));
List<AvailableRestaurant> restaurants = new ArrayList<AvailableRestaurant>();
for (String json : jsonForRestaurants) {
restaurants.add(AvailableRestaurant.fromJson(json));
}
return restaurants;
}

Slide 63

Redis – Spring configuration
@Configuration
public class RedisConfiguration extends AbstractDatabaseConfig {

@Bean
public RedisConnectionFactory jedisConnectionFactory() {
JedisConnectionFactory factory = new JedisConnectionFactory();
factory.setHostName(databaseHostName);
factory.setPort(6379);
factory.setUsePool(true);
JedisPoolConfig poolConfig = new JedisPoolConfig();
poolConfig.setMaxActive(1000);
factory.setPoolConfig(poolConfig);
return factory;
}

@Bean
public StringRedisTemplate stringRedisTemplate(RedisConnectionFactory factory) {
StringRedisTemplate template = new StringRedisTemplate();
template.setConnectionFactory(factory);
return template;
}
}

Slide 64

Cassandra: Easy to store
restaurants
Column Family: RestaurantDetails
Keys Columns

1 name: Ajanta type: Indian …

name: Montclair
2 type: Breakfast …
Egg Shop

OR
Column Family: RestaurantDetails
Keys Columns

1 details: { JSON DOCUMENT }

2 details: { JSON DOCUMENT }

Slide 65

Querying using Cassandra
o  Similar challenges to using Redis
o  Limited querying options
n  Row key – exact or range
n  Column name – exact or range
o  Use composite/concatenated keys
n  Prefix - equality match
n  Suffix - can be range scan
o  No joins è denormalize

Slide 66

Cassandra: Find restaurants that close after the delivery
time and then filter
Keys Super Columns

1430 1430 2130

94619:Mon
1130_1: JSON FOR 1730_1: JSON FOR
0700_2: JSON FOR EGG
AJANTA AJANTA

SuperSlice
key= 94619:Mon
SliceStart = 1815
SliceEnd = 2359

Keys Super Columns

2130

94619:Mon
1730_1: JSON FOR
AJANTA

18:15 is after 17:30 => {Ajanta}

Slide 67

Cassandra/Hector code
import me.prettyprint.hector.api.Cluster;

public class CassandraHelper {
@Autowired private final Cluster cluster;

public <T> List<T> getSuperSlice(String keyspace, String columnFamily,
String key, String sliceStart, String sliceEnd,
SuperSliceResultMapper<T> resultMapper) {

SuperSliceQuery<String, String, String, String> q =
HFactory.createSuperSliceQuery(HFactory.createKeyspace(keyspace, cluster),
StringSerializer.get(), StringSerializer.get(), StringSerializer.get(), StringSerializer.get());
q.setColumnFamily(columnFamily);
q.setKey(key);
q.setRange(sliceStart, sliceEnd, false, 10000);

QueryResult<SuperSlice<String, String, String>> qr = q.execute();

SuperColumnRowProcessor<T> rowProcessor = new SuperColumnRowProcessor<T>(resultMapper);

for (HSuperColumn<String, String, String> superColumn : qr.get().getSuperColumns()) {
List<HColumn<String, String>> columns = superColumn.getColumns();
rowProcessor.processRow(key, superColumn.getName(), columns);
}
return rowProcessor.getResult();
}
}

Slide 68

MongoDB = easy to store
{
"_id": "1234"
"name": "Ajanta",
"serviceArea": ["94619", "99999"],
"openingHours": [
{
"dayOfWeek": 1,
"open": 1130,
"close": 1430
},
{
"dayOfWeek": 2,
"open": 1130,
"close": 1430
},
…
]
}

Slide 69

MongoDB = easy to query

{
"serviceArea": "94619",
"openingHours": {
"$elemMatch": {
"open": { "$lte": 1815},
"dayOfWeek": 4,
"close": { $gte": 1815}
}
}
db.availableRestaurants.ensureIndex({serviceArea: 1})

Slide 70

MongoTemplate-based code
@Repository
public class AvailableRestaurantRepositoryMongoDbImpl
implements AvailableRestaurantRepository {

@Autowired private final MongoTemplate mongoTemplate;

@Autowired @Override
public List<AvailableRestaurant> findAvailableRestaurants(Address deliveryAddress,
Date deliveryTime) {
int timeOfDay = DateTimeUtil.timeOfDay(deliveryTime);
int dayOfWeek = DateTimeUtil.dayOfWeek(deliveryTime);

Query query = new Query(where("serviceArea").is(deliveryAddress.getZip())
.and("openingHours”).elemMatch(where("dayOfWeek").is(dayOfWeek)
.and("openingTime").lte(timeOfDay)
.and("closingTime").gte(timeOfDay)));

return mongoTemplate.find(AVAILABLE_RESTAURANTS_COLLECTION, query,
AvailableRestaurant.class);
}

mongoTemplate.ensureIndex(“availableRestaurants”,
new Index().on("serviceArea", Order.ASCENDING));
Slide 71

MongoDB – Spring Configuration
@Configuration
public class MongoConfig extends AbstractDatabaseConfig {
private @Value("#{mongoDbProperties.databaseName}")
String mongoDbDatabase;

public @Bean MongoFactoryBean mongo() {
MongoFactoryBean factory = new MongoFactoryBean();
factory.setHost(databaseHostName);
MongoOptions options = new MongoOptions();
options.connectionsPerHost = 500;
factory.setMongoOptions(options);
return factory;
}

public @Bean
MongoTemplate mongoTemplate(Mongo mongo) throws Exception {
MongoTemplate mongoTemplate = new MongoTemplate(mongo, mongoDbDatabase);
mongoTemplate.setWriteConcern(WriteConcern.SAFE);
mongoTemplate.setWriteResultChecking(WriteResultChecking.EXCEPTION);
return mongoTemplate;
}
}

Slide 72

Summary
o  Relational databases are great but
n  Object/relational impedance mismatch
n  Relational schema is rigid
n  Extremely difficult/impossible to scale writes
n  Performance can be suboptimal
o  Each NoSQL databases can solve some
combination of those problems BUT
n  Limited transactions
n  One day needing ACID è major rewrite
n  Query-driven, denormalized database design
n  …
è
o  Carefully pick the NoSQL DB for your application
o  Consider a polyglot persistence architecture

Slide 74

Thank you!
My contact info:

chris@chrisrichardson.net

@crichardson

Slide 75

Polygot persistence for Java Developers - August 2011 / @Oakjug

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (18)

Similar to Polygot persistence for Java Developers - August 2011 / @Oakjug

Similar to Polygot persistence for Java Developers - August 2011 / @Oakjug (20)

More from Chris Richardson

More from Chris Richardson (20)

Recently uploaded

Recently uploaded (20)

Polygot persistence for Java Developers - August 2011 / @Oakjug