Your SlideShare is downloading. ×
0
NoSQL & DataGrids from a Developer Perspective         Cyrille Le Clerc - Michaël Figuière
Speaker     @cyrilleleclerc     blog.xebia.fr          Cyrille Le Clerc                                    Large Scale    ...
Speaker     @mfiguiere     blog.xebia.fr          Michaël Figuière           Distributed                                   ...
About NoSQL              No SQL
About NoSQL         Not    Only              No SQL
About NoSQL         Not    Only              No SQL                   Relational
Once upon a time...
On the Web side                                 - Created DynamoSimilar needs for Web giants :                            ...
Amazon : the birth of Dynamo                                         Requires complex requests,                           ...
On the Financial side                                 - Released Coherence in 2001Needs within financial market :          ...
Data Partitioning and Replication
Use Case : Train Ticketing System                                With trains, stations,                                sea...
Store everything in a Mainframe !                               Up to 3 To of RAM !                               More tha...
Data Partitioning                                     Partition gamma                                                     ...
Data Replication                              Node 1                                       synchro   Partition alpha      ...
Partitioned Data Modeling
Partitioned Data Modeling                                   Seat                                              Booking     ...
Partitionned Data Modeling                                Partitioning ready                                  entities tre...
Partitionned Data Modeling                       Remove unused data                             Seat                      ...
Partitionned Data Modeling        Sharding ready data structure                              Seat                        n...
Consistency, Availability and Partition Tolerance
Data Consistency with replicas                                              Node 1            {    "name": "Barbie Compute...
Data Consistency with replicas              {    "name": "Barbie Computer",   Node 1                   "price": 15.50,    ...
Data Consistency with replicas• You can adjust the balance between number of writes and number of  reads• See Eventual Con...
Data Consistency with Multiple Data Centers  {    "name": "Barbie Computer",       "price": 15.50,       "tags" : [       ...
Data Consistency with Multiple Data Centers set price to $ 20.00  {    "name": "Barbie Computer",       "price": 20.00,   ...
Data Consistency with Multiple Data Centers set price to $ 20.00  {    "name": "Barbie Computer",       "price": 20.00,   ...
Data Consistency with Multiple Data Centers set price to $ 20.00  {    "name": "Barbie Computer",       "price": 20.00,   ...
Data Consistency with Multiple Data Centers                      London       New York                                    ...
CAP Theorem                                        Only 2 of these 3                                        properties can...
CAP Theorem                                       Relational DBNoSQL DB              Consistency                        Av...
Data models & APIs
Request Driven Data Modeling• Relational data modeling is business driven         Adaptation to requests comes with tuning...
Key-Value Store                  In memory                  In memory                  with async                  persist...
Example with a user profile               johndoe   User profile as byte[]  Similar to a Java          HashMap
Write Example with Riak   RiakClient riak = new RiakClient("http://server1:8098/riak");   RiakObject userProfileObj =     ...
Read Example with Riak       FetchResponse response = riak.fetch("bucket", "johndoe");       if (response.hasObject()) {  ...
Column Families Store
Column Families Store    For each Row ID we have      a list of key-value pairs                                           ...
Example with a shopping cart   johndoe     17:21   Iphone        17:32   DVD Player     17:44     MacBook   willsmith   6:...
Write Example with CassandraCluster cluster =   HFactory.getOrCreateCluster("cluster", new CassandraHostConfigurator("serv...
Read Example with CassandraSliceQuery<String, String, String> query =    HFactory.createSliceQuery(keyspace,              ...
Document Store
Example with an item of a catalog                          {                              "name": "Iphone",               ...
Write Example with MongoDB       Mongo mongo = new Mongo("mongos_1", 27017);       DB db = mongo.getDB("Ecommerce");      ...
Read Example with MongoDB        BasicDBObject query = new BasicDBObject();        query.put("price", new BasicDBObject("$...
In Memory Data Grids                       eXtreme Scale
Example with train booking with IBM eXtremeScale  @Entity(schemaRoot=true)  public class Train {                          ...
Write Example with IBM eXtreme Scale                          eXtreme Scale provides                          a JPA Style ...
Read Example with IBM eXtreme Scale/** Find by key */Train findById(String id) {   return (Train) entityManager.find(Train...
More APIs• Another Java EE versus Spring battle ? JSR 347 Data Grids vs. Spring Data         Unified API ontop of relation...
Transactions
Transactions• NoSQL usually means NO transactions• Except when it means eXtreme Transactions !
Transactions ConcurrencyPlace order                                                                       231   canon-eos:...
SQL Transactions Place order                                                                     231     canon-eos: 1     ...
SQL Transactions Place order           select for update ...                                                              ...
SQL Transactions Place order           select for update ...                                                              ...
Transactions with Manual Compensation     Place order                            DO                                       ...
Transactions with Manual Compensation     Place order                              DO                                     ...
Transactions with Manual Compensation     Place order                              DO                                     ...
Transactions with Manual Compensation     Place order                              DO                                     ...
Transactions with Manual Compensation     Place order                              DO                                     ...
Transactions with Manual Compensation     Place order                                DO                                   ...
Transactions with Manual Compensation• Code “do” & “undo” & chain execution• What about interrupted chain execution ? Data...
Transactions with Manual Compensation• Code “do” & “undo” & chain execution• What about interrupted chain execution ? Data...
Which solution to choose?
Key-Value Store• Get and Set by key         Simple but enough for a lot of use cases• Riak and Voldemort provide a great s...
Column Families Store• Get and Set by key of a list of columns         Makes it possible to fetch and update partial data•...
Document Store• Schema less         Great for continuously updated schemas• Complex queries are available         Necessar...
In Memory Data Grid• Very Low Latency & eXtreme Transaction Processing (XTP)        Investment banking, booking & inventor...
Polyglot storage for eCommerce                  Products                                                  Solr            ...
Why NoSQL & DataGrids matter ?• Polyglot Storage: databases that fit the needs of every type of data• Linear Scalability: b...
Questions / Answers                      ?
Upcoming SlideShare
Loading in...5
×

GeeCon 2011 - NoSQL and In Memory Data Grids from a developer perspective

2,565

Published on

GeeCon 2011 : NoSQL and In Memory Data Grids from a developer perspective by Cyrille Le Clerc and Michael Figuière - Xebia

Published in: Technology
0 Comments
6 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,565
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
112
Comments
0
Likes
6
Embeds 0
No embeds

No notes for slide

Transcript of "GeeCon 2011 - NoSQL and In Memory Data Grids from a developer perspective"

  1. 1. NoSQL & DataGrids from a Developer Perspective Cyrille Le Clerc - Michaël Figuière
  2. 2. Speaker @cyrilleleclerc blog.xebia.fr Cyrille Le Clerc Large Scale DataGrids Apache CXF
  3. 3. Speaker @mfiguiere blog.xebia.fr Michaël Figuière Distributed Systems NoSQL Search Engines
  4. 4. About NoSQL No SQL
  5. 5. About NoSQL Not Only No SQL
  6. 6. About NoSQL Not Only No SQL Relational
  7. 7. Once upon a time...
  8. 8. On the Web side - Created DynamoSimilar needs for Web giants : - < 40 min of unavailability per year• Huge amount of data• High availability• Fault tolerance - Created BigTable & MapReduce• Scalability on commodity - Stores every webpages of Internet hardware
  9. 9. Amazon : the birth of Dynamo Requires complex requests, temporal unavailability is acceptable Fill cart Checkout Payment Process order Prepare Send Requires high availability, key-value store is enough
  10. 10. On the Financial side - Released Coherence in 2001Needs within financial market : - Started as a distributed cache• Very low latency• Rich queries & transactions• Scalability - Released Gigaspaces XAP in 2001• Data consistency - Routes the request inside the data
  11. 11. Data Partitioning and Replication
  12. 12. Use Case : Train Ticketing System With trains, stations, seats, booking and passengers
  13. 13. Store everything in a Mainframe ! Up to 3 To of RAM ! More than $1,000,000 IBM z11
  14. 14. Data Partitioning Partition gamma Small servers Partition beta MainFrame Partition alpha Split data for scalability
  15. 15. Data Replication Node 1 synchro Partition alpha Node 2 Duplicate data for high availability and Node 3 scalability
  16. 16. Partitioned Data Modeling
  17. 17. Partitioned Data Modeling Seat Booking Passenger number reduction name price Train code type TrainStation TrainStop code date name Typical relational data model
  18. 18. Partitionned Data Modeling Partitioning ready entities tree e ntity Root Seat Booking Passenger number reduction name price Train code Du type pli Refe ca ted renc in e d TrainStation ea ata TrainStop ch code pa date rtit ion name Find the root entity and denormalize
  19. 19. Partitionned Data Modeling Remove unused data Seat Booking Passenger number reduction name price booked Train code type TrainStation TrainStop code date name
  20. 20. Partitionned Data Modeling Sharding ready data structure Seat number price booked Train code type TrainStation TrainStop code date name
  21. 21. Consistency, Availability and Partition Tolerance
  22. 22. Data Consistency with replicas Node 1 { "name": "Barbie Computer", "price": 15.50, "tags" : [ "doll", "barbie" Node 2 ]} write to all Node 3 Node 1 read from one Node 2 Node 3
  23. 23. Data Consistency with replicas { "name": "Barbie Computer", Node 1 "price": 15.50, "tags" : [ "doll", "barbie" ]} Node 2 write to one Node 3 Node 1 Node 2 read from all Node 3
  24. 24. Data Consistency with replicas• You can adjust the balance between number of writes and number of reads• See Eventual Consistency
  25. 25. Data Consistency with Multiple Data Centers { "name": "Barbie Computer", "price": 15.50, "tags" : [ "doll", "barbie" ]} { "name": "Barbie Computer", "price": 15.50, "tags" : [ West Coast "doll", "barbie" ]} East Coast
  26. 26. Data Consistency with Multiple Data Centers set price to $ 20.00 { "name": "Barbie Computer", "price": 20.00, "tags" : [ "doll", "barbie" ]} { "name": "Barbie Computer", "price": 15.50, West Coast "tags" : [ "doll", "barbie" ]} East Coast propagation delay !
  27. 27. Data Consistency with Multiple Data Centers set price to $ 20.00 { "name": "Barbie Computer", "price": 20.00, "tags" : [ "doll", "barbie" ]} { "name": "Barbie Computer", "price": 15.50, West Coast "tags" : [ "doll", "barbie", “girl” ]} East Coast add tag “girl” reconciliation API needed !
  28. 28. Data Consistency with Multiple Data Centers set price to $ 20.00 { "name": "Barbie Computer", "price": 20.00, "tags" : [ "doll", "barbie" ]} { "name": "Barbie Computer", "price": 15.50, West Coast "tags" : [ "doll", "barbie", “girl” ]} East Coast add tag “girl” Network partitioning
  29. 29. Data Consistency with Multiple Data Centers London New York Tokyo World wide replication for financial market
  30. 30. CAP Theorem Only 2 of these 3 properties can be achieved in storage Consistency system Availability Partition Tolerance
  31. 31. CAP Theorem Relational DBNoSQL DB Consistency Availability Partition Impossible Tolerance
  32. 32. Data models & APIs
  33. 33. Request Driven Data Modeling• Relational data modeling is business driven Adaptation to requests comes with tuning• With partitioning, data modeling had to be adapted for requests Because network latency matters• NoSQL & DataGrids data modeling is request driven Two requests may require to store data twice
  34. 34. Key-Value Store In memory In memory with async persistence Persistent
  35. 35. Example with a user profile johndoe User profile as byte[] Similar to a Java HashMap
  36. 36. Write Example with Riak RiakClient riak = new RiakClient("http://server1:8098/riak"); RiakObject userProfileObj = new RiakObject("bucket", "johndoe", serializer.serialize(userProfile); riak.store(userProfileObj); Inserts a user profile into Riak
  37. 37. Read Example with Riak FetchResponse response = riak.fetch("bucket", "johndoe"); if (response.hasObject()) { userProfileObj = response.getObject(); } Fetch a user profile using its key in Riak
  38. 38. Column Families Store
  39. 39. Column Families Store For each Row ID we have a list of key-value pairs Key-value pairs are sorted by keys Relational DB Column families DB
  40. 40. Example with a shopping cart johndoe 17:21 Iphone 17:32 DVD Player 17:44 MacBook willsmith 6:10 Camera 8:29 Ipad pitdavis 14:45 PlayStation 15:01 Asus EEE 15:03 Iphone
  41. 41. Write Example with CassandraCluster cluster = HFactory.getOrCreateCluster("cluster", new CassandraHostConfigurator("server1:9160"));Keyspace keyspace = HFactory.createKeyspace("EcommerceKeyspace", cluster);Mutator<String> mutator = HFactory.createMutator(keyspace, stringSerializer);mutator.insert("johndoe", "ShoppingCartColumnFamily", HFactory.createStringColumn("14:21", "Iphone")); Inserts a column into the ShoppingCartColumnFamily
  42. 42. Read Example with CassandraSliceQuery<String, String, String> query = HFactory.createSliceQuery(keyspace, stringSerializer, stringSerializer, stringSerializer);query.setColumnFamily("ShoppingCartColumnFamily") .setKey("johndoe") .setRange("", "", false, 10);QueryResult<ColumnSlice<String, String>> result = query.execute(); Reads a slice of 10 columns from ShoppingCartColumnFamily
  43. 43. Document Store
  44. 44. Example with an item of a catalog { "name": "Iphone", "price": 559.0, item_1 "vendor": "Apple", "rating": 4.6, "tags": [ "phone", "touch" ] } The database is aware of document’s fields and can offers complex queries
  45. 45. Write Example with MongoDB Mongo mongo = new Mongo("mongos_1", 27017); DB db = mongo.getDB("Ecommerce"); DBCollection catalog = db.getCollection("Catalog"); BasicDBObject doc = new BasicDBObject(); doc.put("name", "Iphone"); doc.put("price", 559.0); catalog.insert(doc); Inserts an item document into MongoDB
  46. 46. Read Example with MongoDB BasicDBObject query = new BasicDBObject(); query.put("price", new BasicDBObject("$lt", 600)); DBCursor cursor = catalog.find(query); while(cursor.hasNext()) { System.out.println(cursor.next()); } Queries for all items with a price lower than 600
  47. 47. In Memory Data Grids eXtreme Scale
  48. 48. Example with train booking with IBM eXtremeScale @Entity(schemaRoot=true) public class Train { Seat number price @Id booked String code; Train code @Index type @Basic TrainStop String name; date @OneToMany(cascade=CascadeType.ALL) List<Seat> seats = new ArrayList<Seat>(); @Version int version; ... } With Data Grids, sub entities can have cross relations
  49. 49. Write Example with IBM eXtreme Scale eXtreme Scale provides a JPA Style API void persist(Train train) { entityManager.persist(train); } Inserts a train into eXtreme Scale
  50. 50. Read Example with IBM eXtreme Scale/** Find by key */Train findById(String id) { return (Train) entityManager.find(Train.class, id);}/** Query Language */Train findByTrain(String code) { Query q = entityManager.createQuery("select t from Train t where t.code=:code"); q.setParameter("code", code); return (Train) q.getSingleResult();} Simple and complex queries with eXtreme Scale
  51. 51. More APIs• Another Java EE versus Spring battle ? JSR 347 Data Grids vs. Spring Data Unified API ontop of relational, document, column, key-value ? Object to tuple projection API
  52. 52. Transactions
  53. 53. Transactions• NoSQL usually means NO transactions• Except when it means eXtreme Transactions !
  54. 54. Transactions ConcurrencyPlace order 231 canon-eos: 1 ipod : 1 headphone : 1 311 iphone: 1 ... ipad : 1 121 iphone: 1 264 concurrency on iphone 2 barbie : 1 iphone: 1 637 cabbage-doll: 1 12 cancel order if one product warehouse stocks is missing
  55. 55. SQL Transactions Place order 231 canon-eos: 1 ipod : 1 begin headphone : 1 311 iphone: 1 ... for each shoppingCart.product select for update ... 121 ipad : 1 iphone: 1 update ... 264 commit 2 barbie : 1 iphone: 1 cabbage-doll: 1 637 12 warehouse stockslock duration = f(shoppingcart.length)if too many locks on the rows, then lock table !
  56. 56. SQL Transactions Place order select for update ... 231 canon-eos: 1 ipod : 1 headphone : 1 311 iphone: 1 ... ipad : 1 121 iphone: 1 264 2 barbie : 1 iphone: 1 cabbage-doll: 1 637 12 warehouse stockslock duration = f(shoppingcart.length)if too many locks on the rows, then lock table !
  57. 57. SQL Transactions Place order select for update ... 231 canon-eos: 1 ipod : 1 headphone : 1 311 iphone: 1 ... ipad : 1 121 iphone: 1 264 2 barbie : 1 iphone: 1 cabbage-doll: 1 637 12 warehouse stockslock duration = f(shoppingcart.length)if too many locks on the rows, then lock table !
  58. 58. Transactions with Manual Compensation Place order DO -1 if(stock - quantity > 0) { stock = stock - quantity; } else { throw exception() ! 231 UNDO stock = stock + quantity; 311canon-eos: 1 -1 DOipod : 1 if(stock - quantity > 0) {headphone : 1 stock = stock - quantity; } else { throw exception() !iphone: 1... UNDO stock = stock + quantity; -1 121 DO if(stock - quantity > 0) { 264 stock = stock - quantity; } else { throw exception() ! UNDO stock = stock + quantity; 2 -1 DO DO if(stock - quantity > 0) { if(stock - quantity > 0) { stock = stock - quantity; } else { throw exception() ! stock = stock - quantity; 637 } else { UNDO stock = stock + quantity; throw exception() ! 12 } warehouse stocks UNDO stock = stock + quantity; code “do”, “undo” and the chain
  59. 59. Transactions with Manual Compensation Place order DO -1 if(stock - quantity > 0) { stock = stock - quantity; } else { throw exception() ! 231 UNDO stock = stock + quantity; 637barbie : 1iphone: 1 DO if(stock - quantity > 0) { -1 311cabbage-doll: 1 stock = stock - quantity; } else { throw exception() ! UNDO stock = stock + quantity; 0 DO 264 if(stock - quantity > 0) { stock = stock - quantity; -1 } else { throw exception() ! UNDO stock = stock + quantity; 12 121 warehouse stocks
  60. 60. Transactions with Manual Compensation Place order DO -1 if(stock - quantity > 0) { stock = stock - quantity; } else { throw exception() ! 231 UNDO stock = stock + quantity; 636barbie : 1iphone: 1 DO if(stock - quantity > 0) { -1 311cabbage-doll: 1 stock = stock - quantity; } else { throw exception() ! UNDO stock = stock + quantity; 0 DO 264 if(stock - quantity > 0) { stock = stock - quantity; -1 } else { throw exception() ! UNDO stock = stock + quantity; 12 121 warehouse stocks
  61. 61. Transactions with Manual Compensation Place order DO -1 if(stock - quantity > 0) { stock = stock - quantity; } else { throw exception() ! 231 UNDO stock = stock + quantity; 636barbie : 1iphone: 1 DO if(stock - quantity > 0) { -1 311cabbage-doll: 1 stock = stock - quantity; } else { no more iphone ! throw exception() ! UNDO stock = stock + quantity; 0 DO 264 if(stock - quantity > 0) { stock = stock - quantity; -1 } else { throw exception() ! UNDO stock = stock + quantity; 12 121 warehouse stocks
  62. 62. Transactions with Manual Compensation Place order DO -1 if(stock - quantity > 0) { stock = stock - quantity; } else { throw exception() ! 231 UNDO stock = stock + quantity; 636barbie : 1iphone: 1 -1 311 interrupted DO if(stock - quantity > 0) {cabbage-doll: 1 stock = stock - quantity; } else { throw exception() ! UNDO stock = stock + quantity; 0 DO 264 -1 cancelled if(stock - quantity > 0) { stock = stock - quantity; } else { throw exception() ! UNDO stock = stock + quantity; 12 121 warehouse stocks
  63. 63. Transactions with Manual Compensation Place order DO -1 if(stock - quantity > 0) { undo stock = stock - quantity; } else { throw exception() ! 231 UNDO stock = stock + quantity; 636 +1barbie : 1 DOiphone: 1 -1 311 interrupted if(stock - quantity > 0) { stock = stock - quantity;cabbage-doll: 1 } else { throw exception() ! UNDO stock = stock + quantity; 0 DO 264 -1 cancelled if(stock - quantity > 0) { stock = stock - quantity; } else { throw exception() ! UNDO stock = stock + quantity; 12 DO 121 if(stock - quantity > 0) { stock = stock - quantity; } else { } throw exception() ! warehouse stocks UNDO stock = stock + quantity;
  64. 64. Transactions with Manual Compensation• Code “do” & “undo” & chain execution• What about interrupted chain execution ? Data corruption ?
  65. 65. Transactions with Manual Compensation• Code “do” & “undo” & chain execution• What about interrupted chain execution ? Data corruption ? data store managed transaction chain execution
  66. 66. Which solution to choose?
  67. 67. Key-Value Store• Get and Set by key Simple but enough for a lot of use cases• Riak and Voldemort provide a great scalability Great to persist continuously growing datasets• Memcached and Redis offer low overhead and latency Great for cache and live data
  68. 68. Column Families Store• Get and Set by key of a list of columns Makes it possible to fetch and update partial data• Queries are simples, but columns slice fetching is possible Great for pagination• Data model is too low level for many complex data modeling Should typically be used for the largest scalability needs
  69. 69. Document Store• Schema less Great for continuously updated schemas• Complex queries are available Necessary for filtering and search• Scalability may be limited if not querying using partition key Can be handle using multiple storage and limited queries
  70. 70. In Memory Data Grid• Very Low Latency & eXtreme Transaction Processing (XTP) Investment banking, booking & inventory systems• In Memory - No Persistence Most of the time backed with a database• High budget and Developer skills required Some Open Source alternatives are appearing
  71. 71. Polyglot storage for eCommerce Products Solr search Product catalog MongoDB Application User account and Cassandra Shopping cart Warehouse inventory Coherence
  72. 72. Why NoSQL & DataGrids matter ?• Polyglot Storage: databases that fit the needs of every type of data• Linear Scalability: being able to handle any further business requirements• High Availability: multi-servers and multi-datacenters• Elasticity: natural integration with Cloud Computing philosophy• Some new use cases now available
  73. 73. Questions / Answers ?
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×