• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
GeeCon 2011 - NoSQL and In Memory Data Grids from a developer perspective
 

GeeCon 2011 - NoSQL and In Memory Data Grids from a developer perspective

on

  • 1,134 views

GeeCon 2011 : NoSQL and In Memory Data Grids from a developer perspective by Cyrille Le Clerc and Michael Figuière - Xebia

GeeCon 2011 : NoSQL and In Memory Data Grids from a developer perspective by Cyrille Le Clerc and Michael Figuière - Xebia

Statistics

Views

Total Views
1,134
Views on SlideShare
1,123
Embed Views
11

Actions

Likes
4
Downloads
33
Comments
0

3 Embeds 11

http://localhost 9
http://twitter.com 1
http://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NonCommercial-NoDerivs LicenseCC Attribution-NonCommercial-NoDerivs LicenseCC Attribution-NonCommercial-NoDerivs License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    GeeCon 2011 - NoSQL and In Memory Data Grids from a developer perspective GeeCon 2011 - NoSQL and In Memory Data Grids from a developer perspective Presentation Transcript

    • NoSQL & DataGrids from a Developer Perspective Cyrille Le Clerc - Michaël Figuière
    • Speaker @cyrilleleclerc blog.xebia.fr Cyrille Le Clerc Large Scale DataGrids Apache CXF
    • Speaker @mfiguiere blog.xebia.fr Michaël Figuière Distributed Systems NoSQL Search Engines
    • About NoSQL No SQL
    • About NoSQL Not Only No SQL
    • About NoSQL Not Only No SQL Relational
    • Once upon a time...
    • On the Web side - Created DynamoSimilar needs for Web giants : - < 40 min of unavailability per year• Huge amount of data• High availability• Fault tolerance - Created BigTable & MapReduce• Scalability on commodity - Stores every webpages of Internet hardware
    • Amazon : the birth of Dynamo Requires complex requests, temporal unavailability is acceptable Fill cart Checkout Payment Process order Prepare Send Requires high availability, key-value store is enough
    • On the Financial side - Released Coherence in 2001Needs within financial market : - Started as a distributed cache• Very low latency• Rich queries & transactions• Scalability - Released Gigaspaces XAP in 2001• Data consistency - Routes the request inside the data
    • Data Partitioning and Replication
    • Use Case : Train Ticketing System With trains, stations, seats, booking and passengers
    • Store everything in a Mainframe ! Up to 3 To of RAM ! More than $1,000,000 IBM z11
    • Data Partitioning Partition gamma Small servers Partition beta MainFrame Partition alpha Split data for scalability
    • Data Replication Node 1 synchro Partition alpha Node 2 Duplicate data for high availability and Node 3 scalability
    • Partitioned Data Modeling
    • Partitioned Data Modeling Seat Booking Passenger number reduction name price Train code type TrainStation TrainStop code date name Typical relational data model
    • Partitionned Data Modeling Partitioning ready entities tree e ntity Root Seat Booking Passenger number reduction name price Train code Du type pli Refe ca ted renc in e d TrainStation ea ata TrainStop ch code pa date rtit ion name Find the root entity and denormalize
    • Partitionned Data Modeling Remove unused data Seat Booking Passenger number reduction name price booked Train code type TrainStation TrainStop code date name
    • Partitionned Data Modeling Sharding ready data structure Seat number price booked Train code type TrainStation TrainStop code date name
    • Consistency, Availability and Partition Tolerance
    • Data Consistency with replicas Node 1 { "name": "Barbie Computer", "price": 15.50, "tags" : [ "doll", "barbie" Node 2 ]} write to all Node 3 Node 1 read from one Node 2 Node 3
    • Data Consistency with replicas { "name": "Barbie Computer", Node 1 "price": 15.50, "tags" : [ "doll", "barbie" ]} Node 2 write to one Node 3 Node 1 Node 2 read from all Node 3
    • Data Consistency with replicas• You can adjust the balance between number of writes and number of reads• See Eventual Consistency
    • Data Consistency with Multiple Data Centers { "name": "Barbie Computer", "price": 15.50, "tags" : [ "doll", "barbie" ]} { "name": "Barbie Computer", "price": 15.50, "tags" : [ West Coast "doll", "barbie" ]} East Coast
    • Data Consistency with Multiple Data Centers set price to $ 20.00 { "name": "Barbie Computer", "price": 20.00, "tags" : [ "doll", "barbie" ]} { "name": "Barbie Computer", "price": 15.50, West Coast "tags" : [ "doll", "barbie" ]} East Coast propagation delay !
    • Data Consistency with Multiple Data Centers set price to $ 20.00 { "name": "Barbie Computer", "price": 20.00, "tags" : [ "doll", "barbie" ]} { "name": "Barbie Computer", "price": 15.50, West Coast "tags" : [ "doll", "barbie", “girl” ]} East Coast add tag “girl” reconciliation API needed !
    • Data Consistency with Multiple Data Centers set price to $ 20.00 { "name": "Barbie Computer", "price": 20.00, "tags" : [ "doll", "barbie" ]} { "name": "Barbie Computer", "price": 15.50, West Coast "tags" : [ "doll", "barbie", “girl” ]} East Coast add tag “girl” Network partitioning
    • Data Consistency with Multiple Data Centers London New York Tokyo World wide replication for financial market
    • CAP Theorem Only 2 of these 3 properties can be achieved in storage Consistency system Availability Partition Tolerance
    • CAP Theorem Relational DBNoSQL DB Consistency Availability Partition Impossible Tolerance
    • Data models & APIs
    • Request Driven Data Modeling• Relational data modeling is business driven Adaptation to requests comes with tuning• With partitioning, data modeling had to be adapted for requests Because network latency matters• NoSQL & DataGrids data modeling is request driven Two requests may require to store data twice
    • Key-Value Store In memory In memory with async persistence Persistent
    • Example with a user profile johndoe User profile as byte[] Similar to a Java HashMap
    • Write Example with Riak RiakClient riak = new RiakClient("http://server1:8098/riak"); RiakObject userProfileObj = new RiakObject("bucket", "johndoe", serializer.serialize(userProfile); riak.store(userProfileObj); Inserts a user profile into Riak
    • Read Example with Riak FetchResponse response = riak.fetch("bucket", "johndoe"); if (response.hasObject()) { userProfileObj = response.getObject(); } Fetch a user profile using its key in Riak
    • Column Families Store
    • Column Families Store For each Row ID we have a list of key-value pairs Key-value pairs are sorted by keys Relational DB Column families DB
    • Example with a shopping cart johndoe 17:21 Iphone 17:32 DVD Player 17:44 MacBook willsmith 6:10 Camera 8:29 Ipad pitdavis 14:45 PlayStation 15:01 Asus EEE 15:03 Iphone
    • Write Example with CassandraCluster cluster = HFactory.getOrCreateCluster("cluster", new CassandraHostConfigurator("server1:9160"));Keyspace keyspace = HFactory.createKeyspace("EcommerceKeyspace", cluster);Mutator<String> mutator = HFactory.createMutator(keyspace, stringSerializer);mutator.insert("johndoe", "ShoppingCartColumnFamily", HFactory.createStringColumn("14:21", "Iphone")); Inserts a column into the ShoppingCartColumnFamily
    • Read Example with CassandraSliceQuery<String, String, String> query = HFactory.createSliceQuery(keyspace, stringSerializer, stringSerializer, stringSerializer);query.setColumnFamily("ShoppingCartColumnFamily") .setKey("johndoe") .setRange("", "", false, 10);QueryResult<ColumnSlice<String, String>> result = query.execute(); Reads a slice of 10 columns from ShoppingCartColumnFamily
    • Document Store
    • Example with an item of a catalog { "name": "Iphone", "price": 559.0, item_1 "vendor": "Apple", "rating": 4.6, "tags": [ "phone", "touch" ] } The database is aware of document’s fields and can offers complex queries
    • Write Example with MongoDB Mongo mongo = new Mongo("mongos_1", 27017); DB db = mongo.getDB("Ecommerce"); DBCollection catalog = db.getCollection("Catalog"); BasicDBObject doc = new BasicDBObject(); doc.put("name", "Iphone"); doc.put("price", 559.0); catalog.insert(doc); Inserts an item document into MongoDB
    • Read Example with MongoDB BasicDBObject query = new BasicDBObject(); query.put("price", new BasicDBObject("$lt", 600)); DBCursor cursor = catalog.find(query); while(cursor.hasNext()) { System.out.println(cursor.next()); } Queries for all items with a price lower than 600
    • In Memory Data Grids eXtreme Scale
    • Example with train booking with IBM eXtremeScale @Entity(schemaRoot=true) public class Train { Seat number price @Id booked String code; Train code @Index type @Basic TrainStop String name; date @OneToMany(cascade=CascadeType.ALL) List<Seat> seats = new ArrayList<Seat>(); @Version int version; ... } With Data Grids, sub entities can have cross relations
    • Write Example with IBM eXtreme Scale eXtreme Scale provides a JPA Style API void persist(Train train) { entityManager.persist(train); } Inserts a train into eXtreme Scale
    • Read Example with IBM eXtreme Scale/** Find by key */Train findById(String id) { return (Train) entityManager.find(Train.class, id);}/** Query Language */Train findByTrain(String code) { Query q = entityManager.createQuery("select t from Train t where t.code=:code"); q.setParameter("code", code); return (Train) q.getSingleResult();} Simple and complex queries with eXtreme Scale
    • More APIs• Another Java EE versus Spring battle ? JSR 347 Data Grids vs. Spring Data Unified API ontop of relational, document, column, key-value ? Object to tuple projection API
    • Transactions
    • Transactions• NoSQL usually means NO transactions• Except when it means eXtreme Transactions !
    • Transactions ConcurrencyPlace order 231 canon-eos: 1 ipod : 1 headphone : 1 311 iphone: 1 ... ipad : 1 121 iphone: 1 264 concurrency on iphone 2 barbie : 1 iphone: 1 637 cabbage-doll: 1 12 cancel order if one product warehouse stocks is missing
    • SQL Transactions Place order 231 canon-eos: 1 ipod : 1 begin headphone : 1 311 iphone: 1 ... for each shoppingCart.product select for update ... 121 ipad : 1 iphone: 1 update ... 264 commit 2 barbie : 1 iphone: 1 cabbage-doll: 1 637 12 warehouse stockslock duration = f(shoppingcart.length)if too many locks on the rows, then lock table !
    • SQL Transactions Place order select for update ... 231 canon-eos: 1 ipod : 1 headphone : 1 311 iphone: 1 ... ipad : 1 121 iphone: 1 264 2 barbie : 1 iphone: 1 cabbage-doll: 1 637 12 warehouse stockslock duration = f(shoppingcart.length)if too many locks on the rows, then lock table !
    • SQL Transactions Place order select for update ... 231 canon-eos: 1 ipod : 1 headphone : 1 311 iphone: 1 ... ipad : 1 121 iphone: 1 264 2 barbie : 1 iphone: 1 cabbage-doll: 1 637 12 warehouse stockslock duration = f(shoppingcart.length)if too many locks on the rows, then lock table !
    • Transactions with Manual Compensation Place order DO -1 if(stock - quantity > 0) { stock = stock - quantity; } else { throw exception() ! 231 UNDO stock = stock + quantity; 311canon-eos: 1 -1 DOipod : 1 if(stock - quantity > 0) {headphone : 1 stock = stock - quantity; } else { throw exception() !iphone: 1... UNDO stock = stock + quantity; -1 121 DO if(stock - quantity > 0) { 264 stock = stock - quantity; } else { throw exception() ! UNDO stock = stock + quantity; 2 -1 DO DO if(stock - quantity > 0) { if(stock - quantity > 0) { stock = stock - quantity; } else { throw exception() ! stock = stock - quantity; 637 } else { UNDO stock = stock + quantity; throw exception() ! 12 } warehouse stocks UNDO stock = stock + quantity; code “do”, “undo” and the chain
    • Transactions with Manual Compensation Place order DO -1 if(stock - quantity > 0) { stock = stock - quantity; } else { throw exception() ! 231 UNDO stock = stock + quantity; 637barbie : 1iphone: 1 DO if(stock - quantity > 0) { -1 311cabbage-doll: 1 stock = stock - quantity; } else { throw exception() ! UNDO stock = stock + quantity; 0 DO 264 if(stock - quantity > 0) { stock = stock - quantity; -1 } else { throw exception() ! UNDO stock = stock + quantity; 12 121 warehouse stocks
    • Transactions with Manual Compensation Place order DO -1 if(stock - quantity > 0) { stock = stock - quantity; } else { throw exception() ! 231 UNDO stock = stock + quantity; 636barbie : 1iphone: 1 DO if(stock - quantity > 0) { -1 311cabbage-doll: 1 stock = stock - quantity; } else { throw exception() ! UNDO stock = stock + quantity; 0 DO 264 if(stock - quantity > 0) { stock = stock - quantity; -1 } else { throw exception() ! UNDO stock = stock + quantity; 12 121 warehouse stocks
    • Transactions with Manual Compensation Place order DO -1 if(stock - quantity > 0) { stock = stock - quantity; } else { throw exception() ! 231 UNDO stock = stock + quantity; 636barbie : 1iphone: 1 DO if(stock - quantity > 0) { -1 311cabbage-doll: 1 stock = stock - quantity; } else { no more iphone ! throw exception() ! UNDO stock = stock + quantity; 0 DO 264 if(stock - quantity > 0) { stock = stock - quantity; -1 } else { throw exception() ! UNDO stock = stock + quantity; 12 121 warehouse stocks
    • Transactions with Manual Compensation Place order DO -1 if(stock - quantity > 0) { stock = stock - quantity; } else { throw exception() ! 231 UNDO stock = stock + quantity; 636barbie : 1iphone: 1 -1 311 interrupted DO if(stock - quantity > 0) {cabbage-doll: 1 stock = stock - quantity; } else { throw exception() ! UNDO stock = stock + quantity; 0 DO 264 -1 cancelled if(stock - quantity > 0) { stock = stock - quantity; } else { throw exception() ! UNDO stock = stock + quantity; 12 121 warehouse stocks
    • Transactions with Manual Compensation Place order DO -1 if(stock - quantity > 0) { undo stock = stock - quantity; } else { throw exception() ! 231 UNDO stock = stock + quantity; 636 +1barbie : 1 DOiphone: 1 -1 311 interrupted if(stock - quantity > 0) { stock = stock - quantity;cabbage-doll: 1 } else { throw exception() ! UNDO stock = stock + quantity; 0 DO 264 -1 cancelled if(stock - quantity > 0) { stock = stock - quantity; } else { throw exception() ! UNDO stock = stock + quantity; 12 DO 121 if(stock - quantity > 0) { stock = stock - quantity; } else { } throw exception() ! warehouse stocks UNDO stock = stock + quantity;
    • Transactions with Manual Compensation• Code “do” & “undo” & chain execution• What about interrupted chain execution ? Data corruption ?
    • Transactions with Manual Compensation• Code “do” & “undo” & chain execution• What about interrupted chain execution ? Data corruption ? data store managed transaction chain execution
    • Which solution to choose?
    • Key-Value Store• Get and Set by key Simple but enough for a lot of use cases• Riak and Voldemort provide a great scalability Great to persist continuously growing datasets• Memcached and Redis offer low overhead and latency Great for cache and live data
    • Column Families Store• Get and Set by key of a list of columns Makes it possible to fetch and update partial data• Queries are simples, but columns slice fetching is possible Great for pagination• Data model is too low level for many complex data modeling Should typically be used for the largest scalability needs
    • Document Store• Schema less Great for continuously updated schemas• Complex queries are available Necessary for filtering and search• Scalability may be limited if not querying using partition key Can be handle using multiple storage and limited queries
    • In Memory Data Grid• Very Low Latency & eXtreme Transaction Processing (XTP) Investment banking, booking & inventory systems• In Memory - No Persistence Most of the time backed with a database• High budget and Developer skills required Some Open Source alternatives are appearing
    • Polyglot storage for eCommerce Products Solr search Product catalog MongoDB Application User account and Cassandra Shopping cart Warehouse inventory Coherence
    • Why NoSQL & DataGrids matter ?• Polyglot Storage: databases that fit the needs of every type of data• Linear Scalability: being able to handle any further business requirements• High Availability: multi-servers and multi-datacenters• Elasticity: natural integration with Cloud Computing philosophy• Some new use cases now available
    • Questions / Answers ?