NoSQL / Spring DataPolyglot Persistence – An introduction to Spring DataPronam Chatterjeepronamc@vmware.com               ...
Presentation goal    How Spring Data simplifies the           development of NoSQL                    applications2
Agenda•   Why NoSQL?•   Overview of NoSQL databases•   Introduction to Spring Data•   Database APIs      - MongoDB      - ...
Relational databases are great• SQL = Rich, declarative query language• Database enforces referential integrity• ACID sema...
The trouble with relational databases• Object/relational impedance mismatch - Complicated to map rich domain model to rela...
NoSQL databases have emerged…Each one offers some combination of:• High performance• High scalability• Rich data-model• Sc...
… but there are few commonalities• Everyone and their dog has written one• Different data models - Key-value - Column - Do...
NoSQL databases have emerged…    • NoSQL usage small by      comparison…    • But growing…8
Agenda• Why NoSQL?• Overview of NoSQL databases• Introduction to Spring Data• Database APIs       - MongoDB       - HyperS...
Redis• Advanced key-value store - Think memcached on steroids (the good kind) - Values can be binary strings, Lists, Sets,...
Scaling Redis• Master/slave replication - Tree of Redis servers - Non-persistent master can replicate to a persistent slav...
Redis use cases• Use in conjunction with another database as the SOR• Drop-in replacement for Memcached  - Session state  ...
vFabric Gemfire - Elastic data fabric• High performance data grid• Enhanced parallel disk persistence• Non Disruptive up/d...
Gemfire - Use Cases • Ultra low latency high throughput application • As an L2 cache in hibernate • Distributed Batch proc...
Neo4j •Graph data model  - Collection of graph nodes  - Typed relationships between nodes  - Nodes and relationships have ...
Neo4j Data Model  16
Neo4j Use Cases • Use Cases  -    Anything social  -    Cloud/Network management, i.e. tracking/managing physical/virtual ...
MongoDB• Document-oriented database  - JSON-style documents: Lists, Maps, primitives  - Documents organized into collectio...
Data Model = Binary JSON documents {          "name" : "Ajanta",                                                          ...
MongoDB query by example • Find a restaurant that serves the 94619 zip code and is open at 6pm on a Monday  {       servic...
MongoDB use cases •                                                Use cases     -    Real-time analytics     -    Content...
Other NoSQL databases• SimpleDB – “key-value”• Cassandra – column oriented database• CouchDB – document-oriented• Membase ...
Agenda • Why NoSQL? • Overview of NoSQL databases • Introduction to Spring Data • Database APIs       - MongoDB       - Hy...
NoSQL Java APIsDatabase                  LibrariesRedis                     Jedis, JRedis, JDBC-Redis, RJCNeo4j           ...
Spring Data Project Goals • Bring classic Spring value propositions to a wide range of NoSQL databases:  - Productivity  -...
Spring Data sub-projects •   Commons: Polyglot persistence •   Key-Value: Redis, Riak •   Document: MongoDB, CouchDB •   G...
Many entry points to use • Auto-generated repository implementations • Opinionated APIs (Think JdbcTemplate) • Object Mapp...
Cloud Foundry supports NoSQL MongoDB and Redis are provided as services è Deploy your MongoDB and Redis applications in s...
Agenda• Why NoSQL?• Overview of NoSQL databases• Introduction to Spring Data• Database APIs      - MongoDB      - HyperSQL...
Three databases for today’s talk        Document database         Relational database           Graph database35
Three persistence strategies for today’s talk• Lower level template approach• Conventions based persistence (Hades)• Cross...
Spring Template Patterns• Resource Management• Callback methods• Exception Translation• Simple Query API 37
Repository Implementation38
• Also known as HSQLDB or Hypersonic SQL• Relational Database• Table oriented data model• SQL used for for queries• … you ...
Spring Data Repository Support• Eliminate bolierplate code – only finder methods• findByLastName – Specifications for type...
• Type safe queries for multiple backends including JPA, SQL and MongoDB in Java• Generate Query classes using Java APT• C...
QueryDSL • Repository Support • Spring Data JPA • Spring data Mongo • Spring Data JDBC extensions • QueryDslJdbcTemplate  42
Spring Data Neo4J•    Using AspectJ support providing a new programming model•    Use annotations to define POJO entities•...
Spring Data Graph Neo4J cross-store• JPA data and “NOSQL” data can share a data model• Separate the persistence provider b...
A cross-store scenario ...     You have a traditional web app using JPA to persist data to a relational     database ...45
JPA Data Model46      8/3/11     Slide 46
Cross-Store Data Model47
Upcoming SlideShare
Loading in...5
×

Wmware NoSQL

1,001

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,001
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
16
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Wmware NoSQL

  1. 1. NoSQL / Spring DataPolyglot Persistence – An introduction to Spring DataPronam Chatterjeepronamc@vmware.com © 2011 VMware Inc. All rights reserved
  2. 2. Presentation goal How Spring Data simplifies the development of NoSQL applications2
  3. 3. Agenda• Why NoSQL?• Overview of NoSQL databases• Introduction to Spring Data• Database APIs - MongoDB - HyperSQL - Neo4J3
  4. 4. Relational databases are great• SQL = Rich, declarative query language• Database enforces referential integrity• ACID semantics• Well understood by developers• Well supported by frameworks and tools, e.g. Spring JDBC, Hibernate, JPA• Well understood by operations • Configuration • Care and feeding • Backups • Tuning • Failure and recovery • Performance characteristics• But…. 4
  5. 5. The trouble with relational databases• Object/relational impedance mismatch - Complicated to map rich domain model to relational schema• Relational schema is rigid - Difficult to handle semi-structured data, e.g. varying attributes - Schema changes = downtime or $$• Extremely difficult/impossible to scale writes: - Vertical scaling is limited/requires $$ - Horizontal scaling is limited or requires $$• Performance can be suboptimal for some use cases 5
  6. 6. NoSQL databases have emerged…Each one offers some combination of:• High performance• High scalability• Rich data-model• Schema lessIn return for:• Limited transactions• Relaxed consistency•… 6
  7. 7. … but there are few commonalities• Everyone and their dog has written one• Different data models - Key-value - Column - Document - Graph• Different APIs – No JDBC, Hibernate, JPA (generally)• “Same sorry state as the database market in the 1970s before SQL was invented” http://queue.acm.org/detail.cfm?id=19612977
  8. 8. NoSQL databases have emerged… • NoSQL usage small by comparison… • But growing…8
  9. 9. Agenda• Why NoSQL?• Overview of NoSQL databases• Introduction to Spring Data• Database APIs - MongoDB - HyperSQL - Neo4J 10
  10. 10. Redis• Advanced key-value store - Think memcached on steroids (the good kind) - Values can be binary strings, Lists, Sets, Ordered Sets, Hash maps, .. - Operations for each data type, e.g. appending to a list, adding to a set, retrieving a slice of a list, … - Provides pub/sub-based messaging K1 V1• Very fast: K2 V2 - In-memory operations - ~100K operations/second on entry-level hardware K3 V2• Persistent - Periodic snapshots of memory OR append commands to log file - Limits are size of keys retained in memory.• Has “transactions” - Commands can be batched and executed atomically 11
  11. 11. Scaling Redis• Master/slave replication - Tree of Redis servers - Non-persistent master can replicate to a persistent slave - Use slaves for read-only queries• Sharding - Client-side only – consistent hashing based on key - Server-side sharding – coming one day• Run multiple servers per physical host - Server is single threaded => Leverage multiple CPUs - 32 bit more efficient than 64 bit• Optional "virtual memory" - Ideally data should fit in RAM - Values (not keys) written to disc 13
  12. 12. Redis use cases• Use in conjunction with another database as the SOR• Drop-in replacement for Memcached - Session state - Cache of data retrieved from SOR - Denormalized datastore for high-performance queries• Hit counts using INCR command• Randomly selecting an item – SRANDMEMBER• Queuing – Lists with LPOP, RPUSH, ….• High score tables – Sorted setsNotable users: github, guardian.co.uk, …. 14
  13. 13. vFabric Gemfire - Elastic data fabric• High performance data grid• Enhanced parallel disk persistence• Non Disruptive up/down scalability• Session state - Cache of data retrieved from SOR - Denormalized datastore for high-performance queries• Heterogenous data sharing • Java • .net • C++• Co-located Transactions 14
  14. 14. Gemfire - Use Cases • Ultra low latency high throughput application • As an L2 cache in hibernate • Distributed Batch process • Session state - Tomcat - tcServer • Wide Area replication 14
  15. 15. Neo4j •Graph data model - Collection of graph nodes - Typed relationships between nodes - Nodes and relationships have properties •High performance traversal API from roots - Breadth first/depth first •Query to find root nodes - Indexes on node/relationship properties - Pluggable - Lucene is the default •Graph algorithms: shortest path, … •Transactional (ACID) including 2PC •Deployment modes - Embedded – written in Java - Server with REST API 15
  16. 16. Neo4j Data Model 16
  17. 17. Neo4j Use Cases • Use Cases - Anything social - Cloud/Network management, i.e. tracking/managing physical/virtual resources - Any kind of geospatial data - Master data management - Bioinformatics - Fraud detection - Metadata management • Who is using it? - StudiVZ (the largest social network in Europe) - Fanbox - The Swedish military - And big organizations in datacom, intelligence, and finance that wish to remain anonymous 19
  18. 18. MongoDB• Document-oriented database - JSON-style documents: Lists, Maps, primitives - Documents organized into collections (~table)• Full or partial document updates - Transactional update in place on one document - Atomic Modifiers• Rich query language for dynamic queries• Index support – secondary and compound• GridFS for efficiently storing large files• Map/Reduce 20
  19. 19. Data Model = Binary JSON documents { "name" : "Ajanta", One document "type" : "Indian", = "serviceArea" : [ "94619", one DDD aggregate "94618" ], "openingHours" : [ { • Sequence of bytes on disk = fast I/O - No joins/seeks "dayOfWeek" : Monday, "open" : 1730, - In-place updates when possible => no index updates "close" : 2130 • Transaction = update of single document } ], "_id" : ObjectId("4bddc2f49d1505567c6220a0") } 21
  20. 20. MongoDB query by example • Find a restaurant that serves the 94619 zip code and is open at 6pm on a Monday { serviceArea:"94619", openingHours: { $elemMatch : { "dayOfWeek" : "Monday", "open": {$lte: 1800}, "close": {$gte: 1800} } } } DBCursor cursor = collection.find(qbeObject); while (cursor.hasNext()) { DBObject o = cursor.next(); … } 23
  21. 21. MongoDB use cases • Use cases - Real-time analytics - Content management systems - Single document partial update - Caching - High volume writes • Who is using it? - Shutterfly, Foursquare - Bit.ly Intuit - SourceForge, NY Times - GILT Groupe, Evite, - SugarCRM Copyright (c) 2011 Chris Richardson. All rights reserved. 25
  22. 22. Other NoSQL databases• SimpleDB – “key-value”• Cassandra – column oriented database• CouchDB – document-oriented• Membase – key-value• Riak – key-value + links• Hbase – column-oriented… http://nosql-database.org/ has a list of 122 NoSQL databases 26
  23. 23. Agenda • Why NoSQL? • Overview of NoSQL databases • Introduction to Spring Data • Database APIs - MongoDB - HyperSQL - Neo4J 27
  24. 24. NoSQL Java APIsDatabase LibrariesRedis Jedis, JRedis, JDBC-Redis, RJCNeo4j Vendor-providedMongoDB Vendor-provided Java driverGemfire Pure Java map API, Spring-Gemfire templatesBut• Usage patterns• Tedious configuration• Repetitive code• Error prone code•… 28
  25. 25. Spring Data Project Goals • Bring classic Spring value propositions to a wide range of NoSQL databases: - Productivity - Programming model consistency: E.g. <NoSQL>Template classes - “Portability” 30
  26. 26. Spring Data sub-projects • Commons: Polyglot persistence • Key-Value: Redis, Riak • Document: MongoDB, CouchDB • Graph: Neo4j • GORM for NoSQL http://www.springsource.org/spring-data31
  27. 27. Many entry points to use • Auto-generated repository implementations • Opinionated APIs (Think JdbcTemplate) • Object Mapping (Java and GORM) • Cross Store Persistence Programming model • Productivity support in Roo and Grails 32
  28. 28. Cloud Foundry supports NoSQL MongoDB and Redis are provided as services è Deploy your MongoDB and Redis applications in seconds33
  29. 29. Agenda• Why NoSQL?• Overview of NoSQL databases• Introduction to Spring Data• Database APIs - MongoDB - HyperSQL - Neo4J 34
  30. 30. Three databases for today’s talk Document database Relational database Graph database35
  31. 31. Three persistence strategies for today’s talk• Lower level template approach• Conventions based persistence (Hades)• Cross-Store persistence using JPA and a NoSQL datastore 36
  32. 32. Spring Template Patterns• Resource Management• Callback methods• Exception Translation• Simple Query API 37
  33. 33. Repository Implementation38
  34. 34. • Also known as HSQLDB or Hypersonic SQL• Relational Database• Table oriented data model• SQL used for for queries• … you know the rest… 39
  35. 35. Spring Data Repository Support• Eliminate bolierplate code – only finder methods• findByLastName – Specifications for type safe queries• JPA CrietriaBuilder integration QueryDSL40
  36. 36. • Type safe queries for multiple backends including JPA, SQL and MongoDB in Java• Generate Query classes using Java APT• Code completion in IDE• Domain types and properties can be referenced safely• Adopts better to refactoring changes in domain typeshttp://www.querydsl.com 41
  37. 37. QueryDSL • Repository Support • Spring Data JPA • Spring data Mongo • Spring Data JDBC extensions • QueryDslJdbcTemplate 42
  38. 38. Spring Data Neo4J• Using AspectJ support providing a new programming model• Use annotations to define POJO entities• Constructor advice automatically handles entity creation• Entity field state persisted to graph using aspects• Leverage graph database APIs from POJO model• Annotation-driven indexing of entities for search 43
  39. 39. Spring Data Graph Neo4J cross-store• JPA data and “NOSQL” data can share a data model• Separate the persistence provider by using annotations– could be the entire Entity– or, some of the fields of an Entity• We call this cross-store persistence– One transaction manager to coordinate the “NOSQL” store with the JPA relational database– AspectJ support to manage the “NOSQL” entities and fields• holds on to changed values in “change sets” until the transaction commits for non- transactional data stores 44
  40. 40. A cross-store scenario ... You have a traditional web app using JPA to persist data to a relational database ...45
  41. 41. JPA Data Model46 8/3/11 Slide 46
  42. 42. Cross-Store Data Model47
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×