Your SlideShare is downloading. ×
NoSQL Overview
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

NoSQL Overview


Published on

This presentation introduces NoSQL and the motivation behind it and the basic flavors of NoSQL.

This presentation introduces NoSQL and the motivation behind it and the basic flavors of NoSQL.

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. NoSQL: What Is It and Why Would I Care? Eberhard Wolff21.09.11
  • 2. Alternative Databases: NoSQL►  NoSQL: Not only SQL►  A good example for a catchy but bad name►  Not positive definition, rather “not something else”►  Now: Even less clear
  • 3. Why NoSQL?►  Exponential data growth►  More and more connected data >  Hypertext, Blogs, User generated content, Blogs►  Semi structured >  User generated content >  Full text search / indices instead of Query-by-Example►  Integration on the database less common►  Cloud prefers scale out over scale up >  Cloud supports scale up: Reboot into larger machine >  …but eventually you will need to scale out i.e. add more machines
  • 4. NoSQL Flavors►  Key / value store►  Document►  Wide Column: Lots of Columns►  Graph Database: Graphs with nodes, relationships and properties►  Object databases: Stores objects – not rows►  Note: NoSQL is actually vaguely defined
  • 5. Key-Value Stores►  Maps keys to values Key Value►  Just a large globally available Map 42 Some data►  i.e. not very powerful data model►  Advantages >  Easy to understand >  Easier to build scale out solutions (no joins, easy sharding etc)►  Disadvantages >  Simplistic data model >  Not a good fit for complex data >  Might add complexity to the application code•  Focus in Scalability•  Redis: Think cache + Persistence•  Riak
  • 6. Key Value Store: Hybrid Approach►  Might just be used to store specific data►  I.e. scores of players in an online game >  No complex structure >  Need to scale >  Lots of reads and write►  Player name, age, address would still be in a RDBMS►  Hybrid approach
  • 7. Key-Value Stores: Store All Data►  Storing data as serialized blobs >  "user:someuser" è "someuser||more|data|here"►  Storing data as multiple keys >  "user:username:someuser" è "someuser" >  "user:email:someuser" è "" >  Requires multi get/set to be efficient >  Allows some querying if the database supports wildcards, like "user:email:someuser*"►  Storing links >  Blob: "basket:someuser" è"...|item|1|product|product:123|..." >  Separate keys: "basket:someuser:item:1:product" è "product:123" –  Multi-get: "basket:someuser:*" loads the shopping basket and all items►  Easy to understand, hard to implement
  • 8. Document Stores►  Aggregates are typically stored as "documents“ (key-value collection)►  JSON, BSON (binary JSON) and XML are common►  Still no schema, so add any data at runtime►  The semi-structure of the document allows the database to build indexes, allowing queries that address properties of the document >  E.g. "find all baskets that contain the product 123"►  Relations might be modeled as links►  Advantages >  Good fit for semi structured data >  In particular a good fit for JSON, XML, HTML… >  Probably the easiest transition from RDBMS►  Disadvantages >  Does not scale to the key/value store level►  Focus on semi structured data e.g. JSON►  MongoDB, CouchDB
  • 9. Wide Column►  Add any "column" you like to a row XX►  Not a key-value store, but a "key-(column-value)" store XX XX XX XX XX XX XX►  Column families are like tables XX XX XX XX►  E.g. in the "Users" column family XX XX XX XX XX XX XX XX >  "someuser" è ("username"è"someuser"), XX XX XX XX ("email" è"") XX XX XX XX XX►  Since columns are named, some databases provide indexing XX XX XX >  E.g. Google AppEngine allows you to define columns that can XX queried be XX XX XX XX XX XX►  Advantages XX XX XX XX >  Easy to store complex and heterogeous data XX xX XX XX XX§  Apache Cassandra§  Amazon SimpleDB
  • 10. Graph►  Nodes with Properties►  Typed relationships with properties►  Ideal e.g. to model relations in a social network►  Easy to find number of followers, degree of relation etc.►  Neo4j
  • 11. What happened to Queries?►  Data is easily and quickly read/stored using primary key►  Denormalize data for commonly used queries >  Store twitter inbox in key/value as –  "inbox:someuser" è ("posts:123", "posts:234", ...) >  instead of doing the query (RDBMS) –  select p.* from POSTS p, POSTLINKS pl where = pl.postId and pl.userid=42►  Store reverse lookup >  ”ewolff|following" è (”spring_rod", ”spring_juergen") >  ”post:435|RT" è (”post:42", ”post:21")
  • 12. What It Means for Developers§  More technologies to have fun with§  Broader choice of persistence stores§  Probably Cross Store Persistence •  Store name, firstname etc in RDBMS •  Store followers in Graph database •  Store Content in RDBMS •  Store User Generated Content in Document database§  Spring Data •  Similar APIs for JPA and NoSQL •  Support for cross store persistence •  Sophisticated support for generic DAOs •  E.g. just add findByName() method, implementation is provided§  QueryDSL •  JPA Criteria API done right