• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
NoSQL Overview
 

NoSQL Overview

on

  • 997 views

This presentation introduces NoSQL and the motivation behind it and the basic flavors of NoSQL.

This presentation introduces NoSQL and the motivation behind it and the basic flavors of NoSQL.

Statistics

Views

Total Views
997
Views on SlideShare
996
Embed Views
1

Actions

Likes
0
Downloads
11
Comments
0

1 Embed 1

http://www.slashdocs.com 1

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    NoSQL Overview NoSQL Overview Presentation Transcript

    • NoSQL: What Is It and Why Would I Care? Eberhard Wolff21.09.11
    • Alternative Databases: NoSQL►  NoSQL: Not only SQL►  A good example for a catchy but bad name►  Not positive definition, rather “not something else”►  Now: Even less clear
    • Why NoSQL?►  Exponential data growth►  More and more connected data >  Hypertext, Blogs, User generated content, Blogs►  Semi structured >  User generated content >  Full text search / indices instead of Query-by-Example►  Integration on the database less common►  Cloud prefers scale out over scale up >  Cloud supports scale up: Reboot into larger machine >  …but eventually you will need to scale out i.e. add more machines
    • NoSQL Flavors►  Key / value store►  Document►  Wide Column: Lots of Columns►  Graph Database: Graphs with nodes, relationships and properties►  Object databases: Stores objects – not rows►  Note: NoSQL is actually vaguely defined
    • Key-Value Stores►  Maps keys to values Key Value►  Just a large globally available Map 42 Some data►  i.e. not very powerful data model►  Advantages >  Easy to understand >  Easier to build scale out solutions (no joins, easy sharding etc)►  Disadvantages >  Simplistic data model >  Not a good fit for complex data >  Might add complexity to the application code•  Focus in Scalability•  Redis: Think cache + Persistence•  Riak
    • Key Value Store: Hybrid Approach►  Might just be used to store specific data►  I.e. scores of players in an online game >  No complex structure >  Need to scale >  Lots of reads and write►  Player name, age, address would still be in a RDBMS►  Hybrid approach
    • Key-Value Stores: Store All Data►  Storing data as serialized blobs >  "user:someuser" è "someuser|someuser@example.com|more|data|here"►  Storing data as multiple keys >  "user:username:someuser" è "someuser" >  "user:email:someuser" è "someuser@example.com" >  Requires multi get/set to be efficient >  Allows some querying if the database supports wildcards, like "user:email:someuser*"►  Storing links >  Blob: "basket:someuser" è"...|item|1|product|product:123|..." >  Separate keys: "basket:someuser:item:1:product" è "product:123" –  Multi-get: "basket:someuser:*" loads the shopping basket and all items►  Easy to understand, hard to implement
    • Document Stores►  Aggregates are typically stored as "documents“ (key-value collection)►  JSON, BSON (binary JSON) and XML are common►  Still no schema, so add any data at runtime►  The semi-structure of the document allows the database to build indexes, allowing queries that address properties of the document >  E.g. "find all baskets that contain the product 123"►  Relations might be modeled as links►  Advantages >  Good fit for semi structured data >  In particular a good fit for JSON, XML, HTML… >  Probably the easiest transition from RDBMS►  Disadvantages >  Does not scale to the key/value store level►  Focus on semi structured data e.g. JSON►  MongoDB, CouchDB
    • Wide Column►  Add any "column" you like to a row XX►  Not a key-value store, but a "key-(column-value)" store XX XX XX XX XX XX XX►  Column families are like tables XX XX XX XX►  E.g. in the "Users" column family XX XX XX XX XX XX XX XX >  "someuser" è ("username"è"someuser"), XX XX XX XX ("email" è"someuser@example.com") XX XX XX XX XX►  Since columns are named, some databases provide indexing XX XX XX >  E.g. Google AppEngine allows you to define columns that can XX queried be XX XX XX XX XX XX►  Advantages XX XX XX XX >  Easy to store complex and heterogeous data XX xX XX XX XX§  Apache Cassandra§  Amazon SimpleDB
    • Graph►  Nodes with Properties►  Typed relationships with properties►  Ideal e.g. to model relations in a social network►  Easy to find number of followers, degree of relation etc.►  Neo4j
    • What happened to Queries?►  Data is easily and quickly read/stored using primary key►  Denormalize data for commonly used queries >  Store twitter inbox in key/value as –  "inbox:someuser" è ("posts:123", "posts:234", ...) >  instead of doing the query (RDBMS) –  select p.* from POSTS p, POSTLINKS pl where p.id = pl.postId and pl.userid=42►  Store reverse lookup >  ”ewolff|following" è (”spring_rod", ”spring_juergen") >  ”post:435|RT" è (”post:42", ”post:21")
    • What It Means for Developers§  More technologies to have fun with§  Broader choice of persistence stores§  Probably Cross Store Persistence •  Store name, firstname etc in RDBMS •  Store followers in Graph database •  Store Content in RDBMS •  Store User Generated Content in Document database§  Spring Data •  Similar APIs for JPA and NoSQL •  Support for cross store persistence •  Sophisticated support for generic DAOs •  E.g. just add findByName() method, implementation is provided§  QueryDSL •  JPA Criteria API done right