SomeNoSQL
By "SQL" we mean :  Relational DBs
So NoSQL:Non-Relational The biggest difference
So, what is relational?
Data is represented byTables and Relations Relational model, created by IBM in 69
Data manipulation is done       by queries      grouped by transactions
What is the problem with          that?
"It does not scale"
A common, but bad,     answer!
Why is it common?
ACIDThe 4 qualities of transactions on a relational model
Atomicity, Consistency,  Isolation, Durability
AtomicityEach transaction is "All or nothing".
ConsistencyEach transaction brings the Database from a valid state                      to another
IsolationConcurrent transactions have no side effects               (reentrancy)
DurabilityOnce a transaction is commited it remains so (even on                      shutdown)
Brewers CAP theorem
Of Consistency,Availability and Partition        tolerance          Chose two
You cannot scale without   Partition tolerance    no mainframe would be that big!
You cannot affordignoring availabilitySorry customer our service is down again!
So to scale you have to   drop consistency  which means some stale data is ok. that is acompromise. A risk to be taken and...
ACID means having   Consistency
to scale you need an     alternative
BASE
Basic Availability Soft-state Eventual consistency    details here : http://www.cs.berkeley.  edu/~brewer/cs262b-2004/PODC...
You drop Consistency for   Eventual ConsistencyThat is the most important change for scaling purposes
All that is true!
So if Relational impliesACID and ACID does not         scale?then: relational databases do not scale, right?
Wrong!
Why is the argument bad?
see FacebookThe worlds biggest hive of data
Facebook uses several     datastores   polyglotism, we will get to that
But most of facebook data      is on MySQL         and it scales
You can make yourrelational data behave in       a BASE way   Given enough effort, time and money.
Should you?it depends on your data
So what is the problemwith the relational model?         The real one?
"If all you have is ahammer, everything looks        like a nail"Abraham Maslow, The Psychology of Science, 1966, p.      ...
We use it fornon-relational data
Your App       Model Logic (the M in MVC)     Model Translator (ORM usually)Database Abstraction Layer (avoid lock in)SQL ...
Several layers just toforce our data to be   something else   AND to go back being our data!
This adds bugs    in each layer
This adds performance         costs     in all those translations
This adds integration         costsEver spent dev time making those layers work?
This adds Dev costsYou must jump hoops making your data behave                relationally
What about NoSQL ?
Several data representations!●   Key-Value●   Document●   Column-Family●   Graph●   XML-bases●   Object●   Grid●   mixed (...
Key-ValueRedis, Riak, CouchBase, etc.
Key-Value Datastore: What is it?You store keys (identifiers) and values (prettymuch anything, serialized)Just a quick way ...
Key-Value Datastore: When to use?●   Dictionaries●   Session data●   User preferences●   Shopping cart●   Anything whose c...
Key-Value Datastore: When to avoid?●   You   have relations●   You   have multi-operational transactions●   You   want to ...
DocumentMongoDB, CouchDB, TerraStore, RavenDB, Lotus Notes,                       etc.
Document Datastore: What is it?As with the key-value, but your data is notamorph is a document!Each document behaves like ...
Document Datastore: When to use?● When you have documents!  ○ Blogs  ○ CMS● When freedom of schema is required  ○ Analytic...
Document Datastore: When to avoid?● You need complex/atomic transactions over  different documents  ○ in that case you may...
Column-FamilyHadoop, Cassandra, Amazon SimpleDB, Amazon              DynamoDB etc.
Column-Family Datastore:What is it?Data in tables of rows and columns like therelational model but:● Each row has a varyin...
Column-Family Datastore:When to use it?● Logging● Registering events● Counters● when you have massive concurrent writes  w...
Column-Family Datastore:When to avoid it?● You need ACID● You need aggregate results (sums,  averages, etc)● Your data is ...
GraphNeo4J, Titan, FlockDB, OrientDB etc.
Graph Datastore: What is it?Data is represented by nodes (objects)connected by vertices (relations).The very school defini...
Graph Datastore: When to use it?Anywhere you should already be using Graphson your application:● Any relations (in the rel...
Graph Datastore: When to avoid it?Your application writes over large sets ofnodes commonly (writing to many nodes atonce i...
Which one to chose?
The ones closer to your data     yes plural
Polyglot PersistenceDifferent Datastores for different Data
For each slice of data you      want to store
Ask what datastore modelwould better represent it
Stop nailing screws!
How do you diagnose thecorrect type of each data?
Linagora can help!
Questions?
Some NoSQL
Upcoming SlideShare
Loading in …5
×

Some NoSQL

335 views

Published on

a NoSQL Introduction

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
335
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Some NoSQL

  1. 1. SomeNoSQL
  2. 2. By "SQL" we mean : Relational DBs
  3. 3. So NoSQL:Non-Relational The biggest difference
  4. 4. So, what is relational?
  5. 5. Data is represented byTables and Relations Relational model, created by IBM in 69
  6. 6. Data manipulation is done by queries grouped by transactions
  7. 7. What is the problem with that?
  8. 8. "It does not scale"
  9. 9. A common, but bad, answer!
  10. 10. Why is it common?
  11. 11. ACIDThe 4 qualities of transactions on a relational model
  12. 12. Atomicity, Consistency, Isolation, Durability
  13. 13. AtomicityEach transaction is "All or nothing".
  14. 14. ConsistencyEach transaction brings the Database from a valid state to another
  15. 15. IsolationConcurrent transactions have no side effects (reentrancy)
  16. 16. DurabilityOnce a transaction is commited it remains so (even on shutdown)
  17. 17. Brewers CAP theorem
  18. 18. Of Consistency,Availability and Partition tolerance Chose two
  19. 19. You cannot scale without Partition tolerance no mainframe would be that big!
  20. 20. You cannot affordignoring availabilitySorry customer our service is down again!
  21. 21. So to scale you have to drop consistency which means some stale data is ok. that is acompromise. A risk to be taken and considered.
  22. 22. ACID means having Consistency
  23. 23. to scale you need an alternative
  24. 24. BASE
  25. 25. Basic Availability Soft-state Eventual consistency details here : http://www.cs.berkeley. edu/~brewer/cs262b-2004/PODC-keynote.pdf
  26. 26. You drop Consistency for Eventual ConsistencyThat is the most important change for scaling purposes
  27. 27. All that is true!
  28. 28. So if Relational impliesACID and ACID does not scale?then: relational databases do not scale, right?
  29. 29. Wrong!
  30. 30. Why is the argument bad?
  31. 31. see FacebookThe worlds biggest hive of data
  32. 32. Facebook uses several datastores polyglotism, we will get to that
  33. 33. But most of facebook data is on MySQL and it scales
  34. 34. You can make yourrelational data behave in a BASE way Given enough effort, time and money.
  35. 35. Should you?it depends on your data
  36. 36. So what is the problemwith the relational model? The real one?
  37. 37. "If all you have is ahammer, everything looks like a nail"Abraham Maslow, The Psychology of Science, 1966, p. 15
  38. 38. We use it fornon-relational data
  39. 39. Your App Model Logic (the M in MVC) Model Translator (ORM usually)Database Abstraction Layer (avoid lock in)SQL Generation (souped up concatenation) SQL Interpreter Database (Complex Algorithms)
  40. 40. Several layers just toforce our data to be something else AND to go back being our data!
  41. 41. This adds bugs in each layer
  42. 42. This adds performance costs in all those translations
  43. 43. This adds integration costsEver spent dev time making those layers work?
  44. 44. This adds Dev costsYou must jump hoops making your data behave relationally
  45. 45. What about NoSQL ?
  46. 46. Several data representations!● Key-Value● Document● Column-Family● Graph● XML-bases● Object● Grid● mixed (using several types)● etc.
  47. 47. Key-ValueRedis, Riak, CouchBase, etc.
  48. 48. Key-Value Datastore: What is it?You store keys (identifiers) and values (prettymuch anything, serialized)Just a quick way to store things under a nameand recover them using that name.
  49. 49. Key-Value Datastore: When to use?● Dictionaries● Session data● User preferences● Shopping cart● Anything whose content you do not want to scry or query.
  50. 50. Key-Value Datastore: When to avoid?● You have relations● You have multi-operational transactions● You want to query the values● You want to operate on sets of entries
  51. 51. DocumentMongoDB, CouchDB, TerraStore, RavenDB, Lotus Notes, etc.
  52. 52. Document Datastore: What is it?As with the key-value, but your data is notamorph is a document!Each document behaves like an Hash-table, ithas entries of a given kind that maythemselves have entries (like a xml or jsonfile).documents are schemaless, you have completeliberty of what goes inside them.
  53. 53. Document Datastore: When to use?● When you have documents! ○ Blogs ○ CMS● When freedom of schema is required ○ Analytics ○ E-commerce products● When you wanted a key-value but wanted to query the values.
  54. 54. Document Datastore: When to avoid?● You need complex/atomic transactions over different documents ○ in that case you may have a relation, you may need sql after all!● The schema-free usage render your queries impossible.● You want to force a schema.
  55. 55. Column-FamilyHadoop, Cassandra, Amazon SimpleDB, Amazon DynamoDB etc.
  56. 56. Column-Family Datastore:What is it?Data in tables of rows and columns like therelational model but:● Each row has a varying number of columns (hence the name)● Each row is timestamped for comparison, expiring and conflict resolution.● There is no master node; writing can be scaled by adding nodes.● A column may contain another row.
  57. 57. Column-Family Datastore:When to use it?● Logging● Registering events● Counters● when you have massive concurrent writes with small chances of collisions (facebook uses for their internal messaging system)● when your information has a due date
  58. 58. Column-Family Datastore:When to avoid it?● You need ACID● You need aggregate results (sums, averages, etc)● Your data is not tabular
  59. 59. GraphNeo4J, Titan, FlockDB, OrientDB etc.
  60. 60. Graph Datastore: What is it?Data is represented by nodes (objects)connected by vertices (relations).The very school definition of a graph.The same data can represent several graphs.Graph traversal may be persisted as a relation.
  61. 61. Graph Datastore: When to use it?Anywhere you should already be using Graphson your application:● Any relations (in the relational model sense) that have no data.● Social relations (friend of, employee, chief of, etc)● Dependency● Geographical data● Routing, dispatching etc.
  62. 62. Graph Datastore: When to avoid it?Your application writes over large sets ofnodes commonly (writing to many nodes atonce is expensive)Your relations carry payloads (in that case youneed sql)
  63. 63. Which one to chose?
  64. 64. The ones closer to your data yes plural
  65. 65. Polyglot PersistenceDifferent Datastores for different Data
  66. 66. For each slice of data you want to store
  67. 67. Ask what datastore modelwould better represent it
  68. 68. Stop nailing screws!
  69. 69. How do you diagnose thecorrect type of each data?
  70. 70. Linagora can help!
  71. 71. Questions?

×