Your SlideShare is downloading. ×
0
NoSQLTuesday, March 22, 2011
The Software                             Crisis                           Writing correct,                     understanda...
The Software                             Crisis                          “as long as there were no                        ...
IMS                          The Hierarchical                             Database                              (1966)    ...
“A Relational                      Model for Large                          Shared                        Databanks”      ...
“In striving to make every                      user happy, a technology can                       actually leave the majo...
Tuesday, March 22, 2011
Eric Evans                          “NoSQL” Reintroduced                                  (2008)Tuesday, March 22, 2011
Total Cost of Ownership                      • The price of a license                      • The price of support         ...
Internet Scale                      • Massive data collections                      • Huge number of requests             ...
AvailabilityTuesday, March 22, 2011
Data ModelsTuesday, March 22, 2011
Data ModelsTuesday, March 22, 2011
Column Oriented                          Column Family ≈ Table                 Can grow “indefinitely”                     ...
BigTable                          DatastoreService  service  =  ...;                          Key  key  =  KeyFactory.crea...
Column Oriented + Super Columns                                named    named       named                  named    named ...
Key Value Store                                       1011                    •Schemaless        0110                    •...
Kyoto Cabinet                          DB  db  =  new  DB(...);                          db.set(“ws103177”,               ...
Graph Database                  SPARQLTuesday, March 22, 2011
Document Store                                            XML                                         <persons>           ...
DetailPageURL                                                           EditorialReviews                         Source   ...
Publisher                                              RelaseDate                                                 Format  ...
Various Queries                //  find  all  products                db.products.find()  //  find  all  products         ...
Find books on “java”                           db.products.find(                               {"fs_keywords_terms":  "jav...
... with the worst sales                                         rank                           db.products.find(         ...
Count books per #pages                          db.products.group({                              key:  {"ItemAttributes.Nu...
SQL                                     19OPQ                                                          Mongo              ...
Availability                            versus                          ConsistencyTuesday, March 22, 2011
CAP                          Theorem                           Eric BrewerTuesday, March 22, 2011
Availability   Consistency                          Partition       Pick two                          ToleranceTuesday, Ma...
Strong Consistency                                         1                                                     0   value...
Weak Consistency                                                                                             B            ...
Eventual Consistency                                                                                            B         ...
Session Consistency                                                                                     B                 ...
Read-your-writes                                   Consistency                                                            ...
Monotonic Read                                    Consistency                                                             ...
Eventual Consistentency                                in RDBMS                                                           ...
No Strong                          Consistency in                            Face Of...Tuesday, March 22, 2011
Network Partitions                                          replicates                                          new value ...
Network Partitions                                          replicates                                          new value ...
Partition Tolerance                                            fails to                                           replicat...
Partition Intolerance                                     fails to                                    replicate           ...
How to do                           better?Tuesday, March 22, 2011
Proper Replication Factor                                    W=3                                A                         ...
Optimizations                          • Optimize read: R = 1, N = W                          • Optimize write: W = 1, N =...
Consistent Hashing                                               Key K                                   A                ...
W=3                                  A                              H         B                          G                ...
No free ride                    You need to consider giving up on:                    •Avoiding redundancy                ...
NoSQL TodayTuesday, March 22, 2011
Resources                             http://nosqlsummer.org/                             http://nosql-database.org/      ...
BooksTuesday, March 22, 2011
No SQL                          wspringer@xebia.comTuesday, March 22, 2011
Upcoming SlideShare
Loading in...5
×

NoSQL

1,262

Published on

Alternative NoSQL talk. Not as much theory. Not as much details on the architectural principles. More code samples.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,262
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
25
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Transcript of "NoSQL"

  1. 1. NoSQLTuesday, March 22, 2011
  2. 2. The Software Crisis Writing correct, understandable, and verifiable computer programs is difficult. Edsger DijkstraTuesday, March 22, 2011
  3. 3. The Software Crisis “as long as there were no machines, programming was no problem at all; when we had a few weak computers, programming became a mild problem, and now we have gigantic computers, programming has become an equally gigantic problem.”Tuesday, March 22, 2011
  4. 4. IMS The Hierarchical Database (1966) Vern WattsTuesday, March 22, 2011
  5. 5. “A Relational Model for Large Shared Databanks” (1970) Ted CoddTuesday, March 22, 2011
  6. 6. “In striving to make every user happy, a technology can actually leave the majority unhappy.” “Every good idea is generalized to its level of inapplicability.” (Peter Principle) Jim GrayTuesday, March 22, 2011
  7. 7. Tuesday, March 22, 2011
  8. 8. Eric Evans “NoSQL” Reintroduced (2008)Tuesday, March 22, 2011
  9. 9. Total Cost of Ownership • The price of a license • The price of support • The price of hardware Oracle +/- 47k / CPU? Soft ware update / support +/- 10k?Tuesday, March 22, 2011
  10. 10. Internet Scale • Massive data collections • Huge number of requests • Coming from geographic areas across the globe • 24/7Tuesday, March 22, 2011
  11. 11. AvailabilityTuesday, March 22, 2011
  12. 12. Data ModelsTuesday, March 22, 2011
  13. 13. Data ModelsTuesday, March 22, 2011
  14. 14. Column Oriented Column Family ≈ Table Can grow “indefinitely” named named named named named key column column column column column … Empty cells are cheap (sparse table) No Schemaless secundary indexesTuesday, March 22, 2011
  15. 15. BigTable DatastoreService  service  =  ...; Key  key  =  KeyFactory.createKey(family,  recordId); Entity  entity  =  service.get(key); entity.getProperty(“firstname”); entity.getProperty(“surname”);Tuesday, March 22, 2011
  16. 16. Column Oriented + Super Columns named named named named named key column column column column column … Super Columns named named named column column column … …Tuesday, March 22, 2011
  17. 17. Key Value Store 1011 •Schemaless 0110 •VersioningTuesday, March 22, 2011
  18. 18. Kyoto Cabinet DB  db  =  new  DB(...); db.set(“ws103177”,                “Wilfred  Springer  <wilfredspringer@sun.com>”); db.get(“ws103177”);   1 mln records in 0.9 sTuesday, March 22, 2011
  19. 19. Graph Database SPARQLTuesday, March 22, 2011
  20. 20. Document Store XML <persons> <person> <name>Wilfred</name> JSON <surname>Springer </person> [{ "Name" : … "Wilfred", </persons> "Surname" : "Springer"}, … ] Improved Indexing Serverside ProcessingTuesday, March 22, 2011
  21. 21. DetailPageURL EditorialReviews Source IsLinkSuppressed Publisher JSON RelaseDate Format Author Binding ProductGroup Label Type ItemAttributes Languages Name ProductName Studio PublicationDate Amount Title CurrencyCode ListPrice FormattedPrice Manufacterer URL LargeImage Width Height Product SalesRank URL MediumImage WidthTuesday, March 22, 2011 Height
  22. 22. Publisher RelaseDate Format Author Binding ProductGroup Label Type ItemAttributes Languages Name ProductName Studio PublicationDate Amount Title CurrencyCode ListPrice FormattedPrice ManufactererTuesday, March 22, 2011
  23. 23. Various Queries //  find  all  products db.products.find()  //  find  all  products //  find  products  with  446  pages  (slow) db.products.find({“ItemAttributes.NumberOfPages”:  446}) //  find  products  with  446  pages  (fast) db.products.ensureIndex({"ItemAttributes.NumberOfPages":  1})   db.products.find({“ItemAttributes.NumberOfPages”:  446}) Product ItemAttributes NumberOfPagesTuesday, March 22, 2011
  24. 24. Find books on “java” db.products.find(    {"fs_keywords_terms":  "java"},    {"ItemAttributes.Title"  :  1} ) ItemAttributes Title Product !8#,9*!%&%!8:;<-;%=>+?*,@%#A%BCDCE% !"#$!%&% =*+>A$%F$#,#>A&%;GC+,#+C9%HI#$*%J>G% ()*+,-$.!/$01234(/3((5/+/60(60**0!7 ;G>KGCLL*G@! J@"?*MN>G$@",*GL@&% fs_keywords_terms O!?*AA*,P!E!9!E!+C9D*G,!E!L#+PC*9!E!)!E!$>AC P>>!E!,+Q!E!#Q!E!@>+?*,@!E!)CDC!E!@*+>A$!E!* $#,#>A!E!QGC+,#+C9!E!KI#$*!E!QG>KGCLL*G@!RTuesday, March 22, 2011
  25. 25. ... with the worst sales rank db.products.find(    {"fs_keywords_terms":  "java"},    {"ItemAttributes.Title"  :  1} ).sort({“SalesRank”:  -­‐1}) ItemAttributes Title Product !8#,9*!%&%!8:;<-;%=>+?*,@%#A%BCDCE% !"#$!%&% =*+>A$%F$#,#>A&%;GC+,#+C9%HI#$*%J>G% ()*+,-$.!/$01234(/3((5/+/60(60**0!7 ;G>KGCLL*G@! J@"?*MN>G$@",*GL@&% fs_keywords_terms O!?*AA*,P!E!9!E!+C9D*G,!E!L#+PC*9!E!)!E!$>AC P>>!E!,+Q!E!#Q!E!@>+?*,@!E!)CDC!E!@*+>A$!E!* $#,#>A!E!QGC+,#+C9!E!KI#$*!E!QG>KGCLL*G@!RTuesday, March 22, 2011
  26. 26. Count books per #pages db.products.group({    key:  {"ItemAttributes.NumberOfPages":  true  },      cond:  {},      initial:  {count:  0},      reduce:  function(obj,prev)  {  prev.count++  } })Tuesday, March 22, 2011
  27. 27. SQL 19OPQ Mongo A*2=*LR SELECT db.runCommand({ Dim1, Dim2, ! mapreduce: "DenormAggCollection", SUM(Measure1) AS MSum, query: { " COUNT(*) AS RecordCount, filter1: { $in: [ A, B ] }, AVG(Measure2) AS MAvg, # filter2: C, MIN(Measure1) AS MMin filter3: { $gt: 123 } MAX(CASE }, WHEN Measure2 < 100 $ map: function() { emit( THEN Measure2 { d1: this.Dim1, d2: this.Dim2 }, END) AS MMax { msum: this.measure1, recs: 1, mmin: this.measure1, FROM DenormAggTable mmax: this.measure2 < 100 ? this.measure2 : 0 } WHERE (Filter1 IN (’A’,’B’)) );}, AND (Filter2 = ‘C’) % reduce: function(key, vals) { AND (Filter3 > 123) var ret = { msum: 0, recs: 0, mmin: 0, mmax: 0 }; GROUP BY Dim1, Dim2 ! for(var i = 0; i < vals.length; i++) { HAVING (MMin > 0) ret.msum += vals[i].msum; ORDER BY RecordCount DESC ret.recs += vals[i].recs; LIMIT 4, 8 if(vals[i].mmin < ret.mmin) ret.mmin = vals[i].mmin; if((vals[i].mmax < 100) && (vals[i].mmax > ret.mmax)) ret.mmax = vals[i].mmax; } ! ()*+,-./.01-230*2/4*5+123/6)-/,+55-./ return ret; *+7/63/8-93/02/7:-/16,/;+2470*2</ }, )-.+402=/7:-/30>-/*;/7:-/?*)802=/3-7@ finalize: function(key, val) { " A-63+)-3/1+37/B-/162+6559/6==)-=67-.@ & val.mavg = val.msum / val.recs; return val; # C==)-=67-3/.-,-2.02=/*2/)-4*)./4*+273/ }, G-E030*2/$</M)-67-./"N!NIN#IN G048/F3B*)2-</)048*3B*)2-@*)= 1+37/?607/+2705/;02650>670*2@ $ A-63+)-3/462/+3-/,)*4-.+)65/5*=04@ out: result1, verbose: true % D057-)3/:6E-/62/FGAHC470E-G-4*).I }); 5**802=/3795-@ db.result1. C==)-=67-/;057-)02=/1+37/B-/6,,50-./7*/ 7:-/)-3+57/3-7</2*7/02/7:-/16,H)-.+4-@ find({ mmin: { $gt: 0 } }). & C34-2.02=J/!K/L-34-2.02=J/I! sort({ recs: -1 }). skip(4). limit(8);Tuesday, March 22, 2011
  28. 28. Availability versus ConsistencyTuesday, March 22, 2011
  29. 29. CAP Theorem Eric BrewerTuesday, March 22, 2011
  30. 30. Availability Consistency Partition Pick two ToleranceTuesday, March 22, 2011
  31. 31. Strong Consistency 1 0 value = "foo" value = "bar" 2 B A value = "bar" 2 C 2 value = "bar" value = "bar" After the update, any subsequent access will return the updated value.Tuesday, March 22, 2011
  32. 32. Weak Consistency B 0 value = "foo" >1 1 value = "bar" A value = "bar" / "foo" >1 C value = "bar" / value = "bar" / "foo" >1 "foo" The system does not guarantee that at any given point in the future subsequent access will return the updated valueTuesday, March 22, 2011
  33. 33. Eventual Consistency B 0 value = "foo" 1 value = "bar" t A value = "bar" t C value = "bar" t value = "bar" t≥1 If no updates are made to the object, eventually all accesses will return the last updated value.Tuesday, March 22, 2011
  34. 34. Session Consistency B Session 1 0 value = "foo" 1 value = "bar" A 2 value = "bar" 2 value = "foo" C Session 2 Within the “session”, the system guarantees read-your- writes consistencyTuesday, March 22, 2011
  35. 35. Read-your-writes Consistency B 0 value = "foo" 1 value = "bar" A C 2 value = "bar" Process A, after updating a data item always access the updated value and never sees an older valueTuesday, March 22, 2011
  36. 36. Monotonic Read Consistency B 0 value = "foo" value = "bar" A 3 1 value = "foo" C 2 value = "foo" 4 value = "bar" If a process has seen a particular value for the object, any subsequent access will never return any previous valuesTuesday, March 22, 2011
  37. 37. Eventual Consistentency in RDBMS Log shipping Primary Backup replica A 1 async 2 3 Eventual consistency is not just a property of NoSQL SolutionsTuesday, March 22, 2011
  38. 38. No Strong Consistency in Face Of...Tuesday, March 22, 2011
  39. 39. Network Partitions replicates new value reads new value writes new value ATuesday, March 22, 2011
  40. 40. Network Partitions replicates new value ! reads new value writes new value ATuesday, March 22, 2011
  41. 41. Partition Tolerance fails to replicate new value reads old value writes new value ATuesday, March 22, 2011
  42. 42. Partition Intolerance fails to replicate new value failing attempt to write a new value ATuesday, March 22, 2011
  43. 43. How to do better?Tuesday, March 22, 2011
  44. 44. Proper Replication Factor W=3 A N=4 R=2Tuesday, March 22, 2011
  45. 45. Optimizations • Optimize read: R = 1, N = W • Optimize write: W = 1, N = RTuesday, March 22, 2011
  46. 46. Consistent Hashing Key K A H B G C F D ETuesday, March 22, 2011
  47. 47. W=3 A H B G C F D ETuesday, March 22, 2011
  48. 48. No free ride You need to consider giving up on: •Avoiding redundancy •Referential integrity •Strong consistency •Ad hoc queries •Joins •Ease of reporting •...Tuesday, March 22, 2011
  49. 49. NoSQL TodayTuesday, March 22, 2011
  50. 50. Resources http://nosqlsummer.org/ http://nosql-database.org/ http://nosqltapes.com/Tuesday, March 22, 2011
  51. 51. BooksTuesday, March 22, 2011
  52. 52. No SQL wspringer@xebia.comTuesday, March 22, 2011
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×