Your SlideShare is downloading. ×
NoSQL
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

NoSQL

1,231
views

Published on

Alternative NoSQL talk. Not as much theory. Not as much details on the architectural principles. More code samples.

Alternative NoSQL talk. Not as much theory. Not as much details on the architectural principles. More code samples.

Published in: Technology

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,231
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
24
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. NoSQLTuesday, March 22, 2011
  • 2. The Software Crisis Writing correct, understandable, and verifiable computer programs is difficult. Edsger DijkstraTuesday, March 22, 2011
  • 3. The Software Crisis “as long as there were no machines, programming was no problem at all; when we had a few weak computers, programming became a mild problem, and now we have gigantic computers, programming has become an equally gigantic problem.”Tuesday, March 22, 2011
  • 4. IMS The Hierarchical Database (1966) Vern WattsTuesday, March 22, 2011
  • 5. “A Relational Model for Large Shared Databanks” (1970) Ted CoddTuesday, March 22, 2011
  • 6. “In striving to make every user happy, a technology can actually leave the majority unhappy.” “Every good idea is generalized to its level of inapplicability.” (Peter Principle) Jim GrayTuesday, March 22, 2011
  • 7. Tuesday, March 22, 2011
  • 8. Eric Evans “NoSQL” Reintroduced (2008)Tuesday, March 22, 2011
  • 9. Total Cost of Ownership • The price of a license • The price of support • The price of hardware Oracle +/- 47k / CPU? Soft ware update / support +/- 10k?Tuesday, March 22, 2011
  • 10. Internet Scale • Massive data collections • Huge number of requests • Coming from geographic areas across the globe • 24/7Tuesday, March 22, 2011
  • 11. AvailabilityTuesday, March 22, 2011
  • 12. Data ModelsTuesday, March 22, 2011
  • 13. Data ModelsTuesday, March 22, 2011
  • 14. Column Oriented Column Family ≈ Table Can grow “indefinitely” named named named named named key column column column column column … Empty cells are cheap (sparse table) No Schemaless secundary indexesTuesday, March 22, 2011
  • 15. BigTable DatastoreService  service  =  ...; Key  key  =  KeyFactory.createKey(family,  recordId); Entity  entity  =  service.get(key); entity.getProperty(“firstname”); entity.getProperty(“surname”);Tuesday, March 22, 2011
  • 16. Column Oriented + Super Columns named named named named named key column column column column column … Super Columns named named named column column column … …Tuesday, March 22, 2011
  • 17. Key Value Store 1011 •Schemaless 0110 •VersioningTuesday, March 22, 2011
  • 18. Kyoto Cabinet DB  db  =  new  DB(...); db.set(“ws103177”,                “Wilfred  Springer  <wilfredspringer@sun.com>”); db.get(“ws103177”);   1 mln records in 0.9 sTuesday, March 22, 2011
  • 19. Graph Database SPARQLTuesday, March 22, 2011
  • 20. Document Store XML <persons> <person> <name>Wilfred</name> JSON <surname>Springer </person> [{ "Name" : … "Wilfred", </persons> "Surname" : "Springer"}, … ] Improved Indexing Serverside ProcessingTuesday, March 22, 2011
  • 21. DetailPageURL EditorialReviews Source IsLinkSuppressed Publisher JSON RelaseDate Format Author Binding ProductGroup Label Type ItemAttributes Languages Name ProductName Studio PublicationDate Amount Title CurrencyCode ListPrice FormattedPrice Manufacterer URL LargeImage Width Height Product SalesRank URL MediumImage WidthTuesday, March 22, 2011 Height
  • 22. Publisher RelaseDate Format Author Binding ProductGroup Label Type ItemAttributes Languages Name ProductName Studio PublicationDate Amount Title CurrencyCode ListPrice FormattedPrice ManufactererTuesday, March 22, 2011
  • 23. Various Queries //  find  all  products db.products.find()  //  find  all  products //  find  products  with  446  pages  (slow) db.products.find({“ItemAttributes.NumberOfPages”:  446}) //  find  products  with  446  pages  (fast) db.products.ensureIndex({"ItemAttributes.NumberOfPages":  1})   db.products.find({“ItemAttributes.NumberOfPages”:  446}) Product ItemAttributes NumberOfPagesTuesday, March 22, 2011
  • 24. Find books on “java” db.products.find(    {"fs_keywords_terms":  "java"},    {"ItemAttributes.Title"  :  1} ) ItemAttributes Title Product !8#,9*!%&%!8:;<-;%=>+?*,@%#A%BCDCE% !"#$!%&% =*+>A$%F$#,#>A&%;GC+,#+C9%HI#$*%J>G% ()*+,-$.!/$01234(/3((5/+/60(60**0!7 ;G>KGCLL*G@! J@"?*MN>G$@",*GL@&% fs_keywords_terms O!?*AA*,P!E!9!E!+C9D*G,!E!L#+PC*9!E!)!E!$>AC P>>!E!,+Q!E!#Q!E!@>+?*,@!E!)CDC!E!@*+>A$!E!* $#,#>A!E!QGC+,#+C9!E!KI#$*!E!QG>KGCLL*G@!RTuesday, March 22, 2011
  • 25. ... with the worst sales rank db.products.find(    {"fs_keywords_terms":  "java"},    {"ItemAttributes.Title"  :  1} ).sort({“SalesRank”:  -­‐1}) ItemAttributes Title Product !8#,9*!%&%!8:;<-;%=>+?*,@%#A%BCDCE% !"#$!%&% =*+>A$%F$#,#>A&%;GC+,#+C9%HI#$*%J>G% ()*+,-$.!/$01234(/3((5/+/60(60**0!7 ;G>KGCLL*G@! J@"?*MN>G$@",*GL@&% fs_keywords_terms O!?*AA*,P!E!9!E!+C9D*G,!E!L#+PC*9!E!)!E!$>AC P>>!E!,+Q!E!#Q!E!@>+?*,@!E!)CDC!E!@*+>A$!E!* $#,#>A!E!QGC+,#+C9!E!KI#$*!E!QG>KGCLL*G@!RTuesday, March 22, 2011
  • 26. Count books per #pages db.products.group({    key:  {"ItemAttributes.NumberOfPages":  true  },      cond:  {},      initial:  {count:  0},      reduce:  function(obj,prev)  {  prev.count++  } })Tuesday, March 22, 2011
  • 27. SQL 19OPQ Mongo A*2=*LR SELECT db.runCommand({ Dim1, Dim2, ! mapreduce: "DenormAggCollection", SUM(Measure1) AS MSum, query: { " COUNT(*) AS RecordCount, filter1: { $in: [ A, B ] }, AVG(Measure2) AS MAvg, # filter2: C, MIN(Measure1) AS MMin filter3: { $gt: 123 } MAX(CASE }, WHEN Measure2 < 100 $ map: function() { emit( THEN Measure2 { d1: this.Dim1, d2: this.Dim2 }, END) AS MMax { msum: this.measure1, recs: 1, mmin: this.measure1, FROM DenormAggTable mmax: this.measure2 < 100 ? this.measure2 : 0 } WHERE (Filter1 IN (’A’,’B’)) );}, AND (Filter2 = ‘C’) % reduce: function(key, vals) { AND (Filter3 > 123) var ret = { msum: 0, recs: 0, mmin: 0, mmax: 0 }; GROUP BY Dim1, Dim2 ! for(var i = 0; i < vals.length; i++) { HAVING (MMin > 0) ret.msum += vals[i].msum; ORDER BY RecordCount DESC ret.recs += vals[i].recs; LIMIT 4, 8 if(vals[i].mmin < ret.mmin) ret.mmin = vals[i].mmin; if((vals[i].mmax < 100) && (vals[i].mmax > ret.mmax)) ret.mmax = vals[i].mmax; } ! ()*+,-./.01-230*2/4*5+123/6)-/,+55-./ return ret; *+7/63/8-93/02/7:-/16,/;+2470*2</ }, )-.+402=/7:-/30>-/*;/7:-/?*)802=/3-7@ finalize: function(key, val) { " A-63+)-3/1+37/B-/162+6559/6==)-=67-.@ & val.mavg = val.msum / val.recs; return val; # C==)-=67-3/.-,-2.02=/*2/)-4*)./4*+273/ }, G-E030*2/$</M)-67-./"N!NIN#IN G048/F3B*)2-</)048*3B*)2-@*)= 1+37/?607/+2705/;02650>670*2@ $ A-63+)-3/462/+3-/,)*4-.+)65/5*=04@ out: result1, verbose: true % D057-)3/:6E-/62/FGAHC470E-G-4*).I }); 5**802=/3795-@ db.result1. C==)-=67-/;057-)02=/1+37/B-/6,,50-./7*/ 7:-/)-3+57/3-7</2*7/02/7:-/16,H)-.+4-@ find({ mmin: { $gt: 0 } }). & C34-2.02=J/!K/L-34-2.02=J/I! sort({ recs: -1 }). skip(4). limit(8);Tuesday, March 22, 2011
  • 28. Availability versus ConsistencyTuesday, March 22, 2011
  • 29. CAP Theorem Eric BrewerTuesday, March 22, 2011
  • 30. Availability Consistency Partition Pick two ToleranceTuesday, March 22, 2011
  • 31. Strong Consistency 1 0 value = "foo" value = "bar" 2 B A value = "bar" 2 C 2 value = "bar" value = "bar" After the update, any subsequent access will return the updated value.Tuesday, March 22, 2011
  • 32. Weak Consistency B 0 value = "foo" >1 1 value = "bar" A value = "bar" / "foo" >1 C value = "bar" / value = "bar" / "foo" >1 "foo" The system does not guarantee that at any given point in the future subsequent access will return the updated valueTuesday, March 22, 2011
  • 33. Eventual Consistency B 0 value = "foo" 1 value = "bar" t A value = "bar" t C value = "bar" t value = "bar" t≥1 If no updates are made to the object, eventually all accesses will return the last updated value.Tuesday, March 22, 2011
  • 34. Session Consistency B Session 1 0 value = "foo" 1 value = "bar" A 2 value = "bar" 2 value = "foo" C Session 2 Within the “session”, the system guarantees read-your- writes consistencyTuesday, March 22, 2011
  • 35. Read-your-writes Consistency B 0 value = "foo" 1 value = "bar" A C 2 value = "bar" Process A, after updating a data item always access the updated value and never sees an older valueTuesday, March 22, 2011
  • 36. Monotonic Read Consistency B 0 value = "foo" value = "bar" A 3 1 value = "foo" C 2 value = "foo" 4 value = "bar" If a process has seen a particular value for the object, any subsequent access will never return any previous valuesTuesday, March 22, 2011
  • 37. Eventual Consistentency in RDBMS Log shipping Primary Backup replica A 1 async 2 3 Eventual consistency is not just a property of NoSQL SolutionsTuesday, March 22, 2011
  • 38. No Strong Consistency in Face Of...Tuesday, March 22, 2011
  • 39. Network Partitions replicates new value reads new value writes new value ATuesday, March 22, 2011
  • 40. Network Partitions replicates new value ! reads new value writes new value ATuesday, March 22, 2011
  • 41. Partition Tolerance fails to replicate new value reads old value writes new value ATuesday, March 22, 2011
  • 42. Partition Intolerance fails to replicate new value failing attempt to write a new value ATuesday, March 22, 2011
  • 43. How to do better?Tuesday, March 22, 2011
  • 44. Proper Replication Factor W=3 A N=4 R=2Tuesday, March 22, 2011
  • 45. Optimizations • Optimize read: R = 1, N = W • Optimize write: W = 1, N = RTuesday, March 22, 2011
  • 46. Consistent Hashing Key K A H B G C F D ETuesday, March 22, 2011
  • 47. W=3 A H B G C F D ETuesday, March 22, 2011
  • 48. No free ride You need to consider giving up on: •Avoiding redundancy •Referential integrity •Strong consistency •Ad hoc queries •Joins •Ease of reporting •...Tuesday, March 22, 2011
  • 49. NoSQL TodayTuesday, March 22, 2011
  • 50. Resources http://nosqlsummer.org/ http://nosql-database.org/ http://nosqltapes.com/Tuesday, March 22, 2011
  • 51. BooksTuesday, March 22, 2011
  • 52. No SQL wspringer@xebia.comTuesday, March 22, 2011