NoSQL
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

NoSQL

on

  • 1,502 views

Alternative NoSQL talk. Not as much theory. Not as much details on the architectural principles. More code samples.

Alternative NoSQL talk. Not as much theory. Not as much details on the architectural principles. More code samples.

Statistics

Views

Total Views
1,502
Views on SlideShare
1,499
Embed Views
3

Actions

Likes
2
Downloads
24
Comments
0

2 Embeds 3

http://www.linkedin.com 2
https://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

NoSQL Presentation Transcript

  • 1. NoSQLTuesday, March 22, 2011
  • 2. The Software Crisis Writing correct, understandable, and verifiable computer programs is difficult. Edsger DijkstraTuesday, March 22, 2011
  • 3. The Software Crisis “as long as there were no machines, programming was no problem at all; when we had a few weak computers, programming became a mild problem, and now we have gigantic computers, programming has become an equally gigantic problem.”Tuesday, March 22, 2011
  • 4. IMS The Hierarchical Database (1966) Vern WattsTuesday, March 22, 2011
  • 5. “A Relational Model for Large Shared Databanks” (1970) Ted CoddTuesday, March 22, 2011
  • 6. “In striving to make every user happy, a technology can actually leave the majority unhappy.” “Every good idea is generalized to its level of inapplicability.” (Peter Principle) Jim GrayTuesday, March 22, 2011
  • 7. Tuesday, March 22, 2011
  • 8. Eric Evans “NoSQL” Reintroduced (2008)Tuesday, March 22, 2011
  • 9. Total Cost of Ownership • The price of a license • The price of support • The price of hardware Oracle +/- 47k / CPU? Soft ware update / support +/- 10k?Tuesday, March 22, 2011
  • 10. Internet Scale • Massive data collections • Huge number of requests • Coming from geographic areas across the globe • 24/7Tuesday, March 22, 2011
  • 11. AvailabilityTuesday, March 22, 2011
  • 12. Data ModelsTuesday, March 22, 2011
  • 13. Data ModelsTuesday, March 22, 2011
  • 14. Column Oriented Column Family ≈ Table Can grow “indefinitely” named named named named named key column column column column column … Empty cells are cheap (sparse table) No Schemaless secundary indexesTuesday, March 22, 2011
  • 15. BigTable DatastoreService  service  =  ...; Key  key  =  KeyFactory.createKey(family,  recordId); Entity  entity  =  service.get(key); entity.getProperty(“firstname”); entity.getProperty(“surname”);Tuesday, March 22, 2011
  • 16. Column Oriented + Super Columns named named named named named key column column column column column … Super Columns named named named column column column … …Tuesday, March 22, 2011
  • 17. Key Value Store 1011 •Schemaless 0110 •VersioningTuesday, March 22, 2011
  • 18. Kyoto Cabinet DB  db  =  new  DB(...); db.set(“ws103177”,                “Wilfred  Springer  <wilfredspringer@sun.com>”); db.get(“ws103177”);   1 mln records in 0.9 sTuesday, March 22, 2011
  • 19. Graph Database SPARQLTuesday, March 22, 2011
  • 20. Document Store XML <persons> <person> <name>Wilfred</name> JSON <surname>Springer </person> [{ "Name" : … "Wilfred", </persons> "Surname" : "Springer"}, … ] Improved Indexing Serverside ProcessingTuesday, March 22, 2011
  • 21. DetailPageURL EditorialReviews Source IsLinkSuppressed Publisher JSON RelaseDate Format Author Binding ProductGroup Label Type ItemAttributes Languages Name ProductName Studio PublicationDate Amount Title CurrencyCode ListPrice FormattedPrice Manufacterer URL LargeImage Width Height Product SalesRank URL MediumImage WidthTuesday, March 22, 2011 Height
  • 22. Publisher RelaseDate Format Author Binding ProductGroup Label Type ItemAttributes Languages Name ProductName Studio PublicationDate Amount Title CurrencyCode ListPrice FormattedPrice ManufactererTuesday, March 22, 2011
  • 23. Various Queries //  find  all  products db.products.find()  //  find  all  products //  find  products  with  446  pages  (slow) db.products.find({“ItemAttributes.NumberOfPages”:  446}) //  find  products  with  446  pages  (fast) db.products.ensureIndex({"ItemAttributes.NumberOfPages":  1})   db.products.find({“ItemAttributes.NumberOfPages”:  446}) Product ItemAttributes NumberOfPagesTuesday, March 22, 2011
  • 24. Find books on “java” db.products.find(    {"fs_keywords_terms":  "java"},    {"ItemAttributes.Title"  :  1} ) ItemAttributes Title Product !8#,9*!%&%!8:;<-;%=>+?*,@%#A%BCDCE% !"#$!%&% =*+>A$%F$#,#>A&%;GC+,#+C9%HI#$*%J>G% ()*+,-$.!/$01234(/3((5/+/60(60**0!7 ;G>KGCLL*G@! J@"?*MN>G$@",*GL@&% fs_keywords_terms O!?*AA*,P!E!9!E!+C9D*G,!E!L#+PC*9!E!)!E!$>AC P>>!E!,+Q!E!#Q!E!@>+?*,@!E!)CDC!E!@*+>A$!E!* $#,#>A!E!QGC+,#+C9!E!KI#$*!E!QG>KGCLL*G@!RTuesday, March 22, 2011
  • 25. ... with the worst sales rank db.products.find(    {"fs_keywords_terms":  "java"},    {"ItemAttributes.Title"  :  1} ).sort({“SalesRank”:  -­‐1}) ItemAttributes Title Product !8#,9*!%&%!8:;<-;%=>+?*,@%#A%BCDCE% !"#$!%&% =*+>A$%F$#,#>A&%;GC+,#+C9%HI#$*%J>G% ()*+,-$.!/$01234(/3((5/+/60(60**0!7 ;G>KGCLL*G@! J@"?*MN>G$@",*GL@&% fs_keywords_terms O!?*AA*,P!E!9!E!+C9D*G,!E!L#+PC*9!E!)!E!$>AC P>>!E!,+Q!E!#Q!E!@>+?*,@!E!)CDC!E!@*+>A$!E!* $#,#>A!E!QGC+,#+C9!E!KI#$*!E!QG>KGCLL*G@!RTuesday, March 22, 2011
  • 26. Count books per #pages db.products.group({    key:  {"ItemAttributes.NumberOfPages":  true  },      cond:  {},      initial:  {count:  0},      reduce:  function(obj,prev)  {  prev.count++  } })Tuesday, March 22, 2011
  • 27. SQL 19OPQ Mongo A*2=*LR SELECT db.runCommand({ Dim1, Dim2, ! mapreduce: "DenormAggCollection", SUM(Measure1) AS MSum, query: { " COUNT(*) AS RecordCount, filter1: { $in: [ A, B ] }, AVG(Measure2) AS MAvg, # filter2: C, MIN(Measure1) AS MMin filter3: { $gt: 123 } MAX(CASE }, WHEN Measure2 < 100 $ map: function() { emit( THEN Measure2 { d1: this.Dim1, d2: this.Dim2 }, END) AS MMax { msum: this.measure1, recs: 1, mmin: this.measure1, FROM DenormAggTable mmax: this.measure2 < 100 ? this.measure2 : 0 } WHERE (Filter1 IN (’A’,’B’)) );}, AND (Filter2 = ‘C’) % reduce: function(key, vals) { AND (Filter3 > 123) var ret = { msum: 0, recs: 0, mmin: 0, mmax: 0 }; GROUP BY Dim1, Dim2 ! for(var i = 0; i < vals.length; i++) { HAVING (MMin > 0) ret.msum += vals[i].msum; ORDER BY RecordCount DESC ret.recs += vals[i].recs; LIMIT 4, 8 if(vals[i].mmin < ret.mmin) ret.mmin = vals[i].mmin; if((vals[i].mmax < 100) && (vals[i].mmax > ret.mmax)) ret.mmax = vals[i].mmax; } ! ()*+,-./.01-230*2/4*5+123/6)-/,+55-./ return ret; *+7/63/8-93/02/7:-/16,/;+2470*2</ }, )-.+402=/7:-/30>-/*;/7:-/?*)802=/3-7@ finalize: function(key, val) { " A-63+)-3/1+37/B-/162+6559/6==)-=67-.@ & val.mavg = val.msum / val.recs; return val; # C==)-=67-3/.-,-2.02=/*2/)-4*)./4*+273/ }, G-E030*2/$</M)-67-./"N!NIN#IN G048/F3B*)2-</)048*3B*)2-@*)= 1+37/?607/+2705/;02650>670*2@ $ A-63+)-3/462/+3-/,)*4-.+)65/5*=04@ out: result1, verbose: true % D057-)3/:6E-/62/FGAHC470E-G-4*).I }); 5**802=/3795-@ db.result1. C==)-=67-/;057-)02=/1+37/B-/6,,50-./7*/ 7:-/)-3+57/3-7</2*7/02/7:-/16,H)-.+4-@ find({ mmin: { $gt: 0 } }). & C34-2.02=J/!K/L-34-2.02=J/I! sort({ recs: -1 }). skip(4). limit(8);Tuesday, March 22, 2011
  • 28. Availability versus ConsistencyTuesday, March 22, 2011
  • 29. CAP Theorem Eric BrewerTuesday, March 22, 2011
  • 30. Availability Consistency Partition Pick two ToleranceTuesday, March 22, 2011
  • 31. Strong Consistency 1 0 value = "foo" value = "bar" 2 B A value = "bar" 2 C 2 value = "bar" value = "bar" After the update, any subsequent access will return the updated value.Tuesday, March 22, 2011
  • 32. Weak Consistency B 0 value = "foo" >1 1 value = "bar" A value = "bar" / "foo" >1 C value = "bar" / value = "bar" / "foo" >1 "foo" The system does not guarantee that at any given point in the future subsequent access will return the updated valueTuesday, March 22, 2011
  • 33. Eventual Consistency B 0 value = "foo" 1 value = "bar" t A value = "bar" t C value = "bar" t value = "bar" t≥1 If no updates are made to the object, eventually all accesses will return the last updated value.Tuesday, March 22, 2011
  • 34. Session Consistency B Session 1 0 value = "foo" 1 value = "bar" A 2 value = "bar" 2 value = "foo" C Session 2 Within the “session”, the system guarantees read-your- writes consistencyTuesday, March 22, 2011
  • 35. Read-your-writes Consistency B 0 value = "foo" 1 value = "bar" A C 2 value = "bar" Process A, after updating a data item always access the updated value and never sees an older valueTuesday, March 22, 2011
  • 36. Monotonic Read Consistency B 0 value = "foo" value = "bar" A 3 1 value = "foo" C 2 value = "foo" 4 value = "bar" If a process has seen a particular value for the object, any subsequent access will never return any previous valuesTuesday, March 22, 2011
  • 37. Eventual Consistentency in RDBMS Log shipping Primary Backup replica A 1 async 2 3 Eventual consistency is not just a property of NoSQL SolutionsTuesday, March 22, 2011
  • 38. No Strong Consistency in Face Of...Tuesday, March 22, 2011
  • 39. Network Partitions replicates new value reads new value writes new value ATuesday, March 22, 2011
  • 40. Network Partitions replicates new value ! reads new value writes new value ATuesday, March 22, 2011
  • 41. Partition Tolerance fails to replicate new value reads old value writes new value ATuesday, March 22, 2011
  • 42. Partition Intolerance fails to replicate new value failing attempt to write a new value ATuesday, March 22, 2011
  • 43. How to do better?Tuesday, March 22, 2011
  • 44. Proper Replication Factor W=3 A N=4 R=2Tuesday, March 22, 2011
  • 45. Optimizations • Optimize read: R = 1, N = W • Optimize write: W = 1, N = RTuesday, March 22, 2011
  • 46. Consistent Hashing Key K A H B G C F D ETuesday, March 22, 2011
  • 47. W=3 A H B G C F D ETuesday, March 22, 2011
  • 48. No free ride You need to consider giving up on: •Avoiding redundancy •Referential integrity •Strong consistency •Ad hoc queries •Joins •Ease of reporting •...Tuesday, March 22, 2011
  • 49. NoSQL TodayTuesday, March 22, 2011
  • 50. Resources http://nosqlsummer.org/ http://nosql-database.org/ http://nosqltapes.com/Tuesday, March 22, 2011
  • 51. BooksTuesday, March 22, 2011
  • 52. No SQL wspringer@xebia.comTuesday, March 22, 2011