SlideShare a Scribd company logo
1 of 52
Download to read offline
NoSQL



Tuesday, March 22, 2011
The Software
                             Crisis
                           Writing correct,
                     understandable, and verifiable
                     computer programs is difficult.




                                                 Edsger Dijkstra

Tuesday, March 22, 2011
The Software
                             Crisis
                          “as long as there were no
                          machines, programming
                          was no problem at all;

                          when we had a few weak
                          computers, programming
                          became a mild problem,

                          and now we have gigantic
                          computers, programming
                          has become an equally
                          gigantic problem.”




Tuesday, March 22, 2011
IMS
                          The Hierarchical
                             Database

                              (1966)




                                             Vern Watts
Tuesday, March 22, 2011
“A Relational
                      Model for Large
                          Shared
                        Databanks”



                           (1970)




                                        Ted Codd
Tuesday, March 22, 2011
“In striving to make every
                      user happy, a technology can
                       actually leave the majority
                                unhappy.”



                            “Every good idea is
                          generalized to its level of
                              inapplicability.”

                              (Peter Principle)


                                                        Jim Gray

Tuesday, March 22, 2011
Tuesday, March 22, 2011
Eric Evans


                          “NoSQL” Reintroduced
                                  (2008)




Tuesday, March 22, 2011
Total Cost of Ownership


                      • The price of a license
                      • The price of support
                      • The price of hardware


                                                 Oracle +/- 47k / CPU?
                                                 Soft ware update / support +/- 10k?




Tuesday, March 22, 2011
Internet Scale

                      • Massive data collections
                      • Huge number of requests
                      • Coming from geographic
                          areas across the globe

                      • 24/7




Tuesday, March 22, 2011
Availability

Tuesday, March 22, 2011
Data Models

Tuesday, March 22, 2011
Data Models




Tuesday, March 22, 2011
Column Oriented
                          Column Family ≈ Table                 Can grow “indefinitely”

                                     named    named    named          named    named
                            key
                                     column   column   column         column   column




                                                                 …




                                  Empty cells are
                                   cheap (sparse
                                      table)
                                                                      No
                                                       Schemaless
                                                                  secundary
                                                                   indexes

Tuesday, March 22, 2011
BigTable
                          DatastoreService  service  =  ...;
                          Key  key  =  KeyFactory.createKey(family,  recordId);
                          Entity  entity  =  service.get(key);
                          entity.getProperty(“firstname”);
                          entity.getProperty(“surname”);




Tuesday, March 22, 2011
Column Oriented + Super Columns
                                named    named       named                  named    named
                          key
                                column   column      column                 column   column




                                                               …




                                                  Super Columns
                                            named     named    named
                                            column    column   column
                                                                        …
                                                                             …




Tuesday, March 22, 2011
Key Value Store

                                       1011
                    •Schemaless        0110
                    •Versioning




Tuesday, March 22, 2011
Kyoto Cabinet
                          DB  db  =  new  DB(...);
                          db.set(“ws103177”,  
                                        “Wilfred  Springer  <wilfredspringer@sun.com>”);
                          db.get(“ws103177”);  




                                  1 mln records in 0.9 s


Tuesday, March 22, 2011
Graph Database




                  SPARQL



Tuesday, March 22, 2011
Document Store
                                            XML
                                         <persons>
                                         <person>
                                         <name>Wilfred</name>
                                                  JSON
                                         <surname>Springer
                                         </person>
                                               [{ "Name" :
                                         …
                                                   "Wilfred",
                                         </persons>
                                                  "Surname" :
                                                  "Springer"},
                                                  …
                                              ]




                 Improved
                 Indexing   Serverside
                            Processing



Tuesday, March 22, 2011
DetailPageURL

                                                           EditorialReviews                         Source


                                                                                                IsLinkSuppressed
                                                                                 Publisher


                          JSON                                                 RelaseDate


                                                                                  Format


                                                                                  Author


                                                                                 Binding


                                                                              ProductGroup


                                                                                  Label
                                                                                                     Type
                          ItemAttributes                     Languages
                                                                                                     Name
                                                                              ProductName


                                                                                  Studio

                                                                              PublicationDate       Amount


                                                                Title                            CurrencyCode


                                                              ListPrice                          FormattedPrice


                                                            Manufacterer                             URL


                                                             LargeImage                              Width


                                                                                                     Height
           Product
                                            SalesRank
                                                                                                     URL
                                                            MediumImage
                                                                                                     Width


Tuesday, March 22, 2011                                                                              Height
Publisher

                                              RelaseDate


                                                 Format


                                                 Author


                                                Binding


                                             ProductGroup


                                                 Label
                                                                   Type
             ItemAttributes   Languages
                                                                   Name
                                             ProductName


                                                 Studio

                                             PublicationDate      Amount


                                  Title                        CurrencyCode


                                ListPrice                      FormattedPrice


                              Manufacterer




Tuesday, March 22, 2011
Various Queries
                //  find  all  products
                db.products.find()  //  find  all  products

                //  find  products  with  446  pages  (slow)
                db.products.find({“ItemAttributes.NumberOfPages”:  446})

                //  find  products  with  446  pages  (fast)
                db.products.ensureIndex({"ItemAttributes.NumberOfPages":  1})  
                db.products.find({“ItemAttributes.NumberOfPages”:  446})


                          Product           ItemAttributes     NumberOfPages




Tuesday, March 22, 2011
Find books on “java”
                           db.products.find(
                               {"fs_keywords_terms":  "java"},
                               {"ItemAttributes.Title"  :  1}
                           )

                                                 ItemAttributes              Title
                      Product
                                                                       !8#,9*!%&%!8:;<-;%=>+?*,@%#A%BCDCE%
                          !"#$!%&%                                     =*+>A$%F$#,#>A&%;GC+,#+C9%HI#$*%J>G%
                          '()*+,-$.!/$01234(/3((5/+/60(60**0!7         ;G>KGCLL*G@!

                                                          J@"?*MN>G$@",*GL@&%
                                  fs_keywords_terms       O!?*AA*,P!E!9!E!+C9D*G,!E!L#+PC*9!E!)!E!$>AC
                                                          P>>!E!,+Q!E!#Q!E!@>+?*,@!E!)CDC!E!@*+>A$!E!*
                                                          $#,#>A!E!QGC+,#+C9!E!KI#$*!E!QG>KGCLL*G@!R




Tuesday, March 22, 2011
... with the worst sales
                                         rank
                           db.products.find(
                               {"fs_keywords_terms":  "java"},
                               {"ItemAttributes.Title"  :  1}
                           ).sort({“SalesRank”:  -­‐1})

                                                   ItemAttributes              Title
                          Product
                                                                         !8#,9*!%&%!8:;<-;%=>+?*,@%#A%BCDCE%
                            !"#$!%&%                                     =*+>A$%F$#,#>A&%;GC+,#+C9%HI#$*%J>G%
                            '()*+,-$.!/$01234(/3((5/+/60(60**0!7         ;G>KGCLL*G@!

                                                            J@"?*MN>G$@",*GL@&%
                                    fs_keywords_terms       O!?*AA*,P!E!9!E!+C9D*G,!E!L#+PC*9!E!)!E!$>AC
                                                            P>>!E!,+Q!E!#Q!E!@>+?*,@!E!)CDC!E!@*+>A$!E!*
                                                            $#,#>A!E!QGC+,#+C9!E!KI#$*!E!QG>KGCLL*G@!R




Tuesday, March 22, 2011
Count books per #pages
                          db.products.group({
                              key:  {"ItemAttributes.NumberOfPages":  true  },  
                              cond:  {},  
                              initial:  {count:  0},  
                              reduce:  function(obj,prev)  {  prev.count++  }
                          })




Tuesday, March 22, 2011
SQL
                                     19OPQ                                                          Mongo
                                                                                                    A*2=*LR
                          SELECT                                                  db.runCommand({
                              Dim1, Dim2,                             !           mapreduce: "DenormAggCollection",
                              SUM(Measure1) AS MSum,                              query: {
                                                                      "
                              COUNT(*) AS RecordCount,                                 filter1: { '$in': [ 'A', 'B' ] },
                              AVG(Measure2) AS MAvg,                  #                filter2: 'C',
                              MIN(Measure1) AS MMin                                    filter3: { '$gt': 123 }
                              MAX(CASE                                              },
                                 WHEN Measure2 < 100                  $           map: function() { emit(
                                 THEN Measure2                                         { d1: this.Dim1, d2: this.Dim2 },
                              END) AS MMax                                             { msum: this.measure1, recs: 1, mmin: this.measure1,
                          FROM DenormAggTable                                            mmax: this.measure2 < 100 ? this.measure2 : 0 }
                          WHERE (Filter1 IN (’A’,’B’))                              );},
                              AND (Filter2 = ‘C’)                     %           reduce: function(key, vals) {
                              AND (Filter3 > 123)                                      var ret = { msum: 0, recs: 0, mmin: 0, mmax: 0 };
                          GROUP BY Dim1, Dim2                         !                for(var i = 0; i < vals.length; i++) {
                          HAVING (MMin > 0)                                              ret.msum += vals[i].msum;
                          ORDER BY RecordCount DESC                                      ret.recs += vals[i].recs;
                          LIMIT 4, 8                                                     if(vals[i].mmin < ret.mmin) ret.mmin = vals[i].mmin;
                                                                                         if((vals[i].mmax < 100) && (vals[i].mmax > ret.mmax))
                                                                                           ret.mmax = vals[i].mmax;
                                                                                       }
                          !   ()*+,-./.01-230*2/4*5+123/6)-/,+55-./                    return ret;
                              *+7/63/8-93/02/7:-/16,/;+2470*2</                     },
                              )-.+402=/7:-/30>-/*;/7:-/?*)802=/3-7@               finalize: function(key, val) {
                                                                              '
                          "   A-63+)-3/1+37/B-/162+6559/6==)-=67-.@       &            val.mavg = val.msum / val.recs;
                                                                                       return val;
                          # C==)-=67-3/.-,-2.02=/*2/)-4*)./4*+273/
                                                                                    },




                                                                                                                                          G-E030*2/$</M)-67-./"N!NIN#IN'
                                                                                                                                          G048/F3B*)2-</)048*3B*)2-@*)=
                            1+37/?607/+2705/;02650>670*2@
                          $ A-63+)-3/462/+3-/,)*4-.+)65/5*=04@
                                                                                  out: 'result1',
                                                                                  verbose: true
                          % D057-)3/:6E-/62/FGAHC470E-G-4*).I                     });
                            5**802=/3795-@
                                                                                  db.result1.
                          ' C==)-=67-/;057-)02=/1+37/B-/6,,50-./7*/
                            7:-/)-3+57/3-7</2*7/02/7:-/16,H)-.+4-@
                                                                                    find({ mmin: { '$gt': 0 } }).
                          & C34-2.02=J/!K/L-34-2.02=J/I!
                                                                                    sort({ recs: -1 }).
                                                                                    skip(4).
                                                                                    limit(8);




Tuesday, March 22, 2011
Availability
                            versus
                          Consistency




Tuesday, March 22, 2011
CAP
                          Theorem
                           Eric Brewer




Tuesday, March 22, 2011
Availability   Consistency




                          Partition       Pick two
                          Tolerance


Tuesday, March 22, 2011
Strong Consistency
                                         1
                                                     0   value = "foo"

                                             value = "bar"                2
                                                                                         B
                              A
                                                                         value = "bar"


                                                                          2



                                                                                         C
                                         2                        value = "bar"
                                             value = "bar"




                          After the update, any subsequent access will return the
                          updated value.




Tuesday, March 22, 2011
Weak Consistency
                                                                                             B
                                                     0     value = "foo"

                                                                            >1
                                          1   value = "bar"




                                A
                                                                     value = "bar" / "foo"


                                                                            >1




                                                                                             C
                                              value = "bar" /    value = "bar" / "foo"
                                         >1       "foo"




                          The system does not guarantee that at any given point in
                          the future subsequent access will return the updated
                          value




Tuesday, March 22, 2011
Eventual Consistency
                                                                                            B
                                                       0    value = "foo"

                                             1   value = "bar"                t



                                A
                                                                            value = "bar"


                                                                                  t




                                                                                            C
                                                                     value = "bar"
                                         t       value = "bar"


                                                    t≥1




                          If no updates are made to the object, eventually all
                          accesses will return the last updated value.




Tuesday, March 22, 2011
Session Consistency
                                                                                     B
                                    Session 1
                                                                 0   value = "foo"

                                           1    value = "bar"




                                A           2   value = "bar"




                                            2    value = "foo"
                                                                                     C
                                    Session 2



                          Within the “session”, the system guarantees read-your-
                          writes consistency




Tuesday, March 22, 2011
Read-your-writes
                                   Consistency
                                                                          B
                                                     0    value = "foo"

                                           1   value = "bar"




                                 A
                                                                          C
                                           2   value = "bar"




                          Process A, after updating a data item always access the
                          updated value and never sees an older value




Tuesday, March 22, 2011
Monotonic Read
                                    Consistency
                                                                                   B
                                                               0   value = "foo"



                                                                   value = "bar"



                               A
                                                                          3
                                           1   value = "foo"




                                                                                   C
                                           2   value = "foo"

                                           4   value = "bar"



                          If a process has seen a particular value for the object, any
                          subsequent access will never return any previous values




Tuesday, March 22, 2011
Eventual Consistentency
                                in RDBMS
                                                           Log shipping


                                                 Primary                  Backup replica




                               A
                                        1

                                                            async
                                        2

                                            3




                          Eventual consistency is not just a property of NoSQL
                          Solutions




Tuesday, March 22, 2011
No Strong
                          Consistency in
                            Face Of...




Tuesday, March 22, 2011
Network Partitions
                                          replicates
                                          new value




                              reads new
                                value
                                                  writes new value




                                           A


Tuesday, March 22, 2011
Network Partitions
                                          replicates
                                          new value




                                             !
                              reads new
                                value
                                                  writes new value




                                           A


Tuesday, March 22, 2011
Partition Tolerance
                                            fails to
                                           replicate
                                          new value




                              reads old
                                value
                                                 writes new value




                                          A


Tuesday, March 22, 2011
Partition Intolerance
                                     fails to
                                    replicate
                                   new value




                                            failing attempt
                                            to write a new
                                                  value




                                   A


Tuesday, March 22, 2011
How to do
                           better?




Tuesday, March 22, 2011
Proper Replication Factor
                                    W=3

                                A

                              N=4
                                    R=2


Tuesday, March 22, 2011
Optimizations


                          • Optimize read: R = 1, N = W
                          • Optimize write: W = 1, N = R




Tuesday, March 22, 2011
Consistent Hashing
                                               Key K
                                   A
                               H       B

                           G               C

                               F       D
                                   E



Tuesday, March 22, 2011
W=3
                                  A
                              H         B

                          G                 C

                              F         D
                                  E

Tuesday, March 22, 2011
No free ride
                    You need to consider giving up on:

                    •Avoiding redundancy
                    •Referential integrity
                    •Strong consistency
                    •Ad hoc queries
                    •Joins
                    •Ease of reporting
                    •...




Tuesday, March 22, 2011
NoSQL Today




Tuesday, March 22, 2011
Resources
                             http://nosqlsummer.org/


                             http://nosql-database.org/



                             http://nosqltapes.com/




Tuesday, March 22, 2011
Books




Tuesday, March 22, 2011
No SQL
                          wspringer@xebia.com




Tuesday, March 22, 2011

More Related Content

Viewers also liked

Viewers also liked (6)

Mongo
MongoMongo
Mongo
 
NoSQL Rollercoaster
NoSQL RollercoasterNoSQL Rollercoaster
NoSQL Rollercoaster
 
Unfiltered Unveiled
Unfiltered UnveiledUnfiltered Unveiled
Unfiltered Unveiled
 
Simplicity
SimplicitySimplicity
Simplicity
 
Byzantine Generals
Byzantine GeneralsByzantine Generals
Byzantine Generals
 
Interactive Powerpoint Project
Interactive Powerpoint ProjectInteractive Powerpoint Project
Interactive Powerpoint Project
 

Similar to NoSQL

Drizzle 7.0, Future of Virtualizing
Drizzle 7.0, Future of VirtualizingDrizzle 7.0, Future of Virtualizing
Drizzle 7.0, Future of VirtualizingBrian Aker
 
Building Scalable Web Apps
Building Scalable Web AppsBuilding Scalable Web Apps
Building Scalable Web Appszeeg
 
Cassandra at Morningstar (Feb 2011)
Cassandra at Morningstar (Feb 2011)Cassandra at Morningstar (Feb 2011)
Cassandra at Morningstar (Feb 2011)jeremiahdjordan
 
How I stopped worrying about and loved DumpRenderTree
How I stopped worrying about and loved DumpRenderTreeHow I stopped worrying about and loved DumpRenderTree
How I stopped worrying about and loved DumpRenderTreeHajime Morrita
 
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011jbellis
 
LISA 2011 Keynote: The DevOps Transformation
LISA 2011 Keynote: The DevOps TransformationLISA 2011 Keynote: The DevOps Transformation
LISA 2011 Keynote: The DevOps Transformationbenrockwood
 
Small, Medium and Big Data
Small, Medium and Big DataSmall, Medium and Big Data
Small, Medium and Big DataPierre De Wilde
 
JSLent: give it up for JavaScript
JSLent: give it up for JavaScriptJSLent: give it up for JavaScript
JSLent: give it up for JavaScriptBigBlueHat
 
Scaling with Riak at Showyou
Scaling with Riak at ShowyouScaling with Riak at Showyou
Scaling with Riak at ShowyouJohn Muellerleile
 
Blackhat Workshop
Blackhat WorkshopBlackhat Workshop
Blackhat Workshopwremes
 
2011 JavaOne EJB with Meta Annotations
2011 JavaOne EJB with Meta Annotations2011 JavaOne EJB with Meta Annotations
2011 JavaOne EJB with Meta AnnotationsDavid Blevins
 
Advanced WAL File Management With OmniPITR
Advanced WAL File Management With OmniPITRAdvanced WAL File Management With OmniPITR
Advanced WAL File Management With OmniPITRRobert Treat
 
Django and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks assDjango and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks assTobias Lindaaker
 
Virtue desk atomic-db vs relational vs everything
Virtue desk atomic-db vs relational vs everythingVirtue desk atomic-db vs relational vs everything
Virtue desk atomic-db vs relational vs everythingjfxm3671
 

Similar to NoSQL (14)

Drizzle 7.0, Future of Virtualizing
Drizzle 7.0, Future of VirtualizingDrizzle 7.0, Future of Virtualizing
Drizzle 7.0, Future of Virtualizing
 
Building Scalable Web Apps
Building Scalable Web AppsBuilding Scalable Web Apps
Building Scalable Web Apps
 
Cassandra at Morningstar (Feb 2011)
Cassandra at Morningstar (Feb 2011)Cassandra at Morningstar (Feb 2011)
Cassandra at Morningstar (Feb 2011)
 
How I stopped worrying about and loved DumpRenderTree
How I stopped worrying about and loved DumpRenderTreeHow I stopped worrying about and loved DumpRenderTree
How I stopped worrying about and loved DumpRenderTree
 
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
 
LISA 2011 Keynote: The DevOps Transformation
LISA 2011 Keynote: The DevOps TransformationLISA 2011 Keynote: The DevOps Transformation
LISA 2011 Keynote: The DevOps Transformation
 
Small, Medium and Big Data
Small, Medium and Big DataSmall, Medium and Big Data
Small, Medium and Big Data
 
JSLent: give it up for JavaScript
JSLent: give it up for JavaScriptJSLent: give it up for JavaScript
JSLent: give it up for JavaScript
 
Scaling with Riak at Showyou
Scaling with Riak at ShowyouScaling with Riak at Showyou
Scaling with Riak at Showyou
 
Blackhat Workshop
Blackhat WorkshopBlackhat Workshop
Blackhat Workshop
 
2011 JavaOne EJB with Meta Annotations
2011 JavaOne EJB with Meta Annotations2011 JavaOne EJB with Meta Annotations
2011 JavaOne EJB with Meta Annotations
 
Advanced WAL File Management With OmniPITR
Advanced WAL File Management With OmniPITRAdvanced WAL File Management With OmniPITR
Advanced WAL File Management With OmniPITR
 
Django and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks assDjango and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks ass
 
Virtue desk atomic-db vs relational vs everything
Virtue desk atomic-db vs relational vs everythingVirtue desk atomic-db vs relational vs everything
Virtue desk atomic-db vs relational vs everything
 

More from Wilfred Springer

More from Wilfred Springer (6)

Eventually Consistent
Eventually ConsistentEventually Consistent
Eventually Consistent
 
Into the Wild
Into the WildInto the Wild
Into the Wild
 
OOPSLA Talk on Preon
OOPSLA Talk on PreonOOPSLA Talk on Preon
OOPSLA Talk on Preon
 
Spring ME JavaOne
Spring ME JavaOneSpring ME JavaOne
Spring ME JavaOne
 
Spring ME
Spring MESpring ME
Spring ME
 
Preon (J-Fall 2008)
Preon (J-Fall 2008)Preon (J-Fall 2008)
Preon (J-Fall 2008)
 

Recently uploaded

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 

Recently uploaded (20)

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 

NoSQL

  • 2. The Software Crisis Writing correct, understandable, and verifiable computer programs is difficult. Edsger Dijkstra Tuesday, March 22, 2011
  • 3. The Software Crisis “as long as there were no machines, programming was no problem at all; when we had a few weak computers, programming became a mild problem, and now we have gigantic computers, programming has become an equally gigantic problem.” Tuesday, March 22, 2011
  • 4. IMS The Hierarchical Database (1966) Vern Watts Tuesday, March 22, 2011
  • 5. “A Relational Model for Large Shared Databanks” (1970) Ted Codd Tuesday, March 22, 2011
  • 6. “In striving to make every user happy, a technology can actually leave the majority unhappy.” “Every good idea is generalized to its level of inapplicability.” (Peter Principle) Jim Gray Tuesday, March 22, 2011
  • 8. Eric Evans “NoSQL” Reintroduced (2008) Tuesday, March 22, 2011
  • 9. Total Cost of Ownership • The price of a license • The price of support • The price of hardware Oracle +/- 47k / CPU? Soft ware update / support +/- 10k? Tuesday, March 22, 2011
  • 10. Internet Scale • Massive data collections • Huge number of requests • Coming from geographic areas across the globe • 24/7 Tuesday, March 22, 2011
  • 14. Column Oriented Column Family ≈ Table Can grow “indefinitely” named named named named named key column column column column column … Empty cells are cheap (sparse table) No Schemaless secundary indexes Tuesday, March 22, 2011
  • 15. BigTable DatastoreService  service  =  ...; Key  key  =  KeyFactory.createKey(family,  recordId); Entity  entity  =  service.get(key); entity.getProperty(“firstname”); entity.getProperty(“surname”); Tuesday, March 22, 2011
  • 16. Column Oriented + Super Columns named named named named named key column column column column column … Super Columns named named named column column column … … Tuesday, March 22, 2011
  • 17. Key Value Store 1011 •Schemaless 0110 •Versioning Tuesday, March 22, 2011
  • 18. Kyoto Cabinet DB  db  =  new  DB(...); db.set(“ws103177”,                “Wilfred  Springer  <wilfredspringer@sun.com>”); db.get(“ws103177”);   1 mln records in 0.9 s Tuesday, March 22, 2011
  • 19. Graph Database SPARQL Tuesday, March 22, 2011
  • 20. Document Store XML <persons> <person> <name>Wilfred</name> JSON <surname>Springer </person> [{ "Name" : … "Wilfred", </persons> "Surname" : "Springer"}, … ] Improved Indexing Serverside Processing Tuesday, March 22, 2011
  • 21. DetailPageURL EditorialReviews Source IsLinkSuppressed Publisher JSON RelaseDate Format Author Binding ProductGroup Label Type ItemAttributes Languages Name ProductName Studio PublicationDate Amount Title CurrencyCode ListPrice FormattedPrice Manufacterer URL LargeImage Width Height Product SalesRank URL MediumImage Width Tuesday, March 22, 2011 Height
  • 22. Publisher RelaseDate Format Author Binding ProductGroup Label Type ItemAttributes Languages Name ProductName Studio PublicationDate Amount Title CurrencyCode ListPrice FormattedPrice Manufacterer Tuesday, March 22, 2011
  • 23. Various Queries //  find  all  products db.products.find()  //  find  all  products //  find  products  with  446  pages  (slow) db.products.find({“ItemAttributes.NumberOfPages”:  446}) //  find  products  with  446  pages  (fast) db.products.ensureIndex({"ItemAttributes.NumberOfPages":  1})   db.products.find({“ItemAttributes.NumberOfPages”:  446}) Product ItemAttributes NumberOfPages Tuesday, March 22, 2011
  • 24. Find books on “java” db.products.find(    {"fs_keywords_terms":  "java"},    {"ItemAttributes.Title"  :  1} ) ItemAttributes Title Product !8#,9*!%&%!8:;<-;%=>+?*,@%#A%BCDCE% !"#$!%&% =*+>A$%F$#,#>A&%;GC+,#+C9%HI#$*%J>G% '()*+,-$.!/$01234(/3((5/+/60(60**0!7 ;G>KGCLL*G@! J@"?*MN>G$@",*GL@&% fs_keywords_terms O!?*AA*,P!E!9!E!+C9D*G,!E!L#+PC*9!E!)!E!$>AC P>>!E!,+Q!E!#Q!E!@>+?*,@!E!)CDC!E!@*+>A$!E!* $#,#>A!E!QGC+,#+C9!E!KI#$*!E!QG>KGCLL*G@!R Tuesday, March 22, 2011
  • 25. ... with the worst sales rank db.products.find(    {"fs_keywords_terms":  "java"},    {"ItemAttributes.Title"  :  1} ).sort({“SalesRank”:  -­‐1}) ItemAttributes Title Product !8#,9*!%&%!8:;<-;%=>+?*,@%#A%BCDCE% !"#$!%&% =*+>A$%F$#,#>A&%;GC+,#+C9%HI#$*%J>G% '()*+,-$.!/$01234(/3((5/+/60(60**0!7 ;G>KGCLL*G@! J@"?*MN>G$@",*GL@&% fs_keywords_terms O!?*AA*,P!E!9!E!+C9D*G,!E!L#+PC*9!E!)!E!$>AC P>>!E!,+Q!E!#Q!E!@>+?*,@!E!)CDC!E!@*+>A$!E!* $#,#>A!E!QGC+,#+C9!E!KI#$*!E!QG>KGCLL*G@!R Tuesday, March 22, 2011
  • 26. Count books per #pages db.products.group({    key:  {"ItemAttributes.NumberOfPages":  true  },      cond:  {},      initial:  {count:  0},      reduce:  function(obj,prev)  {  prev.count++  } }) Tuesday, March 22, 2011
  • 27. SQL 19OPQ Mongo A*2=*LR SELECT db.runCommand({ Dim1, Dim2, ! mapreduce: "DenormAggCollection", SUM(Measure1) AS MSum, query: { " COUNT(*) AS RecordCount, filter1: { '$in': [ 'A', 'B' ] }, AVG(Measure2) AS MAvg, # filter2: 'C', MIN(Measure1) AS MMin filter3: { '$gt': 123 } MAX(CASE }, WHEN Measure2 < 100 $ map: function() { emit( THEN Measure2 { d1: this.Dim1, d2: this.Dim2 }, END) AS MMax { msum: this.measure1, recs: 1, mmin: this.measure1, FROM DenormAggTable mmax: this.measure2 < 100 ? this.measure2 : 0 } WHERE (Filter1 IN (’A’,’B’)) );}, AND (Filter2 = ‘C’) % reduce: function(key, vals) { AND (Filter3 > 123) var ret = { msum: 0, recs: 0, mmin: 0, mmax: 0 }; GROUP BY Dim1, Dim2 ! for(var i = 0; i < vals.length; i++) { HAVING (MMin > 0) ret.msum += vals[i].msum; ORDER BY RecordCount DESC ret.recs += vals[i].recs; LIMIT 4, 8 if(vals[i].mmin < ret.mmin) ret.mmin = vals[i].mmin; if((vals[i].mmax < 100) && (vals[i].mmax > ret.mmax)) ret.mmax = vals[i].mmax; } ! ()*+,-./.01-230*2/4*5+123/6)-/,+55-./ return ret; *+7/63/8-93/02/7:-/16,/;+2470*2</ }, )-.+402=/7:-/30>-/*;/7:-/?*)802=/3-7@ finalize: function(key, val) { ' " A-63+)-3/1+37/B-/162+6559/6==)-=67-.@ & val.mavg = val.msum / val.recs; return val; # C==)-=67-3/.-,-2.02=/*2/)-4*)./4*+273/ }, G-E030*2/$</M)-67-./"N!NIN#IN' G048/F3B*)2-</)048*3B*)2-@*)= 1+37/?607/+2705/;02650>670*2@ $ A-63+)-3/462/+3-/,)*4-.+)65/5*=04@ out: 'result1', verbose: true % D057-)3/:6E-/62/FGAHC470E-G-4*).I }); 5**802=/3795-@ db.result1. ' C==)-=67-/;057-)02=/1+37/B-/6,,50-./7*/ 7:-/)-3+57/3-7</2*7/02/7:-/16,H)-.+4-@ find({ mmin: { '$gt': 0 } }). & C34-2.02=J/!K/L-34-2.02=J/I! sort({ recs: -1 }). skip(4). limit(8); Tuesday, March 22, 2011
  • 28. Availability versus Consistency Tuesday, March 22, 2011
  • 29. CAP Theorem Eric Brewer Tuesday, March 22, 2011
  • 30. Availability Consistency Partition Pick two Tolerance Tuesday, March 22, 2011
  • 31. Strong Consistency 1 0 value = "foo" value = "bar" 2 B A value = "bar" 2 C 2 value = "bar" value = "bar" After the update, any subsequent access will return the updated value. Tuesday, March 22, 2011
  • 32. Weak Consistency B 0 value = "foo" >1 1 value = "bar" A value = "bar" / "foo" >1 C value = "bar" / value = "bar" / "foo" >1 "foo" The system does not guarantee that at any given point in the future subsequent access will return the updated value Tuesday, March 22, 2011
  • 33. Eventual Consistency B 0 value = "foo" 1 value = "bar" t A value = "bar" t C value = "bar" t value = "bar" t≥1 If no updates are made to the object, eventually all accesses will return the last updated value. Tuesday, March 22, 2011
  • 34. Session Consistency B Session 1 0 value = "foo" 1 value = "bar" A 2 value = "bar" 2 value = "foo" C Session 2 Within the “session”, the system guarantees read-your- writes consistency Tuesday, March 22, 2011
  • 35. Read-your-writes Consistency B 0 value = "foo" 1 value = "bar" A C 2 value = "bar" Process A, after updating a data item always access the updated value and never sees an older value Tuesday, March 22, 2011
  • 36. Monotonic Read Consistency B 0 value = "foo" value = "bar" A 3 1 value = "foo" C 2 value = "foo" 4 value = "bar" If a process has seen a particular value for the object, any subsequent access will never return any previous values Tuesday, March 22, 2011
  • 37. Eventual Consistentency in RDBMS Log shipping Primary Backup replica A 1 async 2 3 Eventual consistency is not just a property of NoSQL Solutions Tuesday, March 22, 2011
  • 38. No Strong Consistency in Face Of... Tuesday, March 22, 2011
  • 39. Network Partitions replicates new value reads new value writes new value A Tuesday, March 22, 2011
  • 40. Network Partitions replicates new value ! reads new value writes new value A Tuesday, March 22, 2011
  • 41. Partition Tolerance fails to replicate new value reads old value writes new value A Tuesday, March 22, 2011
  • 42. Partition Intolerance fails to replicate new value failing attempt to write a new value A Tuesday, March 22, 2011
  • 43. How to do better? Tuesday, March 22, 2011
  • 44. Proper Replication Factor W=3 A N=4 R=2 Tuesday, March 22, 2011
  • 45. Optimizations • Optimize read: R = 1, N = W • Optimize write: W = 1, N = R Tuesday, March 22, 2011
  • 46. Consistent Hashing Key K A H B G C F D E Tuesday, March 22, 2011
  • 47. W=3 A H B G C F D E Tuesday, March 22, 2011
  • 48. No free ride You need to consider giving up on: •Avoiding redundancy •Referential integrity •Strong consistency •Ad hoc queries •Joins •Ease of reporting •... Tuesday, March 22, 2011
  • 50. Resources http://nosqlsummer.org/ http://nosql-database.org/ http://nosqltapes.com/ Tuesday, March 22, 2011
  • 52. No SQL wspringer@xebia.com Tuesday, March 22, 2011