MarkLogic Developer CommunityNoSQL Frankfurt, 2010Awesome document-oriented NoSQL databaseBeyond NoSQLwith MarkLogicThe Universal Indexand
nunojobnuno.job@marklogic.com@dscape| nunojob.com
how??Ad hocStructurePredefinedIDMSAd hocPredefinedQueries
Indexes!indexes!so… filter map reduce !?well… sort of…flickr.com/ayalan
divide and conquerlevel of abstraction: ease of usedatabaseconsistent-hashing-like thingypartition2partition3partition1standa group of treesmakes sense to have indexes in the same place
1st index resolution2nd get documentsshared-nothing clusterE Host 1E Host 3E Host 2AppServerSame Code-baseDataD Host 4D Host 5D Host 6D Host kHA&DRpartition1partition2partition3partitionmpartition4
universal indexRange IndexesTermTerm List“accelerating”123, 127, 129, 152, 344, 791 . . . “creation”122, 125, 126, 129, 130, 167 . . .“content”123, 126, 130, 142, 143, 167 . . .“application”123, 130, 131, 135, 162, 177 . . . “agility”Document References126, 130, 167, 212, 219, 377 . . .<article>. . . <article> /  <title>. . . 126, 130, 167, …product: MarkLogicGeospatial
semi structuredarticletitleparagraphget tables from computer science articles that include a title with word “content” but not the word “agility”informationun-ordered listmetadatastructureparentchildparagraphtablefull textfooter
universal indexin kelly speak: zippy-ingRange IndexesTermTerm List“accelerating”123, 127, 129, 152, 344, 791 . . . “creation”122, 125, 126, 129, 130, 167 . . .“content”123, 126, 130, 142, 143, 167 . . .“application”123, 130, 131, 135, 162, 177 . . . “agility”Document References126, 130, 167, 212, 219, 377 . . .<article>122, 125, 126, 129, 130, 143, 167<article> /  <title>122, 125, 126, 129, 130, 167 . . .126, 130, 167, …product: MarkLogicGeospatial
wait a minute…DirectoriesExclusive, hierarchical, analogous to file 	system, map to URICollectionsSet-based, N:N relationshipSecurityInvisible to your app
universal indexRange IndexesTermTerm List“accelerating”123, 127, 129, 152, 344, 791 . . . “creation”122, 125, 126, 129, 130, 167 . . .“content”123, 126, 130, 142, 143, 167 . . .“application”123, 130, 131, 135, 162, 177 . . . “data base”Document References126, 130, 167, 212, 219, 377 . . .<article>. . . <article> /  <title>. . . 126, 130, 167, …product: MarkLogicDirectory: /articles/Collection: CSRole:Editor + Action:ReadGeospatial
throughputin memory stand(s)durability: journalflickr.com/kt
mvccappend only database, use sys-timestampsto know which document is currentlyavailableand the marklogic time machinedeleteupdate(could also be create)createSystemtimestampquery
too good to be true?try us out… free version available!developer.marklogic.com/productsmarkmail.orgpairs.demo.marklogic.comheatmap.demo.marklogic.combit.ly/ml-demoflickr.com/nattu
questions?Love NoSQLdatabases?Want to change the world?We are hiring!!spkr8.com/t/4590Feedbacknuno.job@marklogic.com
Open-source, closed development?RESTMobileXQuery and why it’s awesome!not coveredbut conversations are welcome!App Server + Search + DatabaseScalable ACID transactionsXML vs. JSON ?Merging / CompactionRelevanceMVCCReverse IndexesAlertingHigh Order FunctionsGeospatial queriesCo-occurrenceMeta programmingDocument databases

MarkLogic and The Universal Index

  • 1.
    MarkLogic Developer CommunityNoSQLFrankfurt, 2010Awesome document-oriented NoSQL databaseBeyond NoSQLwith MarkLogicThe Universal Indexand
  • 2.
  • 3.
  • 4.
    Indexes!indexes!so… filter mapreduce !?well… sort of…flickr.com/ayalan
  • 5.
    divide and conquerlevelof abstraction: ease of usedatabaseconsistent-hashing-like thingypartition2partition3partition1standa group of treesmakes sense to have indexes in the same place
  • 6.
    1st index resolution2ndget documentsshared-nothing clusterE Host 1E Host 3E Host 2AppServerSame Code-baseDataD Host 4D Host 5D Host 6D Host kHA&DRpartition1partition2partition3partitionmpartition4
  • 7.
    universal indexRange IndexesTermTermList“accelerating”123, 127, 129, 152, 344, 791 . . . “creation”122, 125, 126, 129, 130, 167 . . .“content”123, 126, 130, 142, 143, 167 . . .“application”123, 130, 131, 135, 162, 177 . . . “agility”Document References126, 130, 167, 212, 219, 377 . . .<article>. . . <article> / <title>. . . 126, 130, 167, …product: MarkLogicGeospatial
  • 8.
    semi structuredarticletitleparagraphget tablesfrom computer science articles that include a title with word “content” but not the word “agility”informationun-ordered listmetadatastructureparentchildparagraphtablefull textfooter
  • 9.
    universal indexin kellyspeak: zippy-ingRange IndexesTermTerm List“accelerating”123, 127, 129, 152, 344, 791 . . . “creation”122, 125, 126, 129, 130, 167 . . .“content”123, 126, 130, 142, 143, 167 . . .“application”123, 130, 131, 135, 162, 177 . . . “agility”Document References126, 130, 167, 212, 219, 377 . . .<article>122, 125, 126, 129, 130, 143, 167<article> / <title>122, 125, 126, 129, 130, 167 . . .126, 130, 167, …product: MarkLogicGeospatial
  • 10.
    wait a minute…DirectoriesExclusive,hierarchical, analogous to file system, map to URICollectionsSet-based, N:N relationshipSecurityInvisible to your app
  • 11.
    universal indexRange IndexesTermTermList“accelerating”123, 127, 129, 152, 344, 791 . . . “creation”122, 125, 126, 129, 130, 167 . . .“content”123, 126, 130, 142, 143, 167 . . .“application”123, 130, 131, 135, 162, 177 . . . “data base”Document References126, 130, 167, 212, 219, 377 . . .<article>. . . <article> / <title>. . . 126, 130, 167, …product: MarkLogicDirectory: /articles/Collection: CSRole:Editor + Action:ReadGeospatial
  • 12.
  • 13.
    mvccappend only database,use sys-timestampsto know which document is currentlyavailableand the marklogic time machinedeleteupdate(could also be create)createSystemtimestampquery
  • 14.
    too good tobe true?try us out… free version available!developer.marklogic.com/productsmarkmail.orgpairs.demo.marklogic.comheatmap.demo.marklogic.combit.ly/ml-demoflickr.com/nattu
  • 15.
    questions?Love NoSQLdatabases?Want tochange the world?We are hiring!!spkr8.com/t/4590Feedbacknuno.job@marklogic.com
  • 16.
    Open-source, closed development?RESTMobileXQueryand why it’s awesome!not coveredbut conversations are welcome!App Server + Search + DatabaseScalable ACID transactionsXML vs. JSON ?Merging / CompactionRelevanceMVCCReverse IndexesAlertingHigh Order FunctionsGeospatial queriesCo-occurrenceMeta programmingDocument databases

Editor's Notes

  • #2 Remember:Ask people if they know: -Map-Reduce,MVCC, Sharding, Shared nothing Clustering, NoSQL, consistent hashing, fsync
  • #3 Worked in large companies like IBM in unstructured data management.Mostly client support.A lot of training.Now focused on clients specially on financial marketsLoves unstructured information data challenges
  • #4 http://www.theregister.co.uk/2010/09/09/google_caffeine_explained
  • #6 Examples: MarkmailApachecouchdb
  • #13 Double buffered in memory stand to ensure maximum throughputStands comprise indexes and respective fragmentsFragments are finalNo “real” update or deleteLess error proneMerging as a self-healing mechanism
  • #14 Introduce MVCC one liner