• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
MarkLogic and The Universal Index
 

MarkLogic and The Universal Index

on

  • 5,436 views

Talk about how MarkLogic Server uses a inverted index (like search engines) to optimize this document oriented NoSQL database

Talk about how MarkLogic Server uses a inverted index (like search engines) to optimize this document oriented NoSQL database

Statistics

Views

Total Views
5,436
Views on SlideShare
3,732
Embed Views
1,704

Actions

Likes
1
Downloads
57
Comments
0

6 Embeds 1,704

http://nosql.mypopescu.com 1473
http://kellblog.com 169
http://lanyrd.com 56
http://translate.googleusercontent.com 3
http://coderwall.com 2
http://static.slidesharecdn.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Remember:Ask people if they know: -Map-Reduce,MVCC, Sharding, Shared nothing Clustering, NoSQL, consistent hashing, fsync
  • Worked in large companies like IBM in unstructured data management.Mostly client support.A lot of training.Now focused on clients specially on financial marketsLoves unstructured information data challenges
  • http://www.theregister.co.uk/2010/09/09/google_caffeine_explained
  • Examples: MarkmailApachecouchdb
  • Double buffered in memory stand to ensure maximum throughputStands comprise indexes and respective fragmentsFragments are finalNo “real” update or deleteLess error proneMerging as a self-healing mechanism
  • Introduce MVCC one liner

MarkLogic and The Universal Index MarkLogic and The Universal Index Presentation Transcript

  • MarkLogic Developer Community
    NoSQL Frankfurt, 2010
    Awesome document-oriented NoSQL database
    Beyond NoSQLwith MarkLogicThe Universal Index
    and
  • nunojob
    nuno.job@marklogic.com
    @dscape| nunojob.com
  • how??
    Ad hoc
    Structure
    Predefined
    IDMS
    Ad hoc
    Predefined
    Queries
  • Indexes!
    indexes!
    so… filter map reduce !?
    well… sort of…
    flickr.com/ayalan
  • divide and conquer
    level of abstraction: ease of use
    database
    consistent-hashing-like thingy
    partition2
    partition3
    partition1
    standa group of trees
    makes sense to have indexes in the same place
  • 1st index resolution
    2nd get documents
    shared-nothing cluster
    E Host 1
    E Host 3
    E Host 2
    AppServer
    Same
    Code-
    base
    Data
    D Host 4
    D Host 5
    D Host 6
    D Host k
    HA&DR
    partition1
    partition2
    partition3
    partitionm
    partition4
  • universal index
    Range Indexes
    Term
    Term List
    “accelerating”
    123, 127, 129, 152, 344, 791 . . .
    “creation”
    122, 125, 126, 129, 130, 167 . . .
    “content”
    123, 126, 130, 142, 143, 167 . . .
    “application”
    123, 130, 131, 135, 162, 177 . . .
    “agility”
    Document References
    126, 130, 167, 212, 219, 377 . . .
    <article>
    . . .
    <article> / <title>
    . . .
    126, 130, 167, …
    product: MarkLogic
    Geospatial
  • semi structured
    article
    title
    paragraph
    get tables from
    computer
    science articles
    that include a
    title with
    word “content”
    but not the
    word “agility”
    information
    un-ordered list
    metadata
    structure
    parentchild
    paragraph
    table
    full text
    footer
  • universal index
    in kelly speak: zippy-ing
    Range Indexes
    Term
    Term List
    “accelerating”
    123, 127, 129, 152, 344, 791 . . .
    “creation”
    122, 125, 126, 129, 130, 167 . . .
    “content”
    123, 126, 130, 142, 143, 167 . . .
    “application”
    123, 130, 131, 135, 162, 177 . . .
    “agility”
    Document References
    126, 130, 167, 212, 219, 377 . . .
    <article>
    122, 125, 126, 129, 130, 143, 167
    <article> / <title>
    122, 125, 126, 129, 130, 167 . . .
    126, 130, 167, …
    product: MarkLogic
    Geospatial
  • wait a minute…
    Directories
    Exclusive, hierarchical, analogous to file
    system, map to URI
    Collections
    Set-based, N:N relationship
    Security
    Invisible to your app
  • universal index
    Range Indexes
    Term
    Term List
    “accelerating”
    123, 127, 129, 152, 344, 791 . . .
    “creation”
    122, 125, 126, 129, 130, 167 . . .
    “content”
    123, 126, 130, 142, 143, 167 . . .
    “application”
    123, 130, 131, 135, 162, 177 . . .
    “data base”
    Document References
    126, 130, 167, 212, 219, 377 . . .
    <article>
    . . .
    <article> / <title>
    . . .
    126, 130, 167, …
    product: MarkLogic
    Directory: /articles/
    Collection: CS
    Role:Editor + Action:Read
    Geospatial
  • throughput
    in memory stand(s)
    durability: journal
    flickr.com/kt
  • mvcc
    append only database, use sys-timestamps
    to know which document is currently
    available
    and the marklogic time machine
    delete
    update
    (could also be create)
    create
    System
    timestamp
    query
  • too good to be true?
    try us out… free version available!
    developer.marklogic.com/products
    markmail.org
    pairs.demo.marklogic.com
    heatmap.demo.marklogic.com
    bit.ly/ml-demo
    flickr.com/nattu
  • questions?
    Love NoSQLdatabases?
    Want to change the world?
    We are hiring!!
    spkr8.com/t/4590
    Feedback
    nuno.job@marklogic.com
  • Open-source, closed development?
    REST
    Mobile
    XQuery and why it’s awesome!
    not covered
    but conversations are welcome!
    App Server + Search + Database
    Scalable ACID transactions
    XML vs. JSON ?
    Merging / Compaction
    Relevance
    MVCC
    Reverse Indexes
    Alerting
    High Order Functions
    Geospatial queries
    Co-occurrence
    Meta programming
    Document databases