SlideShare a Scribd company logo
1 of 43
Download to read offline
Choosing the Right
      NOSQL Database

                          twitter: @thobe / @neo4j / #neo4j
Tobias Ivarsson           email: tobias@neotechnology.com
                          web: http://neo4j.org/
Hacker @ Neo Technology   web: http://thobe.org/
Image credit: http://browsertoolkit.com/fault-tolerance.png

                                                              2
Image credit: http://browsertoolkit.com/fault-tolerance.png

                                                              3
This is still the view a lot
of people have of NOSQL.




Image credit: http://browsertoolkit.com/fault-tolerance.png

                                                              4
The Technologies

๏Graph Databases
  - Neo4j

๏Document Databases
  - MongoDB

๏Column Family Database
  - Cassandra
                          5
Neo4j is a Graph Database
   Graph databases FOCUS
   on the interconnection
   bet ween entities.




                            6
IS_A


Neo4j                      Graph Database
  Graph databases FOCUS
  on the interconnection
  bet ween entities.




                                      6
Other Graph Databases
๏ Neo4j
๏ Sones GraphDB
๏ Infinite Graph (by Objectivity)
๏ AllegroGraph (by Franz inc.)
๏ HypergraphDB
๏ InfoGrid
๏ DEX
๏ VertexDB
๏ FlockDB
                                   7
Document Databases


                 8
Document Databases
๏ MongoDB
๏ Riak
๏ CouchDB
๏ SimpleDB (internal at Amazon)




                                  9
ColumnFamily DBs


               10
ColumnFamily Databases
๏ Cassandra
๏ BigTable (internal at Google)
๏ HBase (part of Hadoop)
๏ Hypertable




                                  11
Application 1:
 Blog system

                 12
Requirements for a Blog System
๏ Get blog posts for a specific blog ordered by date
   • possibly filtered by tag
๏ Blogs can have an arbitrary number of blog posts
๏ Blog posts can have an arbitrary number of comments




                                                        13
the choice:
Document DB

               14
“Schema” design

๏Represent each Blog as a
   Collection of Post
   documents

๏Represent Comments as
   nested documents in the
   Post documents

                             15
Creating a blog post
import com.mongodb.Mongo;
import com.mongodb.DB;
import com.mongodb.DBCollection;
import com.mongodb.BasicDBObject;
import com.mongodb.DBObject;
// ...
Mongo mongo = new Mongo( "localhost" ); // Connect to MongoDB
// ...
DB blogs = mongo.getDB( "blogs" ); // Access the blogs database
DBCollection myBlog = blogs.getCollection( "myBlog" );

DBObject blogPost = new BasicDBObject();
blogPost.put( "title", "JavaOne 2010" );
blogPost.put( "pub_date", new Date() );
blogPost.put( "body", "Publishing a post about JavaOne in my
  MongoDB blog!" );
blogPost.put( "tags", Arrays.asList( "conference", "java" ) );
blogPost.put( "comments", new ArrayList() );

myBlog.insert( blogPost );                               16
Retrieving posts
// ...
import com.mongodb.DBCursor;
// ...

public Object getAllPosts( String blogName ) {
   DBCollection blog = db.getCollection( blogName );
   return renderPosts( blog.find() );
}

public Object getPostsByTag( String blogName, String tag ) {
   DBCollection blog = db.getCollection( blogName );
   return renderPosts( blog.find(
      new BasicDBObject( "tags", tag ) ) );
}

private Object renderPosts( DBCursor cursor ) {
   // order by publication date (descending)
   cursor = cursor.sort( new BasicDBObject( "pub_date", -1 ) );
   // ...
}                                                        17
Adding a comment
DBCollection myBlog = blogs.getCollection( "myBlog" );
// ...

void addComment( String blogPostId, String message ) {
   DBCursor posts = myBlog.find(
      new BasicDBObject( "_id", blogPostId );
   if ( !posts.hasNext() ) throw new NoSuchElementException();

    DBObject blogPost = posts.next();

    List comments = (List)blogPost.get( "comments" );
    comments.add( new BasicDBObject( "message", message )
      .append( "date", new Date() ) );

    myBlog.save( blogPost );
}



                                                            18
Application 2:
Twitter Clone

                 19
Requirements for a Twitter Clone
๏ Handle high load - especially high write load
   • Twitter generates 300GB of tweets / hour (April 2010)
๏ Retrieve all posts by a specific user, ordered by date
๏ Retrieve all posts by people a specific user follows, ordered by date




                                                               20
the choice:
ColumnFamily DB

                  21
Schema design
๏ Main keyspace: “Twissandra”, with these ColumnFamilies:
   • User - user data, keyed by user id (UUID)
   • Username - inverted index from username to user id
   • Friends - who is user X following?
   • Followers - who is following user X?
   • Tweet - the actual messages
   • Userline - timeline of tweets posted by a specific user
   • Timelinespecific user follows posted by users
       that a
               - timeline of tweets


                                                              22
... that’s a lot of denormalization ...
๏ ColumnFamilies are similar to tables in an RDBMS
๏ Each ColumnFamily can only have one Key
๏ This makes the data highly shardable
๏ Which in turn enables very high write throughput
๏ Note however that each ColumnFamily will require its own writes
   • There are no ACID transactions
   • YOU as a developer is responsible for Consistency!
   • (again, this gives you really high write throughput)
                                                            23
Create user
new_useruuid = str(uuid())

USER.insert(useruuid, {
  'id': new_useruuid,
  'username': username,
  'password': password
  })
USERNAME.insert(username, {
  'id': new_useruuid
  })


Follow user
FRIENDS.insert(useruuid, {frienduuid: time.time()})
FOLLOWERS.insert(frienduuid, {useruuid: time.time()})



                                                        24
Create message
tweetuuid = str(uuid())
timestamp = long(time.time() * 1e6)

TWEET.insert(tweetuuid, {
  'id': tweetuuid,
  'user_id': useruuid,
  'body': body,
  '_ts': timestamp})

message_ref = {
  struct.pack('>d'),
  timestamp: tweetuuid}
USERLINE.insert(useruuid, message_ref)

TIMELINE.insert(useruuid, message_ref)
for otheruuid in FOLLOWERS.get(useruuid, 5000):
    TIMELINE.insert(otheruuid, message_ref)


                                                  25
Get messages
For all users this user follows
timeline = TIMELINE.get(useruuid,
  column_start=start,
  column_count=NUM_PER_PAGE,
  column_reversed=True)
tweets = TWEET.multiget( timeline.values() )

By a specific user
timeline = USERLINE.get(useruuid,
  column_start=start,
  column_count=NUM_PER_PAGE,
  column_reversed=True)
tweets = TWEET.multiget( timeline.values() )




                                               26
Application 3:
Social Network

                  27
Requirements for a Social Network
๏ Interact with friends
๏ Get recommendations for new friends
๏ View the social context of a person
    i.e. How do I know this person?




                                        28
the choice:
Graph DB

              29
“Schema” design
๏ Persons represented by Nodes
๏ Friendship represented by Relationships between Person Nodes
๏ Groups represented by Nodes
๏ Group membership represented by Relationship
    from Person Node to Group Node
๏ Index for Person Nodes for lookup by name
๏ Index for Group Nodes for lookup by name



                                                            30
A small social graph example
       FRIENDSHIP
       MEMBERSHIP                                                                       Dozer
                  Nebuchadnezzar crew
                                                                                  ily
                                                                            :F am
                                                                       er
                                                                  alifi
                                                                Qu
                                                         Tank
                          Morpheus

                                                                                          Agent Brown
                                                           Agent Smith
Thomas Anderson
                                                Cypher
           Q
               ua
                 lifi
                    er
                         :L
                              ov
                                er
                                  s                                 Agent taskforce

                                      Trinity
                                                                                                31
Creating the social graph
GraphDatabaseService graphDb = new EmbeddedGraphDatabase(
  GRAPH_STORAGE_LOCATION );
IndexService indexes = new LuceneIndexService( graphDb );
Transaction tx = graphDb.beginTx();
try {
   Node mrAnderson = graphDb.createNode();
   mrAnderson.setProperty( "name", "Thomas Anderson" );
   mrAnderson.setProperty( "age", 29 );
   indexes.index( mrAnderson, "person", "Thomas Anderson" );
   Node morpheus = graphDb.createNode();
   morpheus.setProperty( "name", "Morpheus" );
   morpheus.setProperty( "rank", "Captain" );
   indexes.index( mrAnderson, "person", "Morpheus" );
   Relationship friendship = mrAnderson.createRelationshipTo(
      morpheus, SocialGraphTypes.FRIENDSHIP );

   tx.success();
} finally {
   tx.finish();
}                                                        32
Making new friends
Node person1 = indexes.getSingle( "persons", person1Name );
Node person2 = indexes.getSingle( "persons", person2Name );

person1.createRelationshipTo(
  person2, SocialGraphTypes.FRIENDSHIP );



Joining a group
Node person = indexes.getSingle( "persons", personName );
Node group = indexes.getSingle( "groups", groupName );

person.createRelationshipTo(
  group, SocialGraphTypes.MEMBERSHIP );




                                                            33
How do I know this person?
Node me = ...
Node you = ...

PathFinder shortestPathFinder = GraphAlgoFactory.shortestPath(
   Traversals.expanderForTypes(
       SocialGraphTypes.FRIENDSHIP, Direction.BOTH ),
   /* maximum depth: */ 4 );

Path shortestPath = shortestPathFinder.findSinglePath(me, you);

for ( Node friend : shortestPath.nodes() ) {
   System.out.println( friend.getProperty( "name" ) );
}




                                                         34
Recommend new friends
Node person = ...

TraversalDescription friendsOfFriends = Traversal.description()
   .expand( Traversals.expanderForTypes(
                 SocialGraphTypes.FRIENDSHIP, Direction.BOTH ) )
   .prune( Traversal.pruneAfterDepth( 2 ) )
   .breadthFirst() // Visit my friends before their friends.
   //Visit a node at most once (don’t recommend direct friends)
   .uniqueness( Uniqueness.NODE_GLOBAL )
   .filter( new Predicate<Path>() {
       // Only return friends of friends
       public boolean accept( Path traversalPos ) {
          return traversalPos.length() == 2;
       }
   } );

for ( Node recommendation :
           friendsOfFriends.traverse( person ).nodes() ) {
   System.out.println( recommendedFriend.getProperty("name") );
}                                                        35
When to use Document DB (e.g. MongoDB)
๏ When data is collections of similar entities
   • But semi structured (sparse) rather than tabular
   • When fields in entries have multiple values




                                                        36
When to use ColumnFamily DB (e.g. Cassandra)
๏ When scalability is the main issue
   • Both scaling size and scaling load
      ‣In particular scaling write load


๏ Linear scalability (as you add servers) both in read and write
๏ Low level - will require you to duplicate data to support queries




                                                               37
When to use Graph DB (e.g. Neo4j)
๏ When deep traversals are important
๏ For complex and domains
๏ When how entities relate is an important aspect of the domain




                                                            38
When not to use a NOSQL Database
๏ RDBMSes have been the de-facto standard for years, and still have
     better tools for some tasks

   • Especially for reporting
๏ When maintaining a system that works already
๏ Sometimes when data is uniform / structured
๏ When aggregations over (subsets) of the entire dataset is key

๏ But please don’t use a Relational database for persisting objects

                                                                39
Complex problem? - right tool for each job!




                             Image credits: Unknown :’(   40
Polyglot persistence
๏ Use multiple databases in the same system
    - use the right tool for each part of the system
๏ Examples:
   • Use an RDBMS relationships between entities Database for
       modeling the
                    for structured data and a Graph


   • Use a Graphfor storing for the domain model and a Document
       Database
                 Database
                            large data objects




                                                          41
- the Graph Database company




http://neotechnology.com

More Related Content

What's hot

5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDB5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDBTim Callaghan
 
Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB
 Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB
Conceptos básicos. Seminario web 2: Su primera aplicación MongoDBMongoDB
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDBMongoDB
 
Neo4j 4.1 overview
Neo4j 4.1 overviewNeo4j 4.1 overview
Neo4j 4.1 overviewNeo4j
 
Back to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQLBack to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQLMongoDB
 
Back to Basics Spanish 4 Introduction to sharding
Back to Basics Spanish 4 Introduction to shardingBack to Basics Spanish 4 Introduction to sharding
Back to Basics Spanish 4 Introduction to shardingMongoDB
 
Migrating to MongoDB: Best Practices
Migrating to MongoDB: Best PracticesMigrating to MongoDB: Best Practices
Migrating to MongoDB: Best PracticesMongoDB
 
Breaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDB
Breaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDBBreaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDB
Breaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDBMongoDB
 
Mongodb - NoSql Database
Mongodb - NoSql DatabaseMongodb - NoSql Database
Mongodb - NoSql DatabasePrashant Gupta
 
Back to Basics 2017: Mí primera aplicación MongoDB
Back to Basics 2017: Mí primera aplicación MongoDBBack to Basics 2017: Mí primera aplicación MongoDB
Back to Basics 2017: Mí primera aplicación MongoDBMongoDB
 
Benefits of using MongoDB: Reduce Complexity & Adapt to Changes
Benefits of using MongoDB: Reduce Complexity & Adapt to ChangesBenefits of using MongoDB: Reduce Complexity & Adapt to Changes
Benefits of using MongoDB: Reduce Complexity & Adapt to ChangesAlex Nguyen
 
Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)Federico Panini
 
High Performance Applications with MongoDB
High Performance Applications with MongoDBHigh Performance Applications with MongoDB
High Performance Applications with MongoDBMongoDB
 
Webinar: When to Use MongoDB
Webinar: When to Use MongoDBWebinar: When to Use MongoDB
Webinar: When to Use MongoDBMongoDB
 
Back to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to ShardingBack to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to ShardingMongoDB
 
Cloud Backup Overview
Cloud Backup Overview Cloud Backup Overview
Cloud Backup Overview MongoDB
 
Apache Any23 - Anything to Triples
Apache Any23 - Anything to TriplesApache Any23 - Anything to Triples
Apache Any23 - Anything to TriplesMichele Mostarda
 

What's hot (20)

No sq lv1_0
No sq lv1_0No sq lv1_0
No sq lv1_0
 
5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDB5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDB
 
NoSQL Introduction
NoSQL IntroductionNoSQL Introduction
NoSQL Introduction
 
Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB
 Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB
Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDB
 
Neo4j 4.1 overview
Neo4j 4.1 overviewNeo4j 4.1 overview
Neo4j 4.1 overview
 
Back to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQLBack to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQL
 
Back to Basics Spanish 4 Introduction to sharding
Back to Basics Spanish 4 Introduction to shardingBack to Basics Spanish 4 Introduction to sharding
Back to Basics Spanish 4 Introduction to sharding
 
Migrating to MongoDB: Best Practices
Migrating to MongoDB: Best PracticesMigrating to MongoDB: Best Practices
Migrating to MongoDB: Best Practices
 
Breaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDB
Breaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDBBreaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDB
Breaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDB
 
Mongodb - NoSql Database
Mongodb - NoSql DatabaseMongodb - NoSql Database
Mongodb - NoSql Database
 
Wmware NoSQL
Wmware NoSQLWmware NoSQL
Wmware NoSQL
 
Back to Basics 2017: Mí primera aplicación MongoDB
Back to Basics 2017: Mí primera aplicación MongoDBBack to Basics 2017: Mí primera aplicación MongoDB
Back to Basics 2017: Mí primera aplicación MongoDB
 
Benefits of using MongoDB: Reduce Complexity & Adapt to Changes
Benefits of using MongoDB: Reduce Complexity & Adapt to ChangesBenefits of using MongoDB: Reduce Complexity & Adapt to Changes
Benefits of using MongoDB: Reduce Complexity & Adapt to Changes
 
Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)
 
High Performance Applications with MongoDB
High Performance Applications with MongoDBHigh Performance Applications with MongoDB
High Performance Applications with MongoDB
 
Webinar: When to Use MongoDB
Webinar: When to Use MongoDBWebinar: When to Use MongoDB
Webinar: When to Use MongoDB
 
Back to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to ShardingBack to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to Sharding
 
Cloud Backup Overview
Cloud Backup Overview Cloud Backup Overview
Cloud Backup Overview
 
Apache Any23 - Anything to Triples
Apache Any23 - Anything to TriplesApache Any23 - Anything to Triples
Apache Any23 - Anything to Triples
 

Viewers also liked

RDBMS vs NoSQL-MongoDB
RDBMS vs NoSQL-MongoDBRDBMS vs NoSQL-MongoDB
RDBMS vs NoSQL-MongoDBIntellipaat
 
2014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-22014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-2MongoDB
 
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB WorldNoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB WorldAjay Gupte
 
Managing Social Content with MongoDB
Managing Social Content with MongoDBManaging Social Content with MongoDB
Managing Social Content with MongoDBMongoDB
 
Schema less table & dynamic schema
Schema less table & dynamic schemaSchema less table & dynamic schema
Schema less table & dynamic schemaDavide Mauri
 
Elasticsearch as a search alternative to a relational database
Elasticsearch as a search alternative to a relational databaseElasticsearch as a search alternative to a relational database
Elasticsearch as a search alternative to a relational databaseKristijan Duvnjak
 
New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
 New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S... New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...Big Data Spain
 
Model-Driven Development of Semantic Mashup Applications with the Open-Source...
Model-Driven Development of Semantic Mashup Applications with the Open-Source...Model-Driven Development of Semantic Mashup Applications with the Open-Source...
Model-Driven Development of Semantic Mashup Applications with the Open-Source...InfoGrid.org
 
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL databaseHBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL databaseEdureka!
 
Query mechanisms for NoSQL databases
Query mechanisms for NoSQL databasesQuery mechanisms for NoSQL databases
Query mechanisms for NoSQL databasesArangoDB Database
 
NoSQL databases pros and cons
NoSQL databases pros and consNoSQL databases pros and cons
NoSQL databases pros and consFabio Fumarola
 
Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQLRTigger
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm Chandler Huang
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenLorenzo Alberton
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL DatabasesDerek Stainer
 
DjangoCon 2010 Scaling Disqus
DjangoCon 2010 Scaling DisqusDjangoCon 2010 Scaling Disqus
DjangoCon 2010 Scaling Disquszeeg
 
A Beginners Guide to noSQL
A Beginners Guide to noSQLA Beginners Guide to noSQL
A Beginners Guide to noSQLMike Crabb
 
2015 Upload Campaigns Calendar - SlideShare
2015 Upload Campaigns Calendar - SlideShare2015 Upload Campaigns Calendar - SlideShare
2015 Upload Campaigns Calendar - SlideShareSlideShare
 

Viewers also liked (20)

Mongodb my
Mongodb myMongodb my
Mongodb my
 
RDBMS vs NoSQL-MongoDB
RDBMS vs NoSQL-MongoDBRDBMS vs NoSQL-MongoDB
RDBMS vs NoSQL-MongoDB
 
2014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-22014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-2
 
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB WorldNoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
 
Managing Social Content with MongoDB
Managing Social Content with MongoDBManaging Social Content with MongoDB
Managing Social Content with MongoDB
 
Schema less table & dynamic schema
Schema less table & dynamic schemaSchema less table & dynamic schema
Schema less table & dynamic schema
 
Elasticsearch as a search alternative to a relational database
Elasticsearch as a search alternative to a relational databaseElasticsearch as a search alternative to a relational database
Elasticsearch as a search alternative to a relational database
 
New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
 New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S... New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
 
Model-Driven Development of Semantic Mashup Applications with the Open-Source...
Model-Driven Development of Semantic Mashup Applications with the Open-Source...Model-Driven Development of Semantic Mashup Applications with the Open-Source...
Model-Driven Development of Semantic Mashup Applications with the Open-Source...
 
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL databaseHBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
 
Query mechanisms for NoSQL databases
Query mechanisms for NoSQL databasesQuery mechanisms for NoSQL databases
Query mechanisms for NoSQL databases
 
NoSQL databases pros and cons
NoSQL databases pros and consNoSQL databases pros and cons
NoSQL databases pros and cons
 
Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQL
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
 
Selecting best NoSQL
Selecting best NoSQL Selecting best NoSQL
Selecting best NoSQL
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and when
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL Databases
 
DjangoCon 2010 Scaling Disqus
DjangoCon 2010 Scaling DisqusDjangoCon 2010 Scaling Disqus
DjangoCon 2010 Scaling Disqus
 
A Beginners Guide to noSQL
A Beginners Guide to noSQLA Beginners Guide to noSQL
A Beginners Guide to noSQL
 
2015 Upload Campaigns Calendar - SlideShare
2015 Upload Campaigns Calendar - SlideShare2015 Upload Campaigns Calendar - SlideShare
2015 Upload Campaigns Calendar - SlideShare
 

Similar to Choosing the right NOSQL database

Odessapy2013 - Graph databases and Python
Odessapy2013 - Graph databases and PythonOdessapy2013 - Graph databases and Python
Odessapy2013 - Graph databases and PythonMax Klymyshyn
 
Real-Time Integration Between MongoDB and SQL Databases
Real-Time Integration Between MongoDB and SQL DatabasesReal-Time Integration Between MongoDB and SQL Databases
Real-Time Integration Between MongoDB and SQL DatabasesEugene Dvorkin
 
Real-Time Integration Between MongoDB and SQL Databases
Real-Time Integration Between MongoDB and SQL Databases Real-Time Integration Between MongoDB and SQL Databases
Real-Time Integration Between MongoDB and SQL Databases MongoDB
 
I Have a NoSQL Toaster - Troy .NET User Group - July 2017
I Have a NoSQL Toaster - Troy .NET User Group - July 2017I Have a NoSQL Toaster - Troy .NET User Group - July 2017
I Have a NoSQL Toaster - Troy .NET User Group - July 2017Matthew Groves
 
Building your first app with mongo db
Building your first app with mongo dbBuilding your first app with mongo db
Building your first app with mongo dbMongoDB
 
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...Prasoon Kumar
 
Building a Cross Channel Content Delivery Platform with MongoDB
Building a Cross Channel Content Delivery Platform with MongoDBBuilding a Cross Channel Content Delivery Platform with MongoDB
Building a Cross Channel Content Delivery Platform with MongoDBMongoDB
 
Mongodb introduction and_internal(simple)
Mongodb introduction and_internal(simple)Mongodb introduction and_internal(simple)
Mongodb introduction and_internal(simple)Kai Zhao
 
MongoDB, PHP and the cloud - php cloud summit 2011
MongoDB, PHP and the cloud - php cloud summit 2011MongoDB, PHP and the cloud - php cloud summit 2011
MongoDB, PHP and the cloud - php cloud summit 2011Steven Francia
 
MongoDB World 2019: Terraform New Worlds on MongoDB Atlas
MongoDB World 2019: Terraform New Worlds on MongoDB Atlas MongoDB World 2019: Terraform New Worlds on MongoDB Atlas
MongoDB World 2019: Terraform New Worlds on MongoDB Atlas MongoDB
 
Mongodb intro
Mongodb introMongodb intro
Mongodb introchristkv
 
Big Data Day LA 2015 - Spark after Dark by Chris Fregly of Databricks
Big Data Day LA 2015 - Spark after Dark by Chris Fregly of DatabricksBig Data Day LA 2015 - Spark after Dark by Chris Fregly of Databricks
Big Data Day LA 2015 - Spark after Dark by Chris Fregly of DatabricksData Con LA
 
IMCSummit 2015 - Day 1 Developer Track - Spark After Dark: Generating High Qu...
IMCSummit 2015 - Day 1 Developer Track - Spark After Dark: Generating High Qu...IMCSummit 2015 - Day 1 Developer Track - Spark After Dark: Generating High Qu...
IMCSummit 2015 - Day 1 Developer Track - Spark After Dark: Generating High Qu...In-Memory Computing Summit
 
MongoDB Europe 2016 - Graph Operations with MongoDB
MongoDB Europe 2016 - Graph Operations with MongoDBMongoDB Europe 2016 - Graph Operations with MongoDB
MongoDB Europe 2016 - Graph Operations with MongoDBMongoDB
 
Building your first app with MongoDB
Building your first app with MongoDBBuilding your first app with MongoDB
Building your first app with MongoDBNorberto Leite
 
MongoDB at ZPUGDC
MongoDB at ZPUGDCMongoDB at ZPUGDC
MongoDB at ZPUGDCMike Dirolf
 

Similar to Choosing the right NOSQL database (20)

Odessapy2013 - Graph databases and Python
Odessapy2013 - Graph databases and PythonOdessapy2013 - Graph databases and Python
Odessapy2013 - Graph databases and Python
 
Real-Time Integration Between MongoDB and SQL Databases
Real-Time Integration Between MongoDB and SQL DatabasesReal-Time Integration Between MongoDB and SQL Databases
Real-Time Integration Between MongoDB and SQL Databases
 
Real-Time Integration Between MongoDB and SQL Databases
Real-Time Integration Between MongoDB and SQL Databases Real-Time Integration Between MongoDB and SQL Databases
Real-Time Integration Between MongoDB and SQL Databases
 
I Have a NoSQL Toaster - Troy .NET User Group - July 2017
I Have a NoSQL Toaster - Troy .NET User Group - July 2017I Have a NoSQL Toaster - Troy .NET User Group - July 2017
I Have a NoSQL Toaster - Troy .NET User Group - July 2017
 
Building your first app with mongo db
Building your first app with mongo dbBuilding your first app with mongo db
Building your first app with mongo db
 
GitConnect
GitConnectGitConnect
GitConnect
 
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...
 
Building a Cross Channel Content Delivery Platform with MongoDB
Building a Cross Channel Content Delivery Platform with MongoDBBuilding a Cross Channel Content Delivery Platform with MongoDB
Building a Cross Channel Content Delivery Platform with MongoDB
 
Mongodb Introduction
Mongodb Introduction Mongodb Introduction
Mongodb Introduction
 
Mongodb introduction and_internal(simple)
Mongodb introduction and_internal(simple)Mongodb introduction and_internal(simple)
Mongodb introduction and_internal(simple)
 
MongoDB, PHP and the cloud - php cloud summit 2011
MongoDB, PHP and the cloud - php cloud summit 2011MongoDB, PHP and the cloud - php cloud summit 2011
MongoDB, PHP and the cloud - php cloud summit 2011
 
MongoDB World 2019: Terraform New Worlds on MongoDB Atlas
MongoDB World 2019: Terraform New Worlds on MongoDB Atlas MongoDB World 2019: Terraform New Worlds on MongoDB Atlas
MongoDB World 2019: Terraform New Worlds on MongoDB Atlas
 
XAML/C# to HTML/JS
XAML/C# to HTML/JSXAML/C# to HTML/JS
XAML/C# to HTML/JS
 
Mongodb intro
Mongodb introMongodb intro
Mongodb intro
 
Big Data Day LA 2015 - Spark after Dark by Chris Fregly of Databricks
Big Data Day LA 2015 - Spark after Dark by Chris Fregly of DatabricksBig Data Day LA 2015 - Spark after Dark by Chris Fregly of Databricks
Big Data Day LA 2015 - Spark after Dark by Chris Fregly of Databricks
 
IMCSummit 2015 - Day 1 Developer Track - Spark After Dark: Generating High Qu...
IMCSummit 2015 - Day 1 Developer Track - Spark After Dark: Generating High Qu...IMCSummit 2015 - Day 1 Developer Track - Spark After Dark: Generating High Qu...
IMCSummit 2015 - Day 1 Developer Track - Spark After Dark: Generating High Qu...
 
MongoDB Europe 2016 - Graph Operations with MongoDB
MongoDB Europe 2016 - Graph Operations with MongoDBMongoDB Europe 2016 - Graph Operations with MongoDB
MongoDB Europe 2016 - Graph Operations with MongoDB
 
Building your first app with MongoDB
Building your first app with MongoDBBuilding your first app with MongoDB
Building your first app with MongoDB
 
Latinoware
LatinowareLatinoware
Latinoware
 
MongoDB at ZPUGDC
MongoDB at ZPUGDCMongoDB at ZPUGDC
MongoDB at ZPUGDC
 

More from Tobias Lindaaker

An overview of Neo4j Internals
An overview of Neo4j InternalsAn overview of Neo4j Internals
An overview of Neo4j InternalsTobias Lindaaker
 
[JavaOne 2011] Models for Concurrent Programming
[JavaOne 2011] Models for Concurrent Programming[JavaOne 2011] Models for Concurrent Programming
[JavaOne 2011] Models for Concurrent ProgrammingTobias Lindaaker
 
Django and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks assDjango and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks assTobias Lindaaker
 
NOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4jNOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4jTobias Lindaaker
 
Persistent graphs in Python with Neo4j
Persistent graphs in Python with Neo4jPersistent graphs in Python with Neo4j
Persistent graphs in Python with Neo4jTobias Lindaaker
 
A Better Python for the JVM
A Better Python for the JVMA Better Python for the JVM
A Better Python for the JVMTobias Lindaaker
 
A Better Python for the JVM
A Better Python for the JVMA Better Python for the JVM
A Better Python for the JVMTobias Lindaaker
 
Exploiting Concurrency with Dynamic Languages
Exploiting Concurrency with Dynamic LanguagesExploiting Concurrency with Dynamic Languages
Exploiting Concurrency with Dynamic LanguagesTobias Lindaaker
 

More from Tobias Lindaaker (9)

JDK Power Tools
JDK Power ToolsJDK Power Tools
JDK Power Tools
 
An overview of Neo4j Internals
An overview of Neo4j InternalsAn overview of Neo4j Internals
An overview of Neo4j Internals
 
[JavaOne 2011] Models for Concurrent Programming
[JavaOne 2011] Models for Concurrent Programming[JavaOne 2011] Models for Concurrent Programming
[JavaOne 2011] Models for Concurrent Programming
 
Django and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks assDjango and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks ass
 
NOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4jNOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4j
 
Persistent graphs in Python with Neo4j
Persistent graphs in Python with Neo4jPersistent graphs in Python with Neo4j
Persistent graphs in Python with Neo4j
 
A Better Python for the JVM
A Better Python for the JVMA Better Python for the JVM
A Better Python for the JVM
 
A Better Python for the JVM
A Better Python for the JVMA Better Python for the JVM
A Better Python for the JVM
 
Exploiting Concurrency with Dynamic Languages
Exploiting Concurrency with Dynamic LanguagesExploiting Concurrency with Dynamic Languages
Exploiting Concurrency with Dynamic Languages
 

Recently uploaded

SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 

Recently uploaded (20)

SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 

Choosing the right NOSQL database

  • 1. Choosing the Right NOSQL Database twitter: @thobe / @neo4j / #neo4j Tobias Ivarsson email: tobias@neotechnology.com web: http://neo4j.org/ Hacker @ Neo Technology web: http://thobe.org/
  • 4. This is still the view a lot of people have of NOSQL. Image credit: http://browsertoolkit.com/fault-tolerance.png 4
  • 5. The Technologies ๏Graph Databases - Neo4j ๏Document Databases - MongoDB ๏Column Family Database - Cassandra 5
  • 6. Neo4j is a Graph Database Graph databases FOCUS on the interconnection bet ween entities. 6
  • 7. IS_A Neo4j Graph Database Graph databases FOCUS on the interconnection bet ween entities. 6
  • 8. Other Graph Databases ๏ Neo4j ๏ Sones GraphDB ๏ Infinite Graph (by Objectivity) ๏ AllegroGraph (by Franz inc.) ๏ HypergraphDB ๏ InfoGrid ๏ DEX ๏ VertexDB ๏ FlockDB 7
  • 10. Document Databases ๏ MongoDB ๏ Riak ๏ CouchDB ๏ SimpleDB (internal at Amazon) 9
  • 12. ColumnFamily Databases ๏ Cassandra ๏ BigTable (internal at Google) ๏ HBase (part of Hadoop) ๏ Hypertable 11
  • 13. Application 1: Blog system 12
  • 14. Requirements for a Blog System ๏ Get blog posts for a specific blog ordered by date • possibly filtered by tag ๏ Blogs can have an arbitrary number of blog posts ๏ Blog posts can have an arbitrary number of comments 13
  • 16. “Schema” design ๏Represent each Blog as a Collection of Post documents ๏Represent Comments as nested documents in the Post documents 15
  • 17. Creating a blog post import com.mongodb.Mongo; import com.mongodb.DB; import com.mongodb.DBCollection; import com.mongodb.BasicDBObject; import com.mongodb.DBObject; // ... Mongo mongo = new Mongo( "localhost" ); // Connect to MongoDB // ... DB blogs = mongo.getDB( "blogs" ); // Access the blogs database DBCollection myBlog = blogs.getCollection( "myBlog" ); DBObject blogPost = new BasicDBObject(); blogPost.put( "title", "JavaOne 2010" ); blogPost.put( "pub_date", new Date() ); blogPost.put( "body", "Publishing a post about JavaOne in my MongoDB blog!" ); blogPost.put( "tags", Arrays.asList( "conference", "java" ) ); blogPost.put( "comments", new ArrayList() ); myBlog.insert( blogPost ); 16
  • 18. Retrieving posts // ... import com.mongodb.DBCursor; // ... public Object getAllPosts( String blogName ) { DBCollection blog = db.getCollection( blogName ); return renderPosts( blog.find() ); } public Object getPostsByTag( String blogName, String tag ) { DBCollection blog = db.getCollection( blogName ); return renderPosts( blog.find( new BasicDBObject( "tags", tag ) ) ); } private Object renderPosts( DBCursor cursor ) { // order by publication date (descending) cursor = cursor.sort( new BasicDBObject( "pub_date", -1 ) ); // ... } 17
  • 19. Adding a comment DBCollection myBlog = blogs.getCollection( "myBlog" ); // ... void addComment( String blogPostId, String message ) { DBCursor posts = myBlog.find( new BasicDBObject( "_id", blogPostId ); if ( !posts.hasNext() ) throw new NoSuchElementException(); DBObject blogPost = posts.next(); List comments = (List)blogPost.get( "comments" ); comments.add( new BasicDBObject( "message", message ) .append( "date", new Date() ) ); myBlog.save( blogPost ); } 18
  • 21. Requirements for a Twitter Clone ๏ Handle high load - especially high write load • Twitter generates 300GB of tweets / hour (April 2010) ๏ Retrieve all posts by a specific user, ordered by date ๏ Retrieve all posts by people a specific user follows, ordered by date 20
  • 23. Schema design ๏ Main keyspace: “Twissandra”, with these ColumnFamilies: • User - user data, keyed by user id (UUID) • Username - inverted index from username to user id • Friends - who is user X following? • Followers - who is following user X? • Tweet - the actual messages • Userline - timeline of tweets posted by a specific user • Timelinespecific user follows posted by users that a - timeline of tweets 22
  • 24. ... that’s a lot of denormalization ... ๏ ColumnFamilies are similar to tables in an RDBMS ๏ Each ColumnFamily can only have one Key ๏ This makes the data highly shardable ๏ Which in turn enables very high write throughput ๏ Note however that each ColumnFamily will require its own writes • There are no ACID transactions • YOU as a developer is responsible for Consistency! • (again, this gives you really high write throughput) 23
  • 25. Create user new_useruuid = str(uuid()) USER.insert(useruuid, { 'id': new_useruuid, 'username': username, 'password': password }) USERNAME.insert(username, { 'id': new_useruuid }) Follow user FRIENDS.insert(useruuid, {frienduuid: time.time()}) FOLLOWERS.insert(frienduuid, {useruuid: time.time()}) 24
  • 26. Create message tweetuuid = str(uuid()) timestamp = long(time.time() * 1e6) TWEET.insert(tweetuuid, { 'id': tweetuuid, 'user_id': useruuid, 'body': body, '_ts': timestamp}) message_ref = { struct.pack('&gt;d'), timestamp: tweetuuid} USERLINE.insert(useruuid, message_ref) TIMELINE.insert(useruuid, message_ref) for otheruuid in FOLLOWERS.get(useruuid, 5000): TIMELINE.insert(otheruuid, message_ref) 25
  • 27. Get messages For all users this user follows timeline = TIMELINE.get(useruuid, column_start=start, column_count=NUM_PER_PAGE, column_reversed=True) tweets = TWEET.multiget( timeline.values() ) By a specific user timeline = USERLINE.get(useruuid, column_start=start, column_count=NUM_PER_PAGE, column_reversed=True) tweets = TWEET.multiget( timeline.values() ) 26
  • 29. Requirements for a Social Network ๏ Interact with friends ๏ Get recommendations for new friends ๏ View the social context of a person i.e. How do I know this person? 28
  • 31. “Schema” design ๏ Persons represented by Nodes ๏ Friendship represented by Relationships between Person Nodes ๏ Groups represented by Nodes ๏ Group membership represented by Relationship from Person Node to Group Node ๏ Index for Person Nodes for lookup by name ๏ Index for Group Nodes for lookup by name 30
  • 32. A small social graph example FRIENDSHIP MEMBERSHIP Dozer Nebuchadnezzar crew ily :F am er alifi Qu Tank Morpheus Agent Brown Agent Smith Thomas Anderson Cypher Q ua lifi er :L ov er s Agent taskforce Trinity 31
  • 33. Creating the social graph GraphDatabaseService graphDb = new EmbeddedGraphDatabase( GRAPH_STORAGE_LOCATION ); IndexService indexes = new LuceneIndexService( graphDb ); Transaction tx = graphDb.beginTx(); try { Node mrAnderson = graphDb.createNode(); mrAnderson.setProperty( "name", "Thomas Anderson" ); mrAnderson.setProperty( "age", 29 ); indexes.index( mrAnderson, "person", "Thomas Anderson" ); Node morpheus = graphDb.createNode(); morpheus.setProperty( "name", "Morpheus" ); morpheus.setProperty( "rank", "Captain" ); indexes.index( mrAnderson, "person", "Morpheus" ); Relationship friendship = mrAnderson.createRelationshipTo( morpheus, SocialGraphTypes.FRIENDSHIP ); tx.success(); } finally { tx.finish(); } 32
  • 34. Making new friends Node person1 = indexes.getSingle( "persons", person1Name ); Node person2 = indexes.getSingle( "persons", person2Name ); person1.createRelationshipTo( person2, SocialGraphTypes.FRIENDSHIP ); Joining a group Node person = indexes.getSingle( "persons", personName ); Node group = indexes.getSingle( "groups", groupName ); person.createRelationshipTo( group, SocialGraphTypes.MEMBERSHIP ); 33
  • 35. How do I know this person? Node me = ... Node you = ... PathFinder shortestPathFinder = GraphAlgoFactory.shortestPath( Traversals.expanderForTypes( SocialGraphTypes.FRIENDSHIP, Direction.BOTH ), /* maximum depth: */ 4 ); Path shortestPath = shortestPathFinder.findSinglePath(me, you); for ( Node friend : shortestPath.nodes() ) { System.out.println( friend.getProperty( "name" ) ); } 34
  • 36. Recommend new friends Node person = ... TraversalDescription friendsOfFriends = Traversal.description() .expand( Traversals.expanderForTypes( SocialGraphTypes.FRIENDSHIP, Direction.BOTH ) ) .prune( Traversal.pruneAfterDepth( 2 ) ) .breadthFirst() // Visit my friends before their friends. //Visit a node at most once (don’t recommend direct friends) .uniqueness( Uniqueness.NODE_GLOBAL ) .filter( new Predicate<Path>() { // Only return friends of friends public boolean accept( Path traversalPos ) { return traversalPos.length() == 2; } } ); for ( Node recommendation : friendsOfFriends.traverse( person ).nodes() ) { System.out.println( recommendedFriend.getProperty("name") ); } 35
  • 37. When to use Document DB (e.g. MongoDB) ๏ When data is collections of similar entities • But semi structured (sparse) rather than tabular • When fields in entries have multiple values 36
  • 38. When to use ColumnFamily DB (e.g. Cassandra) ๏ When scalability is the main issue • Both scaling size and scaling load ‣In particular scaling write load ๏ Linear scalability (as you add servers) both in read and write ๏ Low level - will require you to duplicate data to support queries 37
  • 39. When to use Graph DB (e.g. Neo4j) ๏ When deep traversals are important ๏ For complex and domains ๏ When how entities relate is an important aspect of the domain 38
  • 40. When not to use a NOSQL Database ๏ RDBMSes have been the de-facto standard for years, and still have better tools for some tasks • Especially for reporting ๏ When maintaining a system that works already ๏ Sometimes when data is uniform / structured ๏ When aggregations over (subsets) of the entire dataset is key ๏ But please don’t use a Relational database for persisting objects 39
  • 41. Complex problem? - right tool for each job! Image credits: Unknown :’( 40
  • 42. Polyglot persistence ๏ Use multiple databases in the same system - use the right tool for each part of the system ๏ Examples: • Use an RDBMS relationships between entities Database for modeling the for structured data and a Graph • Use a Graphfor storing for the domain model and a Document Database Database large data objects 41
  • 43. - the Graph Database company http://neotechnology.com