Squire: A polyglot application combining
   Neo4j, MongoDB, Ruby and Scala




FOSDEM 2013                Alberto Perdomo
Graph Processing DevRoom   @albertoperdomo
About me
About me


Co-Founder of Aentos
About me


Co-Founder of Aentos

Living my dream* (CNY)
About me


Co-Founder of Aentos

Living my dream* (CNY)

Web developer
d
Features
Features

★ Extensive movies & TV-shows data
Features

★ Extensive movies & TV-shows data

★ Info & price on streaming & rent services:
iTunes, Amazon, Hulu, Netflix...
Features

★ Extensive movies & TV-shows data

★ Info & price on streaming & rent services:
iTunes, Amazon, Hulu, Netflix...

★ Social features (contacts, comments)
Features

★ Extensive movies & TV-shows data

★ Info & price on streaming & rent services:
iTunes, Amazon, Hulu, Netflix...

★ Social features (contacts, comments)

★ Ratings
Features

★ Extensive movies & TV-shows data

★ Info & price on streaming & rent services:
iTunes, Amazon, Hulu, Netflix...

★ Social features (contacts, comments)

★ Ratings

★ Suggestions by users
Features

★ Extensive movies & TV-shows data

★ Info & price on streaming & rent services:
iTunes, Amazon, Hulu, Netflix...

★ Social features (contacts, comments)

★ Ratings

★ Suggestions by users

★ System recommendations
Unconventional
    System
 Architecture*
Ingredients
Ingredients


✔ MongoDB
Ingredients


✔ MongoDB
✔ Neo4j
Ingredients


✔ MongoDB
✔ Neo4j
✔ Ruby
Ingredients


✔ MongoDB
✔ Neo4j
✔ Ruby
✔ Scala
⚛ System Overview ⚛
                       iPad
                      Client




                               Recommend.
           Main API
                                  Engine
            (Ruby)
                                  (Scala)




                                            Users
All data   MongoDB               Neo4j      Ratings
                                            Movies and TV shows
API: Ruby + MongoDB
API: Ruby + MongoDB

⚛ REST+JSON
API: Ruby + MongoDB

⚛ REST+JSON
⚛ Accounts & sessions
API: Ruby + MongoDB

⚛ REST+JSON
⚛ Accounts & sessions
⚛ Movies & TV-Shows
API: Ruby + MongoDB

⚛ REST+JSON
⚛ Accounts & sessions
⚛ Movies & TV-Shows
⚛ Contacts
API: Ruby + MongoDB

⚛ REST+JSON
⚛ Accounts & sessions
⚛ Movies & TV-Shows
⚛ Contacts
⚛ Ratings
API: Ruby + MongoDB

⚛ REST+JSON
⚛ Accounts & sessions
⚛ Movies & TV-Shows
⚛ Contacts
⚛ Ratings
⚛ User suggestions & recommendations
Recommendations: Scala +
        Neo4j
Recommendations: Scala +
        Neo4j


⚛ Movies & TV-shows (IDs & rel.)
Recommendations: Scala +
        Neo4j


⚛ Movies & TV-shows (IDs & rel.)
⚛ Users
Recommendations: Scala +
        Neo4j


⚛ Movies & TV-shows (IDs & rel.)
⚛ Users
⚛ Ratings
Communication (RabbitMQ)
                                                            iPad
                  GET /suggestions                         Client

                                     JSON array of data




                                               Query params in JSON      Recommend.
                    Main API
                                                                            Engine
                     (Ruby)
                                                  JSON array of IDs         (Scala)

           Retrieve data
                                           RabbitMQ + JSON
                                                                               Collect IDs of suggestions
           for response



                                                          Graph reset*                     Users
All data           MongoDB                                                 Neo4j           Ratings
                                                                                           Movies and TV shows
Issues
Issues


☠ API server waiting for query in Neo4j
Issues


☠ API server waiting for query in Neo4j

☠ Query MongoDB for add. data
Issues


☠ API server waiting for query in Neo4j

☠ Query MongoDB for add. data

☠ Transform data
Issues


☠ API server waiting for query in Neo4j

☠ Query MongoDB for add. data

☠ Transform data

☠ Ruby server is blocking
Communication (DB)
                                                      iPad
              GET /suggestions                       Client                             JSON array of
                                                                                      suggested results



                                  Redirect to Recomm.          Follow redirect link
                                 Eng. w/ hashed params



                                                                                 Recommend.
                Main API
                                                                                    Engine
                 (Ruby)
                                          Retrieve data for                         (Scala)
                                              response

      Retrieve data
                                                                                                 Collect IDs of suggestions
    for query params

                                                     Graph reset*


                                                                                                             Users
All data       MongoDB                                                                   Neo4j               Ratings
                                                                                                             Movies and TV shows
Improvements
Improvements

✌ No more long blocking requests in
Ruby server
Improvements

✌ No more long blocking requests in
Ruby server
✌ Move heavy lifting to Spray/Scala
(async)
Improvements

✌ No more long blocking requests in
Ruby server
✌ Move heavy lifting to Spray/Scala
(async)
✌ Add cache layer
Improvements

✌ No more long blocking requests in
Ruby server
✌ Move heavy lifting to Spray/Scala
(async)
✌ Add cache layer
Additional Cache Layer
Additional Cache Layer


* Read-Through
Additional Cache Layer


* Read-Through
* Stores transformed data
Additional Cache Layer


* Read-Through
* Stores transformed data
* Implemented using spray-caching
Total Response Time




40s → <1s
Delta Updates
Delta Updates


* API: New ratings in queue collection
(MongoDB)
Delta Updates


* API: New ratings in queue collection
(MongoDB)
* Recommendations:
Backend checks for updates in queue
and applies them in graph
Delta Updates
                                                  iPad
                                                 Client               GET /suggestions
                       POST /rating




                                                                       Recommend.
               Main API
                                                                          Engine
                (Ruby)
                                      Retrieve data for                   (Scala)
                                          response

      Store rating +                                            Apply updates
                                                                                         Collect IDs of suggestions
Put rating in queue                                               if pending

                                             Check if updates
                                             in queue
                                                                                                    Users
              MongoDB                                                        Neo4j                  Ratings
                                                                                                    Movies and TV shows
(Re)loading data in graph
(Re)loading data in graph

1. Create new Neo4j instance
(Re)loading data in graph

1. Create new Neo4j instance

2. Import data from MongoDB
(Re)loading data in graph

1. Create new Neo4j instance

2. Import data from MongoDB

3. Hot swap
(Re)loading data in graph

1. Create new Neo4j instance

2. Import data from MongoDB

3. Hot swap

★ Allows for massive updates in the primary
database
(Re)loading data in graph

1. Create new Neo4j instance

2. Import data from MongoDB

3. Hot swap

★ Allows for massive updates in the primary
database

★ Used only for content-related data and user
Graph DB as an index
for relations*
Graph DB as an index
for relations*
powerful queries on relations
Graph DB as an index
for relations*
powerful queries on relations
Graph DB as an index
for relations*
powerful queries on relations


⁂ like full text search for strings
Design decisions: Graph
Design decisions: Graph

✍ Copy subsets of data in Neo4j
Design decisions: Graph

✍ Copy subsets of data in Neo4j
✍ Rebuild graph with no
downtime*
Design decisions: Graph

✍ Copy subsets of data in Neo4j
✍ Rebuild graph with no
downtime*

✍ Massive updates blazingly fast
Design decisions: Graph

✍ Copy subsets of data in Neo4j
✍ Rebuild graph with no
downtime*

✍ Massive updates blazingly fast
✍ Less critical pieces
The right tools, for us! ✌



API in Ruby & MongoDB
The right tools, for us! ✌



API in Ruby & MongoDB
Maintanability, flexibility, libraries,
The right tools, for us! ✌



API in Ruby & MongoDB
Maintanability, flexibility, libraries,
schema-free!, development speed
The right tools, for us! ✌


Recommendations
The right tools, for us! ✌


Recommendations
Scala: Performance + good
interface for Neo4j
The right tools, for us! ✌


Recommendations
Scala: Performance + good
interface for Neo4j
Neo4j: Graph problem
Thanks! ☃


      Alberto Perdomo
      @albertoperdomo
❤ Credits ❤
Entering Hyperspace by Éole Wind

Squire: A polyglot application combining Neo4j, MongoDB, Ruby and Scala @ FOSDEM GraphDevRoom 2013