This document discusses using Neo4j, a graph database, for recommendations. It describes modeling data as graphs in Neo4j and developing recommendation algorithms and plugins for it, such as for document similarity, movie recommendations, and restricting recommendations to a subgraph. An example application called TeleVido.tv is also mentioned that provides media content recommendations using Neo4j.
1. Neo4j - Graph database for
recommendations
Jakub Kříž, Ondrej Proksa30.5.2013
2. Summary
Graph databases
Working with Neo4j and Ruby (On Rails)
Plugins and algorithms – live demos
Document similarity
Movie recommendation
Recommendation from subgraph
TeleVido.tv
3. Why Graphs?
Graphs are everywhere!
Natural way to model almost everything
“Whiteboard friendly”
Even the internet is a graph
4. Why Graph Databases?
Relational databases are not so great for
storing graph structures
Unnatural m:n relations
Expensive joins
Expensive look ups during graph traversals
Graph databases fix this
Efficient storage
Direct pointers = no joins
5. Neo4j
The World's Leading Graph Database
www.neo4j.org
NOSQL database
Open source - github.com/neo4j
ACID
Brief history
Official v1.0 – 2010
Current version 1.9
2.0 coming soon
6. Querying Neo4j
Querying languages
Structurally similar to SQL
Based on graph traversal
Most often used
Gremlin – generic graph querying language
Cypher – graph querying language for Neo4j
SPARQL – generic querying language for data in
RDF format
8. Cypher Example (2)
Friend of a friend
START n=node(0)
MATCH (n)--()--(f)
RETURN f
9. Working with Neo4j
REST API => wrappers
Neography for Ruby
py2neo for Python
…
Your own wrapper
Java API
Direct access in JVM based applications
neo4j.rb
10. Neography – API wrapper example
# create nodes and properties
n1 = Neography::Node.create("age" => 31, "name" => "Max")
n2 = Neography::Node.create("age" => 33, "name" => "Roel")
n1.weight = 190
# create relationships
new_rel = Neography::Relationship.create(:coding_buddies, n1, n2)
n1.outgoing(:coding_buddies) << n2
# get nodes related by outgoing friends relationship
n1.outgoing(:friends)
# get n1 and nodes related by friends and friends of friends
n1.outgoing(:friends).depth(2).include_start_node
11. Neo4j.rb – JRuby gem example
class Person < Neo4j::Rails::Model
property :name
property :age, :index => :exact # :fulltext
has_n(:friends).to(Person).relationship(Friend)
end
class Friend < Neo4j::Rails::Relationship
property :as
end
mike = Person.new(:name => ‘Mike’, :age => 24)
john = Person.new(:name => ‘John’, :age => 27)
mike.friends << john
mike.save
12. Our Approach
Relational databases are not so bad
Good for basic data storage
Widely used for web applications
Well supported in Rails via ActiveRecord
Performance issues with Neo4j
However, we need a graph database
We model the domain as a graph
Our recommendation is based on graph traversal
13. Our Approach (2)
Hybrid model using both MySQL and Neo4j
MySQL contains basic information about
entities
Neo4j contains only
relationships
Paired via
identifiers (neo4j_id)
14. Our Approach (3)
Recommendation algorithms
Made as plugins to Neo4j
Written in Java
Embedded into Neo4j API
Rails application uses custom made wrapper
Creates and modifies nodes and relationships via
API calls
Handles recommendation requests
15. Graph Algorithms
Built-in algorithms
Shortest path
All shortest paths
Dijkstra’s algorithm
Custom algorithms
Depth first search
Breadth first search
Spreading activation
Flows, pairing, etc.
16. Document Similarity
Task: find similarities between documents
Documents data model:
Each document is made of sentences
Each sentence can be divided into n-grams
N-grams are connected with relationships
Neo4J is graph database in Java
(Neo4j, graph) – (graph, database) – (database, Java)
18. Detecting similar documents in our graph
model
Shortest path between documents
Number of paths shorter than some distance
Weighing relationships
How about a custom plugin?
Spreading activation
Document Similarity (3)
20. Task: recommend movies based on what
we like
We like some entities, let’s call them initial
Movies
People (actors, directors etc.)
Genres
We want recommended nodes from input
Find nodes which are
The closest to initial nodes
The most relevant to initial nodes
Movie Recommendation
21. 165k nodes
Movies
People
Genre
870k relationships
Movies – People
Movies – Genres
Easy to add more entities
Tags, mood, period, etc.
Will it be fast? We need 1-2 seconds
Movie Recommendation (2)
33. Movie Recommendation – User Model
Spreading energy
Each initial node gets different starting energy
Based on user’s interests and feedback
Improves the recommendation!
34. Recommendation from subgraph
Recommend movies which are currently in
cinemas
Recommend movies which are currently on TV
How?
Algorithm will traverse normally
Creates a subgraph from which it returns nodes
36. TeleVido.tv
Media content recommendation using Neo4j
Movie recommendation
Recommendation of movies in cinemas
Recommendation of TV programs and schedules
37. Summary
Graph databases
Working with Neo4j and Ruby (On Rails)
Plugins and algorithms
Document similarity
Movie recommendation
Recommendation from subgraph
TeleVido.tv