Neo4j - graph database for recommendations

Neo4j - Graph database for
recommendations
Jakub Kříž, Ondrej Proksa30.5.2013

Summary
 Graph databases
 Working with Neo4j and Ruby (On Rails)
 Plugins and algorithms – live demos
 Document similarity
 Movie recommendation
 Recommendation from subgraph
 TeleVido.tv

Why Graphs?
 Graphs are everywhere!
 Natural way to model almost everything
 “Whiteboard friendly”
 Even the internet is a graph

Why Graph Databases?
 Relational databases are not so great for
storing graph structures
 Unnatural m:n relations
 Expensive joins
 Expensive look ups during graph traversals
 Graph databases fix this
 Efficient storage
 Direct pointers = no joins

Neo4j
 The World's Leading Graph Database
 www.neo4j.org
 NOSQL database
 Open source - github.com/neo4j
 ACID
 Brief history
 Official v1.0 – 2010
 Current version 1.9
 2.0 coming soon

Querying Neo4j
 Querying languages
 Structurally similar to SQL
 Based on graph traversal
 Most often used
 Gremlin – generic graph querying language
 Cypher – graph querying language for Neo4j
 SPARQL – generic querying language for data in
RDF format

Cypher Example
CREATE (n {name: {value}})
CREATE (n)-[r:KNOWS]->(m)
START
[MATCH]
[WHERE]
RETURN [ORDER BY] [SKIP] [LIMIT]

Cypher Example (2)
 Friend of a friend
START n=node(0)
MATCH (n)--()--(f)
RETURN f

Working with Neo4j
 REST API => wrappers
 Neography for Ruby
 py2neo for Python
 …
 Your own wrapper
 Java API
 Direct access in JVM based applications
 neo4j.rb

Neography – API wrapper example
# create nodes and properties
n1 = Neography::Node.create("age" => 31, "name" => "Max")
n2 = Neography::Node.create("age" => 33, "name" => "Roel")
n1.weight = 190
# create relationships
new_rel = Neography::Relationship.create(:coding_buddies, n1, n2)
n1.outgoing(:coding_buddies) << n2
# get nodes related by outgoing friends relationship
n1.outgoing(:friends)
# get n1 and nodes related by friends and friends of friends
n1.outgoing(:friends).depth(2).include_start_node

Neo4j.rb – JRuby gem example
class Person < Neo4j::Rails::Model
property :name
property :age, :index => :exact # :fulltext
has_n(:friends).to(Person).relationship(Friend)
end
class Friend < Neo4j::Rails::Relationship
property :as
end
mike = Person.new(:name => ‘Mike’, :age => 24)
john = Person.new(:name => ‘John’, :age => 27)
mike.friends << john
mike.save

Our Approach
 Relational databases are not so bad
 Good for basic data storage
 Widely used for web applications
 Well supported in Rails via ActiveRecord
 Performance issues with Neo4j
 However, we need a graph database
 We model the domain as a graph
 Our recommendation is based on graph traversal

Our Approach (2)
 Hybrid model using both MySQL and Neo4j
 MySQL contains basic information about
entities
 Neo4j contains only
relationships
 Paired via
identifiers (neo4j_id)

Our Approach (3)
 Recommendation algorithms
 Made as plugins to Neo4j
 Written in Java
 Embedded into Neo4j API
 Rails application uses custom made wrapper
 Creates and modifies nodes and relationships via
API calls
 Handles recommendation requests

Graph Algorithms
 Built-in algorithms
 Shortest path
 All shortest paths
 Dijkstra’s algorithm
 Custom algorithms
 Depth first search
 Breadth first search
 Spreading activation
 Flows, pairing, etc.

Document Similarity
 Task: find similarities between documents
 Documents data model:
 Each document is made of sentences
 Each sentence can be divided into n-grams
 N-grams are connected with relationships
 Neo4J is graph database in Java
 (Neo4j, graph) – (graph, database) – (database, Java)

 Detecting similar documents in our graph
model
 Shortest path between documents
 Number of paths shorter than some distance
 Weighing relationships
 How about a custom plugin?
 Spreading activation
Document Similarity (3)

Live Demo…
Document Similarity (4)

 Task: recommend movies based on what
we like
 We like some entities, let’s call them initial
 Movies
 People (actors, directors etc.)
 Genres
 We want recommended nodes from input
 Find nodes which are
 The closest to initial nodes
 The most relevant to initial nodes
Movie Recommendation

 165k nodes
 Movies
 People
 Genre
 870k relationships
 Movies – People
 Movies – Genres
 Easy to add more entities
 Tags, mood, period, etc.
 Will it be fast? We need 1-2 seconds
Movie Recommendation (2)

 Breadth first search
 Union Colors
 Mixing Colors
 Modified Dijkstra
 Weighted relationships between entities
 Spreading activation (energy)
 Each initial node gets same starting energy
Recommendation Algorithms

Spreading Activation (Energy)
100.0
100.0
100.0
100.0

100.0
100.0
100.0
100.0
12.0
12.0
12.0

0.0
100.0
100.0
100.0
12.0
10.0
10.0

0.0
0.0
100.0
100.0
22.0
10.0
8.0
8.0 8.0
8.0

0.0
0.0
0.0
100.0
22.0
18.0

 Experimental evaluation
 Which algorithm is the best (rating on scale 1-5)
 30 users / 168 scenarios
Recommendation - Evaluation
0
0.5
1
1.5
2
2.5
3
3.5
Spájanie farieb Miešanie farieb Šírenie energie Dijkstra

Live Demo…
Movie Recommendation (4)

Movie Recommendation – User Model
 Spreading energy
 Each initial node gets different starting energy
 Based on user’s interests and feedback
 Improves the recommendation!

Recommendation from subgraph
 Recommend movies which are currently in
cinemas
 Recommend movies which are currently on TV
 How?
 Algorithm will traverse normally
 Creates a subgraph from which it returns nodes

Live Demo…
Recommendation from subgraph (2)

TeleVido.tv
 Media content recommendation using Neo4j
 Recommendation of movies in cinemas
 Recommendation of TV programs and schedules

Summary
 Graph databases
 Working with Neo4j and Ruby (On Rails)
 Plugins and algorithms
 Document similarity
 Recommendation from subgraph
 TeleVido.tv

Neo4j - graph database for recommendations

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (13)

Similar to Neo4j - graph database for recommendations

Similar to Neo4j - graph database for recommendations (20)

Recently uploaded

Recently uploaded (20)

Neo4j - graph database for recommendations