Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Build a Recommendation
Engine with Neo4j and Python
‣ Download Neo4j: neo4j.com/download
‣ Open your browser to http://localhost:7474
‣ Type the following command:
:play http...
Problem
Generic recommendations are
low efficacy…
Generic recommendations are
typically low efficacy…
Proposed Solution
"It's all about
relationships
- Kevin Van Gundy
-Lebron James"
…Data Relationships
CAR
DRIVES
name:	“Dan”	
born:	May	29,	1970	
twitter:	“@dan”
name:	“Ann”	
born:		Dec	5,	1975
since:	

Jan	10,	2011
brand:	“...
CAR
DRIVES
name:	“Dan”	
born:	May	29,	1970	
twitter:	“@dan”
name:	“Ann”	
born:		Dec	5,	1975
since:	

Jan	10,	2011
brand:	“...
Introducing our data set...
meetup.com’s recommendations
Recommendation queries
‣ Several different types
• groups to join
• topics to follow
• events to attend
‣ As a user of mee...
The data
meetup.com/meetup_api/
What data do we have?
‣ Groups
‣ Members
‣ Events
‣ Topics
‣ Time & Date
‣ Location
Find similar groups to Neo4j
"As a member of the Outdoorsy Entrepreneur Meetup
I want to find other similar meetup groups
...
What makes groups similar?
‣ Download Neo4j: Neo4j.com/download
‣ Open your browser to http://localhost:7474
‣ Type the following command:
:play http...
To the Browser!
Great Graphs
Batman!
Take Note
Indexes and Constraints
Unique constraints
We create unique constraints to:
‣ ensure uniqueness
‣ allow fast lookup of nodes which match these
(la...
Indexes
We create indexes to:
‣ Allow fast lookup of nodes which match these
(label,property) pairs.


CREATE	INDEX	ON	:Gr...
The following are index backed:
‣ Equality
‣ STARTS WITH
‣ CONTAINS,
‣ ENDS WITH
‣ Range Searches
‣ (Non) Existence Checks...
How does Neo4j use indexes?
Indexes are only used to find the starting point for
queries.
Use index scans to look up
rows ...
Next Guide
Group Membership
Watch Out
Transactions & WITH
Periodic Commit
Cypher keeps all transaction state in memory while
running a query which is fine most of the time.
Periodic Commit
Cypher keeps all transaction state in memory while
running a query which is fine most of the time…


But w...
Periodic Commit
// defaults to 1000

USING PERIODIC COMMIT
LOAD CSV 

...
Periodic Commit
// defaults to 1000

USING PERIODIC COMMIT 10000
LOAD CSV 

...
WITH
The WITH clause allows query parts to be chained
together, piping the results from one to be used as
starting points ...
WITH
It’s used to:
‣ Limit the number of entries that are then passed
on to other MATCH clauses
‣ Filter on aggregated val...
Continue
Continue with the Guide
Exercise
Find yourself and your groups
Solution
Find yourself and your groups
Explore the graph
Type the following command into the Neo4j
browser to see the answers:
:play	http://guides.neo4j.com/reco...
Continue
Continue with the Guide
Find my similar groups
As a member of several meetup groups
I want to find other similar meetup groups
that I’m not alread...
Next Step
Member Interest
Member interests
Attention
Lists with split & UNWIND
Splitting up topic ids
The split function lets us convert a string into
a string array based on a delimiting character.

Splitting up topic ids
The split function lets us convert a string into a string
array based on a delimiting character.
RE...
We can use UNWIND to explode any array or list back
into individual rows.
Splitting up topic ids
We can use UNWIND to explode the resulting array back into individual rows.
UNWIND	[1,2,3]	AS	value	
RETURN	value

1	
2	
3...
UNWIND split("1;2;3", ";") AS topicId
RETURN topicId
1
2
3
Splitting up topic ids
Continue
Continue with the Guide
Exercise
My inferred interests
Solution
My inferred interests
My inferred interests
Type the following command into the Neo4j
browser to see the answers:
:play	http://guides.neo4j.com/...
Continue
Continue with the Guide
Next Guide
Events
Exercise
Event recommendations
Solution
Event recommendations
Event recommendations
Type the following command into the Neo4j
browser to see the answers:
:play	http://guides.neo4j.com/...
Continue
Continue with the Guide
Next Guide
Venues
Exercise
Import venues
Solution
Import venues
Import venues
Type the following command into the Neo4j
browser to see the answers:
:play	http://guides.neo4j.com/reco/05_...
Continue
Continue with the Guide
Next Step
Calculating Distances
Continue
Continue with the Guide
Exercise
Using venues in recommendation
Solution
Using venues in recommendation
Using venues in recommendations
Type the following command into the Neo4j
browser to see the answers:
:play	http://guides....
Using venues in recommendations
WITH	{latitude:	51.518551,	longitude:	-0.086114}	AS	here



MATCH	(member:Member	{name:	"M...
Venues close to here
WITH	{latitude:	51.518551,	longitude:	-0.086114}	AS	here



MATCH	(member:Member	{name:	"Mark	Needham...
Next Guide
RSVPs
Exercise
Events at my venues
Solution
Events at my venues
Events at my venues
Type the following command into the Neo4j
browser to see the answers:
:play	http://guides.neo4j.com/re...
Next Guide
Procedures
Exercise
Import photos metadata
Solution
Import photos metadata
Import photos metadata
Type the following command into the Neo4j
browser to see the answers:
:play	http://guides.neo4j.com...
Next Guide
Latent Social Graph
Watch Out
Transaction State
Transaction State
Cypher keeps all transaction state in memory
while running a query which is fine most of the
time.


But...
We therefore need to take a batched approach to
large scale refactorings.
MATCH	(m1:Process)	WITH	m1	LIMIT	1000
REMOVE	m1:...
Continue
Continue with the Guide
Exercise
Add friends to recommendation
Solution
Add friends to recommendation
Add friends to recommendation
Type the following command into the Neo4j
browser to see the answers:
:play	http://guides.ne...
Next Guide
Scoring
Next Guide
Your turn
JOIN NEO4J.COM/SLACK & #TRAINING-ATTENDEES
Kevin Van Gundy | Building a Recommendation Engine with Neo4j and Python
Kevin Van Gundy | Building a Recommendation Engine with Neo4j and Python
Kevin Van Gundy | Building a Recommendation Engine with Neo4j and Python
Upcoming SlideShare
Loading in …5
×
Upcoming SlideShare
Neo4j - 5 cool graph examples
Next
Download to read offline and view in fullscreen.

Kevin Van Gundy | Building a Recommendation Engine with Neo4j and Python

Download to read offline

PyData Chicago 2016

  • Be the first to like this

Kevin Van Gundy | Building a Recommendation Engine with Neo4j and Python

  1. 1. Build a Recommendation Engine with Neo4j and Python
  2. 2. ‣ Download Neo4j: neo4j.com/download ‣ Open your browser to http://localhost:7474 ‣ Type the following command: :play http://guides.neo4j.com/pydatachi Getting Started
  3. 3. Problem
  4. 4. Generic recommendations are low efficacy…
  5. 5. Generic recommendations are typically low efficacy…
  6. 6. Proposed Solution
  7. 7. "It's all about relationships - Kevin Van Gundy -Lebron James"
  8. 8. …Data Relationships
  9. 9. CAR DRIVES name: “Dan” born: May 29, 1970 twitter: “@dan” name: “Ann” born: Dec 5, 1975 since: 
 Jan 10, 2011 brand: “Volvo” model: “V70” Property Graph Model Components Nodes • The objects in the graph • Can have name-value properties • Can be labeled Relationships • Relate nodes by type and direction • Can have name-value properties LOVES LIKES LIVES WITH OW NS PERSON PERSON
  10. 10. CAR DRIVES name: “Dan” born: May 29, 1970 twitter: “@dan” name: “Ann” born: Dec 5, 1975 since: 
 Jan 10, 2011 brand: “Volvo” model: “V70” LOVES LIKES LIVES WITH OW NS PERSON PERSON Storage on Disk
  11. 11. Introducing our data set...
  12. 12. meetup.com’s recommendations
  13. 13. Recommendation queries ‣ Several different types • groups to join • topics to follow • events to attend ‣ As a user of meetup.com trying to find groups to join and events to attend
  14. 14. The data meetup.com/meetup_api/
  15. 15. What data do we have? ‣ Groups ‣ Members ‣ Events ‣ Topics ‣ Time & Date ‣ Location
  16. 16. Find similar groups to Neo4j "As a member of the Outdoorsy Entrepreneur Meetup I want to find other similar meetup groups So that I can join those groups"
  17. 17. What makes groups similar?
  18. 18. ‣ Download Neo4j: Neo4j.com/download ‣ Open your browser to http://localhost:7474 ‣ Type the following command: :play http://guides.neo4j.com/pydatachi Recommend groups by topic
  19. 19. To the Browser! Great Graphs Batman!
  20. 20. Take Note Indexes and Constraints
  21. 21. Unique constraints We create unique constraints to: ‣ ensure uniqueness ‣ allow fast lookup of nodes which match these (label,property) pairs. CREATE CONSTRAINT ON (t:Topic) 
 ASSERT t.id IS UNIQUE
  22. 22. Indexes We create indexes to: ‣ Allow fast lookup of nodes which match these (label,property) pairs. 
 CREATE INDEX ON :Group(name)
  23. 23. The following are index backed: ‣ Equality ‣ STARTS WITH ‣ CONTAINS, ‣ ENDS WITH ‣ Range Searches ‣ (Non) Existence Checks Indexes
  24. 24. How does Neo4j use indexes? Indexes are only used to find the starting point for queries. Use index scans to look up rows in tables and join them with rows from other tables Use indexes to find the starting points for a query. Relational Graph
  25. 25. Next Guide Group Membership
  26. 26. Watch Out Transactions & WITH
  27. 27. Periodic Commit Cypher keeps all transaction state in memory while running a query which is fine most of the time.
  28. 28. Periodic Commit Cypher keeps all transaction state in memory while running a query which is fine most of the time… 
 But when using LOAD CSV, this state can get very large and may result in an OutOfMemory exception.
  29. 29. Periodic Commit // defaults to 1000
 USING PERIODIC COMMIT LOAD CSV 
 ...
  30. 30. Periodic Commit // defaults to 1000
 USING PERIODIC COMMIT 10000 LOAD CSV 
 ...
  31. 31. WITH The WITH clause allows query parts to be chained together, piping the results from one to be used as starting points or criteria in the next.
  32. 32. WITH It’s used to: ‣ Limit the number of entries that are then passed on to other MATCH clauses ‣ Filter on aggregated values ‣ Separate reading from updating of the graph
  33. 33. Continue Continue with the Guide
  34. 34. Exercise Find yourself and your groups
  35. 35. Solution Find yourself and your groups
  36. 36. Explore the graph Type the following command into the Neo4j browser to see the answers: :play http://guides.neo4j.com/reco/02_find_yourself_answers.html
  37. 37. Continue Continue with the Guide
  38. 38. Find my similar groups As a member of several meetup groups I want to find other similar meetup groups that I’m not already a member of So that I can join those groups
  39. 39. Next Step Member Interest
  40. 40. Member interests
  41. 41. Attention Lists with split & UNWIND
  42. 42. Splitting up topic ids The split function lets us convert a string into a string array based on a delimiting character.

  43. 43. Splitting up topic ids The split function lets us convert a string into a string array based on a delimiting character. RETURN split("1;2;3", ";") AS topicIds
 [1, 2, 3]

  44. 44. We can use UNWIND to explode any array or list back into individual rows. Splitting up topic ids
  45. 45. We can use UNWIND to explode the resulting array back into individual rows. UNWIND [1,2,3] AS value RETURN value
 1 2 3 Splitting up topic ids
  46. 46. UNWIND split("1;2;3", ";") AS topicId RETURN topicId 1 2 3 Splitting up topic ids
  47. 47. Continue Continue with the Guide
  48. 48. Exercise My inferred interests
  49. 49. Solution My inferred interests
  50. 50. My inferred interests Type the following command into the Neo4j browser to see the answers: :play http://guides.neo4j.com/reco/03_inferred_answers.html
  51. 51. Continue Continue with the Guide
  52. 52. Next Guide Events
  53. 53. Exercise Event recommendations
  54. 54. Solution Event recommendations
  55. 55. Event recommendations Type the following command into the Neo4j browser to see the answers: :play http://guides.neo4j.com/reco/04_events_answers.html
  56. 56. Continue Continue with the Guide
  57. 57. Next Guide Venues
  58. 58. Exercise Import venues
  59. 59. Solution Import venues
  60. 60. Import venues Type the following command into the Neo4j browser to see the answers: :play http://guides.neo4j.com/reco/05_venues_import_answers.html
  61. 61. Continue Continue with the Guide
  62. 62. Next Step Calculating Distances
  63. 63. Continue Continue with the Guide
  64. 64. Exercise Using venues in recommendation
  65. 65. Solution Using venues in recommendation
  66. 66. Using venues in recommendations Type the following command into the Neo4j browser to see the answers: :play http://guides.neo4j.com/reco/05_venues_distance_queries_answers.html
  67. 67. Using venues in recommendations WITH {latitude: 51.518551, longitude: -0.086114} AS here
 
 MATCH (member:Member {name: "Mark Needham"})
 -[:MEMBER_OF]->()-[:HOSTED_EVENT]->(futureEvent), (venue)<-[:VENUE]-(futureEvent) WHERE futureEvent.time > timestamp() RETURN group.name, futureEvent.name, round((futureEvent.time - timestamp()) / (24.0*60*60*1000)) AS days, distance(venue, here) AS distance ORDER BY days, distance
  68. 68. Venues close to here WITH {latitude: 51.518551, longitude: -0.086114} AS here
 
 MATCH (member:Member {name: "Mark Needham"})
 -[:MEMBER_OF]->()-[:HOSTED_EVENT]->(futureEvent), (venue)<-[:VENUE]-(futureEvent)
 WHERE futureEvent.time > timestamp() WITH group, futureEvent, distance(venue, here) AS distance WHERE distance < 1000 RETURN group.name, futureEvent.name, round((futureEvent.time - timestamp()) / (24.0*60*60*1000)) AS days, distance ORDER BY days, distance

  69. 69. Next Guide RSVPs
  70. 70. Exercise Events at my venues
  71. 71. Solution Events at my venues
  72. 72. Events at my venues Type the following command into the Neo4j browser to see the answers: :play http://guides.neo4j.com/reco/06_my_venues_answers.html
  73. 73. Next Guide Procedures
  74. 74. Exercise Import photos metadata
  75. 75. Solution Import photos metadata
  76. 76. Import photos metadata Type the following command into the Neo4j browser to see the answers: :play http://guides.neo4j.com/reco/07_photos_answers.html
  77. 77. Next Guide Latent Social Graph
  78. 78. Watch Out Transaction State
  79. 79. Transaction State Cypher keeps all transaction state in memory while running a query which is fine most of the time. 
 But when refactoring the graph, this state can get very large and may result in an OutOfMemory exception.
  80. 80. We therefore need to take a batched approach to large scale refactorings. MATCH (m1:Process) WITH m1 LIMIT 1000 REMOVE m1:Process WITH m1 // do the refactoring Batch all the things
  81. 81. Continue Continue with the Guide
  82. 82. Exercise Add friends to recommendation
  83. 83. Solution Add friends to recommendation
  84. 84. Add friends to recommendation Type the following command into the Neo4j browser to see the answers: :play http://guides.neo4j.com/reco/08_latent_answers.html
  85. 85. Next Guide Scoring
  86. 86. Next Guide Your turn
  87. 87. JOIN NEO4J.COM/SLACK & #TRAINING-ATTENDEES

PyData Chicago 2016

Views

Total views

571

On Slideshare

0

From embeds

0

Number of embeds

0

Actions

Downloads

31

Shares

0

Comments

0

Likes

0

×