Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Develop A Basic Recommendation System using Cypher

55 views

Published on

Recommendation systems deliver more ROI than any other investment in Data Analytics. This talk will introduce the most basic but effective Recommendation System called Collaborative Filtering and show how to implement it using the Cypher Graph Query Language.

The mathematics behind collaborative filtering will be explained as will the usefulness of Graph in implementing such an engine. We will use AgensGraph, which adds graph capability to PostgreSQL, for the talk.

This talk is aimed at those who are new to either Cypher or basic recommendation system theory.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Develop A Basic Recommendation System using Cypher

  1. 1. ⓒ 2017 by Bitnine Co, Ltd. All Rights Reserved. Bitnine Global Company ConfidentialBitnine Global Company Confidential May 2018Joe Fagan Snr Dir Presales EMEA www.bitnine.net www.bitnine.net Basic Recommendation Engine
  2. 2. AGENDA – AgensGraph Introduction Agenda Introduction to Recommendation Engines Collaborative based filtering approach Some arithmetic Importing data into AgensGraph Creating graph, ratings and similarity edges Creating recommendations edges Further enhancements to recommendation Engine.
  3. 3. What is an RS? Outcome:  15%-35% increase in sales  30%-80% increase in screen time Greater ROI than any other investment A Recommendation System (RS) finds, for each user, the ‘product’ most likely to be relevant. Playlist: •Youtube, Netflix Content: •Facebook, Twitter People: •LinkedIn, Badoo Product: •Amazon, Alibaba The first RS was implemented in 1994. Today 35% of purchases on Amazon and 75% of what we watch on Netflix comes from RSs. Source Alternative Outcome: Polarisation
  4. 4. Analytics Journey. | 4 Descriptive Analytics Sum, Avg, Patterns Sell more hats in winter Diagnostic Analytics Correlations and reasoning Cold = more hats Predictive Analytics Describe the future Next winter = more hats Prescriptive Analytics Write the future Make it colder -> more hats - $ $$$ $$$ Value to business RS lives somewhere between predictive and prescriptive analytics
  5. 5. 2 basic principles | 5 Alice and Bob buy blue and green Alice buys yellow Recommend yellow to Bob Bob buys blue Blue is similar to yellow => Recommend yellow to Bob Collaborative Filtering Content Based Filtering Alice Bob Similar Similar Bob
  6. 6. 0.5x2 + 0.2x4 + 0.6x5 / .5+.2+.6 = 4.8 / 1.3 = 3.7 Ratings Matrix and Similarity | 6 P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 C1 3 4 2 4 3 5 3 2 C2 5 2 1 2 5 3 C3 1 5 1 C4 1 2 ?? 3 2 5 3 3 C5 4 1 2 5 2 C6 5 4 5 1 C7 4 5 5 3 5 4 1 4 C8 1 3 1 5 4 2 4 5 A ratings matrix is the starting point. The ratings can be explicit (actual ratings) or implicit (purchases, frequency of purchases, timeline) The challenge is to predict all the blank entries. EG How will C4 feel about P3? To answer this we consider how people similar to C4 felt about P3. There are many ways to calculate similarity. Accepted standard is cosine similarity (next slide) Ratings: Cust, Product Similarity: Cust, Cust C1 C2 C3 C4 C5 C6 C7 C8 C1 0.7 0.6 0.5 0.4 0.4 0.6 0.6 C2 0.7 0.2 0.3 0.6 0.5 0.6 0.5 C3 0.6 0.2 0.2 0.2 0.2 0.1 C4 0.5 0.3 0.2 0.2 0.7 0.6 0.8 C5 0.4 0.6 0.2 0.3 0.6 0.6 C6 0.4 0.5 0.2 0.7 0.3 0.5 0.6 C7 0.6 0.6 0.2 0.6 0.6 0.5 0.6 C8 0.6 0.5 0.1 0.8 0.6 0.6 0.6 Once we know similarity, we want to favour the ratings of similar people and diminish the influence of dissimilar people. C5C1 C7 Once all missing ratings have been calculated for each customer, we pick the highest calculated rating and recommend that product.
  7. 7. Cosine similarity To measure similarity, place the customers as vectors in a vector space whose co- ordinates are the ratings of the products. Look at 2 vectors in just 3d (really, we’re in the dimensional space of the number of products) Similar customers point in similar directions. Measure the cosine of the angle between the vectors. Identical = Cos(0)=1, Orthogonal = Cos(90) = 0 Similarity = Cos(theta) = 3*5 + 4*0 + 2*0 / 5.4*5 = 0.56 | 7 1 2 3 P1 P2 P3 Len A C1 3 4 2 5.4 B C2 5 0 0 5 = 0.56 For some mind-blowing graphics and explanation of topics in linear algebra on Youtube search 3Blue1Brown Linear Algebra
  8. 8. Cypher ASCII Art | 8 cust buys prod ( ) - [ ] -> () ( :cust ) - [ :buys ] -> ( :prod ) ( xx :cust ) - [ r :buys ] -> ( y :prod ) ( xx :cust {name: “C1” } ) - [ r :buys { qty: 4 } ] -> ( y :prod { p: “P1” } ) Base notation Labels Identifiers Properties and Values
  9. 9. AgesnBrowser images | 9 Meta Graph SimilarityRates
  10. 10. Further improvements Use Cosine centrality on ratings Avoids no rating = negative rating Smooths harsh and easy raters Categorize products Calculate similarity by category For consumables show products already purchased For Books/Music show similar products Factor in business preference for vendors/products/categories Based on margin, stock, return rates, strategic direction | 10 P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 C1 -0.3 0.8 -1.3 0.8 -0.3 1.8 -0.3 -1.3 C2 2.0 -1.0 -2.0 -1.0 2.0 C3 -1.3 2.7 -1.3 C4 -1.7 -0.7 0.3 -0.7 2.3 0.3 0.3 C5 1.2 -1.8 -0.8 2.2 -0.8 C6 1.3 0.3 1.3 -2.8 C7 0.1 1.1 1.1 -0.9 1.1 0.1 -2.9 0.1 C8 -2.1 -0.1 -2.1 1.9 0.9 -1.1 0.9 1.9 Cosine Centrality ratings F1 F2 F3 K1 K2 K3 P1 rec Rec Final Product Factor (Held in vertex) Business Direction Scaling factors Adjustment to recommendation Final recommendation Example: F1 is stock, F2 is margin • Business objective says reduce stock is priority = Increase K1, reduce K2. • Business says margin is #1 priority = reduce K1, increase K2 • With enough data and A/B testing adjusting K’s becomes a Machine Learning problem. stock margin
  11. 11. 11 Thank you for your attention

×