SlideShare a Scribd company logo
1 of 50
Recommendations and Statistics
with Graph Databases
Calin Constantinov
Development Consultant
Neo4j Certified Professional
16th May 2019
1. Recommendations 101
2. SQL Drawbacks and NOSQL Alternatives
3. Graph Databases
4. Simple Queries with (open)Cypher
5. Building a Social Recommendations Platform with Neo4j
6. Facebook example: PlacesToBe
7. LinkedIn example: LocalTalent
8. QA
Agenda
RECOMMENDATIONS
101
Smart Things Others Have Said
45% of online shoppers are more likely to shop on a site that offers personalized
recommendations
56% of online shoppers are more likely to return to a site that recommends products
59% of online shoppers believe that it is easier to find more interesting products on
personalized online retail stores
source: https://www.invespcro.com/blog/online-shopping-personalization
Common Approaches
source: https://www.themarketingtechnologist.co/building-a-recommendation-engine-for-
geek-setting-up-the-prerequisites-13
The Ratings Matrix
source: https://nikhilwins.wordpress.com/2015/09/18/movie-recommendations-how-does-
netflix-do-it-a-9-step-coding-intuitive-guide-into-collaborative-filtering
Basic Similarity Measures
Euclidean distance:
Cosine similarity:
Multidimensionality: A 360° Customer View
source: Wenkai Mo - Recommender System
Ideal recommendation features
NOVEL – however, remainders do sometimes work.
RELEVANT – even though an item seems interesting, also consider past orders.
SERENDIPIDY – always recommending the obvious is pointless.
TRANSPARENT – raise trust and credibility by explaining yourself.
SQL DRAWBACKS
AND
NOSQL ALTERNATIVES
SQL Problems
:(
Although SQL databases are excellent for a vast category of problems, they lack scalability.
The ”one size fits all” approach of relational databases is no longer valid.
Moreover, modern data is starting to have an obvious graph-like structure.
SQL does not naturally support graph specific operations (e.g. DFS, BFS).
Complex stored procedures and queries are thus needed for even the simplest tasks.
And what about changes to the structure of the data?
Case Study: Recommender Systems
Fancy name for “Fooling the customer”
Much more can be told about a person by analyzing his relationships than reviewing raw
statistics about him.
Recommendations are more likely to be of value when larger volumes of diverse data are
analyzed.
In case of a traditional approach, queries take too long to complete to be run on demand.
Spoiler alert! That’s not necessarily the case for graphs!
Precomputed recommendations are usually displayed to the users (but consider an
auctioning site!).
NOSQL
Not solely aimed towards pretentious hipsters anymore!
GRAPH
DATABASES
Data Is the New Dollar
source: David Somerville - http://www.smrvl.com/blog
The Labelled Property Graph Model
The Labelled Property Graph Model (cont’d)
Making sense of data
Go graph! All the other kids are doing it!
Takeaway: The value of data isn’t represented by its volume, but by our capacity to
understand the relationships between its consisting elements.
Graph databases represent a technology that has the analytical and discovery capabilities
that no other persistence solution can provide.
Graphs model relations in a generic manner and enable flexibility without major
restructuring of the global schema (as in case of SQL).
Bonus: there’s a very high level of abstraction associated with the way graph queries can
be expressed.
Case study: Minimalist social network
Epic battle!
Let’s consider a social network with 1 000 000 users, each having 50 friends.
SQL has to “fake” relationships (don’t we all?).
SQL: Graph:
source: Ian Robinson, Jim Webber, and Emil Eifrem: Graph Databases, 2013, O'Reilly
Minimalist social network (cont’d)
S14E04: You have 0 friends
Also consider a non-reflexive scenario: Who are my followers?
Reversing the direction of a traversal would be difficult with non-native graph processing.
For that, you must either create a costly reverse-lookup index for each traversal or
perform a brute-force search through the original index.
The results are in!
SIMPLE
QUERIES
WITH
CYPHER
Cypher
‘Member ASCII art? (っ◕‿◕)っ
Powerful and expressive query language requiring 10x to 100x less code than SQL.
Declarative language for describing patterns in graphs visually using an ASCII-art syntax.
Comes with a profiler / interactive query planner.
Collaborative Filtering over a Graph
MATCH (m:Movie {title: "Home Alone"})<-[:RATED]-(u:User)-[:RATED]->(rec:Movie)
RETURN rec.title AS recommendation, COUNT(u) AS usersWhoAlsoWatched
ORDER BY usersWhoAlsoWatched DESC
LIMIT 25
Weighing In
MATCH (u:User {name: "Nicole Ramsey"})
MATCH (u)-[r:RATED]->(m:Movie)
WITH u, AVG(r.rating) AS average
MATCH (u)-[r:RATED]->(m:Movie)
WHERE r.rating > average
RETURN m, r.rating
BUILDING A
SOCIAL
RECOMMENDATIONS
PLATFORM
Airport places
The metagraph:
Exquisite food and cheap beer, right? <3
source: https://neo4j.com/blog/real-time-recommendation-engine-data-science/
Basic social recommendation
Food and drink places in the following {categories} closest to gate {gate} in terminal {terminal}
that {user}'s friends like:
Making friends and liking stuff
User similarities
Let’s apply weights to the Like relationship and compute similarity distances between users.
The moment we began to fall apart
We could add this part in order to:
Find food and drink places in the following {categories} closest to gate {gate} in terminal
{terminal} that users similar to {user} like.
Applying K-Means
More interestingly, user clusters can be identified:
Always remember that you are absolutely unique. Just like everyone else.
Social cluster recommendations
Find food and drink places in the following {categories} closest to gate {gate} in terminal
{terminal} that users in {user}'s cluster like:
It’s a date!
PLACESTOBE
CraiovaRestaurants
Wanna go out tonight?
Back in 2013, Facebook data from 10 users and their friends was mined.
The final dataset consisted of 21981 users, 48051 check-ins, 549 places and 76 categories, all
linked by 392607 relationships. (7% of all check-ins ever placed in Craiova were captured!)
Yes, this was before Cambridge Analytica.
Popular places
Pub crawl!
Most popular places, by number of visitors.
Places where people return
They keep coming back for more!
Most popular places, by the percentage of visitors that have returned at least once.
Places visited by friends
We're social people (at least on Facebook)
Places a given user hasn’t visited but are most commonly visited by users that are most
commonly visiting places with the given user.
Similar places
Bear with me…
Similar places with a given place based on the number of common categories and largest
number of users commonly visiting both places.
Highly-Available Neo4j Heterogeneous Load Balanced Cluster
tl;dr
All read times reasonably fall within a “real-time” constraint.
LOCALTALENT
The graph model
The dataset: 206 complete profiles (2044 total), 275 active jobs (775 total), 361 companies
991 skills, 19421 endorsements, 89 educational institutions.
This is so META!
Biggest companies
Top 15 companies by number of active jobs.
Size Matters!
Loyal employees
#relationshipgoals
Top 15 companies by average time an employee has a position in the company (in months).
Employee leaves
Time for breakup songs!
Top 10 leaves from one company to another.
Active jobs
So many noobs!
A view on the distribution of the active jobs.
Showcased skills
Number of profiles displaying one of the top 20 displayed skills.
Who doesn’t like a show-off?
Endorsements
She didn’t endorse me back :(
Percentage distribution for top 20 endorsed skills.
Wide-range and niche companies
Finding the perfect job for your hipster-esque coding needs
Percentage distribution for top 3 endorsed skills for selected companies.
(calin:IncredibleGraphExpert)-[:ANSWERS]->(anyQuestion)
See you at the workshop on June 13th
THANK YOU!

More Related Content

Similar to Recommendations and Statistics with Graph Databases

Information Architecture for Drupal
Information Architecture for DrupalInformation Architecture for Drupal
Information Architecture for DrupalVanessa Turke
 
Applying information architecture to university web sites
Applying information architecture to university web sitesApplying information architecture to university web sites
Applying information architecture to university web sitesKeith Instone
 
Designing Data Visualizations to Strengthen Health Systems
Designing Data Visualizations to Strengthen Health SystemsDesigning Data Visualizations to Strengthen Health Systems
Designing Data Visualizations to Strengthen Health SystemsAmanda Makulec
 
Elastic loves Graphs
Elastic loves GraphsElastic loves Graphs
Elastic loves GraphsGraphRM
 
Taking the Reins: Website Redesign by the Librarians, for the Users
Taking the Reins: Website Redesign by the Librarians, for the UsersTaking the Reins: Website Redesign by the Librarians, for the Users
Taking the Reins: Website Redesign by the Librarians, for the UsersMark O'English
 
Apache Spark GraphX & GraphFrame Synthetic ID Fraud Use Case
Apache Spark GraphX & GraphFrame Synthetic ID Fraud Use CaseApache Spark GraphX & GraphFrame Synthetic ID Fraud Use Case
Apache Spark GraphX & GraphFrame Synthetic ID Fraud Use CaseMo Patel
 
VWO-GetResponse Webinar - 10 Conversion Killers And How to Fight Them
VWO-GetResponse Webinar - 10 Conversion Killers And How to Fight ThemVWO-GetResponse Webinar - 10 Conversion Killers And How to Fight Them
VWO-GetResponse Webinar - 10 Conversion Killers And How to Fight ThemVWO
 
It's Getting Personal: The Rise of Hyper-Targeted User Experiences - Colin Eagan
It's Getting Personal: The Rise of Hyper-Targeted User Experiences - Colin EaganIt's Getting Personal: The Rise of Hyper-Targeted User Experiences - Colin Eagan
It's Getting Personal: The Rise of Hyper-Targeted User Experiences - Colin EaganUXPA International
 
Converge 2014: Digital Analytics - Getting Leadership Buy-in - Thayer
Converge 2014: Digital Analytics - Getting Leadership Buy-in - ThayerConverge 2014: Digital Analytics - Getting Leadership Buy-in - Thayer
Converge 2014: Digital Analytics - Getting Leadership Buy-in - ThayerConverge Consulting
 
#1NWebinar: Digital on the Runway
#1NWebinar: Digital on the Runway#1NWebinar: Digital on the Runway
#1NWebinar: Digital on the RunwayOne North
 
Map Reduce amrp presentation
Map Reduce amrp presentationMap Reduce amrp presentation
Map Reduce amrp presentationrenjan131
 
Graphs in Action: In-depth look at Neo4j in Production
Graphs in Action: In-depth look at Neo4j in ProductionGraphs in Action: In-depth look at Neo4j in Production
Graphs in Action: In-depth look at Neo4j in ProductionNeo4j
 
VCCORP SoICT 2018
VCCORP SoICT 2018VCCORP SoICT 2018
VCCORP SoICT 2018Tuan Hoang
 
Ordering the chaos: Creating websites with imperfect data
Ordering the chaos: Creating websites with imperfect dataOrdering the chaos: Creating websites with imperfect data
Ordering the chaos: Creating websites with imperfect dataAndy Stretton
 
From Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender systemFrom Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender systemPierre Gutierrez
 
Search Analytics: Diagnosing what ails your site
Search Analytics:  Diagnosing what ails your siteSearch Analytics:  Diagnosing what ails your site
Search Analytics: Diagnosing what ails your siteLouis Rosenfeld
 
The Relationship Between SEO & Content
The Relationship Between SEO & ContentThe Relationship Between SEO & Content
The Relationship Between SEO & ContentJennifer Lind
 
Data Informed Product Management by Eventbrite Sr PM
Data Informed Product Management by Eventbrite Sr PMData Informed Product Management by Eventbrite Sr PM
Data Informed Product Management by Eventbrite Sr PMProduct School
 
Croll lean analytics workshop (3h) - lean ux nyc april 2014
Croll   lean analytics workshop (3h) - lean ux nyc april 2014Croll   lean analytics workshop (3h) - lean ux nyc april 2014
Croll lean analytics workshop (3h) - lean ux nyc april 2014Lean Analytics
 

Similar to Recommendations and Statistics with Graph Databases (20)

Power BI as a storyteller
Power BI as a storytellerPower BI as a storyteller
Power BI as a storyteller
 
Information Architecture for Drupal
Information Architecture for DrupalInformation Architecture for Drupal
Information Architecture for Drupal
 
Applying information architecture to university web sites
Applying information architecture to university web sitesApplying information architecture to university web sites
Applying information architecture to university web sites
 
Designing Data Visualizations to Strengthen Health Systems
Designing Data Visualizations to Strengthen Health SystemsDesigning Data Visualizations to Strengthen Health Systems
Designing Data Visualizations to Strengthen Health Systems
 
Elastic loves Graphs
Elastic loves GraphsElastic loves Graphs
Elastic loves Graphs
 
Taking the Reins: Website Redesign by the Librarians, for the Users
Taking the Reins: Website Redesign by the Librarians, for the UsersTaking the Reins: Website Redesign by the Librarians, for the Users
Taking the Reins: Website Redesign by the Librarians, for the Users
 
Apache Spark GraphX & GraphFrame Synthetic ID Fraud Use Case
Apache Spark GraphX & GraphFrame Synthetic ID Fraud Use CaseApache Spark GraphX & GraphFrame Synthetic ID Fraud Use Case
Apache Spark GraphX & GraphFrame Synthetic ID Fraud Use Case
 
VWO-GetResponse Webinar - 10 Conversion Killers And How to Fight Them
VWO-GetResponse Webinar - 10 Conversion Killers And How to Fight ThemVWO-GetResponse Webinar - 10 Conversion Killers And How to Fight Them
VWO-GetResponse Webinar - 10 Conversion Killers And How to Fight Them
 
It's Getting Personal: The Rise of Hyper-Targeted User Experiences - Colin Eagan
It's Getting Personal: The Rise of Hyper-Targeted User Experiences - Colin EaganIt's Getting Personal: The Rise of Hyper-Targeted User Experiences - Colin Eagan
It's Getting Personal: The Rise of Hyper-Targeted User Experiences - Colin Eagan
 
Converge 2014: Digital Analytics - Getting Leadership Buy-in - Thayer
Converge 2014: Digital Analytics - Getting Leadership Buy-in - ThayerConverge 2014: Digital Analytics - Getting Leadership Buy-in - Thayer
Converge 2014: Digital Analytics - Getting Leadership Buy-in - Thayer
 
#1NWebinar: Digital on the Runway
#1NWebinar: Digital on the Runway#1NWebinar: Digital on the Runway
#1NWebinar: Digital on the Runway
 
Map Reduce amrp presentation
Map Reduce amrp presentationMap Reduce amrp presentation
Map Reduce amrp presentation
 
Graphs in Action: In-depth look at Neo4j in Production
Graphs in Action: In-depth look at Neo4j in ProductionGraphs in Action: In-depth look at Neo4j in Production
Graphs in Action: In-depth look at Neo4j in Production
 
VCCORP SoICT 2018
VCCORP SoICT 2018VCCORP SoICT 2018
VCCORP SoICT 2018
 
Ordering the chaos: Creating websites with imperfect data
Ordering the chaos: Creating websites with imperfect dataOrdering the chaos: Creating websites with imperfect data
Ordering the chaos: Creating websites with imperfect data
 
From Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender systemFrom Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender system
 
Search Analytics: Diagnosing what ails your site
Search Analytics:  Diagnosing what ails your siteSearch Analytics:  Diagnosing what ails your site
Search Analytics: Diagnosing what ails your site
 
The Relationship Between SEO & Content
The Relationship Between SEO & ContentThe Relationship Between SEO & Content
The Relationship Between SEO & Content
 
Data Informed Product Management by Eventbrite Sr PM
Data Informed Product Management by Eventbrite Sr PMData Informed Product Management by Eventbrite Sr PM
Data Informed Product Management by Eventbrite Sr PM
 
Croll lean analytics workshop (3h) - lean ux nyc april 2014
Croll   lean analytics workshop (3h) - lean ux nyc april 2014Croll   lean analytics workshop (3h) - lean ux nyc april 2014
Croll lean analytics workshop (3h) - lean ux nyc april 2014
 

Recently uploaded

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAbhinavSharma374939
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 

Recently uploaded (20)

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog Converter
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 

Recommendations and Statistics with Graph Databases

  • 1. Recommendations and Statistics with Graph Databases Calin Constantinov Development Consultant Neo4j Certified Professional 16th May 2019
  • 2. 1. Recommendations 101 2. SQL Drawbacks and NOSQL Alternatives 3. Graph Databases 4. Simple Queries with (open)Cypher 5. Building a Social Recommendations Platform with Neo4j 6. Facebook example: PlacesToBe 7. LinkedIn example: LocalTalent 8. QA Agenda
  • 4. Smart Things Others Have Said 45% of online shoppers are more likely to shop on a site that offers personalized recommendations 56% of online shoppers are more likely to return to a site that recommends products 59% of online shoppers believe that it is easier to find more interesting products on personalized online retail stores source: https://www.invespcro.com/blog/online-shopping-personalization
  • 6. The Ratings Matrix source: https://nikhilwins.wordpress.com/2015/09/18/movie-recommendations-how-does- netflix-do-it-a-9-step-coding-intuitive-guide-into-collaborative-filtering
  • 7. Basic Similarity Measures Euclidean distance: Cosine similarity:
  • 8. Multidimensionality: A 360° Customer View source: Wenkai Mo - Recommender System
  • 9. Ideal recommendation features NOVEL – however, remainders do sometimes work. RELEVANT – even though an item seems interesting, also consider past orders. SERENDIPIDY – always recommending the obvious is pointless. TRANSPARENT – raise trust and credibility by explaining yourself.
  • 11.
  • 12. SQL Problems :( Although SQL databases are excellent for a vast category of problems, they lack scalability. The ”one size fits all” approach of relational databases is no longer valid. Moreover, modern data is starting to have an obvious graph-like structure. SQL does not naturally support graph specific operations (e.g. DFS, BFS). Complex stored procedures and queries are thus needed for even the simplest tasks. And what about changes to the structure of the data?
  • 13. Case Study: Recommender Systems Fancy name for “Fooling the customer” Much more can be told about a person by analyzing his relationships than reviewing raw statistics about him. Recommendations are more likely to be of value when larger volumes of diverse data are analyzed. In case of a traditional approach, queries take too long to complete to be run on demand. Spoiler alert! That’s not necessarily the case for graphs! Precomputed recommendations are usually displayed to the users (but consider an auctioning site!).
  • 14. NOSQL Not solely aimed towards pretentious hipsters anymore!
  • 16. Data Is the New Dollar source: David Somerville - http://www.smrvl.com/blog
  • 17. The Labelled Property Graph Model
  • 18. The Labelled Property Graph Model (cont’d)
  • 19. Making sense of data Go graph! All the other kids are doing it! Takeaway: The value of data isn’t represented by its volume, but by our capacity to understand the relationships between its consisting elements. Graph databases represent a technology that has the analytical and discovery capabilities that no other persistence solution can provide. Graphs model relations in a generic manner and enable flexibility without major restructuring of the global schema (as in case of SQL). Bonus: there’s a very high level of abstraction associated with the way graph queries can be expressed.
  • 20. Case study: Minimalist social network Epic battle! Let’s consider a social network with 1 000 000 users, each having 50 friends. SQL has to “fake” relationships (don’t we all?). SQL: Graph: source: Ian Robinson, Jim Webber, and Emil Eifrem: Graph Databases, 2013, O'Reilly
  • 21. Minimalist social network (cont’d) S14E04: You have 0 friends Also consider a non-reflexive scenario: Who are my followers? Reversing the direction of a traversal would be difficult with non-native graph processing. For that, you must either create a costly reverse-lookup index for each traversal or perform a brute-force search through the original index. The results are in!
  • 23. Cypher ‘Member ASCII art? (っ◕‿◕)っ Powerful and expressive query language requiring 10x to 100x less code than SQL. Declarative language for describing patterns in graphs visually using an ASCII-art syntax. Comes with a profiler / interactive query planner.
  • 24. Collaborative Filtering over a Graph MATCH (m:Movie {title: "Home Alone"})<-[:RATED]-(u:User)-[:RATED]->(rec:Movie) RETURN rec.title AS recommendation, COUNT(u) AS usersWhoAlsoWatched ORDER BY usersWhoAlsoWatched DESC LIMIT 25
  • 25. Weighing In MATCH (u:User {name: "Nicole Ramsey"}) MATCH (u)-[r:RATED]->(m:Movie) WITH u, AVG(r.rating) AS average MATCH (u)-[r:RATED]->(m:Movie) WHERE r.rating > average RETURN m, r.rating
  • 27. Airport places The metagraph: Exquisite food and cheap beer, right? <3 source: https://neo4j.com/blog/real-time-recommendation-engine-data-science/
  • 28. Basic social recommendation Food and drink places in the following {categories} closest to gate {gate} in terminal {terminal} that {user}'s friends like: Making friends and liking stuff
  • 29. User similarities Let’s apply weights to the Like relationship and compute similarity distances between users. The moment we began to fall apart We could add this part in order to: Find food and drink places in the following {categories} closest to gate {gate} in terminal {terminal} that users similar to {user} like.
  • 30. Applying K-Means More interestingly, user clusters can be identified: Always remember that you are absolutely unique. Just like everyone else.
  • 31. Social cluster recommendations Find food and drink places in the following {categories} closest to gate {gate} in terminal {terminal} that users in {user}'s cluster like: It’s a date!
  • 33. CraiovaRestaurants Wanna go out tonight? Back in 2013, Facebook data from 10 users and their friends was mined. The final dataset consisted of 21981 users, 48051 check-ins, 549 places and 76 categories, all linked by 392607 relationships. (7% of all check-ins ever placed in Craiova were captured!) Yes, this was before Cambridge Analytica.
  • 34. Popular places Pub crawl! Most popular places, by number of visitors.
  • 35. Places where people return They keep coming back for more! Most popular places, by the percentage of visitors that have returned at least once.
  • 36. Places visited by friends We're social people (at least on Facebook) Places a given user hasn’t visited but are most commonly visited by users that are most commonly visiting places with the given user.
  • 37. Similar places Bear with me… Similar places with a given place based on the number of common categories and largest number of users commonly visiting both places.
  • 38. Highly-Available Neo4j Heterogeneous Load Balanced Cluster tl;dr All read times reasonably fall within a “real-time” constraint.
  • 40. The graph model The dataset: 206 complete profiles (2044 total), 275 active jobs (775 total), 361 companies 991 skills, 19421 endorsements, 89 educational institutions. This is so META!
  • 41. Biggest companies Top 15 companies by number of active jobs. Size Matters!
  • 42. Loyal employees #relationshipgoals Top 15 companies by average time an employee has a position in the company (in months).
  • 43. Employee leaves Time for breakup songs! Top 10 leaves from one company to another.
  • 44. Active jobs So many noobs! A view on the distribution of the active jobs.
  • 45. Showcased skills Number of profiles displaying one of the top 20 displayed skills. Who doesn’t like a show-off?
  • 46. Endorsements She didn’t endorse me back :( Percentage distribution for top 20 endorsed skills.
  • 47. Wide-range and niche companies Finding the perfect job for your hipster-esque coding needs Percentage distribution for top 3 endorsed skills for selected companies.
  • 49. See you at the workshop on June 13th