Leveraging Dimensionality for Graph-Based Recommendations with Sparse Data
- 2. Intro
Will Evans, VP Strategy and Innovation at Graphable
Working with Neo4j for 4+ years
My first project was a recommendation engine for
blogs, which I did completely wrong
This will be recommendations + experience to help
guide you in a better direction
Graphable: Consultancy, Exclusive Hume Reseller in
the US
© GRAPHABLE, INC 2020
- 3. Beer Graph Recommendations
The goal for today’s session is to start with a dataset and a business goal, and take you all the
way from schema design to usable results that could be deployed
1. Use Case
2. Schema Design
3. Data Ingestion
4. Cypher Development
5. Finding Similarity for Recommendations
6. Conclusion
© GRAPHABLE, INC 2020
- 4. Use Case: Recommendations
Recommendation engines are always critical for a good customer experience
These recommendation engines are especially necessary when the customer is faced with a large product
assortment where searching is not feasible, or product discovery is desired
Additionally, recommendation engines are critical to personalization and making sure each
individual customer is shown the products they are most likely to want to buy
◦ At the same time, we want to be able to show each customer the most diverse set of products as possible within this
space
But, usually business with large assortments of products and a customer base that would benefit
from personalization:
1. Have sparse data that doesn't lend itself to most machine learning algorithms
2. Involve a customer pool of unique individuals with diverse taste profiles
© GRAPHABLE, INC 2020
- 6. Schema Design: Goal and Data
GOAL: Recommend beer to users.
DATA:
- Sparse, high dimensional data
- Dimensions: beer, beer style, brewer, review
text, glass type, flavors, beer name, beer name
sentiment, country, user beer journey/expertise,
review sentiment, etc. etc.
- Facts: Users, review scores
- P > N
- Most Beers have 1 or fewer reviews, most
Users have reviewed 2 or fewer beers
© GRAPHABLE, INC 2020
- 7. Schema Design: The design
With sparse, high dimensional data the schema design is
often simple. In our instance:
1. Beer is the center of our universe, so we start with the
beer. Where does beer come from?
2. Beer comes from a Brewer, and Brewers have
multiple beers.
3. Beers also are of a certain style, and Styles have
many beers
4. Users are unrelated to any of other entities, so they
become an additional entity
5. Reviews only exist as an interaction of User and Beer
© GRAPHABLE, INC 2020
- 8. Data Ingestion: Source Data
We used Orchestra in GraphAware’s Hume for loading data, but a transformation script and LOAD
CSV or a bulk load would work just as well
© GRAPHABLE, INC 2020
- 10. Cypher Development: Overview
Don’t boil the ocean
Start with individual modules and unify them at the end
◦ Test review scores by X dimension, and explore results
◦ Start aggregating and weighting
◦ Avoid complexity if it’s not needed. You can achieve good, repeatable results from a hard dataset using a
collection of basic techniques
◦ Rule-based+
Clustering algorithms, node2vec, etc. have diminishing returns, and results depend heavily on the
data size. Do a thorough cost analysis for ROI of the improved accuracy compared to the effort
after you’ve already deployed and measured good repeatable recommendations in production
© GRAPHABLE, INC 2020
- 11. Cypher Development: Building
Recommendations
Building a recommendation engine with sparse, high-dimensional data is a tricky process
◦ Cold (cool?) start problem
◦ Many users have reviewed 1 beer, and many beers have only 1 review
◦ Very little overlap across users and beers they've reviewed
◦ Lack of any discernable pattern to walk between Users and Beers and find “similar”
◦ Some brewers have many beers, some have few
◦ Overlapping beer styles that do not clearly identify beer flavor
What we cannot and should not do:
◦ Recommend beers based on co-review occurrence (collaborative filtering)
◦ Base recommendation on beers with highest scores
◦ Simply parse review text to extract vector embeddings to find similar reviews and then recommend the corresponding beers
Proceed carefully to prevent narrow and mismatched recommendations
© GRAPHABLE, INC 2020
- 12. Cypher Development: "Cold Start"
Solution
Introducing new products to people is never easy, push the wrong product and bad things happen
Since so many users have only reviewed one or two beers, they would never get relevant beers
recommended to them
Solution is to leverage the hierarchy of beer to style and find highly-related styles to then drill back down to
beer
◦ Even with this approach we are still left with too big of a set
◦ Ranking by review score is still not enough
Next, a little NLP without NLP, leveraging review text to find words that show up in beer names
◦ Adds an additional dimension to the search
◦ Matches users based on their review phrasing to find beers that might have the correct name matching that is
grounded by review scores
© GRAPHABLE, INC 2020
- 14. Cypher Development: Cosine Similarity
© GRAPHABLE, INC 2020
This query is
designed to be able
to take the user on a
"Beer Walk" using
review scores
But... review scores are
multi-dimensional and can
go in different directions
Some users might score the
same beer more highly on
palate vs aroma vs
appearance
Because the overall
review is in fact a
vector of scores, the
cosine similarity
metric can be
deployed
This allows beers to be
matched based on sharing
similar overall patterns of
scoring, not just the total or
average score
When a user selects
a beer, the algorithm
will find beers that
have a similar
pattern of review
scoring first
Then a randomizer breaks
the tie by selecting the next
best beer
Over multiple rotations, this
algorithm will take the user
on a weighted "random
walk" through the beer
landscape
This algorithm can
be paired with the
Cold Start as an
ensemble or as a
standalone
Overall, these two
algorithms can be
paired together to
guide users through
the "Universe of
Beers" and pick the
best matches while
maintaining
diversity
- 16. Leveraging Subject Matter Expertise
While human preference is personal, our palate and tastes change as we mature and move toward the
level of "connoisseur"
Using large-scale machine learning trusts that the average level of "palate education" will conform to all
beer drinkers
◦ A study on wine expertise has shown how wine tastes and choices change over time
◦ Without accommodating this, there is a strong potential that the model is aggregating away a lot of data
Using parameterized dimensional recommendations allow expert humans to define the key dimensions on
which beers should be matched to individuals
◦ Machine learning is fundamentally regressive and pegs the user to their prior state
◦ With this approach, we can leverage machines to speed up searches across a broad range of products while
leveraging the knowledge of beer experts instead of relying on the "herd" or "crowd" to make recommendations
© GRAPHABLE, INC 2020
- 17. Deploying to production
“It depends”
Determine your performance requirements and data size to dictate if you need to precompute or if
you can run queries on the fly
How and where are your recommendations generated?
How do you measure and evaluate success of the recommendations?
-> Use recommendations to generate more data and facts
Rule-based+
Dimensions can evolve along their own timescale because we treat them individually. In a
standard ML algorithm they must all evolve at the same rate
© GRAPHABLE, INC 2020
- 19. Conclusion
Rule-based+
Be careful of ROI on data science projects
A few combinations of expertise guided modules is likely better for highly dimensional data than
ML
Questions?
© GRAPHABLE, INC 2020