2. Introduction
There are two types of recommendation engines we can build, these are:
1) Collaborative filtering.
2) Content filtering.
3. Collaborative filtering
Collaborative filtering is the process of filtering for information or patterns using
techniques involving collaboration among multiple agents, viewpoints, data
sources.
In this case, we collect user’s action data on all the product items, for example,
user added an item in the cart or viewed an item or removed an item stored as an
individual mapping in DB.
Once we have sufficient data then we run the algorithm for collective filtering to
calculate the similarity between two items.
4. Content filtering
Content filtering is the process of calculating similarity based on the attributes of
two items.
In this case based on the coverage of attributes data we define the list of attributes
which should be used for finding the similarity. Once the list of attributes is
finalized we run matching attributes value on the items to calculate the similarity.
This approach works well if we have good coverage of data for the qualified list of
attributes.
5. Use case
Burrow is a SAAS company which provides recommendations as a service. It
caters to product recommendations for e-commerce companies, content
recommendations for media companies.
A few of the scenarios which their product addresses are follows:
* Suggest other items based on what items are added to the cart
* Suggest items based on browsing pattern
* Suggest items based on how other people with similar demographics buy
6. Use case continued..
* Suggest items based on how other people with similar demographics browse.
* Suggest articles based on articles read
* Suggest articles based on reading patterns like time spent on article etc.
Also any other possible associated scenarios.
7. High level architecture for collaborative filtering
Architecture to use for calculating item to item similarity based on user’s actions:
User
Actions(add item,
browse item, buy
etc..)
DB
Run
collaborative
filtering algo,
we can use
Apache
Mahout here
or any other
algo.
Store
precomputed
results
Cache
the
result if
required
User
8. High level architecture CF continued..
To calculate similar demographic users recommendation (buy,browse), we need
to calculate user to user similarity based upon items bought or browsed, this
needs to be calculated at the run time as the number of users combination will be
more than the no of items, we can use the same approach as explained in the
previous slide, here the difference will be that we will use items bought or browsed
as an attribute for calculating users similarity.
Once we have calculated the similarity, we can filter the data on user’s location to
show recommended
9. Content filtering
Item to item similarity based on item’s attributes list:
Define
attributes list
Recommender: Fetch data
from db and calculate
similarity based on defined
attributes list, store
precomputed data in db or
any other storage.
Item’s data
Store
precomput
ed results
here.
User
10. Content filtering continued..
We can use content filtering to calculate the similarity between articles or any
other media items.
Following attributes can be used:
1) Category.
2) Avg time spent.
3) No of views.
4) Any other attribute.
We can use above list in the previous slide to calculate the similarity between
articles or any other media items.