User behavior based
THE PARSERS ( Kunal, Karan & Anupam )
How we do presently
• Presently we use solr content based recommendation.
• Solr does content matching and give recommendation
based on content.
• We don’t know how relevant are the documents
that are being suggested to the user.
• Track Users browsing behavior.
• Whenever a user visits a slideshow generate an
event which contains the
slideshow_id, user_id, platform
• These data would be used to analyze the related
• Apache mahout uses collaborative filtering and
gives us the recommendation.
• The new set of recommendations might have a
loose coupling in terms of content but is a set of
slideshows which have a higher priority for the
• Would help us in attaining better engagement.
Steps we took
• Setup one node Hadoop Cluster
• Imported the user behavior log of 1 week in the format
• (user_id, slideshow_id)
• Setup Apache Mahout and compiled it using maven
• Apache mahout analyzed it for 2-3 hrs and generated an
output in format
• (slideshow_id, recommended_slideshow_id, relevence_score)
• Used this output to show the recommended content.