Recommendation engine

Uploaded on


More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Project proposal – Recommendation Engine Parinita, Masters in Computational Linguistics, University of Washington
  • 2. Recommendation Engine
    • Proposing two types of Recommendation Engine
        • 1) Proactive- query less recommendation engine (based on Automatic Collaborative Filtering mechanism )
        • 2) Reactive – query based content personalization
    • Both based on user personalization where user data is gathered from web server logs
    • Data used to learn about the implicit and explicit preferences of individual users. This information is used to personalize their information retrieval processes
    • Each user profile records relevancy information to discriminate between those jobs that the user looks at or considers, and those that she is truly interested in
    • Graded profiles on a user make it possible to 1) recommend jobs matching the interest based on what similar users have previously liked 2) supplement each user’s search queries with additional relevant search terms, and filter the retrieval results to weed out irrelevant hits
  • 3. 1. Query-less recommendation service
    • LinkedIn has availability of high-quality user profiles
    • Good to build a proactive, personalized and intelligent model of information access
    • Goal : Proactively recommend new jobs to users based on what similar users have previously liked
    • Automatically and passively constructed by mining server logs
    Fig1: From server logs to graded user profile
  • 4. Relevancy information can be preprocessed using web server logs
    • Records a single job access by a user
    • Encodes details like the time and type of access, and the job and user ids
    • Records “Revisit Data”- the amount of times that a user has accessed an information shows their interest in the job
    • Removes misleading data like "irritation clicks" (repeated clicks on a job description while it is downloading)
    • Records “Read-Time” - the time difference between successive requests by the same user
    • Eliminates spurious read-times (logoff )
    • Records “Activity Data” that avails usage of online application or email facility as measures of high relevancy
  • 5. Recommendation can be made on similarity between user profiles
    • Set of users related to the target user is identified
    • Profile items from these users that are not in the target profile, are ranked for recommendation to the target user
    • Similarity Measures :
    • The degree of overlap between their profile items
    • The correlation coefficient between their grading lists ,whereby a k-nearest neighbor (K-NN) strategy is used
    Fig. 2 Direct vs. Indirect User Relationships
  • 6. Two approaches - direct and indirect
    • Direct relationships:
    • Reliance on direct user relationships, for example between A and B
    • Recommendations may be based on a small number of profiles with low degrees of similarity
    • May even result in no recommendations
    • Ignores potential indirect relationships between users. C may have the same job taste as A, but as C has seen a different set of jobs, this will not be recognized.
    • Indirect relationships:
    • Indirect transitive relationship ; user B is directly related to users A and C
    • Group users prior to recommendation – profiles are clustered into virtual communities such that all of the users in a given community are related
    • The single-link clustering technique can be used with a thresholded version of the similarity metric
    • Each community is a maximal set of users such that every user has a similarity value > than the threshold with at least one other community member
    • Each target user is recommended the most frequently occurring jobs in its virtual community
  • 7. 2. Case-Based User Profiling for Content Personalization Two-step personalized retrieval engine When a user enters a new search query, a server-side similarity-based search engine is used to select a set of similar job cases . This is followed by Personalization, a post-processing retrieval task where the result-set is compared to a user profile in order to filter-out irrelevant jobs .
  • 8. Step1: Similarity-Based Retrieval
    • A similarity-based retrieval technique rather than an exact match technique
    • Case is made up of a fixed set of features such as job type, salary, key skills, minimum experience etc
    • Compute the similarity between each job case and the target query
    • Key skills contains symbolic values, represented as concept trees
    • Symbolic feature similarity is based on subsumption relationships and on the distances between nodes in these trees.
  • 9. Step1: Similarity-Based Retrieval
    • Any job containing a concept that is a descendant of a node in the tree is taken to be an exact match
    • Concept proximity - the closer two concepts are in the tree the more similar they are
  • 10. Step 2: Result Personalization
    • Classifying each individual retrieved job as either relevant or not relevant
    • A nearest-neighbor type classification algorithm that uses the graded job cases in a target user’s profile as training data
    • Compare a candidate job to each profile job case, using a similarity metric to locate the k nearest profile jobs
    • Take the majority classification of the nearest neighbors
    • Analyze the server logs again to improve the recommendations (cyclical process)
  • 11. Bibliography:
    • Automated Collaborative Filtering Applications for Online Recruitment Services Rachael Rafter, Keith Bradley, Barry Smyth ;Smart Media Institute, Department of Computer Science, University College Dublin, Ireland
    • Case-Based User Profiling for Content Personalisation Keith Bradley , Rachael Rafter & Barry Smyth Smart Media Institute Department of Computer Science, University College Dublin, Ireland
    • Navigating Nets : Simple algorithms for proximity search Robert Krauthgamer & James R Lee
    • User Profiles for Personalized Information Access Susan Gauch Mirco Speretta Aravind Chandramouli and Alessandro Micarelli ; Electrical Engineering and Computer Science Information & Telecommunication Technology Center , Lawrence Kansas
    • Another interesting approach :