Developing A Recommender System
On The Basis of
Movie Lens Data Set
Under the Guidance of:
Mr. M VenuGopal Reddy
(Associate Prof.)BATCH VIII
Sindhu Valavoju (10016T0920)
Saigurudatta P.V (10016T0945)
Tharun Katanguru (10016T0905)
Mounika Parsha (10016T0904)
Vikram Konatham (10016T0951)
Abstract
Recommender System
Types of Recommender Systems
Criteria Followed
Hybrid Recommended System
Existing System
Proposed System
Data Flow Diagram
Expected Outcome
ABSTRACT:
The basic approach that has been proposed to develop a
recommender system is COLLOBORATING FILTERING.
In CF recommendations for a user are computed based on the k
nearest neighbours.
A Virtual User is a service that when executed iteratively,
generate traffic from their location to target.
Initially, the rating given by a user is divided into categories
based on the products which are considered to be given by a
VIRTUAL USER.
The recommendations of corresponding virtual users of target
user are combined for recommendation.
This increases the performance of the recommender system and
also efficiency in calculating ‘k’ neighbours.
Recommender system receives information from the
user and recommends the product that fits their needs
the best
These recommender systems have become a key
component of the modern E-Commerce applications.
Collaborating Filtering approach has been proposed to
build a recommender system.
Data set contains three files, movies.dat, ratings.dat
and users.dat. Also included are scripts for generating
subsets of the data to support rating predictions.
Eg: amazon.com uses recommender system to suggest
books to the users.
Algorithm Criteria:
1. Quality of Predictions
2. Speed/Scalability
3. Easily Updated
Secondary Criteria:
Cold start ability
Sparse data handling
Content-based (Eg: movielens.org)
make suggestions based on a computation of the utility
of each object for the user.
Utility-based (Eg: last.fm.com )
functional knowledge: how a particular item meets a
particular need.
Knowledge-based (Eg: whattorent.com )
objects defined by their associated features.
Recommended System typically produce list of
Recommendations in two ways.
COLLABORATIVE FILTERING : It is a process
of filtering information or pattern using techniques
involving collaborations among multiple users.
CONTENT-BASED FILTERING : It is based on
characteristics of items that are going to be
recommended. In particular various items are
compared with items previously rated by the user and
best match is recommended.
Combine multiple methods in order to take advantage of
strengths and alleviate drawbacks
1.Weighted
▫Scores/votes of several recommendation techniques combined
together to produce a single recommendation
2.Switching
▫System switches between recommendation techniques
depending on the current situation
3.Mixed
▫Recommendations from several different recommenders
presented at the same time
4.Feature combination
▫Features from different recommendation data sources thrown
together into a single recommendation algorithm
5.Cascade
▫One recommender refines the recommendations given by
another
6.Feature augmentation
▫Output from one technique is used as an input feature to
another
7.Meta-level
▫The model learned by one recommender is used as input to
another
Collaborating filtering approaches are of two types:
1.Memory based CF: These systems compute
recommendations based on the previous history of the user.
2.Model based CF: They compute predictions on the basis of
modeling the user and item
In CF ,Recommendations for a target user is computed based on
the ratings of ‘k’ nearest neighbors.
It has three steps:
1.Data Representation,
2.Neighbourhood Formation,
3.Generating recommendations.
Existing System:
The comparison is done at user-user level. Hence
there is a chance of missing the near neighbours.
User may or may not request for all categories of
items
Inefficient due to:
Improving the performance of the Collaborative
filtering approach is the main research issue.
It improves the memory-based CF approach by exploiting
categories of the product.
Here the rating given by the user is divided into sub groups based
on the category of the products.
The ratings of each sub-group is considered to be given by a
virtual user
Now the recommendations for the target user can be computed
by using Collaborating Filtering at category-level, this
comparison is done to find the near neighbors.
This filtering is named as Category-Based Collaborative
Filtering.
Similar to CF it also has 3-steps.
Proposed System:
Data Representation: In CCF a user is fragmented in to
virtual users based on category of used data.
Neighborhood Generation: Neighborhood is formed by
processing ratings of all corresponding virtual users.
Generating Recommendations: We have to generate
recommendations for the corresponding virtual user.
These are combined to generate recommendations
finally to target user.
1.Random Approach: We have to combine all
recommendations of all the users and then pick top N
recommendations for the target user.
2.Raking Approach: We select top P ranked virtual users
and then follow random approach to find top N
recommendations.
Letusconsider,setof usersof whichtargetuserisUser1.Categoryof
booksbeXandY.
Sosetof nearneighborsmustbeconsidered. Incaseof CF,thenear
neighborswill beUser4.
Whereasincaseof Category-basedCFthenearneighborsalongwith
User4, User5and User3arealsothenearneighbors..
USER X1 X2 X3 Y2 Y3
User1 1 0 1 1 0
User2 1 1 0 0 0
User3 0 1 1 1 0
User4 1 0 1 0 1
User5 0 1 0 1 0
Project Modules:
 Collection of Data set :
Includes Movies, ratings and users data sets.
 Categorization of Products:
The products are classified based on their genre.
 Division of ratings:
Classifying the quality of ratings that are given by user.
 Creation of Virtual user:
Virtual users are created for each category.
 Analyzing the Virtual user.
Ratings of Virtual users are now compared to find neighbors.
 Finding ‘k’ nearest neighbors.
Top recommenders are selected a nearest neighburs.
 Generation of recommendations using CCF.
Based on the frequency count recommendations are
generated.
 Coding.
 Documentation.
Requirements:
 Hard Ware
o Memory: 512MB
o Processor: P4
o Hard Disk: 40GB
 Soft Ware
o OS: Windows
o Front End:
ApacheTomCat (Java)
o Back End: Oracle
o Design Layout:
NetBeans.
Start
User Profile
Ratings
Product Profile
Match
Recommendations Generated
Display “No Recommendations
Found”
END
Flow Chart
User
Rating
User Profile
Product
Product
Profile
Recommender System
Virtual User
Generation
Recommendations
Generated
Data Flow Diagram
Expected Outcome:
We made an effort to improve the performance the CF
approach which is being used to build
recommendation systems.
We have proposed a framework in which each user is
divided into virtual users based on the categories of
the products rated.
The proposed approach divides each user into
corresponding virtual users, computes
recommendations for each virtual user and combines
these recommendations to give recommendations to
the target user in efficient manner to CF.
Advantages:
Efficiency of the system can be increased by the
proposed approach than that of Collaborating
Filtering.
Performance of the system can be increased in
calculating the nearest neighbors using the concept of
virtual users in the item-based, model-based and
other approaches of CF.
Recommender system

Recommender system

  • 1.
    Developing A RecommenderSystem On The Basis of Movie Lens Data Set Under the Guidance of: Mr. M VenuGopal Reddy (Associate Prof.)BATCH VIII Sindhu Valavoju (10016T0920) Saigurudatta P.V (10016T0945) Tharun Katanguru (10016T0905) Mounika Parsha (10016T0904) Vikram Konatham (10016T0951)
  • 2.
    Abstract Recommender System Types ofRecommender Systems Criteria Followed Hybrid Recommended System Existing System Proposed System Data Flow Diagram Expected Outcome
  • 3.
    ABSTRACT: The basic approachthat has been proposed to develop a recommender system is COLLOBORATING FILTERING. In CF recommendations for a user are computed based on the k nearest neighbours. A Virtual User is a service that when executed iteratively, generate traffic from their location to target. Initially, the rating given by a user is divided into categories based on the products which are considered to be given by a VIRTUAL USER. The recommendations of corresponding virtual users of target user are combined for recommendation. This increases the performance of the recommender system and also efficiency in calculating ‘k’ neighbours.
  • 4.
    Recommender system receivesinformation from the user and recommends the product that fits their needs the best These recommender systems have become a key component of the modern E-Commerce applications. Collaborating Filtering approach has been proposed to build a recommender system. Data set contains three files, movies.dat, ratings.dat and users.dat. Also included are scripts for generating subsets of the data to support rating predictions. Eg: amazon.com uses recommender system to suggest books to the users.
  • 5.
    Algorithm Criteria: 1. Qualityof Predictions 2. Speed/Scalability 3. Easily Updated Secondary Criteria: Cold start ability Sparse data handling
  • 6.
    Content-based (Eg: movielens.org) makesuggestions based on a computation of the utility of each object for the user. Utility-based (Eg: last.fm.com ) functional knowledge: how a particular item meets a particular need. Knowledge-based (Eg: whattorent.com ) objects defined by their associated features.
  • 7.
    Recommended System typicallyproduce list of Recommendations in two ways. COLLABORATIVE FILTERING : It is a process of filtering information or pattern using techniques involving collaborations among multiple users. CONTENT-BASED FILTERING : It is based on characteristics of items that are going to be recommended. In particular various items are compared with items previously rated by the user and best match is recommended.
  • 8.
    Combine multiple methodsin order to take advantage of strengths and alleviate drawbacks 1.Weighted ▫Scores/votes of several recommendation techniques combined together to produce a single recommendation 2.Switching ▫System switches between recommendation techniques depending on the current situation 3.Mixed ▫Recommendations from several different recommenders presented at the same time
  • 9.
    4.Feature combination ▫Features fromdifferent recommendation data sources thrown together into a single recommendation algorithm 5.Cascade ▫One recommender refines the recommendations given by another 6.Feature augmentation ▫Output from one technique is used as an input feature to another 7.Meta-level ▫The model learned by one recommender is used as input to another
  • 10.
    Collaborating filtering approachesare of two types: 1.Memory based CF: These systems compute recommendations based on the previous history of the user. 2.Model based CF: They compute predictions on the basis of modeling the user and item In CF ,Recommendations for a target user is computed based on the ratings of ‘k’ nearest neighbors. It has three steps: 1.Data Representation, 2.Neighbourhood Formation, 3.Generating recommendations. Existing System:
  • 11.
    The comparison isdone at user-user level. Hence there is a chance of missing the near neighbours. User may or may not request for all categories of items Inefficient due to: Improving the performance of the Collaborative filtering approach is the main research issue.
  • 12.
    It improves thememory-based CF approach by exploiting categories of the product. Here the rating given by the user is divided into sub groups based on the category of the products. The ratings of each sub-group is considered to be given by a virtual user Now the recommendations for the target user can be computed by using Collaborating Filtering at category-level, this comparison is done to find the near neighbors. This filtering is named as Category-Based Collaborative Filtering. Similar to CF it also has 3-steps. Proposed System:
  • 13.
    Data Representation: InCCF a user is fragmented in to virtual users based on category of used data. Neighborhood Generation: Neighborhood is formed by processing ratings of all corresponding virtual users. Generating Recommendations: We have to generate recommendations for the corresponding virtual user. These are combined to generate recommendations finally to target user. 1.Random Approach: We have to combine all recommendations of all the users and then pick top N recommendations for the target user. 2.Raking Approach: We select top P ranked virtual users and then follow random approach to find top N recommendations.
  • 14.
    Letusconsider,setof usersof whichtargetuserisUser1.Categoryof booksbeXandY. Sosetofnearneighborsmustbeconsidered. Incaseof CF,thenear neighborswill beUser4. Whereasincaseof Category-basedCFthenearneighborsalongwith User4, User5and User3arealsothenearneighbors.. USER X1 X2 X3 Y2 Y3 User1 1 0 1 1 0 User2 1 1 0 0 0 User3 0 1 1 1 0 User4 1 0 1 0 1 User5 0 1 0 1 0
  • 15.
    Project Modules:  Collectionof Data set : Includes Movies, ratings and users data sets.  Categorization of Products: The products are classified based on their genre.  Division of ratings: Classifying the quality of ratings that are given by user.  Creation of Virtual user: Virtual users are created for each category.  Analyzing the Virtual user. Ratings of Virtual users are now compared to find neighbors.  Finding ‘k’ nearest neighbors. Top recommenders are selected a nearest neighburs.  Generation of recommendations using CCF. Based on the frequency count recommendations are generated.  Coding.  Documentation.
  • 16.
    Requirements:  Hard Ware oMemory: 512MB o Processor: P4 o Hard Disk: 40GB  Soft Ware o OS: Windows o Front End: ApacheTomCat (Java) o Back End: Oracle o Design Layout: NetBeans.
  • 17.
    Start User Profile Ratings Product Profile Match RecommendationsGenerated Display “No Recommendations Found” END Flow Chart
  • 18.
    User Rating User Profile Product Product Profile Recommender System VirtualUser Generation Recommendations Generated Data Flow Diagram
  • 19.
    Expected Outcome: We madean effort to improve the performance the CF approach which is being used to build recommendation systems. We have proposed a framework in which each user is divided into virtual users based on the categories of the products rated. The proposed approach divides each user into corresponding virtual users, computes recommendations for each virtual user and combines these recommendations to give recommendations to the target user in efficient manner to CF.
  • 20.
    Advantages: Efficiency of thesystem can be increased by the proposed approach than that of Collaborating Filtering. Performance of the system can be increased in calculating the nearest neighbors using the concept of virtual users in the item-based, model-based and other approaches of CF.