SlideShare a Scribd company logo
1 of 36
Movies Recommendation System
Submitted in partial fulfillment of the requirements
of the degree of
T. E. Computer Engineering
By
Suraj R. Maurya Roll No. 56 PID: 182262
Om V. Pise Roll No: 58 PID: 172074
Guide (s):
Mr. Rupesh Mishra
Asst. Professor
Department of Computer Engineering
St. Francis Institute of Technology
(Engineering College)
University of Mumbai
2019-2020
i
CERTIFICATE
This is to certify that the project entitled “Movies Recommendation System” is a bonafide
work of “Suraj Maurya (Roll No: 56), Om Pise (Roll No: 58)” submitted to the University of
Mumbai in partial fulfillment of the requirement for the award of the degree of T.E. in Computer
Engineering
(Mr. Rupesh Mishra)
Guide
(Dr. Kavita Sonawane)
Head of Department
Project Report Approval for T.E.
This project report entitled
Maurya , Mr. Om Pise
Engineering.
Date:
Place:Mumbai
ii
Project Report Approval for T.E.
This project report entitled Movies Recommendation System
Mr. Om Pise, is approved for the degree of
Examiners
1.---------------------------------------------
2.---------------------------------------------
Project Report Approval for T.E.
Movies Recommendation System by Mr. Suraj
is approved for the degree of T.E. in Computer
---------------------------------------------
---------------------------------------------
Declaration
I/ We (Make changes in college copy. Individual copy it will be I and
College copy it will be We)
ideas in my own words and where other
have adequately cited and referenced
adhered to all principles of academic honesty and integrity and have not
misrepresented or fabricated or falsified any idea/data/fact/source in my
submission. I understand that any violation of the above wi
disciplinary action by the Institute and can also evoke penal action from the
sources which have thus not been properly cited or from whom proper permission
has not been taken when needed.
Date:
iii
(Make changes in college copy. Individual copy it will be I and
College copy it will be We) declare that this written submission represents my
ideas in my own words and where other’s ideas or words have been included, I
have adequately cited and referenced the original sources. I also declare that I have
adhered to all principles of academic honesty and integrity and have not
misrepresented or fabricated or falsified any idea/data/fact/source in my
submission. I understand that any violation of the above wi
disciplinary action by the Institute and can also evoke penal action from the
sources which have thus not been properly cited or from whom proper permission
has not been taken when needed.
-----------------------------------------
(Signature)
Suraj Maurya Roll
Om Pise Roll no:
(Make changes in college copy. Individual copy it will be I and
declare that this written submission represents my
s ideas or words have been included, I
the original sources. I also declare that I have
adhered to all principles of academic honesty and integrity and have not
misrepresented or fabricated or falsified any idea/data/fact/source in my
submission. I understand that any violation of the above will be cause for
disciplinary action by the Institute and can also evoke penal action from the
sources which have thus not been properly cited or from whom proper permission
-----------------------------------------
(Signature)
no: 56
Roll no: 58
iv
Abstract
A recommendation engine filters the data using different algorithms and recommends the
most relevant items to users. It first captures the past behavior of a customer and based on
that, recommends products which the users might be likely to buy. If a completely new user
visits an e-commerce site, that site will not have any past history of that user. So how does
the site go about recommending products to the user in such a scenario? One possible
solution could be to recommend the best selling products, i.e. the products which are high in
demand. Another possible solution could be to recommend the products which would bring
the maximum profit to the business. Three main approaches are used for our recommender
systems. One is Demographic Filtering i.e They offer generalized recommendations to every
user, based on movie popularity and/or genre. The System recommends the same movies to
users with similar demographic features. Since each user is different , this approach is
considered to be too simple. The basic idea behind this system is that movies that are more
popular and critically acclaimed will have a higher probability of being liked by the average
audience. Second is content-based filtering, where we try to profile the users interests using
information collected, and recommend items based on that profile. The other is collaborative
filtering, where we try to group similar users together and use information about the group to
make recommendations to the user.
Chapter Title
1
INTRODUCTION
1.1 Project Description
1.2 Problem Formulation
1.3 Motivation
1.4 Proposed Solution
1.5 Scope of the project
2 REVIEW OF LITERATURE
3
SYSTEM ANALYSIS
3.1 Functional Requirements
3.2 Non Functional Requirements
3.3 Specific Requirements
3.4 Use-Case Diagrams and description
4
ANALYSIS MODELING
4.1
v
Contents
INTRODUCTION
Project Description
Problem Formulation
Proposed Solution
Scope of the project
REVIEW OF LITERATURE
SYSTEM ANALYSIS
Functional Requirements
Non Functional Requirements
Specific Requirements
Case Diagrams and description
ANALYSIS MODELING
Page
No.
1
1
1
1
2
2
3
5
6
6
7
8
9
Activity Diagrams
Class Diagram
4.2
Functional Modeling
5
DESIGN
5.1 Architectural Design
5.2 User Interface Design
6
IMPLEMENTATION
6.1 Algorithms / Methods Used
6.2 Working of the project
7 CONCLUSIONS
References
Acknowledgement
vi
Activity Diagrams
Class Diagram
Functional Modeling
Architectural Design
User Interface Design
IMPLEMENTATION
Algorithms / Methods Used
Working of the project
CONCLUSIONS
9
11
14
14
18
20
20
23
26
Fig.
No.
1. Use Case Diagram f
2. Class Diagram for
3. Activity Diagram for
4. Context Level DFD for
5. Level 0 DFD for
6. Level 1 DFD for
7. Architecture for
8. Main Window
Sr. No. Abbreviation
1. DFD
2. CFS
vii
List of Figures
Figure Caption
Case Diagram for Movies Recommendation system
Class Diagram for Movies Recommendation system
Activity Diagram for Movies Recommendation system
Context Level DFD for Movies Recommendation system
Level 0 DFD for Movies Recommendation system
Level 1 DFD for Movies Recommendation system
Architecture for Movies Recommendation system
Main Window (GUI)
List of Abbreviations
Abbreviation Expanded form
Data Flow Diagram
Collaborative filtering system
Page
No.
7
9
10
m 11
12
13
14-17
18-19
1
Chapter 1
Introduction
1.1 Description
A recommendation system is a type of information filtering system which attempts to
predict the preferences of a user, and make suggestions based on these preferences.
There are a wide variety of applications for recommendation systems.
2. These have become increasingly popular over the last few years and are now utilized in
most online platforms that we use.
3. The content of such platforms varies from movies, music, books and video, to friends
and stories on social media platforms, to products on e-commerce websites, to people
on professional and dating websites, to search results returned on Google.
4. Often, these systems are able to collect information about a user’s choices, and can use
this information to improve their suggestions in the future.
5. For example, if Amazon observes that a large number of customers who buy the latest
Apple MacBook also buy a USB-C-to USB Adapter, they can recommend the Adapter
to a new user who has just added a MacBook to his cart.
1.2 Problem Formulation
The movie recommendation system will be built using artificial algorithms that analyze user's
favorite genres and recommend movies according to their liking. The response will be based on
the liking of the user. The User will submit queries depending on their liking of their movies.
The System analyses the liking and then recommends the user movies. Providing related content
out of relevant and irrelevant collection of items to users of online service providers. Netflix
aims to recommend movies to users based on content of items rather than other user’s opinions.
1.3 Motivation
A recommendation system also finds a similarity between the different products. For example,
Netflix Recommendation System provides you with the recommendations of the movies that are
similar to the ones that have been watched in the past. Furthermore, there is a collaborative
content filtering that provides you with the recommendations in respect with the other users who
might have a similar viewing history or preferences. There are two types of recommendation
systems – Content-Based Recommendation System and Collaborative Filtering
Recommendation. In this project of recommendation system in R, we will work on a
collaborative filtering recommendation system and more specifically, ITEM based collaborative
recommendation system.
2
1.4 Proposed Solution
The proposed movie recommendation system is based on the abstract maximal clique
method.The k-cliques, which are partially graphs that are fully connected to k vertices
and a very effective method to build groups in social networks analysis is proposed. In
the proposed approach, a similarity measure of cosine is used to measure similarities
between users. The proposed solution offers improved k-clique methods for more
efficient performance than existing collaborative filtering and maximal clique. For
performance evaluation, use MovieLens data, which is general information in movie
recommendation systems. To assess the effectiveness of a MovieLens dataset, it is
divided into experimental and test data that are widely used in artificial intelligence.
Comparison of collaborative filtering methods using k nearest neighbor, maximal clique
method, k-clique method, and improve k-clique to evaluate performance.
1.5 Scope of the Project
In the near future, it will be installed in Apache Server and so it will be published on the
internet. Datasets will be updated continuously and it will make online actual rating
predictions to the users whose habits are changing day by day. As a result, it can be
sensitively satisfying current user tastes. Web services in particular suffer from producing
recommendations of millions of items to millions of users. The time and computational
power can even limit the performance of the best hybrid systems. For larger dataset, we
can work on scalability problems of recommendation systems. The Prediction approach
can also be tried in different datasets to test harmony performance of system scalability
problems of recommendation systems.
3
Chapter 2
Review of Literature
2.1 A STUDY ON CONTENT-BASED VIDEO RECOMMENDATION
Authors: Yan Li Hanjie Wang Hailong Liu Bo Chen - Tencent WeChat, China
Publication: IBM Research, Yorktown Heights, China
Approach:
The competition is challenging, and the reason lies in three aspects, i.e., large vision appearance
variance, insufficient training data, and serious data incompleteness. Meta-data feature: the
provided meta-data information includes actor/actress, director, description, and genre.
Considering the small amount of training data, in this paper we only take the advantage of show
descriptions. For description representation, we first apply the Latent Dirichlet Allocation
(LDA) algorithm to generate a topic model from about 400 movies descriptions (we build this
corpus by crawling data in terms of genre from IMDb, which is treated as the world’s most
popular and authoritative source for movie, TV and celebrity content), and then compute the
topic distribution probability for each TV-show
2.2 Content-based recommender system for online stores using expert system
Authors: Bogdan Walek, Petra Spackova
Publication:University of Ostrava
Approach:
The main goal of the recommender system is to propose and deliver suitable content to the user.
One of the goals of the proposed recommendation system is to decrease the cold start effect. At
the end of the paper, the proposed system is experimentally verified. The recommender system
uses a collaborative filtering system for recommending suitable items and an expert system for
evaluating the popularity of items. The system also proposes an algorithm for showing items
from similar users after the first login to decrease the effect of cold start problem. The
knowledge base of the proposed expert system contains three input linguistic variables and one
output linguistic variables.
2.3 A Content-based Movie Recommender System based on Temporal User Preferences
Authors: Bagher Rahimpour Cami, Hamid Hassanpour, Hoda Mashayekhi
Publication: Shahrood University of Technology Shahrood, Iran
Approach:
the user profile consists of user activities as userId, activity1, ..., activity-n, where each activity-i
indicates the content and access time of selected items denoted as itemId, itemDesc, accessDate.
4
This model is user-centered and employs the profile of each user to create a user model for
individuals. In movie domain, each rating record of rating matrix (movieId, movieDesc, rate,
accessDate) is corresponded to an activity. The temporal preferences model is based on Bayesian
non-parametric framework and has three main components: interest extraction, inferring of
preferences, and prediction.Interests extraction, where analysis the user profile to discover user
interests. This model employs the user profile into Distance Dependent Chinese Restaurant
Process (DDCRP) [27] and performs clustering. DDCRP is based on Bayesian non-parametric
thus, the clusters can grow whenever new data is observed.
2.4 An Improved Content Based Collaborative Filtering Algorithm For Movie
Recommendations
Authors: Ashish Pal, Prateek Parhi and Manuj Aggarwal
Publication: ARSD College, University of Delhi, New Delhi, India
Approach:
Our proposed algorithm takes into consideration the tags and genres specified in the dataset, and
for the content-based prediction, we have applied a set matching comparator. This comparator
returns the number of common objects between two movies. The term object here refers to tags
and genres. For each particular movie, the tags and genres are merged into a single set. This
gives us a bulky content for each movie, and more the content better is the predictions. After
getting the set of common objects, the weight of each set for a movie is calculated. Once the
weights are assigned to each of the set, they are then used to provide the ratings of the unrated
movies using the rated movies which were previously compared. In our methodology first, the
tags for each movie assigned by different users are used and converted into a single list. The
genres for each movie are appended to the same list of tags. This final list is referred to as the
objects for a particular movie. The object set for each active movie is compared with the object
set of every other movie in the dataset and the number of matching objects are assigned to a set.
5
Chapter 3
System Analysis
3.1 Functional Requirements
Major functionalities associated with the user are:
● Enable users to submit his preferred genres by typing into the input text box.
● The text to be written should support adding text from different encodings such as utf-8,
latin encoded text which represents information in English.
Major functionalities associated with the system are as follows:
● Enable system to use the keywords obtained after tokenization of a new input to find the
cluster it belongs to.
Interface Requirements:
● Field 1 accepts the preferences of a user.
● Field 2 recommends the movies to the user
3.2 Non Functional Requirements
3.2.1 Performance
The computer running the software did not require a powerful CPU and GPU, it
requires a 64 bit operating system for execution of a program in order to call
inbuilt packages like flask,sklearn etc.
3.2.2 Reliability
The system takes in the inputs without any error and predicts the expected
response accurately so that the users of the system can get its query response.
3.2.3 Usability
The system is easy to handle and navigates in the most expected way with no
delays. The system interacts with the user in a very friendly manner making it
easier for the users to use the system.
6
3.3 Specific Requirements
3.3.1 User Interfaces
● Front-end Software: Flask ,HTML, CSS, JavaScript, Web browser.
● Back-end Software: R Studio
3.3.2 Hardware Requirements
● CPU Type : Intel Core or above
● Clock speed : 1.0GHz
● Ram size : 1GB and above
● Hard Disk capacity : 100GB and above
● Working keyboard
3.3.2 Software Requirements
● Operating System : Windows
● Python 3.5 or above
3.3.3 Communication Interfaces
This system will be completely based on a local system of the user.
3.4 Use-Case Diagrams and Description
Fig 1. Use Case Diagram for Movies Recommendation System.
Use Case Diagram:
A use case diagram is a dynamic or behavior diagram in UML. Use case diagrams model the
functionality of a system using actors and the use cases. Use cases are set of actions, services,
and functions that the system needs to perform. In this context, a “Sy
developed and operated, such as a website. The “Actors” are people or entities operating under
defined roles within the system.
Case Diagrams and Description
Fig 1. Use Case Diagram for Movies Recommendation System.
A use case diagram is a dynamic or behavior diagram in UML. Use case diagrams model the
functionality of a system using actors and the use cases. Use cases are set of actions, services,
and functions that the system needs to perform. In this context, a “System” is something being
developed and operated, such as a website. The “Actors” are people or entities operating under
defined roles within the system.
7
Fig 1. Use Case Diagram for Movies Recommendation System.
A use case diagram is a dynamic or behavior diagram in UML. Use case diagrams model the
functionality of a system using actors and the use cases. Use cases are set of actions, services,
stem” is something being
developed and operated, such as a website. The “Actors” are people or entities operating under
8
Use Case Specifications
Use case: Submit Query
Brief Description. This use case will be expecting user to submit or a keyword as the input
which is compatible with the system.
Primary Actor: User
Use Case: View Response
Brief Description: This use case will show the bot response in a text format
Primary Actor: User
Use Case: Query Processing
Brief Description: In this use case user input query is processed inorder to give the response.
Primary Actor: System
Main Flow:
1. It will tokenize the sentences followed by word tokenization.
2. For each word in the input given, it is checked against a common set of punctuations and
stop words and are removed accordingly.
Use Case: Generate Response
Brief Description: In this use case responses are generated on the basis of query analysis and
AIML query
Primary Actor: System
Use Case: Submit Feedback
Brief Description: This use case used for taking feedback from user inorder to get better
performance of a bot.
Primary Actor: User
Main Flow:
Taken feedback is stored in a text file and analyze by administrator
4.1 Class Diagram and Activity Diagram
Fig 2. Class Diagram for
Chapter 4
Analysis Modeling
Class Diagram and Activity Diagram
Fig 2. Class Diagram for Movies Recommendation syste
9
em .
Fig 3. Activity Diagram forFig 3. Activity Diagram for Movies Recommendation sys
10
stem.
4.2 Functional Modeling
Data Flow Diagram
A data flow diagram (DFD) illustrates how data is processed by a system in terms of inputs and
outputs. As its name indicates its focus is on flow of information, where data comes from, where
it goes and how it gets stored.
Fig4. Level 0 DFD for Movies
Context Level DFD:
This Level is called the Context Level DFD. It is a basic overview of the whole system or
process being analyzed or modelled. Here the basic flow of the system is showed. The user
gives input which is stored by the system. Based on the input given the system accordingly
processes and gives then output to then user.
Modeling
A data flow diagram (DFD) illustrates how data is processed by a system in terms of inputs and
outputs. As its name indicates its focus is on flow of information, where data comes from, where
. Level 0 DFD for Movies recommendation system.
This Level is called the Context Level DFD. It is a basic overview of the whole system or
process being analyzed or modelled. Here the basic flow of the system is showed. The user
is stored by the system. Based on the input given the system accordingly
processes and gives then output to then user.
11
A data flow diagram (DFD) illustrates how data is processed by a system in terms of inputs and
outputs. As its name indicates its focus is on flow of information, where data comes from, where
recommendation system.
This Level is called the Context Level DFD. It is a basic overview of the whole system or
process being analyzed or modelled. Here the basic flow of the system is showed. The user
is stored by the system. Based on the input given the system accordingly
Fig 5. Level 1 DFD for Movies Recommendation system.
Level 1 DFD:
DFD Level 1provides a more detailed breakout of pieces of Context Level DFD. It basically
explains the system more in detail.
DFD comprises of details which are fabricated in level 0 of DFD. Here
login details and two databases consisting of movie recommendation system data set and user
data set.
Fig 5. Level 1 DFD for Movies Recommendation system.
DFD Level 1provides a more detailed breakout of pieces of Context Level DFD. It basically
explains the system more in detail.The level 1 DFD is more detailed than level 0. This level of
DFD comprises of details which are fabricated in level 0 of DFD. Here in DFD 1 we can see
login details and two databases consisting of movie recommendation system data set and user
12
Fig 5. Level 1 DFD for Movies Recommendation system.
DFD Level 1provides a more detailed breakout of pieces of Context Level DFD. It basically
The level 1 DFD is more detailed than level 0. This level of
in DFD 1 we can see
login details and two databases consisting of movie recommendation system data set and user
Fig 6. Level 2 DFD for Movies recommendation system
C] Level 2 DFD
A level 2 DFD is much more informative than its previous counterparts. Here the system is
further divided and is explained in much more detail so that it is very easy to understand the
whole system. We can go for further level 3 and level 4 of DFDs but the
complicated and make the system hard to understand and implement.
. Level 2 DFD for Movies recommendation system
A level 2 DFD is much more informative than its previous counterparts. Here the system is
further divided and is explained in much more detail so that it is very easy to understand the
whole system. We can go for further level 3 and level 4 of DFDs but they will be much more
complicated and make the system hard to understand and implement.
13
. Level 2 DFD for Movies recommendation system
A level 2 DFD is much more informative than its previous counterparts. Here the system is
further divided and is explained in much more detail so that it is very easy to understand the
y will be much more
14
Chapter 5
Design
5.1 Architectural Design
To start with, we present an overall system diagram for recommendation systems in the
following figure. The main components of the architecture contain one or more machine learning
algorithms.
Fig 7. Architectural Design for Movies recommendation system
15
The simplest thing we can do with data is to store it for later offline processing, which leads to
part of the architecture for managing Offline jobs. However, computation can be done offline,
nearline, or online. Online computation can respond better to recent events and user interaction,
but has to respond to requests in real-time. This can limit the computational complexity of the
algorithms employed as well as the amount of data that can be processed. Offline computation
has less limitations on the amount of data and the computational complexity of the algorithms
since it runs in a batch manner with relaxed timing requirements. However, it can easily grow
stale between updates because the most recent data is not incorporated. One of the key issues in a
personalization architecture is how to combine and manage online and offline computation in a
seamless manner. Nearline computation is an intermediate compromise between these two
modes in which we can perform online-like computations, but do not require them to be served
in real-time. Model training is another form of computation that uses existing data to generate a
model that will later be used during the actual computation of results. Another part of the
architecture describes how the different kinds of events and data need to be handled by the Event
and Data Distribution system. A related issue is how to combine the different Signals and
Models that are needed across the offline, nearline, and online regimes. Finally, we also need to
figure out how to combine intermediate Recommendation Results in a way that makes sense for
the user. The rest of this post will detail these components of this architecture as well as their
interactions. In order to do so, we will break the general diagram into different sub-systems and
we will go into the details of each of them. As you read on, it is worth keeping in mind that our
whole infrastructure runs across the public Amazon Web Services cloud.
Online computation can respond quickly to events and use the most recent data. An example is to
assemble a gallery of action movies sorted for the member using the current context. Online
components are subject to an availability and response time Service Level Agreements (SLA)
that specifies the maximum latency of the process in responding to requests from client
applications while our member is waiting for recommendations to appear. This can make it
harder to fit complex and computationally costly algorithms in this approach. Also, a purely
16
online computation may fail to meet its SLA in some circumstances, so it is always important to
think of a fast fallback mechanism such as reverting to a precomputed result. Computing online
also means that the various data sources involved also need to be available online, which can
require additional infrastructure.Nearline computation can be seen as a compromise between the
two previous modes. In this case, computation is performed exactly like in the online case.
However, we remove the requirement to serve results as soon as they are computed and can
instead store them, allowing it to be asynchronous. The nearline computation is done in response
to user events so that the system can be more responsive between requests. This opens the door
for potentially more complex processing to be done per event. An example is to update
recommendations to reflect that a movie has been watched immediately after a member begins to
watch it. Results can be stored in an intermediate caching or storage back-end. Nearline
computation is also a natural setting for applying incremental learning algorithms.
In any case, the choice of online/nearline/offline processing is not an either/or question. All
approaches can and should be combined. There are many ways to combine them. We already
mentioned the idea of using offline computation as a fallback. Another option is to precompute
part of a result with an offline process and leave the less costly or more context-sensitive parts of
the algorithms for online computation.
Much of the computation we need to do when running personalization machine learning
algorithms can be done offline. This means that the jobs can be scheduled to be executed
periodically and their execution does not need to be synchronous with the request or presentation
of the results. There are two main kinds of tasks that fall in this category: model training and
batch computation of intermediate or final results. In the model training jobs, we collect relevant
existing data and apply a machine learning algorithm produces a set of model parameters (which
we will henceforth refer to as the model). This model will usually be encoded and stored in a file
for later consumption. Although most of the models are trained offline in batch mode, we also
have some online learning techniques where incremental training is indeed performed online.
Batch computation of results is the offline computation process defined above in which we use
existing models and corresponding input data to compute results that will be used at a later time
either for subsequent online processing or direct presentation to the user.
Fig 8. Architecture for Movies recommendation System
existing models and corresponding input data to compute results that will be used at a later time
either for subsequent online processing or direct presentation to the user.
. Architecture for Movies recommendation System
17
existing models and corresponding input data to compute results that will be used at a later time
. Architecture for Movies recommendation System
5.2 User Interface Design
Fig 9
User Interface Design
9. GUI for Movies recommendation System
18
.
Fig 9. GGUI for Movies recommendation System
19
6.1 Algorithms Used
USER-based Collaborative Filtering Model
Now, I will use the user-based approach. According to this approach, given a new user, its
similar users are first identified. Then, the top
recommended.
For each new user, these are the steps:
1. Measure how similar each user is to the new one. Like IBCF, popular similarity measures
are correlation and cosine.
2. Identify the most similar users. The options are:
● Take account of the top k users (k
● Take account of the users whose similarity is above a defined threshold
3. Rate the movies rated by the most similar users. The rating is the average rating among
similar users and the approaches are:
Chapter 6
Implementation
based Collaborative Filtering Model
based approach. According to this approach, given a new user, its
similar users are first identified. Then, the top-rated items rated by similar users are
se are the steps:
Measure how similar each user is to the new one. Like IBCF, popular similarity measures
are correlation and cosine.
Identify the most similar users. The options are:
Take account of the top k users (k-nearest_neighbors)
e users whose similarity is above a defined threshold
Rate the movies rated by the most similar users. The rating is the average rating among
similar users and the approaches are:
20
based approach. According to this approach, given a new user, its
rated items rated by similar users are
Measure how similar each user is to the new one. Like IBCF, popular similarity measures
e users whose similarity is above a defined threshold
Rate the movies rated by the most similar users. The rating is the average rating among
● Average rating
● Weighted average rating, using the similarities as weights
● Pick the top-rated movies.
In content-based filtering, items are recommended based on comparisons between item profile
and user profile. A user profile is content that is found to be relevant to the user in form of
keywords(or features). A user profile m
features) collected by algorithm from items found relevant (or interesting) by the user. A set of
keywords (or features) of an item is the Item profile. For example, consider a scenario in which a
person goes to buy his favorite cake ‘X’ to a pastry. Unfortunately, cake ‘X’ has been sold out
and as a result of this the shopkeeper recommends the person to buy cake ‘Y’ which is made up
of ingredients similar to cake ‘X’. This is an instance of content
We will be using the cosine similarity to calculate a numeric quantity that denotes the
similarity between two movies. We use the cosine similarity score since it is independent of
magnitude and is relatively easy and fast to calculate. Mathematically, it is defined as
follows:
Weighted average rating, using the similarities as weights
rated movies.
based filtering, items are recommended based on comparisons between item profile
and user profile. A user profile is content that is found to be relevant to the user in form of
keywords(or features). A user profile might be seen as a set of assigned keywords (terms,
features) collected by algorithm from items found relevant (or interesting) by the user. A set of
keywords (or features) of an item is the Item profile. For example, consider a scenario in which a
oes to buy his favorite cake ‘X’ to a pastry. Unfortunately, cake ‘X’ has been sold out
and as a result of this the shopkeeper recommends the person to buy cake ‘Y’ which is made up
of ingredients similar to cake ‘X’. This is an instance of content-based filtering
Fig. 10 Content Based Filtering
We will be using the cosine similarity to calculate a numeric quantity that denotes the
similarity between two movies. We use the cosine similarity score since it is independent of
magnitude and is relatively easy and fast to calculate. Mathematically, it is defined as
21
based filtering, items are recommended based on comparisons between item profile
and user profile. A user profile is content that is found to be relevant to the user in form of
ight be seen as a set of assigned keywords (terms,
features) collected by algorithm from items found relevant (or interesting) by the user. A set of
keywords (or features) of an item is the Item profile. For example, consider a scenario in which a
oes to buy his favorite cake ‘X’ to a pastry. Unfortunately, cake ‘X’ has been sold out
and as a result of this the shopkeeper recommends the person to buy cake ‘Y’ which is made up
iltering
We will be using the cosine similarity to calculate a numeric quantity that denotes the
similarity between two movies. We use the cosine similarity score since it is independent of
magnitude and is relatively easy and fast to calculate. Mathematically, it is defined as
We are now in a good position to define our recommendation function. These are the
following steps we'll follow :-
● Get the index of the movie given its t
● Get the list of cosine similarity scores for that particular movie with all movies.
Convert it into a list of tuples where the first element is its position and the second is the
similarity score.
● Sort the aforementioned list of tuples based on t
element.
● Get the top 10 elements of this list. Ignore the first element as it refers to self (the
movie most similar to a particular movie is the movie itself).
● Return the titles corresponding to the indices of the top elements.
While our system has done a decent job of finding movies with similar plot descriptions, the
quality of recommendations is not that great. "The Dark Knight Rises" returns all Batman
movies while it is more likely that the people who liked that movie are more inclined to
enjoy other Christopher Nolan movies. This is something that cannot be captured by the
present system.
We are now in a good position to define our recommendation function. These are the
-
● Get the index of the movie given its title.
● Get the list of cosine similarity scores for that particular movie with all movies.
Convert it into a list of tuples where the first element is its position and the second is the
● Sort the aforementioned list of tuples based on the similarity scores; that is, the second
● Get the top 10 elements of this list. Ignore the first element as it refers to self (the
movie most similar to a particular movie is the movie itself).
● Return the titles corresponding to the indices of the top elements.
While our system has done a decent job of finding movies with similar plot descriptions, the
quality of recommendations is not that great. "The Dark Knight Rises" returns all Batman
ies while it is more likely that the people who liked that movie are more inclined to
enjoy other Christopher Nolan movies. This is something that cannot be captured by the
22
We are now in a good position to define our recommendation function. These are the
● Get the list of cosine similarity scores for that particular movie with all movies.
Convert it into a list of tuples where the first element is its position and the second is the
he similarity scores; that is, the second
● Get the top 10 elements of this list. Ignore the first element as it refers to self (the
While our system has done a decent job of finding movies with similar plot descriptions, the
quality of recommendations is not that great. "The Dark Knight Rises" returns all Batman
ies while it is more likely that the people who liked that movie are more inclined to
enjoy other Christopher Nolan movies. This is something that cannot be captured by the
6.2 Working of the project
CODE SNIPPETS
Fig 11. Fro
6.2 Working of the project
ont end code for Movies recommendation System
23
for Movies recommendation System
Fig 12. Bacckend code for Movies recommendation System
24
for Movies recommendation System
Fig 13. Bacckend code for Movies recommendation System
25
for Movies recommendation System
26
Chapter 7
Conclusion
In our project, a collaborative filtering algorithm is used to predict a user's movie rating. The
MovieLens dataset, which has 10 million ratings, is selected in our project and divided into
training set and test set. The RMSE method is used for algorithm evaluation. According to
evaluation as a result, our movie recommender system has pretty good prediction performance.
A hybrid approach is taken between context based filtering and collaborative filtering to
implement the system. This approach overcomes drawbacks of each individual algorithm and
improves the performance of the system. Techniques like Clustering, Similarity and
Classification are used to get better recommendations thus reducing MAE and increasing
precision and accuracy. In future we can work on hybrid recommender using clustering and
similarity for better performance. Our approach can be further extended to other domains to
recommend songs, video, venue, news, books, tourism and e-commerce sites, etc.
27
References
 https://data-flair.training/blogs/data-science-r-movie-recommendation/
 https://towardsdatascience.com/the-4-recommendation-engines-that-
can-predict-your-movie-tastes-109dc4e10c52
 https://www.geeksforgeeks.org/python-implementation-of-movie-
recommender-system/
 https://www.mygreatlearning.com/blog/masterclass-on-movie-
recommendation-system/
 https://rstudio-pubs-
static.s3.amazonaws.com/288836_388ef70ec6374e348e32fde56f4b8f0e.ht
ml
28
Acknowledgements
We take the opportunity to thank all those people who have helped and guided us through this
project and make this experience worthwhile for us. We wish to sincerely thank our reverend
Bro. Jose Thuruthiyil and principal Dr. Sincy George for giving us this opportunity for
making a project in the Third Year of Engineering. We would also like to thank HOD of
Computer department Dr. Kavita Sonawane and all teaching and nonteaching staff for their
immense support and cooperation.
Last but not the least we would like to thank Mr. Rupesh Mishra for guiding us throughout
the project and encouraging us to explore in this domain.

More Related Content

What's hot

Movie Recommender system
Movie Recommender systemMovie Recommender system
Movie Recommender systemPalakNath
 
Introduction to Recommendation Systems
Introduction to Recommendation SystemsIntroduction to Recommendation Systems
Introduction to Recommendation SystemsTrieu Nguyen
 
Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemMilind Gokhale
 
Recommendation System Explained
Recommendation System ExplainedRecommendation System Explained
Recommendation System ExplainedCrossing Minds
 
A content based movie recommender system for mobile application
A content based movie recommender system for mobile applicationA content based movie recommender system for mobile application
A content based movie recommender system for mobile applicationArafat X
 
Recommendation system
Recommendation systemRecommendation system
Recommendation systemRishabh Mehta
 
Collaborative filtering
Collaborative filteringCollaborative filtering
Collaborative filteringNeha Kulkarni
 
Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence Shrutika Oswal
 
Recommendation System
Recommendation SystemRecommendation System
Recommendation SystemAnamta Sayyed
 
Recommendation system
Recommendation system Recommendation system
Recommendation system Vikrant Arya
 
Movie Recommendation System.pptx
Movie Recommendation System.pptxMovie Recommendation System.pptx
Movie Recommendation System.pptxrandominfo
 
Recommendation Engine Project Presentation
Recommendation Engine Project PresentationRecommendation Engine Project Presentation
Recommendation Engine Project Presentation19Divya
 
Recommender system
Recommender systemRecommender system
Recommender systemSaiguru P.v
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender SystemsLior Rokach
 
Recommender systems using collaborative filtering
Recommender systems using collaborative filteringRecommender systems using collaborative filtering
Recommender systems using collaborative filteringD Yogendra Rao
 

What's hot (20)

Movie Recommender system
Movie Recommender systemMovie Recommender system
Movie Recommender system
 
Introduction to Recommendation Systems
Introduction to Recommendation SystemsIntroduction to Recommendation Systems
Introduction to Recommendation Systems
 
Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation System
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Collaborative filtering
Collaborative filteringCollaborative filtering
Collaborative filtering
 
Recommendation System Explained
Recommendation System ExplainedRecommendation System Explained
Recommendation System Explained
 
A content based movie recommender system for mobile application
A content based movie recommender system for mobile applicationA content based movie recommender system for mobile application
A content based movie recommender system for mobile application
 
Recommendation system
Recommendation systemRecommendation system
Recommendation system
 
Developing Movie Recommendation System
Developing Movie Recommendation SystemDeveloping Movie Recommendation System
Developing Movie Recommendation System
 
Content based filtering
Content based filteringContent based filtering
Content based filtering
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Collaborative filtering
Collaborative filteringCollaborative filtering
Collaborative filtering
 
Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence
 
Recommendation System
Recommendation SystemRecommendation System
Recommendation System
 
Recommendation system
Recommendation system Recommendation system
Recommendation system
 
Movie Recommendation System.pptx
Movie Recommendation System.pptxMovie Recommendation System.pptx
Movie Recommendation System.pptx
 
Recommendation Engine Project Presentation
Recommendation Engine Project PresentationRecommendation Engine Project Presentation
Recommendation Engine Project Presentation
 
Recommender system
Recommender systemRecommender system
Recommender system
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Recommender systems using collaborative filtering
Recommender systems using collaborative filteringRecommender systems using collaborative filtering
Recommender systems using collaborative filtering
 

Similar to Movies recommendation system in R Studio, Machine learning

A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015
A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015
A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015Journal For Research
 
MOVIE RECOMMENDATION SYSTEM USING COLLABORATIVE FILTERING
MOVIE RECOMMENDATION SYSTEM USING COLLABORATIVE FILTERINGMOVIE RECOMMENDATION SYSTEM USING COLLABORATIVE FILTERING
MOVIE RECOMMENDATION SYSTEM USING COLLABORATIVE FILTERINGIRJET Journal
 
Recommendation System Using Social Networking
Recommendation System Using Social Networking Recommendation System Using Social Networking
Recommendation System Using Social Networking ijcseit
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systemsvivatechijri
 
Recommendation system (1).pptx
Recommendation system (1).pptxRecommendation system (1).pptx
Recommendation system (1).pptxprathammishra28
 
recommendationsystem1-221109055232-c8b46131.pdf
recommendationsystem1-221109055232-c8b46131.pdfrecommendationsystem1-221109055232-c8b46131.pdf
recommendationsystem1-221109055232-c8b46131.pdf13DikshaDatir
 
videorecommendationsystemfornewseducationandentertainment-170519183703.pptx
videorecommendationsystemfornewseducationandentertainment-170519183703.pptxvideorecommendationsystemfornewseducationandentertainment-170519183703.pptx
videorecommendationsystemfornewseducationandentertainment-170519183703.pptxABINASHPADHY6
 
Analysing the performance of Recommendation System using different similarity...
Analysing the performance of Recommendation System using different similarity...Analysing the performance of Recommendation System using different similarity...
Analysing the performance of Recommendation System using different similarity...IRJET Journal
 
IRJET- Hybrid Recommendation System for Movies
IRJET-  	  Hybrid Recommendation System for MoviesIRJET-  	  Hybrid Recommendation System for Movies
IRJET- Hybrid Recommendation System for MoviesIRJET Journal
 
ENTERTAINMENT CONTENT RECOMMENDATION SYSTEM USING MACHINE LEARNING
ENTERTAINMENT CONTENT RECOMMENDATION SYSTEM USING MACHINE LEARNINGENTERTAINMENT CONTENT RECOMMENDATION SYSTEM USING MACHINE LEARNING
ENTERTAINMENT CONTENT RECOMMENDATION SYSTEM USING MACHINE LEARNINGIRJET Journal
 
Teacher training material
Teacher training materialTeacher training material
Teacher training materialVikram Parmar
 
Fuzzy Logic Based Recommender System
Fuzzy Logic Based Recommender SystemFuzzy Logic Based Recommender System
Fuzzy Logic Based Recommender SystemRSIS International
 
IRJET- Analysis on Existing Methodologies of User Service Rating Prediction S...
IRJET- Analysis on Existing Methodologies of User Service Rating Prediction S...IRJET- Analysis on Existing Methodologies of User Service Rating Prediction S...
IRJET- Analysis on Existing Methodologies of User Service Rating Prediction S...IRJET Journal
 
Recommendation System using Machine Learning Techniques
Recommendation System using Machine Learning TechniquesRecommendation System using Machine Learning Techniques
Recommendation System using Machine Learning TechniquesIRJET Journal
 
An Efficient Content, Collaborative – Based and Hybrid Approach for Movie Rec...
An Efficient Content, Collaborative – Based and Hybrid Approach for Movie Rec...An Efficient Content, Collaborative – Based and Hybrid Approach for Movie Rec...
An Efficient Content, Collaborative – Based and Hybrid Approach for Movie Rec...ijtsrd
 
Movie Recommendation System Using Hybrid Approch.pptx
Movie Recommendation System Using Hybrid Approch.pptxMovie Recommendation System Using Hybrid Approch.pptx
Movie Recommendation System Using Hybrid Approch.pptxChanduChandran6
 

Similar to Movies recommendation system in R Studio, Machine learning (20)

A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015
A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015
A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015
 
final report.pdf
final report.pdffinal report.pdf
final report.pdf
 
MOVIE RECOMMENDATION SYSTEM USING COLLABORATIVE FILTERING
MOVIE RECOMMENDATION SYSTEM USING COLLABORATIVE FILTERINGMOVIE RECOMMENDATION SYSTEM USING COLLABORATIVE FILTERING
MOVIE RECOMMENDATION SYSTEM USING COLLABORATIVE FILTERING
 
Recommendation System Using Social Networking
Recommendation System Using Social Networking Recommendation System Using Social Networking
Recommendation System Using Social Networking
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
C018211723
C018211723C018211723
C018211723
 
Recommendation system (1).pptx
Recommendation system (1).pptxRecommendation system (1).pptx
Recommendation system (1).pptx
 
recommendationsystem1-221109055232-c8b46131.pdf
recommendationsystem1-221109055232-c8b46131.pdfrecommendationsystem1-221109055232-c8b46131.pdf
recommendationsystem1-221109055232-c8b46131.pdf
 
243
243243
243
 
videorecommendationsystemfornewseducationandentertainment-170519183703.pptx
videorecommendationsystemfornewseducationandentertainment-170519183703.pptxvideorecommendationsystemfornewseducationandentertainment-170519183703.pptx
videorecommendationsystemfornewseducationandentertainment-170519183703.pptx
 
Analysing the performance of Recommendation System using different similarity...
Analysing the performance of Recommendation System using different similarity...Analysing the performance of Recommendation System using different similarity...
Analysing the performance of Recommendation System using different similarity...
 
IRJET- Hybrid Recommendation System for Movies
IRJET-  	  Hybrid Recommendation System for MoviesIRJET-  	  Hybrid Recommendation System for Movies
IRJET- Hybrid Recommendation System for Movies
 
ENTERTAINMENT CONTENT RECOMMENDATION SYSTEM USING MACHINE LEARNING
ENTERTAINMENT CONTENT RECOMMENDATION SYSTEM USING MACHINE LEARNINGENTERTAINMENT CONTENT RECOMMENDATION SYSTEM USING MACHINE LEARNING
ENTERTAINMENT CONTENT RECOMMENDATION SYSTEM USING MACHINE LEARNING
 
Teacher training material
Teacher training materialTeacher training material
Teacher training material
 
Fuzzy Logic Based Recommender System
Fuzzy Logic Based Recommender SystemFuzzy Logic Based Recommender System
Fuzzy Logic Based Recommender System
 
IRJET- Analysis on Existing Methodologies of User Service Rating Prediction S...
IRJET- Analysis on Existing Methodologies of User Service Rating Prediction S...IRJET- Analysis on Existing Methodologies of User Service Rating Prediction S...
IRJET- Analysis on Existing Methodologies of User Service Rating Prediction S...
 
Recommendation System using Machine Learning Techniques
Recommendation System using Machine Learning TechniquesRecommendation System using Machine Learning Techniques
Recommendation System using Machine Learning Techniques
 
20120140506003
2012014050600320120140506003
20120140506003
 
An Efficient Content, Collaborative – Based and Hybrid Approach for Movie Rec...
An Efficient Content, Collaborative – Based and Hybrid Approach for Movie Rec...An Efficient Content, Collaborative – Based and Hybrid Approach for Movie Rec...
An Efficient Content, Collaborative – Based and Hybrid Approach for Movie Rec...
 
Movie Recommendation System Using Hybrid Approch.pptx
Movie Recommendation System Using Hybrid Approch.pptxMovie Recommendation System Using Hybrid Approch.pptx
Movie Recommendation System Using Hybrid Approch.pptx
 

More from Mauryasuraj98

Image encryption using jumbling salting
Image encryption using jumbling saltingImage encryption using jumbling salting
Image encryption using jumbling saltingMauryasuraj98
 
Evolution of computer generation.
Evolution of computer generation. Evolution of computer generation.
Evolution of computer generation. Mauryasuraj98
 
Case study on Intel core i3 processor.
Case study on Intel core i3 processor. Case study on Intel core i3 processor.
Case study on Intel core i3 processor. Mauryasuraj98
 
Atm simulation mini project using Python programming language
Atm simulation  mini project using Python programming language Atm simulation  mini project using Python programming language
Atm simulation mini project using Python programming language Mauryasuraj98
 
CAR PARKING SYSTEM USING VISUAL STUDIO C++ (OPERATING SYSTEM MINI PROJECT )
CAR PARKING SYSTEM USING VISUAL STUDIO C++ (OPERATING SYSTEM MINI PROJECT ) CAR PARKING SYSTEM USING VISUAL STUDIO C++ (OPERATING SYSTEM MINI PROJECT )
CAR PARKING SYSTEM USING VISUAL STUDIO C++ (OPERATING SYSTEM MINI PROJECT ) Mauryasuraj98
 
Ludo game using c++ with documentation
Ludo game using c++ with documentation Ludo game using c++ with documentation
Ludo game using c++ with documentation Mauryasuraj98
 
Ludo mini project in c++
Ludo mini project in c++Ludo mini project in c++
Ludo mini project in c++Mauryasuraj98
 
Telephone directory using c language
Telephone directory using c languageTelephone directory using c language
Telephone directory using c languageMauryasuraj98
 
Mini cnc plotter or printer
Mini cnc plotter or printer Mini cnc plotter or printer
Mini cnc plotter or printer Mauryasuraj98
 

More from Mauryasuraj98 (12)

Image encryption using jumbling salting
Image encryption using jumbling saltingImage encryption using jumbling salting
Image encryption using jumbling salting
 
Evolution of computer generation.
Evolution of computer generation. Evolution of computer generation.
Evolution of computer generation.
 
Case study on Intel core i3 processor.
Case study on Intel core i3 processor. Case study on Intel core i3 processor.
Case study on Intel core i3 processor.
 
Atm simulation mini project using Python programming language
Atm simulation  mini project using Python programming language Atm simulation  mini project using Python programming language
Atm simulation mini project using Python programming language
 
CAR PARKING SYSTEM USING VISUAL STUDIO C++ (OPERATING SYSTEM MINI PROJECT )
CAR PARKING SYSTEM USING VISUAL STUDIO C++ (OPERATING SYSTEM MINI PROJECT ) CAR PARKING SYSTEM USING VISUAL STUDIO C++ (OPERATING SYSTEM MINI PROJECT )
CAR PARKING SYSTEM USING VISUAL STUDIO C++ (OPERATING SYSTEM MINI PROJECT )
 
Ludo game using c++ with documentation
Ludo game using c++ with documentation Ludo game using c++ with documentation
Ludo game using c++ with documentation
 
Ludo mini project in c++
Ludo mini project in c++Ludo mini project in c++
Ludo mini project in c++
 
Telephone directory using c language
Telephone directory using c languageTelephone directory using c language
Telephone directory using c language
 
Mini cnc plotter or printer
Mini cnc plotter or printer Mini cnc plotter or printer
Mini cnc plotter or printer
 
Mini Cnc Printer
Mini Cnc PrinterMini Cnc Printer
Mini Cnc Printer
 
E wallet
E wallet E wallet
E wallet
 
Pointer in C++
Pointer in C++Pointer in C++
Pointer in C++
 

Recently uploaded

Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 

Recently uploaded (20)

Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 

Movies recommendation system in R Studio, Machine learning

  • 1. Movies Recommendation System Submitted in partial fulfillment of the requirements of the degree of T. E. Computer Engineering By Suraj R. Maurya Roll No. 56 PID: 182262 Om V. Pise Roll No: 58 PID: 172074 Guide (s): Mr. Rupesh Mishra Asst. Professor Department of Computer Engineering St. Francis Institute of Technology (Engineering College) University of Mumbai 2019-2020
  • 2. i CERTIFICATE This is to certify that the project entitled “Movies Recommendation System” is a bonafide work of “Suraj Maurya (Roll No: 56), Om Pise (Roll No: 58)” submitted to the University of Mumbai in partial fulfillment of the requirement for the award of the degree of T.E. in Computer Engineering (Mr. Rupesh Mishra) Guide (Dr. Kavita Sonawane) Head of Department
  • 3. Project Report Approval for T.E. This project report entitled Maurya , Mr. Om Pise Engineering. Date: Place:Mumbai ii Project Report Approval for T.E. This project report entitled Movies Recommendation System Mr. Om Pise, is approved for the degree of Examiners 1.--------------------------------------------- 2.--------------------------------------------- Project Report Approval for T.E. Movies Recommendation System by Mr. Suraj is approved for the degree of T.E. in Computer --------------------------------------------- ---------------------------------------------
  • 4. Declaration I/ We (Make changes in college copy. Individual copy it will be I and College copy it will be We) ideas in my own words and where other have adequately cited and referenced adhered to all principles of academic honesty and integrity and have not misrepresented or fabricated or falsified any idea/data/fact/source in my submission. I understand that any violation of the above wi disciplinary action by the Institute and can also evoke penal action from the sources which have thus not been properly cited or from whom proper permission has not been taken when needed. Date: iii (Make changes in college copy. Individual copy it will be I and College copy it will be We) declare that this written submission represents my ideas in my own words and where other’s ideas or words have been included, I have adequately cited and referenced the original sources. I also declare that I have adhered to all principles of academic honesty and integrity and have not misrepresented or fabricated or falsified any idea/data/fact/source in my submission. I understand that any violation of the above wi disciplinary action by the Institute and can also evoke penal action from the sources which have thus not been properly cited or from whom proper permission has not been taken when needed. ----------------------------------------- (Signature) Suraj Maurya Roll Om Pise Roll no: (Make changes in college copy. Individual copy it will be I and declare that this written submission represents my s ideas or words have been included, I the original sources. I also declare that I have adhered to all principles of academic honesty and integrity and have not misrepresented or fabricated or falsified any idea/data/fact/source in my submission. I understand that any violation of the above will be cause for disciplinary action by the Institute and can also evoke penal action from the sources which have thus not been properly cited or from whom proper permission ----------------------------------------- (Signature) no: 56 Roll no: 58
  • 5. iv Abstract A recommendation engine filters the data using different algorithms and recommends the most relevant items to users. It first captures the past behavior of a customer and based on that, recommends products which the users might be likely to buy. If a completely new user visits an e-commerce site, that site will not have any past history of that user. So how does the site go about recommending products to the user in such a scenario? One possible solution could be to recommend the best selling products, i.e. the products which are high in demand. Another possible solution could be to recommend the products which would bring the maximum profit to the business. Three main approaches are used for our recommender systems. One is Demographic Filtering i.e They offer generalized recommendations to every user, based on movie popularity and/or genre. The System recommends the same movies to users with similar demographic features. Since each user is different , this approach is considered to be too simple. The basic idea behind this system is that movies that are more popular and critically acclaimed will have a higher probability of being liked by the average audience. Second is content-based filtering, where we try to profile the users interests using information collected, and recommend items based on that profile. The other is collaborative filtering, where we try to group similar users together and use information about the group to make recommendations to the user.
  • 6. Chapter Title 1 INTRODUCTION 1.1 Project Description 1.2 Problem Formulation 1.3 Motivation 1.4 Proposed Solution 1.5 Scope of the project 2 REVIEW OF LITERATURE 3 SYSTEM ANALYSIS 3.1 Functional Requirements 3.2 Non Functional Requirements 3.3 Specific Requirements 3.4 Use-Case Diagrams and description 4 ANALYSIS MODELING 4.1 v Contents INTRODUCTION Project Description Problem Formulation Proposed Solution Scope of the project REVIEW OF LITERATURE SYSTEM ANALYSIS Functional Requirements Non Functional Requirements Specific Requirements Case Diagrams and description ANALYSIS MODELING Page No. 1 1 1 1 2 2 3 5 6 6 7 8 9
  • 7. Activity Diagrams Class Diagram 4.2 Functional Modeling 5 DESIGN 5.1 Architectural Design 5.2 User Interface Design 6 IMPLEMENTATION 6.1 Algorithms / Methods Used 6.2 Working of the project 7 CONCLUSIONS References Acknowledgement vi Activity Diagrams Class Diagram Functional Modeling Architectural Design User Interface Design IMPLEMENTATION Algorithms / Methods Used Working of the project CONCLUSIONS 9 11 14 14 18 20 20 23 26
  • 8. Fig. No. 1. Use Case Diagram f 2. Class Diagram for 3. Activity Diagram for 4. Context Level DFD for 5. Level 0 DFD for 6. Level 1 DFD for 7. Architecture for 8. Main Window Sr. No. Abbreviation 1. DFD 2. CFS vii List of Figures Figure Caption Case Diagram for Movies Recommendation system Class Diagram for Movies Recommendation system Activity Diagram for Movies Recommendation system Context Level DFD for Movies Recommendation system Level 0 DFD for Movies Recommendation system Level 1 DFD for Movies Recommendation system Architecture for Movies Recommendation system Main Window (GUI) List of Abbreviations Abbreviation Expanded form Data Flow Diagram Collaborative filtering system Page No. 7 9 10 m 11 12 13 14-17 18-19
  • 9. 1 Chapter 1 Introduction 1.1 Description A recommendation system is a type of information filtering system which attempts to predict the preferences of a user, and make suggestions based on these preferences. There are a wide variety of applications for recommendation systems. 2. These have become increasingly popular over the last few years and are now utilized in most online platforms that we use. 3. The content of such platforms varies from movies, music, books and video, to friends and stories on social media platforms, to products on e-commerce websites, to people on professional and dating websites, to search results returned on Google. 4. Often, these systems are able to collect information about a user’s choices, and can use this information to improve their suggestions in the future. 5. For example, if Amazon observes that a large number of customers who buy the latest Apple MacBook also buy a USB-C-to USB Adapter, they can recommend the Adapter to a new user who has just added a MacBook to his cart. 1.2 Problem Formulation The movie recommendation system will be built using artificial algorithms that analyze user's favorite genres and recommend movies according to their liking. The response will be based on the liking of the user. The User will submit queries depending on their liking of their movies. The System analyses the liking and then recommends the user movies. Providing related content out of relevant and irrelevant collection of items to users of online service providers. Netflix aims to recommend movies to users based on content of items rather than other user’s opinions. 1.3 Motivation A recommendation system also finds a similarity between the different products. For example, Netflix Recommendation System provides you with the recommendations of the movies that are similar to the ones that have been watched in the past. Furthermore, there is a collaborative content filtering that provides you with the recommendations in respect with the other users who might have a similar viewing history or preferences. There are two types of recommendation systems – Content-Based Recommendation System and Collaborative Filtering Recommendation. In this project of recommendation system in R, we will work on a collaborative filtering recommendation system and more specifically, ITEM based collaborative recommendation system.
  • 10. 2 1.4 Proposed Solution The proposed movie recommendation system is based on the abstract maximal clique method.The k-cliques, which are partially graphs that are fully connected to k vertices and a very effective method to build groups in social networks analysis is proposed. In the proposed approach, a similarity measure of cosine is used to measure similarities between users. The proposed solution offers improved k-clique methods for more efficient performance than existing collaborative filtering and maximal clique. For performance evaluation, use MovieLens data, which is general information in movie recommendation systems. To assess the effectiveness of a MovieLens dataset, it is divided into experimental and test data that are widely used in artificial intelligence. Comparison of collaborative filtering methods using k nearest neighbor, maximal clique method, k-clique method, and improve k-clique to evaluate performance. 1.5 Scope of the Project In the near future, it will be installed in Apache Server and so it will be published on the internet. Datasets will be updated continuously and it will make online actual rating predictions to the users whose habits are changing day by day. As a result, it can be sensitively satisfying current user tastes. Web services in particular suffer from producing recommendations of millions of items to millions of users. The time and computational power can even limit the performance of the best hybrid systems. For larger dataset, we can work on scalability problems of recommendation systems. The Prediction approach can also be tried in different datasets to test harmony performance of system scalability problems of recommendation systems.
  • 11. 3 Chapter 2 Review of Literature 2.1 A STUDY ON CONTENT-BASED VIDEO RECOMMENDATION Authors: Yan Li Hanjie Wang Hailong Liu Bo Chen - Tencent WeChat, China Publication: IBM Research, Yorktown Heights, China Approach: The competition is challenging, and the reason lies in three aspects, i.e., large vision appearance variance, insufficient training data, and serious data incompleteness. Meta-data feature: the provided meta-data information includes actor/actress, director, description, and genre. Considering the small amount of training data, in this paper we only take the advantage of show descriptions. For description representation, we first apply the Latent Dirichlet Allocation (LDA) algorithm to generate a topic model from about 400 movies descriptions (we build this corpus by crawling data in terms of genre from IMDb, which is treated as the world’s most popular and authoritative source for movie, TV and celebrity content), and then compute the topic distribution probability for each TV-show 2.2 Content-based recommender system for online stores using expert system Authors: Bogdan Walek, Petra Spackova Publication:University of Ostrava Approach: The main goal of the recommender system is to propose and deliver suitable content to the user. One of the goals of the proposed recommendation system is to decrease the cold start effect. At the end of the paper, the proposed system is experimentally verified. The recommender system uses a collaborative filtering system for recommending suitable items and an expert system for evaluating the popularity of items. The system also proposes an algorithm for showing items from similar users after the first login to decrease the effect of cold start problem. The knowledge base of the proposed expert system contains three input linguistic variables and one output linguistic variables. 2.3 A Content-based Movie Recommender System based on Temporal User Preferences Authors: Bagher Rahimpour Cami, Hamid Hassanpour, Hoda Mashayekhi Publication: Shahrood University of Technology Shahrood, Iran Approach: the user profile consists of user activities as userId, activity1, ..., activity-n, where each activity-i indicates the content and access time of selected items denoted as itemId, itemDesc, accessDate.
  • 12. 4 This model is user-centered and employs the profile of each user to create a user model for individuals. In movie domain, each rating record of rating matrix (movieId, movieDesc, rate, accessDate) is corresponded to an activity. The temporal preferences model is based on Bayesian non-parametric framework and has three main components: interest extraction, inferring of preferences, and prediction.Interests extraction, where analysis the user profile to discover user interests. This model employs the user profile into Distance Dependent Chinese Restaurant Process (DDCRP) [27] and performs clustering. DDCRP is based on Bayesian non-parametric thus, the clusters can grow whenever new data is observed. 2.4 An Improved Content Based Collaborative Filtering Algorithm For Movie Recommendations Authors: Ashish Pal, Prateek Parhi and Manuj Aggarwal Publication: ARSD College, University of Delhi, New Delhi, India Approach: Our proposed algorithm takes into consideration the tags and genres specified in the dataset, and for the content-based prediction, we have applied a set matching comparator. This comparator returns the number of common objects between two movies. The term object here refers to tags and genres. For each particular movie, the tags and genres are merged into a single set. This gives us a bulky content for each movie, and more the content better is the predictions. After getting the set of common objects, the weight of each set for a movie is calculated. Once the weights are assigned to each of the set, they are then used to provide the ratings of the unrated movies using the rated movies which were previously compared. In our methodology first, the tags for each movie assigned by different users are used and converted into a single list. The genres for each movie are appended to the same list of tags. This final list is referred to as the objects for a particular movie. The object set for each active movie is compared with the object set of every other movie in the dataset and the number of matching objects are assigned to a set.
  • 13. 5 Chapter 3 System Analysis 3.1 Functional Requirements Major functionalities associated with the user are: ● Enable users to submit his preferred genres by typing into the input text box. ● The text to be written should support adding text from different encodings such as utf-8, latin encoded text which represents information in English. Major functionalities associated with the system are as follows: ● Enable system to use the keywords obtained after tokenization of a new input to find the cluster it belongs to. Interface Requirements: ● Field 1 accepts the preferences of a user. ● Field 2 recommends the movies to the user 3.2 Non Functional Requirements 3.2.1 Performance The computer running the software did not require a powerful CPU and GPU, it requires a 64 bit operating system for execution of a program in order to call inbuilt packages like flask,sklearn etc. 3.2.2 Reliability The system takes in the inputs without any error and predicts the expected response accurately so that the users of the system can get its query response. 3.2.3 Usability The system is easy to handle and navigates in the most expected way with no delays. The system interacts with the user in a very friendly manner making it easier for the users to use the system.
  • 14. 6 3.3 Specific Requirements 3.3.1 User Interfaces ● Front-end Software: Flask ,HTML, CSS, JavaScript, Web browser. ● Back-end Software: R Studio 3.3.2 Hardware Requirements ● CPU Type : Intel Core or above ● Clock speed : 1.0GHz ● Ram size : 1GB and above ● Hard Disk capacity : 100GB and above ● Working keyboard 3.3.2 Software Requirements ● Operating System : Windows ● Python 3.5 or above 3.3.3 Communication Interfaces This system will be completely based on a local system of the user.
  • 15. 3.4 Use-Case Diagrams and Description Fig 1. Use Case Diagram for Movies Recommendation System. Use Case Diagram: A use case diagram is a dynamic or behavior diagram in UML. Use case diagrams model the functionality of a system using actors and the use cases. Use cases are set of actions, services, and functions that the system needs to perform. In this context, a “Sy developed and operated, such as a website. The “Actors” are people or entities operating under defined roles within the system. Case Diagrams and Description Fig 1. Use Case Diagram for Movies Recommendation System. A use case diagram is a dynamic or behavior diagram in UML. Use case diagrams model the functionality of a system using actors and the use cases. Use cases are set of actions, services, and functions that the system needs to perform. In this context, a “System” is something being developed and operated, such as a website. The “Actors” are people or entities operating under defined roles within the system. 7 Fig 1. Use Case Diagram for Movies Recommendation System. A use case diagram is a dynamic or behavior diagram in UML. Use case diagrams model the functionality of a system using actors and the use cases. Use cases are set of actions, services, stem” is something being developed and operated, such as a website. The “Actors” are people or entities operating under
  • 16. 8 Use Case Specifications Use case: Submit Query Brief Description. This use case will be expecting user to submit or a keyword as the input which is compatible with the system. Primary Actor: User Use Case: View Response Brief Description: This use case will show the bot response in a text format Primary Actor: User Use Case: Query Processing Brief Description: In this use case user input query is processed inorder to give the response. Primary Actor: System Main Flow: 1. It will tokenize the sentences followed by word tokenization. 2. For each word in the input given, it is checked against a common set of punctuations and stop words and are removed accordingly. Use Case: Generate Response Brief Description: In this use case responses are generated on the basis of query analysis and AIML query Primary Actor: System Use Case: Submit Feedback Brief Description: This use case used for taking feedback from user inorder to get better performance of a bot. Primary Actor: User Main Flow: Taken feedback is stored in a text file and analyze by administrator
  • 17. 4.1 Class Diagram and Activity Diagram Fig 2. Class Diagram for Chapter 4 Analysis Modeling Class Diagram and Activity Diagram Fig 2. Class Diagram for Movies Recommendation syste 9 em .
  • 18. Fig 3. Activity Diagram forFig 3. Activity Diagram for Movies Recommendation sys 10 stem.
  • 19. 4.2 Functional Modeling Data Flow Diagram A data flow diagram (DFD) illustrates how data is processed by a system in terms of inputs and outputs. As its name indicates its focus is on flow of information, where data comes from, where it goes and how it gets stored. Fig4. Level 0 DFD for Movies Context Level DFD: This Level is called the Context Level DFD. It is a basic overview of the whole system or process being analyzed or modelled. Here the basic flow of the system is showed. The user gives input which is stored by the system. Based on the input given the system accordingly processes and gives then output to then user. Modeling A data flow diagram (DFD) illustrates how data is processed by a system in terms of inputs and outputs. As its name indicates its focus is on flow of information, where data comes from, where . Level 0 DFD for Movies recommendation system. This Level is called the Context Level DFD. It is a basic overview of the whole system or process being analyzed or modelled. Here the basic flow of the system is showed. The user is stored by the system. Based on the input given the system accordingly processes and gives then output to then user. 11 A data flow diagram (DFD) illustrates how data is processed by a system in terms of inputs and outputs. As its name indicates its focus is on flow of information, where data comes from, where recommendation system. This Level is called the Context Level DFD. It is a basic overview of the whole system or process being analyzed or modelled. Here the basic flow of the system is showed. The user is stored by the system. Based on the input given the system accordingly
  • 20. Fig 5. Level 1 DFD for Movies Recommendation system. Level 1 DFD: DFD Level 1provides a more detailed breakout of pieces of Context Level DFD. It basically explains the system more in detail. DFD comprises of details which are fabricated in level 0 of DFD. Here login details and two databases consisting of movie recommendation system data set and user data set. Fig 5. Level 1 DFD for Movies Recommendation system. DFD Level 1provides a more detailed breakout of pieces of Context Level DFD. It basically explains the system more in detail.The level 1 DFD is more detailed than level 0. This level of DFD comprises of details which are fabricated in level 0 of DFD. Here in DFD 1 we can see login details and two databases consisting of movie recommendation system data set and user 12 Fig 5. Level 1 DFD for Movies Recommendation system. DFD Level 1provides a more detailed breakout of pieces of Context Level DFD. It basically The level 1 DFD is more detailed than level 0. This level of in DFD 1 we can see login details and two databases consisting of movie recommendation system data set and user
  • 21. Fig 6. Level 2 DFD for Movies recommendation system C] Level 2 DFD A level 2 DFD is much more informative than its previous counterparts. Here the system is further divided and is explained in much more detail so that it is very easy to understand the whole system. We can go for further level 3 and level 4 of DFDs but the complicated and make the system hard to understand and implement. . Level 2 DFD for Movies recommendation system A level 2 DFD is much more informative than its previous counterparts. Here the system is further divided and is explained in much more detail so that it is very easy to understand the whole system. We can go for further level 3 and level 4 of DFDs but they will be much more complicated and make the system hard to understand and implement. 13 . Level 2 DFD for Movies recommendation system A level 2 DFD is much more informative than its previous counterparts. Here the system is further divided and is explained in much more detail so that it is very easy to understand the y will be much more
  • 22. 14 Chapter 5 Design 5.1 Architectural Design To start with, we present an overall system diagram for recommendation systems in the following figure. The main components of the architecture contain one or more machine learning algorithms. Fig 7. Architectural Design for Movies recommendation system
  • 23. 15 The simplest thing we can do with data is to store it for later offline processing, which leads to part of the architecture for managing Offline jobs. However, computation can be done offline, nearline, or online. Online computation can respond better to recent events and user interaction, but has to respond to requests in real-time. This can limit the computational complexity of the algorithms employed as well as the amount of data that can be processed. Offline computation has less limitations on the amount of data and the computational complexity of the algorithms since it runs in a batch manner with relaxed timing requirements. However, it can easily grow stale between updates because the most recent data is not incorporated. One of the key issues in a personalization architecture is how to combine and manage online and offline computation in a seamless manner. Nearline computation is an intermediate compromise between these two modes in which we can perform online-like computations, but do not require them to be served in real-time. Model training is another form of computation that uses existing data to generate a model that will later be used during the actual computation of results. Another part of the architecture describes how the different kinds of events and data need to be handled by the Event and Data Distribution system. A related issue is how to combine the different Signals and Models that are needed across the offline, nearline, and online regimes. Finally, we also need to figure out how to combine intermediate Recommendation Results in a way that makes sense for the user. The rest of this post will detail these components of this architecture as well as their interactions. In order to do so, we will break the general diagram into different sub-systems and we will go into the details of each of them. As you read on, it is worth keeping in mind that our whole infrastructure runs across the public Amazon Web Services cloud. Online computation can respond quickly to events and use the most recent data. An example is to assemble a gallery of action movies sorted for the member using the current context. Online components are subject to an availability and response time Service Level Agreements (SLA) that specifies the maximum latency of the process in responding to requests from client applications while our member is waiting for recommendations to appear. This can make it harder to fit complex and computationally costly algorithms in this approach. Also, a purely
  • 24. 16 online computation may fail to meet its SLA in some circumstances, so it is always important to think of a fast fallback mechanism such as reverting to a precomputed result. Computing online also means that the various data sources involved also need to be available online, which can require additional infrastructure.Nearline computation can be seen as a compromise between the two previous modes. In this case, computation is performed exactly like in the online case. However, we remove the requirement to serve results as soon as they are computed and can instead store them, allowing it to be asynchronous. The nearline computation is done in response to user events so that the system can be more responsive between requests. This opens the door for potentially more complex processing to be done per event. An example is to update recommendations to reflect that a movie has been watched immediately after a member begins to watch it. Results can be stored in an intermediate caching or storage back-end. Nearline computation is also a natural setting for applying incremental learning algorithms. In any case, the choice of online/nearline/offline processing is not an either/or question. All approaches can and should be combined. There are many ways to combine them. We already mentioned the idea of using offline computation as a fallback. Another option is to precompute part of a result with an offline process and leave the less costly or more context-sensitive parts of the algorithms for online computation. Much of the computation we need to do when running personalization machine learning algorithms can be done offline. This means that the jobs can be scheduled to be executed periodically and their execution does not need to be synchronous with the request or presentation of the results. There are two main kinds of tasks that fall in this category: model training and batch computation of intermediate or final results. In the model training jobs, we collect relevant existing data and apply a machine learning algorithm produces a set of model parameters (which we will henceforth refer to as the model). This model will usually be encoded and stored in a file for later consumption. Although most of the models are trained offline in batch mode, we also have some online learning techniques where incremental training is indeed performed online. Batch computation of results is the offline computation process defined above in which we use
  • 25. existing models and corresponding input data to compute results that will be used at a later time either for subsequent online processing or direct presentation to the user. Fig 8. Architecture for Movies recommendation System existing models and corresponding input data to compute results that will be used at a later time either for subsequent online processing or direct presentation to the user. . Architecture for Movies recommendation System 17 existing models and corresponding input data to compute results that will be used at a later time . Architecture for Movies recommendation System
  • 26. 5.2 User Interface Design Fig 9 User Interface Design 9. GUI for Movies recommendation System 18
  • 27. . Fig 9. GGUI for Movies recommendation System 19
  • 28. 6.1 Algorithms Used USER-based Collaborative Filtering Model Now, I will use the user-based approach. According to this approach, given a new user, its similar users are first identified. Then, the top recommended. For each new user, these are the steps: 1. Measure how similar each user is to the new one. Like IBCF, popular similarity measures are correlation and cosine. 2. Identify the most similar users. The options are: ● Take account of the top k users (k ● Take account of the users whose similarity is above a defined threshold 3. Rate the movies rated by the most similar users. The rating is the average rating among similar users and the approaches are: Chapter 6 Implementation based Collaborative Filtering Model based approach. According to this approach, given a new user, its similar users are first identified. Then, the top-rated items rated by similar users are se are the steps: Measure how similar each user is to the new one. Like IBCF, popular similarity measures are correlation and cosine. Identify the most similar users. The options are: Take account of the top k users (k-nearest_neighbors) e users whose similarity is above a defined threshold Rate the movies rated by the most similar users. The rating is the average rating among similar users and the approaches are: 20 based approach. According to this approach, given a new user, its rated items rated by similar users are Measure how similar each user is to the new one. Like IBCF, popular similarity measures e users whose similarity is above a defined threshold Rate the movies rated by the most similar users. The rating is the average rating among
  • 29. ● Average rating ● Weighted average rating, using the similarities as weights ● Pick the top-rated movies. In content-based filtering, items are recommended based on comparisons between item profile and user profile. A user profile is content that is found to be relevant to the user in form of keywords(or features). A user profile m features) collected by algorithm from items found relevant (or interesting) by the user. A set of keywords (or features) of an item is the Item profile. For example, consider a scenario in which a person goes to buy his favorite cake ‘X’ to a pastry. Unfortunately, cake ‘X’ has been sold out and as a result of this the shopkeeper recommends the person to buy cake ‘Y’ which is made up of ingredients similar to cake ‘X’. This is an instance of content We will be using the cosine similarity to calculate a numeric quantity that denotes the similarity between two movies. We use the cosine similarity score since it is independent of magnitude and is relatively easy and fast to calculate. Mathematically, it is defined as follows: Weighted average rating, using the similarities as weights rated movies. based filtering, items are recommended based on comparisons between item profile and user profile. A user profile is content that is found to be relevant to the user in form of keywords(or features). A user profile might be seen as a set of assigned keywords (terms, features) collected by algorithm from items found relevant (or interesting) by the user. A set of keywords (or features) of an item is the Item profile. For example, consider a scenario in which a oes to buy his favorite cake ‘X’ to a pastry. Unfortunately, cake ‘X’ has been sold out and as a result of this the shopkeeper recommends the person to buy cake ‘Y’ which is made up of ingredients similar to cake ‘X’. This is an instance of content-based filtering Fig. 10 Content Based Filtering We will be using the cosine similarity to calculate a numeric quantity that denotes the similarity between two movies. We use the cosine similarity score since it is independent of magnitude and is relatively easy and fast to calculate. Mathematically, it is defined as 21 based filtering, items are recommended based on comparisons between item profile and user profile. A user profile is content that is found to be relevant to the user in form of ight be seen as a set of assigned keywords (terms, features) collected by algorithm from items found relevant (or interesting) by the user. A set of keywords (or features) of an item is the Item profile. For example, consider a scenario in which a oes to buy his favorite cake ‘X’ to a pastry. Unfortunately, cake ‘X’ has been sold out and as a result of this the shopkeeper recommends the person to buy cake ‘Y’ which is made up iltering We will be using the cosine similarity to calculate a numeric quantity that denotes the similarity between two movies. We use the cosine similarity score since it is independent of magnitude and is relatively easy and fast to calculate. Mathematically, it is defined as
  • 30. We are now in a good position to define our recommendation function. These are the following steps we'll follow :- ● Get the index of the movie given its t ● Get the list of cosine similarity scores for that particular movie with all movies. Convert it into a list of tuples where the first element is its position and the second is the similarity score. ● Sort the aforementioned list of tuples based on t element. ● Get the top 10 elements of this list. Ignore the first element as it refers to self (the movie most similar to a particular movie is the movie itself). ● Return the titles corresponding to the indices of the top elements. While our system has done a decent job of finding movies with similar plot descriptions, the quality of recommendations is not that great. "The Dark Knight Rises" returns all Batman movies while it is more likely that the people who liked that movie are more inclined to enjoy other Christopher Nolan movies. This is something that cannot be captured by the present system. We are now in a good position to define our recommendation function. These are the - ● Get the index of the movie given its title. ● Get the list of cosine similarity scores for that particular movie with all movies. Convert it into a list of tuples where the first element is its position and the second is the ● Sort the aforementioned list of tuples based on the similarity scores; that is, the second ● Get the top 10 elements of this list. Ignore the first element as it refers to self (the movie most similar to a particular movie is the movie itself). ● Return the titles corresponding to the indices of the top elements. While our system has done a decent job of finding movies with similar plot descriptions, the quality of recommendations is not that great. "The Dark Knight Rises" returns all Batman ies while it is more likely that the people who liked that movie are more inclined to enjoy other Christopher Nolan movies. This is something that cannot be captured by the 22 We are now in a good position to define our recommendation function. These are the ● Get the list of cosine similarity scores for that particular movie with all movies. Convert it into a list of tuples where the first element is its position and the second is the he similarity scores; that is, the second ● Get the top 10 elements of this list. Ignore the first element as it refers to self (the While our system has done a decent job of finding movies with similar plot descriptions, the quality of recommendations is not that great. "The Dark Knight Rises" returns all Batman ies while it is more likely that the people who liked that movie are more inclined to enjoy other Christopher Nolan movies. This is something that cannot be captured by the
  • 31. 6.2 Working of the project CODE SNIPPETS Fig 11. Fro 6.2 Working of the project ont end code for Movies recommendation System 23 for Movies recommendation System
  • 32. Fig 12. Bacckend code for Movies recommendation System 24 for Movies recommendation System
  • 33. Fig 13. Bacckend code for Movies recommendation System 25 for Movies recommendation System
  • 34. 26 Chapter 7 Conclusion In our project, a collaborative filtering algorithm is used to predict a user's movie rating. The MovieLens dataset, which has 10 million ratings, is selected in our project and divided into training set and test set. The RMSE method is used for algorithm evaluation. According to evaluation as a result, our movie recommender system has pretty good prediction performance. A hybrid approach is taken between context based filtering and collaborative filtering to implement the system. This approach overcomes drawbacks of each individual algorithm and improves the performance of the system. Techniques like Clustering, Similarity and Classification are used to get better recommendations thus reducing MAE and increasing precision and accuracy. In future we can work on hybrid recommender using clustering and similarity for better performance. Our approach can be further extended to other domains to recommend songs, video, venue, news, books, tourism and e-commerce sites, etc.
  • 35. 27 References  https://data-flair.training/blogs/data-science-r-movie-recommendation/  https://towardsdatascience.com/the-4-recommendation-engines-that- can-predict-your-movie-tastes-109dc4e10c52  https://www.geeksforgeeks.org/python-implementation-of-movie- recommender-system/  https://www.mygreatlearning.com/blog/masterclass-on-movie- recommendation-system/  https://rstudio-pubs- static.s3.amazonaws.com/288836_388ef70ec6374e348e32fde56f4b8f0e.ht ml
  • 36. 28 Acknowledgements We take the opportunity to thank all those people who have helped and guided us through this project and make this experience worthwhile for us. We wish to sincerely thank our reverend Bro. Jose Thuruthiyil and principal Dr. Sincy George for giving us this opportunity for making a project in the Third Year of Engineering. We would also like to thank HOD of Computer department Dr. Kavita Sonawane and all teaching and nonteaching staff for their immense support and cooperation. Last but not the least we would like to thank Mr. Rupesh Mishra for guiding us throughout the project and encouraging us to explore in this domain.