Movies
Recommendation
System
Using ML
Submitted By:
Submitted To:
Kruteeka Samal (2021UGELE16)
Nagraj Sir
Akash Vishwakarma (2021UGECE02)
Problem Statement
Introduction to Recommendation System
Project Flow
Data Processing
Text Vectorization
UI building and Deployment
Accuracy and Confusion Matrix
Results
Conclusion
01
02
03
04
05
06
07
08
09
Table of contents
Problem Statement
In the era of digital entertainment, users often
face the challenge of selecting movies from a
vast library of options. To enhance the user
experience and aid in decision-making, a movie
recommendation system is desired. The system
aims to recommend movies based on user
preferences, leveraging collaborative filtering
techniques.
What is a Recommendation
System?
Recommendation System
A recommendation system is a technology or algorithmic approach designed to suggest
items or content to users based on their preferences, behaviors, or characteristics.
The primary goal of a recommendation system is to predict what items or content a user
may be interested in and present those recommendations in a personalized and relevant
manner.
These systems are widely used in various industries, including e-commerce, streaming
services, social media, and more.
Content-Based Collabrative-Filtering Hybrid
• Hybrid recommendation
systems combine multiple
approaches to provide more
accurate and diverse
recommendations.
• For example, a system might
integrate collaborative
filtering and content-based
filtering to leverage the
strengths of both methods.
Types
• This method recommends items
based on the characteristics or
features of the items themselves
and the user's preferences.
• It involves analyzing the content
of items and recommending
items with similar attributes to
those the user has shown interest
in.
• Collaborative filtering is a
popular technique used in
recommendation systems to
provide personalized
suggestions to users based on
the preferences and behaviors
of similar users.
• There are two main types of
collaborative filtering: user-
based collaborative filtering
and item-based collaborative
filtering.
Project Flow
01 02 03
04 05
Project Flow
Data
Collection
Data
Processing
UI
Buildin
g
Deployment
Text
Vectorization
Data
Processing
Data Processing
Data processing in machine learning is about preparing raw data for model training.
This includes cleaning data, transforming it for compatibility, and splitting it into
training, validation, and testing sets.
Effective data processing is crucial for models to perform well and generalize to new
data.
Data Set
Data Set
Creadits.csv
Python
Helper Functions
Data Set
Cast
Helper Functions
Helper Functions
Data Set
new_df
Text
Vectorizatio
n
Text
Vectorization
Text vectorization is the process of converting textual data into numerical vectors,
which can be used as input for machine learning algorithms
In natural language processing (NLP) and text analysis, representing text as numerical
vectors is essential because most machine learning models operate on numerical
data.
Some common techniques for text vectorization:
• Bag-of-Words (BoW)
• Term Frequency-Inverse Document Frequency (TF-IDF)
• N-grams
• Word Embeddings
• and many more...
Text
Vectorization
Tokenization
Bag-of-Words (BoW)
• Represents a document as an unordered set of words, ignoring grammar and
word order.
• It converts a document or a piece of text into a numerical vector, ignoring the
order and structure of words while focusing on their occurrence in the document.
• The first step is to break down the
text into individual words or tokens.
This process is called tokenization.
Vocabulary
Construction
Word Counting
• Create a vocabulary or a unique
set of words present in the entire
corpus (collection of documents).
• Each word is assigned a unique
index in the vocabulary.
• For each word in the document,
increment the corresponding
index in the vector by one.
• The resulting vector represents
the count of each word in the
document.
Document
Representation
Sparse Representation
• For each document in the corpus,
create a vector with the length
equal to the size of the
vocabulary.
• Initialize all values in the vector to
zero.
• Since most documents use only a
small subset of the entire
vocabulary, the resulting vectors
are typically sparse (contain
mostly zeros).
Example:
• Consider two documents: "The cat in the hat" and "The quick brown
fox."
• Vocabulary: ["The", "cat", "in", "hat", "quick", "brown", "fox"] (unique
words across both documents).
• Document Vectors:
⚬ "The cat in the hat": [1, 1, 1, 1, 0, 0, 0]
⚬ "The quick brown fox": [1, 0, 0, 0, 1, 1, 1]
Helper Functions
UI Building
and
Deployment
GUI
Accuracy
and
Confusion Matrix
Accuracy
Confusion Matrix
Results
Conclusion
THANK YOU

Machine_learning_presentation_on_movie_recomendation_system.pptx

  • 1.
    Movies Recommendation System Using ML Submitted By: SubmittedTo: Kruteeka Samal (2021UGELE16) Nagraj Sir Akash Vishwakarma (2021UGECE02)
  • 2.
    Problem Statement Introduction toRecommendation System Project Flow Data Processing Text Vectorization UI building and Deployment Accuracy and Confusion Matrix Results Conclusion 01 02 03 04 05 06 07 08 09 Table of contents
  • 3.
    Problem Statement In theera of digital entertainment, users often face the challenge of selecting movies from a vast library of options. To enhance the user experience and aid in decision-making, a movie recommendation system is desired. The system aims to recommend movies based on user preferences, leveraging collaborative filtering techniques.
  • 4.
    What is aRecommendation System?
  • 5.
    Recommendation System A recommendationsystem is a technology or algorithmic approach designed to suggest items or content to users based on their preferences, behaviors, or characteristics. The primary goal of a recommendation system is to predict what items or content a user may be interested in and present those recommendations in a personalized and relevant manner. These systems are widely used in various industries, including e-commerce, streaming services, social media, and more.
  • 6.
    Content-Based Collabrative-Filtering Hybrid •Hybrid recommendation systems combine multiple approaches to provide more accurate and diverse recommendations. • For example, a system might integrate collaborative filtering and content-based filtering to leverage the strengths of both methods. Types • This method recommends items based on the characteristics or features of the items themselves and the user's preferences. • It involves analyzing the content of items and recommending items with similar attributes to those the user has shown interest in. • Collaborative filtering is a popular technique used in recommendation systems to provide personalized suggestions to users based on the preferences and behaviors of similar users. • There are two main types of collaborative filtering: user- based collaborative filtering and item-based collaborative filtering.
  • 7.
  • 8.
    01 02 03 0405 Project Flow Data Collection Data Processing UI Buildin g Deployment Text Vectorization
  • 9.
  • 10.
    Data Processing Data processingin machine learning is about preparing raw data for model training. This includes cleaning data, transforming it for compatibility, and splitting it into training, validation, and testing sets. Effective data processing is crucial for models to perform well and generalize to new data.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
    Text Vectorization Text vectorization isthe process of converting textual data into numerical vectors, which can be used as input for machine learning algorithms In natural language processing (NLP) and text analysis, representing text as numerical vectors is essential because most machine learning models operate on numerical data. Some common techniques for text vectorization: • Bag-of-Words (BoW) • Term Frequency-Inverse Document Frequency (TF-IDF) • N-grams • Word Embeddings • and many more...
  • 21.
  • 22.
    Tokenization Bag-of-Words (BoW) • Representsa document as an unordered set of words, ignoring grammar and word order. • It converts a document or a piece of text into a numerical vector, ignoring the order and structure of words while focusing on their occurrence in the document. • The first step is to break down the text into individual words or tokens. This process is called tokenization. Vocabulary Construction Word Counting • Create a vocabulary or a unique set of words present in the entire corpus (collection of documents). • Each word is assigned a unique index in the vocabulary. • For each word in the document, increment the corresponding index in the vector by one. • The resulting vector represents the count of each word in the document. Document Representation Sparse Representation • For each document in the corpus, create a vector with the length equal to the size of the vocabulary. • Initialize all values in the vector to zero. • Since most documents use only a small subset of the entire vocabulary, the resulting vectors are typically sparse (contain mostly zeros).
  • 23.
    Example: • Consider twodocuments: "The cat in the hat" and "The quick brown fox." • Vocabulary: ["The", "cat", "in", "hat", "quick", "brown", "fox"] (unique words across both documents). • Document Vectors: ⚬ "The cat in the hat": [1, 1, 1, 1, 0, 0, 0] ⚬ "The quick brown fox": [1, 0, 0, 0, 1, 1, 1]
  • 25.
  • 26.
  • 27.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.