SlideShare a Scribd company logo
1 of 28
A New Similarity Measurement based on Hellinger Distance
For Collaborating Filtering in Sparse Data Set
Submitted in Fulfillment of Requirements for the
Degree of
MASTER OF TECHNOLOGY IN
COMPUTER SCIENCE AND ENGINEERING
specialization in
Information Security
by
Prabhu Kumar (15MT000624)
Under the guidance of
Dr. Rajendra Pamula
(Assistant Professor)
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
INDIAN INSTITUTE OF TECHNOLOGY (INDIAN SCHOOL OF MINES), DHANBAD
INDIA
M AY 2017
Outlines
β€’ Introduction of recommender system
β€’ Source of information
β€’ Types of recommendation system
β€’ Architecture
β€’ Similarity measurements
β€’ Proposed method
β€’ Result
β€’ References
Introduction
What is Recommender System?
β€’ It’s generic machine learning techniques
or information filtering system which predict
the user’s preference.
Example of Recommender System
β€’ Recommender system widely used in Movie, News, and Music recommendation etc...
Source of Information
β€’ The data which collects for recommendation is from Content, demographic, and
social media information.
Source of information (Continued..)
Types of Recommendation
1. Collaborative filtering recommendation system- It is based on the way which
humans have made decision throughout history and it is based on rating that user
has rated before using that specific items. So that, algorithm analyze their rating
predicts items for recommendation
2. Content based recommendation system- It is based on the user’s choices made
in the past in form of content that which content user liked the most in past
3. Hybrid recommendation system- Combinations of both
If A and B techniques is used for recommendation then A’s disadvantages will fix B
and B’s disadvantages will fix A .
Collaborating Filtering based Recommender system
Content based recommender system
Architecture of recommender system
β€’ For matching process in Recommender system:
β€œKNN algorithm is one of most useful algorithm which is used for recommendation
the item to the users”
KNN-algorithm(oriented to users)
Continued…
Similarity Measurements
β€’ Cosine Similarity:
β€œIt measures angle between two vector of ratings, the lower the angle, higher the similarity”
π’”π’Šπ’Ž(𝒖, 𝒗) 𝒄𝒐𝒔
=
𝒓 𝒖 . 𝒓 𝒗
𝒓 𝒖 . 𝒓 𝒗
β€œA vector which has magnitude and direction.”
Drawbacks:
β€’ If the two vector are on same line example a=(2,2,2,2) and b=(3,3,3,3) then the cosine value will be 1,
the similarity value will be β€œ0”.
β€’ It suffers from the co-rated items.
β€’ Similarity measurement is techniques which finds the nearest neighbor for an specific active user for
further processing of recommendation.
β€’ ACOS (Adjusted Cosine Similarity) : β€œ Some people like to rate high even they don’t like the item very
much However some people like to rate low if they like the item too much. So, ACOS is introduced”
π’”π’Šπ’Ž(𝒖, 𝒗) 𝑨π‘ͺ𝑢𝑺
=
𝒋=𝟏
𝒕𝒐𝒕𝒂𝒍 𝒏𝒐 𝒐𝒇 π’„π’βˆ’π’“π’‚π’•π’†π’… π’Šπ’•π’†π’Žπ’”
𝒓 𝒖 𝒋
βˆ’ 𝒓 𝒖 𝒋
βˆ— (𝒓 𝒗 𝒋
βˆ’ 𝒓 𝒗 𝒋
)
𝒋=𝟏
𝒕𝒐𝒕𝒂𝒍 𝒏𝒐 𝒐𝒇 π’„π’βˆ’π’“π’‚π’•π’†π’… π’Šπ’•π’†π’”π’Ž
(𝒓 𝒖 𝒋
βˆ’ 𝒓 𝒖 𝒋
) 𝟐
𝒋=𝟏
𝒕𝒐𝒕𝒂𝒍 𝒏𝒐 𝒐𝒇 π’„π’βˆ’π’“π’‚π’•π’†π’… π’Šπ’•π’†π’Žπ’”
(𝒓 𝒗 𝒋
βˆ’ 𝒓 𝒗 𝒋
) 𝟐
Drawbacks:
β€’ Similar rating problems
β€’ Few co-rated item problems
β€’ Pearson’s co-relation : β€œIt finds the linear co-relation between two vector of ratings”
π’”π’Šπ’Ž(𝒖, 𝒗) 𝑷π‘ͺπ‘ͺ
=
π’‘βˆˆπ’‹(𝒓 𝒖,𝒑 βˆ’ 𝒓 𝒖)(𝒓 𝒗,𝒑 βˆ’ 𝒓 𝒗)
π’‘βˆˆπ’‹(𝒓 𝒖,𝒑 βˆ’ 𝒓 𝒖) 𝟐 . π’‘βˆˆπ’‹(𝒓 𝒗,𝒑 βˆ’ 𝒓 𝒗)𝟐
Drawbacks:
β€’ If the rating item vector is a=(2,2,2,2) and b=(1,2,3,4) or rating in vector is Flat then PCC can’t be calculate
β€’ If the co-rated item 1, PCC will be β€œ0”, So it suffer from the few co-rated items.
PIP (Proximity-Impact- Popularity) :
π‘ π‘–π‘š(𝑒, 𝑣) 𝑃𝐼𝑃
= π‘—βˆˆπ‘‘π‘œπ‘‘π‘Žπ‘™ π‘›π‘œ π‘œπ‘“ π‘π‘œβˆ’π‘Ÿπ‘Žπ‘‘π‘’π‘‘ π‘–π‘‘π‘’π‘šπ‘  𝑃𝐼𝑃(π‘Ÿπ‘’ 𝑗
, π‘Ÿπ‘£ 𝑗
)
Whereas, 𝑃𝐼𝑃 π‘Ÿ1, π‘Ÿ2 = π‘ƒπ‘Ÿπ‘œπ‘₯π‘–π‘šπ‘–π‘‘π‘¦ π‘Ÿ1, π‘Ÿ2 βˆ— π‘–π‘šπ‘π‘Žπ‘π‘‘ π‘Ÿ1, π‘Ÿ2 βˆ— π‘π‘œπ‘π‘’π‘™π‘Žπ‘Ÿπ‘–π‘‘π‘¦(π‘Ÿ1, π‘Ÿ2)
𝑖𝑓 π‘Ÿ1 > π‘Ÿ π‘šπ‘’π‘‘ π‘Žπ‘›π‘‘ π‘Ÿ2 > π‘Ÿ π‘šπ‘’π‘‘ :
π‘π‘Ÿπ‘œπ‘₯π‘–π‘šπ‘–π‘‘π‘¦ π‘Ÿ1, π‘Ÿ2 = π‘Ÿ1 βˆ’ π‘Ÿ2
π‘–π‘šπ‘π‘Žπ‘π‘‘ π‘Ÿ1, π‘Ÿ2 = ( π‘Ÿ1 βˆ’ π‘Ÿ π‘šπ‘’π‘‘ + 1)( π‘Ÿ2 βˆ’ π‘Ÿ π‘šπ‘’π‘‘ + 1)
π‘π‘œπ‘π‘’π‘™π‘Žπ‘Ÿπ‘–π‘‘π‘¦ π‘Ÿ1, π‘Ÿ2 = 1 + (
π‘Ÿ1+π‘Ÿ2
2
βˆ’ πœ‡ π‘˜)2
𝑒𝑙𝑠𝑒:
π‘π‘Ÿπ‘œπ‘šπ‘–π‘‘π‘¦ π‘Ÿ1, π‘Ÿ2 = 2 βˆ— π‘Ÿ1 βˆ’ π‘Ÿ2
π‘–π‘šπ‘π‘Žπ‘π‘‘ π‘Ÿ1, π‘Ÿ2 =
1
( π‘Ÿ1βˆ’π‘Ÿ π‘šπ‘’π‘‘ +1)( π‘Ÿ2βˆ’π‘Ÿ π‘šπ‘’π‘‘ +1)
π‘π‘œπ‘π‘’π‘™π‘Žπ‘Ÿπ‘–π‘‘π‘¦ π‘Ÿ1, π‘Ÿ2 = 1
and πœ‡ π‘˜ = π‘Žπ‘£π‘’π‘Ÿπ‘Žπ‘”π‘’ π‘Ÿπ‘Žπ‘‘π‘–π‘›π‘” π‘“π‘œπ‘Ÿ π‘‘β„Žπ‘Žπ‘‘ π‘π‘Žπ‘Ÿπ‘‘π‘–π‘π‘’π‘™π‘Žπ‘Ÿ π‘–π‘‘π‘’π‘š π‘€β„Žπ‘–π‘β„Ž 𝑖𝑠 π‘Ÿπ‘Žπ‘‘π‘’π‘‘ 𝑏𝑦 π‘Žπ‘™π‘™ π‘’π‘ π‘’π‘Ÿπ‘ 
Drawbacks:
β€’ It doesn’t consider the proportion of common ratings made by users
β€’ Jacard similarity measurement:
β€œIt only considers the no of common rating between two users.”
π‘Ίπ’Šπ’Ž(𝒖, 𝒗) 𝑱𝒂𝒄𝒂𝒓𝒅
=
𝑰 𝒖 ∩ 𝑰 𝒗
𝑰 𝒖 βˆͺ 𝑰 𝒗
Drawbacks:
β€’ It doesn’t consider the absolute rating.
β€’ Mean squared difference:
β€œIt only considers the absolute rating ”
π’”π’Šπ’Ž(𝒖, 𝒗) π’Žπ’”π’… = 𝟏 βˆ’
π’‘βˆˆπ‘°(𝒓 𝒖,π’‘βˆ’π’“ 𝒗,𝒑) 𝟐
𝑰
Drawbacks:
β€’ It doesn’t consider the no of common rating between two users so, it ignores the credibility of similarity
measurement.
β€’ It ignores the proportion of common rating between two users.
Proposed method
Hellinger Distance:
β€’ It is used to quantify the similarity between two vector.
β€’ The minimum hellinger distance will be zero if no item is rated by both users and all the item rated by users as
absolutely same.
β€’ The value of hellinger distance will range from 0 to 2
β€’ 2 is defines at H(P,Q) ≀ 1 for all distance between the two users
𝑯 𝑷, 𝑸 =
𝟏
𝟐 π’Š=𝟏
π’Œ
( π’‘π’Š βˆ’ π’’π’Š) 𝟐
Let P = {2, 3, 1} and Q= {3, 2, 3}
So, Hellinger distance =
1
2
( 2 βˆ’ 3)2 + 3 βˆ’ 2 2 + ( 1 βˆ’ 3)2
=
1
2
0.101021 + 0.101021 + 0.53589838 =
1
2
𝑋 0.85903 =0.60743
Local references:
β€’ It plays an important role to find the local information about the user’s rating.
β€’ It must provide positive as well as negative co-relation between two users.
β€’ It is used for finding the actual relation between two users according to their ratings.
𝒍𝒐𝒄 π’Žπ’†π’… 𝒓 π’–π’Š , 𝒓 π’—π’Š =
(𝒓 π’–π’Šβˆ’π’“ π’Žπ’†π’… )(𝒓 π’—π’Š βˆ’π’“ π’Žπ’†π’…)
π’Œβˆˆπ‘° 𝒖
(𝒓 π’–π’Œ βˆ’π’“ π’Žπ’†π’…) 𝟐
π’Œβˆˆπ‘° 𝒗
(𝒓 π’—π’Œβˆ’π’“ π’Žπ’†π’…) 𝟐
Whereas, K is all items rated by users
rui is the rating by user u for ith item.
rvi is the rating by user v for ith item.
rmed is the average of rating by users.
Proposed method equation :
𝑆 𝑒, 𝑣 = 𝐻 𝑒, 𝑣 βˆ—
π‘–βˆˆπ‘’ π‘—βˆˆπ‘£
π‘™π‘œπ‘ π‘Ÿπ‘’π‘–, π‘Ÿπ‘£π‘— + π½π‘Žπ‘π‘Žπ‘Ÿπ‘‘(𝑒, 𝑣)
Where,
H(u, v) is the hellinger distance
loc(rui, rvj) is the local similarities between all the user’s rating to that items
Jacard (u, v) measures the rating proportion of two users.
Result:
β€’ In this graph, the flat item-ratings and few common rating problem is solved using proposed
method.
β€’ U1 and U3 and U2-U4 is flat rating, U4-U5 is improvement of Common rating Proportion.
β€’ U3 to U5 has few co-rated item problem.
Item1 Item2 Item3 Item4
User1 4 3 5 4
User2 5 3 - -
User3 4 3 4 4
User4 2 1 - -
User5 4 2 - -
β€’ The problem of same co-rated vector and few co-rated items has improved using proposed method and
also the simultaneous difference of rating problem has been solved.
β€’ U1 and U3 has same co-rated Vector, it improves using proposed method.
β€’ U1 and U5 suffers from few co-rated items
β€’ U4 and U5 has simultaneous difference problem.
β€’ The problem of local similarities and proportion of rating has improved using proposed
method.
β€’ U4 and U5 has proportion of rating problem in PIP which improved by proposed method.
β€’ U1 and U4 has few co-rated item problems.
β€’ U2 and U4 has local similarities improvement.
Evaluation of Proposed method in large dataset
β€’ Through large dataset of Movielens, called ML-100K, there are 100,000 ratings with
943 persons and 1682 movies. Another is ML-1M, it includes 6040 users and 3952
movies with 1,000,209 ratings. Each user has rated at least 20 movies.
β€’ The movie’s recommendation using Cosine Similarity and proposed method.
β€’ The movie’s recommendation using PIP (proximity-impact-popularity) and
proposed method.
References
β€’ J. Bobadilla, F. Ortega, A. Hernando, A. Gutirrez, Recommender systems survey, Knowl.-Based Syst. 46 (2013) 109–132.
β€’ P. Resnick, H.R. Varian, Recommender systems, Commun. ACM 40 (3) (1997) 56–58.
β€’ G. Linden, B. Smith, J. York, Amazon.com recommendations: item-to-item collaborative filtering, IEEE Internet Comput. 7 (1)
(2003) 76–80.
β€’ Y. Koren, Factorization meets the neighborhood: a multifaceted collaborative filtering model, in: Proceedings of the 14th ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining, 2008, pp. 426–434.
β€’ C. Desrosiers, G. Karypis, A comprehensive survey of neighborhood-based recommendation methods, in: Recommender
Systems Handbook, 2011, pp. 107–144.
β€’ M.J. Pazzani, D. Billsus, Content-based recommendation systems, The Adap. Web (2007) 325–341.
β€’ H. Junming, C. Xueqi, G. Jiafeng, S. Huawei, Y. Kun, Social recommendation with interpersonal influence, ECAI 10 (2010) 601–
606.
Thank You !
A special thanks to my project guide Dr. Rajendra Pamula sir for
guiding, motivating and providing me with fruitful information throughout
the development process of this project work
My sincere gratitude to the panel of teachers present for giving their
precious time for listening and evaluating my project presentation

More Related Content

What's hot

[Decisions2013@RecSys]The Role of Emotions in Context-aware Recommendation
[Decisions2013@RecSys]The Role of Emotions in Context-aware Recommendation[Decisions2013@RecSys]The Role of Emotions in Context-aware Recommendation
[Decisions2013@RecSys]The Role of Emotions in Context-aware RecommendationYONG ZHENG
Β 
[ADMA 2017] Identification of Grey Sheep Users By Histogram Intersection In R...
[ADMA 2017] Identification of Grey Sheep Users By Histogram Intersection In R...[ADMA 2017] Identification of Grey Sheep Users By Histogram Intersection In R...
[ADMA 2017] Identification of Grey Sheep Users By Histogram Intersection In R...YONG ZHENG
Β 
Recommendation and Information Retrieval: Two Sides of the Same Coin?
Recommendation and Information Retrieval: Two Sides of the Same Coin?Recommendation and Information Retrieval: Two Sides of the Same Coin?
Recommendation and Information Retrieval: Two Sides of the Same Coin?Arjen de Vries
Β 
Tutorial: Context In Recommender Systems
Tutorial: Context In Recommender SystemsTutorial: Context In Recommender Systems
Tutorial: Context In Recommender SystemsYONG ZHENG
Β 
[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation
[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation
[WI 2017] Affective Prediction By Collaborative Chains In Movie RecommendationYONG ZHENG
Β 
At4102337341
At4102337341At4102337341
At4102337341IJERA Editor
Β 
IRJET- A Personalized Music Recommendation System
IRJET- A Personalized Music Recommendation SystemIRJET- A Personalized Music Recommendation System
IRJET- A Personalized Music Recommendation SystemIRJET Journal
Β 
ACM ICTIR 2019 Slides - Santa Clara, USA
ACM ICTIR 2019 Slides -  Santa Clara, USAACM ICTIR 2019 Slides -  Santa Clara, USA
ACM ICTIR 2019 Slides - Santa Clara, USAIadh Ounis
Β 
[IUI2015] A Revisit to The Identification of Contexts in Recommender Systems
[IUI2015] A Revisit to The Identification of Contexts in Recommender Systems[IUI2015] A Revisit to The Identification of Contexts in Recommender Systems
[IUI2015] A Revisit to The Identification of Contexts in Recommender SystemsYONG ZHENG
Β 
A survey of memory based methods for collaborative filtering based techniques
A survey of memory based methods for collaborative filtering based techniquesA survey of memory based methods for collaborative filtering based techniques
A survey of memory based methods for collaborative filtering based techniquesIAEME Publication
Β 
Scalable recommendation with social contextual information
Scalable recommendation with social contextual informationScalable recommendation with social contextual information
Scalable recommendation with social contextual informationeSAT Journals
Β 
Pak eko 4412ijdms01
Pak eko 4412ijdms01Pak eko 4412ijdms01
Pak eko 4412ijdms01hyuviridvic
Β 
Analysis of Textual Data Classification with a Reddit Comments Dataset
Analysis of Textual Data Classification with a Reddit Comments DatasetAnalysis of Textual Data Classification with a Reddit Comments Dataset
Analysis of Textual Data Classification with a Reddit Comments DatasetAdamBab
Β 
Literature Review: The Role of Signal Processing in Meeting Privacy Challenge...
Literature Review: The Role of Signal Processing in Meeting Privacy Challenge...Literature Review: The Role of Signal Processing in Meeting Privacy Challenge...
Literature Review: The Role of Signal Processing in Meeting Privacy Challenge...Kato Mivule
Β 
Analysis of wavelet-based full reference image quality assessment algorithm
Analysis of wavelet-based full reference image quality assessment algorithmAnalysis of wavelet-based full reference image quality assessment algorithm
Analysis of wavelet-based full reference image quality assessment algorithmjournalBEEI
Β 
Low rank models for recommender systems with limited preference information
Low rank models for recommender systems with limited preference informationLow rank models for recommender systems with limited preference information
Low rank models for recommender systems with limited preference informationEvgeny Frolov
Β 
Kjartjo-lokaverkefni
Kjartjo-lokaverkefniKjartjo-lokaverkefni
Kjartjo-lokaverkefniKjartan Akil J
Β 
Item basedcollaborativefilteringrecommendationalgorithms
Item basedcollaborativefilteringrecommendationalgorithmsItem basedcollaborativefilteringrecommendationalgorithms
Item basedcollaborativefilteringrecommendationalgorithmsAravindharamanan S
Β 
debatrim_report (1)
debatrim_report (1)debatrim_report (1)
debatrim_report (1)Debatri Mitra
Β 
IRJET- Book Recommendation System using Item Based Collaborative Filtering
IRJET- Book Recommendation System using Item Based Collaborative FilteringIRJET- Book Recommendation System using Item Based Collaborative Filtering
IRJET- Book Recommendation System using Item Based Collaborative FilteringIRJET Journal
Β 

What's hot (20)

[Decisions2013@RecSys]The Role of Emotions in Context-aware Recommendation
[Decisions2013@RecSys]The Role of Emotions in Context-aware Recommendation[Decisions2013@RecSys]The Role of Emotions in Context-aware Recommendation
[Decisions2013@RecSys]The Role of Emotions in Context-aware Recommendation
Β 
[ADMA 2017] Identification of Grey Sheep Users By Histogram Intersection In R...
[ADMA 2017] Identification of Grey Sheep Users By Histogram Intersection In R...[ADMA 2017] Identification of Grey Sheep Users By Histogram Intersection In R...
[ADMA 2017] Identification of Grey Sheep Users By Histogram Intersection In R...
Β 
Recommendation and Information Retrieval: Two Sides of the Same Coin?
Recommendation and Information Retrieval: Two Sides of the Same Coin?Recommendation and Information Retrieval: Two Sides of the Same Coin?
Recommendation and Information Retrieval: Two Sides of the Same Coin?
Β 
Tutorial: Context In Recommender Systems
Tutorial: Context In Recommender SystemsTutorial: Context In Recommender Systems
Tutorial: Context In Recommender Systems
Β 
[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation
[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation
[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation
Β 
At4102337341
At4102337341At4102337341
At4102337341
Β 
IRJET- A Personalized Music Recommendation System
IRJET- A Personalized Music Recommendation SystemIRJET- A Personalized Music Recommendation System
IRJET- A Personalized Music Recommendation System
Β 
ACM ICTIR 2019 Slides - Santa Clara, USA
ACM ICTIR 2019 Slides -  Santa Clara, USAACM ICTIR 2019 Slides -  Santa Clara, USA
ACM ICTIR 2019 Slides - Santa Clara, USA
Β 
[IUI2015] A Revisit to The Identification of Contexts in Recommender Systems
[IUI2015] A Revisit to The Identification of Contexts in Recommender Systems[IUI2015] A Revisit to The Identification of Contexts in Recommender Systems
[IUI2015] A Revisit to The Identification of Contexts in Recommender Systems
Β 
A survey of memory based methods for collaborative filtering based techniques
A survey of memory based methods for collaborative filtering based techniquesA survey of memory based methods for collaborative filtering based techniques
A survey of memory based methods for collaborative filtering based techniques
Β 
Scalable recommendation with social contextual information
Scalable recommendation with social contextual informationScalable recommendation with social contextual information
Scalable recommendation with social contextual information
Β 
Pak eko 4412ijdms01
Pak eko 4412ijdms01Pak eko 4412ijdms01
Pak eko 4412ijdms01
Β 
Analysis of Textual Data Classification with a Reddit Comments Dataset
Analysis of Textual Data Classification with a Reddit Comments DatasetAnalysis of Textual Data Classification with a Reddit Comments Dataset
Analysis of Textual Data Classification with a Reddit Comments Dataset
Β 
Literature Review: The Role of Signal Processing in Meeting Privacy Challenge...
Literature Review: The Role of Signal Processing in Meeting Privacy Challenge...Literature Review: The Role of Signal Processing in Meeting Privacy Challenge...
Literature Review: The Role of Signal Processing in Meeting Privacy Challenge...
Β 
Analysis of wavelet-based full reference image quality assessment algorithm
Analysis of wavelet-based full reference image quality assessment algorithmAnalysis of wavelet-based full reference image quality assessment algorithm
Analysis of wavelet-based full reference image quality assessment algorithm
Β 
Low rank models for recommender systems with limited preference information
Low rank models for recommender systems with limited preference informationLow rank models for recommender systems with limited preference information
Low rank models for recommender systems with limited preference information
Β 
Kjartjo-lokaverkefni
Kjartjo-lokaverkefniKjartjo-lokaverkefni
Kjartjo-lokaverkefni
Β 
Item basedcollaborativefilteringrecommendationalgorithms
Item basedcollaborativefilteringrecommendationalgorithmsItem basedcollaborativefilteringrecommendationalgorithms
Item basedcollaborativefilteringrecommendationalgorithms
Β 
debatrim_report (1)
debatrim_report (1)debatrim_report (1)
debatrim_report (1)
Β 
IRJET- Book Recommendation System using Item Based Collaborative Filtering
IRJET- Book Recommendation System using Item Based Collaborative FilteringIRJET- Book Recommendation System using Item Based Collaborative Filtering
IRJET- Book Recommendation System using Item Based Collaborative Filtering
Β 

Similar to A new similarity measurement based on hellinger distance for collaborating filtering in sparse data set

Recommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative FilteringRecommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative FilteringChangsung Moon
Β 
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...Emanuel Lacić
Β 
Icdec2020_presentation_slides_13
Icdec2020_presentation_slides_13Icdec2020_presentation_slides_13
Icdec2020_presentation_slides_13ICDEcCnferenece
Β 
Download
DownloadDownload
Downloadbutest
Β 
Download
DownloadDownload
Downloadbutest
Β 
Recommender Systems Fairness Evaluation via Generalized Cross Entropy
Recommender Systems Fairness Evaluation via Generalized Cross EntropyRecommender Systems Fairness Evaluation via Generalized Cross Entropy
Recommender Systems Fairness Evaluation via Generalized Cross EntropyVito Walter Anelli
Β 
Jeffrey xu yu large graph processing
Jeffrey xu yu large graph processingJeffrey xu yu large graph processing
Jeffrey xu yu large graph processingjins0618
Β 
Rating System:Various rating algorithms Review.
Rating System:Various rating algorithms Review.Rating System:Various rating algorithms Review.
Rating System:Various rating algorithms Review.Scandala Tamang
Β 
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...Kishor Datta Gupta
Β 
Ability Study of Proximity Measure for Big Data Mining Context on Clustering
Ability Study of Proximity Measure for Big Data Mining Context on ClusteringAbility Study of Proximity Measure for Big Data Mining Context on Clustering
Ability Study of Proximity Measure for Big Data Mining Context on ClusteringKamleshKumar394
Β 
Recommendation system
Recommendation systemRecommendation system
Recommendation systemDing Li
Β 
Language Models for Collaborative Filtering Neighbourhoods [ECIR '16 Slides]
Language Models for Collaborative Filtering Neighbourhoods [ECIR '16 Slides]Language Models for Collaborative Filtering Neighbourhoods [ECIR '16 Slides]
Language Models for Collaborative Filtering Neighbourhoods [ECIR '16 Slides]Daniel Valcarce
Β 
Movie recommendation project
Movie recommendation projectMovie recommendation project
Movie recommendation projectAbhishek Jaisingh
Β 
CSE545_Porject
CSE545_PorjectCSE545_Porject
CSE545_Porjecthan li
Β 
Fifty Shades of Ratings: How to Benefit from a Negative Feedback in Top-N Rec...
Fifty Shades of Ratings: How to Benefit from a Negative Feedback in Top-N Rec...Fifty Shades of Ratings: How to Benefit from a Negative Feedback in Top-N Rec...
Fifty Shades of Ratings: How to Benefit from a Negative Feedback in Top-N Rec...Evgeny Frolov
Β 
User Based Recommendation Systems (1).pdf
User Based Recommendation Systems (1).pdfUser Based Recommendation Systems (1).pdf
User Based Recommendation Systems (1).pdfMridulGupta588131
Β 
Recommender systems using collaborative filtering
Recommender systems using collaborative filteringRecommender systems using collaborative filtering
Recommender systems using collaborative filteringD Yogendra Rao
Β 
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...Sc Huang
Β 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation SystemsRobin Reni
Β 
Advanced cosine measures for collaborative filtering
Advanced cosine measures for collaborative filteringAdvanced cosine measures for collaborative filtering
Advanced cosine measures for collaborative filteringLoc Nguyen
Β 

Similar to A new similarity measurement based on hellinger distance for collaborating filtering in sparse data set (20)

Recommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative FilteringRecommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative Filtering
Β 
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
Β 
Icdec2020_presentation_slides_13
Icdec2020_presentation_slides_13Icdec2020_presentation_slides_13
Icdec2020_presentation_slides_13
Β 
Download
DownloadDownload
Download
Β 
Download
DownloadDownload
Download
Β 
Recommender Systems Fairness Evaluation via Generalized Cross Entropy
Recommender Systems Fairness Evaluation via Generalized Cross EntropyRecommender Systems Fairness Evaluation via Generalized Cross Entropy
Recommender Systems Fairness Evaluation via Generalized Cross Entropy
Β 
Jeffrey xu yu large graph processing
Jeffrey xu yu large graph processingJeffrey xu yu large graph processing
Jeffrey xu yu large graph processing
Β 
Rating System:Various rating algorithms Review.
Rating System:Various rating algorithms Review.Rating System:Various rating algorithms Review.
Rating System:Various rating algorithms Review.
Β 
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
Β 
Ability Study of Proximity Measure for Big Data Mining Context on Clustering
Ability Study of Proximity Measure for Big Data Mining Context on ClusteringAbility Study of Proximity Measure for Big Data Mining Context on Clustering
Ability Study of Proximity Measure for Big Data Mining Context on Clustering
Β 
Recommendation system
Recommendation systemRecommendation system
Recommendation system
Β 
Language Models for Collaborative Filtering Neighbourhoods [ECIR '16 Slides]
Language Models for Collaborative Filtering Neighbourhoods [ECIR '16 Slides]Language Models for Collaborative Filtering Neighbourhoods [ECIR '16 Slides]
Language Models for Collaborative Filtering Neighbourhoods [ECIR '16 Slides]
Β 
Movie recommendation project
Movie recommendation projectMovie recommendation project
Movie recommendation project
Β 
CSE545_Porject
CSE545_PorjectCSE545_Porject
CSE545_Porject
Β 
Fifty Shades of Ratings: How to Benefit from a Negative Feedback in Top-N Rec...
Fifty Shades of Ratings: How to Benefit from a Negative Feedback in Top-N Rec...Fifty Shades of Ratings: How to Benefit from a Negative Feedback in Top-N Rec...
Fifty Shades of Ratings: How to Benefit from a Negative Feedback in Top-N Rec...
Β 
User Based Recommendation Systems (1).pdf
User Based Recommendation Systems (1).pdfUser Based Recommendation Systems (1).pdf
User Based Recommendation Systems (1).pdf
Β 
Recommender systems using collaborative filtering
Recommender systems using collaborative filteringRecommender systems using collaborative filtering
Recommender systems using collaborative filtering
Β 
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
Β 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation Systems
Β 
Advanced cosine measures for collaborative filtering
Advanced cosine measures for collaborative filteringAdvanced cosine measures for collaborative filtering
Advanced cosine measures for collaborative filtering
Β 

Recently uploaded

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
Β 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
Β 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
Β 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
Β 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
Β 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
Β 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
Β 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
Β 
Integration and Automation in Practice: CI/CD in MuleΒ Integration and Automat...
Integration and Automation in Practice: CI/CD in MuleΒ Integration and Automat...Integration and Automation in Practice: CI/CD in MuleΒ Integration and Automat...
Integration and Automation in Practice: CI/CD in MuleΒ Integration and Automat...Patryk Bandurski
Β 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
Β 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
Β 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
Β 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
Β 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
Β 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
Β 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
Β 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
Β 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
Β 

Recently uploaded (20)

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
Β 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
Β 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
Β 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
Β 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
Β 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Β 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Β 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
Β 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
Β 
Integration and Automation in Practice: CI/CD in MuleΒ Integration and Automat...
Integration and Automation in Practice: CI/CD in MuleΒ Integration and Automat...Integration and Automation in Practice: CI/CD in MuleΒ Integration and Automat...
Integration and Automation in Practice: CI/CD in MuleΒ Integration and Automat...
Β 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
Β 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Β 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
Β 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
Β 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
Β 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
Β 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
Β 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
Β 
Hot Sexy call girls in Panjabi Bagh πŸ” 9953056974 πŸ” Delhi escort Service
Hot Sexy call girls in Panjabi Bagh πŸ” 9953056974 πŸ” Delhi escort ServiceHot Sexy call girls in Panjabi Bagh πŸ” 9953056974 πŸ” Delhi escort Service
Hot Sexy call girls in Panjabi Bagh πŸ” 9953056974 πŸ” Delhi escort Service
Β 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
Β 

A new similarity measurement based on hellinger distance for collaborating filtering in sparse data set

  • 1. A New Similarity Measurement based on Hellinger Distance For Collaborating Filtering in Sparse Data Set Submitted in Fulfillment of Requirements for the Degree of MASTER OF TECHNOLOGY IN COMPUTER SCIENCE AND ENGINEERING specialization in Information Security by Prabhu Kumar (15MT000624) Under the guidance of Dr. Rajendra Pamula (Assistant Professor) DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING INDIAN INSTITUTE OF TECHNOLOGY (INDIAN SCHOOL OF MINES), DHANBAD INDIA M AY 2017
  • 2. Outlines β€’ Introduction of recommender system β€’ Source of information β€’ Types of recommendation system β€’ Architecture β€’ Similarity measurements β€’ Proposed method β€’ Result β€’ References
  • 3. Introduction What is Recommender System? β€’ It’s generic machine learning techniques or information filtering system which predict the user’s preference.
  • 4. Example of Recommender System β€’ Recommender system widely used in Movie, News, and Music recommendation etc...
  • 5. Source of Information β€’ The data which collects for recommendation is from Content, demographic, and social media information.
  • 6. Source of information (Continued..)
  • 7. Types of Recommendation 1. Collaborative filtering recommendation system- It is based on the way which humans have made decision throughout history and it is based on rating that user has rated before using that specific items. So that, algorithm analyze their rating predicts items for recommendation 2. Content based recommendation system- It is based on the user’s choices made in the past in form of content that which content user liked the most in past 3. Hybrid recommendation system- Combinations of both If A and B techniques is used for recommendation then A’s disadvantages will fix B and B’s disadvantages will fix A .
  • 8. Collaborating Filtering based Recommender system
  • 11. β€’ For matching process in Recommender system: β€œKNN algorithm is one of most useful algorithm which is used for recommendation the item to the users”
  • 14. Similarity Measurements β€’ Cosine Similarity: β€œIt measures angle between two vector of ratings, the lower the angle, higher the similarity” π’”π’Šπ’Ž(𝒖, 𝒗) 𝒄𝒐𝒔 = 𝒓 𝒖 . 𝒓 𝒗 𝒓 𝒖 . 𝒓 𝒗 β€œA vector which has magnitude and direction.” Drawbacks: β€’ If the two vector are on same line example a=(2,2,2,2) and b=(3,3,3,3) then the cosine value will be 1, the similarity value will be β€œ0”. β€’ It suffers from the co-rated items. β€’ Similarity measurement is techniques which finds the nearest neighbor for an specific active user for further processing of recommendation.
  • 15. β€’ ACOS (Adjusted Cosine Similarity) : β€œ Some people like to rate high even they don’t like the item very much However some people like to rate low if they like the item too much. So, ACOS is introduced” π’”π’Šπ’Ž(𝒖, 𝒗) 𝑨π‘ͺ𝑢𝑺 = 𝒋=𝟏 𝒕𝒐𝒕𝒂𝒍 𝒏𝒐 𝒐𝒇 π’„π’βˆ’π’“π’‚π’•π’†π’… π’Šπ’•π’†π’Žπ’” 𝒓 𝒖 𝒋 βˆ’ 𝒓 𝒖 𝒋 βˆ— (𝒓 𝒗 𝒋 βˆ’ 𝒓 𝒗 𝒋 ) 𝒋=𝟏 𝒕𝒐𝒕𝒂𝒍 𝒏𝒐 𝒐𝒇 π’„π’βˆ’π’“π’‚π’•π’†π’… π’Šπ’•π’†π’”π’Ž (𝒓 𝒖 𝒋 βˆ’ 𝒓 𝒖 𝒋 ) 𝟐 𝒋=𝟏 𝒕𝒐𝒕𝒂𝒍 𝒏𝒐 𝒐𝒇 π’„π’βˆ’π’“π’‚π’•π’†π’… π’Šπ’•π’†π’Žπ’” (𝒓 𝒗 𝒋 βˆ’ 𝒓 𝒗 𝒋 ) 𝟐 Drawbacks: β€’ Similar rating problems β€’ Few co-rated item problems β€’ Pearson’s co-relation : β€œIt finds the linear co-relation between two vector of ratings” π’”π’Šπ’Ž(𝒖, 𝒗) 𝑷π‘ͺπ‘ͺ = π’‘βˆˆπ’‹(𝒓 𝒖,𝒑 βˆ’ 𝒓 𝒖)(𝒓 𝒗,𝒑 βˆ’ 𝒓 𝒗) π’‘βˆˆπ’‹(𝒓 𝒖,𝒑 βˆ’ 𝒓 𝒖) 𝟐 . π’‘βˆˆπ’‹(𝒓 𝒗,𝒑 βˆ’ 𝒓 𝒗)𝟐 Drawbacks: β€’ If the rating item vector is a=(2,2,2,2) and b=(1,2,3,4) or rating in vector is Flat then PCC can’t be calculate β€’ If the co-rated item 1, PCC will be β€œ0”, So it suffer from the few co-rated items.
  • 16. PIP (Proximity-Impact- Popularity) : π‘ π‘–π‘š(𝑒, 𝑣) 𝑃𝐼𝑃 = π‘—βˆˆπ‘‘π‘œπ‘‘π‘Žπ‘™ π‘›π‘œ π‘œπ‘“ π‘π‘œβˆ’π‘Ÿπ‘Žπ‘‘π‘’π‘‘ π‘–π‘‘π‘’π‘šπ‘  𝑃𝐼𝑃(π‘Ÿπ‘’ 𝑗 , π‘Ÿπ‘£ 𝑗 ) Whereas, 𝑃𝐼𝑃 π‘Ÿ1, π‘Ÿ2 = π‘ƒπ‘Ÿπ‘œπ‘₯π‘–π‘šπ‘–π‘‘π‘¦ π‘Ÿ1, π‘Ÿ2 βˆ— π‘–π‘šπ‘π‘Žπ‘π‘‘ π‘Ÿ1, π‘Ÿ2 βˆ— π‘π‘œπ‘π‘’π‘™π‘Žπ‘Ÿπ‘–π‘‘π‘¦(π‘Ÿ1, π‘Ÿ2) 𝑖𝑓 π‘Ÿ1 > π‘Ÿ π‘šπ‘’π‘‘ π‘Žπ‘›π‘‘ π‘Ÿ2 > π‘Ÿ π‘šπ‘’π‘‘ : π‘π‘Ÿπ‘œπ‘₯π‘–π‘šπ‘–π‘‘π‘¦ π‘Ÿ1, π‘Ÿ2 = π‘Ÿ1 βˆ’ π‘Ÿ2 π‘–π‘šπ‘π‘Žπ‘π‘‘ π‘Ÿ1, π‘Ÿ2 = ( π‘Ÿ1 βˆ’ π‘Ÿ π‘šπ‘’π‘‘ + 1)( π‘Ÿ2 βˆ’ π‘Ÿ π‘šπ‘’π‘‘ + 1) π‘π‘œπ‘π‘’π‘™π‘Žπ‘Ÿπ‘–π‘‘π‘¦ π‘Ÿ1, π‘Ÿ2 = 1 + ( π‘Ÿ1+π‘Ÿ2 2 βˆ’ πœ‡ π‘˜)2 𝑒𝑙𝑠𝑒: π‘π‘Ÿπ‘œπ‘šπ‘–π‘‘π‘¦ π‘Ÿ1, π‘Ÿ2 = 2 βˆ— π‘Ÿ1 βˆ’ π‘Ÿ2 π‘–π‘šπ‘π‘Žπ‘π‘‘ π‘Ÿ1, π‘Ÿ2 = 1 ( π‘Ÿ1βˆ’π‘Ÿ π‘šπ‘’π‘‘ +1)( π‘Ÿ2βˆ’π‘Ÿ π‘šπ‘’π‘‘ +1) π‘π‘œπ‘π‘’π‘™π‘Žπ‘Ÿπ‘–π‘‘π‘¦ π‘Ÿ1, π‘Ÿ2 = 1 and πœ‡ π‘˜ = π‘Žπ‘£π‘’π‘Ÿπ‘Žπ‘”π‘’ π‘Ÿπ‘Žπ‘‘π‘–π‘›π‘” π‘“π‘œπ‘Ÿ π‘‘β„Žπ‘Žπ‘‘ π‘π‘Žπ‘Ÿπ‘‘π‘–π‘π‘’π‘™π‘Žπ‘Ÿ π‘–π‘‘π‘’π‘š π‘€β„Žπ‘–π‘β„Ž 𝑖𝑠 π‘Ÿπ‘Žπ‘‘π‘’π‘‘ 𝑏𝑦 π‘Žπ‘™π‘™ π‘’π‘ π‘’π‘Ÿπ‘  Drawbacks: β€’ It doesn’t consider the proportion of common ratings made by users
  • 17. β€’ Jacard similarity measurement: β€œIt only considers the no of common rating between two users.” π‘Ίπ’Šπ’Ž(𝒖, 𝒗) 𝑱𝒂𝒄𝒂𝒓𝒅 = 𝑰 𝒖 ∩ 𝑰 𝒗 𝑰 𝒖 βˆͺ 𝑰 𝒗 Drawbacks: β€’ It doesn’t consider the absolute rating. β€’ Mean squared difference: β€œIt only considers the absolute rating ” π’”π’Šπ’Ž(𝒖, 𝒗) π’Žπ’”π’… = 𝟏 βˆ’ π’‘βˆˆπ‘°(𝒓 𝒖,π’‘βˆ’π’“ 𝒗,𝒑) 𝟐 𝑰 Drawbacks: β€’ It doesn’t consider the no of common rating between two users so, it ignores the credibility of similarity measurement. β€’ It ignores the proportion of common rating between two users.
  • 18. Proposed method Hellinger Distance: β€’ It is used to quantify the similarity between two vector. β€’ The minimum hellinger distance will be zero if no item is rated by both users and all the item rated by users as absolutely same. β€’ The value of hellinger distance will range from 0 to 2 β€’ 2 is defines at H(P,Q) ≀ 1 for all distance between the two users 𝑯 𝑷, 𝑸 = 𝟏 𝟐 π’Š=𝟏 π’Œ ( π’‘π’Š βˆ’ π’’π’Š) 𝟐 Let P = {2, 3, 1} and Q= {3, 2, 3} So, Hellinger distance = 1 2 ( 2 βˆ’ 3)2 + 3 βˆ’ 2 2 + ( 1 βˆ’ 3)2 = 1 2 0.101021 + 0.101021 + 0.53589838 = 1 2 𝑋 0.85903 =0.60743
  • 19. Local references: β€’ It plays an important role to find the local information about the user’s rating. β€’ It must provide positive as well as negative co-relation between two users. β€’ It is used for finding the actual relation between two users according to their ratings. 𝒍𝒐𝒄 π’Žπ’†π’… 𝒓 π’–π’Š , 𝒓 π’—π’Š = (𝒓 π’–π’Šβˆ’π’“ π’Žπ’†π’… )(𝒓 π’—π’Š βˆ’π’“ π’Žπ’†π’…) π’Œβˆˆπ‘° 𝒖 (𝒓 π’–π’Œ βˆ’π’“ π’Žπ’†π’…) 𝟐 π’Œβˆˆπ‘° 𝒗 (𝒓 π’—π’Œβˆ’π’“ π’Žπ’†π’…) 𝟐 Whereas, K is all items rated by users rui is the rating by user u for ith item. rvi is the rating by user v for ith item. rmed is the average of rating by users.
  • 20. Proposed method equation : 𝑆 𝑒, 𝑣 = 𝐻 𝑒, 𝑣 βˆ— π‘–βˆˆπ‘’ π‘—βˆˆπ‘£ π‘™π‘œπ‘ π‘Ÿπ‘’π‘–, π‘Ÿπ‘£π‘— + π½π‘Žπ‘π‘Žπ‘Ÿπ‘‘(𝑒, 𝑣) Where, H(u, v) is the hellinger distance loc(rui, rvj) is the local similarities between all the user’s rating to that items Jacard (u, v) measures the rating proportion of two users.
  • 21. Result: β€’ In this graph, the flat item-ratings and few common rating problem is solved using proposed method. β€’ U1 and U3 and U2-U4 is flat rating, U4-U5 is improvement of Common rating Proportion. β€’ U3 to U5 has few co-rated item problem. Item1 Item2 Item3 Item4 User1 4 3 5 4 User2 5 3 - - User3 4 3 4 4 User4 2 1 - - User5 4 2 - -
  • 22. β€’ The problem of same co-rated vector and few co-rated items has improved using proposed method and also the simultaneous difference of rating problem has been solved. β€’ U1 and U3 has same co-rated Vector, it improves using proposed method. β€’ U1 and U5 suffers from few co-rated items β€’ U4 and U5 has simultaneous difference problem.
  • 23. β€’ The problem of local similarities and proportion of rating has improved using proposed method. β€’ U4 and U5 has proportion of rating problem in PIP which improved by proposed method. β€’ U1 and U4 has few co-rated item problems. β€’ U2 and U4 has local similarities improvement.
  • 24. Evaluation of Proposed method in large dataset β€’ Through large dataset of Movielens, called ML-100K, there are 100,000 ratings with 943 persons and 1682 movies. Another is ML-1M, it includes 6040 users and 3952 movies with 1,000,209 ratings. Each user has rated at least 20 movies.
  • 25. β€’ The movie’s recommendation using Cosine Similarity and proposed method.
  • 26. β€’ The movie’s recommendation using PIP (proximity-impact-popularity) and proposed method.
  • 27. References β€’ J. Bobadilla, F. Ortega, A. Hernando, A. Gutirrez, Recommender systems survey, Knowl.-Based Syst. 46 (2013) 109–132. β€’ P. Resnick, H.R. Varian, Recommender systems, Commun. ACM 40 (3) (1997) 56–58. β€’ G. Linden, B. Smith, J. York, Amazon.com recommendations: item-to-item collaborative filtering, IEEE Internet Comput. 7 (1) (2003) 76–80. β€’ Y. Koren, Factorization meets the neighborhood: a multifaceted collaborative filtering model, in: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2008, pp. 426–434. β€’ C. Desrosiers, G. Karypis, A comprehensive survey of neighborhood-based recommendation methods, in: Recommender Systems Handbook, 2011, pp. 107–144. β€’ M.J. Pazzani, D. Billsus, Content-based recommendation systems, The Adap. Web (2007) 325–341. β€’ H. Junming, C. Xueqi, G. Jiafeng, S. Huawei, Y. Kun, Social recommendation with interpersonal influence, ECAI 10 (2010) 601– 606.
  • 28. Thank You ! A special thanks to my project guide Dr. Rajendra Pamula sir for guiding, motivating and providing me with fruitful information throughout the development process of this project work My sincere gratitude to the panel of teachers present for giving their precious time for listening and evaluating my project presentation