SlideShare a Scribd company logo
1 of 23
Advanced cosine measures for
collaborative filtering
Prof. Loc Nguyen, PhD
Loc Nguyen’s Academic Network, Vietnam
Email: ng_phloc@yahoo.com
Homepage: www.locnguyen.net
TA measure - Loc Nguyen7/12/2020 1
Abstract
Cosine similarity is an important measure to compare two vectors for many researches in
data mining and information retrieval. In this research, cosine measure and its advanced
variants for collaborating filtering (CF) are evaluated. Cosine measure is effective but it has
a drawback that there may be two end points of two vectors which are far from each other
according to Euclidean distance, but their cosine is high. This is negative effect of
Euclidean distance which decreases accuracy of cosine similarity. Therefore, a so-called
triangle area (TA) measure is proposed as an improved version of cosine measure. TA
measure uses ratio of basic triangle area to whole triangle area as reinforced factor for
Euclidean distance so that it can alleviate negative effect of Euclidean distance whereas it
keeps simplicity and effectiveness of both cosine measure and Euclidean distance in making
similarity of two vectors. TA is considered as an advanced cosine measure. TA and other
advanced cosine measures are tested with other similarity measures. From experimental
results, TA is not a preeminent measure but it is better than traditional cosine measures in
most cases and it is also adequate to real-time application. Moreover, its formula is simple
too.
2TA measure - Loc Nguyen7/12/2020
Table of contents
1. Introduction
2. Methodologies
3. Results and discussions
4. Conclusions
3TA measure - Loc Nguyen7/12/2020
1. Introduction
• Recommendation system is a system which recommends items to users among many existing
items in database. There are two main approaches for recommendation such as content-based
filtering (CBF) and collaborative filtering (CF). One of popular algorithms in CF is nearest
neighbors (NN) algorithm. The essence of NN algorithm [1] is to find out nearest neighbors of
active user and then to recommend active user items that these neighbors may like.
• Let U = {u1, u2,…, um} be the set of users and let V = {v1, v2,…, vn} be the set of items. User-
based rating matrix is the matrix in which rows indicate users and columns indicate items and
each cell is a rating. Let, rating vectors u1 = (r11, r12,…, r1n) and u2 = (r21, r22,…, r2n) of user 1
and user 2, in which some rij can be missing (empty).
• Suppose the active rating vector is u4, NN algorithm will find out nearest neighbors of u4 and
then compute the predictive values for r43 and r44 based on similarities between these
neighbors and u4.
7/12/2020 TA measure - Loc Nguyen 4
1. Introduction
• Finding out nearest neighbors of active user is to calculate similarities between active vector and
other vectors. Users whose similarities between them and active user are larger than or equal to
than a threshold are nearest neighbors of active user.
• Let sim(u1, u2) denote the similarity of u1 and u2. There are many similarity measures, for
instance, cosine measure is defined as follows [1]:
sim 𝑢1, 𝑢2 = cos 𝑢1, 𝑢2 =
𝑢1 ⋅ 𝑢2
𝑢1 𝑢2
=
𝑗∈𝐼1∩𝐼2
𝑟1𝑗 𝑟2𝑗
𝑗∈𝐼1∩𝐼2
𝑟1𝑗
2
𝑗∈𝐼1∩𝐼2
𝑟2𝑗
2
• Note, I1 and I2 are set of indices of items that user 1 and user 2 rated, respectively. Notation |x|
indicates absolute value of number, length of vector, length of geometric segment, or cardinality
of set.
• Computing predictive values for missing ratings is based on ratings of nearest neighbors and
similarities:
𝑟1𝑗 = 𝑢1 +
𝑖∈𝑁 𝑟𝑖𝑗 − 𝑢𝑖 sim 𝑢1, 𝑢𝑖
𝑖∈𝑁 sim 𝑢1, 𝑢𝑖
7/12/2020 TA measure - Loc Nguyen 5
1. Introduction
• Pearson correlation:
Pearson 𝑢1, 𝑢2 =
𝑗∈𝐼1∩𝐼2
𝑟1𝑗 − 𝑢1 𝑟2𝑗 − 𝑢2
𝑗∈𝐼1∩𝐼2
𝑟1𝑗 − 𝑢1
2
𝑗∈𝐼1∩𝐼2
𝑟2𝑗 − 𝑢2
2
• Constrained Pearson correlation (CPC):
CPC 𝑢1, 𝑢2 = CON 𝑢1, 𝑢2 =
𝑗∈𝐼1∩𝐼2
𝑟1𝑗 − 𝑟 𝑚 𝑟2𝑗 − 𝑟 𝑚
𝑗∈𝐼1∩𝐼2
𝑟1𝑗 − 𝑟 𝑚
2
𝑗∈𝐼1∩𝐼2
𝑟2𝑗 − 𝑟 𝑚
2
• Weight Pearson correlation (WPC) and sigmoid Pearson correlation (SPC):
WPC 𝑢1, 𝑢2 =
Pearson 𝑢1, 𝑢2 ∗
𝐼
𝐻
, if 𝐼 ≤ 𝐻
Pearson 𝑢1, 𝑢2 , otherwise
SPC 𝑢1, 𝑢2 = Pearson 𝑢1, 𝑢2 ∗
1
1 + exp − 𝐼 2
• Jaccard 𝑢1, 𝑢2 =
𝐼1∩𝐼2
𝐼1∪𝐼2
7/12/2020 TA measure - Loc Nguyen 6
1. Introduction
• Mean squared difference (MSD):
MSD 𝑢1, 𝑢2 = 1 −
𝑗∈𝐼
𝑟1𝑗 − 𝑟2𝑗
MAX
2
𝐼
• MSDJ 𝑢1, 𝑢2 = MSD 𝑢1, 𝑢2 ∗ Jaccard 𝑢1, 𝑢2
• NHMS 𝑢1, 𝑢2 = PSS 𝑢1, 𝑢2 ∗ URP 𝑢1, 𝑢2 ∗ Jaccard2 𝑢1, 𝑢2
• Bhattacharyya coefficient in CF (BCF):
BCF 𝑢1, 𝑢2 = Jaccard 𝑢1, 𝑢2 + BC 𝑢1, 𝑢2
• Similarity Measure for Text Processing (SMTP):
SMTP 𝑢1, 𝑢2 =
𝐹 𝑢1, 𝑢2 + 𝜆
1 + 𝜆
• Adjusted cosine measure (COD):
COD 𝑢1, 𝑢2 =
𝑗∈𝐼1∩𝐼2
𝑟1𝑗 − 𝑣𝑗 𝑟2𝑗 − 𝑣𝑗
𝑗∈𝐼1∩𝐼2
𝑟1𝑗 − 𝑣𝑗
2
𝑗∈𝐼1∩𝐼2
𝑟2𝑗 − 𝑣𝑗
2
7/12/2020 TA measure - Loc Nguyen 7
2. Methodologies
• Let u1 = OA and u2 = OB be two rating vectors and let α be the angle formed by u1 and u2. Such
two vectors form the triangle OAB.
• Cosine measure has a drawback that there may be two points like A and B which are far from each
other according to Euclidean distance, but their cosine is high.
• For example, given three vectors u1 = (1, 1), u2 = (9, 9), and u3 = (10, 10), although their cosine is
1, Euclidean distance still affects negatively because obviously u2 = (9, 9) is nearer to u3 = (10, 10)
than the u1 = (1, 1).
• The so-called triangle area (TA) measure is proposed to alleviate the negative effect of Euclidean
distance
7/12/2020 TA measure - Loc Nguyen 8
2. Methodologies
• Given the first case 0 ≤ α ≤ π/2, let S be area of the whole triangle OAB, let s be area of the basic triangle
OAH. Reinforced factor k is defined as areal ratio: k = s/S. Equation 1 of k is:
0 ≤ 𝛼 ≤
𝜋
2
: 𝑘 =
𝑢1 cos 𝛼
𝑢2
if 𝑢1 ≤ 𝑢2
𝑢2 cos 𝛼
𝑢1
if 𝑢1 > 𝑢2
• There are many points on two rays OA = u1 and OB = u2 so that their cosine is the same but only points A’
and B’ whose distance is shortest will obtain highest reinforced factor. Shortest distance viewpoint:
Reinforced factor is optimal if distance between two vectors is shortest.
7/12/2020 TA measure - Loc Nguyen 9
2. Methodologies
• Given the second case π/2 < α ≤ π, let S be area of the whole triangle OAB, let s be area of the
basic triangle OAH’. Reinforced factor k is defined as areal ratio: k = s/S. Equation 2 of k is:
𝜋
2
< 𝛼 ≤ 𝜋: 𝑘 =
𝑢1
𝑢2
if 𝑢1 ≤ 𝑢2
𝑢2
𝑢1
if 𝑢1 > 𝑢2
• Equal vector-length viewpoint: Reinforced factor is optimal if two vectors have equal length
within the same angle.
7/12/2020 TA measure - Loc Nguyen 10
2. Methodologies
• The equal vector-length viewpoint is relative to the shortest distance viewpoint. Let A’
and H’’ be projections of A and H’ on lines OB and OA, respectively. So, |AA’| is distance
from A to the line OB and |H’H’’| is distance from H’ to the line OA with note that |OA| =
|OH’|. We have |AA’| = |H’H’’|.
• The condition of equal vector length only assures that two distances from every point to
its dual line are equal (|AA’| = |H’H’’|) but |AH’| is not shortest distance due to |AA’| =
|H’H’’| < |AH’|. So, the condition of equal vector-length (equation 2) is weaker than the
condition of shortest distance (equation 1).
7/12/2020 TA measure - Loc Nguyen 11
2. Methodologies
• Suppose A1 is an approximative projection of A on the ray OB so that A1 is near to H
(|AA1| ≈ |AH| = |u1|cos(α)) according to the shortest distance viewpoint and we have |AH’|
= |u1| according to the equal vector-length viewpoint.
• When A1 is the optimal point of equation 1 in practice and H’ is the optimal point of
equation 2 then, equation 1 is better than equation 2 because A1 is between H and H’ and
|OH| < |OA1| < |OH’|. The best A1 occurs if and only if H ≡ A1 ≡ H’ when A ≡ B and the
triangle OAB degrades into segment OA (cos(α)=1 and |AB|=0). So, equation 1 balances
length of vectors and distance between vectors (points).
7/12/2020 TA measure - Loc Nguyen 12
2. Methodologies
• TA measure is defined as product of cosine value and reinforced factor: TA(u1, u2) =
k*cos(u1, u2). Let • denote dot product and let |x| denote length of vector, the full equation
of TA is:
𝑢1 ⋅ 𝑢2 ≥ 0: TA 𝑢1, 𝑢2 =
𝑢1 ⋅ 𝑢2
2
𝑢1 𝑢2
3
if 𝑢1 ≤ 𝑢2
𝑢1 ⋅ 𝑢2
2
𝑢1
3 𝑢2
if 𝑢1 > 𝑢2
𝑢1 ⋅ 𝑢2 < 0: TA 𝑢1, 𝑢2 =
𝑢1 ⋅ 𝑢2
𝑢2
2
if 𝑢1 ≤ 𝑢2
𝑢1 ⋅ 𝑢2
𝑢1
2
if 𝑢1 > 𝑢2
• TAJ is the combined measure which combines TA and Jaccard as follows:
TAJ 𝑢1, 𝑢2 = TA 𝑢1, 𝑢2 ∗ Jaccard 𝑢1, 𝑢2
• TAN is normalized version of TA and TANJ is combination of TAN and Jaccard.
7/12/2020 TA measure - Loc Nguyen 13
3. Results and discussions
• TA family (TA, TAJ, TAN, TANJ) is tested with traditional cosine family (cosine,
COJ, CON, COD) and other well-known measures such as Pearson family (Pearson,
WPC, SPC), Jaccard, MSD family (MSD, MSDJ), NHSM, BCF, and SMTP.
• Two metrics used to test measures are MAE (mean absolute error) and CC
(correlation coefficient). The smaller MAE is, the better the measures are. The larger
CC is, the better the measures are.
• All measures are tested with NN algorithm. NN algorithm for user-based rating
matrix becomes user-based NN algorithm and NN algorithm for item-based rating
matrix becomes item-based NN algorithm
• Dataset Movielens [8] is used for evaluation, which has 100,000 ratings from 943
users on 1682 movies (items). It is divided into 5 folders and each folder includes
training set and testing set. Training set and testing set in the same folder are disjoint
sets. The ratio of testing set over the whole dataset depends on the testing parameter r
= 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9.
7/12/2020 TA measure - Loc Nguyen 14
3. Results and discussions
• With r=0.1, TANJ is the best with
lowest item-based MAE and highest
item-based CC whereas WPC is the
best with lowest user-based MAE
and highest user-based CC.
• In general, top-5 measures according
to item-based MAE are TANJ (1),
NHSM (2), TAJ (3), MSDJ (4), and
Jaccard (5). However, top-5
measures according to item-based
CC are TANJ, NHSM, WPC, TAJ,
and SPC. Top-5 measures according
to user-based MAE and user-based
CC are WPC, TANJ, SPC, Pearson,
and NHSM.
• Anyway, TANJ is always in top-5
measures and TA family is better
than cosine family.
7/12/2020 TA measure - Loc Nguyen 15
3. Results and discussions
• With r=0.5, NHSM is the best with
lowest item-based MAE, highest item-
based CC, and lowest user-based
MAE. COJ is the best with highest
user-based CC.
• In general, top-5 measures according
to item-based MAE are NHSM (1),
TANJ (2), TAJ (3), MSDJ (4), and
Jaccard (5). However, top-5 measures
according to item-based CC are
NHSM, TAJ, TANJ, MSDJ, and
Jaccard. Top-5 measures according to
user-based MAE are NHSM, TANJ,
TAJ, cosine, and MSDJ. Top-5
measures according to user-based CC
are COJ, TAJ, NHSM, MSDJ, and TA.
• Anyway, TA family is always in top-5
measures and it is better than cosine
family.
7/12/2020 TA measure - Loc Nguyen 16
3. Results and discussions
• With r=0.9, results are unexpected
because the parameter r=0.9 makes
noise NN algorithm when the training
set is not large enough. Cosine is the
best with lowest MAE and highest CC.
• In general, top-5 measures according
to item-based MAE are cosine (1),
Pearson (2), SPC (3), MSD (4), and
TA (5). However, top-5 measures
according to item-based CC are
cosine, BCF, Pearson, SPC, and MSD.
Top-5 measures according to user-
based MAE are cosine, MSD, Pearson,
SMTP, and TA. Top-5 measures
according to user-based CC are cosine,
Pearson, SPC, MSD, and SMTP.
• Although TA is in top-5 measures with
user-based MAE, it is worse than pure
cosine and hence, TA family is not
better than cosine family.
7/12/2020 TA measure - Loc Nguyen 17
3. Results and discussions
• With two parameter values r=0.1 and r=0.5, TA family is better than cosine
family and NHSM is the best measure when item-based MAE is selected as
the main referred metric because item-based NN algorithm is always better
than user-based NN algorithm.
• Three values of r such as 0.1, 0.5, and 0.9 are enough for us to survey all
measures because these values are key values. For instance, the minimum
value r=0.1 implies large real-time database, the medium value r=0.5
implies testing database, and the maximum value r=0.9 implies
unexpectedly small database.
• However, it is not possible to draw which measures are the best in general
yet. So, all measures need to be tested with all values of r: 0.1, 0.2, 0.3, 0.4,
0.5, 0.6, 0.7, 0.8, 0.9 and then average item-based MAE for each measure is
calculated in order to determine best measures.
7/12/2020 TA measure - Loc Nguyen 18
3. Results and discussions
7/12/2020 TA measure - Loc Nguyen 19
Item-based MAE of all measures over all values of r
3. Results and discussions
• For all r values, general top-5 measures are TANJ, NHSM, TAJ, MSDJ, and Jaccard whose average item-
based MAE values are 0.7537, 0.7538, 0.7544, 0.7555, and 0.7584, respectively in which TANJ is the
dominant measure.
• Because TA measure in TA family does not combine with Jaccard, comparison of pure TA and pure cosine is
necessary. In general, TA (average MAE = 0.7626) is worse than cosine (average MAE = 0.7621). However,
for most r = 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, and 0.8, TA is better than cosine.
• With only r=0.9, TA (MAE=0.8399) is unexpectedly worse than cosine (MAE=0.8177). The reason is that the
parameter r=0.9 implies training set is too small to train NN algorithm and so, strong point of TA which
alleviates negative effect of Euclidean distance is broken by lack of information in small training set.
7/12/2020 TA measure - Loc Nguyen 20
4. Conclusions
• There is no doubt that TA family is better than traditional cosine family in most cases (r = 0.1,
0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8) although pure TA measure is not preeminent like TANJ
measure. Moreover, the general top-5 measures are TANJ, NHSM, TAJ, MSDJ, and Jaccard,
in which variants of TA as TANJ and TAJ are in such top-5 list.
• TA is more suitable to real applications than cosine because values of r which are smaller 0.5
indicate large rating databases in real applications. Moreover, formula of TA is still simple
• Jaccard is itself not a preeminent measure, but it is an important factor to improve any
measure. In fact, good measures such as TANJ, NHSM, TAJ, MSDJ combine themselves with
Jaccard. The reason is that numerical measures except Jaccard are calculated with respect to
rating values of common items on which both users rated and hence, these numerical
measures ignore items which are rated uniquely by each user whereas Jaccard implies
accuracy of these measures because Jaccard is the ratio of the number of common items to the
number of all items. As a result, putting Jaccard into another measure is to adjust accuracy of
such measure.
• In future trend, we try our best to improve TA so that it follows the ideology of Jaccard
measure instead of combining TA and Jaccard as usual.
7/12/2020 TA measure - Loc Nguyen 21
Thank you for attention
22TA measure - Loc Nguyen7/12/2020
References
1. R. D. Torres Júnior, "Combining Collaborative and Content-based Filtering to Recommend Research
Paper," Universidade Federal do Rio Grande do Sul, Porto Alegre, 2004.
2. M.-P. T. Do, D. V. Nguyen and L. Nguyen, "Model-based Approach for Collaborative Filtering," in
Proceedings of The 6th International Conference on Information Technology for Education
(IT@EDU2010), Ho Chi Minh, 2010.
3. B. Sarwar, G. Karypis, J. Konstan and J. Riedl, "Item-based Collaborative Filtering Recommendation
Algorithms," in Proceedings of the 10th international conference on World Wide Web, Hong Kong, 2001.
4. H. Liu, Z. Hu, A. Mian, H. Tian and X. Zhu, "A new user similarity model to improve the accuracy of
collaborative filtering," Knowledge-Based Systems, vol. 56, no. 2014, pp. 156-166, 20 November 2013.
5. B. K. Patra, R. Launonen, V. Ollikainen and S. Nandi, "A new similarity measure using Bhattacharyya
coefficient for collaborative filtering in sparse data," Knowledge-Based Systems, vol. 82, pp. 163-177, 1
March 2015.
6. Y.-S. Lin, J.-Y. Jiang and S.-J. Lee, "A Similarity Measure for Text Classification and Clustering," IEEE
Transactions on Knowledge and Data Engineering, vol. 26, no. 7, pp. 1575-1590, 25 January 2013.
7. J. L. Herlocker, J. A. Konstan, L. G. Terveen and J. T. Riedl, "Evaluating Collaborative Filtering
Recommender Systems," ACM Transactions on Information Systems (TOIS), vol. 22, no. 1, pp. 5-53,
2004.
8. GroupLens, "MovieLens datasets," GroupLens Research Project, University of Minnesota, USA, 22 April
1998. [Online]. Available: http://grouplens.org/datasets/movielens. [Accessed 3 August 2012].
7/12/2020 TA measure - Loc Nguyen 23

More Related Content

What's hot

Trust Region Algorithm - Bachelor Dissertation
Trust Region Algorithm - Bachelor DissertationTrust Region Algorithm - Bachelor Dissertation
Trust Region Algorithm - Bachelor DissertationChristian Adom
 
Time series analysis of collaborative activities-CRIWG2012
Time series analysis of collaborative activities-CRIWG2012Time series analysis of collaborative activities-CRIWG2012
Time series analysis of collaborative activities-CRIWG2012Irene-Angelica Chounta
 
A COMPARATIVE STUDY ON DISTANCE MEASURING APPROACHES FOR CLUSTERING
A COMPARATIVE STUDY ON DISTANCE MEASURING APPROACHES FOR CLUSTERINGA COMPARATIVE STUDY ON DISTANCE MEASURING APPROACHES FOR CLUSTERING
A COMPARATIVE STUDY ON DISTANCE MEASURING APPROACHES FOR CLUSTERINGIJORCS
 
Free Ebooks Download
Free Ebooks Download Free Ebooks Download
Free Ebooks Download Edhole.com
 
Numerical Study of Some Iterative Methods for Solving Nonlinear Equations
Numerical Study of Some Iterative Methods for Solving Nonlinear EquationsNumerical Study of Some Iterative Methods for Solving Nonlinear Equations
Numerical Study of Some Iterative Methods for Solving Nonlinear Equationsinventionjournals
 
article_imen_ridha_2016_version_finale
article_imen_ridha_2016_version_finalearticle_imen_ridha_2016_version_finale
article_imen_ridha_2016_version_finaleMdimagh Ridha
 
Effects of Weight Approximation Methods on Performance of Digital Beamforming...
Effects of Weight Approximation Methods on Performance of Digital Beamforming...Effects of Weight Approximation Methods on Performance of Digital Beamforming...
Effects of Weight Approximation Methods on Performance of Digital Beamforming...IOSR Journals
 
Imagethresholding
ImagethresholdingImagethresholding
Imagethresholdingananta200
 
A New Approach for Speech Enhancement Based On Eigenvalue Spectral Subtraction
A New Approach for Speech Enhancement Based On Eigenvalue Spectral SubtractionA New Approach for Speech Enhancement Based On Eigenvalue Spectral Subtraction
A New Approach for Speech Enhancement Based On Eigenvalue Spectral SubtractionCSCJournals
 
Seminar Report (Final)
Seminar Report (Final)Seminar Report (Final)
Seminar Report (Final)Aruneel Das
 
Strehl Ratio with Higher-Order Parabolic Filter
Strehl Ratio with Higher-Order Parabolic FilterStrehl Ratio with Higher-Order Parabolic Filter
Strehl Ratio with Higher-Order Parabolic FilterIJMER
 
Numerical methods for 2 d heat transfer
Numerical methods for 2 d heat transferNumerical methods for 2 d heat transfer
Numerical methods for 2 d heat transferArun Sarasan
 
Varaiational formulation fem
Varaiational formulation fem Varaiational formulation fem
Varaiational formulation fem Tushar Aneyrao
 

What's hot (18)

Trust Region Algorithm - Bachelor Dissertation
Trust Region Algorithm - Bachelor DissertationTrust Region Algorithm - Bachelor Dissertation
Trust Region Algorithm - Bachelor Dissertation
 
Jmessp13420104
Jmessp13420104Jmessp13420104
Jmessp13420104
 
DIMENSIONAL ANALYSIS (Lecture notes 08)
DIMENSIONAL ANALYSIS (Lecture notes 08)DIMENSIONAL ANALYSIS (Lecture notes 08)
DIMENSIONAL ANALYSIS (Lecture notes 08)
 
Jmestn42351212
Jmestn42351212Jmestn42351212
Jmestn42351212
 
Time series analysis of collaborative activities-CRIWG2012
Time series analysis of collaborative activities-CRIWG2012Time series analysis of collaborative activities-CRIWG2012
Time series analysis of collaborative activities-CRIWG2012
 
A COMPARATIVE STUDY ON DISTANCE MEASURING APPROACHES FOR CLUSTERING
A COMPARATIVE STUDY ON DISTANCE MEASURING APPROACHES FOR CLUSTERINGA COMPARATIVE STUDY ON DISTANCE MEASURING APPROACHES FOR CLUSTERING
A COMPARATIVE STUDY ON DISTANCE MEASURING APPROACHES FOR CLUSTERING
 
Free Ebooks Download
Free Ebooks Download Free Ebooks Download
Free Ebooks Download
 
Numerical Study of Some Iterative Methods for Solving Nonlinear Equations
Numerical Study of Some Iterative Methods for Solving Nonlinear EquationsNumerical Study of Some Iterative Methods for Solving Nonlinear Equations
Numerical Study of Some Iterative Methods for Solving Nonlinear Equations
 
article_imen_ridha_2016_version_finale
article_imen_ridha_2016_version_finalearticle_imen_ridha_2016_version_finale
article_imen_ridha_2016_version_finale
 
Paper 6 (azam zaka)
Paper 6 (azam zaka)Paper 6 (azam zaka)
Paper 6 (azam zaka)
 
Effects of Weight Approximation Methods on Performance of Digital Beamforming...
Effects of Weight Approximation Methods on Performance of Digital Beamforming...Effects of Weight Approximation Methods on Performance of Digital Beamforming...
Effects of Weight Approximation Methods on Performance of Digital Beamforming...
 
Finite Element Methods
Finite Element  MethodsFinite Element  Methods
Finite Element Methods
 
Imagethresholding
ImagethresholdingImagethresholding
Imagethresholding
 
A New Approach for Speech Enhancement Based On Eigenvalue Spectral Subtraction
A New Approach for Speech Enhancement Based On Eigenvalue Spectral SubtractionA New Approach for Speech Enhancement Based On Eigenvalue Spectral Subtraction
A New Approach for Speech Enhancement Based On Eigenvalue Spectral Subtraction
 
Seminar Report (Final)
Seminar Report (Final)Seminar Report (Final)
Seminar Report (Final)
 
Strehl Ratio with Higher-Order Parabolic Filter
Strehl Ratio with Higher-Order Parabolic FilterStrehl Ratio with Higher-Order Parabolic Filter
Strehl Ratio with Higher-Order Parabolic Filter
 
Numerical methods for 2 d heat transfer
Numerical methods for 2 d heat transferNumerical methods for 2 d heat transfer
Numerical methods for 2 d heat transfer
 
Varaiational formulation fem
Varaiational formulation fem Varaiational formulation fem
Varaiational formulation fem
 

Similar to Advanced cosine measures for collaborative filtering

HIGHLY ACCURATE LOG SKEW NORMAL APPROXIMATION TO THE SUM OF CORRELATED LOGNOR...
HIGHLY ACCURATE LOG SKEW NORMAL APPROXIMATION TO THE SUM OF CORRELATED LOGNOR...HIGHLY ACCURATE LOG SKEW NORMAL APPROXIMATION TO THE SUM OF CORRELATED LOGNOR...
HIGHLY ACCURATE LOG SKEW NORMAL APPROXIMATION TO THE SUM OF CORRELATED LOGNOR...cscpconf
 
Performance of Spiked Population Models for Spectrum Sensing
Performance of Spiked Population Models for Spectrum SensingPerformance of Spiked Population Models for Spectrum Sensing
Performance of Spiked Population Models for Spectrum SensingPolytechnique Montreal
 
A reduced complexity and an efficient channel
A reduced complexity and an efficient channelA reduced complexity and an efficient channel
A reduced complexity and an efficient channeleSAT Publishing House
 
Comparitive analysis of doa and beamforming algorithms for smart antenna systems
Comparitive analysis of doa and beamforming algorithms for smart antenna systemsComparitive analysis of doa and beamforming algorithms for smart antenna systems
Comparitive analysis of doa and beamforming algorithms for smart antenna systemseSAT Journals
 
IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING
IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING
IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING cscpconf
 
On the approximation of the sum of lognormals by a log skew normal distribution
On the approximation of the sum of lognormals by a log skew normal distributionOn the approximation of the sum of lognormals by a log skew normal distribution
On the approximation of the sum of lognormals by a log skew normal distributionIJCNCJournal
 
Cluster Analysis
Cluster Analysis Cluster Analysis
Cluster Analysis Baivab Nag
 
Boosting CED Using Robust Orientation Estimation
Boosting CED Using Robust Orientation EstimationBoosting CED Using Robust Orientation Estimation
Boosting CED Using Robust Orientation Estimationijma
 
IRJET- Rapid Spectrum Sensing Algorithm in Cooperative Spectrum Sensing for a...
IRJET- Rapid Spectrum Sensing Algorithm in Cooperative Spectrum Sensing for a...IRJET- Rapid Spectrum Sensing Algorithm in Cooperative Spectrum Sensing for a...
IRJET- Rapid Spectrum Sensing Algorithm in Cooperative Spectrum Sensing for a...IRJET Journal
 
1648796607723_Material-8---Concept-on-Estimation-Variance.pdf
1648796607723_Material-8---Concept-on-Estimation-Variance.pdf1648796607723_Material-8---Concept-on-Estimation-Variance.pdf
1648796607723_Material-8---Concept-on-Estimation-Variance.pdfandifebby2
 
A new similarity measurement based on hellinger distance for collaborating fi...
A new similarity measurement based on hellinger distance for collaborating fi...A new similarity measurement based on hellinger distance for collaborating fi...
A new similarity measurement based on hellinger distance for collaborating fi...Prabhu Kumar
 
Introduction geostatistic for_mineral_resources
Introduction geostatistic for_mineral_resourcesIntroduction geostatistic for_mineral_resources
Introduction geostatistic for_mineral_resourcesAdi Handarbeni
 
An Approach for Power Flow Analysis of Radial Distribution Networks
An Approach for Power Flow Analysis of Radial Distribution NetworksAn Approach for Power Flow Analysis of Radial Distribution Networks
An Approach for Power Flow Analysis of Radial Distribution Networksresearchinventy
 
Performance Improvement of Vector Quantization with Bit-parallelism Hardware
Performance Improvement of Vector Quantization with Bit-parallelism HardwarePerformance Improvement of Vector Quantization with Bit-parallelism Hardware
Performance Improvement of Vector Quantization with Bit-parallelism HardwareCSCJournals
 
Performance of cognitive radio networks with maximal ratio combining over cor...
Performance of cognitive radio networks with maximal ratio combining over cor...Performance of cognitive radio networks with maximal ratio combining over cor...
Performance of cognitive radio networks with maximal ratio combining over cor...Polytechnique Montreal
 
Two Dimensional Shape and Texture Quantification - Medical Image Processing
Two Dimensional Shape and Texture Quantification - Medical Image ProcessingTwo Dimensional Shape and Texture Quantification - Medical Image Processing
Two Dimensional Shape and Texture Quantification - Medical Image ProcessingChamod Mune
 
Sensing Method for Two-Target Detection in Time-Constrained Vector Poisson Ch...
Sensing Method for Two-Target Detection in Time-Constrained Vector Poisson Ch...Sensing Method for Two-Target Detection in Time-Constrained Vector Poisson Ch...
Sensing Method for Two-Target Detection in Time-Constrained Vector Poisson Ch...sipij
 
Sensing Method for Two-Target Detection in Time-Constrained Vector Poisson Ch...
Sensing Method for Two-Target Detection in Time-Constrained Vector Poisson Ch...Sensing Method for Two-Target Detection in Time-Constrained Vector Poisson Ch...
Sensing Method for Two-Target Detection in Time-Constrained Vector Poisson Ch...sipij
 

Similar to Advanced cosine measures for collaborative filtering (20)

HIGHLY ACCURATE LOG SKEW NORMAL APPROXIMATION TO THE SUM OF CORRELATED LOGNOR...
HIGHLY ACCURATE LOG SKEW NORMAL APPROXIMATION TO THE SUM OF CORRELATED LOGNOR...HIGHLY ACCURATE LOG SKEW NORMAL APPROXIMATION TO THE SUM OF CORRELATED LOGNOR...
HIGHLY ACCURATE LOG SKEW NORMAL APPROXIMATION TO THE SUM OF CORRELATED LOGNOR...
 
11
1111
11
 
Performance of Spiked Population Models for Spectrum Sensing
Performance of Spiked Population Models for Spectrum SensingPerformance of Spiked Population Models for Spectrum Sensing
Performance of Spiked Population Models for Spectrum Sensing
 
A reduced complexity and an efficient channel
A reduced complexity and an efficient channelA reduced complexity and an efficient channel
A reduced complexity and an efficient channel
 
Comparitive analysis of doa and beamforming algorithms for smart antenna systems
Comparitive analysis of doa and beamforming algorithms for smart antenna systemsComparitive analysis of doa and beamforming algorithms for smart antenna systems
Comparitive analysis of doa and beamforming algorithms for smart antenna systems
 
IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING
IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING
IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING
 
On the approximation of the sum of lognormals by a log skew normal distribution
On the approximation of the sum of lognormals by a log skew normal distributionOn the approximation of the sum of lognormals by a log skew normal distribution
On the approximation of the sum of lognormals by a log skew normal distribution
 
Cluster Analysis
Cluster Analysis Cluster Analysis
Cluster Analysis
 
Boosting CED Using Robust Orientation Estimation
Boosting CED Using Robust Orientation EstimationBoosting CED Using Robust Orientation Estimation
Boosting CED Using Robust Orientation Estimation
 
IRJET- Rapid Spectrum Sensing Algorithm in Cooperative Spectrum Sensing for a...
IRJET- Rapid Spectrum Sensing Algorithm in Cooperative Spectrum Sensing for a...IRJET- Rapid Spectrum Sensing Algorithm in Cooperative Spectrum Sensing for a...
IRJET- Rapid Spectrum Sensing Algorithm in Cooperative Spectrum Sensing for a...
 
1648796607723_Material-8---Concept-on-Estimation-Variance.pdf
1648796607723_Material-8---Concept-on-Estimation-Variance.pdf1648796607723_Material-8---Concept-on-Estimation-Variance.pdf
1648796607723_Material-8---Concept-on-Estimation-Variance.pdf
 
A new similarity measurement based on hellinger distance for collaborating fi...
A new similarity measurement based on hellinger distance for collaborating fi...A new similarity measurement based on hellinger distance for collaborating fi...
A new similarity measurement based on hellinger distance for collaborating fi...
 
Introduction geostatistic for_mineral_resources
Introduction geostatistic for_mineral_resourcesIntroduction geostatistic for_mineral_resources
Introduction geostatistic for_mineral_resources
 
An Approach for Power Flow Analysis of Radial Distribution Networks
An Approach for Power Flow Analysis of Radial Distribution NetworksAn Approach for Power Flow Analysis of Radial Distribution Networks
An Approach for Power Flow Analysis of Radial Distribution Networks
 
Performance Improvement of Vector Quantization with Bit-parallelism Hardware
Performance Improvement of Vector Quantization with Bit-parallelism HardwarePerformance Improvement of Vector Quantization with Bit-parallelism Hardware
Performance Improvement of Vector Quantization with Bit-parallelism Hardware
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Performance of cognitive radio networks with maximal ratio combining over cor...
Performance of cognitive radio networks with maximal ratio combining over cor...Performance of cognitive radio networks with maximal ratio combining over cor...
Performance of cognitive radio networks with maximal ratio combining over cor...
 
Two Dimensional Shape and Texture Quantification - Medical Image Processing
Two Dimensional Shape and Texture Quantification - Medical Image ProcessingTwo Dimensional Shape and Texture Quantification - Medical Image Processing
Two Dimensional Shape and Texture Quantification - Medical Image Processing
 
Sensing Method for Two-Target Detection in Time-Constrained Vector Poisson Ch...
Sensing Method for Two-Target Detection in Time-Constrained Vector Poisson Ch...Sensing Method for Two-Target Detection in Time-Constrained Vector Poisson Ch...
Sensing Method for Two-Target Detection in Time-Constrained Vector Poisson Ch...
 
Sensing Method for Two-Target Detection in Time-Constrained Vector Poisson Ch...
Sensing Method for Two-Target Detection in Time-Constrained Vector Poisson Ch...Sensing Method for Two-Target Detection in Time-Constrained Vector Poisson Ch...
Sensing Method for Two-Target Detection in Time-Constrained Vector Poisson Ch...
 

More from Loc Nguyen

Conditional mixture model and its application for regression model
Conditional mixture model and its application for regression modelConditional mixture model and its application for regression model
Conditional mixture model and its application for regression modelLoc Nguyen
 
Nghịch dân chủ luận (tổng quan về dân chủ và thể chế chính trị liên quan đến ...
Nghịch dân chủ luận (tổng quan về dân chủ và thể chế chính trị liên quan đến ...Nghịch dân chủ luận (tổng quan về dân chủ và thể chế chính trị liên quan đến ...
Nghịch dân chủ luận (tổng quan về dân chủ và thể chế chính trị liên quan đến ...Loc Nguyen
 
A Novel Collaborative Filtering Algorithm by Bit Mining Frequent Itemsets
A Novel Collaborative Filtering Algorithm by Bit Mining Frequent ItemsetsA Novel Collaborative Filtering Algorithm by Bit Mining Frequent Itemsets
A Novel Collaborative Filtering Algorithm by Bit Mining Frequent ItemsetsLoc Nguyen
 
Simple image deconvolution based on reverse image convolution and backpropaga...
Simple image deconvolution based on reverse image convolution and backpropaga...Simple image deconvolution based on reverse image convolution and backpropaga...
Simple image deconvolution based on reverse image convolution and backpropaga...Loc Nguyen
 
Technological Accessibility: Learning Platform Among Senior High School Students
Technological Accessibility: Learning Platform Among Senior High School StudentsTechnological Accessibility: Learning Platform Among Senior High School Students
Technological Accessibility: Learning Platform Among Senior High School StudentsLoc Nguyen
 
Engineering for Social Impact
Engineering for Social ImpactEngineering for Social Impact
Engineering for Social ImpactLoc Nguyen
 
Harnessing Technology for Research Education
Harnessing Technology for Research EducationHarnessing Technology for Research Education
Harnessing Technology for Research EducationLoc Nguyen
 
Future of education with support of technology
Future of education with support of technologyFuture of education with support of technology
Future of education with support of technologyLoc Nguyen
 
Where the dragon to fly
Where the dragon to flyWhere the dragon to fly
Where the dragon to flyLoc Nguyen
 
Adversarial Variational Autoencoders to extend and improve generative model
Adversarial Variational Autoencoders to extend and improve generative modelAdversarial Variational Autoencoders to extend and improve generative model
Adversarial Variational Autoencoders to extend and improve generative modelLoc Nguyen
 
Learning dyadic data and predicting unaccomplished co-occurrent values by mix...
Learning dyadic data and predicting unaccomplished co-occurrent values by mix...Learning dyadic data and predicting unaccomplished co-occurrent values by mix...
Learning dyadic data and predicting unaccomplished co-occurrent values by mix...Loc Nguyen
 
Tutorial on Bayesian optimization
Tutorial on Bayesian optimizationTutorial on Bayesian optimization
Tutorial on Bayesian optimizationLoc Nguyen
 
Tutorial on Support Vector Machine
Tutorial on Support Vector MachineTutorial on Support Vector Machine
Tutorial on Support Vector MachineLoc Nguyen
 
A Proposal of Two-step Autoregressive Model
A Proposal of Two-step Autoregressive ModelA Proposal of Two-step Autoregressive Model
A Proposal of Two-step Autoregressive ModelLoc Nguyen
 
Extreme bound analysis based on correlation coefficient for optimal regressio...
Extreme bound analysis based on correlation coefficient for optimal regressio...Extreme bound analysis based on correlation coefficient for optimal regressio...
Extreme bound analysis based on correlation coefficient for optimal regressio...Loc Nguyen
 
Jagged stock investment strategy
Jagged stock investment strategyJagged stock investment strategy
Jagged stock investment strategyLoc Nguyen
 
A short study on minima distribution
A short study on minima distributionA short study on minima distribution
A short study on minima distributionLoc Nguyen
 
Tutorial on EM algorithm – Part 4
Tutorial on EM algorithm – Part 4Tutorial on EM algorithm – Part 4
Tutorial on EM algorithm – Part 4Loc Nguyen
 
Tutorial on EM algorithm – Part 3
Tutorial on EM algorithm – Part 3Tutorial on EM algorithm – Part 3
Tutorial on EM algorithm – Part 3Loc Nguyen
 
Tutorial on particle swarm optimization
Tutorial on particle swarm optimizationTutorial on particle swarm optimization
Tutorial on particle swarm optimizationLoc Nguyen
 

More from Loc Nguyen (20)

Conditional mixture model and its application for regression model
Conditional mixture model and its application for regression modelConditional mixture model and its application for regression model
Conditional mixture model and its application for regression model
 
Nghịch dân chủ luận (tổng quan về dân chủ và thể chế chính trị liên quan đến ...
Nghịch dân chủ luận (tổng quan về dân chủ và thể chế chính trị liên quan đến ...Nghịch dân chủ luận (tổng quan về dân chủ và thể chế chính trị liên quan đến ...
Nghịch dân chủ luận (tổng quan về dân chủ và thể chế chính trị liên quan đến ...
 
A Novel Collaborative Filtering Algorithm by Bit Mining Frequent Itemsets
A Novel Collaborative Filtering Algorithm by Bit Mining Frequent ItemsetsA Novel Collaborative Filtering Algorithm by Bit Mining Frequent Itemsets
A Novel Collaborative Filtering Algorithm by Bit Mining Frequent Itemsets
 
Simple image deconvolution based on reverse image convolution and backpropaga...
Simple image deconvolution based on reverse image convolution and backpropaga...Simple image deconvolution based on reverse image convolution and backpropaga...
Simple image deconvolution based on reverse image convolution and backpropaga...
 
Technological Accessibility: Learning Platform Among Senior High School Students
Technological Accessibility: Learning Platform Among Senior High School StudentsTechnological Accessibility: Learning Platform Among Senior High School Students
Technological Accessibility: Learning Platform Among Senior High School Students
 
Engineering for Social Impact
Engineering for Social ImpactEngineering for Social Impact
Engineering for Social Impact
 
Harnessing Technology for Research Education
Harnessing Technology for Research EducationHarnessing Technology for Research Education
Harnessing Technology for Research Education
 
Future of education with support of technology
Future of education with support of technologyFuture of education with support of technology
Future of education with support of technology
 
Where the dragon to fly
Where the dragon to flyWhere the dragon to fly
Where the dragon to fly
 
Adversarial Variational Autoencoders to extend and improve generative model
Adversarial Variational Autoencoders to extend and improve generative modelAdversarial Variational Autoencoders to extend and improve generative model
Adversarial Variational Autoencoders to extend and improve generative model
 
Learning dyadic data and predicting unaccomplished co-occurrent values by mix...
Learning dyadic data and predicting unaccomplished co-occurrent values by mix...Learning dyadic data and predicting unaccomplished co-occurrent values by mix...
Learning dyadic data and predicting unaccomplished co-occurrent values by mix...
 
Tutorial on Bayesian optimization
Tutorial on Bayesian optimizationTutorial on Bayesian optimization
Tutorial on Bayesian optimization
 
Tutorial on Support Vector Machine
Tutorial on Support Vector MachineTutorial on Support Vector Machine
Tutorial on Support Vector Machine
 
A Proposal of Two-step Autoregressive Model
A Proposal of Two-step Autoregressive ModelA Proposal of Two-step Autoregressive Model
A Proposal of Two-step Autoregressive Model
 
Extreme bound analysis based on correlation coefficient for optimal regressio...
Extreme bound analysis based on correlation coefficient for optimal regressio...Extreme bound analysis based on correlation coefficient for optimal regressio...
Extreme bound analysis based on correlation coefficient for optimal regressio...
 
Jagged stock investment strategy
Jagged stock investment strategyJagged stock investment strategy
Jagged stock investment strategy
 
A short study on minima distribution
A short study on minima distributionA short study on minima distribution
A short study on minima distribution
 
Tutorial on EM algorithm – Part 4
Tutorial on EM algorithm – Part 4Tutorial on EM algorithm – Part 4
Tutorial on EM algorithm – Part 4
 
Tutorial on EM algorithm – Part 3
Tutorial on EM algorithm – Part 3Tutorial on EM algorithm – Part 3
Tutorial on EM algorithm – Part 3
 
Tutorial on particle swarm optimization
Tutorial on particle swarm optimizationTutorial on particle swarm optimization
Tutorial on particle swarm optimization
 

Recently uploaded

All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 

Recently uploaded (20)

All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 

Advanced cosine measures for collaborative filtering

  • 1. Advanced cosine measures for collaborative filtering Prof. Loc Nguyen, PhD Loc Nguyen’s Academic Network, Vietnam Email: ng_phloc@yahoo.com Homepage: www.locnguyen.net TA measure - Loc Nguyen7/12/2020 1
  • 2. Abstract Cosine similarity is an important measure to compare two vectors for many researches in data mining and information retrieval. In this research, cosine measure and its advanced variants for collaborating filtering (CF) are evaluated. Cosine measure is effective but it has a drawback that there may be two end points of two vectors which are far from each other according to Euclidean distance, but their cosine is high. This is negative effect of Euclidean distance which decreases accuracy of cosine similarity. Therefore, a so-called triangle area (TA) measure is proposed as an improved version of cosine measure. TA measure uses ratio of basic triangle area to whole triangle area as reinforced factor for Euclidean distance so that it can alleviate negative effect of Euclidean distance whereas it keeps simplicity and effectiveness of both cosine measure and Euclidean distance in making similarity of two vectors. TA is considered as an advanced cosine measure. TA and other advanced cosine measures are tested with other similarity measures. From experimental results, TA is not a preeminent measure but it is better than traditional cosine measures in most cases and it is also adequate to real-time application. Moreover, its formula is simple too. 2TA measure - Loc Nguyen7/12/2020
  • 3. Table of contents 1. Introduction 2. Methodologies 3. Results and discussions 4. Conclusions 3TA measure - Loc Nguyen7/12/2020
  • 4. 1. Introduction • Recommendation system is a system which recommends items to users among many existing items in database. There are two main approaches for recommendation such as content-based filtering (CBF) and collaborative filtering (CF). One of popular algorithms in CF is nearest neighbors (NN) algorithm. The essence of NN algorithm [1] is to find out nearest neighbors of active user and then to recommend active user items that these neighbors may like. • Let U = {u1, u2,…, um} be the set of users and let V = {v1, v2,…, vn} be the set of items. User- based rating matrix is the matrix in which rows indicate users and columns indicate items and each cell is a rating. Let, rating vectors u1 = (r11, r12,…, r1n) and u2 = (r21, r22,…, r2n) of user 1 and user 2, in which some rij can be missing (empty). • Suppose the active rating vector is u4, NN algorithm will find out nearest neighbors of u4 and then compute the predictive values for r43 and r44 based on similarities between these neighbors and u4. 7/12/2020 TA measure - Loc Nguyen 4
  • 5. 1. Introduction • Finding out nearest neighbors of active user is to calculate similarities between active vector and other vectors. Users whose similarities between them and active user are larger than or equal to than a threshold are nearest neighbors of active user. • Let sim(u1, u2) denote the similarity of u1 and u2. There are many similarity measures, for instance, cosine measure is defined as follows [1]: sim 𝑢1, 𝑢2 = cos 𝑢1, 𝑢2 = 𝑢1 ⋅ 𝑢2 𝑢1 𝑢2 = 𝑗∈𝐼1∩𝐼2 𝑟1𝑗 𝑟2𝑗 𝑗∈𝐼1∩𝐼2 𝑟1𝑗 2 𝑗∈𝐼1∩𝐼2 𝑟2𝑗 2 • Note, I1 and I2 are set of indices of items that user 1 and user 2 rated, respectively. Notation |x| indicates absolute value of number, length of vector, length of geometric segment, or cardinality of set. • Computing predictive values for missing ratings is based on ratings of nearest neighbors and similarities: 𝑟1𝑗 = 𝑢1 + 𝑖∈𝑁 𝑟𝑖𝑗 − 𝑢𝑖 sim 𝑢1, 𝑢𝑖 𝑖∈𝑁 sim 𝑢1, 𝑢𝑖 7/12/2020 TA measure - Loc Nguyen 5
  • 6. 1. Introduction • Pearson correlation: Pearson 𝑢1, 𝑢2 = 𝑗∈𝐼1∩𝐼2 𝑟1𝑗 − 𝑢1 𝑟2𝑗 − 𝑢2 𝑗∈𝐼1∩𝐼2 𝑟1𝑗 − 𝑢1 2 𝑗∈𝐼1∩𝐼2 𝑟2𝑗 − 𝑢2 2 • Constrained Pearson correlation (CPC): CPC 𝑢1, 𝑢2 = CON 𝑢1, 𝑢2 = 𝑗∈𝐼1∩𝐼2 𝑟1𝑗 − 𝑟 𝑚 𝑟2𝑗 − 𝑟 𝑚 𝑗∈𝐼1∩𝐼2 𝑟1𝑗 − 𝑟 𝑚 2 𝑗∈𝐼1∩𝐼2 𝑟2𝑗 − 𝑟 𝑚 2 • Weight Pearson correlation (WPC) and sigmoid Pearson correlation (SPC): WPC 𝑢1, 𝑢2 = Pearson 𝑢1, 𝑢2 ∗ 𝐼 𝐻 , if 𝐼 ≤ 𝐻 Pearson 𝑢1, 𝑢2 , otherwise SPC 𝑢1, 𝑢2 = Pearson 𝑢1, 𝑢2 ∗ 1 1 + exp − 𝐼 2 • Jaccard 𝑢1, 𝑢2 = 𝐼1∩𝐼2 𝐼1∪𝐼2 7/12/2020 TA measure - Loc Nguyen 6
  • 7. 1. Introduction • Mean squared difference (MSD): MSD 𝑢1, 𝑢2 = 1 − 𝑗∈𝐼 𝑟1𝑗 − 𝑟2𝑗 MAX 2 𝐼 • MSDJ 𝑢1, 𝑢2 = MSD 𝑢1, 𝑢2 ∗ Jaccard 𝑢1, 𝑢2 • NHMS 𝑢1, 𝑢2 = PSS 𝑢1, 𝑢2 ∗ URP 𝑢1, 𝑢2 ∗ Jaccard2 𝑢1, 𝑢2 • Bhattacharyya coefficient in CF (BCF): BCF 𝑢1, 𝑢2 = Jaccard 𝑢1, 𝑢2 + BC 𝑢1, 𝑢2 • Similarity Measure for Text Processing (SMTP): SMTP 𝑢1, 𝑢2 = 𝐹 𝑢1, 𝑢2 + 𝜆 1 + 𝜆 • Adjusted cosine measure (COD): COD 𝑢1, 𝑢2 = 𝑗∈𝐼1∩𝐼2 𝑟1𝑗 − 𝑣𝑗 𝑟2𝑗 − 𝑣𝑗 𝑗∈𝐼1∩𝐼2 𝑟1𝑗 − 𝑣𝑗 2 𝑗∈𝐼1∩𝐼2 𝑟2𝑗 − 𝑣𝑗 2 7/12/2020 TA measure - Loc Nguyen 7
  • 8. 2. Methodologies • Let u1 = OA and u2 = OB be two rating vectors and let α be the angle formed by u1 and u2. Such two vectors form the triangle OAB. • Cosine measure has a drawback that there may be two points like A and B which are far from each other according to Euclidean distance, but their cosine is high. • For example, given three vectors u1 = (1, 1), u2 = (9, 9), and u3 = (10, 10), although their cosine is 1, Euclidean distance still affects negatively because obviously u2 = (9, 9) is nearer to u3 = (10, 10) than the u1 = (1, 1). • The so-called triangle area (TA) measure is proposed to alleviate the negative effect of Euclidean distance 7/12/2020 TA measure - Loc Nguyen 8
  • 9. 2. Methodologies • Given the first case 0 ≤ α ≤ π/2, let S be area of the whole triangle OAB, let s be area of the basic triangle OAH. Reinforced factor k is defined as areal ratio: k = s/S. Equation 1 of k is: 0 ≤ 𝛼 ≤ 𝜋 2 : 𝑘 = 𝑢1 cos 𝛼 𝑢2 if 𝑢1 ≤ 𝑢2 𝑢2 cos 𝛼 𝑢1 if 𝑢1 > 𝑢2 • There are many points on two rays OA = u1 and OB = u2 so that their cosine is the same but only points A’ and B’ whose distance is shortest will obtain highest reinforced factor. Shortest distance viewpoint: Reinforced factor is optimal if distance between two vectors is shortest. 7/12/2020 TA measure - Loc Nguyen 9
  • 10. 2. Methodologies • Given the second case π/2 < α ≤ π, let S be area of the whole triangle OAB, let s be area of the basic triangle OAH’. Reinforced factor k is defined as areal ratio: k = s/S. Equation 2 of k is: 𝜋 2 < 𝛼 ≤ 𝜋: 𝑘 = 𝑢1 𝑢2 if 𝑢1 ≤ 𝑢2 𝑢2 𝑢1 if 𝑢1 > 𝑢2 • Equal vector-length viewpoint: Reinforced factor is optimal if two vectors have equal length within the same angle. 7/12/2020 TA measure - Loc Nguyen 10
  • 11. 2. Methodologies • The equal vector-length viewpoint is relative to the shortest distance viewpoint. Let A’ and H’’ be projections of A and H’ on lines OB and OA, respectively. So, |AA’| is distance from A to the line OB and |H’H’’| is distance from H’ to the line OA with note that |OA| = |OH’|. We have |AA’| = |H’H’’|. • The condition of equal vector length only assures that two distances from every point to its dual line are equal (|AA’| = |H’H’’|) but |AH’| is not shortest distance due to |AA’| = |H’H’’| < |AH’|. So, the condition of equal vector-length (equation 2) is weaker than the condition of shortest distance (equation 1). 7/12/2020 TA measure - Loc Nguyen 11
  • 12. 2. Methodologies • Suppose A1 is an approximative projection of A on the ray OB so that A1 is near to H (|AA1| ≈ |AH| = |u1|cos(α)) according to the shortest distance viewpoint and we have |AH’| = |u1| according to the equal vector-length viewpoint. • When A1 is the optimal point of equation 1 in practice and H’ is the optimal point of equation 2 then, equation 1 is better than equation 2 because A1 is between H and H’ and |OH| < |OA1| < |OH’|. The best A1 occurs if and only if H ≡ A1 ≡ H’ when A ≡ B and the triangle OAB degrades into segment OA (cos(α)=1 and |AB|=0). So, equation 1 balances length of vectors and distance between vectors (points). 7/12/2020 TA measure - Loc Nguyen 12
  • 13. 2. Methodologies • TA measure is defined as product of cosine value and reinforced factor: TA(u1, u2) = k*cos(u1, u2). Let • denote dot product and let |x| denote length of vector, the full equation of TA is: 𝑢1 ⋅ 𝑢2 ≥ 0: TA 𝑢1, 𝑢2 = 𝑢1 ⋅ 𝑢2 2 𝑢1 𝑢2 3 if 𝑢1 ≤ 𝑢2 𝑢1 ⋅ 𝑢2 2 𝑢1 3 𝑢2 if 𝑢1 > 𝑢2 𝑢1 ⋅ 𝑢2 < 0: TA 𝑢1, 𝑢2 = 𝑢1 ⋅ 𝑢2 𝑢2 2 if 𝑢1 ≤ 𝑢2 𝑢1 ⋅ 𝑢2 𝑢1 2 if 𝑢1 > 𝑢2 • TAJ is the combined measure which combines TA and Jaccard as follows: TAJ 𝑢1, 𝑢2 = TA 𝑢1, 𝑢2 ∗ Jaccard 𝑢1, 𝑢2 • TAN is normalized version of TA and TANJ is combination of TAN and Jaccard. 7/12/2020 TA measure - Loc Nguyen 13
  • 14. 3. Results and discussions • TA family (TA, TAJ, TAN, TANJ) is tested with traditional cosine family (cosine, COJ, CON, COD) and other well-known measures such as Pearson family (Pearson, WPC, SPC), Jaccard, MSD family (MSD, MSDJ), NHSM, BCF, and SMTP. • Two metrics used to test measures are MAE (mean absolute error) and CC (correlation coefficient). The smaller MAE is, the better the measures are. The larger CC is, the better the measures are. • All measures are tested with NN algorithm. NN algorithm for user-based rating matrix becomes user-based NN algorithm and NN algorithm for item-based rating matrix becomes item-based NN algorithm • Dataset Movielens [8] is used for evaluation, which has 100,000 ratings from 943 users on 1682 movies (items). It is divided into 5 folders and each folder includes training set and testing set. Training set and testing set in the same folder are disjoint sets. The ratio of testing set over the whole dataset depends on the testing parameter r = 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9. 7/12/2020 TA measure - Loc Nguyen 14
  • 15. 3. Results and discussions • With r=0.1, TANJ is the best with lowest item-based MAE and highest item-based CC whereas WPC is the best with lowest user-based MAE and highest user-based CC. • In general, top-5 measures according to item-based MAE are TANJ (1), NHSM (2), TAJ (3), MSDJ (4), and Jaccard (5). However, top-5 measures according to item-based CC are TANJ, NHSM, WPC, TAJ, and SPC. Top-5 measures according to user-based MAE and user-based CC are WPC, TANJ, SPC, Pearson, and NHSM. • Anyway, TANJ is always in top-5 measures and TA family is better than cosine family. 7/12/2020 TA measure - Loc Nguyen 15
  • 16. 3. Results and discussions • With r=0.5, NHSM is the best with lowest item-based MAE, highest item- based CC, and lowest user-based MAE. COJ is the best with highest user-based CC. • In general, top-5 measures according to item-based MAE are NHSM (1), TANJ (2), TAJ (3), MSDJ (4), and Jaccard (5). However, top-5 measures according to item-based CC are NHSM, TAJ, TANJ, MSDJ, and Jaccard. Top-5 measures according to user-based MAE are NHSM, TANJ, TAJ, cosine, and MSDJ. Top-5 measures according to user-based CC are COJ, TAJ, NHSM, MSDJ, and TA. • Anyway, TA family is always in top-5 measures and it is better than cosine family. 7/12/2020 TA measure - Loc Nguyen 16
  • 17. 3. Results and discussions • With r=0.9, results are unexpected because the parameter r=0.9 makes noise NN algorithm when the training set is not large enough. Cosine is the best with lowest MAE and highest CC. • In general, top-5 measures according to item-based MAE are cosine (1), Pearson (2), SPC (3), MSD (4), and TA (5). However, top-5 measures according to item-based CC are cosine, BCF, Pearson, SPC, and MSD. Top-5 measures according to user- based MAE are cosine, MSD, Pearson, SMTP, and TA. Top-5 measures according to user-based CC are cosine, Pearson, SPC, MSD, and SMTP. • Although TA is in top-5 measures with user-based MAE, it is worse than pure cosine and hence, TA family is not better than cosine family. 7/12/2020 TA measure - Loc Nguyen 17
  • 18. 3. Results and discussions • With two parameter values r=0.1 and r=0.5, TA family is better than cosine family and NHSM is the best measure when item-based MAE is selected as the main referred metric because item-based NN algorithm is always better than user-based NN algorithm. • Three values of r such as 0.1, 0.5, and 0.9 are enough for us to survey all measures because these values are key values. For instance, the minimum value r=0.1 implies large real-time database, the medium value r=0.5 implies testing database, and the maximum value r=0.9 implies unexpectedly small database. • However, it is not possible to draw which measures are the best in general yet. So, all measures need to be tested with all values of r: 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 and then average item-based MAE for each measure is calculated in order to determine best measures. 7/12/2020 TA measure - Loc Nguyen 18
  • 19. 3. Results and discussions 7/12/2020 TA measure - Loc Nguyen 19 Item-based MAE of all measures over all values of r
  • 20. 3. Results and discussions • For all r values, general top-5 measures are TANJ, NHSM, TAJ, MSDJ, and Jaccard whose average item- based MAE values are 0.7537, 0.7538, 0.7544, 0.7555, and 0.7584, respectively in which TANJ is the dominant measure. • Because TA measure in TA family does not combine with Jaccard, comparison of pure TA and pure cosine is necessary. In general, TA (average MAE = 0.7626) is worse than cosine (average MAE = 0.7621). However, for most r = 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, and 0.8, TA is better than cosine. • With only r=0.9, TA (MAE=0.8399) is unexpectedly worse than cosine (MAE=0.8177). The reason is that the parameter r=0.9 implies training set is too small to train NN algorithm and so, strong point of TA which alleviates negative effect of Euclidean distance is broken by lack of information in small training set. 7/12/2020 TA measure - Loc Nguyen 20
  • 21. 4. Conclusions • There is no doubt that TA family is better than traditional cosine family in most cases (r = 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8) although pure TA measure is not preeminent like TANJ measure. Moreover, the general top-5 measures are TANJ, NHSM, TAJ, MSDJ, and Jaccard, in which variants of TA as TANJ and TAJ are in such top-5 list. • TA is more suitable to real applications than cosine because values of r which are smaller 0.5 indicate large rating databases in real applications. Moreover, formula of TA is still simple • Jaccard is itself not a preeminent measure, but it is an important factor to improve any measure. In fact, good measures such as TANJ, NHSM, TAJ, MSDJ combine themselves with Jaccard. The reason is that numerical measures except Jaccard are calculated with respect to rating values of common items on which both users rated and hence, these numerical measures ignore items which are rated uniquely by each user whereas Jaccard implies accuracy of these measures because Jaccard is the ratio of the number of common items to the number of all items. As a result, putting Jaccard into another measure is to adjust accuracy of such measure. • In future trend, we try our best to improve TA so that it follows the ideology of Jaccard measure instead of combining TA and Jaccard as usual. 7/12/2020 TA measure - Loc Nguyen 21
  • 22. Thank you for attention 22TA measure - Loc Nguyen7/12/2020
  • 23. References 1. R. D. Torres Júnior, "Combining Collaborative and Content-based Filtering to Recommend Research Paper," Universidade Federal do Rio Grande do Sul, Porto Alegre, 2004. 2. M.-P. T. Do, D. V. Nguyen and L. Nguyen, "Model-based Approach for Collaborative Filtering," in Proceedings of The 6th International Conference on Information Technology for Education (IT@EDU2010), Ho Chi Minh, 2010. 3. B. Sarwar, G. Karypis, J. Konstan and J. Riedl, "Item-based Collaborative Filtering Recommendation Algorithms," in Proceedings of the 10th international conference on World Wide Web, Hong Kong, 2001. 4. H. Liu, Z. Hu, A. Mian, H. Tian and X. Zhu, "A new user similarity model to improve the accuracy of collaborative filtering," Knowledge-Based Systems, vol. 56, no. 2014, pp. 156-166, 20 November 2013. 5. B. K. Patra, R. Launonen, V. Ollikainen and S. Nandi, "A new similarity measure using Bhattacharyya coefficient for collaborative filtering in sparse data," Knowledge-Based Systems, vol. 82, pp. 163-177, 1 March 2015. 6. Y.-S. Lin, J.-Y. Jiang and S.-J. Lee, "A Similarity Measure for Text Classification and Clustering," IEEE Transactions on Knowledge and Data Engineering, vol. 26, no. 7, pp. 1575-1590, 25 January 2013. 7. J. L. Herlocker, J. A. Konstan, L. G. Terveen and J. T. Riedl, "Evaluating Collaborative Filtering Recommender Systems," ACM Transactions on Information Systems (TOIS), vol. 22, no. 1, pp. 5-53, 2004. 8. GroupLens, "MovieLens datasets," GroupLens Research Project, University of Minnesota, USA, 22 April 1998. [Online]. Available: http://grouplens.org/datasets/movielens. [Accessed 3 August 2012]. 7/12/2020 TA measure - Loc Nguyen 23