Pairwise Reviews Ranking and Classification
for Medicine E-Commerce Application
Shaurya Uppal1, Ambikesh Jayal2, Anuja Arora1
Jaypee Institute of Information Technology, Noida, India
Cardiff School of Technologies, Western Avenue, Cardiff, CF5 2YB
1
Paper Link: https://ieeexplore.ieee.org/document/8844887
Abstract
● Reviews are very important part of E-Commerce which impact buying decision of
a customers.
● Segregate Useful reviews out of enormous amount of reviews is complex task.
● Relevance based ranking is based on syntactic and semantic sense of the text.
●Google is the only web application which uses relevance based reviews ranking
approach in Google Maps and Playstore.
Objective: To Showcase reviews based on relevancy mapping to customers.
Approach: Pairwise Review Relevance Ranking method.
Work Flow:
Feature
extraction
Pairwise
Ranking
Classification
2
What is Learning to Rank?
Pairwise Ranking Methodology is used as it compares all reviews, hence
giving each review equally likely chance on be placed on top. 3
WorkFlow of our Strategy
4
Feature Extraction
● Noun Strength (Rn)
● Review Polarity (Rp) - Sentiment of Review signifying whether a review is positive or
negative.
● Review Subjectivity(Rs) - Subjectivity is a measure of the sentiment being objective to
subjective and goes from 0 to 1.
5
Feature Extraction
● Review Complexity (Rc) - Richness of vocab used in a review. Unique
words in a review / Total unique words in corpus.
● Service Tag (Rs) – This feature is used to check whether a review is related
to delivery, packing, customer service, etc. Like “I am not happy as delivery
was not on time”.
● Compound Score (Rsc) - It has ability to find sentiment of Slang (e.g. SUX! )
, Emoji (😊,😥), Emoticons ( :), :D ) and difference between capitalized word
expressions also (I am HAPPY, I am happy are different expressions).
● Word Length (Rw) - No. of words in a review.
6
Feature Extraction
POS Tags
7
Subjectivity Score Compound Score
Review Segregation
The reviews are segregated in two sets. Set 1 represents review with label 1 i.e
reviews that are informative and are better than all reviews of Set 0; Set 0
represents a review with label 0 i.e. that is not informative. Even, reviews those
are more subjective and discusses about delivery, service, and customer support
but does not talk about the product also come in this set.
We pairwise compare each review of set1 with all reviews of set0 and vice-versa.
(Ri, Rj,1) where i∈ Set1 and j∈ Set0 → Ri is better than Rj
(Rj, Ri, 0) where i∈ Set1 and j∈ Set0 → Rj is worse than Ri
This now becomes a classification problem.
8
Review Score
Review Score Computation: For a given product, we compare each review (Ri)
with every other review (Rj) for a product and get a win/lose score where win
means (Ri) is better than (Rj) and lose means (Ri) is worst than (Rj). Further, a
review score is computed using:-
Review Score =Total #Wins / Total #Comparisons
9
DataSet Statistics
10
Training dataset:
# of reviews : 503
# of Categories : 5 Vitamin B Tablet, Vitamin D Tablet, Accu-check, Omega 3 Fatty acid,
and a medicinal shampoo
# of reviews range in each category: 83-107
Classification Model
Classification models Used:
● RandomForest
● Support Vector Classifier
● Logistic Regression
● Neural Network
11
On experimenting with these classifier models, we got the best results with Random
Forest Classifier. [Slide 13]
Performance Measures
● Ranking Accuracy Measure
A sorted list of reviews based on review score (computed earlier) is the outcome of
pairwise Ranking algorithm. Therefore, To test this hypothesis a ranking metric is
designed which is as follows:
Let the number of reviews labeled as 1 in our Dataset be Nlabel=1.
Ranking Accuracy product=Number of 1s found in first Nlabel=1 positions / Nlabel=1
● Classification Accuracy Measure
Classification accuracy is computed using true positive, true negative, False Positive, and
False negative outcome.
Classification Accuracyreview= TP+TN / TP+TN+FP+FN
12
Results
13
Ranked Reviews for Vitamin E Tablet (Top Ranked
Reviews)
14
Ranked Reviews for Vitamin E Tablet (Low Ranked
Reviews)
15
Future Work
● Gathering user upvotes data.
● Personalized ranking for every user based on his activity, knowing what type
of reviews and what type of literature he/she likes.
● Personalized ranking would be further extension for relevance ranking where
we use upvotes per user and learn which type of reviews a person likes and
based on that helpful upvote we will show the reviews on top which have
same features and relevance which the person upvotes.
● Readability Score can be added in the future if Personalized per user based
ranking would be done.
16
References
• Seki, Y. (2002). Sentence Extraction by tf/idf and position weighting from Newspaper Articles.
• Ravikumar, P., Tewari, A., & Yang, E. (2011, June). On NDCG consistency of listwise ranking methods. In
Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (pp. 618-626).
• Yu, R., Zhang, Y., Ye, Y., Wu, L., Wang, C., Liu, Q., & Chen, E. (2018, October). Multiple Pairwise Ranking with
Implicit Feedback. In Proceedings of the 27th ACM International Conference on Information and Knowledge
Management (pp. 1727-1730). ACM.
• Yu, L., Zhang, C., Pei, S., Sun, G., & Zhang, X. (2018, April). Walkranker: A unified pairwise ranking model with
multiple relations for item recommendation. In Thirty-Second AAAI Conference on Artificial Intelligence.
• Bai, T., Zhao, W. X., He, Y., Nie, J. Y., & Wen, J. R. (2018). Characterizing and predicting early reviewers for
effective product marketing on e-commerce websites. IEEE Transactions on Knowledge and Data Engineering,
30(12), 2271-2284.
• Yan, Y., Liu, Z., Zhao, M., Guo, W., Yan, W. P., & Bao, Y. (2018, September). A Practical Deep Online Ranking
System in E-commerce Recommendation. In Joint European Conference on Machine Learning and Knowledge
Discovery in Databases (pp. 186-201). Springer, Cham.
• Li, H. (2011). A short introduction to learning to rank. IEICE TRANSACTIONS on Information and Systems, 94(10),
1854-1862.
17

Pairwise reviews ranking and classification

  • 1.
    Pairwise Reviews Rankingand Classification for Medicine E-Commerce Application Shaurya Uppal1, Ambikesh Jayal2, Anuja Arora1 Jaypee Institute of Information Technology, Noida, India Cardiff School of Technologies, Western Avenue, Cardiff, CF5 2YB 1 Paper Link: https://ieeexplore.ieee.org/document/8844887
  • 2.
    Abstract ● Reviews arevery important part of E-Commerce which impact buying decision of a customers. ● Segregate Useful reviews out of enormous amount of reviews is complex task. ● Relevance based ranking is based on syntactic and semantic sense of the text. ●Google is the only web application which uses relevance based reviews ranking approach in Google Maps and Playstore. Objective: To Showcase reviews based on relevancy mapping to customers. Approach: Pairwise Review Relevance Ranking method. Work Flow: Feature extraction Pairwise Ranking Classification 2
  • 3.
    What is Learningto Rank? Pairwise Ranking Methodology is used as it compares all reviews, hence giving each review equally likely chance on be placed on top. 3
  • 4.
    WorkFlow of ourStrategy 4
  • 5.
    Feature Extraction ● NounStrength (Rn) ● Review Polarity (Rp) - Sentiment of Review signifying whether a review is positive or negative. ● Review Subjectivity(Rs) - Subjectivity is a measure of the sentiment being objective to subjective and goes from 0 to 1. 5
  • 6.
    Feature Extraction ● ReviewComplexity (Rc) - Richness of vocab used in a review. Unique words in a review / Total unique words in corpus. ● Service Tag (Rs) – This feature is used to check whether a review is related to delivery, packing, customer service, etc. Like “I am not happy as delivery was not on time”. ● Compound Score (Rsc) - It has ability to find sentiment of Slang (e.g. SUX! ) , Emoji (😊,😥), Emoticons ( :), :D ) and difference between capitalized word expressions also (I am HAPPY, I am happy are different expressions). ● Word Length (Rw) - No. of words in a review. 6
  • 7.
  • 8.
    Review Segregation The reviewsare segregated in two sets. Set 1 represents review with label 1 i.e reviews that are informative and are better than all reviews of Set 0; Set 0 represents a review with label 0 i.e. that is not informative. Even, reviews those are more subjective and discusses about delivery, service, and customer support but does not talk about the product also come in this set. We pairwise compare each review of set1 with all reviews of set0 and vice-versa. (Ri, Rj,1) where i∈ Set1 and j∈ Set0 → Ri is better than Rj (Rj, Ri, 0) where i∈ Set1 and j∈ Set0 → Rj is worse than Ri This now becomes a classification problem. 8
  • 9.
    Review Score Review ScoreComputation: For a given product, we compare each review (Ri) with every other review (Rj) for a product and get a win/lose score where win means (Ri) is better than (Rj) and lose means (Ri) is worst than (Rj). Further, a review score is computed using:- Review Score =Total #Wins / Total #Comparisons 9
  • 10.
    DataSet Statistics 10 Training dataset: #of reviews : 503 # of Categories : 5 Vitamin B Tablet, Vitamin D Tablet, Accu-check, Omega 3 Fatty acid, and a medicinal shampoo # of reviews range in each category: 83-107
  • 11.
    Classification Model Classification modelsUsed: ● RandomForest ● Support Vector Classifier ● Logistic Regression ● Neural Network 11 On experimenting with these classifier models, we got the best results with Random Forest Classifier. [Slide 13]
  • 12.
    Performance Measures ● RankingAccuracy Measure A sorted list of reviews based on review score (computed earlier) is the outcome of pairwise Ranking algorithm. Therefore, To test this hypothesis a ranking metric is designed which is as follows: Let the number of reviews labeled as 1 in our Dataset be Nlabel=1. Ranking Accuracy product=Number of 1s found in first Nlabel=1 positions / Nlabel=1 ● Classification Accuracy Measure Classification accuracy is computed using true positive, true negative, False Positive, and False negative outcome. Classification Accuracyreview= TP+TN / TP+TN+FP+FN 12
  • 13.
  • 14.
    Ranked Reviews forVitamin E Tablet (Top Ranked Reviews) 14
  • 15.
    Ranked Reviews forVitamin E Tablet (Low Ranked Reviews) 15
  • 16.
    Future Work ● Gatheringuser upvotes data. ● Personalized ranking for every user based on his activity, knowing what type of reviews and what type of literature he/she likes. ● Personalized ranking would be further extension for relevance ranking where we use upvotes per user and learn which type of reviews a person likes and based on that helpful upvote we will show the reviews on top which have same features and relevance which the person upvotes. ● Readability Score can be added in the future if Personalized per user based ranking would be done. 16
  • 17.
    References • Seki, Y.(2002). Sentence Extraction by tf/idf and position weighting from Newspaper Articles. • Ravikumar, P., Tewari, A., & Yang, E. (2011, June). On NDCG consistency of listwise ranking methods. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (pp. 618-626). • Yu, R., Zhang, Y., Ye, Y., Wu, L., Wang, C., Liu, Q., & Chen, E. (2018, October). Multiple Pairwise Ranking with Implicit Feedback. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (pp. 1727-1730). ACM. • Yu, L., Zhang, C., Pei, S., Sun, G., & Zhang, X. (2018, April). Walkranker: A unified pairwise ranking model with multiple relations for item recommendation. In Thirty-Second AAAI Conference on Artificial Intelligence. • Bai, T., Zhao, W. X., He, Y., Nie, J. Y., & Wen, J. R. (2018). Characterizing and predicting early reviewers for effective product marketing on e-commerce websites. IEEE Transactions on Knowledge and Data Engineering, 30(12), 2271-2284. • Yan, Y., Liu, Z., Zhao, M., Guo, W., Yan, W. P., & Bao, Y. (2018, September). A Practical Deep Online Ranking System in E-commerce Recommendation. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 186-201). Springer, Cham. • Li, H. (2011). A short introduction to learning to rank. IEICE TRANSACTIONS on Information and Systems, 94(10), 1854-1862. 17