SlideShare a Scribd company logo
Article
Recommender
System Done by-
Lakshya Karwa
Tarun Kumar I. S.
Guided by-
Dr. J. Shana
On the Internet, where the number of options is
overwhelming, there is a need to filter prioritize
and efficiently deliver relevant information in
order to alleviate the problem of information
overload, which has created a potential problem
to many internet users.
Problem Description
User_Interactions
The dataset which we are using was available on DeskDrop which is an
Internal Communications Platform developed by CI & T.
It has data from 2016 to 2017
There are 2 different Datasets namely :
Shared_Articles
DataSet Description
Shared_Articles
● timestamp
● eventType
● contentId
● authorPersonId
● authorSessionId
● authorUserAgent
● authorRegion
● authorCountry
● contentType
● url
● title
● Text
● Lang
Shared_Articles
• Contains information about the articles shared in the platform. Each article has a
sharing date (timestamp), the original URL, title, content, the article lang and
information about the user who shared the article (author).
• Two possible event types at a given timestamp:
CONTENT SHARED: The article was shared in the platform and is available for
users.
CONTENT REMOVED: The article was removed from the platform and not
available for further recommendation.
• For the sake of simplicity, we only consider here the "CONTENT SHARED" event
type, assuming (naively) that all articles were available during the whole one year
period. For a more precise evaluation (and higher accuracy), only articles that were
available at a given time should be recommended.
User_Interactions
● timestamp
● eventType
● contentId
● personId
● sessionId
● userAgent
● userRegion
● userCountry
User_Interactions
• Contains logs of user interactions on shared articles. It can be joined to
articles_shared.csv by contentId column.
• The eventType values are:
VIEW: The user has opened the article.
LIKE: The user has liked the article.
COMMENT CREATED: The user created a comment in the article.
FOLLOW: The user chose to be notified on any new comment in the article.
BOOKMARK: The user has bookmarked the article for easy return in the future.
Data Pre-Processing
and Preparation
• No filling up of data was required as there were no missing info in the dataset
• A new rating column was created based on the user’s actions on a particular article.
1 - VIEW: The user has opened the article.
2 - LIKE: The user has liked the article.
3 - COMMENT CREATED: The user created a comment in the article.
4 - FOLLOW: The user chose to be notified on any new comment in the
article.
5 - BOOKMARK: The user has bookmarked the article for easy return in the
future.
• The two datasets were merged using “INNER Join” using the “contentID” attribute
present in both the datasets
MERGING OF TABLES
IMPLICITLY ADDING VALUES TO PREVIOUS VIEWS
EXPLORATORY
DATA
ANALYSIS
COUNT OF ITEMS IN EVENT TYPE
NO OF ARTICLES IN DIFFFENT LANGUAGES
CHECKING INTERACTIONS WITH
USERS
MODEL SELECTION
• Alternating Least Squares (ALS) - Performed by Lakshya Karwa
• Bayesian Personalized Ranking (BPR) - Performed by Tarun
Kumar I.S.
• Logistic Matrix Factorization (LMF) - Performed by both
MODEL BUILDING
PHASES
• Model Selection :
• Was selected on the basis of collaborative filtering.
• ALS minimizes two loss functions alternatively.
• Scalability
• BPR works on Concept of Bayes concept, where it tries to find probability of
item to occur when certain thing occur.
• LMF works on same concept using ALS but using log function in confidence
matrix to improve accuracy.
• Model Fitting :
Done using the implicit library available in python
• Model Validation:
Checked using train-test split
Performance
Analysis
• Accuracy for Bayesian Personalized Ranking (BPR): 82.6 %
• Accuracy for Alternating Least Squares (ALS): 98.1%
• Accuracy for Logistic Matrix Factorization (LMF): 97.89 %
COMPARISON OF MODELS
Inference For ALS
• Collaborative Filtering can be improved using Matrix Factorization
• The method is pretty robust.
• The time complexity is O(n).
Inference for BPR
• The method is does depend more on previous interactions than latent
factors.
• The time complexity is O(n).
Inference For LMF
• THIS method is almost similar as ALS, here we use log function in
confidence matrix which improves accuracy than ALS
• The time complexity is O(n).
Time Taken by each
of the models to train
- Total Time taken in building the BPR model: 0.25283193588256836
- Total Time taken in building the ALS model: 0.4497077465057373
- Total Time taken in building the LMF model: 0.3670186996459961
RECOMENDATION
Recommendation by als
Recommendation by BPR
Recommendation by LMF
Challenges
• The implicit library available for implementation of the algorithms
wasn’t readily available for the Windows 10 Operating System. It had to
be run on Linux (ubuntu) and to run on Windows 10 it needed a C/C++
compiler.
• On Ubuntu, the system took a lot of time computing the results for ALS,
i.e., 17.xx seconds, every time the model was built.
Learning
• The usage of the implicit Library available in python.
• How different-different ‘Recommender Systems’ work.
• Implementation of ALS, BPR and LMF models using the implicit library
and how collaborative filtering can be improved.
• Matrix Factorization for sparse data problem.
References
• DataSet - https://www.kaggle.com/gspmoreira/articles-sharing-
reading-from-cit-deskdrop
• https://implicit.readthedocs.io/en/latest/quickstart.html
• https://implicit.readthedocs.io/en/latest/
• https://readthedocs.org/projects/implicit/downloads/pdf/latest/
THANK YOU

More Related Content

Similar to artrec.pptx

11 Project-Online Library Management System
11 Project-Online Library Management System11 Project-Online Library Management System
11 Project-Online Library Management System
Heather Strinden
 
A Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TVA Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TV
Francisco Couto
 
A flexible recommenndation system for Cable TV
A flexible recommenndation system for Cable TVA flexible recommenndation system for Cable TV
A flexible recommenndation system for Cable TV
IntoTheMinds
 
SNATZ Technology
SNATZ TechnologySNATZ Technology
SNATZ Technology
Pavel Yakovlev
 
Poster (1)
Poster (1)Poster (1)
Poster (1)
Daniel Osei
 
Recommender systems for E-commerce
Recommender systems for E-commerceRecommender systems for E-commerce
Recommender systems for E-commerce
Alexander Konduforov
 
11.project online library management system
11.project online library management system11.project online library management system
11.project online library management system
monika ahalawat
 
Measuring the New Wikipedia Community (PyData SV 2013)
Measuring the New Wikipedia Community (PyData SV 2013)Measuring the New Wikipedia Community (PyData SV 2013)
Measuring the New Wikipedia Community (PyData SV 2013)
PyData
 
Tag based recommender system
Tag based recommender systemTag based recommender system
Tag based recommender system
Karen Li
 
[WI 2014]Context Recommendation Using Multi-label Classification
[WI 2014]Context Recommendation Using Multi-label Classification[WI 2014]Context Recommendation Using Multi-label Classification
[WI 2014]Context Recommendation Using Multi-label Classification
YONG ZHENG
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Spark Summit
 
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...
IJCSEA Journal
 
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...
IJCSEA Journal
 
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
SBGC
 
Paper id 37201536
Paper id 37201536Paper id 37201536
Paper id 37201536
IJRAT
 
Annotations and Europeana @Project Assembly 2014 - Tech Workshops
Annotations and Europeana @Project Assembly 2014 - Tech WorkshopsAnnotations and Europeana @Project Assembly 2014 - Tech Workshops
Annotations and Europeana @Project Assembly 2014 - Tech Workshops
David Haskiya
 
SEppt
SEpptSEppt
Machine_learning_presentation_on_movie_recomendation_system.pptx
Machine_learning_presentation_on_movie_recomendation_system.pptxMachine_learning_presentation_on_movie_recomendation_system.pptx
Machine_learning_presentation_on_movie_recomendation_system.pptx
arunchoubeybxr
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
Joaquin Delgado PhD.
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
S. Diana Hu
 

Similar to artrec.pptx (20)

11 Project-Online Library Management System
11 Project-Online Library Management System11 Project-Online Library Management System
11 Project-Online Library Management System
 
A Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TVA Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TV
 
A flexible recommenndation system for Cable TV
A flexible recommenndation system for Cable TVA flexible recommenndation system for Cable TV
A flexible recommenndation system for Cable TV
 
SNATZ Technology
SNATZ TechnologySNATZ Technology
SNATZ Technology
 
Poster (1)
Poster (1)Poster (1)
Poster (1)
 
Recommender systems for E-commerce
Recommender systems for E-commerceRecommender systems for E-commerce
Recommender systems for E-commerce
 
11.project online library management system
11.project online library management system11.project online library management system
11.project online library management system
 
Measuring the New Wikipedia Community (PyData SV 2013)
Measuring the New Wikipedia Community (PyData SV 2013)Measuring the New Wikipedia Community (PyData SV 2013)
Measuring the New Wikipedia Community (PyData SV 2013)
 
Tag based recommender system
Tag based recommender systemTag based recommender system
Tag based recommender system
 
[WI 2014]Context Recommendation Using Multi-label Classification
[WI 2014]Context Recommendation Using Multi-label Classification[WI 2014]Context Recommendation Using Multi-label Classification
[WI 2014]Context Recommendation Using Multi-label Classification
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
 
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...
 
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...
 
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
 
Paper id 37201536
Paper id 37201536Paper id 37201536
Paper id 37201536
 
Annotations and Europeana @Project Assembly 2014 - Tech Workshops
Annotations and Europeana @Project Assembly 2014 - Tech WorkshopsAnnotations and Europeana @Project Assembly 2014 - Tech Workshops
Annotations and Europeana @Project Assembly 2014 - Tech Workshops
 
SEppt
SEpptSEppt
SEppt
 
Machine_learning_presentation_on_movie_recomendation_system.pptx
Machine_learning_presentation_on_movie_recomendation_system.pptxMachine_learning_presentation_on_movie_recomendation_system.pptx
Machine_learning_presentation_on_movie_recomendation_system.pptx
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 

Recently uploaded

一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
a9qfiubqu
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
ElizabethGarrettChri
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
SaffaIbrahim1
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
Márton Kodok
 
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
wyddcwye1
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
VyNguyen709676
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
mkkikqvo
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 

Recently uploaded (20)

一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
 
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 

artrec.pptx

  • 1. Article Recommender System Done by- Lakshya Karwa Tarun Kumar I. S. Guided by- Dr. J. Shana
  • 2. On the Internet, where the number of options is overwhelming, there is a need to filter prioritize and efficiently deliver relevant information in order to alleviate the problem of information overload, which has created a potential problem to many internet users. Problem Description
  • 3. User_Interactions The dataset which we are using was available on DeskDrop which is an Internal Communications Platform developed by CI & T. It has data from 2016 to 2017 There are 2 different Datasets namely : Shared_Articles DataSet Description
  • 4. Shared_Articles ● timestamp ● eventType ● contentId ● authorPersonId ● authorSessionId ● authorUserAgent ● authorRegion ● authorCountry ● contentType ● url ● title ● Text ● Lang
  • 5. Shared_Articles • Contains information about the articles shared in the platform. Each article has a sharing date (timestamp), the original URL, title, content, the article lang and information about the user who shared the article (author). • Two possible event types at a given timestamp: CONTENT SHARED: The article was shared in the platform and is available for users. CONTENT REMOVED: The article was removed from the platform and not available for further recommendation. • For the sake of simplicity, we only consider here the "CONTENT SHARED" event type, assuming (naively) that all articles were available during the whole one year period. For a more precise evaluation (and higher accuracy), only articles that were available at a given time should be recommended.
  • 6. User_Interactions ● timestamp ● eventType ● contentId ● personId ● sessionId ● userAgent ● userRegion ● userCountry
  • 7. User_Interactions • Contains logs of user interactions on shared articles. It can be joined to articles_shared.csv by contentId column. • The eventType values are: VIEW: The user has opened the article. LIKE: The user has liked the article. COMMENT CREATED: The user created a comment in the article. FOLLOW: The user chose to be notified on any new comment in the article. BOOKMARK: The user has bookmarked the article for easy return in the future.
  • 8. Data Pre-Processing and Preparation • No filling up of data was required as there were no missing info in the dataset • A new rating column was created based on the user’s actions on a particular article. 1 - VIEW: The user has opened the article. 2 - LIKE: The user has liked the article. 3 - COMMENT CREATED: The user created a comment in the article. 4 - FOLLOW: The user chose to be notified on any new comment in the article. 5 - BOOKMARK: The user has bookmarked the article for easy return in the future. • The two datasets were merged using “INNER Join” using the “contentID” attribute present in both the datasets
  • 10. IMPLICITLY ADDING VALUES TO PREVIOUS VIEWS
  • 12. COUNT OF ITEMS IN EVENT TYPE
  • 13. NO OF ARTICLES IN DIFFFENT LANGUAGES
  • 15.
  • 16.
  • 17. MODEL SELECTION • Alternating Least Squares (ALS) - Performed by Lakshya Karwa • Bayesian Personalized Ranking (BPR) - Performed by Tarun Kumar I.S. • Logistic Matrix Factorization (LMF) - Performed by both
  • 18. MODEL BUILDING PHASES • Model Selection : • Was selected on the basis of collaborative filtering. • ALS minimizes two loss functions alternatively. • Scalability • BPR works on Concept of Bayes concept, where it tries to find probability of item to occur when certain thing occur. • LMF works on same concept using ALS but using log function in confidence matrix to improve accuracy. • Model Fitting : Done using the implicit library available in python • Model Validation: Checked using train-test split
  • 19. Performance Analysis • Accuracy for Bayesian Personalized Ranking (BPR): 82.6 % • Accuracy for Alternating Least Squares (ALS): 98.1% • Accuracy for Logistic Matrix Factorization (LMF): 97.89 %
  • 21. Inference For ALS • Collaborative Filtering can be improved using Matrix Factorization • The method is pretty robust. • The time complexity is O(n).
  • 22. Inference for BPR • The method is does depend more on previous interactions than latent factors. • The time complexity is O(n).
  • 23. Inference For LMF • THIS method is almost similar as ALS, here we use log function in confidence matrix which improves accuracy than ALS • The time complexity is O(n).
  • 24. Time Taken by each of the models to train - Total Time taken in building the BPR model: 0.25283193588256836 - Total Time taken in building the ALS model: 0.4497077465057373 - Total Time taken in building the LMF model: 0.3670186996459961
  • 29. Challenges • The implicit library available for implementation of the algorithms wasn’t readily available for the Windows 10 Operating System. It had to be run on Linux (ubuntu) and to run on Windows 10 it needed a C/C++ compiler. • On Ubuntu, the system took a lot of time computing the results for ALS, i.e., 17.xx seconds, every time the model was built.
  • 30. Learning • The usage of the implicit Library available in python. • How different-different ‘Recommender Systems’ work. • Implementation of ALS, BPR and LMF models using the implicit library and how collaborative filtering can be improved. • Matrix Factorization for sparse data problem.
  • 31. References • DataSet - https://www.kaggle.com/gspmoreira/articles-sharing- reading-from-cit-deskdrop • https://implicit.readthedocs.io/en/latest/quickstart.html • https://implicit.readthedocs.io/en/latest/ • https://readthedocs.org/projects/implicit/downloads/pdf/latest/