SlideShare a Scribd company logo
1 of 32
Article
Recommender
System Done by-
Lakshya Karwa
Tarun Kumar I. S.
Guided by-
Dr. J. Shana
On the Internet, where the number of options is
overwhelming, there is a need to filter prioritize
and efficiently deliver relevant information in
order to alleviate the problem of information
overload, which has created a potential problem
to many internet users.
Problem Description
User_Interactions
The dataset which we are using was available on DeskDrop which is an
Internal Communications Platform developed by CI & T.
It has data from 2016 to 2017
There are 2 different Datasets namely :
Shared_Articles
DataSet Description
Shared_Articles
● timestamp
● eventType
● contentId
● authorPersonId
● authorSessionId
● authorUserAgent
● authorRegion
● authorCountry
● contentType
● url
● title
● Text
● Lang
Shared_Articles
• Contains information about the articles shared in the platform. Each article has a
sharing date (timestamp), the original URL, title, content, the article lang and
information about the user who shared the article (author).
• Two possible event types at a given timestamp:
CONTENT SHARED: The article was shared in the platform and is available for
users.
CONTENT REMOVED: The article was removed from the platform and not
available for further recommendation.
• For the sake of simplicity, we only consider here the "CONTENT SHARED" event
type, assuming (naively) that all articles were available during the whole one year
period. For a more precise evaluation (and higher accuracy), only articles that were
available at a given time should be recommended.
User_Interactions
● timestamp
● eventType
● contentId
● personId
● sessionId
● userAgent
● userRegion
● userCountry
User_Interactions
• Contains logs of user interactions on shared articles. It can be joined to
articles_shared.csv by contentId column.
• The eventType values are:
VIEW: The user has opened the article.
LIKE: The user has liked the article.
COMMENT CREATED: The user created a comment in the article.
FOLLOW: The user chose to be notified on any new comment in the article.
BOOKMARK: The user has bookmarked the article for easy return in the future.
Data Pre-Processing
and Preparation
• No filling up of data was required as there were no missing info in the dataset
• A new rating column was created based on the user’s actions on a particular article.
1 - VIEW: The user has opened the article.
2 - LIKE: The user has liked the article.
3 - COMMENT CREATED: The user created a comment in the article.
4 - FOLLOW: The user chose to be notified on any new comment in the
article.
5 - BOOKMARK: The user has bookmarked the article for easy return in the
future.
• The two datasets were merged using “INNER Join” using the “contentID” attribute
present in both the datasets
MERGING OF TABLES
IMPLICITLY ADDING VALUES TO PREVIOUS VIEWS
EXPLORATORY
DATA
ANALYSIS
COUNT OF ITEMS IN EVENT TYPE
NO OF ARTICLES IN DIFFFENT LANGUAGES
CHECKING INTERACTIONS WITH
USERS
MODEL SELECTION
• Alternating Least Squares (ALS) - Performed by Lakshya Karwa
• Bayesian Personalized Ranking (BPR) - Performed by Tarun
Kumar I.S.
• Logistic Matrix Factorization (LMF) - Performed by both
MODEL BUILDING
PHASES
• Model Selection :
• Was selected on the basis of collaborative filtering.
• ALS minimizes two loss functions alternatively.
• Scalability
• BPR works on Concept of Bayes concept, where it tries to find probability of
item to occur when certain thing occur.
• LMF works on same concept using ALS but using log function in confidence
matrix to improve accuracy.
• Model Fitting :
Done using the implicit library available in python
• Model Validation:
Checked using train-test split
Performance
Analysis
• Accuracy for Bayesian Personalized Ranking (BPR): 82.6 %
• Accuracy for Alternating Least Squares (ALS): 98.1%
• Accuracy for Logistic Matrix Factorization (LMF): 97.89 %
COMPARISON OF MODELS
Inference For ALS
• Collaborative Filtering can be improved using Matrix Factorization
• The method is pretty robust.
• The time complexity is O(n).
Inference for BPR
• The method is does depend more on previous interactions than latent
factors.
• The time complexity is O(n).
Inference For LMF
• THIS method is almost similar as ALS, here we use log function in
confidence matrix which improves accuracy than ALS
• The time complexity is O(n).
Time Taken by each
of the models to train
- Total Time taken in building the BPR model: 0.25283193588256836
- Total Time taken in building the ALS model: 0.4497077465057373
- Total Time taken in building the LMF model: 0.3670186996459961
RECOMENDATION
Recommendation by als
Recommendation by BPR
Recommendation by LMF
Challenges
• The implicit library available for implementation of the algorithms
wasn’t readily available for the Windows 10 Operating System. It had to
be run on Linux (ubuntu) and to run on Windows 10 it needed a C/C++
compiler.
• On Ubuntu, the system took a lot of time computing the results for ALS,
i.e., 17.xx seconds, every time the model was built.
Learning
• The usage of the implicit Library available in python.
• How different-different ‘Recommender Systems’ work.
• Implementation of ALS, BPR and LMF models using the implicit library
and how collaborative filtering can be improved.
• Matrix Factorization for sparse data problem.
References
• DataSet - https://www.kaggle.com/gspmoreira/articles-sharing-
reading-from-cit-deskdrop
• https://implicit.readthedocs.io/en/latest/quickstart.html
• https://implicit.readthedocs.io/en/latest/
• https://readthedocs.org/projects/implicit/downloads/pdf/latest/
THANK YOU

More Related Content

Similar to artrec.pptx

11 Project-Online Library Management System
11 Project-Online Library Management System11 Project-Online Library Management System
11 Project-Online Library Management SystemHeather Strinden
 
A flexible recommenndation system for Cable TV
A flexible recommenndation system for Cable TVA flexible recommenndation system for Cable TV
A flexible recommenndation system for Cable TVIntoTheMinds
 
A Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TVA Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TVFrancisco Couto
 
Recommender systems for E-commerce
Recommender systems for E-commerceRecommender systems for E-commerce
Recommender systems for E-commerceAlexander Konduforov
 
11.project online library management system
11.project online library management system11.project online library management system
11.project online library management systemmonika ahalawat
 
Measuring the New Wikipedia Community (PyData SV 2013)
Measuring the New Wikipedia Community (PyData SV 2013)Measuring the New Wikipedia Community (PyData SV 2013)
Measuring the New Wikipedia Community (PyData SV 2013)PyData
 
Tag based recommender system
Tag based recommender systemTag based recommender system
Tag based recommender systemKaren Li
 
[WI 2014]Context Recommendation Using Multi-label Classification
[WI 2014]Context Recommendation Using Multi-label Classification[WI 2014]Context Recommendation Using Multi-label Classification
[WI 2014]Context Recommendation Using Multi-label ClassificationYONG ZHENG
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Spark Summit
 
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...IJCSEA Journal
 
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...IJCSEA Journal
 
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )SBGC
 
Paper id 37201536
Paper id 37201536Paper id 37201536
Paper id 37201536IJRAT
 
Annotations and Europeana @Project Assembly 2014 - Tech Workshops
Annotations and Europeana @Project Assembly 2014 - Tech WorkshopsAnnotations and Europeana @Project Assembly 2014 - Tech Workshops
Annotations and Europeana @Project Assembly 2014 - Tech WorkshopsDavid Haskiya
 
Machine_learning_presentation_on_movie_recomendation_system.pptx
Machine_learning_presentation_on_movie_recomendation_system.pptxMachine_learning_presentation_on_movie_recomendation_system.pptx
Machine_learning_presentation_on_movie_recomendation_system.pptxarunchoubeybxr
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...S. Diana Hu
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...Joaquin Delgado PhD.
 

Similar to artrec.pptx (20)

11 Project-Online Library Management System
11 Project-Online Library Management System11 Project-Online Library Management System
11 Project-Online Library Management System
 
A flexible recommenndation system for Cable TV
A flexible recommenndation system for Cable TVA flexible recommenndation system for Cable TV
A flexible recommenndation system for Cable TV
 
A Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TVA Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TV
 
SNATZ Technology
SNATZ TechnologySNATZ Technology
SNATZ Technology
 
Poster (1)
Poster (1)Poster (1)
Poster (1)
 
Recommender systems for E-commerce
Recommender systems for E-commerceRecommender systems for E-commerce
Recommender systems for E-commerce
 
11.project online library management system
11.project online library management system11.project online library management system
11.project online library management system
 
Measuring the New Wikipedia Community (PyData SV 2013)
Measuring the New Wikipedia Community (PyData SV 2013)Measuring the New Wikipedia Community (PyData SV 2013)
Measuring the New Wikipedia Community (PyData SV 2013)
 
Tag based recommender system
Tag based recommender systemTag based recommender system
Tag based recommender system
 
[WI 2014]Context Recommendation Using Multi-label Classification
[WI 2014]Context Recommendation Using Multi-label Classification[WI 2014]Context Recommendation Using Multi-label Classification
[WI 2014]Context Recommendation Using Multi-label Classification
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
 
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...
 
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...
 
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
 
Paper id 37201536
Paper id 37201536Paper id 37201536
Paper id 37201536
 
Annotations and Europeana @Project Assembly 2014 - Tech Workshops
Annotations and Europeana @Project Assembly 2014 - Tech WorkshopsAnnotations and Europeana @Project Assembly 2014 - Tech Workshops
Annotations and Europeana @Project Assembly 2014 - Tech Workshops
 
SEppt
SEpptSEppt
SEppt
 
Machine_learning_presentation_on_movie_recomendation_system.pptx
Machine_learning_presentation_on_movie_recomendation_system.pptxMachine_learning_presentation_on_movie_recomendation_system.pptx
Machine_learning_presentation_on_movie_recomendation_system.pptx
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
 

Recently uploaded

dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 

Recently uploaded (20)

dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 

artrec.pptx

  • 1. Article Recommender System Done by- Lakshya Karwa Tarun Kumar I. S. Guided by- Dr. J. Shana
  • 2. On the Internet, where the number of options is overwhelming, there is a need to filter prioritize and efficiently deliver relevant information in order to alleviate the problem of information overload, which has created a potential problem to many internet users. Problem Description
  • 3. User_Interactions The dataset which we are using was available on DeskDrop which is an Internal Communications Platform developed by CI & T. It has data from 2016 to 2017 There are 2 different Datasets namely : Shared_Articles DataSet Description
  • 4. Shared_Articles ● timestamp ● eventType ● contentId ● authorPersonId ● authorSessionId ● authorUserAgent ● authorRegion ● authorCountry ● contentType ● url ● title ● Text ● Lang
  • 5. Shared_Articles • Contains information about the articles shared in the platform. Each article has a sharing date (timestamp), the original URL, title, content, the article lang and information about the user who shared the article (author). • Two possible event types at a given timestamp: CONTENT SHARED: The article was shared in the platform and is available for users. CONTENT REMOVED: The article was removed from the platform and not available for further recommendation. • For the sake of simplicity, we only consider here the "CONTENT SHARED" event type, assuming (naively) that all articles were available during the whole one year period. For a more precise evaluation (and higher accuracy), only articles that were available at a given time should be recommended.
  • 6. User_Interactions ● timestamp ● eventType ● contentId ● personId ● sessionId ● userAgent ● userRegion ● userCountry
  • 7. User_Interactions • Contains logs of user interactions on shared articles. It can be joined to articles_shared.csv by contentId column. • The eventType values are: VIEW: The user has opened the article. LIKE: The user has liked the article. COMMENT CREATED: The user created a comment in the article. FOLLOW: The user chose to be notified on any new comment in the article. BOOKMARK: The user has bookmarked the article for easy return in the future.
  • 8. Data Pre-Processing and Preparation • No filling up of data was required as there were no missing info in the dataset • A new rating column was created based on the user’s actions on a particular article. 1 - VIEW: The user has opened the article. 2 - LIKE: The user has liked the article. 3 - COMMENT CREATED: The user created a comment in the article. 4 - FOLLOW: The user chose to be notified on any new comment in the article. 5 - BOOKMARK: The user has bookmarked the article for easy return in the future. • The two datasets were merged using “INNER Join” using the “contentID” attribute present in both the datasets
  • 10. IMPLICITLY ADDING VALUES TO PREVIOUS VIEWS
  • 12. COUNT OF ITEMS IN EVENT TYPE
  • 13. NO OF ARTICLES IN DIFFFENT LANGUAGES
  • 15.
  • 16.
  • 17. MODEL SELECTION • Alternating Least Squares (ALS) - Performed by Lakshya Karwa • Bayesian Personalized Ranking (BPR) - Performed by Tarun Kumar I.S. • Logistic Matrix Factorization (LMF) - Performed by both
  • 18. MODEL BUILDING PHASES • Model Selection : • Was selected on the basis of collaborative filtering. • ALS minimizes two loss functions alternatively. • Scalability • BPR works on Concept of Bayes concept, where it tries to find probability of item to occur when certain thing occur. • LMF works on same concept using ALS but using log function in confidence matrix to improve accuracy. • Model Fitting : Done using the implicit library available in python • Model Validation: Checked using train-test split
  • 19. Performance Analysis • Accuracy for Bayesian Personalized Ranking (BPR): 82.6 % • Accuracy for Alternating Least Squares (ALS): 98.1% • Accuracy for Logistic Matrix Factorization (LMF): 97.89 %
  • 21. Inference For ALS • Collaborative Filtering can be improved using Matrix Factorization • The method is pretty robust. • The time complexity is O(n).
  • 22. Inference for BPR • The method is does depend more on previous interactions than latent factors. • The time complexity is O(n).
  • 23. Inference For LMF • THIS method is almost similar as ALS, here we use log function in confidence matrix which improves accuracy than ALS • The time complexity is O(n).
  • 24. Time Taken by each of the models to train - Total Time taken in building the BPR model: 0.25283193588256836 - Total Time taken in building the ALS model: 0.4497077465057373 - Total Time taken in building the LMF model: 0.3670186996459961
  • 29. Challenges • The implicit library available for implementation of the algorithms wasn’t readily available for the Windows 10 Operating System. It had to be run on Linux (ubuntu) and to run on Windows 10 it needed a C/C++ compiler. • On Ubuntu, the system took a lot of time computing the results for ALS, i.e., 17.xx seconds, every time the model was built.
  • 30. Learning • The usage of the implicit Library available in python. • How different-different ‘Recommender Systems’ work. • Implementation of ALS, BPR and LMF models using the implicit library and how collaborative filtering can be improved. • Matrix Factorization for sparse data problem.
  • 31. References • DataSet - https://www.kaggle.com/gspmoreira/articles-sharing- reading-from-cit-deskdrop • https://implicit.readthedocs.io/en/latest/quickstart.html • https://implicit.readthedocs.io/en/latest/ • https://readthedocs.org/projects/implicit/downloads/pdf/latest/