SlideShare a Scribd company logo
1 of 29
Download to read offline
Literature Recommendation Software
Faruk Cankaya
Melike Keskin
Supervisor: Florian Schramm
Professor: Prof. Dr. Jürgen Ernstberger
April 15, 2021
Agenda
➢ Introduction
➢ Related Works
➢ Methodology
○ Data preparation
○ Topic Extraction
○ Finding similar papers
➢ Results
➢ Conclusion and Future Work
➢ Questions
Introduction
➢ Problem Statement
○ No preliminary data
○ Paragraph input
Introduction
➢ Keyword based input (X)
➢ Reference based recommendation (X)
➢ Mostly cited papers (X)

https://images.unsplash.com/photo-1526721940322-10fb6e3ae94a?utm_medium=medium&w=700&q=50&auto=format
https://cdn-images-1.medium.com/max/880/0*LHnFAic3Jw4N_IdP
https://images.unsplash.com/photo-1532012197267-da84d127e765?utm_medium=medium&w=700&q=50&auto=format
Introduction
➢ Problem Statement
○ No preliminary data
○ Paragraph input
➢ Motivation
○ First recommender system based on just a paragraph input
○ Specific area based paper recommendation
○ Wide area to try different technique combinations
○ Make easier the writing thesis
○ Time saving
○ Specific domain
Agenda
➢ Introduction
➢ Related Works
➢ Methodology
○ Data preparation
○ Topic Extraction
○ Finding similar papers
➢ Results
➢ Conclusion and Future Work
➢ Questions
➢ Related Works
○ Scienstein: A Research Paper Recommender System
■ Paper recommender
■ Hybrid filtering
■ Citation, author and source analysis
■ Preliminary data (citation analysis, author analysis, source analysis )
○ Science Concierge: A Fast Content-Based Recommendation System for
Scientific Publications
■ Paper recommender
■ Content-based filtering
■ Topic Modeling
■ Preliminary data (users’ votes)
○ ScienceDirect: Topic Modeling Driven Content-Based Jobs Recommendation
Engine for Recruitment Industry
■ Job recommender
■ Content-based filtering
■ Topic Modeling
■ Preliminary data (job description, user details)
Related Works
Agenda
➢ Introduction
➢ Related Works
➢ Methodology
○ Data preparation
○ Topic Extraction
○ Finding similar papers
➢ Results
➢ Conclusion and Future Work
➢ Questions
Methodology
Methodology
➢ Used Method
○ Content-based
○ Data Preprocessing
■ Cleaning + Tokenization + Stop Word Removing + Lemmatization
○ Topic modelling
■ LDA
■ NMF
○ Similarity Function
■ Cosine Similarity
➢ Data preparation
○ Number of documents: ~12.000 papers
○ Tokenization, Cleaning text, Stop word removal, Stemming,
Lemmatization, Synonym replacement, POS, etc.
Our Model:
Cleaning + Tokenization + Stop Word Removing + Lemmatization
Methodology
Methodology
➢ Vectorization
Vectorization
● Bag of words
● TF-IDF……...
Preprocessed input
text
Vectorized data
Methodology
➢ Vectorization
○ Bag-Of-Words
○ TF-IDF
terms, features or corpus
items or
documents
Methodology
➢ Topic Extraction
○ Applied Topic Modeling Technique
■ LDA
■ NMF
Methodology
Vectorized data
➢ Topic Extraction
Terms in each topic
Topic Probability of each document
Methodology
➢ Prediction / Recommendation
○ based on Cosine Similarity
Topic Probability
Matrix of dataset
Topic Probability
Vector of input
Agenda
➢ Introduction
➢ Related Works
➢ Methodology
○ Data preparation
○ Topic Extraction
○ Finding Similarity
➢ Results
➢ Conclusion and Future Work
➢ Questions
Results
➢ Data preprocessing steps effect
Results
➢ Data preprocessing steps effect
Results
➢ Model Comparisons
Results
➢ Number of Words in User Input
Results
➢ Validation with user feedback
○ Before user feedback
■ Accuracy with content 3
● LDA is better than NMF
■ Accuracy with content 10
● NMF is better than LDA
○ After user feedback
■ NMF is better than LDA
Agenda
➢ Introduction
○ Problem Statement
○ Motivation
➢ Related Works
➢ Methodology
○ Data preparation
○ Topic Extraction
○ Finding Similarity
➢ Results
➢ Conclusion and Future Work
➢ Questions
Conclusion & Future Works
➢ Conclusion
○ Found optimal data preprocessing model
■ Cleaning + Tokenization + Stop Word Removing + Lemmatization
○ Compared 2 different topic modelling techniques
■ LDA, and NMF
○ Compared model accuracies
○ User ratings
■ Models with LDA, and NMF
➢ Future Works
➢ Try another techniques such as BERT and check if the result of these
techniques give better result on user rating feedback.
➢ Use user ratings to improve recommendation system
➢ Add new features to the website
➢ Try different topic modellings
➢ Try different similarity functions
➢ Train a model use the extracted topics
➢ Tune the hyperparameters according to new techniques
Conclusion & Future Works
Agenda
➢ Introduction
➢ Related Works
➢ Methodology
○ Data preparation
○ Topic Extraction
○ Finding Similarity
➢ Results
➢ Conclusion and Future Work
➢ Questions
DEMO
➢ Web Site
Thank You
Questions?

More Related Content

Similar to Literature Recommendation Software

Curtain call of zooey - what i've learned in yahoo
Curtain call of zooey - what i've learned in yahooCurtain call of zooey - what i've learned in yahoo
Curtain call of zooey - what i've learned in yahoo羽祈 張
 
Henning agt talk-caise-semnet
Henning agt   talk-caise-semnetHenning agt   talk-caise-semnet
Henning agt talk-caise-semnetcaise2013vlc
 
Enabling Real-Time Adaptivity in MOOCs with a Personalized Next-Step Recommen...
Enabling Real-Time Adaptivity in MOOCs with a Personalized Next-Step Recommen...Enabling Real-Time Adaptivity in MOOCs with a Personalized Next-Step Recommen...
Enabling Real-Time Adaptivity in MOOCs with a Personalized Next-Step Recommen...Daniel Davis
 
Intro to Deep Learning for Question Answering
Intro to Deep Learning for Question AnsweringIntro to Deep Learning for Question Answering
Intro to Deep Learning for Question AnsweringTraian Rebedea
 
Data Management for Quantitative Biology - Lecture 1, Apr 16, 2015
Data Management for Quantitative Biology - Lecture 1, Apr 16, 2015Data Management for Quantitative Biology - Lecture 1, Apr 16, 2015
Data Management for Quantitative Biology - Lecture 1, Apr 16, 2015QBiC_Tue
 
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic ProfilesA Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic ProfilesBesnik Fetahu
 
Balancing PM & Software Development Practices by Splunk Sr PM
Balancing PM & Software Development Practices by Splunk Sr PMBalancing PM & Software Development Practices by Splunk Sr PM
Balancing PM & Software Development Practices by Splunk Sr PMProduct School
 
Open Education 2016: Leveraging Open Educational Resources to Expand Access t...
Open Education 2016: Leveraging Open Educational Resources to Expand Access t...Open Education 2016: Leveraging Open Educational Resources to Expand Access t...
Open Education 2016: Leveraging Open Educational Resources to Expand Access t...Dan Blickensderfer
 
1.2 Motivating Challenges As mentioned earlier, traditional data
1.2 Motivating Challenges As mentioned earlier, traditional data1.2 Motivating Challenges As mentioned earlier, traditional data
1.2 Motivating Challenges As mentioned earlier, traditional dataSantosConleyha
 
Ai and ml study group lecture 1 and 2
Ai and ml study group   lecture 1 and 2Ai and ml study group   lecture 1 and 2
Ai and ml study group lecture 1 and 2Ashley Davis
 
Making Sense of It All: Analyzing Qualitative Data
Making Sense of It All: Analyzing Qualitative DataMaking Sense of It All: Analyzing Qualitative Data
Making Sense of It All: Analyzing Qualitative DataGeorge Hayhoe
 
Research Methods in Medical Informatics
Research Methods in Medical InformaticsResearch Methods in Medical Informatics
Research Methods in Medical InformaticsSerkan Turkeli
 
PFCC special lecture on materials informatics_nanotech2023
PFCC special lecture on materials informatics_nanotech2023PFCC special lecture on materials informatics_nanotech2023
PFCC special lecture on materials informatics_nanotech2023Matlantis
 
Coursera data science specialization
Coursera data science specializationCoursera data science specialization
Coursera data science specializationMengshu Liu
 

Similar to Literature Recommendation Software (20)

Curtain call of zooey - what i've learned in yahoo
Curtain call of zooey - what i've learned in yahooCurtain call of zooey - what i've learned in yahoo
Curtain call of zooey - what i've learned in yahoo
 
Henning agt talk-caise-semnet
Henning agt   talk-caise-semnetHenning agt   talk-caise-semnet
Henning agt talk-caise-semnet
 
Enabling Real-Time Adaptivity in MOOCs with a Personalized Next-Step Recommen...
Enabling Real-Time Adaptivity in MOOCs with a Personalized Next-Step Recommen...Enabling Real-Time Adaptivity in MOOCs with a Personalized Next-Step Recommen...
Enabling Real-Time Adaptivity in MOOCs with a Personalized Next-Step Recommen...
 
Research Problem
Research ProblemResearch Problem
Research Problem
 
Intro to Deep Learning for Question Answering
Intro to Deep Learning for Question AnsweringIntro to Deep Learning for Question Answering
Intro to Deep Learning for Question Answering
 
Data Management for Quantitative Biology - Lecture 1, Apr 16, 2015
Data Management for Quantitative Biology - Lecture 1, Apr 16, 2015Data Management for Quantitative Biology - Lecture 1, Apr 16, 2015
Data Management for Quantitative Biology - Lecture 1, Apr 16, 2015
 
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic ProfilesA Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
 
Balancing PM & Software Development Practices by Splunk Sr PM
Balancing PM & Software Development Practices by Splunk Sr PMBalancing PM & Software Development Practices by Splunk Sr PM
Balancing PM & Software Development Practices by Splunk Sr PM
 
A Benchmark of PDF Information Extraction Tools using a Multi-Task and Multi-...
A Benchmark of PDF Information Extraction Tools using a Multi-Task and Multi-...A Benchmark of PDF Information Extraction Tools using a Multi-Task and Multi-...
A Benchmark of PDF Information Extraction Tools using a Multi-Task and Multi-...
 
Seminar2017
Seminar2017Seminar2017
Seminar2017
 
Pmp session 1
Pmp session 1Pmp session 1
Pmp session 1
 
Open Education 2016: Leveraging Open Educational Resources to Expand Access t...
Open Education 2016: Leveraging Open Educational Resources to Expand Access t...Open Education 2016: Leveraging Open Educational Resources to Expand Access t...
Open Education 2016: Leveraging Open Educational Resources to Expand Access t...
 
Intro
IntroIntro
Intro
 
first_seminar.pdf
first_seminar.pdffirst_seminar.pdf
first_seminar.pdf
 
1.2 Motivating Challenges As mentioned earlier, traditional data
1.2 Motivating Challenges As mentioned earlier, traditional data1.2 Motivating Challenges As mentioned earlier, traditional data
1.2 Motivating Challenges As mentioned earlier, traditional data
 
Ai and ml study group lecture 1 and 2
Ai and ml study group   lecture 1 and 2Ai and ml study group   lecture 1 and 2
Ai and ml study group lecture 1 and 2
 
Making Sense of It All: Analyzing Qualitative Data
Making Sense of It All: Analyzing Qualitative DataMaking Sense of It All: Analyzing Qualitative Data
Making Sense of It All: Analyzing Qualitative Data
 
Research Methods in Medical Informatics
Research Methods in Medical InformaticsResearch Methods in Medical Informatics
Research Methods in Medical Informatics
 
PFCC special lecture on materials informatics_nanotech2023
PFCC special lecture on materials informatics_nanotech2023PFCC special lecture on materials informatics_nanotech2023
PFCC special lecture on materials informatics_nanotech2023
 
Coursera data science specialization
Coursera data science specializationCoursera data science specialization
Coursera data science specialization
 

Recently uploaded

Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 

Recently uploaded (20)

Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 

Literature Recommendation Software

  • 1. Literature Recommendation Software Faruk Cankaya Melike Keskin Supervisor: Florian Schramm Professor: Prof. Dr. Jürgen Ernstberger April 15, 2021
  • 2. Agenda ➢ Introduction ➢ Related Works ➢ Methodology ○ Data preparation ○ Topic Extraction ○ Finding similar papers ➢ Results ➢ Conclusion and Future Work ➢ Questions
  • 3. Introduction ➢ Problem Statement ○ No preliminary data ○ Paragraph input
  • 4. Introduction ➢ Keyword based input (X) ➢ Reference based recommendation (X) ➢ Mostly cited papers (X)  https://images.unsplash.com/photo-1526721940322-10fb6e3ae94a?utm_medium=medium&w=700&q=50&auto=format https://cdn-images-1.medium.com/max/880/0*LHnFAic3Jw4N_IdP https://images.unsplash.com/photo-1532012197267-da84d127e765?utm_medium=medium&w=700&q=50&auto=format
  • 5. Introduction ➢ Problem Statement ○ No preliminary data ○ Paragraph input ➢ Motivation ○ First recommender system based on just a paragraph input ○ Specific area based paper recommendation ○ Wide area to try different technique combinations ○ Make easier the writing thesis ○ Time saving ○ Specific domain
  • 6. Agenda ➢ Introduction ➢ Related Works ➢ Methodology ○ Data preparation ○ Topic Extraction ○ Finding similar papers ➢ Results ➢ Conclusion and Future Work ➢ Questions
  • 7. ➢ Related Works ○ Scienstein: A Research Paper Recommender System ■ Paper recommender ■ Hybrid filtering ■ Citation, author and source analysis ■ Preliminary data (citation analysis, author analysis, source analysis ) ○ Science Concierge: A Fast Content-Based Recommendation System for Scientific Publications ■ Paper recommender ■ Content-based filtering ■ Topic Modeling ■ Preliminary data (users’ votes) ○ ScienceDirect: Topic Modeling Driven Content-Based Jobs Recommendation Engine for Recruitment Industry ■ Job recommender ■ Content-based filtering ■ Topic Modeling ■ Preliminary data (job description, user details) Related Works
  • 8. Agenda ➢ Introduction ➢ Related Works ➢ Methodology ○ Data preparation ○ Topic Extraction ○ Finding similar papers ➢ Results ➢ Conclusion and Future Work ➢ Questions
  • 10. Methodology ➢ Used Method ○ Content-based ○ Data Preprocessing ■ Cleaning + Tokenization + Stop Word Removing + Lemmatization ○ Topic modelling ■ LDA ■ NMF ○ Similarity Function ■ Cosine Similarity
  • 11. ➢ Data preparation ○ Number of documents: ~12.000 papers ○ Tokenization, Cleaning text, Stop word removal, Stemming, Lemmatization, Synonym replacement, POS, etc. Our Model: Cleaning + Tokenization + Stop Word Removing + Lemmatization Methodology
  • 12. Methodology ➢ Vectorization Vectorization ● Bag of words ● TF-IDF……... Preprocessed input text Vectorized data
  • 13. Methodology ➢ Vectorization ○ Bag-Of-Words ○ TF-IDF terms, features or corpus items or documents
  • 14. Methodology ➢ Topic Extraction ○ Applied Topic Modeling Technique ■ LDA ■ NMF
  • 15. Methodology Vectorized data ➢ Topic Extraction Terms in each topic Topic Probability of each document
  • 16. Methodology ➢ Prediction / Recommendation ○ based on Cosine Similarity Topic Probability Matrix of dataset Topic Probability Vector of input
  • 17. Agenda ➢ Introduction ➢ Related Works ➢ Methodology ○ Data preparation ○ Topic Extraction ○ Finding Similarity ➢ Results ➢ Conclusion and Future Work ➢ Questions
  • 21. Results ➢ Number of Words in User Input
  • 22. Results ➢ Validation with user feedback ○ Before user feedback ■ Accuracy with content 3 ● LDA is better than NMF ■ Accuracy with content 10 ● NMF is better than LDA ○ After user feedback ■ NMF is better than LDA
  • 23. Agenda ➢ Introduction ○ Problem Statement ○ Motivation ➢ Related Works ➢ Methodology ○ Data preparation ○ Topic Extraction ○ Finding Similarity ➢ Results ➢ Conclusion and Future Work ➢ Questions
  • 24. Conclusion & Future Works ➢ Conclusion ○ Found optimal data preprocessing model ■ Cleaning + Tokenization + Stop Word Removing + Lemmatization ○ Compared 2 different topic modelling techniques ■ LDA, and NMF ○ Compared model accuracies ○ User ratings ■ Models with LDA, and NMF
  • 25. ➢ Future Works ➢ Try another techniques such as BERT and check if the result of these techniques give better result on user rating feedback. ➢ Use user ratings to improve recommendation system ➢ Add new features to the website ➢ Try different topic modellings ➢ Try different similarity functions ➢ Train a model use the extracted topics ➢ Tune the hyperparameters according to new techniques Conclusion & Future Works
  • 26. Agenda ➢ Introduction ➢ Related Works ➢ Methodology ○ Data preparation ○ Topic Extraction ○ Finding Similarity ➢ Results ➢ Conclusion and Future Work ➢ Questions