SlideShare a Scribd company logo
1 of 40
A Task-based Scientific
Paper Recommender
System for Literature
Review and Manuscript
Preparation
Aravind SESAGIRI RAAMKUMAR
PhD Candidate
Oral Examination Presentation for fulfillment of
PhD
October 3rd 2017
Citation
recommendations
Research
paper
recommender
systems
Recommending
papers
{Ad-hoc search}
Background
“How to get the best set of relevant documents for a researcher’s literature review and
publication purposes?”
SPRS
Research
Area
More than
250 papers
Process-based
Interventions
Technology-oriented
Interventions
2
Identify
research
opportunities
Find
collaborators
Secure
support
Review the
literature
Collect
research data
Analyze
research data
Disseminate
findings
Manage the
research
process
Related Work Classification (1/2)
Active and Explicit
Information Needs
(AEIN) in the
Research Lifecycle
A) Active and Explicit Information Needs (AEIN)
B) Passive and Implicit Information Needs (PIIN)
<Recommendation of Scholarly Objects>
• Building a reading list for
literature review
• Finding similar papers
:
:
:
• Searching papers based
on input text
• Publication Venues
• Citation Context
3
Related Work Classification (2/2)
Passive and Implicit Information Needs (PIIN)
• User Footprint
• Researcher’s Publication History
• Social Network of Authors
• Social Tags
• Reference Management Systems
4
Research Gaps in SPRS Studies
Consolidated
Framework for
Contextual
Dimensions
Lack of
Connectivity
between Tasks
Lack of
Relation(s)
between Tasks
and RS Filtering
Mechanisms
Absence of
Article Type as
an Input
Dimension
5
Research Objectives
• To identify an appropriate method to map the identified LR and
MP tasks to relevant IR/RS algorithms
– RQ1: What are the key search tasks of researchers in the
literature review and publication lifecycle?
– RQ2: How to relate the identified tasks of researchers to
IR/RS algorithms?
• To evaluate whether the performance of the proposed
recommendation techniques for the tasks and the overall
system were at the expected level
– RQ3: Do the proposed recommendation techniques of the
relevant tasks outperform the existing baseline
approaches in system-based evaluation?
– RQ4: Do the proposed recommendation techniques and
the overall system meet the expected standards in user-
based evaluation?
Study I
Study II
Rec4LRW
System
6
Study I - Survey on Inadequate and Omitted
Citations (IOC) in Manuscripts
Authors
Reviewers
Problems
in research
quality
Manuscripts with
improper LR
7
Study I - Survey on Inadequate and Omitted
Citations (IOC) in Manuscripts
Aims
• What are the critical instances of IOC?
• Do the critical instances and reasons of IOC in research manuscripts relate
with the scenarios/tasks where researchers need external assistance in
finding papers?
• Identify the prominent information sources
• What is the researchers’ awareness level of available recommendation
services for research papers?
8
Study I Details
• Single center data collection conducted for two months
• Only researchers with paper authoring experience were recruited
• 207 NTU researchers participated in the study
71% of the participants answered from both reviewer and author perspectives
• Survey questionnaire comprising of 31 questions
• Agreeability measured in 5-point Likert scale
• Data analyses through one-sample t-test with test value of either
2 or 3
9
Study I - Results
Instances of IOC
• Authors viewpoint
– Missed citing seminal and topically-similar papers in
journal manuscripts
• Reviewers viewpoint
– Missed citing seminal, topically-similar papers in all
manuscripts
– Insufficient and irrelevant papers in the LR of all
manuscripts
Effects of IOC
• Reviewers viewpoint
– Manuscripts are sent back for revision due to missing
citations
10
Study I - Results
Need for External Assistance in Finding Papers
• Authors required support for the below papers:-
1. Interdisciplinary papers
2. Topically-similar papers
3. Seminal papers
4. Citations for placeholders in manuscripts
5. Necessary citations meant for inclusion in manuscripts
Usage of Academic Information Sources
• Researchers used the below sources in the order of usage
1. Google Scholar
2. ScienceDirect
3. Web of Science
4. SpringerLink
• 62% of the participants have never used SPRS services
11
Study I – Key Findings
• Researchers need help in finding interdisciplinary, topically-
similar and seminal papers
• Generating reading list (seminal papers) and finding similar
papers are two necessary LR search tasks for the
proposed system
• Shortlisting papers from final reading list for inclusion in
manuscript, selected as third task for the proposed system
• Google Scholar’s simplistic UI makes it the most used
information source and ideal choice for UI design of a new
assistive system
12
Rec4LRW Design and Development
Task Redesign
Task
Interconnectivity
Informational
Display Features
13
Rec4LRW System Design - I
Base Features
• Plug and Play concept
• Features represent different characteristic of paper and its relations to references and citations
• Grey Literature Percentage, Coverage, Textual Similarity and Specificity are novel features
• New features can be added as required
14
Rec4LRW System Design - II
Task 1 -Building an Initial Reading List of Research Papers
Popular
papers
Recent
papers
Survey
papers
Diverse
papers
Use of Okapi BM25 Similarity Score to retrieve
top 200 matching papers
Requirements
Author-specified Keywords based Retrieval (AKR) Technique
Ranking problem
Composite Rank is a weighted mix of Coverage,
Citation Count and Reference Count
15
Rec4LRW System Design - II
Task 2 - Finding Similar Papers based on Set of Papers Extended paper
discovery
problem
Multiple input
papers
Integrated
Discovery of
Similar Papers
(IDSP) Technique
IDSP
Technique
Similar papers
16
Rec4LRW System Design - II
Task 3 - Shortlisting Articles from RL for Inclusion in Manuscript Cluster detection
problem
Final list of
papers from LR
Citation Network
based Shortlisting
(CNS) TechniqueIDSP
Technique
Unique and
important papers
17
Rec4LRW System Design - III
Task Screens
Task 1
Task 2
Information
cue labels
Seed
Basket (SB)
18
Rec4LRW System Design - III
Task Screens
Task 2
Task 3
Shared
Co-relations
Reading List
(RL)
19
Rec4LRW System Design - III
Task Screens
Task 3
• Front end: PHP, HTML, CSS, JavaScript
• Backend: MySQL
• Processing layer: JAVA
• Java libraries: Apache Lucence (for BM25), Apache Mahout (for IBCF), Jung (for community
detection algorithm)
Cluster
viewing
option
20
Study II – Rec4LRW Evaluation
21
Study II - Dataset
• XML files provided by ACM
• Papers published in the period 1951 to 2011
• Total of 103,739 articles and corresponding 2,320,345 references
• Data was cleaned and transformed in MySQL
• References were parsed using AnyStyle parser
• All the seven base features were precomputed before Study II
22
Study II – Pre-study
Evaluated Techniques
Label Abbr. Technique Description
A AKRv1 Basic AKR technique with weights WCC = 0.25, WRC=0.25, WCO = 0.5
B AKRv2 Basic AKR technique with weights WCC = 0.1, WRC=0.1, WCO = 0.8
C HAKRv1 HITS enhanced AKR technique boosted with weights WCC = 0.25, WRC=0.25, WCO = 0.5
D HAKRv2 HITS enhanced AKR technique boosted with weights WCC = 0.1, WRC=0.1, WCO = 0.8
E CFHITS IBCF technique boosted with HITS
F CFPR IBCF technique boosted with PageRank
G PR PageRank technique
Experiment Setup
• A total of 186 author-specified keywords from the ACM DL dataset were identified as the seed research topic
• The experiment was performed in three sequential steps.
1. Top 200 papers were retrieved using the BM25 similarity algorithm
2. Top 20 papers were identified using the specific ranking schemes of the seven techniques
3. The evaluation metrics were measured for the seven techniques
Evaluation Approach
• Number of Recent (R1), Popular (R2), Survey (R3) and Diverse (R4) papers were enumerated for each of the
186 topics and seven techniques
• Ranks were assigned to the technique based on the highest counts in each recommendation list
• The RankAggreg library was used to perform Rank Aggregation
23
Study II – Part I (Pre-study)
Results
Paper Type (Requirement)
Optimal Aggregated Ranks Min. Obj. Function
Score1 2 3 4 5 6 7
Recent Papers (R1) B A C D E F G 10.66
Popular Papers (R2) F E C D G A B 11.89
Literature Survey Papers (R3) C G D A E F B 13.38
Diverse Papers (R4) C D G A B F E 12.15
• The HITS enhanced version of the AKR technique HAKRv1 (C) was the best all-round performing technique
• The HAKRv1 technique was particularly good for retrieving literature survey papers and papers from different
sub-topics while the basic AKRv1 technique (A) was good for retrieving recent papers
• The baseline CFPR technique (F) remains the best technique for retrieving popular papers
• The advantage of using weights has been shown
• AKR technique’s scalability is highlighted
24
Study II – User Study Evaluation Goals
1. Ascertain the agreement percentages of the evaluation measures for the three tasks
and the overall system and identify whether the values are above a preset threshold
criteria of 75%
2. Test the hypothesis that students benefit more from the recommendation
tasks/system in comparison to staff
3. Measure the correlation between the measures and build a regression model with
‘agreeability on a good list’ as the dependent variable
4. Track the change in user perceptions between the three tasks
5. Compare the pre-study and post-study variables for understanding whether the
target participants are benefitted from the tasks
6. Identify the top most preferred and critical aspects of the task recommendations and
the system using the subjective feedback of the participants
25
Study II - Details
• Rec4LRW system was made available over the internet
• Participants were recruited with intent to get worldwide audience
• Only researchers with paper authoring experience were recruited through a
pre-screening survey
• 230 researchers participated in the pre-screening survey
• 149 participants were deemed eligible and invited for the study
• Participants provided with a user guide
• All the three tasks were required to be executed by the participants
• Evaluation questionnaires embedded in the screen of each task of Rec4LRW
system
26
Study II – Participant Demographics
Stage N
Task 1 132
Task 2 121
Task 3 119
Demographic Variable N
Position
Student 62 (47%)
Staff 70 (53%)
Experience Level
Beginner 15 (11.4%)
Intermediate 61 (46.2%)
Advanced 34 (25.8%)
Expert 22 (16.7%)
Discipline N
Computer Science & Information Systems 51 (38.6%)
Library and Information Studies 30 (22.7%)
Electrical & Electronic Engineering 30 (22.7%)
Communication & Media Studies 8 (6.1%)
Mechanical, Aeronautical & Manufacturing Engineering 5 (3.8%)
Biological Sciences 2 (1.5%)
Statistics & Operational Research 1 (0.8%)
Education 1 (0.8%)
Politics & International Studies 1 (0.8%)
Economics & Econometrics 1 (0.8%)
Civil & Structural Engineering 1 (0.8%)
Psychology 1 (0.8%)
Country N
Singapore 107 (81.1%)
India 4 (3%)
Malaysia 3 (2.3%)
Sri Lanka 3 (2.3%)
Pakistan 3 (2.3%)
Indonesia 2 (1.5%)
Germany 2 (1.5%)
Australia 1 (0.8%)
Iran 1 (0.8%)
Thailand 1 (0.8%)
China 1 (0.8%)
USA 1 (0.8%)
Canada 1 (0.8%)
Sweden 1 (0.8%)
Slovenia 1 (0.8%) 27
Study II – Task Evaluation Measures
Common Measures
• Relevance
• Usefulness
• Good_List
Tasks 1 and 2
• Good_Spread
• Diversity
• Interdisciplinarity
• Popularity
• Recency
• Good_Mix
• Familiarity
• Novelty
• Serendipity
• Expansion_Required
• User_Satisfaction
Task 2 specific
• Seedbasket_Similarity
• Shared_Corelations
• Seedbasket_Usefulness
Task 3 specific
• Importance
• Certainty
• Shortlisting_Feature
28
1) From the displayed information, what features did
you like the most?
2) Please provide your personal feedback about the
execution of this task
Study II – System Evaluation Measures
Effort to use the System (EUS)
• Convenience
• Effort_Required
• Mouse_Clicks
• Little_Time
• Much_Time
Perceived Usefulness (PU)
• Productivity_Improvability
• Enhance_Effectiveness
• Ease_Job
• Work_Usefulness
Perceived System Effectiveness (PSE)
• Recommend
• Pleasant_Experience
• Useless
• Awareness
• Better_Choice
• Findability
• Accomplish_Tasks
• Performance_Improvability
29
Study II – Analysis Procedures
Quantitative Data
• Agreement Percentage (AP) calculated by only considering responses of 4
(‘Agree’) and 5 (‘Strongly Agree’) in the 5-point Likert scale
• Independent samples t-test for hypothesis testing
• Spearman coefficient for correlation measurement
• MLR used for the predictive models
– Paired samples t-test for model validation
Qualitative Data
• Descriptive coding method was used to code the participant feedback
• Two coders performed the coding in a sequential manner
Preferred Aspects (κ) Critical Aspects (κ)
Task 1 0.918 0.727
Task 2 0.930 0.758
Task 3 0.877 0.902
30
Study II – Results for Goals 1 & 2
31
Study II – Results for Goals 3 and 4
Predictors for “Good_List”
Task Independent Variables
Task 1 Recency, Novelty, Serendipity, Usefulness, User_Satisfaction
Task 2 Seedbasket_Similarity, Usefulness
Task 3 Relevance, Usefulness, Certainty
Transition of User Perception from Task 1 to 2
32
Study II – Results for Goal 5
0 1 3 4 00
6
5
10
21
9
18
22
40 1
11
18
10 1 2
5
6
0
5
10
15
20
25
Count
1
2
3
4
5
0 3 5
20
30 3
9
30
41 2 7
21
20 0 3 1 2
0
5
10
15
20
25
30
35
Count 1
2
3
4
5
0 1 3 2 30 2
8
15
40 4
7
24
6
0 1 5
16
31 1 2 5 1
0
5
10
15
20
25
30
Count
1
2
3
4
5
Task 1
Task 2
Task 3
Need_Assistance
(pre study)
Vs.
Good_List
(post study)
33
Never Rarely Sometimes Often Always
Never Rarely Sometimes Often Always
Never Rarely Sometimes Often Always
Study II – Results for Goal 6
Top 5 Preferred Aspects
Rank Task 1 (N=109) Task 2 (N=100) Task 3 (N=91)
1 Information Cue Labels (41%)
Shared Co-citations & Co-references
(28%)
Shortlisting Feature &
Recommendation Quality (24%)
2 Rich Metadata (21%) Recommendation Quality (27%) Information Cue Labels (15%)
3 Diversity of Papers (13%) Information Cue Labels (16%) View Papers in Clusters (11%)
4 Recommendation Quality (9%) Seed Basket (14%) Rich Metadata (7%)
5 Recency of Papers (4%) Rich Metadata (9%) Ranking of Papers (3%)
Rank Task 1 (N=109) Task 2 (N=100) Task 3 (N=91)
1 Broad topics not suitable (20%) Quality can be improved (16%)
Rote selection of papers for task
execution (16%)
2 Limited dataset (7%) Limited dataset (12%) Limited dataset (5%)
3 Quality can be improved (6%)
Recommendation algorithm could
include more dimensions (7%)
Algorithm can be improved (5%)
4 Different algorithm required (5%) Speed can be improved (7%) Not sure of the usefulness (4%)
5 Free-text search required (4%)
Repeated recommendations from Task 1
(3%)
UI can be improved (3%)
Top 5 Critical Aspects
34
Contributions and Implications
• The Rec4LRW system and its recommendations adequately
satisfy the most affected user group – Students
• Addresses the piecemeal scholarship on scientific paper
recommender systems (SPRS)
• Proposes bridge between task requirements and IR/RS
algorithms
• The threefold intervention framework helps in integrating
research ideas from UI, IR and RS research areas
35
Limitations
• Recommendation techniques do not cater to disciplinary
differences (if any)
• Recommendations could be biased to certain requirements
of the three tasks
• Non-user personalized techniques (not a serious issue)
• Evaluation study conducted with a limited set of research
topics
36
SPRRF - Scientific Paper Retrieval and
Recommender Framework (SPRRF)
Distinct User
Groups
Usefulness of
Information Cue
Labels
Forced
Serendipity vs.
Natural
Serendipity
Learning
Algorithms vs.
Fixed-Logic
Algorithms
Inclusion of
Control
Features in UI
Inclusion of
Bibliometric
Data
Diversification
of Corpus
• Seven themes identified using holistic coding method
• SPRRF conceptualized as a mental model based on
the themes
• The framework needs to be validated
37
Future Work
• Validation of the proposed SPRRF framework
• Longitudinal user evaluation studies
• Improvements in recommendation techniques
– Inclusion of more metrics
– More weights for customization
– Citation motivations
– Usage of open web standards
38
Publications
Journal Papers
1. Raamkumar, A. S., Foo, S., & Pang, N. (2016). Survey on inadequate and omitted citations in manuscripts: a precursory study in identification of
tasks for a literature review and manuscript writing assistive system. Information Research, 21(4).
2. Raamkumar, A. S., Foo, S., & Pang, N. (2017). Using author-specified keywords in building an initial reading list of research papers in scientific
paper retrieval and recommender systems. Information Processing & Management, 53(3), 577-594.
3. Sesagiri Raamkumar, A., Foo, S., Pang, N. (2017). Evaluating a threefold intervention framework for assisting researchers in literature review and
manuscript preparatory tasks. Journal of Documentation, 73(3), 555-580.
4. Sesagiri Raamkumar, A., Foo, S., Pang, N. (2017). User Evaluation of a Task for Shortlisting Papers from Researcher’s Reading List for Citing in
Manuscripts. Aslib Journal of Information Management, 69(6).
5. Sesagiri Raamkumar, A., Foo, S., Pang, N. (2017). Can I have more of these please? Assisting researchers in finding similar research papers from
a seed basket of papers. The Electronic Library. Manuscript recommended for publication.
Conference Papers
1. Sesagiri Raamkumar, A., Foo, S., & Pang, N. (2015). Rec4LRW-scientific paper recommender system for literature review and writing. Frontiers in
Artificial Intelligence and Applications (Vol. 275).
2. Raamkumar, A. S., Foo, S., & Pang, N. (2015). Comparison of techniques for measuring research coverage of scientific papers: A case study. In
Digital Information Management (ICDIM), 2015 Tenth International Conference on (pp. 132-137). IEEE.
3. Raamkumar, A. S., Foo, S., & Pang, N. (2015). More Than Just Black and White: A Case for Grey Literature References in Scientific Paper
Information Retrieval Systems. In International Conference on Asian Digital Libraries (pp. 252-257). Springer, Cham.
4. Sesagiri Raamkumar, A., Foo, S., & Pang, N. (2016,). Making Literature Review and Manuscript Writing Tasks Easier for Novice Researchers
through Rec4LRW System. In Proceedings of the 16th ACM/IEEE-CS on Joint Conference on Digital Libraries (pp. 229-230). ACM.
5. Sesagiri Raamkumar, A., Foo, S., & Pang, N. (2016). What papers should I cite from my reading list? User evaluation of a manuscript preparatory
assistive task. In Proceedings of the Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital
Libraries (BIRNDL2016) (pp. 51–62).
39
THANK YOU
40

More Related Content

What's hot

Query Dependent Pseudo-Relevance Feedback based on Wikipedia
Query Dependent Pseudo-Relevance Feedback based on WikipediaQuery Dependent Pseudo-Relevance Feedback based on Wikipedia
Query Dependent Pseudo-Relevance Feedback based on Wikipedia
YI-JHEN LIN
 
Search term recommendation and non-textual ranking evaluated
 Search term recommendation and non-textual ranking evaluated Search term recommendation and non-textual ranking evaluated
Search term recommendation and non-textual ranking evaluated
GESIS
 
Assessment = Improved Teaching and Learning: Using Rubrics to Measure Inform...
Assessment = Improved Teaching and Learning:  Using Rubrics to Measure Inform...Assessment = Improved Teaching and Learning:  Using Rubrics to Measure Inform...
Assessment = Improved Teaching and Learning: Using Rubrics to Measure Inform...
Kathryn Crowe
 

What's hot (12)

Query Dependent Pseudo-Relevance Feedback based on Wikipedia
Query Dependent Pseudo-Relevance Feedback based on WikipediaQuery Dependent Pseudo-Relevance Feedback based on Wikipedia
Query Dependent Pseudo-Relevance Feedback based on Wikipedia
 
Wsdm west wesley-smith
Wsdm west wesley-smithWsdm west wesley-smith
Wsdm west wesley-smith
 
Literature searching peer review in practice: enhancing the skills of searchers
Literature searching peer review in practice: enhancing the skills of searchersLiterature searching peer review in practice: enhancing the skills of searchers
Literature searching peer review in practice: enhancing the skills of searchers
 
Improved author profiling through the use of citation classes
Improved author profiling through the use of citation classesImproved author profiling through the use of citation classes
Improved author profiling through the use of citation classes
 
Search term recommendation and non-textual ranking evaluated
 Search term recommendation and non-textual ranking evaluated Search term recommendation and non-textual ranking evaluated
Search term recommendation and non-textual ranking evaluated
 
Assessment = Improved Teaching and Learning: Using Rubrics to Measure Inform...
Assessment = Improved Teaching and Learning:  Using Rubrics to Measure Inform...Assessment = Improved Teaching and Learning:  Using Rubrics to Measure Inform...
Assessment = Improved Teaching and Learning: Using Rubrics to Measure Inform...
 
INGI2252 Software Measures & Maintenance
INGI2252 Software Measures & MaintenanceINGI2252 Software Measures & Maintenance
INGI2252 Software Measures & Maintenance
 
Social Phrases Having Impact in Altmetrics - SOPHIA
Social Phrases Having Impact in Altmetrics - SOPHIASocial Phrases Having Impact in Altmetrics - SOPHIA
Social Phrases Having Impact in Altmetrics - SOPHIA
 
Approach to leverage Websites to APIs through Semantics
Approach to leverage Websites to APIs through SemanticsApproach to leverage Websites to APIs through Semantics
Approach to leverage Websites to APIs through Semantics
 
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...
 
Europe PMC Section Tagger
Europe PMC Section TaggerEurope PMC Section Tagger
Europe PMC Section Tagger
 
Converting scripts into reproducible workflow research objects
Converting scripts into reproducible workflow research objectsConverting scripts into reproducible workflow research objects
Converting scripts into reproducible workflow research objects
 

Similar to A task-based scientific paper recommender system for literature review and manuscript preparation

Library Assessment Toolkit & Dashboard Scoping Research Final Report and Path...
Library Assessment Toolkit & Dashboard Scoping Research Final Report and Path...Library Assessment Toolkit & Dashboard Scoping Research Final Report and Path...
Library Assessment Toolkit & Dashboard Scoping Research Final Report and Path...
Megan Hurst
 

Similar to A task-based scientific paper recommender system for literature review and manuscript preparation (20)

empirical-SLR.pptx
empirical-SLR.pptxempirical-SLR.pptx
empirical-SLR.pptx
 
Part 1 Research workshop
Part 1 Research workshopPart 1 Research workshop
Part 1 Research workshop
 
Introduction to Systematic Literature Review method
Introduction to Systematic Literature Review methodIntroduction to Systematic Literature Review method
Introduction to Systematic Literature Review method
 
Presentation on Software process improvement in GSD
Presentation on Software process improvement in GSDPresentation on Software process improvement in GSD
Presentation on Software process improvement in GSD
 
Discovery Tools for Open Access Repositories: A Literature Mapping
Discovery Tools for Open Access Repositories: A Literature MappingDiscovery Tools for Open Access Repositories: A Literature Mapping
Discovery Tools for Open Access Repositories: A Literature Mapping
 
Benchmarking Domain-specific Expert Search using Workshop Program Committees
Benchmarking Domain-specific Expert Search using Workshop Program CommitteesBenchmarking Domain-specific Expert Search using Workshop Program Committees
Benchmarking Domain-specific Expert Search using Workshop Program Committees
 
Enhance your rese​arch impact through open science
Enhance your rese​arch impact through open scienceEnhance your rese​arch impact through open science
Enhance your rese​arch impact through open science
 
What's in the research librarian's tool shed?
What's in the research librarian's tool shed?What's in the research librarian's tool shed?
What's in the research librarian's tool shed?
 
Searching systematically: supporting authors of Cochrane reviews.
Searching systematically: supporting authors of Cochrane reviews.  Searching systematically: supporting authors of Cochrane reviews.
Searching systematically: supporting authors of Cochrane reviews.
 
Credible workshop
Credible workshopCredible workshop
Credible workshop
 
Introduction to research and its different aspects
Introduction to research and its different aspectsIntroduction to research and its different aspects
Introduction to research and its different aspects
 
Clement, A measured approach to supporting research productivity
Clement, A measured approach to supporting research productivityClement, A measured approach to supporting research productivity
Clement, A measured approach to supporting research productivity
 
محاضرة برنامج Nails لتحليل الدراسات السابقة د.شروق المقرن
محاضرة برنامج Nails  لتحليل الدراسات السابقة د.شروق المقرنمحاضرة برنامج Nails  لتحليل الدراسات السابقة د.شروق المقرن
محاضرة برنامج Nails لتحليل الدراسات السابقة د.شروق المقرن
 
Converting Scripts into Reproducible Workflow Research Objects
Converting Scripts into Reproducible Workflow Research ObjectsConverting Scripts into Reproducible Workflow Research Objects
Converting Scripts into Reproducible Workflow Research Objects
 
1.pdf
1.pdf1.pdf
1.pdf
 
Pikas using bibliometrics to make sense of research proposals
Pikas using bibliometrics to make sense of research proposalsPikas using bibliometrics to make sense of research proposals
Pikas using bibliometrics to make sense of research proposals
 
Systematic Literature Review
Systematic Literature ReviewSystematic Literature Review
Systematic Literature Review
 
لتحليل الدراسات السابقة Nails محاضرة برنامج
  لتحليل الدراسات السابقة Nails محاضرة برنامج  لتحليل الدراسات السابقة Nails محاضرة برنامج
لتحليل الدراسات السابقة Nails محاضرة برنامج
 
83341 ch24 jacobsen
83341 ch24 jacobsen83341 ch24 jacobsen
83341 ch24 jacobsen
 
Library Assessment Toolkit & Dashboard Scoping Research Final Report and Path...
Library Assessment Toolkit & Dashboard Scoping Research Final Report and Path...Library Assessment Toolkit & Dashboard Scoping Research Final Report and Path...
Library Assessment Toolkit & Dashboard Scoping Research Final Report and Path...
 

More from Aravind Sesagiri Raamkumar

More from Aravind Sesagiri Raamkumar (20)

Approaches to combining supplementary datasets across multiple trusted resear...
Approaches to combining supplementary datasets across multiple trusted resear...Approaches to combining supplementary datasets across multiple trusted resear...
Approaches to combining supplementary datasets across multiple trusted resear...
 
Measuring the Outreach Efforts of Public Health Authorities and the Public Re...
Measuring the Outreach Efforts of Public Health Authorities and the Public Re...Measuring the Outreach Efforts of Public Health Authorities and the Public Re...
Measuring the Outreach Efforts of Public Health Authorities and the Public Re...
 
Understanding the Twitter Usage of Science Citation Index (SCI) Journals
Understanding the Twitter Usage of Science Citation Index (SCI) JournalsUnderstanding the Twitter Usage of Science Citation Index (SCI) Journals
Understanding the Twitter Usage of Science Citation Index (SCI) Journals
 
Investigating the Characteristics and Research Impact of Sentiments in Tweets...
Investigating the Characteristics and Research Impact of Sentiments in Tweets...Investigating the Characteristics and Research Impact of Sentiments in Tweets...
Investigating the Characteristics and Research Impact of Sentiments in Tweets...
 
Understanding the Twitter Usage of Humanities and Social Sciences Academic Jo...
Understanding the Twitter Usage of Humanities and Social Sciences Academic Jo...Understanding the Twitter Usage of Humanities and Social Sciences Academic Jo...
Understanding the Twitter Usage of Humanities and Social Sciences Academic Jo...
 
Using altmetrics to support research evaluation
Using altmetrics to support research evaluationUsing altmetrics to support research evaluation
Using altmetrics to support research evaluation
 
Evolution and state-of-the art of Altmetric research: Insights from network a...
Evolution and state-of-the art of Altmetric research: Insights from network a...Evolution and state-of-the art of Altmetric research: Insights from network a...
Evolution and state-of-the art of Altmetric research: Insights from network a...
 
Feature Analysis of Research Metrics Systems
Feature Analysis of Research Metrics SystemsFeature Analysis of Research Metrics Systems
Feature Analysis of Research Metrics Systems
 
Scientometric Analysis of Research Performance of African Countries in select...
Scientometric Analysis of Research Performance of African Countries in select...Scientometric Analysis of Research Performance of African Countries in select...
Scientometric Analysis of Research Performance of African Countries in select...
 
New Dialog, New Services with Altmetrics: Lingnan University Library Experience
New Dialog, New Services with Altmetrics: Lingnan University Library ExperienceNew Dialog, New Services with Altmetrics: Lingnan University Library Experience
New Dialog, New Services with Altmetrics: Lingnan University Library Experience
 
Field-weighting readership: how does it compare to field-weighting citations?
Field-weighting readership: how does it compare to field-weighting citations?Field-weighting readership: how does it compare to field-weighting citations?
Field-weighting readership: how does it compare to field-weighting citations?
 
How do Scholars Evaluate and Promote Research Outputs? An NTU Case Study
How do Scholars Evaluate and Promote Research Outputs? An NTU Case StudyHow do Scholars Evaluate and Promote Research Outputs? An NTU Case Study
How do Scholars Evaluate and Promote Research Outputs? An NTU Case Study
 
Monitoring the broad impact of the journal publication output on country leve...
Monitoring the broad impact of the journal publication output on country leve...Monitoring the broad impact of the journal publication output on country leve...
Monitoring the broad impact of the journal publication output on country leve...
 
A Comparative Investigation on Citation Counts and Altmetrics between Papers ...
A Comparative Investigation on Citation Counts and Altmetrics between Papers ...A Comparative Investigation on Citation Counts and Altmetrics between Papers ...
A Comparative Investigation on Citation Counts and Altmetrics between Papers ...
 
Database-Centric Guidelines for Building a Scholarly Metrics Information Syst...
Database-Centric Guidelines for Building a Scholarly Metrics Information Syst...Database-Centric Guidelines for Building a Scholarly Metrics Information Syst...
Database-Centric Guidelines for Building a Scholarly Metrics Information Syst...
 
Altmetrics for Research Impact Actuation (ARIA)
Altmetrics for Research Impact Actuation (ARIA)Altmetrics for Research Impact Actuation (ARIA)
Altmetrics for Research Impact Actuation (ARIA)
 
What’s in a Country Name – Twitter Hashtag Analysis of #singapore
What’s in a Country Name – Twitter Hashtag Analysis of #singaporeWhat’s in a Country Name – Twitter Hashtag Analysis of #singapore
What’s in a Country Name – Twitter Hashtag Analysis of #singapore
 
Object Recognition-based Mnemonics Mobile App for Senior Adults Communication
Object Recognition-based Mnemonics Mobile App for Senior Adults CommunicationObject Recognition-based Mnemonics Mobile App for Senior Adults Communication
Object Recognition-based Mnemonics Mobile App for Senior Adults Communication
 
Linked-Data based Data Management for data.gov.sg
Linked-Data based Data Management for data.gov.sgLinked-Data based Data Management for data.gov.sg
Linked-Data based Data Management for data.gov.sg
 
Personal Teaching Philosophy
Personal Teaching PhilosophyPersonal Teaching Philosophy
Personal Teaching Philosophy
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 

A task-based scientific paper recommender system for literature review and manuscript preparation

  • 1. A Task-based Scientific Paper Recommender System for Literature Review and Manuscript Preparation Aravind SESAGIRI RAAMKUMAR PhD Candidate Oral Examination Presentation for fulfillment of PhD October 3rd 2017
  • 2. Citation recommendations Research paper recommender systems Recommending papers {Ad-hoc search} Background “How to get the best set of relevant documents for a researcher’s literature review and publication purposes?” SPRS Research Area More than 250 papers Process-based Interventions Technology-oriented Interventions 2
  • 3. Identify research opportunities Find collaborators Secure support Review the literature Collect research data Analyze research data Disseminate findings Manage the research process Related Work Classification (1/2) Active and Explicit Information Needs (AEIN) in the Research Lifecycle A) Active and Explicit Information Needs (AEIN) B) Passive and Implicit Information Needs (PIIN) <Recommendation of Scholarly Objects> • Building a reading list for literature review • Finding similar papers : : : • Searching papers based on input text • Publication Venues • Citation Context 3
  • 4. Related Work Classification (2/2) Passive and Implicit Information Needs (PIIN) • User Footprint • Researcher’s Publication History • Social Network of Authors • Social Tags • Reference Management Systems 4
  • 5. Research Gaps in SPRS Studies Consolidated Framework for Contextual Dimensions Lack of Connectivity between Tasks Lack of Relation(s) between Tasks and RS Filtering Mechanisms Absence of Article Type as an Input Dimension 5
  • 6. Research Objectives • To identify an appropriate method to map the identified LR and MP tasks to relevant IR/RS algorithms – RQ1: What are the key search tasks of researchers in the literature review and publication lifecycle? – RQ2: How to relate the identified tasks of researchers to IR/RS algorithms? • To evaluate whether the performance of the proposed recommendation techniques for the tasks and the overall system were at the expected level – RQ3: Do the proposed recommendation techniques of the relevant tasks outperform the existing baseline approaches in system-based evaluation? – RQ4: Do the proposed recommendation techniques and the overall system meet the expected standards in user- based evaluation? Study I Study II Rec4LRW System 6
  • 7. Study I - Survey on Inadequate and Omitted Citations (IOC) in Manuscripts Authors Reviewers Problems in research quality Manuscripts with improper LR 7
  • 8. Study I - Survey on Inadequate and Omitted Citations (IOC) in Manuscripts Aims • What are the critical instances of IOC? • Do the critical instances and reasons of IOC in research manuscripts relate with the scenarios/tasks where researchers need external assistance in finding papers? • Identify the prominent information sources • What is the researchers’ awareness level of available recommendation services for research papers? 8
  • 9. Study I Details • Single center data collection conducted for two months • Only researchers with paper authoring experience were recruited • 207 NTU researchers participated in the study 71% of the participants answered from both reviewer and author perspectives • Survey questionnaire comprising of 31 questions • Agreeability measured in 5-point Likert scale • Data analyses through one-sample t-test with test value of either 2 or 3 9
  • 10. Study I - Results Instances of IOC • Authors viewpoint – Missed citing seminal and topically-similar papers in journal manuscripts • Reviewers viewpoint – Missed citing seminal, topically-similar papers in all manuscripts – Insufficient and irrelevant papers in the LR of all manuscripts Effects of IOC • Reviewers viewpoint – Manuscripts are sent back for revision due to missing citations 10
  • 11. Study I - Results Need for External Assistance in Finding Papers • Authors required support for the below papers:- 1. Interdisciplinary papers 2. Topically-similar papers 3. Seminal papers 4. Citations for placeholders in manuscripts 5. Necessary citations meant for inclusion in manuscripts Usage of Academic Information Sources • Researchers used the below sources in the order of usage 1. Google Scholar 2. ScienceDirect 3. Web of Science 4. SpringerLink • 62% of the participants have never used SPRS services 11
  • 12. Study I – Key Findings • Researchers need help in finding interdisciplinary, topically- similar and seminal papers • Generating reading list (seminal papers) and finding similar papers are two necessary LR search tasks for the proposed system • Shortlisting papers from final reading list for inclusion in manuscript, selected as third task for the proposed system • Google Scholar’s simplistic UI makes it the most used information source and ideal choice for UI design of a new assistive system 12
  • 13. Rec4LRW Design and Development Task Redesign Task Interconnectivity Informational Display Features 13
  • 14. Rec4LRW System Design - I Base Features • Plug and Play concept • Features represent different characteristic of paper and its relations to references and citations • Grey Literature Percentage, Coverage, Textual Similarity and Specificity are novel features • New features can be added as required 14
  • 15. Rec4LRW System Design - II Task 1 -Building an Initial Reading List of Research Papers Popular papers Recent papers Survey papers Diverse papers Use of Okapi BM25 Similarity Score to retrieve top 200 matching papers Requirements Author-specified Keywords based Retrieval (AKR) Technique Ranking problem Composite Rank is a weighted mix of Coverage, Citation Count and Reference Count 15
  • 16. Rec4LRW System Design - II Task 2 - Finding Similar Papers based on Set of Papers Extended paper discovery problem Multiple input papers Integrated Discovery of Similar Papers (IDSP) Technique IDSP Technique Similar papers 16
  • 17. Rec4LRW System Design - II Task 3 - Shortlisting Articles from RL for Inclusion in Manuscript Cluster detection problem Final list of papers from LR Citation Network based Shortlisting (CNS) TechniqueIDSP Technique Unique and important papers 17
  • 18. Rec4LRW System Design - III Task Screens Task 1 Task 2 Information cue labels Seed Basket (SB) 18
  • 19. Rec4LRW System Design - III Task Screens Task 2 Task 3 Shared Co-relations Reading List (RL) 19
  • 20. Rec4LRW System Design - III Task Screens Task 3 • Front end: PHP, HTML, CSS, JavaScript • Backend: MySQL • Processing layer: JAVA • Java libraries: Apache Lucence (for BM25), Apache Mahout (for IBCF), Jung (for community detection algorithm) Cluster viewing option 20
  • 21. Study II – Rec4LRW Evaluation 21
  • 22. Study II - Dataset • XML files provided by ACM • Papers published in the period 1951 to 2011 • Total of 103,739 articles and corresponding 2,320,345 references • Data was cleaned and transformed in MySQL • References were parsed using AnyStyle parser • All the seven base features were precomputed before Study II 22
  • 23. Study II – Pre-study Evaluated Techniques Label Abbr. Technique Description A AKRv1 Basic AKR technique with weights WCC = 0.25, WRC=0.25, WCO = 0.5 B AKRv2 Basic AKR technique with weights WCC = 0.1, WRC=0.1, WCO = 0.8 C HAKRv1 HITS enhanced AKR technique boosted with weights WCC = 0.25, WRC=0.25, WCO = 0.5 D HAKRv2 HITS enhanced AKR technique boosted with weights WCC = 0.1, WRC=0.1, WCO = 0.8 E CFHITS IBCF technique boosted with HITS F CFPR IBCF technique boosted with PageRank G PR PageRank technique Experiment Setup • A total of 186 author-specified keywords from the ACM DL dataset were identified as the seed research topic • The experiment was performed in three sequential steps. 1. Top 200 papers were retrieved using the BM25 similarity algorithm 2. Top 20 papers were identified using the specific ranking schemes of the seven techniques 3. The evaluation metrics were measured for the seven techniques Evaluation Approach • Number of Recent (R1), Popular (R2), Survey (R3) and Diverse (R4) papers were enumerated for each of the 186 topics and seven techniques • Ranks were assigned to the technique based on the highest counts in each recommendation list • The RankAggreg library was used to perform Rank Aggregation 23
  • 24. Study II – Part I (Pre-study) Results Paper Type (Requirement) Optimal Aggregated Ranks Min. Obj. Function Score1 2 3 4 5 6 7 Recent Papers (R1) B A C D E F G 10.66 Popular Papers (R2) F E C D G A B 11.89 Literature Survey Papers (R3) C G D A E F B 13.38 Diverse Papers (R4) C D G A B F E 12.15 • The HITS enhanced version of the AKR technique HAKRv1 (C) was the best all-round performing technique • The HAKRv1 technique was particularly good for retrieving literature survey papers and papers from different sub-topics while the basic AKRv1 technique (A) was good for retrieving recent papers • The baseline CFPR technique (F) remains the best technique for retrieving popular papers • The advantage of using weights has been shown • AKR technique’s scalability is highlighted 24
  • 25. Study II – User Study Evaluation Goals 1. Ascertain the agreement percentages of the evaluation measures for the three tasks and the overall system and identify whether the values are above a preset threshold criteria of 75% 2. Test the hypothesis that students benefit more from the recommendation tasks/system in comparison to staff 3. Measure the correlation between the measures and build a regression model with ‘agreeability on a good list’ as the dependent variable 4. Track the change in user perceptions between the three tasks 5. Compare the pre-study and post-study variables for understanding whether the target participants are benefitted from the tasks 6. Identify the top most preferred and critical aspects of the task recommendations and the system using the subjective feedback of the participants 25
  • 26. Study II - Details • Rec4LRW system was made available over the internet • Participants were recruited with intent to get worldwide audience • Only researchers with paper authoring experience were recruited through a pre-screening survey • 230 researchers participated in the pre-screening survey • 149 participants were deemed eligible and invited for the study • Participants provided with a user guide • All the three tasks were required to be executed by the participants • Evaluation questionnaires embedded in the screen of each task of Rec4LRW system 26
  • 27. Study II – Participant Demographics Stage N Task 1 132 Task 2 121 Task 3 119 Demographic Variable N Position Student 62 (47%) Staff 70 (53%) Experience Level Beginner 15 (11.4%) Intermediate 61 (46.2%) Advanced 34 (25.8%) Expert 22 (16.7%) Discipline N Computer Science & Information Systems 51 (38.6%) Library and Information Studies 30 (22.7%) Electrical & Electronic Engineering 30 (22.7%) Communication & Media Studies 8 (6.1%) Mechanical, Aeronautical & Manufacturing Engineering 5 (3.8%) Biological Sciences 2 (1.5%) Statistics & Operational Research 1 (0.8%) Education 1 (0.8%) Politics & International Studies 1 (0.8%) Economics & Econometrics 1 (0.8%) Civil & Structural Engineering 1 (0.8%) Psychology 1 (0.8%) Country N Singapore 107 (81.1%) India 4 (3%) Malaysia 3 (2.3%) Sri Lanka 3 (2.3%) Pakistan 3 (2.3%) Indonesia 2 (1.5%) Germany 2 (1.5%) Australia 1 (0.8%) Iran 1 (0.8%) Thailand 1 (0.8%) China 1 (0.8%) USA 1 (0.8%) Canada 1 (0.8%) Sweden 1 (0.8%) Slovenia 1 (0.8%) 27
  • 28. Study II – Task Evaluation Measures Common Measures • Relevance • Usefulness • Good_List Tasks 1 and 2 • Good_Spread • Diversity • Interdisciplinarity • Popularity • Recency • Good_Mix • Familiarity • Novelty • Serendipity • Expansion_Required • User_Satisfaction Task 2 specific • Seedbasket_Similarity • Shared_Corelations • Seedbasket_Usefulness Task 3 specific • Importance • Certainty • Shortlisting_Feature 28 1) From the displayed information, what features did you like the most? 2) Please provide your personal feedback about the execution of this task
  • 29. Study II – System Evaluation Measures Effort to use the System (EUS) • Convenience • Effort_Required • Mouse_Clicks • Little_Time • Much_Time Perceived Usefulness (PU) • Productivity_Improvability • Enhance_Effectiveness • Ease_Job • Work_Usefulness Perceived System Effectiveness (PSE) • Recommend • Pleasant_Experience • Useless • Awareness • Better_Choice • Findability • Accomplish_Tasks • Performance_Improvability 29
  • 30. Study II – Analysis Procedures Quantitative Data • Agreement Percentage (AP) calculated by only considering responses of 4 (‘Agree’) and 5 (‘Strongly Agree’) in the 5-point Likert scale • Independent samples t-test for hypothesis testing • Spearman coefficient for correlation measurement • MLR used for the predictive models – Paired samples t-test for model validation Qualitative Data • Descriptive coding method was used to code the participant feedback • Two coders performed the coding in a sequential manner Preferred Aspects (κ) Critical Aspects (κ) Task 1 0.918 0.727 Task 2 0.930 0.758 Task 3 0.877 0.902 30
  • 31. Study II – Results for Goals 1 & 2 31
  • 32. Study II – Results for Goals 3 and 4 Predictors for “Good_List” Task Independent Variables Task 1 Recency, Novelty, Serendipity, Usefulness, User_Satisfaction Task 2 Seedbasket_Similarity, Usefulness Task 3 Relevance, Usefulness, Certainty Transition of User Perception from Task 1 to 2 32
  • 33. Study II – Results for Goal 5 0 1 3 4 00 6 5 10 21 9 18 22 40 1 11 18 10 1 2 5 6 0 5 10 15 20 25 Count 1 2 3 4 5 0 3 5 20 30 3 9 30 41 2 7 21 20 0 3 1 2 0 5 10 15 20 25 30 35 Count 1 2 3 4 5 0 1 3 2 30 2 8 15 40 4 7 24 6 0 1 5 16 31 1 2 5 1 0 5 10 15 20 25 30 Count 1 2 3 4 5 Task 1 Task 2 Task 3 Need_Assistance (pre study) Vs. Good_List (post study) 33 Never Rarely Sometimes Often Always Never Rarely Sometimes Often Always Never Rarely Sometimes Often Always
  • 34. Study II – Results for Goal 6 Top 5 Preferred Aspects Rank Task 1 (N=109) Task 2 (N=100) Task 3 (N=91) 1 Information Cue Labels (41%) Shared Co-citations & Co-references (28%) Shortlisting Feature & Recommendation Quality (24%) 2 Rich Metadata (21%) Recommendation Quality (27%) Information Cue Labels (15%) 3 Diversity of Papers (13%) Information Cue Labels (16%) View Papers in Clusters (11%) 4 Recommendation Quality (9%) Seed Basket (14%) Rich Metadata (7%) 5 Recency of Papers (4%) Rich Metadata (9%) Ranking of Papers (3%) Rank Task 1 (N=109) Task 2 (N=100) Task 3 (N=91) 1 Broad topics not suitable (20%) Quality can be improved (16%) Rote selection of papers for task execution (16%) 2 Limited dataset (7%) Limited dataset (12%) Limited dataset (5%) 3 Quality can be improved (6%) Recommendation algorithm could include more dimensions (7%) Algorithm can be improved (5%) 4 Different algorithm required (5%) Speed can be improved (7%) Not sure of the usefulness (4%) 5 Free-text search required (4%) Repeated recommendations from Task 1 (3%) UI can be improved (3%) Top 5 Critical Aspects 34
  • 35. Contributions and Implications • The Rec4LRW system and its recommendations adequately satisfy the most affected user group – Students • Addresses the piecemeal scholarship on scientific paper recommender systems (SPRS) • Proposes bridge between task requirements and IR/RS algorithms • The threefold intervention framework helps in integrating research ideas from UI, IR and RS research areas 35
  • 36. Limitations • Recommendation techniques do not cater to disciplinary differences (if any) • Recommendations could be biased to certain requirements of the three tasks • Non-user personalized techniques (not a serious issue) • Evaluation study conducted with a limited set of research topics 36
  • 37. SPRRF - Scientific Paper Retrieval and Recommender Framework (SPRRF) Distinct User Groups Usefulness of Information Cue Labels Forced Serendipity vs. Natural Serendipity Learning Algorithms vs. Fixed-Logic Algorithms Inclusion of Control Features in UI Inclusion of Bibliometric Data Diversification of Corpus • Seven themes identified using holistic coding method • SPRRF conceptualized as a mental model based on the themes • The framework needs to be validated 37
  • 38. Future Work • Validation of the proposed SPRRF framework • Longitudinal user evaluation studies • Improvements in recommendation techniques – Inclusion of more metrics – More weights for customization – Citation motivations – Usage of open web standards 38
  • 39. Publications Journal Papers 1. Raamkumar, A. S., Foo, S., & Pang, N. (2016). Survey on inadequate and omitted citations in manuscripts: a precursory study in identification of tasks for a literature review and manuscript writing assistive system. Information Research, 21(4). 2. Raamkumar, A. S., Foo, S., & Pang, N. (2017). Using author-specified keywords in building an initial reading list of research papers in scientific paper retrieval and recommender systems. Information Processing & Management, 53(3), 577-594. 3. Sesagiri Raamkumar, A., Foo, S., Pang, N. (2017). Evaluating a threefold intervention framework for assisting researchers in literature review and manuscript preparatory tasks. Journal of Documentation, 73(3), 555-580. 4. Sesagiri Raamkumar, A., Foo, S., Pang, N. (2017). User Evaluation of a Task for Shortlisting Papers from Researcher’s Reading List for Citing in Manuscripts. Aslib Journal of Information Management, 69(6). 5. Sesagiri Raamkumar, A., Foo, S., Pang, N. (2017). Can I have more of these please? Assisting researchers in finding similar research papers from a seed basket of papers. The Electronic Library. Manuscript recommended for publication. Conference Papers 1. Sesagiri Raamkumar, A., Foo, S., & Pang, N. (2015). Rec4LRW-scientific paper recommender system for literature review and writing. Frontiers in Artificial Intelligence and Applications (Vol. 275). 2. Raamkumar, A. S., Foo, S., & Pang, N. (2015). Comparison of techniques for measuring research coverage of scientific papers: A case study. In Digital Information Management (ICDIM), 2015 Tenth International Conference on (pp. 132-137). IEEE. 3. Raamkumar, A. S., Foo, S., & Pang, N. (2015). More Than Just Black and White: A Case for Grey Literature References in Scientific Paper Information Retrieval Systems. In International Conference on Asian Digital Libraries (pp. 252-257). Springer, Cham. 4. Sesagiri Raamkumar, A., Foo, S., & Pang, N. (2016,). Making Literature Review and Manuscript Writing Tasks Easier for Novice Researchers through Rec4LRW System. In Proceedings of the 16th ACM/IEEE-CS on Joint Conference on Digital Libraries (pp. 229-230). ACM. 5. Sesagiri Raamkumar, A., Foo, S., & Pang, N. (2016). What papers should I cite from my reading list? User evaluation of a manuscript preparatory assistive task. In Proceedings of the Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL2016) (pp. 51–62). 39

Editor's Notes

  1. To put steps in research lifecycle here
  2. To put steps in research lifecycle here
  3. The link from RO to Study I can be more well established?
  4. May need to move the first image to earlier slide
  5. Insert information about Grey Literature Percentage, Coverage, Textual Similarity and Specificity
  6. Mention what is CC, RC and TPC. Mention that weights can be changed
  7. Reqs were not added to this slide
  8. Reqs were not added to this slide. Article-type data can be included?
  9. Insert the CALLOUTS
  10. Insert the CALLOUTS
  11. Insert the CALLOUTS
  12. Need to put some finishing touches. Also discussion points
  13. Glamour work to be done
  14. Do some formatting
  15. Do some formatting
  16. Glamour
  17. Glamour
  18. INSERT LABELS
  19. Galmour