SlideShare a Scribd company logo
1 of 32
Assessing a human mediated
current awareness service
International Symposium of Information Science (ISI 2015)
Zadar, 2015-05-20
Zeljko Carevic1, Thomas Krichel2 and Philipp Mayr1
1firstname.lastname@gesis.org
2lastname@openlib.org
Outline
1. Introduction
2. RePEc and NEP
3. Results
3.1 Editing time
3.2 Indicators for report success
3.3 Editing effort
4. Conclusion and Outlook
Slide 2 / 31
Motivation
• Thomas Krichel, the founder of
RePEc, visited GESIS – Cologne
in Oct. 2014
• Sharing his Russian souvenir
• ~100 GB of XML log files
Slide 3 / 31
1. Introduction
• Current awareness in digital libraries
– To inform users / subscribers about new / relevant
acquisitions in their libraries [1].
• Current awareness services allow subscribers to keep up to
date with new additions in a certain area of research.
• Selection of relevant documents can be done (semi-
)automatically or manually.
• For this work we focus on the intellectual editing process
• Aim of this work:
How do editors work when creating a subject
specific report in Digital Libraries (DL)?
Slide 4 / 31
2. Use case: RePEc
• RePEc (Research Papers in Economics)
is a DL for working papers in economics
research.
• Covers metadata for working papers and
journal articles.
• Usually document metadata contains links
to full texts
Slide 5 / 31
2. RePEc statistics
0
200
400
600
800
1000
1200
1400
1600
1800
1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016
Numberofdocuments
Year
Contr. Archives Documents Full text
Documents
Regist. Authors Abstract views
(April 2015)
~1,700 1.77 mio 1.63 mio ~45,000 >2 mio
Slide 6 / 31
2. Current awareness service NEP
• NEP (New Economics Papers) is a current awareness service for
new additions in RePEc.
• NEP covers subject specific reports from over 90 specific fields.
– Business, Economic and Financial History
– Public Economics
– Social Norms and Social Capital
• Issues are sent to subscribers via E-Mail, RSS and Twitter
• Reports to new additions are generated by subject specific editors.
• Relevant document selection is done manually by the editor!
Slide 7 / 31
Nep-acc Nep-afr
Nep-all
• Contains all new RePEc
docs
• Created roughly on
weekly base
• Contains avg. 488 doc
Selects
Nep-upt Nep-ure
Selects Selects Selects
Sends issue Sends issue Sends issue Sends issue
Manual selection of relevant documents
is a time consuming task.
Slide 8 / 31
ERNAD
• ERNAD (Editing Reports on New Academic
Documents) is a purposed built system
• Re-rank nep-all for each editor based on the
specific report topic
• Looking at past issues of a report to produce
a ranked nep-all
• If presorting works well editors select highly
ranked documents from nep-all
Slide 9 / 31
ERNAD example for Nep-Africa
(NEP-AFR)
1. Tax compliance..
2. Mental accounting..
…
212. Ethnic ..in Africa
317. Sino-African relations:
Nep-all unsorted Nep-all presorted
Slide 10 / 31
1. Ethnic ..in Africa
2. Sino-African relations:
…
50. Tax compliance..
51. Mental accounting..
Editing stages
Slide 11 / 31
Research questions
• RQ 1: How long is the editing duration?
• RQ 2: What influences the success of a report?
– Editing duration
– Issue size
• RQ 3: How much effort is invested for selecting
and sorting papers per issue?
– Precision @ N
– Relative search length
Slide 12 / 31
RQ 1: Editing time
How much time do editors invest to
create a report?
Slide 13 / 31
Pre-selection
• Editing an issue can be interrupted
• This would distort the results
• Exclude interrupted issues by separating
the edit duration in 3-minute chunks
Slide 14 / 31
Pre-selection
0
1000
2000
3000
4000
5000
6000
7000
8000
9000 3
6
9
12
15
18
21
24
27
30
33
36
39
42
45
48
51
54
57
60
63
66
69
72
75
78
81
84
87
90
>90
Numberofissues
3-minute chunks
Limit edit time < 90 min
Slide 15 / 31
0
10
20
30
40
50
60
nep-ets
nep-gro
nep-opm
nep-pke
nep-cba
nep-hea
nep-rm
g
nep-geo
nep-hap
nep-tid
nep-dem
nep-soc
nep-cse
nep-net
nep-ifn
nep-lab
nep-ltv
nep-for
nep-law
nep-m
ig
nep-cdm
nep-m
on
nep-exp
nep-neu
nep-ino
nep-m
st
nep-ore
nep-fm
k
nep-ara
nep-m
kt
Averageeditingtimeinminutes
Report
Avg. editing time
RQ 1: Editing time
Avg. 15.5 minutes.
(sd = 10.1)
Min. 2.5 minutes NEP-
RES (Resource
economics)
Max. 53 minutes
NEP-ETS
(Economic time
series)
Slide 16 / 31
Summarize RQ 1
• Average editing time is comparable low
with 15.5 minutes
• Huge scattering between the reports:
–Min. 2.5 minutes
–Max. 53 minutes
Slide 17 / 31
RQ 2: Influences to successful
reports
• Popularity of a report can be measured by the number of
subscribers.
• Huge scattering between number of subscribers per report
– Max. 6859 NEP-HIS Business, Economic and Financial History
– Min. 75 NEP-CIS Confederation of Independent States
• Factors influencing reports success for example: topic, age of
a report..
• Does the issue size or the editing time influence the report
success?
Slide 18 / 31
Editing time
0
1000
2000
3000
4000
5000
6000
7000
0 10 20 30 40 50 60
Numberofsubscribers
Average editing time
Avg. edit time
Avg. number of subscribers
Education
2198 sub.
(avg. 836)
Project, Program and
Portfolio Management
43,5 min (avg. 15.5)
Slide 19 / 31
Issue size
0
1000
2000
3000
4000
5000
6000
7000
0 10 20 30 40 50 60
Numberofsubscribers
Average issue size
Avg. issue size
Avg. number of subscribers
Sports
issue size
2.5
(avg. 12.4)
Demographic
Economic
issue size 21
(avg. 12.4)
Slide 20 / 31
Summarize RQ 2
• There is no correlation between:
– Issue size and number of subscribers
– Editing time and number of subscribers
• We assume that the success of a report is
mainly driven by topic and age.
Slide 21 / 31
RQ 3: Effort in selecting and
sorting
How much effort is invested in selecting and
sorting relevant documents from nep-all?
Two measures are used:
Precision @N
Relative search length
Slide 22 / 31
Precision @ N
• How many of the top n documents from pre-sorted
nep-all are selected for the issue?
• N set to: 5, 10, 15, 20
• We only consider issues where issue size > N
• A document is relevant if its index position in nep-all
is < N.
Slide 23 / 31
Example: P@ 5
• M={(D1, 4), (D2, 1), (D3, 7), (D4, 3), (D5, 9)}
• P@5 for issue I in report J = ⅗
• Editors vary between using pre-sorted and
un-sorted nep-all. Therefore:
– Only consider issues with pre-sort usage > 50
Slide 24 / 31
Results for P@N
Avg. P@5
(82 rep)
Avg. P@10
(64 rep)
Avg.
P@15(50rep)
Avg. P@20
(31 rep)
0.77 0.80 0.80 0.82
• Max. found for nep-env (Environmental
Economics) with P@5 = 0.99
• Min. found for nep-cba (Central Bank) with
P@5 = 0.35
Slide 25 / 31
Summarize P@N
• Editors work comfortably with the
presorting in nep-all.
• The number of papers per issue has no
significant influence for the precision.
Slide 26 / 31
Relative Search Length
• We know how many of the top N
document from nep-all selected.
• To what depth do editors inspect nep-all?
• Ratio between the highest index position
(hin) of the last relevant document in nep-
all and the length of nep-all
Slide 27 / 31
Example RSL
• Editor is given a nep-all containing 300
documents.
• M={(D1, 4), (D2, 10), (D3, 7)}
• RSL = 10/300
• We assume that the editor has inspected
nep-all to document 10.
Slide 28 / 31
Relative Search Length
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
nep-mac
nep-dem
nep-cwa
nep-eur
nep-iuenep-cbe
nep-afrnep-mic
nep-bec
nep-intnep-knm
nep-com
nep-regnep-ifnnep-cdm
nep-tidnep-effnep-inonep-upt
nep-edu
nep-fornep-neu
nep-cisnep-ltvnep-net
nep-dev
nep-ppm
nep-spo
AverageRSLperReport
Report
Avg. RSL
NEP-MAC
(Macroeconomics)
RSL = 0.35
NEP-SPO
(Sports and Economics)
RSL = 0.01
Avg. RSL =
0.08
Slide 29 / 31
Summarize RSL
• The relative search length is comparable
low with 0.08
• Editors select papers from the very upper
part of nep-all.
Slide 30 / 31
Conclusion
• Focused on observable system features
– Editing time
– Influences on report success
– Effort in creating an issue
• Summarize: The system supports the editor well in creating
an issue
• A complete view requires a more user-centred observation.
• Future work:
– Why and under what conditions is a document relevant?
• NEP provides many opportunities for further research on data
that is relatively easily available.
Slide 31 / 31
Thank you!
Questions?

More Related Content

Viewers also liked

Analyzing the research output presented at European Networked Knowledge Organ...
Analyzing the research output presented at European Networked Knowledge Organ...Analyzing the research output presented at European Networked Knowledge Organ...
Analyzing the research output presented at European Networked Knowledge Organ...GESIS
 
PEP-TF: Social Media Monitoring of the Campaigns for the 2013 German Bundesta...
PEP-TF: Social Media Monitoring of the Campaigns for the 2013 German Bundesta...PEP-TF: Social Media Monitoring of the Campaigns for the 2013 German Bundesta...
PEP-TF: Social Media Monitoring of the Campaigns for the 2013 German Bundesta...GESIS
 
Introduction to the 15th NKOS workshop @TPDL2016
Introduction to the 15th NKOS workshop @TPDL2016Introduction to the 15th NKOS workshop @TPDL2016
Introduction to the 15th NKOS workshop @TPDL2016GESIS
 
Bibliometric-enhanced Retrieval Models for Big Scholarly Information Systems
Bibliometric-enhanced Retrieval Models for Big Scholarly Information SystemsBibliometric-enhanced Retrieval Models for Big Scholarly Information Systems
Bibliometric-enhanced Retrieval Models for Big Scholarly Information SystemsGESIS
 
Are topic-specific search term, journal name and author name recommendations ...
Are topic-specific search term, journal name and author name recommendations ...Are topic-specific search term, journal name and author name recommendations ...
Are topic-specific search term, journal name and author name recommendations ...GESIS
 
Demonstrating a Framework for KOS-based Recommendations Systems
Demonstrating a Framework for KOS-based Recommendations SystemsDemonstrating a Framework for KOS-based Recommendations Systems
Demonstrating a Framework for KOS-based Recommendations SystemsGESIS
 
Introduction of the Bibliometric-enhanced Information Retrieval (BIR) workshop
Introduction of the Bibliometric-enhanced Information Retrieval (BIR) workshopIntroduction of the Bibliometric-enhanced Information Retrieval (BIR) workshop
Introduction of the Bibliometric-enhanced Information Retrieval (BIR) workshopGESIS
 
Recent Advances in Bibliometric-Enhanced Information Retrieval
Recent Advances in Bibliometric-Enhanced Information RetrievalRecent Advances in Bibliometric-Enhanced Information Retrieval
Recent Advances in Bibliometric-Enhanced Information RetrievalGESIS
 
Measuring the usefulness of Knowledge Organization Systems in Information Ret...
Measuring the usefulness of Knowledge Organization Systems in Information Ret...Measuring the usefulness of Knowledge Organization Systems in Information Ret...
Measuring the usefulness of Knowledge Organization Systems in Information Ret...GESIS
 
Using co-authorship networks for author name disambiguation
Using co-authorship networks for author name disambiguationUsing co-authorship networks for author name disambiguation
Using co-authorship networks for author name disambiguationGESIS
 
Towards a Semantic Citation Index for the German Social Sciences
Towards a Semantic Citation Index for the German Social SciencesTowards a Semantic Citation Index for the German Social Sciences
Towards a Semantic Citation Index for the German Social SciencesGESIS
 
How to build your own citation index
How to build your own citation indexHow to build your own citation index
How to build your own citation indexGESIS
 
Opening Scholarly Communication in Social Sciences by Connecting Collaborativ...
Opening Scholarly Communication in Social Sciences by Connecting Collaborativ...Opening Scholarly Communication in Social Sciences by Connecting Collaborativ...
Opening Scholarly Communication in Social Sciences by Connecting Collaborativ...GESIS
 
Einführung in das Vektorraummodell
Einführung in das VektorraummodellEinführung in das Vektorraummodell
Einführung in das VektorraummodellGESIS
 
Industrie 4.0
Industrie 4.0Industrie 4.0
Industrie 4.0GESIS
 

Viewers also liked (15)

Analyzing the research output presented at European Networked Knowledge Organ...
Analyzing the research output presented at European Networked Knowledge Organ...Analyzing the research output presented at European Networked Knowledge Organ...
Analyzing the research output presented at European Networked Knowledge Organ...
 
PEP-TF: Social Media Monitoring of the Campaigns for the 2013 German Bundesta...
PEP-TF: Social Media Monitoring of the Campaigns for the 2013 German Bundesta...PEP-TF: Social Media Monitoring of the Campaigns for the 2013 German Bundesta...
PEP-TF: Social Media Monitoring of the Campaigns for the 2013 German Bundesta...
 
Introduction to the 15th NKOS workshop @TPDL2016
Introduction to the 15th NKOS workshop @TPDL2016Introduction to the 15th NKOS workshop @TPDL2016
Introduction to the 15th NKOS workshop @TPDL2016
 
Bibliometric-enhanced Retrieval Models for Big Scholarly Information Systems
Bibliometric-enhanced Retrieval Models for Big Scholarly Information SystemsBibliometric-enhanced Retrieval Models for Big Scholarly Information Systems
Bibliometric-enhanced Retrieval Models for Big Scholarly Information Systems
 
Are topic-specific search term, journal name and author name recommendations ...
Are topic-specific search term, journal name and author name recommendations ...Are topic-specific search term, journal name and author name recommendations ...
Are topic-specific search term, journal name and author name recommendations ...
 
Demonstrating a Framework for KOS-based Recommendations Systems
Demonstrating a Framework for KOS-based Recommendations SystemsDemonstrating a Framework for KOS-based Recommendations Systems
Demonstrating a Framework for KOS-based Recommendations Systems
 
Introduction of the Bibliometric-enhanced Information Retrieval (BIR) workshop
Introduction of the Bibliometric-enhanced Information Retrieval (BIR) workshopIntroduction of the Bibliometric-enhanced Information Retrieval (BIR) workshop
Introduction of the Bibliometric-enhanced Information Retrieval (BIR) workshop
 
Recent Advances in Bibliometric-Enhanced Information Retrieval
Recent Advances in Bibliometric-Enhanced Information RetrievalRecent Advances in Bibliometric-Enhanced Information Retrieval
Recent Advances in Bibliometric-Enhanced Information Retrieval
 
Measuring the usefulness of Knowledge Organization Systems in Information Ret...
Measuring the usefulness of Knowledge Organization Systems in Information Ret...Measuring the usefulness of Knowledge Organization Systems in Information Ret...
Measuring the usefulness of Knowledge Organization Systems in Information Ret...
 
Using co-authorship networks for author name disambiguation
Using co-authorship networks for author name disambiguationUsing co-authorship networks for author name disambiguation
Using co-authorship networks for author name disambiguation
 
Towards a Semantic Citation Index for the German Social Sciences
Towards a Semantic Citation Index for the German Social SciencesTowards a Semantic Citation Index for the German Social Sciences
Towards a Semantic Citation Index for the German Social Sciences
 
How to build your own citation index
How to build your own citation indexHow to build your own citation index
How to build your own citation index
 
Opening Scholarly Communication in Social Sciences by Connecting Collaborativ...
Opening Scholarly Communication in Social Sciences by Connecting Collaborativ...Opening Scholarly Communication in Social Sciences by Connecting Collaborativ...
Opening Scholarly Communication in Social Sciences by Connecting Collaborativ...
 
Einführung in das Vektorraummodell
Einführung in das VektorraummodellEinführung in das Vektorraummodell
Einführung in das Vektorraummodell
 
Industrie 4.0
Industrie 4.0Industrie 4.0
Industrie 4.0
 

Similar to Assessing a human mediated current awareness service

Supporting Springer Nature Editors by means of Semantic Technologies
Supporting Springer Nature Editors by means of Semantic TechnologiesSupporting Springer Nature Editors by means of Semantic Technologies
Supporting Springer Nature Editors by means of Semantic TechnologiesFrancesco Osborne
 
Towards Reading Session-based indicators in Educational Reading Analytics
Towards Reading Session-based indicators in Educational Reading AnalyticsTowards Reading Session-based indicators in Educational Reading Analytics
Towards Reading Session-based indicators in Educational Reading AnalyticsMadjid Sadallah
 
Beit 381 se lec 13 - 11 - 12 mar20 - project management
Beit 381 se lec 13  -  11 -  12 mar20 - project managementBeit 381 se lec 13  -  11 -  12 mar20 - project management
Beit 381 se lec 13 - 11 - 12 mar20 - project managementbabak danyal
 
A cost structure study for French HSS journals
A cost structure study for French HSS journalsA cost structure study for French HSS journals
A cost structure study for French HSS journalsOpenEdition
 
Project_Report_2022.pptx
Project_Report_2022.pptxProject_Report_2022.pptx
Project_Report_2022.pptxYashankNagotra
 
CS3114_09212011.ppt
CS3114_09212011.pptCS3114_09212011.ppt
CS3114_09212011.pptArumugam90
 
Industry - The Evolution of Information Systems. A Case Study on Document Man...
Industry - The Evolution of Information Systems. A Case Study on Document Man...Industry - The Evolution of Information Systems. A Case Study on Document Man...
Industry - The Evolution of Information Systems. A Case Study on Document Man...ICSM 2011
 
Keeping Linked Open Data Caches Up-to-date by Predicting the Life-time of RDF...
Keeping Linked Open Data Caches Up-to-date by Predicting the Life-time of RDF...Keeping Linked Open Data Caches Up-to-date by Predicting the Life-time of RDF...
Keeping Linked Open Data Caches Up-to-date by Predicting the Life-time of RDF...MOVING Project
 
[Document] MultiProject analysis with Critical Path Method
[Document] MultiProject analysis with Critical Path Method[Document] MultiProject analysis with Critical Path Method
[Document] MultiProject analysis with Critical Path MethodMichele Palumbo
 
Effective logframes for international development
Effective logframes for international developmentEffective logframes for international development
Effective logframes for international developmentNIDOS
 
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic MinerAutomatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic MinerFrancesco Osborne
 
STAT200 Assignment #2 - Descriptive Statistics Analysis and.docx
STAT200 Assignment #2 - Descriptive Statistics Analysis and.docxSTAT200 Assignment #2 - Descriptive Statistics Analysis and.docx
STAT200 Assignment #2 - Descriptive Statistics Analysis and.docxrafaelaj1
 
Lecture7-QuantitativeAnalysis2.pptx
Lecture7-QuantitativeAnalysis2.pptxLecture7-QuantitativeAnalysis2.pptx
Lecture7-QuantitativeAnalysis2.pptxssuser0d0f881
 
Keynote Presentation: Jac van der Gun, Senior Consultant, UNESCO-Internationa...
Keynote Presentation: Jac van der Gun, Senior Consultant, UNESCO-Internationa...Keynote Presentation: Jac van der Gun, Senior Consultant, UNESCO-Internationa...
Keynote Presentation: Jac van der Gun, Senior Consultant, UNESCO-Internationa...Iwl Pcu
 
KantanFest: Andy Way
KantanFest: Andy WayKantanFest: Andy Way
KantanFest: Andy Waykantanmt
 
4998 project report proposal abstract template
4998 project report proposal abstract template4998 project report proposal abstract template
4998 project report proposal abstract templatemelendez321
 
Risk analysis and management
Risk analysis and managementRisk analysis and management
Risk analysis and managementIvo Andreev
 

Similar to Assessing a human mediated current awareness service (20)

Supporting Springer Nature Editors by means of Semantic Technologies
Supporting Springer Nature Editors by means of Semantic TechnologiesSupporting Springer Nature Editors by means of Semantic Technologies
Supporting Springer Nature Editors by means of Semantic Technologies
 
Towards Reading Session-based indicators in Educational Reading Analytics
Towards Reading Session-based indicators in Educational Reading AnalyticsTowards Reading Session-based indicators in Educational Reading Analytics
Towards Reading Session-based indicators in Educational Reading Analytics
 
Beit 381 se lec 13 - 11 - 12 mar20 - project management
Beit 381 se lec 13  -  11 -  12 mar20 - project managementBeit 381 se lec 13  -  11 -  12 mar20 - project management
Beit 381 se lec 13 - 11 - 12 mar20 - project management
 
A cost structure study for French HSS journals
A cost structure study for French HSS journalsA cost structure study for French HSS journals
A cost structure study for French HSS journals
 
Project_Report_2022.pptx
Project_Report_2022.pptxProject_Report_2022.pptx
Project_Report_2022.pptx
 
CS3114_09212011.ppt
CS3114_09212011.pptCS3114_09212011.ppt
CS3114_09212011.ppt
 
Industry - The Evolution of Information Systems. A Case Study on Document Man...
Industry - The Evolution of Information Systems. A Case Study on Document Man...Industry - The Evolution of Information Systems. A Case Study on Document Man...
Industry - The Evolution of Information Systems. A Case Study on Document Man...
 
Keeping Linked Open Data Caches Up-to-date by Predicting the Life-time of RDF...
Keeping Linked Open Data Caches Up-to-date by Predicting the Life-time of RDF...Keeping Linked Open Data Caches Up-to-date by Predicting the Life-time of RDF...
Keeping Linked Open Data Caches Up-to-date by Predicting the Life-time of RDF...
 
[Document] MultiProject analysis with Critical Path Method
[Document] MultiProject analysis with Critical Path Method[Document] MultiProject analysis with Critical Path Method
[Document] MultiProject analysis with Critical Path Method
 
Lec 2
Lec 2Lec 2
Lec 2
 
Effective logframes for international development
Effective logframes for international developmentEffective logframes for international development
Effective logframes for international development
 
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic MinerAutomatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
 
STAT200 Assignment #2 - Descriptive Statistics Analysis and.docx
STAT200 Assignment #2 - Descriptive Statistics Analysis and.docxSTAT200 Assignment #2 - Descriptive Statistics Analysis and.docx
STAT200 Assignment #2 - Descriptive Statistics Analysis and.docx
 
Lecture7-QuantitativeAnalysis2.pptx
Lecture7-QuantitativeAnalysis2.pptxLecture7-QuantitativeAnalysis2.pptx
Lecture7-QuantitativeAnalysis2.pptx
 
Chapter 07
Chapter 07Chapter 07
Chapter 07
 
Keynote Presentation: Jac van der Gun, Senior Consultant, UNESCO-Internationa...
Keynote Presentation: Jac van der Gun, Senior Consultant, UNESCO-Internationa...Keynote Presentation: Jac van der Gun, Senior Consultant, UNESCO-Internationa...
Keynote Presentation: Jac van der Gun, Senior Consultant, UNESCO-Internationa...
 
Lec 3
Lec 3Lec 3
Lec 3
 
KantanFest: Andy Way
KantanFest: Andy WayKantanFest: Andy Way
KantanFest: Andy Way
 
4998 project report proposal abstract template
4998 project report proposal abstract template4998 project report proposal abstract template
4998 project report proposal abstract template
 
Risk analysis and management
Risk analysis and managementRisk analysis and management
Risk analysis and management
 

More from GESIS

10th BIR Workshop @ECIR 2020: introduction
10th  BIR Workshop @ECIR 2020: introduction10th  BIR Workshop @ECIR 2020: introduction
10th BIR Workshop @ECIR 2020: introductionGESIS
 
From closed to open access: A case study of flipped journals
From closed to open access: A case study of flipped journalsFrom closed to open access: A case study of flipped journals
From closed to open access: A case study of flipped journalsGESIS
 
Highly cited references in PLOS ONE and their in-text usage over time
Highly cited references in PLOS ONE and their in-text usage over timeHighly cited references in PLOS ONE and their in-text usage over time
Highly cited references in PLOS ONE and their in-text usage over timeGESIS
 
4th Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural...
4th Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural...4th Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural...
4th Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural...GESIS
 
Bibliometric-enhanced Information Retrieval: Connecting IR with Bibliometrics
Bibliometric-enhanced Information Retrieval: Connecting IR with BibliometricsBibliometric-enhanced Information Retrieval: Connecting IR with Bibliometrics
Bibliometric-enhanced Information Retrieval: Connecting IR with BibliometricsGESIS
 
Analyzing the network structure and gender differences of the “NKOS community”
Analyzing the network structure and gender differences of the “NKOS community”Analyzing the network structure and gender differences of the “NKOS community”
Analyzing the network structure and gender differences of the “NKOS community”GESIS
 
Recent advances in the project EXCITE – Extraction of Citations from PDF Docu...
Recent advances in the project EXCITE – Extraction of Citations from PDF Docu...Recent advances in the project EXCITE – Extraction of Citations from PDF Docu...
Recent advances in the project EXCITE – Extraction of Citations from PDF Docu...GESIS
 
Searching beyond datasets in the Social Sciences
Searching beyond datasets in the Social SciencesSearching beyond datasets in the Social Sciences
Searching beyond datasets in the Social SciencesGESIS
 
Bedeutung von Text Mining am Beispiel der Sozialwissenschaften
Bedeutung von Text Mining am Beispiel der SozialwissenschaftenBedeutung von Text Mining am Beispiel der Sozialwissenschaften
Bedeutung von Text Mining am Beispiel der SozialwissenschaftenGESIS
 
Contextualised Browsing in a Digital Library’s Living Lab
Contextualised Browsing in a Digital Library’s Living LabContextualised Browsing in a Digital Library’s Living Lab
Contextualised Browsing in a Digital Library’s Living LabGESIS
 
41st European Conference on Information Retrieval (ECIR 2019)
41st European Conference on Information Retrieval (ECIR 2019)41st European Conference on Information Retrieval (ECIR 2019)
41st European Conference on Information Retrieval (ECIR 2019)GESIS
 
Offenes kollaboratives Schreiben: Eine „Open Science“-Infrastruktur am Beispi...
Offenes kollaboratives Schreiben: Eine „Open Science“-Infrastruktur am Beispi...Offenes kollaboratives Schreiben: Eine „Open Science“-Infrastruktur am Beispi...
Offenes kollaboratives Schreiben: Eine „Open Science“-Infrastruktur am Beispi...GESIS
 
A Complete Year of User Retrieval Sessions in a Social Sciences Academic Sear...
A Complete Year of User Retrieval Sessions in a Social Sciences Academic Sear...A Complete Year of User Retrieval Sessions in a Social Sciences Academic Sear...
A Complete Year of User Retrieval Sessions in a Social Sciences Academic Sear...GESIS
 
Challenges in Extracting and Managing References
Challenges in Extracting and Managing ReferencesChallenges in Extracting and Managing References
Challenges in Extracting and Managing ReferencesGESIS
 

More from GESIS (14)

10th BIR Workshop @ECIR 2020: introduction
10th  BIR Workshop @ECIR 2020: introduction10th  BIR Workshop @ECIR 2020: introduction
10th BIR Workshop @ECIR 2020: introduction
 
From closed to open access: A case study of flipped journals
From closed to open access: A case study of flipped journalsFrom closed to open access: A case study of flipped journals
From closed to open access: A case study of flipped journals
 
Highly cited references in PLOS ONE and their in-text usage over time
Highly cited references in PLOS ONE and their in-text usage over timeHighly cited references in PLOS ONE and their in-text usage over time
Highly cited references in PLOS ONE and their in-text usage over time
 
4th Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural...
4th Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural...4th Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural...
4th Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural...
 
Bibliometric-enhanced Information Retrieval: Connecting IR with Bibliometrics
Bibliometric-enhanced Information Retrieval: Connecting IR with BibliometricsBibliometric-enhanced Information Retrieval: Connecting IR with Bibliometrics
Bibliometric-enhanced Information Retrieval: Connecting IR with Bibliometrics
 
Analyzing the network structure and gender differences of the “NKOS community”
Analyzing the network structure and gender differences of the “NKOS community”Analyzing the network structure and gender differences of the “NKOS community”
Analyzing the network structure and gender differences of the “NKOS community”
 
Recent advances in the project EXCITE – Extraction of Citations from PDF Docu...
Recent advances in the project EXCITE – Extraction of Citations from PDF Docu...Recent advances in the project EXCITE – Extraction of Citations from PDF Docu...
Recent advances in the project EXCITE – Extraction of Citations from PDF Docu...
 
Searching beyond datasets in the Social Sciences
Searching beyond datasets in the Social SciencesSearching beyond datasets in the Social Sciences
Searching beyond datasets in the Social Sciences
 
Bedeutung von Text Mining am Beispiel der Sozialwissenschaften
Bedeutung von Text Mining am Beispiel der SozialwissenschaftenBedeutung von Text Mining am Beispiel der Sozialwissenschaften
Bedeutung von Text Mining am Beispiel der Sozialwissenschaften
 
Contextualised Browsing in a Digital Library’s Living Lab
Contextualised Browsing in a Digital Library’s Living LabContextualised Browsing in a Digital Library’s Living Lab
Contextualised Browsing in a Digital Library’s Living Lab
 
41st European Conference on Information Retrieval (ECIR 2019)
41st European Conference on Information Retrieval (ECIR 2019)41st European Conference on Information Retrieval (ECIR 2019)
41st European Conference on Information Retrieval (ECIR 2019)
 
Offenes kollaboratives Schreiben: Eine „Open Science“-Infrastruktur am Beispi...
Offenes kollaboratives Schreiben: Eine „Open Science“-Infrastruktur am Beispi...Offenes kollaboratives Schreiben: Eine „Open Science“-Infrastruktur am Beispi...
Offenes kollaboratives Schreiben: Eine „Open Science“-Infrastruktur am Beispi...
 
A Complete Year of User Retrieval Sessions in a Social Sciences Academic Sear...
A Complete Year of User Retrieval Sessions in a Social Sciences Academic Sear...A Complete Year of User Retrieval Sessions in a Social Sciences Academic Sear...
A Complete Year of User Retrieval Sessions in a Social Sciences Academic Sear...
 
Challenges in Extracting and Managing References
Challenges in Extracting and Managing ReferencesChallenges in Extracting and Managing References
Challenges in Extracting and Managing References
 

Recently uploaded

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 

Recently uploaded (20)

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 

Assessing a human mediated current awareness service

  • 1. Assessing a human mediated current awareness service International Symposium of Information Science (ISI 2015) Zadar, 2015-05-20 Zeljko Carevic1, Thomas Krichel2 and Philipp Mayr1 1firstname.lastname@gesis.org 2lastname@openlib.org
  • 2. Outline 1. Introduction 2. RePEc and NEP 3. Results 3.1 Editing time 3.2 Indicators for report success 3.3 Editing effort 4. Conclusion and Outlook Slide 2 / 31
  • 3. Motivation • Thomas Krichel, the founder of RePEc, visited GESIS – Cologne in Oct. 2014 • Sharing his Russian souvenir • ~100 GB of XML log files Slide 3 / 31
  • 4. 1. Introduction • Current awareness in digital libraries – To inform users / subscribers about new / relevant acquisitions in their libraries [1]. • Current awareness services allow subscribers to keep up to date with new additions in a certain area of research. • Selection of relevant documents can be done (semi- )automatically or manually. • For this work we focus on the intellectual editing process • Aim of this work: How do editors work when creating a subject specific report in Digital Libraries (DL)? Slide 4 / 31
  • 5. 2. Use case: RePEc • RePEc (Research Papers in Economics) is a DL for working papers in economics research. • Covers metadata for working papers and journal articles. • Usually document metadata contains links to full texts Slide 5 / 31
  • 6. 2. RePEc statistics 0 200 400 600 800 1000 1200 1400 1600 1800 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 Numberofdocuments Year Contr. Archives Documents Full text Documents Regist. Authors Abstract views (April 2015) ~1,700 1.77 mio 1.63 mio ~45,000 >2 mio Slide 6 / 31
  • 7. 2. Current awareness service NEP • NEP (New Economics Papers) is a current awareness service for new additions in RePEc. • NEP covers subject specific reports from over 90 specific fields. – Business, Economic and Financial History – Public Economics – Social Norms and Social Capital • Issues are sent to subscribers via E-Mail, RSS and Twitter • Reports to new additions are generated by subject specific editors. • Relevant document selection is done manually by the editor! Slide 7 / 31
  • 8. Nep-acc Nep-afr Nep-all • Contains all new RePEc docs • Created roughly on weekly base • Contains avg. 488 doc Selects Nep-upt Nep-ure Selects Selects Selects Sends issue Sends issue Sends issue Sends issue Manual selection of relevant documents is a time consuming task. Slide 8 / 31
  • 9. ERNAD • ERNAD (Editing Reports on New Academic Documents) is a purposed built system • Re-rank nep-all for each editor based on the specific report topic • Looking at past issues of a report to produce a ranked nep-all • If presorting works well editors select highly ranked documents from nep-all Slide 9 / 31
  • 10. ERNAD example for Nep-Africa (NEP-AFR) 1. Tax compliance.. 2. Mental accounting.. … 212. Ethnic ..in Africa 317. Sino-African relations: Nep-all unsorted Nep-all presorted Slide 10 / 31 1. Ethnic ..in Africa 2. Sino-African relations: … 50. Tax compliance.. 51. Mental accounting..
  • 12. Research questions • RQ 1: How long is the editing duration? • RQ 2: What influences the success of a report? – Editing duration – Issue size • RQ 3: How much effort is invested for selecting and sorting papers per issue? – Precision @ N – Relative search length Slide 12 / 31
  • 13. RQ 1: Editing time How much time do editors invest to create a report? Slide 13 / 31
  • 14. Pre-selection • Editing an issue can be interrupted • This would distort the results • Exclude interrupted issues by separating the edit duration in 3-minute chunks Slide 14 / 31
  • 17. Summarize RQ 1 • Average editing time is comparable low with 15.5 minutes • Huge scattering between the reports: –Min. 2.5 minutes –Max. 53 minutes Slide 17 / 31
  • 18. RQ 2: Influences to successful reports • Popularity of a report can be measured by the number of subscribers. • Huge scattering between number of subscribers per report – Max. 6859 NEP-HIS Business, Economic and Financial History – Min. 75 NEP-CIS Confederation of Independent States • Factors influencing reports success for example: topic, age of a report.. • Does the issue size or the editing time influence the report success? Slide 18 / 31
  • 19. Editing time 0 1000 2000 3000 4000 5000 6000 7000 0 10 20 30 40 50 60 Numberofsubscribers Average editing time Avg. edit time Avg. number of subscribers Education 2198 sub. (avg. 836) Project, Program and Portfolio Management 43,5 min (avg. 15.5) Slide 19 / 31
  • 20. Issue size 0 1000 2000 3000 4000 5000 6000 7000 0 10 20 30 40 50 60 Numberofsubscribers Average issue size Avg. issue size Avg. number of subscribers Sports issue size 2.5 (avg. 12.4) Demographic Economic issue size 21 (avg. 12.4) Slide 20 / 31
  • 21. Summarize RQ 2 • There is no correlation between: – Issue size and number of subscribers – Editing time and number of subscribers • We assume that the success of a report is mainly driven by topic and age. Slide 21 / 31
  • 22. RQ 3: Effort in selecting and sorting How much effort is invested in selecting and sorting relevant documents from nep-all? Two measures are used: Precision @N Relative search length Slide 22 / 31
  • 23. Precision @ N • How many of the top n documents from pre-sorted nep-all are selected for the issue? • N set to: 5, 10, 15, 20 • We only consider issues where issue size > N • A document is relevant if its index position in nep-all is < N. Slide 23 / 31
  • 24. Example: P@ 5 • M={(D1, 4), (D2, 1), (D3, 7), (D4, 3), (D5, 9)} • P@5 for issue I in report J = ⅗ • Editors vary between using pre-sorted and un-sorted nep-all. Therefore: – Only consider issues with pre-sort usage > 50 Slide 24 / 31
  • 25. Results for P@N Avg. P@5 (82 rep) Avg. P@10 (64 rep) Avg. P@15(50rep) Avg. P@20 (31 rep) 0.77 0.80 0.80 0.82 • Max. found for nep-env (Environmental Economics) with P@5 = 0.99 • Min. found for nep-cba (Central Bank) with P@5 = 0.35 Slide 25 / 31
  • 26. Summarize P@N • Editors work comfortably with the presorting in nep-all. • The number of papers per issue has no significant influence for the precision. Slide 26 / 31
  • 27. Relative Search Length • We know how many of the top N document from nep-all selected. • To what depth do editors inspect nep-all? • Ratio between the highest index position (hin) of the last relevant document in nep- all and the length of nep-all Slide 27 / 31
  • 28. Example RSL • Editor is given a nep-all containing 300 documents. • M={(D1, 4), (D2, 10), (D3, 7)} • RSL = 10/300 • We assume that the editor has inspected nep-all to document 10. Slide 28 / 31
  • 30. Summarize RSL • The relative search length is comparable low with 0.08 • Editors select papers from the very upper part of nep-all. Slide 30 / 31
  • 31. Conclusion • Focused on observable system features – Editing time – Influences on report success – Effort in creating an issue • Summarize: The system supports the editor well in creating an issue • A complete view requires a more user-centred observation. • Future work: – Why and under what conditions is a document relevant? • NEP provides many opportunities for further research on data that is relatively easily available. Slide 31 / 31