Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Do Citations and Readership Predict Excellent Publications?

28 views

Published on

Slides for a presentation at the First Workshop on Scholarly Web Mining @ WSDM 2017
Paper: https://dl.acm.org/citation.cfm?id=3057154

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Do Citations and Readership Predict Excellent Publications?

  1. 1. Do Citations and Readership Predict Excellent Publications? Dasha Herrmannova, The Open University, UK Robert Patton, Oak Ridge National Laboratory, USA Petr Knoth, The Open University, UK Chris Stahl, Oak Ridge National Laboratory, USA
  2. 2. Research question Q:Are current research evaluation metrics sufficient for identifying highly influential papers?
  3. 3. Why care about metrics? Research papers Researchers Funding agencies Institutions Who to fund? Returns on investment? Are we doing well? What to subscribe to? What to read? Where to publish? Collaborators? Citationanalysis Altmetrics
  4. 4. Finding what works • ML approach • Evaluate all methods in terms of precision-recall/accuracy/… • Requirement: ground truth • Research evaluation • No ground truth • Authority often established axiomatically • JIF, h-index, etc. • Can we build a ground truth dataset?
  5. 5. Our understanding of "impact" Low impact High impact vs
  6. 6. Our understanding of "impact" Low impact High impact vs Survey papers: "A general view, examination or description of someone or something" Seminal works: "Strongly influencing later developments"
  7. 7. Creating a dataset • Online questionnaire • Discipline? • Reference to a survey paper • Reference to a seminal paper • Collected 314 papers • Labels (seminal, survey) • Title, authors, year of publication, abstract, DOI, ... • Available online • http://trueid.semantometrics.org • Analysis • Seminal papers on average 10 years older • Seminal papers cited on average 5 times more
  8. 8. Do citations/readership predict excellent papers? • Classify papers using citations and readership as features • Model • Select a threshold t • If cit(d) ≥ t → label as seminal • Else → label as survey • Use threshold with best accuracy on the training set • Leave-one-out cross-validation • 3 experiments • Aggregate • Per discipline • Per year
  9. 9. Results Model Data Accuracy Upper bound Baseline Citations 52.87% - Readership 52.87% - Aggregate Citations 63.06% 63.38% Readership 42.68% 52.87% Discipline based Citations 45.28% 68.11% Readership 42.13% 62.60% Year based Citations 55.23% 68.62% Readership 51.05% 65.27%
  10. 10. Conclusion • Both citations and readership provide an improvement over the baseline • Neither of the two metrics is optimal
  11. 11. What next? • Ideal dataset • Multi-disciplinary • Time span • Publication types • Peer review judgement • Better metrics • Citation context • Analyzing content
  12. 12. Thank you! Questions? http://trueid.semantometrics.org

×