Machine Support for Interacting with Scientific Publications Improving Information Retrieval, and Assessing Quality of Scientific Output
Upcoming SlideShare
Loading in...5
×
 

Machine Support for Interacting with Scientific Publications Improving Information Retrieval, and Assessing Quality of Scientific Output

on

  • 74 views

4th German-Russian Young Researchers Forum

4th German-Russian Young Researchers Forum
Saint-Petersburg, August 2014

Statistics

Views

Total Views
74
Views on SlideShare
74
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Machine Support for Interacting with Scientific Publications Improving Information Retrieval, and Assessing Quality of Scientific Output Machine Support for Interacting with Scientific Publications Improving Information Retrieval, and Assessing Quality of Scientific Output Presentation Transcript

  • Introduction Vision Technology Solutions Conclusion Machine Support for Interacting w. Scientific Publications, Improving Information Retrieval, and Assessing Quality of Scientific Output 4th German-Russian Young Researchers Forum 2014 Christoph Lange1,2 1Enterprise Information Systems, Institute for Applied Computer Science, University of Bonn 2Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS), Sankt Augustin http://langec.wordpress.com/about Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 1
  • Introduction Vision Technology Solutions Conclusion Machine Support for Assessing Quality of Scientific Output 4th German-Russian Young Researchers Forum 2014 Christoph Lange1,2 1Enterprise Information Systems, Institute for Applied Computer Science, University of Bonn 2Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS), Sankt Augustin http://langec.wordpress.com/about Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 1
  • Introduction Vision Technology Solutions Conclusion Hello, World! 2011 PhD at Jacobs Univ. Bremen, Germany: software for collaborating on mathematical documents [Lan11] 2011/12 Univ. Bremen, Germany: making knowledge of different complexity manageable for computers [OntoIOp13] 2012/13 Univ. Birmingham, UK: enabling domain experts to make mathematical models machine-verifiable [KLR] 2013– Enterprise Information Systems @ Univ. Bonn, Germany / Organized Knowledge @ Fraunhofer IAIS: enterprise information integration [AL14], data quality assessment, ... Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 2
  • Introduction Vision Technology Solutions Conclusion Assess Quality of Scientific Output (I) Vision: answer the following questions about the quality of scientific output: Author “What is a good workshop to discuss my latest idea?” Senior Researcher “Should I accept an invitation to the programme committee of this conference?” PhD Student “What are the best publications I should read to get started?” Reviewer “Is this paper based on high-quality data?” Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 3
  • Introduction Vision Technology Solutions Conclusion Assess Quality of Scientific Output (II) How? – Semantic Web / Linked Open Data technology weak artificial intelligence – does not aim at replacing, but at supporting humans practically applicable, and scalable to the size of the Web (→ search engine example) suitable for connecting data from heterogeneous sources: scientific publications (bibliographic metadata, citations and full text) social networks (in science? – ResearchGate, Mendeley, etc.) research data Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 4
  • Introduction Vision Technology Solutions Conclusion Linked Open Data: schema.org initiative of search engines (Google, Yandex, ...) structuring web page content (creative works, events, organisations, persons, places, products) Example (Movie description) Avatar Director: James Cameron (born August 16, 1954) Science fiction Trailer Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 5
  • Introduction Vision Technology Solutions Conclusion Linked Open Data: schema.org initiative of search engines (Google, Yandex, ...) structuring web page content (creative works, events, organisations, persons, places, products) Example (Movie description) <div class="movie"> <h1>Avatar</h1> <div class="director"> Director: James Cameron (born August 16, 1954) </div> <span class="genre">Science fiction</span> <a href="../movies/avatar-theatrical-trailer.html" Trailer</a></div> Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 5
  • Introduction Vision Technology Solutions Conclusion Linked Open Data: schema.org initiative of search engines (Google, Yandex, ...) structuring web page content (creative works, events, organisations, persons, places, products) Example (Movie description) <div itemscope itemtype="http://schema.org/Movie"> <h1 itemprop="name">Avatar</h1> <div itemprop="director" itemscope itemtype="http://schema.org/Person"> Director: <span itemprop="name">James Cameron</span> (born <span itemprop="birthDate">August 16, 1954</span>)</div> <span itemprop="genre">Science fiction</span> <a href="../movies/avatar-theatrical-trailer.html" itemprop="trailer">Trailer</a></div> Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 5
  • Introduction Vision Technology Solutions Conclusion Linked Open Data: schema.org initiative of search engines (Google, Yandex, ...) structuring web page content (creative works, events, organisations, persons, places, products) Example (Movie description) Movie Avatar Person James Cameron August 16, 1954Science fiction../movies/... type nam e director genre trailer type name birthDate Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 5
  • Introduction Vision Technology Solutions Conclusion Social Data with schema.org review or rating of a creative work, organization or product (written by a person) social network of a person: “knows”, “works for”, “is colleague of”, “has parent/sibling/spouse/child/relative” Example (Reviews of a movie) Movie type Avatar name reviews authorreviewRating reviews author reviewRating 6 ratingValue 8.5 ratingValue Pünktchen name Anton name Person type type knows Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 6
  • Introduction Vision Technology Solutions Conclusion schema.org in a Search Engine Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 7
  • Introduction Vision Technology Solutions Conclusion Workshop Quality Author: “What is a good workshop to discuss my latest idea?” Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 8
  • Introduction Vision Technology Solutions Conclusion Workshop Quality: Examples Low-quality workshop 1st International Workshop on Applied Networking (but all non-invited submissions are from authors from the same institution as the chairs) High-quality workshop focused topic, 10 editions so far, balanced continuity and renewal in organising committee, number of submissions not decreasing, international participation, part of a high-profile conference Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 9
  • Introduction Vision Technology Solutions Conclusion Workshop Quality: Data Semantic Publishing Challenge [DL14] @ Extended Semantic Web Conference 2014 One task focused on extracting Linked Data from CEUR-WS.org workshop proceedings volumes 1,200 workshops since 1995 open access most important publisher for computer science workshops semi-structured HTML tables of content unstructured PDF full-text A team from Saint-Petersburg (ITMO University) won the award for the best-performing tool [KK14] Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 10
  • Introduction Vision Technology Solutions Conclusion Conference Quality Senior Researcher: “Should I accept an invitation to the programme committee of this conference?” Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 11
  • Introduction Vision Technology Solutions Conclusion Conference Quality in the Past: Ranking CORE (Computing Research and Education Association of Australasia) and ERA (Excellence in Research for Australia) rankings of 2008, 2010 and 2013: infrequent and intransparent Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 12
  • Introduction Vision Technology Solutions Conclusion Paper Quality in the Past: Impact Factor PhD Student: “What are the best publications I should read to get started?” Impact Factor Average number of citations of recent articles journals only not comparable across disciplines can be influenced by journal editors Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 13
  • Introduction Vision Technology Solutions Conclusion Paper Quality in the Future Multidimensional, context-sensitive analysis: trend detection, topic analysis, expert search, community dynamics, research performance at different levels (e.g. [OM14]) context-sensitive citation analysis e.g. 2014 Semantic Publishing Challenge task 2 (using PubMedCentral XML metadata) [DL14] “good citation”: B’s contribution is based on A’s methodology “bad citation”: A cited in a footnote in the “related work” section Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 14
  • Introduction Vision Technology Solutions Conclusion Data Quality Reviewer: “Is this paper based on high-quality data?” Quality metrics of an evolving dataset [DLA14] Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 15
  • Introduction Vision Technology Solutions Conclusion Data Quality Assessment Quality := “fitness for use” – categories [Zav+13]: Relevancy Conciseness Timeliness Rep.- Conciseness Interoperability Consistency Interpretability Understandability Versatility* Availability Performance* Interlinking* Syntactic Validity Representation Contextual Intrinsic Accessibility Trustworthiness Two dimensions are related Licensing* Semantic Accuracy Completeness Security* Dim1 Dim2 Enable authors to upload data with their papers! Give peer reviewers access to data quality metrics Starting collaboration with GESIS (social science) Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 16
  • Introduction Vision Technology Solutions Conclusion Directions: Jailbreaking the PDF “exploring ways to access scholarly data in modern ways” free peer-reviewed scientific knowledge from being locked up in PDF documents Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 17
  • Introduction Vision Technology Solutions Conclusion Directions: Pact with the Devil Openness vs. impact Springer: conference linked data Elsevier: executable paper challenge ResearchGate: open reviews Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 18
  • Introduction Vision Technology Solutions Conclusion Conclusion Scientists need help with assessing the quality of scientific output. Having PDF documents peer-reviewed by human experts is not sufficient. We need better quality metrics than the impact factor. Not just paper quality matters, but also data quality. Semantic Web/Linked Data technology helps to provide complementary machine support... ... and is a gate into openness. Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 19
  • References References I S. Auer and C. Lange. “Interlinking Data and Knowledge in Enterprises, Research and Society with Linked Data”. In: Proceedings of the 11th International Baltic Conference on Databases and Information Systems (Baltic DB&IS). (Tallinn, Estonia, June 8–11, 2014). Ed. by H.-M. Haav, A. Kalja, and T. Robal. Invited paper. Tallinn, Estonia: Tallinn University of Technology Press, 2014, pp. 3–12. A. Di Iorio and C. Lange, eds. (Anissaras, Greece, May 25, 2014). 2014. URL: http://2014.eswc- conferences.org/program/semwebeval. Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 20
  • References References II J. Debattista, C. Lange, and S. Auer. “Representing Dataset Quality Metadata using Multi-Dimensional Views”. 2014. Submitted. M. Kolchin and F. Kozlov. “Unstable markup: A template-based information extraction from web sites with unstable markup”. In: Semantic Publishing Challenge (Extended Semantic Web Conference, Semantic Web Evaluation Track). (Anissaras, Greece, May 25, 2014). Ed. by A. Di Iorio and C. Lange. 2014. URL: http://2014.eswc- conferences.org/program/semwebeval. Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 21
  • References References III M. Kerber, C. Lange, and C. Rowat. ForMaRE. Formal Mathematical Reasoning in Economics. URL: http://cs.bham.ac.uk/research/ projects/formare/ (visited on 2013-02-10). C. Lange. “Enabling Collaboration on Semiformal Mathematical Knowledge by Semantic Web Integration”. PhD thesis. Jacobs University Bremen, 2011. Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 22
  • References References IV F. Osborne and E. Motta. “Understanding Research Dynamics”. In: Semantic Publishing Challenge (Extended Semantic Web Conference, Semantic Web Evaluation Track). (Anissaras, Greece, May 25, 2014). Ed. by A. Di Iorio and C. Lange. 2014. URL: http://2014.eswc- conferences.org/program/semwebeval. OntoIOp (Ontology, Model and Specification Integration and Interoperability), an OMG Standard Development Initiative. 2013. URL: http://ontoiop.org (visited on 2013-10-09). Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 23
  • References References V A. Zaveri, A. Rula, A. Maurino, R. Pietrobon, J. Lehmann, and S. Auer. “Quality Assessment Methodologies for Linked Open Data”. In: Semantic Web Journal (2013). This article is still under review. URL: http://www.semantic- web-journal.net/content/quality- assessment-linked-open-data-survey. Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 24