Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Towards digitizing scholarly communication


Published on

Slides of the VIVO 2016 Conference keynote: Despite the availability of ubiquitous connectivity and information technology, scholarly communication has not changed much in the last hundred years: research findings are still encoded in and decoded from linear, static articles and the possibilities of digitization are rarely used. In this talk, we will discuss strategies for digitizing scholarly communication. This comprises in particular: the use of machine-readable, dynamic content; the description and interlinking of research artifacts using Linked Data; the crowd-sourcing of multilingual
educational and learning content. We discuss the relation of these developments to research information systems and how they could become part of an open ecosystem for scholarly communication.

Published in: Technology
  • Be the first to comment

Towards digitizing scholarly communication

  1. 1. Towards Digitizing Scholarly Communication Sören Auer University of Bonn & Fraunhofer IAIS
  2. 2. Publishing 6600 BCE Jiahu symbols 16 distinct markings on prehistoric artifacts found in Jiahu, a neolithic Peiligang culture site found in Henan, China
  3. 3. Publishing 2000 BCE Cursive hieroglyphs from the Papyrus of Ani (1250 BCE)
  4. 4. Publishing 380 BCE Papirus Oxyrhynchus, with fragment of Plato's Republic
  5. 5. 12th century publishing Codex Gigas (largest extant medieval manuscript) was created in the Benedictine monastery of Podlažice in Bohemia (now Czech Republic).
  6. 6. Scientific publishing in the 17th century One of the earliest research journals: Philosophical Transactions of the Royal Society © CC BY Henry Oldenburg
  7. 7. Scientific publishing today Mainly based on PDF • Is only partially machine- readable • Does not preserve structure • Does not allow embedding of semantics • Does not facilitate interactivity/dynamicity/ repurposing • …
  8. 8. Has it changed much? In terms of distribution: YES • Almost zero cost of copying and distribution • (whole history of publishing is mainly a history of the reduction of marginal costs of publishing) In terms of method/representation: NO • Articles are fixed sucessions of characters and words • static in terms of presentation, content, granularity
  9. 9. Researchers spend (most of) their time on: • Encoding their findings in articles • Decoding other reserchers findings from articles • Finding related work • Getting an overview over the state-of-the-art We need to develop means to make scholarly communication more efficient and effective.
  10. 10. New possibilities in a Digital World • Machine-readability • Semantic representation • Dynamic content, interactive examples • Integration of multimedia content • Rich interlinking with context (related work, calls, reviews, comments/ discussions) • Integration of rich metadata (provenance, licensing) • Interactive collaboration • …
  11. 11. Machine-readability • In PDFs the structure of the documents is lost • Headings, paragraphs, tables, references etc. are not recognizable anymore • Semantics can only be added as metadata on a per document level
  12. 12. Semantic Representation In addition to 5-star data ( we need 5-star documents: • Machine-readable • Semantics-aware • Interlinked David Shotton: The Five Stars of Online Journal Articles, D-Lib Magazine (2012),
  13. 13. limes-paper describes appr1 . appr1 a approach . appr1 for Link_Discovery appr1 hasProp looseless . ... limes-paper describes impl1 . impl1 a implementation impl1 implements appr1 . impl1 language Java . ... limes-paper describes eval1 . eval1 a evaluation . eval1 evaluates impl1 . eval1 uses Dbpedia . ... Internal: Semantic Description of Scientific Content Facilitates querying for all link discovery approaches having certain properties or implementations thereof in a certain language using a certain dataset.
  14. 14. External: Rich Interlinking with Related Work, Calls, Reviews, Discussions, …
  15. 15. Three approaches for digitizing scholarly communication Linked Research – enabling semantic authoring, publishing, discovery SlideWiki – courseware authoring and translation OpenResearch – Collaborative Management of Scholarly Communication Metadata
  16. 16. clientside editor for decentralised article publishing, annotations and social interactions Sarven Capadisli
  17. 17. A holistic view on scientific publishing
  18. 18. Sparklines – Adding small interactive inline charts
  19. 19. Linked Research & Features • Documents are human and machine- friendly. • Using the plain old semantic HTML marking process, with further semantic annotations using microformats and RDF. • All kinds of interactive content can be embedded into the HTML5 documents e.g. Javascript apps, code, videos, audio, interactive visualizations • Different views e.g., ACM, LNCS, W3C-ED, Slideshow, Native • Builds on Linked Data Platform, Solid and Linked Data Notifications to realize truly decentralized authoring & publishing workflows
  20. 20. Interlinking a research article, call for contributions and workshops, and proceedings @prefix sioc: <> . @prefix schema: <> . @prefix bibo: <> . <> sioc:reply_of <> ; schema:hasPart <> . <> sioc:reply_of <> ; bibo:citedBy <> . <> schema:hasPart <> . <> bibo:uri <> .
  21. 21. Comparison of scientific authoring and publishing approaches ACaA Access control and attribution AtA Adaptation to audiences CaF Commentary and feedback DAaP Decentralised authoring and publishing DI Data integration DVaM Different views and media EI Entity identifiers FaI Feedback and interactions HaMR Human and machine-readability IAaPW Integrated authoring and publication workflow IC Interactive content IM Impact metrics IoS Integration of semantics M Multimedia PaA Provenance and accountability PaP Persistence and preservation SaSI Sharing and social interactions
  22. 22. OpenResearch – a Semantic Wiki for Scientific Event Metadata (RIS for Events) We need Research Information Systems not only for organizations but also for communities or specific types of content Events are a crucial element of scholarly communication Information about events is difficult to obtain: • Quality (e.g. acceptance rate, PC members) • Logistics (locations, fees) • Dates (submission, registration etc.) • Co-located events • … CC BY 3.0 Wiki4des at English Wikipedia
  23. 23. Structured event meta data Semantic (typed) links inside call text
  24. 24. Interactive Queries (can be also created by users)
  25. 25. Architecture
  26. 26. SlideWiki – A collaborative OpenCourseWare Authoring Platform • Collaborative creation and maintenance of high-quality, multilingual OpenCourseWare is still a major challenge • SlideWiki is a platform for OpenCourseWare creation employing crowdsourcing, full versioning, WYSIWIG
  27. 27. Facilitates translations to many languages Helps to keep track of authors, translators and sources
  28. 28. SlideWiki: Self-assessment questions … can be attached to every single slide
  29. 29. Learners can test their knowledge … an be pointed exactly to the content they need to revisit
  30. 30. How is SlideWiki different? SlideWiki differs from other online tools for presentations, such as Google Docs Presentations, Prezi, SlideShare due to its focus on: • E-learning - you can add questions to slides and thus compose comprehensive self-assessment tests for learners • Collaboration - SlideWiki aims at empowering whole communities to create presentations collaboratively • Translation - with SlideWiki content can be easily translated in more than 50 languages
  31. 31. Semantic Web Layer Cake 2001 • Monolithic based on XML • Focus on heavyweight Semantic (Ontologies, Logic, Reasoning)
  32. 32. © Fraunhofer The Semantic Web Layer Cake 2015 – Bridging between Big & Smart Data Unicode URIs XML JSON CSV RDB HTML RDF RDF/XML JSON-LD CSV2RDF R2RML RDFa RDF Data Shapes RDF-Schema Vocabularies OntologiesSKOS Thesauri LogicSWRL Rules SPARQL (Accesscontrol),Signatur, Encryption(HTTPS/CERT/DANE), • Lingua Franca of Data integration with many technology interfaces (XML, HTML, JSON, CSV, RDB,…) • Focus on lightweight vocabularies, rules, thesauri etc. • Less “invasive”
  33. 33. Towards an Ecosystem of Open Scholarly Communication Infrastructure Authoring environments •Crowd-sourcing, versioning, social networking 5-Star publishing of data, documents, courseware and artefacts •Decentralized, open, interlinked Research information systems •organization, community, region, communication- type centered We need to invest more into techniques tailored for digital knowledge exchange instead of techniques mimicking work-arounds of the past. From document-centricity to knowledge-centricity
  34. 34. Thanks for your attention Sören Auer Sarven Capadisli Christoph Lange