Nanopublications and Decentralized PublishingTobias Kuhn
1) Current methods of publishing and sharing research results and data pose problems regarding verifiability, immutability, and permanence over time.
2) Nanopublications use cryptographic hashes to create "Trusty URIs" that make digital objects verifiable, immutable, and permanent by linking identifiers to content.
3) A decentralized network of nanopublication servers allows for open, real-time publishing and retrieval of nanopublications without a central authority through propagation across nodes.
New tasks, new roles: Libraries in the tension between Digital Humanities, Re...Stefan Schmunk
This document summarizes Dr. Stefan Schmunk's presentation on new roles for libraries in relation to digital humanities, research data, and research infrastructures. The presentation discusses how digital humanities projects involving tasks like digital scholarly editions require new skills from libraries, such as expertise in XML encoding, long-term preservation of digital materials, and creation of virtual research environments. It also explores how libraries must adapt to help researchers with the growing importance of research data in the humanities by taking on roles like hosting data repositories, providing data management support and training, and building research data infrastructures.
Making social science more reproducible by encapsulating access to linked dataAlbert Meroño-Peñuela
This document discusses improving reproducibility in social science research by encapsulating access to linked data. It proposes using GitHub to collaboratively write SPARQL queries that can be used to combine and select subsets of linked open data. A tool called GRLC is presented that automatically builds APIs from the SPARQL queries in a GitHub repository. This allows research questions to be encoded as SPARQL and executed through HTTP links. GRLC has been successfully used in several domains and projects to improve data sharing and reuse.
This document discusses using linked data technology to solve the problem of disconnected and isolated economic and social science datasets. It proposes hosting updated versions of important historical databases as linked data in a single location. This will allow users to easily query across datasets, upload and link their own datasets, and build a large graph of interconnected public datasets. The document demonstrates some triplestore and linked data browsing tools, including a SPARQL query editor and lightweight linked data browser. It also introduces the team working on the structured datahub and linked data solutions for economic and social historians.
This document outlines plans for the Clariah Structured Data Hub project. Clariah aims to provide humanities scholars access to large digital resources and tools to enable ground-breaking research. The Structured Data Hub will curate and link structured datasets on various levels from micro to macro. It will also create tools to facilitate the research process, such as data evaluation, linking, analysis, and visualization. The project will involve a design phase with two pilot studies, followed by preparation, execution, and close phases to develop a research infrastructure with linked data and tools.
This document describes QB'er, a tool for converting statistical datasets into linked open data on the semantic web. It aims to address problems with today's workflow for working with multiple disconnected datasets, including a lack of comparability and repeating cleaning efforts. QB'er allows researchers to standardize individual datasets according to community best practices, share code lists with colleagues, and publish standardized, interlinked datasets on a structured data hub. This grows a graph of interconnected datasets and makes the cleaning and mapping efforts reusable rather than disposable. A demonstration shows uploading a historical census dataset and mapping its variable values to codes while preserving the original values.
The document discusses using linked data (LD) to model heterogeneous economic and social history data. It provides examples of recurring datasets that could benefit from an LD approach, such as the Historical Sample of the Netherlands (HSN) life course data. LD allows data to be expressed as statements in triples that can be queried using SPARQL. This contributes to reproducible research by making data and efforts more openly accessible and integrated across datasets through shared vocabularies. However, LD may not be suitable if performance is a pressing issue. Examples are then provided of how the HSN data could be modeled and queried using LD by linking together information on individuals, households, and external datasets.
This document discusses developing a distributed network of digital heritage information in the Netherlands. It proposes taking a resource-centric linked data approach, implementing linked data principles in data sources, building a knowledge graph, and creating a registry to link organizations, datasets, and resources. This would allow for federated querying across distributed data sources and improved discovery of digital heritage information.
Nanopublications and Decentralized PublishingTobias Kuhn
1) Current methods of publishing and sharing research results and data pose problems regarding verifiability, immutability, and permanence over time.
2) Nanopublications use cryptographic hashes to create "Trusty URIs" that make digital objects verifiable, immutable, and permanent by linking identifiers to content.
3) A decentralized network of nanopublication servers allows for open, real-time publishing and retrieval of nanopublications without a central authority through propagation across nodes.
New tasks, new roles: Libraries in the tension between Digital Humanities, Re...Stefan Schmunk
This document summarizes Dr. Stefan Schmunk's presentation on new roles for libraries in relation to digital humanities, research data, and research infrastructures. The presentation discusses how digital humanities projects involving tasks like digital scholarly editions require new skills from libraries, such as expertise in XML encoding, long-term preservation of digital materials, and creation of virtual research environments. It also explores how libraries must adapt to help researchers with the growing importance of research data in the humanities by taking on roles like hosting data repositories, providing data management support and training, and building research data infrastructures.
Making social science more reproducible by encapsulating access to linked dataAlbert Meroño-Peñuela
This document discusses improving reproducibility in social science research by encapsulating access to linked data. It proposes using GitHub to collaboratively write SPARQL queries that can be used to combine and select subsets of linked open data. A tool called GRLC is presented that automatically builds APIs from the SPARQL queries in a GitHub repository. This allows research questions to be encoded as SPARQL and executed through HTTP links. GRLC has been successfully used in several domains and projects to improve data sharing and reuse.
This document discusses using linked data technology to solve the problem of disconnected and isolated economic and social science datasets. It proposes hosting updated versions of important historical databases as linked data in a single location. This will allow users to easily query across datasets, upload and link their own datasets, and build a large graph of interconnected public datasets. The document demonstrates some triplestore and linked data browsing tools, including a SPARQL query editor and lightweight linked data browser. It also introduces the team working on the structured datahub and linked data solutions for economic and social historians.
This document outlines plans for the Clariah Structured Data Hub project. Clariah aims to provide humanities scholars access to large digital resources and tools to enable ground-breaking research. The Structured Data Hub will curate and link structured datasets on various levels from micro to macro. It will also create tools to facilitate the research process, such as data evaluation, linking, analysis, and visualization. The project will involve a design phase with two pilot studies, followed by preparation, execution, and close phases to develop a research infrastructure with linked data and tools.
This document describes QB'er, a tool for converting statistical datasets into linked open data on the semantic web. It aims to address problems with today's workflow for working with multiple disconnected datasets, including a lack of comparability and repeating cleaning efforts. QB'er allows researchers to standardize individual datasets according to community best practices, share code lists with colleagues, and publish standardized, interlinked datasets on a structured data hub. This grows a graph of interconnected datasets and makes the cleaning and mapping efforts reusable rather than disposable. A demonstration shows uploading a historical census dataset and mapping its variable values to codes while preserving the original values.
The document discusses using linked data (LD) to model heterogeneous economic and social history data. It provides examples of recurring datasets that could benefit from an LD approach, such as the Historical Sample of the Netherlands (HSN) life course data. LD allows data to be expressed as statements in triples that can be queried using SPARQL. This contributes to reproducible research by making data and efforts more openly accessible and integrated across datasets through shared vocabularies. However, LD may not be suitable if performance is a pressing issue. Examples are then provided of how the HSN data could be modeled and queried using LD by linking together information on individuals, households, and external datasets.
This document discusses developing a distributed network of digital heritage information in the Netherlands. It proposes taking a resource-centric linked data approach, implementing linked data principles in data sources, building a knowledge graph, and creating a registry to link organizations, datasets, and resources. This would allow for federated querying across distributed data sources and improved discovery of digital heritage information.
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...WARCnet
Wednesday 6 May: Hand me the data! What you should know as a humanities researcher before asking for data from a web archive, Ulrich Have, NetLab/DIGHUMLAB, Aarhus University
The document discusses a data wrangling experiment to create datasets from the Rijksmuseum collection and web archive data for research purposes. A group of researchers from different universities aim to develop standardized code books and controlled vocabularies to structure the data and enable interlinking across collections. They discuss techniques like SPARQL and identifiers in Wikidata to retrieve and organize machine-readable data for future studies of body postures in artworks and web archives.
Repositories are systems mainly used to store and publish academic contents. This presentation discusses why repositories contents should be published as Linked (Open) Data and how repositories can be extended to do so.
Wikidata is a free and open knowledge base that can be edited by anyone to store structured data. It currently has over 33.5 million articles and 1.9 billion edits in 287 languages. Wikidata provides structured, collaborative, free, open, multilingual, and referenced data through its API and licenses its data under CC0 to allow easy access and reuse. It helps projects like Wikipedia by providing integrated access to its data and supports smaller languages and communities through micro-contributions. In 2015, Google's Freebase project moved its data to Wikidata, increasing its scope and ecosystem.
Linked Data allows evolving the web into a global data space by publishing structured data on the web using RDF and by linking data items across different data sources. It follows the Linked Data principles of using URIs to identify things and HTTP URIs to look up those names, providing useful RDF information when URIs are dereferenced, and including RDF links to discover related data. The amount of published Linked Data on the web has grown enormously since 2007. Large data sources like DBpedia extract structured data from Wikipedia and act as hubs by interlinking different data sets, enabling new applications and search over integrated data.
New approaches for data acquisition at europeana iiif, sitemaps and schema.o...Nuno Freire
Presentation on experiments at Europeana regarding new methods of aggregating metadata.
Presented at the Seminar Linked Data in Research and Cultural Heritage, on 1st of May 2017.
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...Micah Altman
The WorldMap platform http://worldmap.harvard.edu is the largest open source collaborative mapping system in the world, with over 13,000 map layers contributed by thousands of users from Harvard and around the world. Researchers may upload large spatial datasets to the system, create data-driven visualizations, edit data, and control access. Users may keep their data private, share it in groups, or publish to the world.
The user base is interdisciplinary, including scholars from the humanities, social sciences, sciences, public health, design, planning, etc. All are able to access, view, and use one another’s data, either online, via map services, or by downloading.
Current work is underway to create and maintain a global registry of map services and take us a step closer to one-stop-access for public geospatial data. Another project is working on tools to support the visualization of spatial datasets with over a billion features. Current collaborations are underway with groups inside Harvard, such as Dataverse, HarvardX, and various departments, and with groups outside Harvard, such as Cornell University and the University of Pennsylvania. Major additional contributors to the underlying source code include the WorldBank, the U.S. State Department, and the United Nations.
The source code for the WorldMap platform is available on GitHub https://github.com/cga-harvard/cga-worldmap.
Location: E25-202
Discussant: Ben Lewis is system architect and project manager for WorldMap, an open source infrastructure that supports collaborative research centered on geospatial information. Before joining Harvard, Ben was a project manager with Advanced Technology Solutions of Pennsylvania, where he led the company in adopting platform independent approaches to GIS system development. Ben studied Chinese at the University of Wisconsin and has a Masters in Planning from the University of Pennsylvania. After Penn, Ben helped start the GIS Lab at U.C. Berkeley, founded the GIS group for transportation engineering firm McCormick Taylor, and coordinated the Land Acquisition Mapping System for South Florida Water Management District. Ben is especially interested in technologies that lower the barrier to spatial technology access.
Information Science Brown Bag talks, hosted by the Program on Information Science, consists of regular discussions and brainstorming sessions on all aspects of information science and uses of information science and technology to assess and solve institutional, social and research problems. These are informal talks. Discussions are often inspired by real-world problems being faced by the lead discussant.
Tuesday 5 May: The Shapes of Archives and Memory, Helle Strandgaard JensenWARCnet
This document outlines a research project that will examine how the online activities of national archives in Denmark and the UK have shaped archives and their role in public memory-making. The project will use global history theory to analyze the websites of the Danish National Archive and the National Archives UK from their earliest versions archived by the Wayback Machine and national web archives. The analysis will focus on aspects like connections to other sites, functions, size, topics, internal networks, audiences, and design. It will also consider the broader technological contexts. The goal is to understand how the archives' online presences have evolved and how web archives can shape both the history written about archives and archives' own histories on the web.
Mind the gap! Reflections on the state of repository data harvestingSimeon Warner
A 24x7 presentation at Open Repositories 2017 in Brisbane, Australia.
I start with an opinionated history of the evolution of repository data harvesting since the late 1990's to the present. A conclusion is that we are currently in danger of creating a repository environment with fewer cross-repository services than before, with the potential to reinforce the silos we hope to open. I suggest that the community needs to agree upon a new solution, and further suggest that solution should be ResourceSync.
DataverseNL is a data repository service started in 2014 by DANS as a shared service for 15 Dutch institutions. It currently contains over 200 dataverses and 450 datasets that have been downloaded over 7,000 times. DANS aims to use DataverseNL for ongoing research projects and then archive finalized datasets in its Trusted Digital Repository (TDR) for permanent preservation. DataverseNL serves as a collaboration platform and integration point for sharing research data across Dutch universities and organizations. DANS is working to link DataverseNL metadata to semantic web vocabularies and expose it as linked open data.
1) GRLC is a tool that generates Linked Data APIs from SPARQL queries stored in a GitHub repository. It automatically builds Swagger specifications and API code by mapping the GitHub repository structure and SPARQL queries.
2) This allows SPARQL queries to be organized and maintained externally to applications in a version controlled way. The APIs generated hide the complexity of SPARQL from clients.
3) GRLC was used to build APIs for accessing historical census data, hiding SPARQL from historians. It was also used to reduce coupling between SPARQL and R code for a project analyzing the impact of early life conditions on later outcomes.
Presentatie for "Studiemiddag Linked Data Archieven"Victor de Boer
- Archival data is currently isolated and not integrated, making it difficult to reuse. Linked data provides a flexible way to integrate heterogeneous data sources while retaining their original structures.
- Linked data uses web technologies like URIs and RDF triples to describe things and link them together. This allows data to be connected and queried across collections and domains.
- Publishing data as linked open data according to the 5 star system maximizes its potential for reuse through open standards and linking to other open datasets. This enables new applications and ways of exploring and analyzing the data.
This document outlines a research project to identify and document the skills, knowledge, tools, and training needed for web archiving research. A team of researchers will conduct a pilot skills workshop, survey researchers, and interview researchers to understand what skills and knowledge led to successful research experiences. The goals are to develop a skills matrix, identify resources for acquiring skills, and create "personas" of researchers. The outputs will include papers and a dataset of interviews archived as examples of web archiving use cases.
Maximising (Re)Usability of Library metadata using Linked Data Asuncion Gomez-Perez
This document discusses maximizing the reusability of library metadata using linked data. It motivates the use of linked data by describing the current heterogeneous data landscape with issues around language, format, and lack of interoperability. It then discusses how linked data allows for uniform access through agreed upon vocabularies and standards. Specific issues around language, provenance, license and the linked data process are covered. Uses of linked library metadata are also discussed.
Nederlab is a consortium that aims to bring together digitized Dutch texts from 800 AD to the present into one online interface. This includes newspapers from 1618-1995 and early Dutch books. The texts are converted to a uniform XML format and processed with tools like TICCL to correct OCR errors. Volunteers are helping to transcribe newspapers. Future work includes incorporating other projects' corpus exploration tools to enhance the portal.
Open content opens up new avenues of researchFelix Lohmeier
(1) The document discusses how open access to digitized content from libraries opens up new avenues of research. It describes the digitization efforts of the Saxon State and University Library (SLUB) in Dresden, Germany, which has scanned over 2.7 million pages in 2012.
(2) Open access to this digital content allows for new ways of discovering and interacting with information through semantic search tools and virtual research environments. It also facilitates increased collaboration between researchers as they can more easily share and discuss data.
(3) However, there remains a "data gap" between available digital resources and researchers' needs. The document argues that SLUB aims to help fill this gap through various digital services
CLARIAH Toogdag 2018: A distributed network of digital heritage informationEnno Meijers
Slides of my keynote at the CLARIAH Toogdag 2018 on 9 March at the National Library of the Netherlands. The main topics were the development of the distributed digital heritage network and the alignment to and cooperation with the CLARIAH infrastructure and data. It also points at some of the current limitations of the semantic web technology.
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...WARCnet
Wednesday 6 May: Hand me the data! What you should know as a humanities researcher before asking for data from a web archive, Ulrich Have, NetLab/DIGHUMLAB, Aarhus University
The document discusses a data wrangling experiment to create datasets from the Rijksmuseum collection and web archive data for research purposes. A group of researchers from different universities aim to develop standardized code books and controlled vocabularies to structure the data and enable interlinking across collections. They discuss techniques like SPARQL and identifiers in Wikidata to retrieve and organize machine-readable data for future studies of body postures in artworks and web archives.
Repositories are systems mainly used to store and publish academic contents. This presentation discusses why repositories contents should be published as Linked (Open) Data and how repositories can be extended to do so.
Wikidata is a free and open knowledge base that can be edited by anyone to store structured data. It currently has over 33.5 million articles and 1.9 billion edits in 287 languages. Wikidata provides structured, collaborative, free, open, multilingual, and referenced data through its API and licenses its data under CC0 to allow easy access and reuse. It helps projects like Wikipedia by providing integrated access to its data and supports smaller languages and communities through micro-contributions. In 2015, Google's Freebase project moved its data to Wikidata, increasing its scope and ecosystem.
Linked Data allows evolving the web into a global data space by publishing structured data on the web using RDF and by linking data items across different data sources. It follows the Linked Data principles of using URIs to identify things and HTTP URIs to look up those names, providing useful RDF information when URIs are dereferenced, and including RDF links to discover related data. The amount of published Linked Data on the web has grown enormously since 2007. Large data sources like DBpedia extract structured data from Wikipedia and act as hubs by interlinking different data sets, enabling new applications and search over integrated data.
New approaches for data acquisition at europeana iiif, sitemaps and schema.o...Nuno Freire
Presentation on experiments at Europeana regarding new methods of aggregating metadata.
Presented at the Seminar Linked Data in Research and Cultural Heritage, on 1st of May 2017.
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...Micah Altman
The WorldMap platform http://worldmap.harvard.edu is the largest open source collaborative mapping system in the world, with over 13,000 map layers contributed by thousands of users from Harvard and around the world. Researchers may upload large spatial datasets to the system, create data-driven visualizations, edit data, and control access. Users may keep their data private, share it in groups, or publish to the world.
The user base is interdisciplinary, including scholars from the humanities, social sciences, sciences, public health, design, planning, etc. All are able to access, view, and use one another’s data, either online, via map services, or by downloading.
Current work is underway to create and maintain a global registry of map services and take us a step closer to one-stop-access for public geospatial data. Another project is working on tools to support the visualization of spatial datasets with over a billion features. Current collaborations are underway with groups inside Harvard, such as Dataverse, HarvardX, and various departments, and with groups outside Harvard, such as Cornell University and the University of Pennsylvania. Major additional contributors to the underlying source code include the WorldBank, the U.S. State Department, and the United Nations.
The source code for the WorldMap platform is available on GitHub https://github.com/cga-harvard/cga-worldmap.
Location: E25-202
Discussant: Ben Lewis is system architect and project manager for WorldMap, an open source infrastructure that supports collaborative research centered on geospatial information. Before joining Harvard, Ben was a project manager with Advanced Technology Solutions of Pennsylvania, where he led the company in adopting platform independent approaches to GIS system development. Ben studied Chinese at the University of Wisconsin and has a Masters in Planning from the University of Pennsylvania. After Penn, Ben helped start the GIS Lab at U.C. Berkeley, founded the GIS group for transportation engineering firm McCormick Taylor, and coordinated the Land Acquisition Mapping System for South Florida Water Management District. Ben is especially interested in technologies that lower the barrier to spatial technology access.
Information Science Brown Bag talks, hosted by the Program on Information Science, consists of regular discussions and brainstorming sessions on all aspects of information science and uses of information science and technology to assess and solve institutional, social and research problems. These are informal talks. Discussions are often inspired by real-world problems being faced by the lead discussant.
Tuesday 5 May: The Shapes of Archives and Memory, Helle Strandgaard JensenWARCnet
This document outlines a research project that will examine how the online activities of national archives in Denmark and the UK have shaped archives and their role in public memory-making. The project will use global history theory to analyze the websites of the Danish National Archive and the National Archives UK from their earliest versions archived by the Wayback Machine and national web archives. The analysis will focus on aspects like connections to other sites, functions, size, topics, internal networks, audiences, and design. It will also consider the broader technological contexts. The goal is to understand how the archives' online presences have evolved and how web archives can shape both the history written about archives and archives' own histories on the web.
Mind the gap! Reflections on the state of repository data harvestingSimeon Warner
A 24x7 presentation at Open Repositories 2017 in Brisbane, Australia.
I start with an opinionated history of the evolution of repository data harvesting since the late 1990's to the present. A conclusion is that we are currently in danger of creating a repository environment with fewer cross-repository services than before, with the potential to reinforce the silos we hope to open. I suggest that the community needs to agree upon a new solution, and further suggest that solution should be ResourceSync.
DataverseNL is a data repository service started in 2014 by DANS as a shared service for 15 Dutch institutions. It currently contains over 200 dataverses and 450 datasets that have been downloaded over 7,000 times. DANS aims to use DataverseNL for ongoing research projects and then archive finalized datasets in its Trusted Digital Repository (TDR) for permanent preservation. DataverseNL serves as a collaboration platform and integration point for sharing research data across Dutch universities and organizations. DANS is working to link DataverseNL metadata to semantic web vocabularies and expose it as linked open data.
1) GRLC is a tool that generates Linked Data APIs from SPARQL queries stored in a GitHub repository. It automatically builds Swagger specifications and API code by mapping the GitHub repository structure and SPARQL queries.
2) This allows SPARQL queries to be organized and maintained externally to applications in a version controlled way. The APIs generated hide the complexity of SPARQL from clients.
3) GRLC was used to build APIs for accessing historical census data, hiding SPARQL from historians. It was also used to reduce coupling between SPARQL and R code for a project analyzing the impact of early life conditions on later outcomes.
Presentatie for "Studiemiddag Linked Data Archieven"Victor de Boer
- Archival data is currently isolated and not integrated, making it difficult to reuse. Linked data provides a flexible way to integrate heterogeneous data sources while retaining their original structures.
- Linked data uses web technologies like URIs and RDF triples to describe things and link them together. This allows data to be connected and queried across collections and domains.
- Publishing data as linked open data according to the 5 star system maximizes its potential for reuse through open standards and linking to other open datasets. This enables new applications and ways of exploring and analyzing the data.
This document outlines a research project to identify and document the skills, knowledge, tools, and training needed for web archiving research. A team of researchers will conduct a pilot skills workshop, survey researchers, and interview researchers to understand what skills and knowledge led to successful research experiences. The goals are to develop a skills matrix, identify resources for acquiring skills, and create "personas" of researchers. The outputs will include papers and a dataset of interviews archived as examples of web archiving use cases.
Maximising (Re)Usability of Library metadata using Linked Data Asuncion Gomez-Perez
This document discusses maximizing the reusability of library metadata using linked data. It motivates the use of linked data by describing the current heterogeneous data landscape with issues around language, format, and lack of interoperability. It then discusses how linked data allows for uniform access through agreed upon vocabularies and standards. Specific issues around language, provenance, license and the linked data process are covered. Uses of linked library metadata are also discussed.
Nederlab is a consortium that aims to bring together digitized Dutch texts from 800 AD to the present into one online interface. This includes newspapers from 1618-1995 and early Dutch books. The texts are converted to a uniform XML format and processed with tools like TICCL to correct OCR errors. Volunteers are helping to transcribe newspapers. Future work includes incorporating other projects' corpus exploration tools to enhance the portal.
Open content opens up new avenues of researchFelix Lohmeier
(1) The document discusses how open access to digitized content from libraries opens up new avenues of research. It describes the digitization efforts of the Saxon State and University Library (SLUB) in Dresden, Germany, which has scanned over 2.7 million pages in 2012.
(2) Open access to this digital content allows for new ways of discovering and interacting with information through semantic search tools and virtual research environments. It also facilitates increased collaboration between researchers as they can more easily share and discuss data.
(3) However, there remains a "data gap" between available digital resources and researchers' needs. The document argues that SLUB aims to help fill this gap through various digital services
CLARIAH Toogdag 2018: A distributed network of digital heritage informationEnno Meijers
Slides of my keynote at the CLARIAH Toogdag 2018 on 9 March at the National Library of the Netherlands. The main topics were the development of the distributed digital heritage network and the alignment to and cooperation with the CLARIAH infrastructure and data. It also points at some of the current limitations of the semantic web technology.
A distributed network of digital heritage information - Unesco/NDL IndiaEnno Meijers
These slides were presented at the Knowledge Engeneering for Digital Library Design Workshop in New-Delhi on 25 October 2017. The Workshop was organised by Unesco and the National Digital Library of India.
A distributed network of digital heritage information by Enno Meijers - Europ...Europeana
The document discusses the Digital Heritage Network (NDE) in the Netherlands, which aims to increase access to digital heritage information by developing a distributed network. It outlines the NDE's three-layered approach focusing on sustainability, usability, and visibility. Key challenges include poor semantic alignment and data integration issues. The network will implement Linked Data principles by maximizing usability of data at the source, building a shared terminology network, and supporting a mix of semantic and physical/virtual integration approaches like federated querying. This will help realize the vision of a semantically integrated yet distributed network for digital heritage discovery.
Session 1.4 a distributed network of heritage informationsemanticsconference
This document discusses strategies for improving discovery of digital heritage information across Dutch cultural institutions. It identifies problems with the current infrastructure based on OAI-PMH including lack of semantic alignment and inefficient data integration. The proposed strategy is to build a distributed network based on Linked Data principles, with a registry of organizations and datasets, a knowledge graph with backlinks to support resource discovery, and virtual data integration using federated querying of Linked Data sources. This will improve usability, visibility, and sustainability of digital heritage information in the Netherlands.
A distributed network of digital heritage information - Semantics AmsterdamEnno Meijers
This document discusses strategies for improving discovery of digital heritage information across Dutch cultural institutions. It identifies problems with the current infrastructure based on OAI-PMH including lack of semantic alignment and inefficient data integration. The proposed strategy is to build a distributed network based on Linked Data principles, with a registry of organizations and datasets, a knowledge graph with backlinks to support resource discovery, and virtual data integration using federated querying of Linked Data sources. This will improve usability, visibility, and sustainability of digital heritage information in the Netherlands.
20191210 NDLI KEDL2019 Building the dutch digital heritage networkEnno Meijers
This document provides an overview of the Dutch Digital Heritage Network (NDE) programme. It discusses building a digital heritage network by rethinking the current digital network to focus on users and be more interconnected. The current work involves improving data usability at the source, a shared terminology service, registering organizations and datasets, and rethinking the aggregation landscape. Lessons learned include the time needed to implement new technologies and the importance of examples. Next steps involve maturing the design based on current infrastructure and decentralizing implementations.
The document discusses aggregation as an intervention tactic to improve discoverability of online content. It argues that early web approaches focused on human accessibility but hid complexity, while aggregation can expose relationships and make content more understandable and findable by machines. Done strategically with purposes of engagement, value-adding, and enhancing discoverability through promiscuous metadata, aggregation can help unlock online riches.
The Digital Heritage Network in the Netherlands is working on a Linked Data based approach for improving the visibility of Digital Heritage information.
The benefits of Linked Data are well known, but the supporting software ecosystem is still somewhat lacking. During this presentation we will look into the approach taken by Joinup: How we start from a formalized ontology, and map this to the Joinup website. We’ll give an overview of the Open Source components that we created for building linked data based CMS applications.
2010 EGITF Amsterdam - Gap between GRID and HumanitiesDirk Roorda
The document discusses the potential role of grid computing technologies in supporting humanities research. It outlines several European research infrastructure projects that aim to apply such technologies, including CLARIN, DARIAH, CESSDA and SHARE. While these projects initially saw grid as enabling workflows and sustainable tools/data, they found the technology required significant customization and expertise. The document argues that humanities research involves small, linked data analyzed through both automated and human-guided methods. It proposes that "virtual use cases" outlining generic, infrastructure-level tasks could help bridge the gap between such research and complex technologies like the grid.
NordForsk Open Access Reykjavik 14-15/8-2014:NeICNordForsk
This document discusses the Nordic e-Infrastructure Collaboration (NeIC) and its efforts to promote open access to research data. NeIC facilitates the development of high-quality e-infrastructure solutions for its member countries. It engages with Nordic research communities to help make their data easily shared and addresses challenges around sensitive data. One example is Tryggve, a Nordic platform for securely storing, computing, and legally sharing sensitive biomedical and health research data while protecting individual privacy. Tryggve aims to strengthen Nordic research collaboration by increasing data findability, reuse, and the potential size of research cohorts.
(Big) bibliographic data @ ScaDS project meeting - 2015-06-12Felix Lohmeier
The document discusses big bibliographic data from UB Leipzig and SLUB Dresden libraries. It notes that libraries are becoming data hubs and describes the libraries' metadata including resources like books, journals, and accessibility information. The libraries are working together on projects like finc and d:swarm to process and integrate metadata, link authority files, and discover resources through a unified search interface. Challenges include scaling the graph database d:swarm to handle large metadata volumes for data integration and enrichment.
Building COVID-19 Museum as Open Science Projectvty
This document discusses building a COVID-19 Museum as an open science project. It describes the speaker's background working on various data management projects. It discusses moving towards open science and sharing data according to FAIR principles. It outlines the Time Machine project for digitizing historical documents and its approach to data management. The rest of the document discusses using the Dataverse platform to build repositories, linking metadata to ontologies, using tools like Weblate for translations, and exploring the use of artificial intelligence and machine learning to enhance metadata and facilitate human-in-the-loop review processes.
A summary of DBpedia's History and a detailed analysis of challenges and solutions.
We show how the Linked Data Cloud evolved around DBpedia and also what problems we and other data projects encountered. We included a section on the new solutions that will lead DBpedia into a bright future.
The document discusses integrating the RSpace electronic lab notebook (ELN) with the University of Edinburgh's research data management services. It describes how RSpace can link to files stored in Edinburgh's DataStore storage system, export data and metadata to the DataShare research data repository, and archive data long-term in the future DataVault archive. The integration helps researchers manage and share their data across different projects and institutions while complying with the university's RDM policy. RSpace provides a convenient interface for researchers, while the services help institutions meet requirements for data storage, publication, and preservation.
The document summarizes a presentation about using the Hydra framework to build an institutional repository at the University of Hull. Some key points:
- Hydra allows the repository to support different types of content through customizable templates and handle relationships between items.
- The repository has been used to archive research outputs, events, student works, and experimental data from the history department.
- Customizations were made to integrate maps, DOIs, and additional metadata fields for different data management needs.
- The repository provides a platform for data preservation and access, helping the university comply with research policies like those from funders.
Similar to 20170501 Distributed Network of Digital Heritage Information (20)
Building library networks with linked dataEnno Meijers
Slides of my talk at the Semantics Conference in Vienna in 2018. The topic of the talk was the initiative of the National Library of the Netherlands to publish their bibliographic metadata as Linked Data.
Presentatie gegeven op het KNVI Congres op 9 november 2017 in Nieuwegein. Na een korte intro over Linked Data wordt inzichtelijk gemaakt hoe de Koninklijke Bibliotheek met partners werkt aan het beter verbinden van de digitale informatie die in steeds grotere hoeveelheid beschikbaar komt in de domeinen van bibliotheek, erfgoed en wetenschap. Linked Data technologie speelt hier een belangrijke rol in.
20170407 Bruikbaar Erfgoed - Week van het Digitaal ErfgoedEnno Meijers
Het Netwerk Digitaal Erfgoed werkt vanuit het programma 'Bruikbaar' aan het beter bruikbaar maken van digitaal erfgoedinformatie. De presentatie geeft een update van de activiteiten, met bijdragen van Netwerk Oorlogsbronnen (@LizzyJongma) en Zuiderzeemuseum (Shannon van Muijden)
Presentatie PCDB overleg Utrecht 28 juni 2016Enno Meijers
Presentatie die in gaat op het toepassen van Linked Data om de zichtbaarheid van bibliotheekcollecties te verbeteren en duidelijk maakt de Koninklijke Bibliotheek hier de komende jaren op in zal zetten.
Slimmer werken met metadata COPE 25 mei 2016Enno Meijers
Presentatie gehouden op het COPE congres op 25 mei 2016 in Duiven over de mogelijkheden om de metadata stromen in het bibliotheeknetwerk slimmer in te richten.
Presentatie gegeven in de workshop Mode Thesaurus in het Modemuseum te Antwerpen op 23 mei. De presentatie laat zien wat er nodig is om een collectiebeschrijving omgezet kan worden naar Linked Data en zo verbonden kan worden met andere Linked Data bronnen, o.a. met het termen netwerk dat in het kader van Netwerk Digitaal Erfgoed gerealiseerd zal worden.
Developing a national digital library stapel - meijers 20160302Enno Meijers
In 2015, the Koninklijke Bibliotheek (KB) became legally responsible for the digital infrastructure of the Dutch public libraries.
The KB wants to offer a platform where people and information come together. Their most important task for the years to come is the development of a national digital library - together with their partners in the network.
In this session, representatives from the KB will present their approach towards the Dutch digital library infrastructure. They will address some issues and welcome input from colleague librarians that are facing the same challenges.
Presentatie KNVI Congres 12 november 2015
Vier stappen om via Linked Data informatie beter zichtbaar te maken voor de klant. Van publiceren op het web naar publiceren in het web.
Nbc+ deel 2 informatiedag lokale collectiesEnno Meijers
Presentatie over de rol van de Nationale Bibliotheek Catalogus van Bibliotheek.nl voor het verbeteren van de vindbaarheid van lokale bibliotheek- en erfgoed collecties. Gepresenteerd op 30 september 2014 in de KB in het kader van de informatiedag over het project "Bibliotheekcollecties in het Netwerk"
Presentatie nl.dbpedia.org Datasalon 8 Gent 24 Februari 2012Enno Meijers
Een kort pleidooi voor het inrichten van een Nederlandstalige versie van de DBpedia om zo de collecties van Bibliotheken, Archieven en Musea beter vindbaar te maken en eenvoudiger te kunnen koppelen met andere relevante informatiebronnen. In het kader van het Open Zoekplatform wil Stichting Bibliotheek.nl een eerste aanzet doen voor het inrichten van het NL-domein. Het wordt echter alleen een succes wanneer enthousiaste personen en organisaties hun schouders hieronder willen zetten. Er is werk aan de winkel voor programmeurs, dataspecialisten en Wikipedianen. Stuur een mailtje als je ons wil helpen!!
Presentation by Julie Topoleski, CBO’s Director of Labor, Income Security, and Long-Term Analysis, at the 16th Annual Meeting of the OECD Working Party of Parliamentary Budget Officials and Independent Fiscal Institutions.
karnataka housing board schemes . all schemesnarinav14
The Karnataka government, along with the central government’s Pradhan Mantri Awas Yojana (PMAY), offers various housing schemes to cater to the diverse needs of citizens across the state. This article provides a comprehensive overview of the major housing schemes available in the Karnataka housing board for both urban and rural areas in 2024.
Indira awas yojana housing scheme renamed as PMAYnarinav14
Indira Awas Yojana (IAY) played a significant role in addressing rural housing needs in India. It emerged as a comprehensive program for affordable housing solutions in rural areas, predating the government’s broader focus on mass housing initiatives.
How To Cultivate Community Affinity Throughout The Generosity JourneyAggregage
This session will dive into how to create rich generosity experiences that foster long-lasting relationships. You’ll walk away with actionable insights to redefine how you engage with your supporters — emphasizing trust, engagement, and community!
AHMR is an interdisciplinary peer-reviewed online journal created to encourage and facilitate the study of all aspects (socio-economic, political, legislative and developmental) of Human Mobility in Africa. Through the publication of original research, policy discussions and evidence research papers AHMR provides a comprehensive forum devoted exclusively to the analysis of contemporaneous trends, migration patterns and some of the most important migration-related issues.
The Antyodaya Saral Haryana Portal is a pioneering initiative by the Government of Haryana aimed at providing citizens with seamless access to a wide range of government services
Contributi dei parlamentari del PD - Contributi L. 3/2019Partito democratico
DI SEGUITO SONO PUBBLICATI, AI SENSI DELL'ART. 11 DELLA LEGGE N. 3/2019, GLI IMPORTI RICEVUTI DALL'ENTRATA IN VIGORE DELLA SUDDETTA NORMA (31/01/2019) E FINO AL MESE SOLARE ANTECEDENTE QUELLO DELLA PUBBLICAZIONE SUL PRESENTE SITO
Bharat Mata - History of Indian culture.pdfBharat Mata
Bharat Mata Channel is an initiative towards keeping the culture of this country alive. Our effort is to spread the knowledge of Indian history, culture, religion and Vedas to the masses.
20170501 Distributed Network of Digital Heritage Information
1. A distributed network of digital
heritage information
Enno Meijers - KB / Netwerk Digitaal Erfgoed
DANS Seminar - 1 May 2017
2. Contents
● The Dutch Digital Heritage Network
● Current aggregation practice
● A distributed web of cultural heritage information
● Design issues
● Roadmap
3. Netwerk Digital Erfgoed (NDE) aims at
increasing the social value of the
(digital) heritage information
maintained by libraries, archives,
museums and other cultural heritage
institutions.
Long term cooperation between the
government and the institutions on
national, regional and local level. It’s
about information and people!
4. Three layered approach for improving
the sustainability, the usability and the
visibility of digital heritage information.
5.
6. Typical aggregation process
collect
clean
enrich
harmonize
select
index
OAI-PMH based => data is copied
isolated improvements => datasource stays unchanged
one size fits all => possible loss of details
present
aimed at presentation => possible inefficient use of data
presentation at portal level => no feedback to the datasource
Based on a repository-centric approach...
11. Developing a resource-centric approach
First step: Implement Linked Data principles in the datasources
● use formal (sustainable!) URIs that describe the resources
○ specify a URI strategy for the Digital Heritage Network
○ implement persistent URIs (handles)
● provide resource URIs that deliver the data in RDF
● use of formal definitions for persons, places, concepts, events (URIs)
○ provide Linked and Open reference sources to link to (thesauri, taxonomies, etc)
○ build a Knowledge Graph for Dutch digital heritage
○ align and link related terms
● add support for cross-domain discovery (EDM, Schema.org,...)
=> but how can this help our ‘Googling’ users to find our information?
12. Design issues...
A tiny example...suppose a resource exists:
<http://data.museum1.nl/object1>
a nde:painting ;
dcterms:subject <http://knowledgegraph.nde.nl/term1> .
For ‘browsable Linked Data’ you would like to add the inverse relation:
<http://knowledgegraph.nde.nl/term1>
a skos:Concept ;
skos:prefLabel "Flowers" ;
skos:isSubjectOf <http://data.museum1.nl/object1> .
TBL about ‘browsable linked data’ [1]:
In practice, when data is stored in two documents, this means that any RDF statements which relate
things in the two files must be repeated in each.
[1]: https://www.w3.org/DesignIssues/LinkedData.html
13. Possible approaches to make it work
1. Aggregate all the related Linked Data resources:
● collect datadumps and build a large triplestore
● impressive work: LOD Laundromat
2. Use federated querying:
● doesn’t work very well using regular triplestores
● implement Linked Data Fragments?!
● realistic for a network with about 1500 nodes?
● use logic for predicting most relevant endpoint [1]?
[1]: Routing Memento Requests Using Binary Classifiers
14. Or should we build a registry?
A decentralized infrastructure for the efficient management of resources in the web of data
- Michalis Stefanidakis, Ioannis Papadakis - SWIM 2012
16. NDE’s strategy for resource discovery
1. improve the usability of the data at the source:
- implement linked data principles
- align references for people, places, events, concepts (URIs!)
2. build a knowledge graph for Dutch digital heritage
3. build a registry to link to the resources:
- use formal definitions for organisations and datasets (DCAT/DataID)
- send backlinks to registry (using Linked Data Notifications?)
4. support selective collection of data by federated querying (or harvesting):
- use registry and knowledge graph for selecting the resources
- use Linked Data Fragments for federated querying?
18. “Object ID”
“term string”
OAI OAI
“Object ID”
Term URI
current
situation
Knowledge
Graph
enabled
term publication
via openSKOS
LD LD
Linked Data
enabled
Linked Data
Platform
compliant
OAI based aggregator(s)
Linked Data
Platform
LD
LDLD
API LD
Object URI
Term URI
Object URI
Term URI
REGISTRY
for organizations,
datasets and
links to objects
KNOWLEDGE GRAPH
for digital heritage
collection
registration
systems (CRS)
CRS CRS CRS CRS
aggregators at
national, regional,
thematic level
Term URIs
Roadmap from OAI-PMH to LOD
21. TBL on Decentralization:
This is a principle of the design of distributed systems, including societies. It points out that
any single common point which is involved in any operation trends to limit the way the
system scales, and produce a single point of complete failure.
Interesting work:
● Interoperability and FAIRness [1]
● IPFS and Linked Open Data [2]
● https://discuss.ipfs.io/t/whos-using-ipfs-in-libraries-archives-and-museums/130
[1]: Interoperability and FAIRness through a novel combination of Web technologies - Mark D. Wilkinson, Ruben Verborgh et al. - April 2017 -
https://peerj.com/articles/cs-110/
[2]: Sharing Linked Open Data over Peer-to-Peer Distributed File Systems: The Case of IPFS - Miguel-Angel Sicilia, Salvador
Sánchez-Alonso - November 2016 - DOI: 10.1007/978-3-319-49157-8_1
A distributed network of digital heritage information??
22. Thank you for your attention!
Please share your thoughts with me...
enno.meijers@kb.nl
twitter: ennomeijers
23. Questions for the panel / audience
1. The ideal of the Web of Data seems to be unrealistic because links cannot be
used in a bidirectional way. What are your expectations in regard to the future
possibilities of real ‘browsable linked data’? Will a central infrastructure like a
triplestore, registry or index always be necessary? Will an ideal
implementation of the ‘read/write’ web solve this or a distributed web
technology like IPFS?
2. Should web resources by default support a way to register backlinks similar to
the mailbox idea of Linked Data Notifications?