The document discusses challenges in preserving linked data. It describes the PRELIDA project which aims to identify differences between linked data/open data and digital preservation approaches. Key issues discussed include whether linked data preservation requires storing RDF data alone or additional context, and whether it can be viewed as a special case of web archiving. The document also provides examples of linked data preservation approaches used by DBpedia and Europeana.
The document discusses the PRELIDA project which aims to identify differences between linked data and digital preservation communities and analyze gaps between the two. The objectives are to collect use cases of long-term preservation of linked data and identify challenges of applying existing preservation approaches to linked data. Issues discussed include differences in preservation requirements for linked data versus other data types and whether linked data preservation can be viewed as a special case of web archiving.
by Sotiris Batsakis & Grigoris Antoniou, presented at the 3rd PRELIDA Consolidation and Dissemination Workshop, Riva, Italy, October, 17, 2014. More information about the workshop at: prelida.eu
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...EUDAT
| www.eudat.eu | This webinar was co-organised by DANS, EUDAT and OpenAIRE and was held on 12th and 13th December 2016.
Everybody wants to play FAIR, but how do we put the principles into practice?
There is a growing demand for quality criteria for research datasets. In this webinar we will argue that the DSA (Data Seal of Approval for data repositories) and FAIR principles get as close as possible to giving quality criteria for research data. They do not do this by trying to make value judgements about the content of datasets, but rather by qualifying the fitness for data reuse in an impartial and measurable way. By bringing the ideas of the DSA and FAIR together, we will be able to offer an operationalization that can be implemented in any certified Trustworthy Digital Repository.
In 2014 the FAIR Guiding Principles (Findable, Accessible, Interoperable and Reusable) were formulated. The well-chosen FAIR acronym is highly attractive: it is one of these ideas that almost automatically get stuck in your mind once you have heard it. In a relatively short term, the FAIR data principles have been adopted by many stakeholder groups, including research funders.
The FAIR principles are remarkably similar to the underlying principles of DSA (2005): the data can be found on the Internet, are accessible (clear rights and licenses), in a usable format, reliable and are identified in a unique and persistent way so that they can be referred to. Essentially, the DSA presents quality criteria for digital repositories, whereas the FAIR principles target individual datasets.
In this webinar the two sets of principles will be discussed and compared and a tangible operationalization will be presented.
This document discusses standards and metadata. It defines a standard as a document that provides requirements to ensure materials are fit for their purpose. Metadata is defined as "data about data" that describes other data. The document outlines several important metadata standards like Dublin Core, MARC, EAD, MODS and RDF. It provides details on the purpose and specifications of these standards, noting they are used to enhance accessibility, discovery and preservation of information.
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...EUDAT
| www.eudat.eu | 2nd Session: July 14, 2016.
In this webinar, Sarah Jones (DCC) and Marjan Grootveld (DANS) talked through the aspects that Horizon 2020 requires from a DMP. They discussed examples from real DMPs and also touched upon the Software Management Plan, which for some projects can be a sensible addition
HDL - Towards A Harmonized Dataset Model for Open Data PortalsAhmad Assaf
This document discusses the need for a harmonized dataset model for open data portals. It describes existing dataset models like DCAT, VoID, CKAN, and others. It proposes classifying metadata into information groups (resource, tag, group, organization) and types (general, ownership, provenance, etc.). The document outlines a process for harmonizing existing models which includes mapping these information groups and types and examining how extras fields are used across different models and portals. The goal is to define a minimum set of metadata needed to build dataset profiles and enable interoperability.
EUDAT Research Data Management | www.eudat.eu | EUDAT
| www.eudat.eu | The presentation gives an introduction to Research Data Management, explaining why it is important to manage and share data.
November 2016
Metadata harvesting is the automatic collection of metadata from individual repositories using metadata extraction systems or generators. It occurs through analyzing tags and elements like Dublin Core to gather descriptive, technical, and administrative information without human intervention. However, inconsistencies in metadata practices across repositories can cause confusion and insufficient data for service providers harvesting metadata through the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). Improving guidelines, local standards, evaluation, communication, and data quality can help address these harvesting problems.
The document discusses the PRELIDA project which aims to identify differences between linked data and digital preservation communities and analyze gaps between the two. The objectives are to collect use cases of long-term preservation of linked data and identify challenges of applying existing preservation approaches to linked data. Issues discussed include differences in preservation requirements for linked data versus other data types and whether linked data preservation can be viewed as a special case of web archiving.
by Sotiris Batsakis & Grigoris Antoniou, presented at the 3rd PRELIDA Consolidation and Dissemination Workshop, Riva, Italy, October, 17, 2014. More information about the workshop at: prelida.eu
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...EUDAT
| www.eudat.eu | This webinar was co-organised by DANS, EUDAT and OpenAIRE and was held on 12th and 13th December 2016.
Everybody wants to play FAIR, but how do we put the principles into practice?
There is a growing demand for quality criteria for research datasets. In this webinar we will argue that the DSA (Data Seal of Approval for data repositories) and FAIR principles get as close as possible to giving quality criteria for research data. They do not do this by trying to make value judgements about the content of datasets, but rather by qualifying the fitness for data reuse in an impartial and measurable way. By bringing the ideas of the DSA and FAIR together, we will be able to offer an operationalization that can be implemented in any certified Trustworthy Digital Repository.
In 2014 the FAIR Guiding Principles (Findable, Accessible, Interoperable and Reusable) were formulated. The well-chosen FAIR acronym is highly attractive: it is one of these ideas that almost automatically get stuck in your mind once you have heard it. In a relatively short term, the FAIR data principles have been adopted by many stakeholder groups, including research funders.
The FAIR principles are remarkably similar to the underlying principles of DSA (2005): the data can be found on the Internet, are accessible (clear rights and licenses), in a usable format, reliable and are identified in a unique and persistent way so that they can be referred to. Essentially, the DSA presents quality criteria for digital repositories, whereas the FAIR principles target individual datasets.
In this webinar the two sets of principles will be discussed and compared and a tangible operationalization will be presented.
This document discusses standards and metadata. It defines a standard as a document that provides requirements to ensure materials are fit for their purpose. Metadata is defined as "data about data" that describes other data. The document outlines several important metadata standards like Dublin Core, MARC, EAD, MODS and RDF. It provides details on the purpose and specifications of these standards, noting they are used to enhance accessibility, discovery and preservation of information.
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...EUDAT
| www.eudat.eu | 2nd Session: July 14, 2016.
In this webinar, Sarah Jones (DCC) and Marjan Grootveld (DANS) talked through the aspects that Horizon 2020 requires from a DMP. They discussed examples from real DMPs and also touched upon the Software Management Plan, which for some projects can be a sensible addition
HDL - Towards A Harmonized Dataset Model for Open Data PortalsAhmad Assaf
This document discusses the need for a harmonized dataset model for open data portals. It describes existing dataset models like DCAT, VoID, CKAN, and others. It proposes classifying metadata into information groups (resource, tag, group, organization) and types (general, ownership, provenance, etc.). The document outlines a process for harmonizing existing models which includes mapping these information groups and types and examining how extras fields are used across different models and portals. The goal is to define a minimum set of metadata needed to build dataset profiles and enable interoperability.
EUDAT Research Data Management | www.eudat.eu | EUDAT
| www.eudat.eu | The presentation gives an introduction to Research Data Management, explaining why it is important to manage and share data.
November 2016
Metadata harvesting is the automatic collection of metadata from individual repositories using metadata extraction systems or generators. It occurs through analyzing tags and elements like Dublin Core to gather descriptive, technical, and administrative information without human intervention. However, inconsistencies in metadata practices across repositories can cause confusion and insufficient data for service providers harvesting metadata through the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). Improving guidelines, local standards, evaluation, communication, and data quality can help address these harvesting problems.
Introduction to Persistent Identifiers| www.eudat.eu | EUDAT
This document provides an introduction to persistent identifiers (PIDs) and their use in the EUDAT system. It defines PIDs as globally unique identifiers that can be used to persistently identify digital objects. The document discusses why PIDs are useful, describing problems with URLs like link rot. It then covers different PID systems like Handle and DOI, as well as EUDAT's use of Handle through the B2HANDLE service. The document also discusses PID policies, use cases, and the B2HANDLE Python library for programmatic PID management.
Persistent Identifiers in EUDAT services| www.eudat.eu | EUDAT
| www.eudat.eu | The EUDAT data domain handles registered data. Each digital object should have a persistent identifier. This persistent identifier is used for: Replica identification; Identification of the repository of record (in the case of replication); Querying of additional information; Checksum (time stamped)...
Enabling Accessible Resource Access via Service ProvidersAlexander Haffner
Libraries have become digitized and are using information technology for storing and managing their
resource inventory. Additional metadata are used for describing properties of non-digital assets as well
as of purely digital resources. In particular, as the amount of digital resources increases, there is
demand for centralized services for searching and distribution of content.
Stakeholders of libraries and publishing industry already have made progress in areas of archival
strategies and standardization of preservation strategies. A variety of metadata standards and
exchange protocols enable service providers to offer a single access point to resources. However,
there is still an increased demand for improvement of resource organization and enhanced quality
particularly in terms of accessibility. This paper presents strategies to increase accessibility of
resources as a valuable step towards access-for-all. For consideration of accessibility within the
publishing chain we analyse the whole processing chain and identify stakeholders such as national
libraries and their corresponding responsibilities for ingest, archival storage and dissemination of
digital resources.
The academic research data lifecycle. Session 1.4 of the RDMRose v3 materials.
The JISC funded RDMRose project (June 2012-May 2013) was a collaboration between the libraries of the University of Leeds, Sheffield and York, with the Information School at Sheffield to provide an Open Educational Resource for information professionals on Research Data Management. The materials were revised between November 2014 and February 2015 for the consortium of North West Academic Libraries (NoWAL).
http://www.sheffield.ac.uk/is/research/projects/rdmrose
| www.eudat.eu | B2FIND Integration Version 4 February 2017: The aim of this presentation is to illustrate how metadata can be published in the B2FIND catalogue and how EUDAT’s B2FIND metadata catalogue can be integrated.
This document provides an overview of databases, including how data is organized and stored in different types of databases. It discusses the logical components of data like fields, records, and files. The main types of databases are hierarchical, network, relational, multidimensional, and object-oriented. Relational databases store data in tables with rows and columns and relate tables through common data items. Databases are used for both individual and company/shared use and can be local, distributed across networks, or large commercial databases. Security is important because databases contain valuable private information.
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu | EUDAT
This webinar discusses research data management. It explains why managing data is important for reproducibility, avoiding data loss, and meeting funder requirements. It outlines Horizon 2020's requirements for open data and describes services from EUDAT and OpenAIRE that can help with the entire data lifecycle from creation to long-term preservation and sharing. The webinar covers best practices like creating data management plans, metadata, using standards, licensing, and selecting repositories to archive and share research data.
In this Business Analysis Training session, you will learn Types of Databases. Topics covered in this session are:
• What is Database?
• Document-Oriented Database
• Embedded Database
• Graph Database
• Hypertext Database
• Operational Database
• Distributed Database
• Flat-File
To learn more about this course, visit this link: https://www.mindsmapped.com/courses/business-analysis/business-analysis-fundamentals-with-hands-on-training/
This document discusses standardizing data on the web. It notes that data exists in many formats, from informal to curated, and machine to human readable. W3C has focused on integrating data at web scale using standards like RDF, SPARQL, and Linked Data principles. However, converting all data to RDF has challenges. Much data exists as CSV, JSON, XML and does not need full integration. The reality is data on the web is messy with many formats. Developers see converting data as too complex. The document discusses providing tools to publish Linked Data easily, or focusing on raw data without RDF. It notes different approaches can coexist and discusses a workshop on open data formats.
B2SHARE - How to share and store research data using EUDAT’s B2SHARE | www.eu...EUDAT
| www.eudat.eu | B2SHARE is a user-friendly, reliable and trustworthy way for researchers, scientific communities and scientists to store and share small-scale research data from diverse contexts.
5.15.17 Powering Linked Data and Hosted Solutions with Fedora Webinar SlidesDuraSpace
Fedora is flexible middleware that can handle both simple and complex digital repository use cases. It stores, preserves, and provides access to digital objects while supporting linked data. Fedora is used by ICPSR to power their digital repository and provide long-term preservation of over 250,000 social science research files across various collections. While Fedora provides flexibility and community support, it lacks some database-like features such as SQL querying and bulk operations.
This document summarizes a webinar on metadata for managing scientific research data. The webinar covered why metadata is important for scientific data management, definitions of data and metadata, selected metadata standards including Dublin Core, Darwin Core and FGDC, challenges in generating metadata and opportunities to address these challenges, and advice for getting started with metadata. The webinar emphasized that metadata standards provide guidelines not strict rules, and encouraged participants to keep metadata simple while aiming to facilitate reuse of data.
Presented by Natasha Aburrow-Jones at the CILIP Cataloguing and Indexing Group Conference 2014 at Canterbury on 8 September 2014. Poor quality, non-standardised metadata may not lead directly to the end of the world, but it won't help!
This document provides an overview of CDS/ISIS, an information storage and retrieval system developed by UNESCO. CDS/ISIS allows users to design database structures, create and edit records, and search and retrieve information. It can manage standalone or networked databases. The document discusses basic features and functions of CDS/ISIS, including installing it, running it, creating and editing databases, and searching and retrieving records. It also describes how CDS/ISIS can be used beyond a single computer by publishing to CD-ROMs, sharing over a local network, or publishing databases on the web.
Metadata is data that provides information about other data. It describes elements like the creator, date of creation, file format, and standards used. Metadata can be created automatically by computers or manually by humans. It is typically structured according to standards and metadata schemes to enhance access and understanding. The key purpose of metadata is to facilitate the retrieval of information.
A multimedia database stores different types of media such as text, audio, video and images. It can be organized by linking metadata to the actual media files or by embedding the media files within the database. A multimedia database management system provides support for creating, storing, accessing, querying and controlling multimedia data and formats. Examples of large multimedia databases include YouTube's database of over 100 million videos watched daily which has grown to over 45 terabytes, and Google's database containing over 33 trillion search entries.
| www.eudat.eu | B2FIND - User training Version 07, June 2017: B2FIND is EUDAT’s simple, user friendly metadata catalogue allowing users to discover metadata from a wide range of scientific communities.
The importance of metadata for datasets: The DCAT-AP European standardGiorgia Lodi
The document discusses metadata standards for datasets, including DCAT, DCAT-AP, and related standards. It provides 3 key points:
1. DCAT and DCAT-AP are metadata standards that provide models for describing datasets and their distributions in order to improve discoverability, interoperability, and reuse. DCAT-AP adds constraints to DCAT for use by European data portals.
2. DCAT-AP_IT is the Italian implementation of DCAT-AP, which extends it with additional mandatory properties and controlled vocabularies. It defines core classes and properties for catalogs, datasets, and distributions in RDF.
3. Future developments include DCAT version 2, which introduces new
The document describes DBpedia, a project that extracts structured data from Wikipedia and makes it available on the Web. DBpedia has extracted over 2.6 million entities from Wikipedia and defined web-dereferenceable identifiers for each. As DBpedia covers many domains, other data sources on the Web have begun linking to DBpedia resources, making DBpedia a central hub. This has resulted in a Web of over 4.7 billion interlinked pieces of data across various domains.
This document discusses querying cultural heritage data stored as graphs using SPARQL. It provides examples of retrieving single and sets of triples from the graph and explains how a SPARQL server can perform additional reasoning. Exercises demonstrate querying for object owners and their names, exporting query results to CSV, and counting objects made of different materials.
Introduction to Persistent Identifiers| www.eudat.eu | EUDAT
This document provides an introduction to persistent identifiers (PIDs) and their use in the EUDAT system. It defines PIDs as globally unique identifiers that can be used to persistently identify digital objects. The document discusses why PIDs are useful, describing problems with URLs like link rot. It then covers different PID systems like Handle and DOI, as well as EUDAT's use of Handle through the B2HANDLE service. The document also discusses PID policies, use cases, and the B2HANDLE Python library for programmatic PID management.
Persistent Identifiers in EUDAT services| www.eudat.eu | EUDAT
| www.eudat.eu | The EUDAT data domain handles registered data. Each digital object should have a persistent identifier. This persistent identifier is used for: Replica identification; Identification of the repository of record (in the case of replication); Querying of additional information; Checksum (time stamped)...
Enabling Accessible Resource Access via Service ProvidersAlexander Haffner
Libraries have become digitized and are using information technology for storing and managing their
resource inventory. Additional metadata are used for describing properties of non-digital assets as well
as of purely digital resources. In particular, as the amount of digital resources increases, there is
demand for centralized services for searching and distribution of content.
Stakeholders of libraries and publishing industry already have made progress in areas of archival
strategies and standardization of preservation strategies. A variety of metadata standards and
exchange protocols enable service providers to offer a single access point to resources. However,
there is still an increased demand for improvement of resource organization and enhanced quality
particularly in terms of accessibility. This paper presents strategies to increase accessibility of
resources as a valuable step towards access-for-all. For consideration of accessibility within the
publishing chain we analyse the whole processing chain and identify stakeholders such as national
libraries and their corresponding responsibilities for ingest, archival storage and dissemination of
digital resources.
The academic research data lifecycle. Session 1.4 of the RDMRose v3 materials.
The JISC funded RDMRose project (June 2012-May 2013) was a collaboration between the libraries of the University of Leeds, Sheffield and York, with the Information School at Sheffield to provide an Open Educational Resource for information professionals on Research Data Management. The materials were revised between November 2014 and February 2015 for the consortium of North West Academic Libraries (NoWAL).
http://www.sheffield.ac.uk/is/research/projects/rdmrose
| www.eudat.eu | B2FIND Integration Version 4 February 2017: The aim of this presentation is to illustrate how metadata can be published in the B2FIND catalogue and how EUDAT’s B2FIND metadata catalogue can be integrated.
This document provides an overview of databases, including how data is organized and stored in different types of databases. It discusses the logical components of data like fields, records, and files. The main types of databases are hierarchical, network, relational, multidimensional, and object-oriented. Relational databases store data in tables with rows and columns and relate tables through common data items. Databases are used for both individual and company/shared use and can be local, distributed across networks, or large commercial databases. Security is important because databases contain valuable private information.
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu | EUDAT
This webinar discusses research data management. It explains why managing data is important for reproducibility, avoiding data loss, and meeting funder requirements. It outlines Horizon 2020's requirements for open data and describes services from EUDAT and OpenAIRE that can help with the entire data lifecycle from creation to long-term preservation and sharing. The webinar covers best practices like creating data management plans, metadata, using standards, licensing, and selecting repositories to archive and share research data.
In this Business Analysis Training session, you will learn Types of Databases. Topics covered in this session are:
• What is Database?
• Document-Oriented Database
• Embedded Database
• Graph Database
• Hypertext Database
• Operational Database
• Distributed Database
• Flat-File
To learn more about this course, visit this link: https://www.mindsmapped.com/courses/business-analysis/business-analysis-fundamentals-with-hands-on-training/
This document discusses standardizing data on the web. It notes that data exists in many formats, from informal to curated, and machine to human readable. W3C has focused on integrating data at web scale using standards like RDF, SPARQL, and Linked Data principles. However, converting all data to RDF has challenges. Much data exists as CSV, JSON, XML and does not need full integration. The reality is data on the web is messy with many formats. Developers see converting data as too complex. The document discusses providing tools to publish Linked Data easily, or focusing on raw data without RDF. It notes different approaches can coexist and discusses a workshop on open data formats.
B2SHARE - How to share and store research data using EUDAT’s B2SHARE | www.eu...EUDAT
| www.eudat.eu | B2SHARE is a user-friendly, reliable and trustworthy way for researchers, scientific communities and scientists to store and share small-scale research data from diverse contexts.
5.15.17 Powering Linked Data and Hosted Solutions with Fedora Webinar SlidesDuraSpace
Fedora is flexible middleware that can handle both simple and complex digital repository use cases. It stores, preserves, and provides access to digital objects while supporting linked data. Fedora is used by ICPSR to power their digital repository and provide long-term preservation of over 250,000 social science research files across various collections. While Fedora provides flexibility and community support, it lacks some database-like features such as SQL querying and bulk operations.
This document summarizes a webinar on metadata for managing scientific research data. The webinar covered why metadata is important for scientific data management, definitions of data and metadata, selected metadata standards including Dublin Core, Darwin Core and FGDC, challenges in generating metadata and opportunities to address these challenges, and advice for getting started with metadata. The webinar emphasized that metadata standards provide guidelines not strict rules, and encouraged participants to keep metadata simple while aiming to facilitate reuse of data.
Presented by Natasha Aburrow-Jones at the CILIP Cataloguing and Indexing Group Conference 2014 at Canterbury on 8 September 2014. Poor quality, non-standardised metadata may not lead directly to the end of the world, but it won't help!
This document provides an overview of CDS/ISIS, an information storage and retrieval system developed by UNESCO. CDS/ISIS allows users to design database structures, create and edit records, and search and retrieve information. It can manage standalone or networked databases. The document discusses basic features and functions of CDS/ISIS, including installing it, running it, creating and editing databases, and searching and retrieving records. It also describes how CDS/ISIS can be used beyond a single computer by publishing to CD-ROMs, sharing over a local network, or publishing databases on the web.
Metadata is data that provides information about other data. It describes elements like the creator, date of creation, file format, and standards used. Metadata can be created automatically by computers or manually by humans. It is typically structured according to standards and metadata schemes to enhance access and understanding. The key purpose of metadata is to facilitate the retrieval of information.
A multimedia database stores different types of media such as text, audio, video and images. It can be organized by linking metadata to the actual media files or by embedding the media files within the database. A multimedia database management system provides support for creating, storing, accessing, querying and controlling multimedia data and formats. Examples of large multimedia databases include YouTube's database of over 100 million videos watched daily which has grown to over 45 terabytes, and Google's database containing over 33 trillion search entries.
| www.eudat.eu | B2FIND - User training Version 07, June 2017: B2FIND is EUDAT’s simple, user friendly metadata catalogue allowing users to discover metadata from a wide range of scientific communities.
The importance of metadata for datasets: The DCAT-AP European standardGiorgia Lodi
The document discusses metadata standards for datasets, including DCAT, DCAT-AP, and related standards. It provides 3 key points:
1. DCAT and DCAT-AP are metadata standards that provide models for describing datasets and their distributions in order to improve discoverability, interoperability, and reuse. DCAT-AP adds constraints to DCAT for use by European data portals.
2. DCAT-AP_IT is the Italian implementation of DCAT-AP, which extends it with additional mandatory properties and controlled vocabularies. It defines core classes and properties for catalogs, datasets, and distributions in RDF.
3. Future developments include DCAT version 2, which introduces new
The document describes DBpedia, a project that extracts structured data from Wikipedia and makes it available on the Web. DBpedia has extracted over 2.6 million entities from Wikipedia and defined web-dereferenceable identifiers for each. As DBpedia covers many domains, other data sources on the Web have begun linking to DBpedia resources, making DBpedia a central hub. This has resulted in a Web of over 4.7 billion interlinked pieces of data across various domains.
This document discusses querying cultural heritage data stored as graphs using SPARQL. It provides examples of retrieving single and sets of triples from the graph and explains how a SPARQL server can perform additional reasoning. Exercises demonstrate querying for object owners and their names, exporting query results to CSV, and counting objects made of different materials.
Slide presentation discussing linkages between interprofessional education and health literacy, inspired by the All Together Better Health Conference in Kobe, Japan, Oct. 5-8, 2012
Presented in Second Life on November 12, 2012
The document discusses challenges in preserving linked data. It describes the PRELIDA project which aims to identify differences between linked data/open data and digital preservation approaches. Key issues discussed include whether linked data preservation requires storing RDF data alone or additional context. Case studies on preserving DBpedia and Europeana datasets are provided, highlighting dependencies on external linked datasets and challenges of preserving interconnected evolving data.
The document provides an overview of how social media affects job searching. It discusses how employers use social media to find and evaluate candidates, with nearly 75% checking candidates' online profiles. It recommends job seekers establish an online presence on sites like LinkedIn, Twitter, and blogs to help employers find them. It also advises maintaining a professional personal brand and being careful about any unprofessional content online that could negatively impact opportunities.
This document discusses the pros and cons of using social media for link building and marketing purposes. It provides tips on finding influential social media sites and leaders, developing engaging content, measuring ROI, and asking the right questions to determine the best social platforms to use. Examples are given of effective tweets and metrics that marketers can track in social media. Overall, the document offers strategic advice on leveraging social media networks, content, and influencers to build links, traffic, and brand awareness.
The document discusses AMD's RapidFire technology for remote graphics solutions. RapidFire uses dedicated cloud hardware and software to deliver multiple HD game streams from a single GPU with low latency. It has four independent components - the server, network, client, and user interface. The server performs GPU encoding of the desktop into an H264 video stream. The stream is sent to the client over the network, where it is decoded by the client hardware and displayed in the UI. RapidFire is designed to work across different hardware and support various use cases and workflows.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive functioning. Exercise causes chemical changes in the brain that may help protect against mental illness and improve symptoms.
The document discusses the Computing Science programs at the University of Ontario Institute of Technology (UOIT). It describes three Computing Science programs: Comprehensive, Digital Media Specialization, and Digital Forensics Specialization. For each program, it provides details on specialized courses and labs related to topics such as digital media, mobile devices, computer forensics, and crime scene investigation. It also highlights benefits of the Computing Science program such as the laptop program, co-op opportunities, and smaller class sizes compared to other universities.
RDA is a set of guidelines for cataloging digital resources that is based on FRBR and FRAD models. It addresses shortcomings of AACR2 for describing online resources. The RDA Toolkit provides the full RDA instructions and tools like mappings, workflows and an element set to support efficient RDA implementation. It is maintained by the RDA Steering Committee and aims to produce robust data that clearly defines relationships for discovery of resources in libraries, archives and other cultural heritage organizations.
The document discusses integrating the RSpace electronic lab notebook (ELN) with the University of Edinburgh's research data management services. It describes how RSpace can link to files stored in Edinburgh's DataStore storage system, export data and metadata to the DataShare research data repository, and archive data long-term in the future DataVault archive. The integration helps researchers manage and share their data across different projects and institutions while complying with the university's RDM policy. RSpace provides a convenient interface for researchers, while the services help institutions meet requirements for data storage, publication, and preservation.
Researchers require infrastructures that ensure a maximum of accessibility, stability and reliability to facilitate working with and sharing of research data. Such infrastructures are being increasingly summarised under the term Research Data Repositories (RDR). The project re3data.org – Registry of Research Data Repositories – began to index research data repositories in 2012 and offers researchers, funding organisations, libraries and publishers an overview of the heterogeneous research data repository landscape. In December 2014 re3data.org listed more than 1,030 research data repositories, which are described in detail using the re3data.org schema (http://dx.doi.org/10.2312/re3.003). Information icons help researchers to identify easily an adequate repository for the storage and reuse of their data. This talk describes the heterogeneous RDR landscape and presents a typology of institutional, disciplinary, multidisciplinary and project-specific RDR. Further, it outlines the features of re3data. org and it shows current developments for integration into data management planning tools and other services.
By the end of 2015 re3data.org and Databib (Purdue University, USA) will merge their services, which will then be managed under the auspices of DataCite. The aim of this merger is to reduce duplication of effort and to serve the research community better with a single, sustainable registry of research data repositories. The talk will present this organisational development as a best practice example for the development of international research information services.
This presentation gives details on technologies and approaches towards exploiting Linked Data by building LD applications. In particular, it gives an overview of popular existing applications and introduces the main technologies that support implementation and development. Furthermore, it illustrates how data exposed through common Web APIs can be integrated with Linked Data in order to create mashups.
This document discusses data accessibility and challenges. It covers the data life cycle, including planning, generating data, reliability, ownership, metadata, versioning, and publishing. It discusses expectations for accessing and sharing data. Open access data policies are encouraged by research funders, journals, and initiatives like DataCite to assign identifiers to research data. Data can be shared through repositories, journals, websites, or informally between researchers. Factors that affect sharing and accessing data include size, computing needs, standards, repositories, data nature, governance, and metadata.
The Web of Linked Open Data, or LOD, is the most relevant achievement of the Semantic Web. Initially proposed by Tim Berners-Lee in a seminal paper published in Scientific American in 2001, the Semantic Web envisions a web where software agents can interact with large volumes of structured, easy to process data. It is now when users have at our disposal the first, mature results of this vision. Among them, and probably the most significant ones, are the different LOD initiatives and projects that publish open data in standard formats like RDF.
This presentation provides an overview and comparison of different LOD initiatives in the area of patent information, and analyses potential opportunities for building new information services based on largely available datasets of patent information. Information is based on different interviews conducted with innovation agents and on the analysis of professional bibliography and current implementations.
LOD opportunities are not only restricted to information aggregators, but also to end-users and innovation agents that need to face with the difficulties of dealing with large amounts of data. In both cases, the opportunities offered by LOD need to be assessed, as LOD has just become a standard, universal method to distribute, share and access data.
Linked Data for the Masses: The approach and the SoftwareIMC Technologies
Title: Linked Data for the Masses: The approach and the Software
@ EELLAK (GFOSS) Conference 2010
Athens, Greece
15/05/2010
Creator: George Anadiotis (R&D Director)
Bio2RDF converts over 40 life science databases with over 30 billion triples into semantic web technologies to support biological discovery. It provides interlinked data through SPARQL endpoints in various locations. The presentation discusses Bio2RDF's methodology for converting, providing, and enabling reuse of data based on linked open data principles in order to encourage original data providers to directly publish RDF and link to other data sources.
Digital Repositories: Essential Information for Academic LibrariansJeffrey Beall
This presentation provides essential information for academic librarians about digital repositories.It describes institutional, disciplinary, and data repositories and gives examples of each. The presentation also looks at the current state of access, focusing on OAI-PMH, and it examines digital preservation for IRs. Academic libraries that host repositories essentially become publishers, and this responsibility has many implications for libraries. The talk closes with a brief look at the proposed "all-scholarship repository" (ASR).
This paper surveys the landscape of linked open data projects in cultural heritage, exam- ining the work of groups from around the world. Traditionally, linked open data has been ranked using the five star method proposed by Tim Berners-Lee. We found this ranking to be lacking when evaluating how cultural heritage groups not merely develop linked open datasets, but find ways to used linked data to augment user experience. Building on the five-star method, we developed a six-stage life cycle describing both dataset development and dataset usage. We use this framework to describe and evaluate fifteen linked open data projects in the realm of cultural heritage.
The document summarizes a presentation about using the Hydra framework to build an institutional repository at the University of Hull. Some key points:
- Hydra allows the repository to support different types of content through customizable templates and handle relationships between items.
- The repository has been used to archive research outputs, events, student works, and experimental data from the history department.
- Customizations were made to integrate maps, DOIs, and additional metadata fields for different data management needs.
- The repository provides a platform for data preservation and access, helping the university comply with research policies like those from funders.
This presentation was provided by Abigail Sparling and Adam Cohen of The University of Alberta Library, during the NISO webinar "Implementing Linked Library Data," held on November 13, 2019.
This document provides an overview of research data management (RDM) priorities, stakeholders, and practices from the perspective of the University of Edinburgh. It discusses the university's RDM roadmap, which aims to implement RDM services and support over multiple phases by April 2015. Key services discussed include general RDM support and consultancy, support for data management planning, storage and collaboration facilities, and tools for long-term data management and deposit. The roles of key university committees in overseeing the RDM program are also outlined. Finally, the document discusses the university's communications plan to raise awareness of RDM among researchers and support staff.
morning session talk at the second Keystone Training School "Keyword search in Big Linked Data" held in Santiago de Compostela.
https://eventos.citius.usc.es/keystone.school/
Similar to Wed batsakis tut_chalasdlenges of preservations (20)
The document describes a semantic recommendation system for helping customers select fish for an aquarium. The system takes into account various criteria like temperature, predator/prey relationships between fish, food requirements, ecosystem needs, size, and color preferences. It integrates data from multiple sources and uses semantic technologies like ontologies and linked data to make personalized recommendations based on a user's needs and preferences. The system aims to connect people interested in fish keeping through a social network application.
SyrtAPI is a new entertainment platform that combines music and book data from multiple sources like MusicBrainz and NYTimes reviews. It uses Linked Data and SPARQL queries to extract lyrics and reviews and recommend songs that match the content of the books. The team learned about using Linked Data vocabularies and linking datasets while building the prototype, which currently retrieves 25 lyrics and 10 book reviews through its pipeline. Future work includes adding more data sources, developing a mobile app, and using natural language processing to better analyze texts.
Keep fit (a bit) - ESWC SSchool 14 - Student projecteswcsummerschool
The document presents a web-based project called "Keep Fit(a Bit) in Kalamaki" which aims to make Kalamaki, Greece a smart city by developing a personalized health planner. The planner integrates data on restaurants, dishes, energy/calorie content, prices, and walking distances to provide personalized recommendations to help users like Fred stay fit on holidays in Kalamaki. The project team collected data from various sources and implemented a prototype interface that allows users to view personalized recommendations. Future steps include publishing more Kalamaki data, adding social features, and integrating additional health and weather data.
This document discusses the creation of an Arabic sentiment lexicon and finding related entities from Arabic text. It involves processing Arabic financial text data by tagging parts of speech, removing stop words, translating verbs and adjectives to English using Google Translate, stemming the words, and using an existing English sentiment lexicon like SentiWordNet to assign positive and negative sentiment scores. Related entities are extracted using a window-based approach to find nouns occurring near sentiment words. The process aims to create an Arabic sentiment lexicon and identify related entities to help with sentiment analysis on Arabic text.
FIT-8BIT An activity music assistant - ESWC SSchool 14 - Student projecteswcsummerschool
The document discusses the advantages of music in sports. It outlines five key ways music can influence preparation and performance: 1) dissociation to lower effort perception, 2) arousal regulation as a stimulant or sedative, 3) synchronization with exercise for increased output, 4) positive impact on acquiring motor skills, and 5) attainment of flow state. It also discusses links between sport and music, defining tempo rhythms, and providing scenarios for a music application interface and workflow.
Personal Tours at the British Museum - ESWC SSchool 14 - Student projecteswcsummerschool
This document discusses creating personal tours at the British Museum using a mobile app. It would allow visitors to choose a starting point and then be suggested next artifacts to view based on their interests, time constraints, and what they liked or disliked. Challenges included data issues and changing requirements, but enriching descriptions, collecting visitor analytics, and adding game elements could improve the experience.
Empowering fishing business using Linked Data - ESWC SSchool 14 - Student pro...eswcsummerschool
This document describes a project to create a visualization of current fish population and fishing legislation around the world. The project, called PYTHEIA, will provide information to fishing businesses to help them choose suitable new locations by linking data on fish populations, laws, and management from various sources. It outlines the user scenario, workflow, ontology developed to represent the data, and plans for the user interface and enhancing the system in the future.
Tutorial: Social Semantic Web and Crowdsourcing - E. Simperl - ESWC SS 2014 eswcsummerschool
This document discusses combining the social web and semantic web through crowdsourcing. It defines key concepts like the social web, crowdsourcing, and semantic technologies. It then provides examples of how semantic tasks can be crowdsourced, such as annotating research papers, mapping topics to ontologies, and curating linked data. Challenges with crowdsourcing semantic tasks are also explored, such as how to optimally structure tasks and validate crowd responses.
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014eswcsummerschool
This document discusses tools and techniques for monitoring global media data and events. It introduces several systems developed at the Jozef Stefan Institute for collecting news articles from around the world, enriching documents with semantic annotations, linking information across languages, and analyzing news reporting bias. It also addresses representing events with structured and semantic descriptions and tracking how topics evolve over time through an event registry system. The overall goal is to establish an integrated real-time pipeline for processing multilingual media, identifying events, and providing visualization of global event dynamics.
Hands On: Amazon Mechanical Turk - M. Acosta - ESWC SS 2014 eswcsummerschool
This document provides an overview of Amazon Mechanical Turk (MTurk) and how it can be used for crowdsourcing projects. It discusses key MTurk concepts like requesters, workers, HITs, assignments, and qualifications. It then walks through the steps to create an MTurk project, including defining the HIT properties, previewing templates, creating batches, publishing HITs, and reviewing results. Finally, it discusses best practices like testing HITs in the sandbox environment and monitoring worker forums.
Tutorial: Querying a Marine Data Warehouse Using SPARQL - I. Fundulaki - ESWC...eswcsummerschool
This document describes querying a marine data warehouse using SPARQL. It discusses the MarineTLO ontology used to integrate data about marine species from multiple sources. Examples are provided of SPARQL queries against the MarineTLO warehouse to retrieve information about species, their distributions, relationships and more. A series of 21 example queries are also listed that demonstrate different ways of interrogating the semantic data in the warehouse.
This document discusses different data formats for representing cultural data on the web and their pros and cons, including CSV, RDBMS, XML/SOAP, and JSON/REST. It advocates for using URIs, HTTP, and semantic web standards like RDF and SPARQL to represent cultural data in a way that is distributed, extensible, and links related resources on the web.
The document outlines the schedule and activities for a summer school on semantic web technologies. The summer school will include tutorials on topics such as linked data, ontologies, and data publishing/preservation. Students will work in groups on mini-projects with guidance from tutors. There will be keynote speakers each day and social events planned. The goal is for students to learn practical skills through hands-on experience while interacting with peers and experts in the field.
This document outlines the goals and instructions for a hands-on session to publish a dataset as linked data. The session will divide participants into three groups to work on creating, interlinking, and publishing the RDF dataset. Each group will have 40 minutes to select vocabularies, design URIs, transform tabular data into RDF, select target datasets to link to, create metadata using VoID, and select a license. Then each group will present their work in 1 minute without slides. The overall goal is to accomplish the tasks of creating, interlinking, and publishing the RDF dataset.
The document discusses processing linked data at high speeds using the Signal/Collect graph algorithm framework. It provides examples of how Signal/Collect can be used to perform tasks like RDFS subclass inference and PageRank calculation on semantic graphs. It also summarizes performance results showing that TripleRush, an implementation of Signal/Collect, outperforms other graph processing systems on benchmark datasets. Finally, it discusses ongoing work on graph partitioning with TripleRush.
This document discusses knowledge engineering and the use of knowledge on the web. It covers web data representation using standards like RDF, HTML5 and SKOS. It discusses categorizing knowledge from different sources and aligning categories. It also discusses using knowledge through techniques like visualization, graph-based search across linked data, and improving search through vocabulary alignment and location-based queries.
This document provides an overview of querying linked data using SPARQL. It begins with an introduction and motivation for querying linked data. It then covers the basics of SPARQL including its components like prefixes, query forms, and solution modifiers. Several examples are provided demonstrating how to construct ASK, SELECT, and other types of SPARQL queries. The document also discusses SPARQL algebra and updating linked data with SPARQL 1.1.
This document provides an overview of SPARQL, the query language for retrieving and manipulating data stored in RDF format. It describes the basic components of SPARQL including triple patterns, basic graph patterns, group graph patterns, filters, and how these patterns are matched against RDF data to retrieve variable bindings. It also gives a brief introduction to SPARQL 1.1 features for querying and updating RDF stores.
The document discusses the development of K-HAL, an AI assistant, through three versions. K-HAL V1.0 used a simple ontology and knowledge base to represent spacecraft piloting knowledge. K-HAL V2.0 leveraged existing online ontologies and linked data to expand its knowledge. It also used crowdsourcing to add new facts. K-HAL V3.0 would represent processes by modeling interactions between a virtual choir, conductor, and listeners. The conclusion advocates reusing and sharing ontologies and data to benefit the semantic web community.
SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN MATKA MATKA RESULT KALYAN MATKA TIPS SATTA MATKA MATKA COM MATKA PANA JODI TODAY BATTA SATKA MATKA PATTI JODI NUMBER MATKA RESULTS MATKA CHART MATKA JODI SATTA COM INDIA SATTA MATKA MATKA TIPS MATKA WAPKA ALL MATKA RESULT LIVE ONLINE MATKA RESULT KALYAN MATKA RESULT DPBOSS MATKA 143 MAIN MATKA KALYAN MATKA RESULTS KALYAN CHART KALYAN CHART
High-Quality IPTV Monthly Subscription for $15advik4387
Experience high-quality entertainment with our IPTV monthly subscription for just $15. Access a vast array of live TV channels, movies, and on-demand shows with crystal-clear streaming. Our reliable service ensures smooth, uninterrupted viewing at an unbeatable price. Perfect for those seeking premium content without breaking the bank. Start streaming today!
https://rb.gy/f409dk
SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA MATKA RESULT KALYAN MATKA TIPS SATTA MATKA MATKA COM MATKA PANA JODI TODAY BATTA SATKA MATKA PATTI JODI NUMBER MATKA RESULTS MATKA CHART MATKA JODI SATTA COM INDIA SATTA MATKA MATKA TIPS MATKA WAPKA ALL MATKA RESULT LIVE ONLINE MATKA RESULT KALYAN MATKA RESULT DPBOSS MATKA 143 MAIN MATKA KALYAN MATKA RESULTS KALYAN CHART
SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA MATKA RESULT KALYAN MATKA TIPS SATTA MATKA MATKA COM MATKA PANA JODI TODAY BATTA SATKA MATKA PATTI JODI NUMBER MATKA RESULTS MATKA CHART MATKA JODI SATTA COM INDIA SATTA MATKA MATKA TIPS MATKA WAPKA ALL MATKA RESULT LIVE ONLINE MATKA RESULT KALYAN MATKA RESULT DPBOSS MATKA 143 MAIN MATKA KALYAN MATKA RESULTS KALYAN CHART
SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA MATKA RESULT KALYAN MATKA TIPS SATTA MATKA MATKA COM MATKA PANA JODI TODAY BATTA SATKA MATKA PATTI JODI NUMBER MATKA RESULTS MATKA CHART MATKA JODI SATTA COM INDIA SATTA MATKA MATKA TIPS MATKA WAPKA ALL MATKA RESULT LIVE ONLINE MATKA RESULT KALYAN MATKA RESULT DPBOSS MATKA 143 MAIN MATKA KALYAN MATKA RESULTS KALYAN CHART
Discover the Beauty and Functionality of The Expert Remodeling Serviceobriengroupinc04
Unlock your kitchen's true potential with expert remodeling services from O'Brien Group Inc. Transform your space into a functional, modern, and luxurious haven with their experienced professionals. From layout reconfiguration to high-end upgrades, they deliver stunning results tailored to your style and needs. Visit obriengroupinc.com to elevate your kitchen's beauty and functionality today.
The Steadfast and Reliable Bull: Taurus Zodiac Signmy Pandit
Explore the steadfast and reliable nature of the Taurus Zodiac Sign. Discover the personality traits, key dates, and horoscope insights that define the determined and practical Taurus, and learn how their grounded nature makes them the anchor of the zodiac.
KALYAN CHART SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN MATKA MATKA RESULT KALYAN MATKA TIPS SATTA MATKA MATKA COM MATKA PANA JODI TODAY BATTA SATKA MATKA PATTI JODI NUMBER MATKA RESULTS MATKA CHART MATKA JODI SATTA COM INDIA SATTA MATKA MATKA TIPS MATKA WAPKA ALL MATKA RESULT LIVE ONLINE MATKA RESULT KALYAN MATKA RESULT DPBOSS MATKA 143 MAIN MATKA KALYAN MATKA RESULTS KALYAN CHART
SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA MATKA RESULT KALYAN MATKA TIPS SATTA MATKA MATKA COM MATKA PANA JODI TODAY BATTA SATKA MATKA PATTI JODI NUMBER MATKA RESULTS MATKA CHART MATKA JODI SATTA COM INDIA SATTA MATKA MATKA TIPS MATKA WAPKA ALL MATKA RESULT LIVE ONLINE MATKA RESULT KALYAN MATKA RESULT DPBOSS MATKA 143 MAIN MATKA KALYAN MATKA RESULTS KALYAN CHART
AI Transformation Playbook: Thinking AI-First for Your BusinessArijit Dutta
I dive into how businesses can stay competitive by integrating AI into their core processes. From identifying the right approach to building collaborative teams and recognizing common pitfalls, this guide has got you covered. AI transformation is a journey, and this playbook is here to help you navigate it successfully.
Cover Story - China's Investment Leader - Dr. Alyce SUmsthrill
In World Expo 2010 Shanghai – the most visited Expo in the World History
https://www.britannica.com/event/Expo-Shanghai-2010
China’s official organizer of the Expo, CCPIT (China Council for the Promotion of International Trade https://en.ccpit.org/) has chosen Dr. Alyce Su as the Cover Person with Cover Story, in the Expo’s official magazine distributed throughout the Expo, showcasing China’s New Generation of Leaders to the World.
The report *State of D2C in India: A Logistics Update* talks about the evolving dynamics of the d2C landscape with a particular focus on how brands navigate the complexities of logistics. Third Party Logistics enablers emerge indispensable partners in facilitating the growth journey of D2C brands, offering cost-effective solutions tailored to their specific needs. As D2C brands continue to expand, they encounter heightened operational complexities with logistics standing out as a significant challenge. Logistics not only represents a substantial cost component for the brands but also directly influences the customer experience. Establishing efficient logistics operations while keeping costs low is therefore a crucial objective for brands. The report highlights how 3PLs are meeting the rising demands of D2C brands, supporting their expansion both online and offline, and paving the way for sustainable, scalable growth in this fast-paced market.
2. Synopsis
● Description of use cases related to the long-term
preservation and access to Linked Data.
● Identify challenges, problems and limitations
of existing Preservation approaches when
applied on Linked Data.
● Identification of differences, and the analysis
of the gap existing between two communities
● Linked Data
● Digital Preservation
University of Huddersfield
3. PRELIDA project
● PRELIDA project objectives include the
identification of differences, and the analysis
of the gap existing between two communities:
Linked Data or Linked Open Data as part of the
semantic web technology and Digital
Preservation as discussed in the context of
archives and digital libraries.
University of Huddersfield
4. Issues (1/3)
● What are the differences between Linked Data
and other types of data with respect to digital
preservation requirements?
● Can Linked Data preservation can be reduced to
reliable storing of RDF data or do additional
issues arise and must be taken into account?
● Can Linked Data preservation be viewed as a
special case of Web archiving, or these two
tasks (Linked Data preservation and Web
preservation) still have different archiving
requirements?
University of Huddersfield
5. Issues (2/3)
● What are the differences between Linked Data
preservation and other special types of data (e.
g. multimedia content, software)?
● Is the full functionality related with an RDF
dataset (e.g., SPARQL endpoints) that must be
also preserved or just the data?
● Do previous versions of Linked Data need to be
preserved at all?
University of Huddersfield
6. Issues (3/3)
● If there is a reason for preserving Linked Data
does this reason apply directly to datasets
related to the specific Link Dataset that must
be preserved?
● Does preservation implies keeping track of
changes on other datasets directly or indirectly
connected to a specific RDF dataset?
● Does Linked Data complicate preservation
requirements in terms of stakeholders, rights
over data, ownership of an interlinked dataset
and ownership of the archived version?
University of Huddersfield
7. Digital Preservation introduction
● “Activities ensuring access to digital objects
(data and software) as long as it is required. In
addition to that, preserved content must be
authenticated and rendered properly upon
request.”
● Problems
● File format obsolescence
● Storage medium failure
● Value and function of the digital object cannot be
determined anymore
● Simple loss of the digital objects
University of Huddersfield
8. Preservation Recommendations
● Using file formats based on open standards
● Using the services of digital archives to store
the objects for the long-term
● Creating and maintaining high quality
documentation specifically developed to create
preservation metadata so in the future the
digital objects can be reused
● Making use of multiple storage facilities to
reduce the risk that the objects get lost.
University of Huddersfield
9. OAIS reference model
● OAIS reference model (Reference Model for an
Open Archival Information System) establishes a
common framework of terms and concepts
relevant for the long term archiving of digital data.
● The model details the processes around and inside
of the archive, including the interaction with the
user. But, it does not make any statements about
which data would need to be preserved.
● The OAIS reference model has been adopted as ISO
standard 14721
University of Huddersfield
10. OAIS definitions
● OAIS is defined as an archive and an
organisation of people and systems that has
accepted the responsibility to preserve
information and make it available for a
“Designated Community”
● A Designated Community is defined as “an
identified group of potential consumers who
should be able to understand a particular set of
information. The Designated Community may
be composed of multiple user communities”.
University of Huddersfield
11. Repository responsibilities (1/2)
● Negotiate for and accept appropriate
information from information Producers.
● Obtain sufficient control of the information
provided to the level needed to ensure Long
Term Preservation.
● Determine, either by itself or in conjunction
with other parties, which communities should
become the Designated Community
University of Huddersfield
12. Repository responsibilities (2/2)
● Ensure that the information to be preserved is
Independently Understandable to the Designated
Community.
● Follow documented policies and procedures which
ensure that the information is preserved against
all reasonable contingencies, including the demise
of the Archive.
● Make the preserved information available to the
Designated Community and enable the
information to be disseminated as copies of, or as
traceable to, the original submitted Data Objects
with evidence supporting its Authenticity.
University of Huddersfield
14. Linked Data
● Use the Web as a platform to publish and re-use
identifiers that refer to data
● Use a standard data model for expressing the data
(RDF).
University of Huddersfield
15. Publishing RDF Data
● As annotation to Web documents
● RDF data is included within the HTML code of Web pages.
● Software with suitable parsers can then extract the RDF content
for the pages.
● As Web documents
● RDF data is serialized and stored on the Web.
● RDF documents are served next to HTML documents and a
machine can request specific type of documents.
● As a database
● RDF can be stored in optimised graph databases (“triple store”)
and queried using the SPARQL query language.
University of Huddersfield
16. Publishing RDF data example
● DBpedia publishing:
● As annotations through the RDFa markup present in the HTML
page
● As RDF content via content-negotiation with the resource
● With a SPARQL query sent to the end point
University of Huddersfield
17. Web of data: Different versions
● The “Web” Web of Data: a network of semantically linked
resources published exclusively on the Web.
● This content is exclusively accessible on the Web and
cannot be queried using SPARQL, a query language for
RDF.
● The “Data-base” Web of Data
● A set of RDF statements stored in an optimised database
and made queryable using SPARQL.
● This set of resources uses URIs which are not expected,
and most of the time are not, dereferencable. As such
this Web of Data is a graph disconnected from the Web.
University of Huddersfield
18. Preservation of Linked Data
● Web Data can be preserved just like any web page, especially
if there is structured data embedded in it (RDFa, Microdata).
● It is possible to extract structured data from any Web page
that contains annotations in order to expose it to the user
via various serialisation formats.
● Database Data can be preserved just like any database. RDF is
to be considered as the raw bits of information which are
serialised in RDF/XML (among others).
● The preservation of such files is similar to what would be
done for relational databases with the goal of providing
data consumers with a serialisation format that can be
consumed with current software.
University of Huddersfield
19. Semantics
● The archiving of a Web document consists of its
own text and other Web resources that are
embedded in it. This view differs for a Web of
resources where the links between the resources
matter and evolve in time.
● On a global graph interconnecting several data
sources through shared conceptualization, this
context is infinite. The only way to preserve the
Web of Data in a meaningful way would be to
snapshot it entirely, a scenario that is intractable
from an architectural point of view.
University of Huddersfield
20. Overlap
● RDF data dumps are easy to preserve, share,
load and consume. These RDF data dumps are
already largely used on data portals as a way to
publish Linked Open Data. DBpedia archiving
is such an example. As long as the URIs in
these files are considered not to have any Web
existence one willing to grasp the meaning of
the resources at the time of preservation will
have to load the relevant snapshots dated from
the same preservation time.
University of Huddersfield
21. DBpedia introduction
● DBpedia's objective is to extract structured knowledge from
Wikipedia and make it freely available on the Web using
Semantic Web and Linked Data technologies.
● Data is extracted in RDF format
● Can be retrieved
● Directly (RDF)
● Through a SPARQL end-point
● As Web pages.
● Knowledge from different language editions of Wikipedia is
extracted along with links to other Linked Open Data
datasets.
University of Huddersfield
23. DBpedia data extraction
● Wikipedia content is available using Creative Commons
Attribution-Sharealike 3.0 Unported License (CC-BY-SA) and the
GNU Free Documentation Licence (GFDL).
● DBpedia content (data and metadata such as the DBpedia ontology)
is available to end users under the same terms and licences as the
Wikipedia content.
University of Huddersfield
24. DBpedia preservation (1/2)
● DBpedia preserves different versions of the entire
dataset by means of DBpedia dumps corresponding
to an versioning mechanism
● Besides the versioned versions of DBpedia,
DBpedia live keeps track of changes in Wikipedia
and extracts newly changed information from
Wikipedia infoboxes and text into RDF format
● DBpedia live contains also metadata about the part
of Wikipedia text that the information was extracted,
the user created or modified corresponding data and
the date of creation or last modification. Incremental
modifications of DBpedia live are also archived
University of Huddersfield
25. DBpedia preservation (2/2)
● DBpedia dataset contains links to other datasets containing
both definitions and data (e.g., Geonames).
● There are currently more than 27 million links from
DBpedia to other datasets.
● DBpedia archiving mechanisms also preserve links to
these datasets but not their content.
● Preserved data is DBpedia content in RDF or tables (CSV)
format.
● Rendering and querying software is not part of the archive
although extraction software from Wikipedia infoboxes
and text used for the creation of DBpedia dataset is
preserved at GitHub.
University of Huddersfield
26. Use Cases
● Based on possible interactions and user
requests :
● Request of archived data in RDF of CSV format
● Request of rendered data in Web format
● Submitting SPARQL queries on the archived
versions of the data
● Time
● Point
● Interval
● External Sources involved
University of Huddersfield
27. Use Case 1-RDF Data archiving and retrieval
● DBpedia data (in RDF format, or Tables-CSV format) is
archived and the user requests specific data (or the entire
dataset) as it was at a specific date in the past.
● The preservation mechanism must be able to provide the
requested data in RDF (or Table) format.
● Timestamps specifying the interval that the data was valid
(i.e., dates that the data was created or modified for the
last time before the date specified into the user request,
and first modification or deletion date after that date) are
a desirable feature of the mechanism.
University of Huddersfield
28. Use Case 1
● Retrieving data for a specific time interval, e.g.,
2010-2012, instead of a specific date is a more
complex case since all versions of the data and
their corresponding validity intervals with respect
to the request interval must be returned.
● Currently complex requests involving intervals are
not handled by the DBpedia archiving mechanism.
RDF data containing links to other LOD for specific
time points can be retrieved, but the content of
links to external LOD is not preserved.
University of Huddersfield
29. Use case 2: Rendering data as Web page
● User requests the DBpedia data for a specific
topic at a given temporal point or interval
presented in a Web page format
● The preservation mechanism should be able to
return the data in RDF format and in case
description is modified during the given
interval all corresponding descriptions, the
intervals that each one distinct description
was true, modification history, differences
between versions and editors should be
returned as in the first use case.
University of Huddersfield
30. Use case 2
● Rendering requested data as a Web page will
introduce the following problem: can the
functionality of external links be preserved
and supported as well or not?
● Currently rendering software is not part of the
preservation mechanism, although using a
standard representation (RDF) minimizes the
risk of not been able to render the data in the
future.
University of Huddersfield
31. Use case 3: SPARQL Endpoint functionality
● Reconstruct the functionality of the DBpedia SPARQL endpoint at a
specific temporal point in the past.
● Specifying both the query and the time point (this use case can be
extended by supporting interval queries as in use case 1 and 2
above) . There are different kinds of queries that must be handled
corresponding to different use cases:
a) Queries spanning into RDF data into DBpedia dataset only
b) Queries spanning into DBpedia dataset and datasets directly
connected to the DBpedia RDF dataset (e.g., Geonames)
c) Queries spanning into DBpedia data and to external datasets
connected indirectly with DBpedia (i.e., through links to
datasets of case b).
University of Huddersfield
32. Use case 3
● Currently SPARQL end-point functionality is
not directly preserved, i.e., the users must
retrieve the data and use their own SPARQL
end-point to query them.
● Then, they will be able to issue queries of type
(a) above, but not queries of type (b) when the
content of external links is requested.
● Also requests of type (c) cannot be handled by
the current preservation mechanism
University of Huddersfield
33. Europeana
● Europeana.eu is a platform for providing access to digitized
cultural heritage objects from Europe’s museums, libraries
and archives. It currently provides access to over 30M such
objects.
● Europeana functions as a metadata aggregator: its partner
institutions or projects send it (descriptive) metadata
about their digitized objects to enable centralized search
functions.
● The datasets include links to the websites of providers,
where users can get access to the digitized objects
themselves.
● Europeana re-publishes this data openly (CC0), now
mainly by means of an API usable by everyone.
University of Huddersfield
34. Dependency on Third party datasets
● Cultural Heritage providers are not Europeana’s only source
of data. To compensate for certain quality lacks in the
providers’ data, especially considering multilingualism or
semantic linking, Europeana has embarked on enriching this
data.
● Europeana connects to GEMET, Geonames and DBpedia.
Once the links to contextual resources (places, persons)
from these datasets, have been created, the data on these
resources is added to Europeana’s own database, to later
be exploited to provider better services.
● This introduces a dependency towards external linked
datasets, which Europeana has to take into account.
University of Huddersfield
35. More dependencies
● Europeana started to encourage its providers
to proceed with some linking by themselves.
Result:
● Links to the same external linked data sources,
that Europeana already uses for its own
enrichment
● Links to projects’ and institutions’ own thesauri
classification expressed themselves as linked
data.
University of Huddersfield
36. Europeana functionality
● Europeana re-distributes the metadata it aggregates from its
partners, in a fully open way. This is done via its API, mainly.
But there have been experiments using semantic mark-up
on object pages (RDFa, notably with the schema.org
vocabulary) and in the form of “real” linked data , either by
http content negotiation or in the form of RDF dumps.
● Data that Europeana gathers changes. This implies some
level of link rot.
● Content decay, as the metadata statements sent by
providers, or Europeana’s own enrichments, change.
University of Huddersfield
37. Europeana problems
● Aligning different time-versions of data for linked data
consumption.
● It could be that a description of an object in Europeana,
given by a provider, uses a third-party URI that is now
deprecated in the most updated version of that third party
linked dataset.
● Preserving data that aggregates other datasets
● European's problem becomes the one of preserving an
interconnected set of dataset views. What should be the
best practices for doing this?
University of Huddersfield
38. LD characteristics
● Digital objects can be:
● static vs dynamic
● complex vs simple
● active vs passive
● rendered vs non-rendered
● Linked Data is dynamic, complex, passive and
typically non-rendered.
University of Huddersfield
39. LD is dynamic
● Do people need archived version of LOD datasets or are the
most up to date version only what is needed?
● Linked Data are usually dynamic, which isn't the case in
many preservation applications for other types of objects,
so older versions should be preserved. This is the case for
example of DBpedia where both older versions and
incremental changes are archived.
● Different statements may be made at any time and so the
“boundary” of the object under consideration changes in
time.
University of Huddersfield
40. LD is Complex
● Linked Data is typically about expressing
statements (facts) whose truth or falsity is
grounded to the context provided by all the
other statements available at that particular
moment.
● Linked Data is a form of formal knowledge.
As for any kind of data or information, the
problem for long-term preservation is not the
preservation of an object as such, but the
preservation of the meaning of the object.
University of Huddersfield
41. LD rendering
● Typically Linked data are not rendered and adopt standards
such as RDF that are open, widely adopted and well
supported.
● This simplifies that preservation mechanism if the only
preservation objects are the data themselves.
● This is the case in the current DBpedia archiving
mechanism.
● On the other hand if rendering is a requirement then
appropriate software and perhaps platform emulators must
be preserved in addition to data themselves.
University of Huddersfield
42. LD is passive
● The linked data is usually represented in the form
of statements or objects (typically RDF triples)
which are not applications.
● Besides preserving data, software that handles
data should be preserved in some cases.
● Rendering functionality or access method, such as
a SPARQL endpoint for archived DBpedia data is an
example case not handled by the existing archiving
mechanism.
University of Huddersfield
43. Additional issues
● Linked Data is typically distributed and the
persistence the preserved objects depend on all
the individual parts and the
ontologies/vocabularies with which the data is
expressed.
● A lot of data is essentially dependent on OWL
ontologies that are created/maintained/hosted
by others
● Also authenticity, uncertainty, web infrastructure
issues, accessibility in many ways, versioning and
preservation of metadata referring to triples
University of Huddersfield
44. Problems (1/4)
● Selection
● Which LOD data should actively be preserved?
● Who is responsible for “community” data, such as
DBpedia?
● Durability of the format
● Which formats can we distinguish? Can we create
guidelines to create durable LOD formats? E.g. RDF is
based on open standards so they can be considered as very
durable. Also use of Persistent Identifiers contributes to
durability of LOD. Standardization efforts in Linked Open
data community and the compliance to W3C standards
greatly reduce risks related to durability of the adopted
format.
University of Huddersfield
45. Problems (2/4)
● Rights / ownership / licenses.
● LOD are by definition open (which is not always
the case for LD in general), but how to preserve
privacy then? Additional issues are which
licenses to use, which Creative Commons code?
● Rights and licenses issue is the main difference
between open and non-open LD since different
licences are typically required for each case.
University of Huddersfield
46. Problems (3/4)
● Storage.
● Highest quality is storage in “Trusted Digital
Repository”. But which other models can be
used? One example is providing multiple
copies/mirrors (CLOCKS).
● Also there is the issue of scale, similar to
problems encountered when dealing with Web
archiving.
University of Huddersfield
47. Problems (4/4)
● Metadata and Definitions.
● Documentation is required to enable the designated
community to understand the meaning of LOD objects. Are
LOD objects “self-descriptive”?
● That depends on where to put the boundary of LD objects.
If a LD object doesn't include the ontology(ies) that
provides the classes and properties used to express the
data, than the object is not self-descriptive, and there's
an additional preservation risk.
● Do they need any additional data elements to facilitate long
term access? Are these metadata adequate and how to prove
that?
University of Huddersfield
48. Diachron Project (1/2)
● Identifies main issues related to LOD
preservation for different use case categories
● Open Data Markets
● Enterprise Data Intranets
● Scientific Information Systems.
University of Huddersfield
49. Diachron project (2/2)
● Issues identified are:
● Ranking datasets
● Crawling datasets
● Diachronic citations (i.e., data from different
sources for the same fact)
● Temporal annotations
● Cleaning and repairing uncertain data
● Propagation of changes over Linked Data
● Archiving multiple versions
● Longitudinal querying (i.e., querying over different
versions of data spanning over a temporal interval as
in the DBpedia use case of this report).
University of Huddersfield
50. Conclusion
● Linked Data or Linked Open Data are a specific
form of digital objects. The problem for LOD
lies not with the notation of the data model
(RDF triples).
● The differentiation between LOD living on the
web, and which main part are URIs pointing to
web resources; and LD living in a database like
environment, which creates most problems for
archiving. Any attempt to archive LOD as part
of the living web shares problems to archive
web resources.
University of Huddersfield
51. Thank you
● Questions?
University of Huddersfield