This document proposes a system to give researchers credit for depositing their data by allowing them to easily submit a "data paper" about their deposited data set. It involves developing a helper application that would integrate with data repositories and publishers to streamline the process of depositing data and submitting the associated data paper. The proposal outlines three phases: requirements gathering and design, development and trial deployment, and expansion and sustainability. The goal is to incentivize data sharing by providing researchers a publication and citation opportunity for describing and taking credit for their datasets.
Using Neo4j for exploring the research graph connections made by RD-Switchboardamiraryani
In this talk, Jingbo Wang (NCI) and Amir Aryani (ANDS) have presented the Neo4j queries that can help data managers to explore the connections between datasets, researchers, grants, and publications using the graph model and Research Data Switchboard. In addition, they have discussed a paper on "Graph connections made by RD-Switchboard using NCI’s metadata", presented in the Reproducible Open Science workshop in Hannover September 2016.
RankBrain represents a new way of measuring relevance, built on teaching machines to understand relationships between words. How should RankBrain change our approach to SEO and specifically to keyword research? Will we need to fight machine learning with machine learning?
This document discusses converting metadata to linked open data. It provides an overview of the process of mapping metadata fields and their values to URIs and standardized vocabularies. This involves selecting existing terms where possible, cleaning up field values, and manually mapping values that don't match existing terms. It also discusses tools for working with linked data and principles for publishing open data online.
Open Harvester - Search publications for a researcher from CrossRef, PubMed a...Muhammad Javed
A java prototype that processes the result set of pre-downloaded data (from a database) and allows one to claim his/her publications from a ranked list.
This document proposes a system to give researchers credit for depositing their data by allowing them to easily submit a "data paper" about their deposited data set. It involves developing a helper application that would integrate with data repositories and publishers to streamline the process of depositing data and submitting the associated data paper. The proposal outlines three phases: requirements gathering and design, development and trial deployment, and expansion and sustainability. The goal is to incentivize data sharing by providing researchers a publication and citation opportunity for describing and taking credit for their datasets.
Using Neo4j for exploring the research graph connections made by RD-Switchboardamiraryani
In this talk, Jingbo Wang (NCI) and Amir Aryani (ANDS) have presented the Neo4j queries that can help data managers to explore the connections between datasets, researchers, grants, and publications using the graph model and Research Data Switchboard. In addition, they have discussed a paper on "Graph connections made by RD-Switchboard using NCI’s metadata", presented in the Reproducible Open Science workshop in Hannover September 2016.
RankBrain represents a new way of measuring relevance, built on teaching machines to understand relationships between words. How should RankBrain change our approach to SEO and specifically to keyword research? Will we need to fight machine learning with machine learning?
This document discusses converting metadata to linked open data. It provides an overview of the process of mapping metadata fields and their values to URIs and standardized vocabularies. This involves selecting existing terms where possible, cleaning up field values, and manually mapping values that don't match existing terms. It also discusses tools for working with linked data and principles for publishing open data online.
Open Harvester - Search publications for a researcher from CrossRef, PubMed a...Muhammad Javed
A java prototype that processes the result set of pre-downloaded data (from a database) and allows one to claim his/her publications from a ranked list.
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...Stuart Chalk
Scientists are looking for ways to leverage web 2.0 technologies in the research laboratory and as a consequence a number of approaches to web-based electronic notebooks are being evaluated. In this presentation I discuss the Eureka Research Workbench, an electronic laboratory notebook built on semantic technology and XML. Using this approach the context of the information recorded in the laboratory can be captured and searched along with the data itself. A discussion of the current system is presented along with the next planned development of the framework and long-term plans relative to linked open data. Presented at the 246th American Chemical Society Meeting in Indianapolis, IN, USA on September 12th, 2013.
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...Stuart Chalk
An electronic laboratory Notebook (ELN) can be characterized as a system that allows scientists to capture the data and resources used in performing scientific experiments. This allows users to easily organize and find their data however, little information about the scientific process is recorded.
In this paper we highlight the current status of progress toward semantic representation of science in ELNs.
This document discusses text and data mining (TDM) and provides definitions from 1982, 1999, and 2008 that describe mining as automatically generating logical representations of text passages, the (semi)automated discovery of trends and patterns across large datasets, and the use of automated methods to exploit knowledge in biomedical literature. It also lists different types of content that can be mined, such as images, graphs, tables, datasets, and text, and provides 101 potential uses for content mining, such as finding papers about chemistry in German or papers acknowledging support from the Wellcome Trust.
Svante Schubert presented on metadata and the new metadata model for OpenDocument Format 1.2. The new model addresses limitations of the current ODF metadata by making it more extensible and descriptive. It uses RDF and OWL to annotate content in a common way, aligning with semantic web standards. Metadata is stored in RDF files and linked to content elements via IDs. This allows software to more easily find, combine and share information. OpenOffice.org 3 will provide APIs to access and extend the new metadata capabilities.
The document discusses OpenURL activity data collected by the OpenURL Router. It describes what the data includes, such as anonymized IP addresses and metadata about journal articles accessed. The goals of the project are to make this activity data openly available, develop prototype services using the data, and potentially aggregate data from other institutions to analyze usage on a broader scale. Key considerations for aggregating data include legal issues regarding personal data, technical challenges in standardizing data extraction, and the financial costs of ongoing data sharing and maintenance.
This document summarizes basic search techniques for navigating electronic information sources. It discusses searching by keywords and phrases, truncation to find different word forms, proximity searches to specify distance between words within sentences or paragraphs, Boolean searches using AND, OR and NOT operators, limiting searches by date or file type, and field searching within specific fields like titles or URLs. The techniques described allow researchers to efficiently search and retrieve relevant electronic documents.
Presentation on the use of the Eureka Research Workbench to store data and scientific workflow information. Presented online as part of the Dial-a-molecule 'Liberating Laboratory Data' event (http://www.dial-a-molecule.org/wp/events-listing/liberating-laboratory-data/)
A Generic Scientific Data Model and Ontology for Representation of Chemical DataStuart Chalk
The current movement toward openness and sharing of data is likely to have a profound effect on the speed of scientific research and the complexity of questions we can answer. However, a fundamental problem with currently available datasets (and their metadata) is heterogeneity in terms of implementation, organization, and representation.
To address this issue we have developed a generic scientific data model (SDM) to organize and annotate raw and processed data, and the associated metadata. This paper will present the current status of the SDM, implementation of the SDM in JSON-LD, and the associated scientific data model ontology (SDMO). Example usage of the SDM to store data from a variety of sources with be discussed along with future plans for the work.
Annotopia open annotation services platformTim Clark
Annotopia is an open-access, open-source, open annotation services platform developed for scientific annotation of documents and datasets on the web using the W3C Open Annotation model http://www.openannotation.org/spec/core/.
Using Annotopia, virtually any client application including lightweight web clients, can create, selectively share, and access annotation of web documents and data. This can be done regardless of the ownership of the base objects being annotated.
Annotopia supports unstructured, semi-structured and fully-structured (semantic) annotation; manual and automated (textmining) annotation; permissions, groups, and sharing. It also provides access to specialized vocabulary and text analytics services.
Annotopia is an open source platform licensed under Apache 2.0.
The document discusses stacks and queues, which are linear data structures that maintain order. Stacks follow LIFO (last in, first out) order, where new elements are added to the top and the top element is removed first. Queues follow FIFO (first in, first out) order, where new elements are added to the rear and elements are removed from the front. The document compares stacks and queues, noting that stacks are used for calculations and function calls while queues are used for character buffers and print queues.
This document discusses challenges with the current scientific publishing system and proposes a vision for next generation scientific publishing (NGSP). Some key problems include retractions due to misconduct, lack of reproducibility, and non-reusable data and methods. NGSP would feature transparent and computable data and methods, open annotation of narratives and objects, and no restrictions on text mining or remixing. It would move information more quickly and allow verification through an open, service-oriented system without walled gardens. Taking NGSP forward will require collaboration across stakeholders in research communications.
Scientific Units in the Electronic AgeStuart Chalk
Scientists have standardized on the SI unit system since the late 1700’s. While much work has been done over the years to refine and redefine the system, little has formally done to standardize the representation of the SI units in electronic systems.
This paper will present a summary of current efforts toward electronic representation of scientific units in text, XML, and RDF, an analysis of needs for current computer/network systems, and an outline of future work.
The document discusses making data FAIR (Findable, Accessible, Interoperable, and Reusable) through a novel combination of web technologies. It describes the core FAIR principles for each component - findable, accessible, interoperable, and reusable. It then discusses how applying these principles through an "internet-inspired" approach using existing standards and protocols could help make large, heterogeneous and complex data more actionable for various applications and users. The presentation provides examples of how this could work through a layered architecture similar to the internet, with shared standards and specifications at each layer.
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...Stuart Chalk
Recently, the US government has mandated that publicly funded scientific research data be freely made available in a useable form, allowing integration of data in other systems. While this mandate has been articulated, existing publications and new papers (PDF) still do not provide accessible data, meaning that the usefulness is limited without human intervention.
This presentation outlines our efforts to extract scientific data from PDF files, using the PDFToText software and regular expressions (regex), and process it into a form that structures the data and its context (metadata). Extracted data is processed (cleaned, normalized), organized, and inserted into a contextually developed MySQL database. The data and metadata can then be output using a generic JSON-LD based scientific data model (SDM) under development in our laboratory.
This document discusses basic search techniques for electronic information sources. It describes keyword and phrase searching, truncation searching using right, left, and internal truncation, proximity searching within words, sentences and paragraphs with ordered and unordered options, Boolean searching using AND, OR and NOT operators, limiting searches by date, file type or other limits, and field searching within specific fields like title or URL. The techniques covered allow researchers to more effectively search and navigate the growing amount of electronic information available.
May 2012 JaxDUG presentation by Zachary Gramana on using the Lucene.NET library to add search functionality to .NET applications. Contains an overview of search/information retrieval concepts and highlights some common use-cases.
Fairport domain specific metadata using w3 c dcat & skos w ontology viewsTim Clark
FAIRPORT is an international project to develop a lightweight interoperability architecture for biomedical - and potentially other - data repositories.
This slide deck is a presentation to the FAIRPORT technical team. It describes a proposed model for supporting domain-specific search metadata using a common schema model across all repositories.
The proposal makes use of the following existing technologies, with minor extensions:
- the W3C DCAT model for dataset description
- the W3C SKOS knowledge organization system
- OWL2 Ontology Language
- Dublin Core Vocabulary
- NCBO Bioportal biomedical ontologies collection
Linked Open Data Fundamentals for Libraries, Archives and Museumstrevorthornton
This document provides an overview of linked open data concepts for libraries, archives, and museums. It discusses what linked open data is, potential benefits for cultural institutions, and technical concepts like URIs, HTTP, RDF, ontologies, and SPARQL. The document also covers publishing linked open data by establishing URIs for resources and using content negotiation. Trust and attribution of linked data sources are addressed. Open data licensing, including options from Creative Commons, is also summarized.
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...Stuart Chalk
Scientists are looking for ways to leverage web 2.0 technologies in the research laboratory and as a consequence a number of approaches to web-based electronic notebooks are being evaluated. In this presentation I discuss the Eureka Research Workbench, an electronic laboratory notebook built on semantic technology and XML. Using this approach the context of the information recorded in the laboratory can be captured and searched along with the data itself. A discussion of the current system is presented along with the next planned development of the framework and long-term plans relative to linked open data. Presented at the 246th American Chemical Society Meeting in Indianapolis, IN, USA on September 12th, 2013.
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...Stuart Chalk
An electronic laboratory Notebook (ELN) can be characterized as a system that allows scientists to capture the data and resources used in performing scientific experiments. This allows users to easily organize and find their data however, little information about the scientific process is recorded.
In this paper we highlight the current status of progress toward semantic representation of science in ELNs.
This document discusses text and data mining (TDM) and provides definitions from 1982, 1999, and 2008 that describe mining as automatically generating logical representations of text passages, the (semi)automated discovery of trends and patterns across large datasets, and the use of automated methods to exploit knowledge in biomedical literature. It also lists different types of content that can be mined, such as images, graphs, tables, datasets, and text, and provides 101 potential uses for content mining, such as finding papers about chemistry in German or papers acknowledging support from the Wellcome Trust.
Svante Schubert presented on metadata and the new metadata model for OpenDocument Format 1.2. The new model addresses limitations of the current ODF metadata by making it more extensible and descriptive. It uses RDF and OWL to annotate content in a common way, aligning with semantic web standards. Metadata is stored in RDF files and linked to content elements via IDs. This allows software to more easily find, combine and share information. OpenOffice.org 3 will provide APIs to access and extend the new metadata capabilities.
The document discusses OpenURL activity data collected by the OpenURL Router. It describes what the data includes, such as anonymized IP addresses and metadata about journal articles accessed. The goals of the project are to make this activity data openly available, develop prototype services using the data, and potentially aggregate data from other institutions to analyze usage on a broader scale. Key considerations for aggregating data include legal issues regarding personal data, technical challenges in standardizing data extraction, and the financial costs of ongoing data sharing and maintenance.
This document summarizes basic search techniques for navigating electronic information sources. It discusses searching by keywords and phrases, truncation to find different word forms, proximity searches to specify distance between words within sentences or paragraphs, Boolean searches using AND, OR and NOT operators, limiting searches by date or file type, and field searching within specific fields like titles or URLs. The techniques described allow researchers to efficiently search and retrieve relevant electronic documents.
Presentation on the use of the Eureka Research Workbench to store data and scientific workflow information. Presented online as part of the Dial-a-molecule 'Liberating Laboratory Data' event (http://www.dial-a-molecule.org/wp/events-listing/liberating-laboratory-data/)
A Generic Scientific Data Model and Ontology for Representation of Chemical DataStuart Chalk
The current movement toward openness and sharing of data is likely to have a profound effect on the speed of scientific research and the complexity of questions we can answer. However, a fundamental problem with currently available datasets (and their metadata) is heterogeneity in terms of implementation, organization, and representation.
To address this issue we have developed a generic scientific data model (SDM) to organize and annotate raw and processed data, and the associated metadata. This paper will present the current status of the SDM, implementation of the SDM in JSON-LD, and the associated scientific data model ontology (SDMO). Example usage of the SDM to store data from a variety of sources with be discussed along with future plans for the work.
Annotopia open annotation services platformTim Clark
Annotopia is an open-access, open-source, open annotation services platform developed for scientific annotation of documents and datasets on the web using the W3C Open Annotation model http://www.openannotation.org/spec/core/.
Using Annotopia, virtually any client application including lightweight web clients, can create, selectively share, and access annotation of web documents and data. This can be done regardless of the ownership of the base objects being annotated.
Annotopia supports unstructured, semi-structured and fully-structured (semantic) annotation; manual and automated (textmining) annotation; permissions, groups, and sharing. It also provides access to specialized vocabulary and text analytics services.
Annotopia is an open source platform licensed under Apache 2.0.
The document discusses stacks and queues, which are linear data structures that maintain order. Stacks follow LIFO (last in, first out) order, where new elements are added to the top and the top element is removed first. Queues follow FIFO (first in, first out) order, where new elements are added to the rear and elements are removed from the front. The document compares stacks and queues, noting that stacks are used for calculations and function calls while queues are used for character buffers and print queues.
This document discusses challenges with the current scientific publishing system and proposes a vision for next generation scientific publishing (NGSP). Some key problems include retractions due to misconduct, lack of reproducibility, and non-reusable data and methods. NGSP would feature transparent and computable data and methods, open annotation of narratives and objects, and no restrictions on text mining or remixing. It would move information more quickly and allow verification through an open, service-oriented system without walled gardens. Taking NGSP forward will require collaboration across stakeholders in research communications.
Scientific Units in the Electronic AgeStuart Chalk
Scientists have standardized on the SI unit system since the late 1700’s. While much work has been done over the years to refine and redefine the system, little has formally done to standardize the representation of the SI units in electronic systems.
This paper will present a summary of current efforts toward electronic representation of scientific units in text, XML, and RDF, an analysis of needs for current computer/network systems, and an outline of future work.
The document discusses making data FAIR (Findable, Accessible, Interoperable, and Reusable) through a novel combination of web technologies. It describes the core FAIR principles for each component - findable, accessible, interoperable, and reusable. It then discusses how applying these principles through an "internet-inspired" approach using existing standards and protocols could help make large, heterogeneous and complex data more actionable for various applications and users. The presentation provides examples of how this could work through a layered architecture similar to the internet, with shared standards and specifications at each layer.
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...Stuart Chalk
Recently, the US government has mandated that publicly funded scientific research data be freely made available in a useable form, allowing integration of data in other systems. While this mandate has been articulated, existing publications and new papers (PDF) still do not provide accessible data, meaning that the usefulness is limited without human intervention.
This presentation outlines our efforts to extract scientific data from PDF files, using the PDFToText software and regular expressions (regex), and process it into a form that structures the data and its context (metadata). Extracted data is processed (cleaned, normalized), organized, and inserted into a contextually developed MySQL database. The data and metadata can then be output using a generic JSON-LD based scientific data model (SDM) under development in our laboratory.
This document discusses basic search techniques for electronic information sources. It describes keyword and phrase searching, truncation searching using right, left, and internal truncation, proximity searching within words, sentences and paragraphs with ordered and unordered options, Boolean searching using AND, OR and NOT operators, limiting searches by date, file type or other limits, and field searching within specific fields like title or URL. The techniques covered allow researchers to more effectively search and navigate the growing amount of electronic information available.
May 2012 JaxDUG presentation by Zachary Gramana on using the Lucene.NET library to add search functionality to .NET applications. Contains an overview of search/information retrieval concepts and highlights some common use-cases.
Fairport domain specific metadata using w3 c dcat & skos w ontology viewsTim Clark
FAIRPORT is an international project to develop a lightweight interoperability architecture for biomedical - and potentially other - data repositories.
This slide deck is a presentation to the FAIRPORT technical team. It describes a proposed model for supporting domain-specific search metadata using a common schema model across all repositories.
The proposal makes use of the following existing technologies, with minor extensions:
- the W3C DCAT model for dataset description
- the W3C SKOS knowledge organization system
- OWL2 Ontology Language
- Dublin Core Vocabulary
- NCBO Bioportal biomedical ontologies collection
Linked Open Data Fundamentals for Libraries, Archives and Museumstrevorthornton
This document provides an overview of linked open data concepts for libraries, archives, and museums. It discusses what linked open data is, potential benefits for cultural institutions, and technical concepts like URIs, HTTP, RDF, ontologies, and SPARQL. The document also covers publishing linked open data by establishing URIs for resources and using content negotiation. Trust and attribution of linked data sources are addressed. Open data licensing, including options from Creative Commons, is also summarized.
This document discusses building Linked Data apps for the iPhone. It covers key Linked Data concepts like URIs, RDF, and SPARQL. It also outlines the technologies needed to develop for the iPhone like Objective-C, HTML, and various libraries. Finally, it presents Lodsy as an example Linked Data app for the iPhone that uses facets and views to browse and display RDF data on maps and in details.
Interlinking educational data to Web of Data (Thesis presentation)Enayat Rajabi
This is a thesis presentation about interlinking educational data to Web of Data. I explain how I used the Linked Data approach to expose and interlink educational data to the Linked Open Data cloud
This presentation is an introduction to RDFa, as the fourth assignment of the IST 681 in iSchool, Syracuse University. The presentation is made by Kai Li, who is a library student in Syracuse University..
It's not rocket surgery - Linked In: ALA 2011Ross Singer
This document provides a brief introduction to linked library data and linked data concepts. It explains the core principles of linked data, including using URIs as names for things and including links between URIs so that additional related data can be discovered. It also discusses common vocabularies and schemas used in linked data like Dublin Core, Bibliontology, and RDA Elements. The document uses a sample book record to demonstrate how linked data can be modeled and interconnected using these vocabularies and external data sources like VIAF, LOC, and Geonames.
Linked Data provides a standardized framework for publishing structured data on the web by linking data instead of documents. It uses URIs, HTTP, and RDF to link related data across different sources to create a global data space without silos. EnAKTing is a research project focused on building ontologies from large-scale user participation, querying linked data at web-scale, and visualizing the massive amounts of interconnected data. Some of its applications include services for discovering backlinks, geographical resources, and dataset equivalences in the Web of Data.
Providing open data is of interest for its societal and commercial value, for transparency, and because more people can do fun things with data. There is a growing number of initiatives to provide open data, from, for example, the UK government and the World Bank. However, much of this data is provided in formats such as Excel files, or even PDF files. This raises the question of
- How best to provide access to data so it can be most easily reused?
- How to enable the discovery of relevant data within the multitude of available data sets?
- How to enable applications to integrate data from large numbers of formerly unknown data sources?
One way to address these issues to to use the design principles of linked data (http://www.w3.org/DesignIssues/LinkedData.html), which suggest best practices for how to publish and connect structured data on the Web. This presentation gives an overview of linked data technologies (such as RDF and SPARQL), examples of how they can be used, as well as some starting points for people who want to provide and use linked data.
The presentation was given on August 8, at the Hacknight event (http://hacknight.se/) of Forskningsavdelningen (http://forskningsavd.se/) (Swedish: “Research Department”) a hackerspace in Malmö.
Learning Resource Metadata Initiative: using schema.org to describe open educ...Phil Barker
This paper discusses the Learning Resource Metadata Initiative (LRMI), an international project that aims to facilitate the discovery of educational resources through the use of embedded metadata that can be used by search engines (e.g. Google, Yahoo, Bing, Yandex) to refine the search services they offer. LRMI has extended the schema.org metadata vocabulary with terms that are specifically relevant to aiding the discovery of learning resources.
The document discusses using schema.org to describe open educational resources in order to help users more easily find resources that meet their needs. It describes how the Learning Resource Metadata Initiative (LRMI) extended schema.org by adding educational parameters that were previously missing, such as educational alignment, learning resource type, and typical age range. A prototype Google custom search engine is provided as an example of how these LRMI extensions could be used to narrow searches for educational resources.
Transmission6 - Publishing Linked DataBill Roberts
This document provides guidance on publishing linked data by describing how to (1) use URIs to identify things, (2) make those URIs accessible via HTTP, (3) provide useful information about those URIs using standards, and (4) include links between URIs. It recommends starting by describing important things and assigning them URIs, and then representing the descriptions in both human-readable and machine-readable formats like RDF. Publishers should also include links between related URIs and provide licensing information.
Presentation created for the CILIP Cataloguing Interest Group event on Linked Data, 25th November 2013 (http://www.cilip.org.uk/cataloguing-and-indexing-group/events/linked-data-what-cataloguers-need-know-cig-event)
Commodity Semantic Search: A Case Study of DiscoverEdNathan Yergler
The document discusses DiscoverEd, an open source search tool for open educational resources (OER) that are labeled with Creative Commons licenses. It describes how DiscoverEd uses curators to identify and add metadata to educational resources, which are then indexed using Nutch and can be queried based on metadata fields. Current work focuses on improving provenance tracking of curator metadata and developing tools to help curators publish more linked data descriptions of their resources.
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIsJosef Petrák
The document discusses the Semantic Web and RDF data formats. It provides an overview of RDF syntaxes like RDF/XML, N3, N-Triples, RDF/JSON, and RDFa. It also discusses software APIs for working with RDF data in languages like Java, PHP, and Ruby. The document outlines handling RDF data using statement-centric, resource-centric, and ontology-centric models, as well as named graphs. It provides examples of reading RDF data from files and querying RDF data using SPARQL.
This document provides an introduction to linked data and RDF. It discusses:
1. The principles of linked data, which involve using URIs to identify things and including links to other related resources.
2. The goals of linked data, which are to transfer information between machines without loss of meaning by identifying data on the web using shared vocabularies and RDF.
3. An overview of RDF, which structures data as subject-predicate-object triples and can be serialized in formats like RDF/XML and Turtle to represent typed links between resources.
Choices, modelling and Frankenstein Ontologiesbenosteen
This document discusses an ontology project at the University of Bristol. It addresses issues with representing research information, which changes frequently. The project uses a combination of ontologies like FOAF, Bio, and Dcterms to model "Things" like people and publications. Context about these Things, like time periods of validity, is represented using named graphs. The current implementation stores this information in a Fedora object store with RDF serialization. The project aims to gather relevant domain taxonomies and provide APIs for researchers to maintain them, taking a "Frankenstein" approach of combining relevant standards. It notes some design flaws of the CERIF interchange format compared to the linked data approach taken.
The document discusses IIIF (International Image Interoperability Framework), annotations, and discourse. It provides an overview of IIIF, describing its image, presentation, and search APIs. It discusses how annotations can be used with IIIF manifestations to provide additional context and information. Examples are given of active IIIF efforts involving granularity, 3D images, time-based media, and taking advantage of the web. The document concludes with a demo of the Mirador IIIF viewer.
Active Digital Preservation and Data/Metadata MigrationKaren Estlund
Presentation by Nick Ruest and Karen Estlund at spring 2017 Coalition for Networked Information meeting on digital preservation, Fedora, and import / export efforts.
This document discusses using the International Image Interoperability Framework (IIIF) for digitized newspapers. It outlines the goals of the IIIF Newspaper Interest Group, which aims to promote best practices for exploiting IIIF capabilities for searching, discovering, and annotating newspaper content. Examples are provided of how OCR text and ALTO metadata can be represented as IIIF image and annotation APIs. Guidelines are presented for mapping newspaper structures like titles and pages to IIIF concepts and exposing newspaper collections through APIs.
This document discusses using the IIIF standard to provide access to digitized newspaper collections. It provides examples of several openly accessible newspaper collections and notes that newspapers typically receive more page views than other types of digital collections. The document outlines possible ways to map newspaper metadata and images to the IIIF standard to allow zooming, panning, clipping of images and text correction/annotation. Implementing IIIF could improve access and discovery of historical newspaper content.
Portland Common Data Model (PCDM): Creating and Sharing Complex Digital ObjectsKaren Estlund
Interoperability has long been a goal of digital repositories, as demonstrated by efforts ranging from OAI-PMH, to attempts to create common APIs such as IIIF, to community based metadata standards such as Dublin Core. As repositories have matured and the desire to work more collaboratively and reuse source code has grown, the need for a common understanding of how digital objects are conceived and represented is essential. The Portland Common Data Model (PCDM) is an effort to create a shared, linked data-based model for representing complex digital objects. Starting in the Hydra community but quickly expanding to include contributors from Islandora, Fedora, the Digital Public Library of America, and other repository-related service communities, PCDM is the result of over sixty practitioners’ contributions to a shared model for structuring digital objects. The process was holistic and rooted in concrete use-cases. An initial in-person meeting in Portland, Oregon in fall 2014 resulted in the release of the first draft of the data model for which it is named. With this shared model, we intend to further the goal of interoperability across repositories and related technologies. This presentation will review the origins of PCDM, provide a general technical overview, update on current status, and forecast future work.
Comparison of integration of local digital newspaper projects at the University of Utah Marriott Library, University of Oregon Libraries, and Penn State University Libraries with content created as part of participation in the National Digital Newspaper Program.
Beyond NDNP: Technical Specifications Working GroupKaren Estlund
Recommendations for digital newspaper descriptive metadata specification based on the National Digital Newspaper Program (NDNP) technical specification but extended for beyond NDNP uses.
Library Support for Journal Publishing: Emphasis on multi-modal open peer rev...Karen Estlund
Brief review of University of Oregon Libraries Journal Publishing program followed by in-depth look at Ada. Content also provided by Sarah Hamid and Bryce Peake
This document summarizes current efforts related to digitizing and providing access to historical newspaper content. It discusses several national programs and projects focused on digitizing newspapers, developing metadata and technical standards, and addressing copyright issues. It also outlines different levels at which newspaper content can be described and accessed (title, issue, page, article levels) and highlights some specific digital newspaper collections and interfaces. Finally, it poses questions to the audience about desired functionality and remaining challenges.
This document summarizes a presentation on rights technical design given at the DPLA Fest 2015. It discusses the common namespace and class used for rights statements. It also describes the URI design for rights statements and how they are designed to be both machine-readable and human-readable. Finally, it lists the technical working group members who worked on rights statement design.
Publishing Ada: A Retrospective Look at the First Three Years of an Open Peer...Karen Estlund
Presentation by: Karen Estlund, Sarah Hamid, and Bryce Peake
At the CNI spring 2012 meeting, we presented on a new collaborative journal publishing project from The Fembot Collective and the University of Oregon (UO) Libraries, Ada: A Journal of Gender, New Media, and Technology. The Fembot Collective is a collaborative of feminist media scholars, producers, and artists engaged with the intersection of new media and technology and scholarly communication. One aspiration of this project was to reclaim the means of scholarly production through a community-centered model of open peer review and multi-modal publication processes. As a work in progress, Ada has continuously evolved to meet the needs of diverse authors, readers, and commentators. In the face of changing scholarly communication practices, the Fembot and library collaboration offers an alternative system of open-access publication and review that recaptures academic production structures in favor of cross-disciplinary, multi-modal, collaborative knowledge. Our community standards state that “responding is political work” emphasizing a space that demands constant redirection and active participation by its collaborators in order to generate new expressions of feminist open access scholarship over time. Now in our third year of publication and working on our ninth issue, we will review lessons learned about audience, production, infrastructure, design and assessment. We will discuss the ways in which our intervention has been transformed by, while also transforming, discussions about participatory media, open and collaborative peer review, production costs, and the intersections of technical and intellectual labor.
http://adanewmedia.org
http://fembotcollective.org
https://library.uoregon.edu/digitalscholarship
APIs & Open Data with Oregon Digital NewspapersKaren Estlund
The document discusses how APIs and open data can provide access to digitized newspaper content from the Oregon Digital Newspaper Program. It provides examples of different types of API requests that can be used including searches, requests for individual pages or titles, and batch requests. One example of a student project that used the newspaper data is also mentioned, and potential future opportunities are discussed.
This document summarizes an RDF in Hydra summit. It lists the participants and objectives of examining implications and opportunities of Fedora 4 and Hydra. Key discussion highlights included that RDF is a mechanism to leverage linked data, convoluted terminology around data models and ontologies, and examples of RDF usage in Hydra practice. Reasons for using RDF included granularity of description, flexibility and extensibility of schemas, reusability of linked data, and RDF being machine actionable. Next steps proposed forming a Hydra working group to address issues and having breakout sessions.
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceIndexBug
Imagine a world where machines not only perform tasks but also learn, adapt, and make decisions. This is the promise of Artificial Intelligence (AI), a technology that's not just enhancing our lives but revolutionizing entire industries.
GraphRAG for Life Science to increase LLM accuracyTomaz Bratanic
GraphRAG for life science domain, where you retriever information from biomedical knowledge graphs using LLMs to increase the accuracy and performance of generated answers
Taking AI to the Next Level in Manufacturing.pdfssuserfac0301
Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as:
1. How quickly AI is being implemented in manufacturing.
2. Which barriers stand in the way of AI adoption.
3. How data quality and governance form the backbone of AI.
4. Organizational processes and structures that may inhibit effective AI adoption.
6. Ideas and approaches to help build your organization's AI strategy.
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on integration of Salesforce with Bonterra Impact Management.
Interested in deploying an integration with Salesforce for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Fueling AI with Great Data with Airbyte WebinarZilliz
This talk will focus on how to collect data from a variety of sources, leveraging this data for RAG and other GenAI use cases, and finally charting your course to productionalization.
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxSitimaJohn
Ocean Lotus cyber threat actors represent a sophisticated, persistent, and politically motivated group that poses a significant risk to organizations and individuals in the Southeast Asian region. Their continuous evolution and adaptability underscore the need for robust cybersecurity measures and international cooperation to identify and mitigate the threats posed by such advanced persistent threat groups.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Linked data intro primer
1. Linked Data Principles
Oregon Digital Linked Data
Workshop, Eugene, Oregon
November 25, 2013
Tom Johnson thomas.
johnson@oregonstate.edu
2. 4 Principles
1. Use URIs as names for things
2. Use HTTP URIs so that people can look up those names.
3. When someone looks up a URI, provide useful
information, using the standards (RDF*, SPARQL)
4. Include links to other URIs. so that they can discover
more things.
8. Practical Semantics
➢ Hierarchical Metadata Terms
⇒ relationships between vocabularies
⇒ e.g. mrel:photographer < dc:contributor
➢ Domain and Range Statements
⇒ Limit vocabulary application for data quality and
interoperability
➢ Objects in one statement can be subjects in
others.
9. Global Scale
➢ Statement-centric
➢ Model is “Open World”
⇒ Data we don’t have is assumed to be unknown
locally, not globally.
➢ Outside data is valued
➢ Linking is web scale
10. Resources
➢ Linked Open Vocabularies (vocabulary search engine)
http://lov.okfn.org/dataset/lov/
➢ W3C Library Linked Data Incubator Group Reports http:
//www.w3.org/2005/Incubator/lld/
➢ Open Metadata Registry (hosts RDA vocabularies) http:
//metadataregistry.org/