This document discusses augmenting interoperability across scholarly repositories. It proposes a shared data model and services using core data surrogates that can be obtained, harvested, and put across repositories. This would allow richer cross-repository services and enable scholarly communication as a global workflow. A pathways core data model is presented for representing digital objects uniformly across repositories to support interoperable functions.
Keynote presentation delivered at ELAG 2013 in Gent, Belgium, on May 29 2013. Discusses Research Objects and the relationship to work my team has been involved in during the past couple of years: OAI-ORE, Open Annotation, Memento.
Keynote presentation delivered at ELAG 2013 in Gent, Belgium, on May 29 2013. Discusses Research Objects and the relationship to work my team has been involved in during the past couple of years: OAI-ORE, Open Annotation, Memento.
Linking Universities - A broader look at the application of linked data and s...Mathieu d'Aquin
Presentation at the VIVO - International Research Network about Linked Universities, data.open.ac.uk, linkedup, linked data for universities, education and research.
These slides accompany the LDOW2010 paper "An HTTP-Based Versioning Mechanism for Linked Data". The paper is available at http://arxiv.org/abs/1003.3661. It describes how the combination of the Memento (Time Travel for the Web) framework, and a resource versioning approach that is aligned both with the Cool URI notion and with Tim Berners-Lee concept of Time-Generic and Time-Specific, yields the ability to collect current and prior versions of resource merely using "follow your nose" HTTP navigation. The proposed combination further extends the value of a URI, and allows the emergence of a novel realm of temporal Web applications.
Slides used for a presentation at the CNI 2013 Fall meeting. Discusses the problem domain of the Hiberlink project, a collaboration between the Los Alamos National Laboratory and the University of Edinburgh, funded by the Andrew W. Mellon Foundation. Hiberlink investigates reference rot in web-based scholarly communication.
Linking Universities - A broader look at the application of linked data and s...Mathieu d'Aquin
Presentation at the VIVO - International Research Network about Linked Universities, data.open.ac.uk, linkedup, linked data for universities, education and research.
These slides accompany the LDOW2010 paper "An HTTP-Based Versioning Mechanism for Linked Data". The paper is available at http://arxiv.org/abs/1003.3661. It describes how the combination of the Memento (Time Travel for the Web) framework, and a resource versioning approach that is aligned both with the Cool URI notion and with Tim Berners-Lee concept of Time-Generic and Time-Specific, yields the ability to collect current and prior versions of resource merely using "follow your nose" HTTP navigation. The proposed combination further extends the value of a URI, and allows the emergence of a novel realm of temporal Web applications.
Slides used for a presentation at the CNI 2013 Fall meeting. Discusses the problem domain of the Hiberlink project, a collaboration between the Los Alamos National Laboratory and the University of Edinburgh, funded by the Andrew W. Mellon Foundation. Hiberlink investigates reference rot in web-based scholarly communication.
This presentation introduces the Memento solution to allow time travel on the Web. Slides used at the first presentation about Memento at the Library of Congress, November 16 2009. Please consult the February 2010 slides (http://www.slideshare.net/hvdsomp/memento-updated-technical-details-february-2010) for up-to-date technical details. More info at http://www.mementoweb.org
As the scholarly communication system evolves to become natively web-based and starts supporting the communication of a wide variety of objects, the manner in which its essential functions – registration, certification, awareness, archiving - are fulfilled co-evolves. This presentation focuses on the nature of the archival function based on a perspective of the future scholarly communication infrastructure. This presentation, prepared for a meeting in June 2014, is based on and updates a previous one that was prepared for a January 2014 meeting. The latter is available at http://www.slideshare.net/atreloar/scholarly-archiveofthefuture
Presentation for PIDapalooza 2016. PIDs need to be used to achieve their intended persistence. Our research (reported at WWW2016, see http://arxiv.org/1602.09102) found that a disturbing percentage of references to papers that have DOIs actually use the landing page HTTP URI instead of the DOI HTTP URI. The problem is likely related to tools used for collecting references such as bookmarks and reference managers. These select the landing page URI instead of the DOI URI because the former is what's available in the address bar. It can safely be assumed that the same problem exists for other types of PIDs. The net result is that the true potential of PIDs is not realized. In order to ameliorate this problem we propose a Signposting pattern for PIDs (http://signposting.org/identifier/). It consists of adding a Link header to HTTP HEAD/GET responses for all resources identified by a DOI, including the landing page and content resources such as "the PDF" and "the dataset". The Link header contains a link, which points with the "identifier" relation type to the DOI HTTP URI. When such a link is available, tools can automatically discover and use the DOI URI instead of the other URIs (landing page, PDF, dataset) associated with the DOI-identified object.
Memento: Big Leaps Towards Seamless Navigation of the Web of the PastHerbert Van de Sompel
These slides provide an explanation of the Memento Framework (time travel for the Web) from the perspective of resource versioning. It also details progress that has been made with deploying the framework since it was first introduced in November 2009, including standardization, development of tools, and advocacy. In addition, it touches upon new challenges (discovery, branding) and announces plans to make transactional Web archiving software available in the course of 2011.
DBpedia Archive using Memento, Triple Pattern Fragments, and HDTHerbert Van de Sompel
DBpedia is the Linked Data version of Wikipedia. Starting in 2007, several DBpedia dumps have been made available for download. In 2010, the Research Library at the Los Alamos National Laboratory used these dumps to deploy a Memento-compliant DBpedia Archive, in order to demonstrate the applicability and appeal of accessing temporal versions of Linked Data sets using the Memento “Time Travel for the Web” protocol. The archive supported datetime negotiation to access various temporal versions of RDF descriptions of DBpedia subject URIs.
In a recent collaboration with the iMinds Group of Ghent University, the DBpedia Archive received a major overhaul. The initial MongoDB storage approach, which was unable to handle increasingly large DBpedia dumps, was replaced by HDT, the Binary RDF Representation for Publication and Exchange. And, in addition to the existing subject URI access point, Triple Pattern Fragments access, as proposed by the Linked Data Fragments project, was added. This allows datetime negotiation for URIs that identify RDF triples that match subject/predicate/object patterns. To add this powerful capability, native Memento support was added to the Linked Data Fragments Server of Ghent University.
In this talk, we will include a brief refresher of Memento, and will cover Linked Data Fragments, Triple Pattern Fragments, and HDT in more detail. We will share lessons learned from this effort and demo the new DBpedia Archive, which, at this point, holds over 5 billion RDF triples.
The presentation explores the trend towards a scholarly communication system that is friendly to machines. It presents 3 exhibits illustrating the trend and 1 exhibit illustrating inertia in the system. It makes the point that machine-actionability can be much easier achieved if content and metadata are available in Open Access and under a permissive Creative Commons license. It also observes that even with content and metadata openly available, new costs related to advanced tools to explore the scholarly record will emerge. Finally, it points at significant challenges regarding the persistence of the scholarly record in light of increasingly interconnected and actionable content and advanced tools to interact with it.
The slides were used for a plenary presentation at the LIBER 2011 Conference in Barcelona, Spain, on June 30 2011.
Presentation given at CERN Workshop on Innovations in Scholarly Communication (OAI7) on 22nd June 2011
http://indico.cern.ch/conferenceDisplay.py?confId=103325
A talk delivered by Anne Trefethen at the Anybook Oxford Libraries Conference 2015 - Adapting for the Future: Developing Our Professions and Services, 21st July 2015
2012 03-28 Wf4ever, preserving workflows as digital research objectsStian Soiland-Reyes
Presented on 2012-03-28 at EGI Community Forum 2012, Munich.
http://www.wf4ever-project.org/
http://purl.org/wf4ever/model
http://cf2012.egi.eu/
https://www.egi.eu/indico/sessionDisplay.py?sessionId=66&confId=679#20120328
Understanding new ways of sharing content for learning and researching.@cristobalcobo
This lecture explores how the expansion of the Internet and a variety of digital devices has influenced the way that information and knowledge is generated, consumed and distributed particularly in the scholar environment.
Collection directions - towards collective collectionslisld
How the emergence of new research and learning workflows in digital environments is affecting library collecting and collections. Several trends are reviewed. In the light of diversifying competing requirements, the need to manage down print and develop shared print responses is discussed.
Presentation to OCLC Asia Pacific Regional Council meeting. 13 Oct. 2014.
Presentation about reference rot given at the Complexity Science Hub in Vienna, November 2021.
Links to web resources frequently break (link rot), and linked content can change at unpredictable rates (content drift). These dynamics of the Web are detrimental when references to web resources provide evidence or supporting information.
This presentation will report on research that assessed the extent of these problems for links to web resources in scholarly literature, by using three vast corpora of publications and a range of public web archives. It will also describe the Robust Link approach that offers a proactive, uniform, and machine-actionable way to combat link rot and content drift. Finally, it will introduce the Robustify web service and API that was devised to generate links that remain functional over time, paying special attention to challenges related to deploying infrastructure that is required to be long lasting.
Researcher Pod: Scholarly Communication Using the Decentralized WebHerbert Van de Sompel
The presentation provides an overview of the motivation and direction of the Mellon-funded Researcher Pod project that investigates technical aspects of scholarly communication in a decentralized web setting.
Presentation for a workshop about persistent identifiers organized by the Royal Library of The Netherlands and DANS. Highlights the non-trivial commitments required of all parties involved in persistent identifier systems to actually keep links based on persistent identifiers ... err ... persistent.
Various FAIR criteria pertaining to machine interaction with scholarly artifacts can commonly be addressed by means of repository-wide affordances that are uniformly provided for all hosted artifacts rather than through artifact-specific interventions. If various repository platforms provide such affordances in an interoperable manner, devising tools - for both human and machine use - that leverage them becomes easier.
My involvement, over the years, in a range of interoperability efforts has brought the insight that two factors strongly influence adoption: addressing a burning issue and delivering a KISS solution to tackle it. Undoubtedly, FAIR and FAIR DOs are burning issues. FAIR Signposting <https://signposting.org/FAIR/> is an ad-hoc repository interoperability effort that squarely fits in this problem space and that purposely specifies a KISS solution, hoping to inspire wide adoption.
Registration / Certification Interoperability Architecture (overlay peer-review)Herbert Van de Sompel
Presentation for the COAR meeting on Overlay Peer-Review held at INRIA, Paris, France. It provides overall context regarding a scholarly communication system in which the core functions of scholarly communication (registration, certification, awareness, archiving) are implemented in a decoupled manner and whereby each function can simultaneously be fulfilled by different parties, potentially in different ways. It shows how notifications can be used to achieve loosely coupled, point-to-point interoperability in such an environment, zooming in on interoperability between registration and certification aka interoperability between repositories and overlay peer-review services.
Slides used for a keynote presentation at the VIVO 2019 Conference in Podgorica, Montenegro.
Abstract: The invitation to present a keynote at the VIVO Conference and the goal of the VIVO platform, as stated on the DuraSpace site, to create an integrated record of the scholarly work of an organisation reminded me of various efforts that I have been involved in over the past years that had similar goals. EgoSystem (2014) attempted to gather information about postdocs that had left the organisation, leaving little or no contact details behind. Autoload (2017), an operational service, discovers papers by organisational researchers in order to upload them in the institutional repository. myresearch.institute (2018), an experiment that is still in progress, discovers artefacts that researchers deposit in web productivity portals and subsequently archives them. More recently, I have been involved in thinking about the future of NARCIS, a portal that provides an overview of research productivity in The Netherlands. The approach taken in all these efforts share a characteristic motivated by a desire to devise scalable and sustainable solutions: let machines rather than humans do the work. In this talk, I will provide an overview of these efforts, their motivations, the challenges involved, and the nature of success (if any).
Presentation for PIDapalooza 2019, Dublin, Ireland.
The Scholarly Orphans project, funded by the Andrew W. Mellon Foundation, explores technical approaches aimed at capturing and archiving scholarly artifacts that researchers deposit in web productivity portals as a means to collaborate and communicate with their peers. These artifacts are not collected by other frameworks aimed at archiving the scholarly record (e.g., LOCKSS, Portico, Institutional Repositories) and are only incidentally captured by web archives. The project explores an institution-driven approach inspired by web archiving. To demonstrate the ongoing thinking, the project has devised an experimental automated pipeline that continuously discovers, captures, and archives artifacts. These are created by actual researchers who, for the purpose of the experiment, were virtually enlisted in a fictive research institution. A portal at myresearch.institute provides an overview of the artifacts that were discovered and provides access to archived versions stored in both an institutional and a cross-institutional archive. The set-up leverages a range of technologies that share a flavor of persistence: Memento, Memento Tracer, Robust Links, Signposting.
As a memento of my last week of working at LANL, I put together a slide deck that provides an overview of major efforts conducted during the time I was there.
Presentation given at EuropeanaTech 2018 in Rotterdam, The Netherlands. Provides a summary of insights gained from working for about a decade on challenges related to temporal aspects of the web, persistence.
"Scholarly Communication: Deconstruct and Decentralize" was presented at the Fall 2017 Meeting of the Coalition for Networked Information. It explores working towards a Scholarly Commons by applying decentralized web ideas to scholarly communication.
Looks at hyperlinks from the perspective of a managed collection of resources for which link persistence/integrity is considered a quality of service concern. Distinguishes between links into other managed collections and to the web at large. Considers link rot and content drift.
This slide deck provides an overview of proposals to use HTTP Links as a means to address some long standing problems related to scholarly resources on the web.
This slide deck provides an overview of proposals to use HTTP Links as a means to address some long standing problems related to scholarly resources on the web.
These slides go with the paper "Reminiscing About 15 Years of Interoperability Efforts" which is available at http://dx.doi.org/10.1045/november2015-vandesompel
Slides were used for a presentation at the Fall 2015 Membership Meeting of the Coalition for Networked Information.
This presentation looks back at several efforts, conducted in the past fifteen years, aimed at establishing interoperability for web-based scholarly communication. It tries to characterize the perspectives/approaches taken by these efforts and, based upon that, proposes an HATEOS-based approach to interlink scholarly nodes on the web. This was first presented at the Research Data Alliance meeting in Paris, France, September 22 2015.
Extended version of slides presented at the "404/File Not Found" symposium held at Georgetown University on October 24 2014, see http://www.law.georgetown.edu/library/404/ . The presentation provides a brief overview of the link/reference rot problem and then discusses three complimentary strategies to combat it: Pro-actively capturing web resources that are linked from a seed collection; Referencing the captures by means of annotated links; Accessing the captures using Memento infrastructure.
This presentation introduces ResourceSync, a specification aimed to enable web-based synchronization of resources. The specification is the result of a collaboration between NISO and the Open Archives Initiative funded by the Sloan Foundation and JISC. The proposed resource synchronization approach is based on several existing specifications (e.g. Sitemaps, PubSubHubbub, well-known URI) and is aligned with common architectural principles (e.g. REST, follow your nose).
A 15 minute video version of these slides is available at https://www.youtube.com/watch?v=ASQ4jMYytsA
This presentation provides an overview of the Memento "Time Travel for the Web" framework that is aligned with the stable version of the Memento protocol, specified in RFC 7089.
The slides were used to accompany an overview of the outcomes of the ResourceSync project at the 2014 Spring Membership Meeting of the Coalition for Networked Information (CNI).
The launch of ResourceSync, a joint project of the National Information Standards Organization (NISO) and the Open Archives Initiative (OAI) funded by the Alfred P. Sloan Foundation, was motivated by the ubiquitous need to synchronize resources for applications in the realm of cultural heritage and research communication. After an initial problem definition and scoping phase, the project has designed, specified, and tested a framework for web-based synchronization that is based on SiteMaps, a protocol widely used by web servers to advertise the resources they make available to search engines for indexing. This choice allows repositories to address both search engine optimization and resource synchronization needs using the same technology.
The ResourceSync framework specifies various modular capabilities that a repository can support in order to allow third party systems to remain synchronized with its evolving resources. For example, a Resource List provides an inventory of resources whereas a Change List details resources that were created, deleted or updated during a given temporal interval. Support for capabilities can be combined in order to meet local or community requirements. The framework specifies capabilities that require a third party to recurrently poll for up-to-date information about a repositories’ resources but also publish/subscribe capabilities that keep third parties informed about changes through notifications, thereby significantly reducing synchronization latency.
Persistent Identifiers and the Web: The Need for an Unambiguous MappingHerbert Van de Sompel
Presentation given at the International Digital Curation Conference in San Francisco, February 26 2014. Highlights the lack of machine-actionability of persistent identifiers assigned to scholarly communication assets. Proposes an approach to address the issue that meets requirements that take into account the changing nature of web based research communication. A draft paper provides more details: http://public.lanl.gov/herbertv/papers/Papers/2014/IDCC2014_vandesompel.pdf
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Communications Mining Series - Zero to Hero - Session 1
Augmenting interoperability across scholarly repositories
1. Augmenting Interoperability
across Scholarly Repositories
Harvest
Obtain
Put
Herbert Van de Sompel
Research Library
Los Alamos National Laboratory, USA
This work was supported by NSF award number IIS-0430906 (Pathways)
RESEARCH
Augmenting Interoperability across Scholarly Repositories LIBRARY
JISC CNI Conference, York, UK, July 6th 2006
Herbert Van de Sompel
2. Pathways Project
• NSF grant number IIS-0430906
• http://www.infosci.cornell.edu/pathways/
• PIs: Carl Lagoze, Sandy Payette, Herbert Van de Sompel, Simeon
Warner
• Research Participants: Lyudmila Balakireva, Jeroen Bekaert,
Xiaoming Liu, Chris Wilper, Zhiwu Xie
RESEARCH
Augmenting Interoperability across Scholarly Repositories LIBRARY
JISC CNI Conference, York, UK, July 6th 2006
Herbert Van de Sompel
3. Meeting in NYC, April 20-21 2006
• Supported by Microsoft, Mellon Foundation, Coalition for
Networked Information, Digital Library Federation, JISC
• Representatives from institutional Repository projects, scholarly
content Repositories, Registry projects, various projects that touch
on interoperability
• See http://msc.mellon.org/Meetings/Interop/ for Agenda,
Participants, Topics & Goals, Terminology, Presentations, Prototype
demonstration.
• Report available July 2006
RESEARCH
Augmenting Interoperability across Scholarly Repositories LIBRARY
JISC CNI Conference, York, UK, July 6th 2006
Herbert Van de Sompel
4. And more discussions with the community
• Panel at JCDL 2006, Chapel-Hill, NC
• IATUL 2006, Porto, Portugal
• ElPub 2006, Bansko, Bulgaria
• Meeting at the University of Southampton, UK
RESEARCH
Augmenting Interoperability across Scholarly Repositories LIBRARY
JISC CNI Conference, York, UK, July 6th 2006
Herbert Van de Sompel
5. Context: the Repository model
An environment consisting of
Digital Object Repositories
with a Long Life Expectation:
o Scholarly repositories
- Institutional
repositories
- Discipline-oriented
repositories
- Publisher’s repositories
- Dataset repositories
- …
o Cultural heritage
repositories
Repository
o Preservation archives
o Educational repositories
RESEARCH
Augmenting Interoperability across Scholarly Repositories LIBRARY
JISC CNI Conference, York, UK, July 6th 2006
Herbert Van de Sompel
6. Context: compound digital objects
Objects of scholarly
communication system are
increasingly compound in
nature, simultaneously
consisting of:
• Multiple media types
id
• Multiple content types
o Papers,
o Datasets,
o simulations,
Digital Object
o software,
o dynamic knowledge
representations,
o machine readable chemical
structures
RESEARCH
Augmenting Interoperability across Scholarly Repositories LIBRARY
JISC CNI Conference, York, UK, July 6th 2006
Herbert Van de Sompel
7. Context: the Repository model
• We must leverage the value of the materials that become
available in those distributed Repositories.
• Think about these Repositories as active nodes in a global
environment, not as passive local nodes
o These Repositories are about facilitating the use and re-
use of materials in many contexts
o These Repositories are the starting point of value chains
• In order to enable value chains, we need to augment
interoperability across repositories
RESEARCH
Augmenting Interoperability across Scholarly Repositories LIBRARY
JISC CNI Conference, York, UK, July 6th 2006
Herbert Van de Sompel
8. Motivation 1 : Richer cross-Repository services
Distributed Repositories provide source
materials for cross-Repository overlay
services such as discovery services
Selective collecting
service
Need: digital object representation,
harvesting interface, datastream
semantics
RESEARCH
Augmenting Interoperability across Scholarly Repositories LIBRARY
JISC CNI Conference, York, UK, July 6th 2006
Herbert Van de Sompel
9. Motivation 2 : Scholarly communication workflow
Distributed Repositories at the basis of a
digital scholarly communication system.
Scholarly communication as a global
workflow across those Repositories
id
recombine & add value id
id
Need: digital object representation,
obtain interface, put interface
RESEARCH
Augmenting Interoperability across Scholarly Repositories LIBRARY
JISC CNI Conference, York, UK, July 6th 2006
Herbert Van de Sompel
10. Augmenting interoperability across Repositories
Shared Data Model and Services
DSpace
Nature
ePrints
Fedora
aDORe
arXiv
Individual Data Models and Services
RESEARCH
Augmenting Interoperability across Scholarly Repositories LIBRARY
JISC CNI Conference, York, UK, July 6th 2006
Herbert Van de Sompel
11. Considerations re interoperable framework
• Scholarly communication is a long-term endeavor:
• Need abstract definitions of Repository interfaces that can be
instantiated on the basis of various technologies as time goes by
• Repository interfaces need to work with whichever type of
identifier (current and future) because Repositories will use
whichever type of identifier
• Value chains do not require transfer of all digital object
content
• The content that needs to be transferred depends on the nature
of the value chain
• Recording a chain of evidence of a value chain requires fine
granularity of identification
• Not only identifier of the digital object but also of the
repository
RESEARCH
Augmenting Interoperability across Scholarly Repositories LIBRARY
JISC CNI Conference, York, UK, July 6th 2006
Herbert Van de Sompel
12. Augmenting interoperability across Repositories
m
Obtain
Harvest
Put
DSpace
Nature
ePrints
Fedora
aDORe
arXiv
Individual Data Models and Services
RESEARCH
Augmenting Interoperability across Scholarly Repositories LIBRARY
JISC CNI Conference, York, UK, July 6th 2006
Herbert Van de Sompel
13. Augmenting interoperability across Repositories
m Pathways Core Data Model for Cross-Repository services
Bekaert, Jeroen, Xiaoming Liu, Herbert Van de Sompel, Sandy Payette, Carl Lagoze, and
Simeon Warner. Pathways Core: A Data Model for Cross-Repository Services. 2006.
Poster for JCDL 2006. http://public.lanl.gov/herbertv/papers/pathways_core_poster_submit.pdf
RESEARCH
Augmenting Interoperability across Scholarly Repositories LIBRARY
JISC CNI Conference, York, UK, July 6th 2006
Herbert Van de Sompel
14. Augmenting interoperability across Repositories
m Pathways Core Surrogates (currently XML/RDF)
• A Surrogate is available for every Digital Object
• A Surrogate is a representation of the Digital
Object according to the Pathways Core data model
• The representation is uniform across repositories;
not tied to identifier type, content type, application
domain.
• The Surrogate is what is used in the value chains;
the Surrogate is used at Obtain, Harvest and Put
interfaces.
o Expresses properties and access points for the
Digital Object (see later)
oThe Surrogate for a specific Digital Object can
change over time
RESEARCH
Augmenting Interoperability across Scholarly Repositories LIBRARY
JISC CNI Conference, York, UK, July 6th 2006
Herbert Van de Sompel
15. Augmenting interoperability across Repositories
m Pathways Core Surrogates (currently XML/RDF)
• The Surrogates provide By-Reference access to
constituent datastreams of Digital Objects
• Full asset transfer is only required for certain
applications
• Static asset transform may be undesirable for
dynamic objects => Live references
• Avoid IP issues at the level of the interoperability
framework
• The idea is that the Surrogate itself is not
encumbered by IP issues; attach - by definition -
a liberal Creative Commons license to Surrogates
• Allow Surrogates to flow freely independent of
business models of the underlying content
RESEARCH
Augmenting Interoperability across Scholarly Repositories LIBRARY
JISC CNI Conference, York, UK, July 6th 2006
Herbert Van de Sompel
16. Augmenting interoperability across Repositories
m Pathways Core Surrogates (currently XML/RDF)
• A Surrogate expresses access points and
properties of a Digital Object, e.g.:
• Location of content streams
• providerInfo: the keys necessary to Obtain a
fresh Surrogate at some later point in time:
• (Repository identifier, preferredIdentifier,
versionKey)
• Lineage: A Surrogate expresses its
predecessor(s)
• == providerInfo in previous life
• semantic: A Surrogate expresses the type of
content.
RESEARCH
Augmenting Interoperability across Scholarly Repositories LIBRARY
JISC CNI Conference, York, UK, July 6th 2006
Herbert Van de Sompel
17. Augmenting interoperability across Repositories
Obtain interface: a Repository interface that supports the request of
Obtain
services pertaining to individual Digital Objects (including their
component Datastreams). The core service is the request of a
Surrogate for a Digital Object.
Harvest
Harvest interface: a Repository interface that exposes Surrogates for
incremental collecting/harvesting.
Put interface: a Repository interface that supports submission of one
Put
or more Surrogates into the Repository, thereby facilitating the
addition of Digital Objects to the collection of the Repository.
RESEARCH
Augmenting Interoperability across Scholarly Repositories LIBRARY
JISC CNI Conference, York, UK, July 6th 2006
Herbert Van de Sompel
18. Surrogate is at the core of the value chain
providerInfo
Lineage
Obtain
recombine &
id add value
Obtain
Put
id
providerInfo
Obtain
id
Lineage
RESEARCH
Augmenting Interoperability across Scholarly Repositories LIBRARY
JISC CNI Conference, York, UK, July 6th 2006
Herbert Van de Sompel
19. Basis for a Network of Linked Digital Objects
RESEARCH
Augmenting Interoperability across Scholarly Repositories LIBRARY
JISC CNI Conference, York, UK, July 6th 2006
Herbert Van de Sompel
20. Harvest
Put
Put1 Harvest1
Obtain
Obtain1 service
Repo1
Harvest
Put
Put2 Harvest2
Obtain
Repo2 Obtain2
RESEARCH
Augmenting Interoperability across Scholarly Repositories LIBRARY
JISC CNI Conference, York, UK, July 6th 2006
Herbert Van de Sompel
21. (provider,
Harvest
preferredIdentifier,
versionKey)
Put providerInfo
Put1 Harvest1
Obtain
Obtain1
Repo1
Registry
Service
Harvest
Put
Put2 Harvest2
provider Obtain Harvest Put
Obtain
Repo1 Obtain1 Harvest1 Put1
Obtain2 Repo2 Obtain2 Harvest2 Put2
Repo2
RESEARCH
Augmenting Interoperability across Scholarly Repositories LIBRARY
JISC CNI Conference, York, UK, July 6th 2006
Herbert Van de Sompel