A call to action to discuss and agree on practical considerations around the creation, publication and discovery of linked open data about historical activities and objects.
Text of approximately what I said: http://bit.ly/usable_lod
To be useful, Linked Open Data requires shared identities and the reuse of their identifiers (URIs). This presentation argues that exact identity matching is both theoretically and practically impossible, and proposes some practical considerations for how to create an actual web of data.
Presented as invited seminar at UC Berkeley, February 24th, 2017
Digital Share 2017 presentation about Linked Open Data at The Getty, starting from what LOD is, to why we're interested in it, and some of the practical approaches we're using to make it real.
US2TS Conference position paper on publishing and retrieving not just LOD, but LOUD -- Linked Open Usable Data.
APIs are the UIs of Developers, and need:
* Correct Abstraction level for the Audience
* Few Barriers to Entry
* Comprehensible by introspection
* Thorough Documentation with copy-able examples
* Few Exceptions, instead consistent patterns
An introduction to the linked.art LOD data model, based on a carefully selected profile of CIDOC-CRM, and expressed as JSON-LD. It focuses on developer happiness and data usability, while trying to also maintain as much of the richness of CRM as possible.
Linked Open Data is great for recommendations about publishing data, but we need five more stars for the consumer -- How can it be both complete and usable? Design principles for Linked Open Usable Data.
Presentation about usability of linked data, following LODLAM 2020 at the Getty. Discusses JSON-LD 1.1, IIIF, Linked Art, in the context of the design principles for building usable APIs on top of semantically accurate models, and domain specific vocabularies.
In particular a focus on the different abstraction layers between conceptual model, ontology, vocabulary, and application profile and the various uses of the data.
To be useful, Linked Open Data requires shared identities and the reuse of their identifiers (URIs). This presentation argues that exact identity matching is both theoretically and practically impossible, and proposes some practical considerations for how to create an actual web of data.
Presented as invited seminar at UC Berkeley, February 24th, 2017
Digital Share 2017 presentation about Linked Open Data at The Getty, starting from what LOD is, to why we're interested in it, and some of the practical approaches we're using to make it real.
US2TS Conference position paper on publishing and retrieving not just LOD, but LOUD -- Linked Open Usable Data.
APIs are the UIs of Developers, and need:
* Correct Abstraction level for the Audience
* Few Barriers to Entry
* Comprehensible by introspection
* Thorough Documentation with copy-able examples
* Few Exceptions, instead consistent patterns
An introduction to the linked.art LOD data model, based on a carefully selected profile of CIDOC-CRM, and expressed as JSON-LD. It focuses on developer happiness and data usability, while trying to also maintain as much of the richness of CRM as possible.
Linked Open Data is great for recommendations about publishing data, but we need five more stars for the consumer -- How can it be both complete and usable? Design principles for Linked Open Usable Data.
Presentation about usability of linked data, following LODLAM 2020 at the Getty. Discusses JSON-LD 1.1, IIIF, Linked Art, in the context of the design principles for building usable APIs on top of semantically accurate models, and domain specific vocabularies.
In particular a focus on the different abstraction layers between conceptual model, ontology, vocabulary, and application profile and the various uses of the data.
Background for linked open data at the J Paul Getty Trust, followed by a summary of Linked Open Usable Data, and an initial walkthrough of the https://linked.art/ model.
Slides from a talk I gave at Perspectives Workshop on Semantic Web, http://www.dagstuhl.de/en/program/calendar/semhp/?semnr=09271 ... Dagstuhl, Germany 2009-06-29. Title was from Jim Hender!
This presentation discussed Kathy Schrock's "5 W's" construct and how to use it to assess the validity or web content. This is a companion piece to the article published on EmergingEdTech.com [URL]
Background for linked open data at the J Paul Getty Trust, followed by a summary of Linked Open Usable Data, and an initial walkthrough of the https://linked.art/ model.
Slides from a talk I gave at Perspectives Workshop on Semantic Web, http://www.dagstuhl.de/en/program/calendar/semhp/?semnr=09271 ... Dagstuhl, Germany 2009-06-29. Title was from Jim Hender!
This presentation discussed Kathy Schrock's "5 W's" construct and how to use it to assess the validity or web content. This is a companion piece to the article published on EmergingEdTech.com [URL]
Digitised Manuscripts and the British Library's new IIIF viewer Mia
The British Library's implementation of the IIIF-based Universal Viewer. Presentation for 'Digitised Hebrew Manuscripts: British Library and Beyond', London, November 2016
Challenges and opportunities in personal omics profilingSenthil Natesan
The term ‘‘omic’’ is derived from the Latin suffix ‘‘ome’’ meaning mass or many. Thus, OMICS involve a mass (large number) of measurements per endpoint. (Jackson et al., 2006)
The functional state of a cell can be explained by the integrated set of different OMICS data, called molecular signature or biomarker.The same fact can be exploited to find out difference between diseased and normal.
For diagnosis of a diseases in future, personal OMICS profiling (POP) is indispensible.
The POP further confer advantage to produce personal drugs, based on POP.
Discussion of the needs around updating Shared Canvas data model for IIIF's Presentation API, and aligning with new work such as the Web Annotation specs.
Europeana & IIIF - what we have been doing with IIIF and whyDavid Haskiya
Slides supporting my presentation at the IIIF Outreach event at the Rijksmuseum, October 18 2016. The presentation covered why we at Europeana have chose to join the IIIF community and adopt the protocol in our own stack. It includes examples of what we have developed and also what we have in the development pipeline.
https://doi.org/10.6084/m9.figshare.11854626.v1
Presented at Dutch National Librarian/Information Professianal Association annual conference 2011 - NVB2011
November 17, 2011
Everything you ever wanted to know about IIIF but were too afraid to askCogapp
Workshop session to introduce the International Image Interoperability Framework, presented at DPLAFest 2016. Slides from Tristan Roddis (Cogapp), Tom Cramer (Stanford University Library), Esmé Cowles (Princeton University Library), Antoine Isaac (Europeana), Mark Matienzo (DPLA)
Sales Decks for Founders - Founding Sales - December 2015 Peter Kazanjy
Presentation on "sales decks for founders" covering the best way to present your new technology product to a business-to-business buyer.
Presentation is an adaption of a chapter from Founding Sales (book on technology sales for founders and other first-time sellers): https://twitter.com/FoundingSales
Chapter excerpt here: http://firstround.com/review/building-your-best-sales-deck-starts-here/
This presentation was provided by Rob Sanderson of the J. Paul Getty Trust during the NISO Virtual Conference, Open Data Projects, held on Wednesday, June 13, 2018.
IIIF and Linked Data: A Cultural Heritage DAM EcosystemRobert Sanderson
Presentation at DAMLA, November 15 2017, on the adoption of the IIIF image interoperability APIs across the Cultural Heritage sector for access to digital assets. How Linked Open Data then provides interoperable discovery solutions for that content.
Standards and Communities: Connected People, Consistent Data, Usable Applicat...Robert Sanderson
Keynote presentation at JCDL 2019 at UIUC, on the interaction between standards (development and usage) and communities. Looking at Linked Open Data, digital library protocols, and evaluation of standards practices.
Sanderson CNI 2020 Keynote - Cultural Heritage Research Data EcosystemRobert Sanderson
There have been, and continue to be, many initiatives to address the social, technological, financial and policy-based challenges that throw up roadblocks towards achieving this vision. However, it is hard to tell whether we are making progress, or whether we are eternally waiting for the hyperloop that will never come. If we are to ever be able to answer research questions that require a broad, international corpus of cultural data, then we need an ecosystem that can be characterized with 5 “C”s: Collaborative, Consistent, Connected, Correct and Contextualized. Each of these has implications for the sustainability, innovation, usability, timeliness and ethical considerations that must be addressed in a coherent and holistic manner. As with autonomous vehicles, technology (and perhaps even machine “intelligence”) is a necessary but insufficient component.
In this presentation, I will frame and motivate this grand challenge and propose where we can build connections between the academy, the cultural heritage sector, and industry. The discussion will explore the issues, and highlight some of the successful endeavors and more approachable opportunities where, together, progress can be made.
Using causal Inference to better understand the search intent Dateme Tubotamuno
Causal inference helps us understand the question of why. In this talk, I will demonstrate the power of causality in understanding user intent during keyword research and performance analysis. The user intent is beyond transactional, informational, and navigational classifications
Want to build more awesome links for your business?
Join this sponsored ThinkTank webinar and learn how to quickly and efficiently execute three proven link building strategies.
Ric Rodriguez - The Knowledge Graph: A Paradigm Shift In Search (Search Y 2020)Ric Rodriguez
The world of search is changing. Users have more ways to find information than ever - and in more places. At the heart of these new intelligent services sits the knowledge graph, a brain-like system that processes and stores all of the information that we create online. In this presentation, Ric Rodriguez, SEO Consultant at Yext, described the process to the Search Y 2020 audience.
User intent tends to be oversimplified. The intent graph was proposed by a researcher from Gartner. In this talk, I will adopt models from cognitive science and graph neural networks to propose an extensive analysis of search intent. The goal is to help us adopt a deeper understanding of user intent when executing keyword research, plotting content strategy and comprehending SERP.
Similar to Community Challenges for Practical Linked Open Data - Linked Pasts keynote (14)
A walk through of the Linked Art data model, API and community processes. Presented originally at the Rijksmuseum for the 5th Linked Art face to face meeting. Linked Art is a linked open usable data specification created by the community to describe artwork, museum objects, and related bibliographic and archival content.
LUX - Cross Collections Cultural Heritage at YaleRobert Sanderson
A brief presentation based on the CNI talk for the Linked Data for Libraries Discovery affinity group about LUX, Linked Open Usable Data and our discovery processes based on graphs rather than documents.
An introduction to Linked Open Usable Data (LOUD) through the lens of a zooming paradigm, and thoughts on how such a paradigm can help to address some grand challenges of LOUD, including search granularity, trust and reconciliation. Presented to the IDLab / Knowledge at Web Scale department of the University of Ghent in Feb '23
Data is our Product: Thoughts on LOD SustainabilityRobert Sanderson
Invited keynote presentation for the LINCS Project, June 23rd 2022 at the University of Guelph, Canada. It describes thoughts on a framework for sustainability of linked open usable data products in the cultural heritage domain.
A Perspective on Wikidata: Ecosystems, Trust, and UsabilityRobert Sanderson
Brief and skeptical presentation about wikidata and its potential for use and abuse in the cultural heritage data ecosystem, presented at the PCC/LDAC forum on wikidata, November 12th, 2021.
Linked Art: Sustainable Cultural Knowledge through Linked Open Usable DataRobert Sanderson
An introduction to Linked Art - why we need it, what it is, and how it works. A great starting point if you're interested in linked open usable data in cultural heritage, especially art museums.
Illusions of Grandeur: Trust and Belief in Cultural Heritage Linked Open DataRobert Sanderson
What is the notion of trust, when it comes to publishing linked open data in the cultural heritage sector? This presentation discusses some aspects with relation to three primary questions: How do we trust what was said, trust that the institution said it, and trust what it means?
Invited seminar for UIUC's IS 575 class on metadata in theory and practice, about structural metadata practice in RDF/LOD. Touches on OAI-ORE, PCDM, Annotation, IIIF and Linked Art. Challenges explored are graph boundaries, APIs and context specific metadata.
Tiers of Abstraction and Audience in Cultural Heritage Data ModelingRobert Sanderson
A walk through of a framework based around the distinctions between Abstraction, Implementation and Audience for considering the value and utility of data modeling patterns and paradigms in cultural heritage information systems. In particular, a focus on CIDOC-CRM, BibFrame, RiC-CM/RiC-O, EDM, and IIIF, with the intent to demonstrate best practices and anti-patterns in modeling.
Euromed2018 Keynote: Usability over Completeness, Community over CommitteeRobert Sanderson
Discussion of cultural heritage issues around usability and prioritization with completeness, and focus on bringing together communities rather than small and transient committees. Focus on Linked Open Usable Data, Annotations, JSON-LD, IIIF and Linked.Art.
A walkthrough of the CIDOC-CRM based, LOD data model developed and maintained at https://linked.art/ for describing cultural heritage resources and activities.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Good morning! Thank you Leif/Elton for the introduction, and for the invitation to come and give the keynote presentation. I'm honestly excited to be here, as my relatively recent move to the Getty [smile at Nicole] has given me an amazing opportunity to become more deeply involved in both Linked Open Data, and how to use it pragmatically to describe objects and the events that carry them through history.
I’m going to start with a brief meta-level "header" about Keynotes, if you'll indulge me. There are three types of Keynotes, in my experience. The first two types tend to bore me to tears, so I hope not to repeat their mistakes!
This is not a “CV” sort of keynote, where I just talk about things that I’ve done. If you want to read my CV, well ... you can read my CV ... or better, come and talk to me over the next couple of days!
And it’s not a dry, background work, scene setting, domain survey. You already know all of that, and while you might have heard about a couple of things that you didn’t know about, you might also have been asleep by that point.
Which leaves the third type: the call to action. I’m going to try and lay out where I think we, as a community should be trying to go and some of the challenges we need to decide how to resolve along the way. How can we make a difference beyond these two days, when we get back to our regular work. And I’m not going to be coy and keep you in suspense as to what I want us to do ...
It’s pretty simple. We should come together as a Community to agree on how Best to create and publish linked open data about historical events and their participants.
And then, of course, to go home and actually do something about it :)
That's also pretty much the outline of the presentation. I’m going to talk about Community, which leads into what I think constitutes good linked data, and then about creating and publishing it.
In my experience of the technical information standards world, starting with Z39.50 and SRU about 20 years ago, the most successful specifications are those that come from Communities, not as mandates from self-appointed Committees.
Direction and vision are required... the focus of the community needs to be articulated clearly and convincingly. Communities need awareness and understanding of the problems that they are facing, and the motivation to work together to overcome them..
There needs to be leadership, but not at the expense of being open and welcoming. Not at the expense of engaging and understanding what is actually needed. Instead the focus gives a methodology for community based decision making, ensuring that the problems being solved are real for the members of the community.
Participation is the key requirement in solving practical challenges, not reputation. And Active participation, not just lurking and occasionally making a snide comment about how nothing is getting done. A community that isn't doing anything, isn't a community. It needs to not only think about the end result, but actively consider and adapt how it is getting there.
So Flexible, not Agile? I prefer the notion of the community being flexible but strong – when the willow tree sways with the wind it’s flexible, but is true to its purpose. A cat agilely avoiding a dangerous situation doesn’t address the danger, it just avoids it. Agility lets you dance around the problem, Flexibility lets you overcome it.
Focused, Open, Active, Flexible ... sounds great, but how do we get there?
Katherine Skinner of Educopia introduced me to this notion of the community engagement pyramid. There are a few people at the top of the pyramid, and increasingly many as you move down the tiers. In the IIIF community, for example, there are probably 5 active leaders, but a good 10-20 experts and advocates, maybe 50-100 contributors, 500 or so members that aren't constantly engaged but actively following, and then an impossible to determine number of people on the edges looking in. Or up.
The point is not to have a hierarchical and fixed structure, but to recognize that people look upwards and it is the sign of a healthy community when everyone on the tier above is reaching down to help those who want to take on a bigger role to do so.
In order to be successful in pulling off the amazing transformation of cultural heritage into linked open data, we need to have a solid understanding of how to work together strategically, while advancing our own organizations' immediate goals.
Catherine Bracy, Director of Community for Code for America, gave an outstanding keynote at the Museum Computer Network in November, and I'm shamelessly echoing her points on how community leadership can most effectively work because they're bang on.
There are four easy steps:
First, know your audience. Who is the community, and who, as a community leader, do you need to be working directly with.
Secondly, reach out to those people on their terms, not on yours. You need them to participate, and for that, they need to understand and agree with the goals and direction.
Thirdly, have a continuing conversation with the community about everything!
And make opportunities for people to actively and meaningfully participate. Through participation you're building the ladders to bring them to the next level of the pyramid.
Let's get more concrete...
And perhaps a little controversial ... Who IS the audience for Linked Open Data?
(beat)
Developers. Developers, Developers everywhere. And I mean that literally, not just your developers, but developers everywhere. Developers who cannot come to your office and ask why an E12 Production is an Activity, but an E6 Destruction is only an Event.
Oh and if you can explain that to _me_, that'd be great too.
Lets look at the LOD community pyramid, rather than the generic one. There are a few architects, ontologists and leaders who discuss, design, create and advocate for ontologies. Then there are experts and content providers who understand and use those ontologies to made their data available ... to developers to build applications for the users, and one of the most focused on class of user is researchers.
Individuals can fit into multiple tiers, and you should think of them for the purposes of this pyramid as if they were the highest of those tiers. A researcher than can also write code or SPARQL queries is a developer. Thinking that “lots of researchers use my data” because they can write sparql queries is a category error – developers that are also researchers use your data.
Meet the developers on their terms. Don’t go to developers and talk about ontologies and triples and namespaces and reification and inference and sparql and quadstores and turtle and ... You lost them way back at ontologies.
Instead talk about JSON, and APIs, and HTTP, and content, and applications. Because remember, developers are your gateway to the users of their applications. So as Lee says, listen to them and engage in a way that makes them feel needed and wanted, because they are!
And don't just talk AT them, discuss WITH them. Have a conversation about what they need, what you can do to help them, and of course what they know their users need. Some middle aged white guy [cough] talking at an audience is a presentation, not a community.
Finally, create opportunities for the community to participate meaningfully. Not just listening but actively engaging. Can the developers help fix your bugs? Can their users annotate your connect with corrections and suggestions of related content?
Listening to and acting on feedback is important, but think about ways for others to _get involved_. Can you provide APIs to let developers and users actually do something directly?
I love the expression of the woman in the middle. She's REALLY dubious about something! Maybe ...
Ha! Said no Developer, Ever. Meeting on their terms, remember.
That seems much more likely! Or at least, that's what I hear most often
So let's look at how this workflow plays out. It starts with the data being created and knowledge is being represented – which is expensive! Money goes in, and something called "triples" comes out the other end of what appears to most management like a black box.
Those triples go to the developer, who has to actually DO something with them to build a web application ...
Which is then used by researchers (and others) to form hypotheses, do research and write papers. But each of them have different expectations ...
For the creation and transformation of the dataset, there should be “No data left behind” – the ontology and model should be thorough and complete, a good knowledge representation.
The developer however wants the data to be usable and understandable, otherwise he can’t do his job of making it available.
And the researcher, not knowing any of the process behind the HTML application, needs the data to be accurate, otherwise when using it and comparing it with other data, his research will be equally inaccurate.
What I've come to realize is that Linked Data can be Complete, Usable, Accurate ...
... Pick one.
... And pick Usable.
It would be great to pick all three, but like “Good, Cheap, Fast, Pick two” it’s a matter of limited resources and priorities. For today, let’s take Accuracy of the data off the table. Ensuring that all statements correctly represent reality is a direct function of resources (spending time to find and fix errors), not the model or its representation. And we’re not going to solve the challenges of humanities funding today.
By trying to make the data both Complete AND Usable, we're trying to optimize for two independent variables at the same time, with different purposes and most importantly different audiences. Are we prioritizing the needs of the ontologist and data manager, or the developer that has to work with the result? In my experience, we tend to meet our own requirements first because we care about the knowledge representation and hope that the developer can make do with what they get, without much thought to the API.
Optimizing for completeness fulfills the knowledge representation use case. But Linked Data also has a protocol aspect – it’s a fundamental part of LOD that the representation is available at its URI via HTTP.
This means that we are implicitly also designing, and hopefully co-optimizing for, a Usable API. And this is where the Completeness and Usability axes become more complicated. The question is not only can I understand and use the model or data in that model, it’s how easy is it to use that data in the way it is made available – its serialization and the network transfers.
I think the chart of ontologies looks something like this ...
At the beginning, any completeness adds to usability. Then there's a dramatic rise in usability, without adding so much completeness when you hit the sweet spot of "just enough data". Then as you add more to the ontology, it starts to drop off slowly at first ... not so much as to be a significant problem ... but then you reach the tipping point where it becomes incomprehensible, and as completeness tends to 100%, usability tends to 0%
Okay, I know you want examples, so some more potential controversy ...
Bibframe 1.0 was terrible. It was complex without actually addressing the issues, worse than Simple Dublin Core, which at least has enough to get /something/ done with. Then you have frameworks further up the usability scale like Web Annotations and EDM, and a little more complete, but by no means everything you want to say.
The IIIF Presentation API is, demonstrably, about as usable as a linked data API can get ... but we're constantly fighting to stay at high usability by resisting requests to add features.
Then comes the slippery slope that schema.org is further down ... still usable for now, but they're constantly adding to it ... and not in a sustainable or directed fashion ... until you hit rock bottom with CIDOC CRM and the meta-meta-meta statements available via CRMInf.
The zone most ontologies should aim to be in, in my opinion, is the top right hand corner ... maximizing usability and completeness, maximizing the area between 0,0 and the x,y point reached. The community needs to take into account where on that slope will result in greatest adoption.
For our wedding anniversary, my wife and I drove up to see the giant redwoods in Northern California. There’s a living tree you can drive your car through even. However, there’s still the forest surrounding the trees that people want to look at, and without the forest, those trees would fall in the next storm. You can cut down a few trees to make sure that the rest survive and people can see them, but there’s no need to build paths to every tree. Like the forest, you can and should keep data around towards completeness and stability of the whole, without exposing it to developers. In the Getty vocabularies, we have a changelog of every edit for every resource … we publish that, and it just gets in the way of understanding for no value. We don’t have to throw it away completely to increase usability, we can just leave it out of the API.
So how do we understand what we should publish?
The other stages in the workflow have reasonably well understood evaluation processes -- the formal validity of the ontology, the extent to which it can encode all of the required information, unit and integration tests for code, user acceptance tests for the application, user personae to guide development ... but how should we evaluate the quality of the API we're providing to our data?
Michael Barth lays out six fundamental features for API evaluation.
Abstraction Level -- is the abstraction of the data and functionality appropriate to the audience and use cases. An end user of the "car" API presses a button or turns a key. A "car" developer needs access to engine controls, all the modern safety features and so forth.
Comprehensibility -- is the audience able to understand how to use it to accomplish their goals
Consistency -- if you know the "rules" of the API, how well does it stick to them? Or how many exceptions are there to a core set of design principles (like Destruction not being an Activity)
Documentation -- How easy is it to find out the functionality of the API?
Domain Correspondence -- If you understand the core domain of the data and API, how closely does the understanding of the domain align with an understanding of the data?
And finally, what barriers to getting started are there?
Sometimes I hear linked data experts say "You should just use Federated SPARQL Queries". But SPARQL, let alone federations of systems, performs very poorly on all of those metrics. To explain why "consistency" is "mediocre", that's because everyone has different underlying models and exceptions, and SPARQL is a complex but very thin layer over those models. The Abstraction level for SPARQL is for the car designer who knows everything and needs to be able to get at it, which is a very very limited number of people.
So when I hear people say "You should use SPARQL", my internal reaction is ...
Now you have more problems than you can count.
Or … You can tell that people are using your SPARQL endpoint, because it’s down.
And even if you love SPARQL, the incontrovertible fact is that when you compare to REST + JSON developers, the Venn diagram looks something like this. Sorry it's a little hard to see, let me zoom in a little for you ...
Is that better? There MIGHT be one SPARQL developer who doesn't know JSON, but I doubt it.
Okay, so what do we need to have available in that data? (This is the brief background, scene setting bit)
The scope, in my view, is broadly covered by "historical activities", and the participants in those activities. We need a model for describing them, and shared resources such as people, places and objects, should have shared identities. There's no need to get into philosophical wars right now about the best ontology or the most appropriate sources of shared identity, so what CAN we practically and pragmatically discuss?
Let's start with serialization. JSON-LD. It's JSON with explicit, managed structure. The keys are named in a way that’s easily understandable to humans and easy to use when programming. No hyphens, no numbers, no strange symbols everywhere, …
And let me explain the significance of the colors. All the blue strings are URIs, including "ManMadeObject" and "Material". The only actual string literals are the two red labels. JSON-LD lets you manage the complexity of the graph in a way that ends up familiar to the audience, the developer, not daunting to them. Remember: Meet On Their Terms.
Or ... Curly brackets are the new Angle brackets.
If we’ve solved serialization ... well, have a direction at least ... what ARE the challenges we need to work on? Let’s count down my top 5.
Coming in at number 5 … Order!
History is Ordered, and order is hard in RDF, right?
We’re fortunate that history is ordered globally by the steady march of time, not locally.
For historical events, we can be universally correct, Dr Who time travel aside, saying that an event in 1500 occurred after an event in 1490. No need to worry about explicitly ordering them, when applications can use the timestamps to do it themselves, as needed.
But for local ordering, use rdf:List. The serialization in JSON-LD is good, sparql 1.1 supports property chains, and SPARQL support shouldn’t be our main concern anyway.
Number 4 is "boundary of representation" ... or which triples should be included when you dereference a particular URI.
This is a critical point for the use of Linked Data as an API. We need to optimize the representations for use cases, based on the developer audience and what they need. There's no one rule that can generally determine the best practice here.
Note that the terms from AAT are used both by reference -- the object is classified as a painting, but without even a label -- and by value -- we're explicit that aat:300015045 is a Material, and it has a label of "watercolor". Why? Well, why indeed?! This is just an example, but one that we need to question with a critical eye, and discuss with developers as to whether it meets their needs.
JSON-LD does have an algorithm called Framing that makes this sort of effort much more consistent and efficient. Also, we might take a leaf or two out of Facebook's GraphQL book, where the request can govern some aspects of the response's representation.
Number 3! The many levels of meta-data that we try to squash into a single response.
RDF is good at making statements about reality, and bad at making statements about statements. At three levels its terrible, and when you end up trying to make statements about what you believe that others used to believe, you’ve gone too far into the dream and need to get back to reality.
Inception made for a cool movie, but would be a terrible API. Make broad statements about your dataset, and leave it at that. For example … Associate a license with the dataset, not with each resource … You’re not letting people use Rembrandt, the actual person, with a CC-BY license so don't claim that (like ULAN currently does). And don't reify everything in order to make mostly blanket statements about certainty or preference. No one is absolutely certain about anything, and everyone has a preference about which label or description to use ... but don't try and encode all of those subjective assertions against each triple!
Of course ... naming things.
Plato has Socrates discuss the correctness of names in Cratylus, which leads to the theory of forms, and nominalism. As a race, we haven’t solved this question in the last 2500 years, so I would predict that we’re unlikely to solve it today.
We have this problem in several different guises:
The URIs of Predicates, and their JSON-LD keys
The URIs of instances, particularly ones that are common across datasets, and hopefully across organizations
The only way to know the "best" or "most correct" name for something is through use and discussion.
And if number 2 is Naming Things, then number 1 must surely be Cache Invalidation.
And it really is. Efficient change management in distributed systems without a layer of orchestration is literally asking us to solve predicting the future.
The best we can do is add a lightweight notifications layer, using a publish/subscribe pattern with standards like LDN (Linked Data Notifications) and WebSub. Both are currently going through the standardization process within the Social Web Working Group of the W3C.
That gives us distributed hosting, but the potential for centralized discovery and use, where applications are informed about changes to remote systems that have been updated, so that they can update their caches of that information.
And finally Number Zero, Off-By-One errors. (beat) Oh.
Home stretch folks, thank you for your attention so far!
Practical Linked Open Data... P L O D .. Plod. I've gotta say, Leif, it's not an awe inspiring acronym, sorry. Mr Plod the Police Officer is just not the most exciting figure. Police make you think of being told to stop, and we want to move forwards and learn from our mistakes. To get something out there, and iterate.
Remember ... We Want U. U for Usable.
(beat)
So my call to action, fans, is to Get LOUD. Linked Open Usable Data.
With the Community ... Community Linked Open Usable Data ... or the CLOUD.
(And it wouldn't be a linked data keynote without the LOD CLOUD diagram, would it!)
The community includes everyone in that pyramid, and together we can share the burdens across all levels. Shared ontologies, shared identities, shared code, shared use cases ... but also think about this ... By linking to other peoples' data, you're reducing your own completeness burden. And theirs, at the same time. Not everyone needs a complete description of Rembrandt, or Plato.
Enabling users and developers to provide feedback on your data reduces your burden of Accuracy. They have the possibility to work with you to correct it.
And working directly with developers, regardless of whether they're in your organization or not, validates Usability. Barth has several really good options for how to put that evaluation into practice.
Remember FOAF ... who got that first time round? ... Focused, Open, Active, Flexible.
The audience for your actual Linked Data is the developers within your community.
We need to meet on their terms, and allow them (and their users) to participate in the creation and management process for the data. We thus need to focus on usability of the data, not necessarily the completeness nor the accuracy.
A good way to do that is through the use of JSON-LD, with frames governing the graph boundaries and validated through use cases and discussion about the names used. Let them know, through notifications, when the data is updated.
Let developers help with usability, let the community help with completeness, and let the users help with accuracy.
(beat)
You’ll have to deal with the off by one errors yourself, unfortunately.
Thank You!
I don't want to "take questions". I want us, as a community, to discuss :) So ... Discussion?