This document provides an overview of the state of Linked Data and the Linking Open Data initiative. It describes key concepts like URIs, RDF, and the LOD cloud. It outlines datasets published as part of LOD and tools for mapping, indexing, searching, and navigating Linked Data. Statistics are presented on the size and structure of the LOD graph. The document concludes by discussing challenges in growing the data web and making Linked Data more usable and useful to end users.
An introduction deck for the Web of Data to my team, including basic semantic web, Linked Open Data, primer, and then DBpedia, Linked Data Integration Framework (LDIF), Common Crawl Database, Web Data Commons.
IFLA LIDASIG Open Session 2017: Introduction to Linked DataLars G. Svensson
At the IFLA Linked Data Special Interest Group open session in Wroclaw we briefly introduced the mission of the SIG and then went on to a brief introduction to what linked data is and why that topic is important to libraries.
The presentation was held jointly by Astrid Verheusen (general introduction to the SIG) and Lars G. Svensson (introduction to Linked Data)
There has been plenty of hype around the Semanic Web, but will we ever see the vision of intelligent agents working on our behalf? This talk introduces the concepts of the Semantic Web as envisioned by Tim Berners-Lee over 10 years ago and compares that vision to where we have come since then. It includes a discussion of implementations such as XML, RDF, OWL (web ontology language), and SPARQL. After reviewing the design principles and enabling technologies, I plan to show how these techniques can be implemented in WebGUI.
An introduction deck for the Web of Data to my team, including basic semantic web, Linked Open Data, primer, and then DBpedia, Linked Data Integration Framework (LDIF), Common Crawl Database, Web Data Commons.
IFLA LIDASIG Open Session 2017: Introduction to Linked DataLars G. Svensson
At the IFLA Linked Data Special Interest Group open session in Wroclaw we briefly introduced the mission of the SIG and then went on to a brief introduction to what linked data is and why that topic is important to libraries.
The presentation was held jointly by Astrid Verheusen (general introduction to the SIG) and Lars G. Svensson (introduction to Linked Data)
There has been plenty of hype around the Semanic Web, but will we ever see the vision of intelligent agents working on our behalf? This talk introduces the concepts of the Semantic Web as envisioned by Tim Berners-Lee over 10 years ago and compares that vision to where we have come since then. It includes a discussion of implementations such as XML, RDF, OWL (web ontology language), and SPARQL. After reviewing the design principles and enabling technologies, I plan to show how these techniques can be implemented in WebGUI.
Persistent Identifiers and the Web: The Need for an Unambiguous MappingHerbert Van de Sompel
Presentation given at the International Digital Curation Conference in San Francisco, February 26 2014. Highlights the lack of machine-actionability of persistent identifiers assigned to scholarly communication assets. Proposes an approach to address the issue that meets requirements that take into account the changing nature of web based research communication. A draft paper provides more details: http://public.lanl.gov/herbertv/papers/Papers/2014/IDCC2014_vandesompel.pdf
I presented this keynote talk at the WorldComp conference in Las Vegas, on July 13, 2009. In it, I summarize what grid is about (focusing in particular on the "integration" function, rather than the "outsourcing" function--what people call "cloud" today), using biomedical examples in particular.
Slides from my workshop at Open Repositories 2016 about DSpace's Linked Data support. The slides include a short introduction into the Semantic Web and Linked Data, the main ideas behind the Linked Data support of DSpace, information on how to configure this feature and some examples about how to query DSpace installations for Linked Data.
This presentation addresses the main issues of Linked Data and scalability. In particular, it provides gives details on approaches and technologies for clustering, distributing, sharing, and caching data. Furthermore, it addresses the means for publishing data trough could deployment and the relationship between Big Data and Linked Data, exploring how some of the solutions can be transferred in the context of Linked Data.
Talk given at Open Knowledge Foundation 'Opening Up Metadata: Challenges, Standards and Tools' Workshop, Queen Mary University of London, 13th June 2012.
Info on the event at http://openglam.org/2012/05/31/last-places-left-for-opening-up-metadata-challenges-standards-and-tools/
A set of slides that provides a high-level overview of the W3C Linked Data Platform specification presented at the 4th Linked Data in Architecture and Construction Workshop.
For more detailed and technical version of the presentation, please refer to
http://www.slideshare.net/nandana/learning-w3c-linked-data-platform-with-examples
LDAC 2016 programme
http://smartcity.linkeddata.es/LDAC2016/#programme
Repositories are systems to safely store and publish digital objects and their descriptive metadata. Repositories mainly serve their data by using web interfaces which are primarily oriented towards human consumption. They either hide their data behind non-generic interfaces or do not publish them at all in a way a computer can process easily. At the same time the data stored in repositories are particularly suited to be used in the Semantic Web as metadata are already available. They do not have to be generated or entered manually for publication as Linked Data. In my talk I will present a concept of how metadata and digital objects stored in repositories can be woven into the Linked (Open) Data Cloud and which characteristics of repositories have to be considered while doing so. One problem it targets is the use of existing metadata to present Linked Data. The concept can be applied to almost every repository software. At the end of my talk I will present an implementation for DSpace, one of the software solutions for repositories most widely used. With this implementation every institution using DSpace should become able to export their repository content as Linked Data.
Resource Oriented Architectures: The Future of Data API?Victor Olex
From API Strategy & Practice 2013 conference and the creators of SlashDB (http://www.slashdb.com). What are resource-oriented apis, how they contrast with service oriented architectures? Will JSON-LD standard reignite the excitment about in linked data, RDF and SPARQL?
These slides go with the paper "Reminiscing About 15 Years of Interoperability Efforts" which is available at http://dx.doi.org/10.1045/november2015-vandesompel
Slides were used for a presentation at the Fall 2015 Membership Meeting of the Coalition for Networked Information.
The W3C Linked Data Platform (LDP) specification describes a set of best practices and simple approach for a read-write Linked Data architecture, based on HTTP access to web resources that describe their state using the RDF data model. This presentation provides a set of simple examples that illustrates how an LDP client can interact with an LDP server in the context of a read-write Linked Data application i.e. how to use the LDP protocol for retrieving, updating, creating and deleting Linked Data resources.
Persistent Identifiers and the Web: The Need for an Unambiguous MappingHerbert Van de Sompel
Presentation given at the International Digital Curation Conference in San Francisco, February 26 2014. Highlights the lack of machine-actionability of persistent identifiers assigned to scholarly communication assets. Proposes an approach to address the issue that meets requirements that take into account the changing nature of web based research communication. A draft paper provides more details: http://public.lanl.gov/herbertv/papers/Papers/2014/IDCC2014_vandesompel.pdf
I presented this keynote talk at the WorldComp conference in Las Vegas, on July 13, 2009. In it, I summarize what grid is about (focusing in particular on the "integration" function, rather than the "outsourcing" function--what people call "cloud" today), using biomedical examples in particular.
Slides from my workshop at Open Repositories 2016 about DSpace's Linked Data support. The slides include a short introduction into the Semantic Web and Linked Data, the main ideas behind the Linked Data support of DSpace, information on how to configure this feature and some examples about how to query DSpace installations for Linked Data.
This presentation addresses the main issues of Linked Data and scalability. In particular, it provides gives details on approaches and technologies for clustering, distributing, sharing, and caching data. Furthermore, it addresses the means for publishing data trough could deployment and the relationship between Big Data and Linked Data, exploring how some of the solutions can be transferred in the context of Linked Data.
Talk given at Open Knowledge Foundation 'Opening Up Metadata: Challenges, Standards and Tools' Workshop, Queen Mary University of London, 13th June 2012.
Info on the event at http://openglam.org/2012/05/31/last-places-left-for-opening-up-metadata-challenges-standards-and-tools/
A set of slides that provides a high-level overview of the W3C Linked Data Platform specification presented at the 4th Linked Data in Architecture and Construction Workshop.
For more detailed and technical version of the presentation, please refer to
http://www.slideshare.net/nandana/learning-w3c-linked-data-platform-with-examples
LDAC 2016 programme
http://smartcity.linkeddata.es/LDAC2016/#programme
Repositories are systems to safely store and publish digital objects and their descriptive metadata. Repositories mainly serve their data by using web interfaces which are primarily oriented towards human consumption. They either hide their data behind non-generic interfaces or do not publish them at all in a way a computer can process easily. At the same time the data stored in repositories are particularly suited to be used in the Semantic Web as metadata are already available. They do not have to be generated or entered manually for publication as Linked Data. In my talk I will present a concept of how metadata and digital objects stored in repositories can be woven into the Linked (Open) Data Cloud and which characteristics of repositories have to be considered while doing so. One problem it targets is the use of existing metadata to present Linked Data. The concept can be applied to almost every repository software. At the end of my talk I will present an implementation for DSpace, one of the software solutions for repositories most widely used. With this implementation every institution using DSpace should become able to export their repository content as Linked Data.
Resource Oriented Architectures: The Future of Data API?Victor Olex
From API Strategy & Practice 2013 conference and the creators of SlashDB (http://www.slashdb.com). What are resource-oriented apis, how they contrast with service oriented architectures? Will JSON-LD standard reignite the excitment about in linked data, RDF and SPARQL?
These slides go with the paper "Reminiscing About 15 Years of Interoperability Efforts" which is available at http://dx.doi.org/10.1045/november2015-vandesompel
Slides were used for a presentation at the Fall 2015 Membership Meeting of the Coalition for Networked Information.
The W3C Linked Data Platform (LDP) specification describes a set of best practices and simple approach for a read-write Linked Data architecture, based on HTTP access to web resources that describe their state using the RDF data model. This presentation provides a set of simple examples that illustrates how an LDP client can interact with an LDP server in the context of a read-write Linked Data application i.e. how to use the LDP protocol for retrieving, updating, creating and deleting Linked Data resources.
Linked Data Basics Slot in WWW2012 Tutorial: Practical Cross-Dataset Queries on the Web of Data
http://latc-project.eu/events/www2012-tutorial-cross-dataset-queries
morning session talk at the second Keystone Training School "Keyword search in Big Linked Data" held in Santiago de Compostela.
https://eventos.citius.usc.es/keystone.school/
A distributed network of digital heritage information - Unesco/NDL IndiaEnno Meijers
These slides were presented at the Knowledge Engeneering for Digital Library Design Workshop in New-Delhi on 25 October 2017. The Workshop was organised by Unesco and the National Digital Library of India.
Linked Data for the Masses: The approach and the SoftwareIMC Technologies
Title: Linked Data for the Masses: The approach and the Software
@ EELLAK (GFOSS) Conference 2010
Athens, Greece
15/05/2010
Creator: George Anadiotis (R&D Director)
Presentation for the Knowledge Graph Conference 2021
Abstract: Show me your schemas, and I will show you a graph! Although graph databases have become very popular in the enterprise, deep expertise in graphs is still in short supply (see "Building an Enterprise Knowledge Graph @Uber: Lessons from Reality" from KGC 2019). Developers often think of graphs as a completely different kind of thing from the rest of their company's data, and will go to great lengths to force their data into a "graph" shape. The amount of manual effort involved in building and maintaining ETL pipelines can become a bottleneck and a maintenance burden. In fact, there is usually a rich domain data model of entities, relationships, and properties which is already implicit in the company's existing schemas, be they interface descriptions for microservices, relational schemas, or various other kinds of storage schemas. Taking advantage of these schemas, and mapping conforming data into the graph, ought to require relatively little extra work, but developers need appropriate tools. In this presentation, we will illustrate such mappings with real-world examples from Uber, as well as introducing formal techniques for schema and data migration. We will also look ahead to the emerging GQL standard as the foundation for a new generation of highly interoperable graph database tools.
Presentation for Data Day Texas, 6/13/2022
Abstract: If you have ever built an enterprise knowledge graph, you know that heterogeneity comes at a cost. The more complex the interfaces to the graph become – more domain data models, more data representation languages and data exchange formats, more programming languages in which applications and ETL code are written – the more time is spent on mappings, and the harder it becomes to keep these mappings in a consistent state. At the same time, support for heterogeneity is often what motivates us to build a graph in the first place. In a previous Data Day talk, A Graph is a Graph is a Graph, I talked about a generic approach for reconciling graph and non-graph data models. The approach was later formalized as Algebraic Property Graphs and implemented in a proprietary tool which I was ultimately not permitted to release as open source software. This time around, I would like to introduce you to a new, open-source project called Hydra which expands the scope of the problem from defining composable transformations for data and schemas, to also porting those transformations between concrete programming languages, encapsulating them in developer-friendly DSLs. Learn to love typed lambda calculi, and see how weird and wonderful things get when a transformation library starts transforming itself.
An Algebraic Data Model for Graphs and Hypergraphs (Category Theory meetup, N...Joshua Shinavier
A presentation for the Category Theory meetup at Uber in San Francisco, November 21, 2019. A combination of previous slide shows motivating and presenting the Algebraic Property Graphs data model.
In Search of the Universal Data Model (ISWC 2019 Minute Madness)Joshua Shinavier
A one-minute version of my talk from Connected Data London about graph data models versus mental representations. For the "Minute Madness" session at ISWC 2019.
A presentation on mashing up Twitter Annotations with the Semantic Web. June 24, 2010 at the Semantic Technology Conference, San Francisco (SemTech 2010).
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
GridMate - End to end testing is a critical piece to ensure quality and avoid...ThomasParaiso2
End to end testing is a critical piece to ensure quality and avoid regressions. In this session, we share our journey building an E2E testing pipeline for GridMate components (LWC and Aura) using Cypress, JSForce, FakerJS…
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Securing your Kubernetes cluster_ a step-by-step guide to success !
The state of the art in Linked Data
1. Joshua Shinavier
The state of the art in
Linked Data
Advanced Semantic Web, Spring 2009
Literature Survey
2. Outline
• Linked Data
• Linking Open Data
• describing linked datasets
• growing the data web
• keeping Linked Data connected
• indexing and searching
• applications
• navigation
• state of the data web
2
3. Linked Data overview
• resource -- an item of interest
• URI -- global identifier for a resource
• representation -- data corresponding to the state
of a resource
• information resource -- a “document” containing
information
• non-information resource -- anything else
• associated description -- representation describing
a Semantic Web resource
3
4. The Linking Open Data initiative
• “bootstrap” the data web with large, interconnected data sets
to reach a critical mass of semantics
• strict adherence to W3C standards
• identification and transportation (URI, HTTP) of resource
descriptions
• interpretation (RDF, RDFS, OWL) of resource descriptions
• LOD grows as data providers:
• publish structured data on the Web
• set RDF links between entities in different data sources
• transition of the web from a distributed document repository
into a universal, ubiquitous database [Erling 09]
4
8. Describing linked datasets
• voiD (Vocabulary of Interlinked Datasets)
[Alexander, Cyganiak, Hausenblas, Zhao 09]
• describes data sets the link sets between them
• DING (Dataset RankING) [Toupikov, Umbrich,
Delbru, Hausenblas, Tummarello 09]
• ranking of linked datasets using formal
descriptions
• modeling of the Linked Data domain [Halpin,
Presutti 09]
8
9. Keeping Linked Data connected
• network-shaped Entity Name System to enable
systematic reuse of URIs [Bouquet, Stoermer,
Cordioli, Tummarello 08]
• similar to DNS for interlinking hypertext
• n2Mate framework [Peterson, Cregan, Atkinson,
Brisbin 08]
• use social networking principles to facilitate
vocabulary and instance reuse
• graph-based disambiguation of Semantic Web
entities with idMesh [Cudré-Mauroux, Haghani,
Jost, Aberer, de Meer 09]
9
10. Managing co-reference
• many conflated resources in DBpedia [Jaffri,
Glaser, Millard 08]
• representative of LOD as a whole
• Co-Reference Resolution Service [Glaser, Jaffri,
Millard 09]
• when co-reference is context-specific,
owl:sameAs is inappropriate
• stores co-reference information as a first-class
entity
• ontology-level alignment should precede data-level
alignment [Nikolov, Uren, Motta 09]
10
11. Growing the data web
• how to get data out there?
• challenges of the read-write Semantic Web
• user awareness of social context of data (e.g.
licensing, privacy)
• view update problem
• is the wiki model applicable?
• incentives for posting data on the SW
• validating existing Linked Data with Vapour
[Berrueta, Fernandez, Frade 08]
11
12. Examples of LOD data sets
• DBpedia [Auer, Bizer, Kobilarov, Lehmann,
Cyganiak, Ives 07]
• extracts structured information from Wikipedia
• linking hub for the LOD cloud
• RDF Book Mashup [Bizer, Cyganiak, Gauss 07]
• product metadata from Amazon.com
12
13. Music and movies as Linked Data
• Linked Movie Database [Hassanzadeh, Consens 09]
• combines data from IMDb, Freebase, OMDB,
DBPedia, RottenTomatoes.com, Stanford Movie
Database
• interlinked music datasets [Raimond, Sutton,
Sandler 08]
• combines data from Jamendo on DBTune, BBC
John Peel sessions, SBSimilarity, Musicbrainz,
DBpedia, Geonames
• links artists, albums, tracks, personal music
collections
• generated links based similarity of resources,
similarity of neighbors
13
14. Other sources of data
• the hypertext Web itself [Li, Zhao 08]
• extraction of semantic links from hypertext links and
hierarchical relationships among Web documents
• RDF representation of HTML DOM from using SparqPlug
[Coetzee, Heath, Motta 08]
• multimedia metadata
• interlinking multimedia fragments [Hausenblas, Troncy,
Bürger, Raimond 09]
14
15. Other sources of data (cont.)
• XML Business Reporting Language (XBRL) [Garcia, Gil
09]
• mapping data to RDF and schemas to OWL
facilitates interoperability
• large thesauri [Neubert 09]
• as interlinking hubs for professional communities
• enterprise data, e.g. technical documentation [Servant
08]
• MARC21 bibliographic records [Styles, Ayers, Shabir
08]
15
16. Mapping tools
• D2R Server for customizable mappings from
relational databases to ontologies [Bizer, Cyganiak
06]
• browser-based tools for defining RDB-to-RDF
mappings [Zhou, Xu, Chen, Idehen 08]
• Triplify [Auer, Dietzold, Lehmann, Hellmann,
Aumueller 09]
• from generic data silos to Linked Data using
OpenLink Data Spaces [Idehen, Erling 08]
16
17. Aggregated resources
• Open Archives Initiative Protocol for Metadata
Harvesting (OAI-PMH)
• can be made Web-accessible with OAI2LOD
Server [Haslhofer, Schandl 08]
• Open Archives Initiative - Object Reuse and
Exchange (OAI-ORE) [Van de Sompel, Lagoze,
Nelson, Warner, Sanderson, Johnston 09]
• adheres to Web principles
17
18. User-driven Linked Data
• existing Linked Data datasets are more
appropriate for machine than human
consumption
• template-generated interlinks are of limited quality
• data from existing silos quickly becomes out of
date
• need human involvement to grow the data web
organically
18
19. User-driven Linked Data (cont.)
• direct modification using SPARQL/Update
• e.g. in Tabulator [Berners-Lee, Hollenbach, Lu, Presbrey,
Prud’hommeaux, Schraefel 08]
• User Contributed Interlinking [Halb, Raimond, Hausenblas]
• semantic wikis
• Loomp [Roesch, Heese 09]
• semantic annotation of content using a text editor
interface
19
20. User-driven Linked Data (cont.)
• public data from existing social networks
• wrappers for Web 2.0 services [Passant 08]
• unifying personal identity across various
networks [Rowe 09]
• Semantically Interlinked Online Communities
(SIOC)
• integrating social media sites (forums, blogs,
wikis, etc. with the data web [Bojars, Passant,
Cyganiak, Breslin 08]
• Meaning of a Tag (MOAT) ontology gives meaning
to tags on Web 2.0 [Passant, Laublet 08]
20
21. Usability and licensing
• usability (for humans) of Linked Data [Halb,
Raimond, Hausenblas 08]
• current LOD datasets are primarily for machine
consumption
• low semantic strength of current LOD link sets
• provenance information for Linked Data [Hartig
09]
• Open Data Commons license [Miller, Styles, Heath
08]
21
22. Indexing and searching
• W3C’s TAP semantic search [Guha, McCool 01]
• Swoogle [Ding, Finin, Joshi, Pan, Cost, Peng, Reddivari,
Doshi, Sachs 04]
• adapts PageRank concept to ontologies
• SWSE [Hogan, Harth, Umbrich, Decker 07]
• MultiCrawler [Harth, Umbrich, Decker 06]
• RDF Gateway search
• Watson document-based search
• Falcons [Cheng, Ge, Wu, Qu 08]
• textual search using class hierarchies for query restriction
• Sindice Semantic Web index [Tummarello, Delbru, Oren 07]
22
23. Link discovery
• Silk link discovery framework [Volz, Bizer, Gaedke,
Kobilarov 09]
• find relationships between entities within
different data sources
• generation of owl:sameAs links
• value of Web of Data depends on the amount and
quality of links between data sources
23
24. Navigation
• like early Web, it’s easy to get “Lost in Hyperspace”
• Tabulator generic Linked Data browser [Berners-
Lee, Chen, Chilton, Connolly, Dhanaraj,
Hollenbach, Lerer, Sheets 06]
• encourage deployment of Linked Data
• test, refine and promote Linked Data standards
• faceted views over large-scale linked data with
Virtuoso Cluster Edition [Erling 09]
• Explorator RDF browser [Araujo, Schwabe 09]
• exploratory search using direct manipulation
24
25. Navigation (cont.)
• DBPedia Mobile map view and faceted Linked
Data browser [Becker, Bizer 08]
• explore the geospatial Semantic Web
• uses current GPS position as a starting point
• potential for Linked Data publishing
25
26. Navigation (cont.)
• Fenfire generic Linked Data browser [Hastrup,
Cyganiak, Bojars 08]
• uses graph views rather than tables or outlines
• shows graph data as directly as possible
• related to Fentwine [Fallenstein, Lukka 04]
26
27. Navigation (cont.)
• Humboldt [Kobilarov,
Dickinson 08]
• exploratory browsing
• faceted views
• “resource at a time”
• uses a “pivot” operation
to refocus the view
27
28. Navigation (cont.)
• zLinks plugin [Bergman, Giasson 08]
• WordPress plugin with supporting server
• relates hypertext links with contextually
relevant Linked Data
• WOWY (WordNet, OpenCyc, Wikipedia, YAGO)
• distinguish between types of resources
• disambiguate alternate senses
28
29. Navigation (cont.)
• mapping of Linked Data to a file system model
[Schandl 09]
• enables use of this data within desktop
applications
29
30. Other applications
• how to use the data that is out there?
• emerging applications which exploit Linked
Data [Hausenblas 09]
• integrating data sources related to drug and
clinical trials [Jentzsch, Andersson, Hassanzadeh,
Stephens, Bizer 09]
• mashups
• MashQL [Jarrar, Dikaiakos 09]
• Internet is a database, mashup is a query
over that database
• benefit of specialized, independent Linked Data
services acting together [Bojars, Passant, Giasson,
Breslin 07]
30
31. The gray area
• U-P2P framework for peer-to-peer linked data [Davoust,
Esfandiari 09]
• data replication provides a measure of popularity
• Linked Data with Named Graphs
• e.g. interlinks with embedded provenance information
[Zhao, Klyne, Shotton 08]
• Ripple scripting language [Shinavier 07]
• embeds Turing-complete programs in the Web of Data
31
32. State of the data web
• where are we with the Linked Data graph?
• size
• number and type of links
• usefulness to end users
• network characteristics
• single-point-of-access (e.g. DBpedia, GeoNames)
vs. distributed datasets (e.g. FOAF-o-sphere,
SIOC-land)
• syntactic and semantic analysis of the LOD
dataset [Hausenblas, Halb, Raimond, Heath 08]
32
33. Statistics of the data web
• today’s Linked Data is very different than the first-
generation data web [Halpin 09]
• LOD data accounts for the vast majority of data
• power-law distributions are emerging
• data web is not growing organically
• Web standards are generally adhered to
• is Linked Data useful to ordinary users?
• sampling of Linked Data using Live.com query
logs and FALCON-S semantic search engine
33
39. Graph analysis for the data web
• common network analysis techniques can be used
to investigate interoperability and structural
patterns of the LOD cloud [Rodriguez 09]
• results based on March 2009 statistics of the LOD
data set graph:
• LOD graph is not strongly connected
• diameter of 8 is large given relatively small size
of the cloud
• data sets have nearly identical incoming and
outgoing link patterns (⇒ majority of reciprocal
owl:sameAs links)
39