This document discusses interaction with linked data, focusing on visualization techniques. It begins with an overview of the linked data visualization process, including extracting data analytically, applying visualization transformations, and generating views. It then covers challenges like scalability, handling heterogeneous data, and enabling user interaction. Various visualization techniques are classified and examples are provided, including bar charts, graphs, timelines, and maps. Finally, linked data visualization tools and examples using tools like Sigma, Sindice, and Information Workbench are described.
This presentation covers the whole spectrum of Linked Data production and exposure. After a grounding in the Linked Data principles and best practices, with special emphasis on the VoID vocabulary, we cover R2RML, operating on relational databases, Open Refine, operating on spreadsheets, and GATECloud, operating on natural language. Finally we describe the means to increase interlinkage between datasets, especially the use of tools like Silk.
Big Linked Data - Creating Training CurriculaEUCLID project
This presentation includes an overview of the basic rules to follow when developing training and education curricula for Linked Data and Big Linked Data
This presentation gives details on technologies and approaches towards exploiting Linked Data by building LD applications. In particular, it gives an overview of popular existing applications and introduces the main technologies that support implementation and development. Furthermore, it illustrates how data exposed through common Web APIs can be integrated with Linked Data in order to create mashups.
This presentation addresses the main issues of Linked Data and scalability. In particular, it provides gives details on approaches and technologies for clustering, distributing, sharing, and caching data. Furthermore, it addresses the means for publishing data trough could deployment and the relationship between Big Data and Linked Data, exploring how some of the solutions can be transferred in the context of Linked Data.
This presentation looks in detail at SPARQL (SPARQL Protocol and RDF Query Language) and introduces approaches for querying and updating semantic data. It covers the SPARQL algebra, the SPARQL protocol, and provides examples for reasoning over Linked Data. We use examples from the music domain, which can be directly tried out and ran over the MusicBrainz dataset. This includes gaining some familiarity with the RDFS and OWL languages, which allow developers to formulate generic and conceptual knowledge that can be exploited by automatic reasoning services in order to enhance the power of querying.
Usage of Linked Data: Introduction and Application ScenariosEUCLID project
This presentation introduces the main principles of Linked Data, the underlying technologies and background standards. It provides basic knowledge for how data can be published over the Web, how it can be queried, and what are the possible use cases and benefits. As an example, we use the development of a music portal (based on the MusicBrainz dataset), which facilitates access to a wide range of information and multimedia resources relating to music.
This presentation covers the whole spectrum of Linked Data production and exposure. After a grounding in the Linked Data principles and best practices, with special emphasis on the VoID vocabulary, we cover R2RML, operating on relational databases, Open Refine, operating on spreadsheets, and GATECloud, operating on natural language. Finally we describe the means to increase interlinkage between datasets, especially the use of tools like Silk.
Big Linked Data - Creating Training CurriculaEUCLID project
This presentation includes an overview of the basic rules to follow when developing training and education curricula for Linked Data and Big Linked Data
This presentation gives details on technologies and approaches towards exploiting Linked Data by building LD applications. In particular, it gives an overview of popular existing applications and introduces the main technologies that support implementation and development. Furthermore, it illustrates how data exposed through common Web APIs can be integrated with Linked Data in order to create mashups.
This presentation addresses the main issues of Linked Data and scalability. In particular, it provides gives details on approaches and technologies for clustering, distributing, sharing, and caching data. Furthermore, it addresses the means for publishing data trough could deployment and the relationship between Big Data and Linked Data, exploring how some of the solutions can be transferred in the context of Linked Data.
This presentation looks in detail at SPARQL (SPARQL Protocol and RDF Query Language) and introduces approaches for querying and updating semantic data. It covers the SPARQL algebra, the SPARQL protocol, and provides examples for reasoning over Linked Data. We use examples from the music domain, which can be directly tried out and ran over the MusicBrainz dataset. This includes gaining some familiarity with the RDFS and OWL languages, which allow developers to formulate generic and conceptual knowledge that can be exploited by automatic reasoning services in order to enhance the power of querying.
Usage of Linked Data: Introduction and Application ScenariosEUCLID project
This presentation introduces the main principles of Linked Data, the underlying technologies and background standards. It provides basic knowledge for how data can be published over the Web, how it can be queried, and what are the possible use cases and benefits. As an example, we use the development of a music portal (based on the MusicBrainz dataset), which facilitates access to a wide range of information and multimedia resources relating to music.
Existing data management approaches assume control over schema, data and data generation, which is not the case in open, de-centralised environments such as the Web. The lack of control means that there are social processes necessary to generate 'ordo ab chao' and hence a new life cycle model is necessary.
Based on our experience in Linked Data publishing and consumption over the past years, we have identify involved parties and fundamental phases, which provide for a multitude of so called Linked Data life cycles.
If you want to hear me speak to the slides, you might want to check out the following videos on YouTube:
Part 1: http://www.youtube.com/watch?v=AFJSMKv5s3s
Part 2: http://www.youtube.com/watch?v=G6YJSZdXOsc
Part 3: http://www.youtube.com/watch?v=OagzNpDEPJg
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageOntotext
Many issues are faced by scholars, book researchers, museum directors who try to find the underlying connection between resources. Scholars in particular continuously emphasizes the role of digital humanities and the value of linked data in cultural heritage information systems.
First Steps in Semantic Data Modelling and Search & Analytics in the CloudOntotext
This webinar will break the roadblocks that prevent many from reaping the benefits of heavyweight Semantic Technology in small scale projects. We will show you how to build Semantic Search & Analytics proof of concepts by using managed services in the Cloud.
Alphabet soup: CDM, VRA, CCO, METS, MODS, RDF - Why Metadata MattersNew York University
This presentation given to University of Iowa Libraries on Nov. 17, 2014, discussing 1) the alphabet soup of metadata standards, e.g. CDM, VRA, CCO, METS, MODS, RDF, including sample tagging and their applications for digital libraries, and 2) why metadata matters. It does not address metadata issues and tools for metadata creation, extraction, transformation, quality control, syndication and ingest.
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataSören Auer
Over the past 4 years, the Semantic Web activity has gained momentum with the widespread publishing of structured data as RDF. The Linked Data paradigm has therefore evolved from a practical research idea into
a very promising candidate for addressing one of the biggest challenges
of computer science: the exploitation of the Web as a platform for data
and information integration. To translate this initial success into a
world-scale reality, a number of research challenges need to be
addressed: the performance gap between relational and RDF data
management has to be closed, coherence and quality of data published on
the Web have to be improved, provenance and trust on the Linked Data Web
must be established and generally the entrance barrier for data
publishers and users has to be lowered. This tutorial will discuss
approaches for tackling these challenges. As an example of a successful
Linked Data project we will present DBpedia, which leverages Wikipedia
by extracting structured information and by making this information
freely accessible on the Web. The tutorial will also outline some recent advances in DBpedia, such as the mappings Wiki, DBpedia Live as well as
the recently launched DBpedia benchmark.
NISO Webinar:
Experimenting with BIBFRAME: Reports from Early Adopters
About the Webinar
In May 2011, the Library of Congress officially launched a new modeling initiative, Bibliographic Framework Initiative, as a linked data alternative to MARC. The Library then announced in November 2012 the proposed model, called BIBFRAME. Since then, the library world is moving from mainly theorizing about the BIBFRAME model to attempts to implement practical experimentation and testing. This experimentation is iterative, and continues to shape the model so that it’s stable enough and broadly acceptable enough for adoption.
In this webinar, several institutions will share their progress in experimenting with BIBFRAME within their library system. They will discuss the existing, developing, and planned projects happening at their institutions. Challenges and opportunities in exploring and implementing BIBFRAME in their institutions will be discussed as well.
Agenda
Introduction
Todd Carpenter, Executive Director, NISO
Experimental Mode: The National Library of Medicine and experiences with BIBFRAME
Nancy Fallgren, Metadata Specialist Librarian, National Library of Medicine, National Institutes of Health, US Department of Health and Human Services (DHHS)
Exploring BIBFRAME at a Small Academic Library
Jeremy Nelson, Metadata and Systems Librarian, Colorado College
Working with BIBFRAME for discovery and production: Linked data for Libraries/Linked Data for Production
Nancy Lorimer, Head, Metadata Dept, Stanford University Libraries
The slideset used to conduct an introduction/tutorial
on DBpedia use cases, concepts and implementation
aspects held during the DBpedia community meeting
in Dublin on the 9th of February 2015.
(slide creators: M. Ackermann, M. Freudenberg
additional presenter: Ali Ismayilov)
Linked Data for the Masses: The approach and the SoftwareIMC Technologies
Title: Linked Data for the Masses: The approach and the Software
@ EELLAK (GFOSS) Conference 2010
Athens, Greece
15/05/2010
Creator: George Anadiotis (R&D Director)
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionRonald Ashri
The Physics Department of the University of Cagliari and the Linkalab Group invited me to talk about the Semantic Web and Linked Data - this is simply an introduction to the technologies involved.
Existing data management approaches assume control over schema, data and data generation, which is not the case in open, de-centralised environments such as the Web. The lack of control means that there are social processes necessary to generate 'ordo ab chao' and hence a new life cycle model is necessary.
Based on our experience in Linked Data publishing and consumption over the past years, we have identify involved parties and fundamental phases, which provide for a multitude of so called Linked Data life cycles.
If you want to hear me speak to the slides, you might want to check out the following videos on YouTube:
Part 1: http://www.youtube.com/watch?v=AFJSMKv5s3s
Part 2: http://www.youtube.com/watch?v=G6YJSZdXOsc
Part 3: http://www.youtube.com/watch?v=OagzNpDEPJg
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageOntotext
Many issues are faced by scholars, book researchers, museum directors who try to find the underlying connection between resources. Scholars in particular continuously emphasizes the role of digital humanities and the value of linked data in cultural heritage information systems.
First Steps in Semantic Data Modelling and Search & Analytics in the CloudOntotext
This webinar will break the roadblocks that prevent many from reaping the benefits of heavyweight Semantic Technology in small scale projects. We will show you how to build Semantic Search & Analytics proof of concepts by using managed services in the Cloud.
Alphabet soup: CDM, VRA, CCO, METS, MODS, RDF - Why Metadata MattersNew York University
This presentation given to University of Iowa Libraries on Nov. 17, 2014, discussing 1) the alphabet soup of metadata standards, e.g. CDM, VRA, CCO, METS, MODS, RDF, including sample tagging and their applications for digital libraries, and 2) why metadata matters. It does not address metadata issues and tools for metadata creation, extraction, transformation, quality control, syndication and ingest.
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataSören Auer
Over the past 4 years, the Semantic Web activity has gained momentum with the widespread publishing of structured data as RDF. The Linked Data paradigm has therefore evolved from a practical research idea into
a very promising candidate for addressing one of the biggest challenges
of computer science: the exploitation of the Web as a platform for data
and information integration. To translate this initial success into a
world-scale reality, a number of research challenges need to be
addressed: the performance gap between relational and RDF data
management has to be closed, coherence and quality of data published on
the Web have to be improved, provenance and trust on the Linked Data Web
must be established and generally the entrance barrier for data
publishers and users has to be lowered. This tutorial will discuss
approaches for tackling these challenges. As an example of a successful
Linked Data project we will present DBpedia, which leverages Wikipedia
by extracting structured information and by making this information
freely accessible on the Web. The tutorial will also outline some recent advances in DBpedia, such as the mappings Wiki, DBpedia Live as well as
the recently launched DBpedia benchmark.
NISO Webinar:
Experimenting with BIBFRAME: Reports from Early Adopters
About the Webinar
In May 2011, the Library of Congress officially launched a new modeling initiative, Bibliographic Framework Initiative, as a linked data alternative to MARC. The Library then announced in November 2012 the proposed model, called BIBFRAME. Since then, the library world is moving from mainly theorizing about the BIBFRAME model to attempts to implement practical experimentation and testing. This experimentation is iterative, and continues to shape the model so that it’s stable enough and broadly acceptable enough for adoption.
In this webinar, several institutions will share their progress in experimenting with BIBFRAME within their library system. They will discuss the existing, developing, and planned projects happening at their institutions. Challenges and opportunities in exploring and implementing BIBFRAME in their institutions will be discussed as well.
Agenda
Introduction
Todd Carpenter, Executive Director, NISO
Experimental Mode: The National Library of Medicine and experiences with BIBFRAME
Nancy Fallgren, Metadata Specialist Librarian, National Library of Medicine, National Institutes of Health, US Department of Health and Human Services (DHHS)
Exploring BIBFRAME at a Small Academic Library
Jeremy Nelson, Metadata and Systems Librarian, Colorado College
Working with BIBFRAME for discovery and production: Linked data for Libraries/Linked Data for Production
Nancy Lorimer, Head, Metadata Dept, Stanford University Libraries
The slideset used to conduct an introduction/tutorial
on DBpedia use cases, concepts and implementation
aspects held during the DBpedia community meeting
in Dublin on the 9th of February 2015.
(slide creators: M. Ackermann, M. Freudenberg
additional presenter: Ali Ismayilov)
Linked Data for the Masses: The approach and the SoftwareIMC Technologies
Title: Linked Data for the Masses: The approach and the Software
@ EELLAK (GFOSS) Conference 2010
Athens, Greece
15/05/2010
Creator: George Anadiotis (R&D Director)
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionRonald Ashri
The Physics Department of the University of Cagliari and the Linkalab Group invited me to talk about the Semantic Web and Linked Data - this is simply an introduction to the technologies involved.
Manager une boite de Geeks est très particulier. Ces salariés d'un genre à part veulent des nouvelles technos, du temps pour expérimenter, partager la connaissance, faire du Pair Programming, participer à la stratégie d'entreprise et une vraie liberté de parole.
Speaker : Luc Legardeur, Président de Xebia, à Devoxx France 2015
Tyler Baldwin, Yunyao Li, Bogdan Alexe, Ioana Roxana Stanoi: Automatic Term Ambiguity Detection. ACL (2) 2013: 804-809
Abstract:
While the resolution of term ambiguity is important for information extraction (IE) systems, the cost of resolving each instance of an entity can be prohibitively expensive on large datasets. To combat this, this work looks at ambiguity detection at the term, rather than the instance,
level. By making a judgment about the general ambiguity of a term, a system is able to handle ambiguous and unambiguous cases differently, improving through-put and quality. To address the term ambiguity detection problem, we employ a model that combines data from language models, ontologies, and topic modeling. Results over a dataset of entities from four product domains show that the
proposed approach achieves significantly above baseline F-measure of 0.96.
It is quite often observed that when people use retrieval systems, they do not just search documents or text passages in the first place, but for some information contained inside, which is related to some entities, for instance, person, organization, location, events, time, etc. The goal is to find out various kinds of valuable semantic information about real-world entites embedded in different web pages and databases. But It is a difficult task for us to find out specific or exact information about entities from present search engines. So we need search engines, which will identify our queries across different domains and extract structured information about entities.
A Comparison of NER Tools w.r.t. a Domain-Specific VocabularyTimm Heuss
Presentation hold at the SEMANTiCS 2014, in regard of this paper: http://doi.acm.org/10.1145/2660517.2660520
In this paper we compare several state-of-the-art Linked Data Knowledge Extraction tools, with regard to their ability to recognise entities of a controlled, domain-specific vocabulary. This includes tools that offer APIs as a Service, locally installed platforms as well as an UIMA-based approach as reference. We evaluate under realistic conditions, with natural language source texts from keywording experts of the Städel Museum Frankfurt. The goal is to find first hints which tool approach or strategy is more convincing in case of a domain specific tagging/annotation, towards a working solution that is demanded by GLAMs world-wide.
Roy Tennant, Senior Program Officer, OCLC Research
As library collections shift from print materials to digital formats, and as the web enables ubiquitous and instantaneous discovery of information, library users expect to find and access materials online. It’s not enough to have pages “on the web”; library data must be “woven into the web” and integrated into the sites and services that library users frequent daily – Google, Wikipedia, social networks. When information about a library’s collection is locked up behind a specific web site (such as an OPAC), it is often exceedingly difficult for services, such as search engines, to consume that data. Information seekers need to be connected back to their local library resources from wherever they are on the web. The imperative is to make library data available in new data formats that are native to the web, exposing it to the wider web community, making it easily discoverable by other sites, services, and ultimately consumers. Roy Tennant will shed light on what linked data is and how to re-envision, expose and share library data as entities that are part of the web.
(http://lod2.eu/BlogPost/webinar-series) In this Webinar Michael Martin presents CubeViz - a facetted browser for statistical data utilizing the RDF Data Cube vocabulary which is the state-of-the-art in representing statistical data in RDF. This vocabulary is compatible with SDMX and increasingly being adopted. Based on the vocabulary and the encoded Data Cube, CubeViz is generating a facetted browsing widget that can be used to filter interactively observations to be visualized in charts. Based on the selected structure, CubeViz offer beneficiary chart types and options which can be selected by users.
If you are interested in Linked (Open) Data principles and mechanisms, LOD tools & services and concrete use cases that can be realised using LOD then join us in the free LOD2 webinar series!
Process and steps that are followed in creation of successful visualization. Taking an example of Encyclopaedia of life data and tableu visualization prototype
Presentació del projecte europeu ECHOES duta a terme el 28 de juny de 2018 a Leiden (Holanda), on el CSUC ha mostrat els objectius i principals característiques del projecte a empreses tecnològiques holandeses.
SEMANCO - Integrating multiple data sources, domains and tools in urban ener...Álvaro Sicilia
Semantic-based interoperability based on ontologies provide an alternative to centralized stand-ard data models. They help to integrate heterogeneous data produced by loose coupled information systems and to interlink these data with different tools in ad hoc situations. In the SEMANCO project (www.semanco-project.eu) we have used semantic technologies to create energy models of urban areas encompassing a variety of data sources and do-mains (building, geospatial, energy, climate, socioeconomic). The semantically modelled data has been made accessible to a set of simulation and analysis tools. The interoperability among the data sources and between these and the tools that interact with them is assured by a Semantic Energy Information Framework (SEIF) developed in the project. The access to the data and tools takes place in the SEMANCO integrated platform. In this paper we describe the work carried out to integrate an existing simulation software –URSOS– with the semantic data model. The functionalities of the tool and the integrated platform have been demonstrated in an application case carried out in the city of Manresa, in Spain
Dec'2013 webinar from the EUCLID project on managing large volumes of Linked Data
webinar recording at https://vimeo.com/84126769 and https://vimeo.com/84126770
more info on EUCLID: http://euclid-project.eu/
Presented in : JIST2015, Yichang, China
Prototype: http://rc.lodac.nii.ac.jp/rdf4u/
Video: https://www.youtube.com/watch?v=z3roA9-Cp8g
Abstract: It is known that Semantic Web and Linked Open Data (LOD) are powerful technologies for knowledge management, and explicit knowledge is expected to be presented by RDF format (Resource Description Framework), but normal users are far from RDF due to technical skills required. As we learn, a concept-map or a node-link diagram can enhance the learning ability of learners from beginner to advanced user level, so RDF graph visualization can be a suitable tool for making users be familiar with Semantic technology. However, an RDF graph generated from the whole query result is not suitable for reading, because it is highly connected like a hairball and less organized. To make a graph presenting knowledge be more proper to read, this research introduces an approach to sparsify a graph using the combination of three main functions: graph simplification, triple ranking, and property selection. These functions are mostly initiated based on the interpretation of RDF data as knowledge units together with statistical analysis in order to deliver an easily-readable graph to users. A prototype is implemented to demonstrate the suitability and feasibility of the approach. It shows that the simple and flexible graph visualization is easy to read, and it creates the impression of users. In addition, the attractive tool helps to inspire users to realize the advantageous role of linked data in knowledge management.
RDF4U: RDF Graph Visualization by Interpreting Linked Data as KnowledgeRathachai Chawuthai
Presented in : JIST2015, Yichang, China
Prototype: http://rc.lodac.nii.ac.jp/rdf4u/
Video: https://www.youtube.com/watch?v=z3roA9-Cp8g
Abstract: It is known that Semantic Web and Linked Open Data (LOD) are powerful technologies for knowledge management, and explicit knowledge is expected to be presented by RDF format (Resource Description Framework), but normal users are far from RDF due to technical skills required. As we learn, a concept-map or a node-link diagram can enhance the learning ability of learners from beginner to advanced user level, so RDF graph visualization can be a suitable tool for making users be familiar with Semantic technology. However, an RDF graph generated from the whole query result is not suitable for reading, because it is highly connected like a hairball and less organized. To make a graph presenting knowledge be more proper to read, this research introduces an approach to sparsify a graph using the combination of three main functions: graph simplification, triple ranking, and property selection. These functions are mostly initiated based on the interpretation of RDF data as knowledge units together with statistical analysis in order to deliver an easily-readable graph to users. A prototype is implemented to demonstrate the suitability and feasibility of the approach. It shows that the simple and flexible graph visualization is easy to read, and it creates the impression of users. In addition, the attractive tool helps to inspire users to realize the advantageous role of linked data in knowledge management.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
3. Motivation: Music! (2)
EUCLID – Interaction with Linked Data 3
• Our aim: build a music-based portal using Linked
Data technologies
• So far, we have studied different mechanisms to
consume Linked Data:
• Executing SPARQL queries
• Dereferencing URIs
• Downloading RDF dumps
• Extracting RDFa data
• The output of these mechanisms corresponds to
data in machine-readable formats
CH 2
CH 3
CH 1
5. Visualizations techniques are needed in order to
transform the machine-readable data into this:
Motivation: Music! (4)
EUCLID – Interaction with Linked Data 5
Source: http://musicbrainz.fluidops.net/
6. In addition, visualization techniques allow for:
Motivation: Music! (5)
EUCLID – Interaction with Linked Data 6
• Telling a story
• Engaging our pattern matching
brain
• Identifying data characteristics
which cannot be directly inferred
from statistical properties:
• Anscombe’s quartet: 4 datasets very
different, but with same statistical values.
Image: http://en.wikipedia.org/wiki/Anscombe's_quartet
Source: Donaldson, I. and Lamere P. Using Visualizations for Music Discovery
Image: Chan W., Qu. H, Mak, W. Visualizing the
Semantic Structure in Classical Musical Works.
7. Agenda
1. Linked Data visualization
2. Linked Data search
3. Methods for Linked Data analysis
7EUCLID – Interaction with Linked Data
9. LDVisualizationTechniques
• Linked Data visualization techniques should provide
graphical representations of the information within
the LD datasets
• Visualization techniques should be selected
accordingly to:
– The type of data: Specific types of data should be
visualized in a certain way
– The purpose of the visualization: Depending on the type
of analysis/application to employ
9EUCLID – Interaction with Linked Data
10. LDVisualizationTechniques (2)
EUCLID – Interaction with Linked Data 10
• (Raw) RDF data: Instance data, taxonomies,
ontologies, vocabularies.
• Analytically extracted data: Subset of
the data denominated region of interest (ROI),
obtained via data extraction mechanisms, for
example, SPARQL queries.
• Visualization abstraction: It is obtained by
applying visualization transformations to render the
data into displayable information.
• View: Final result. The visual mapping
transformations obtain a graphic representation of
the data using the selected visualization technique.
• User interaction: The user interacts (click,
zoom, etc.) with the visualization, which may trigger
a new visualization process.
RDF data
Analytically
extracted data
Visualization
abstraction
View
Data extraction
Visualization
transformation
Visual mapping
transformation
Overview of the Linked DataVisualization process
Process partially based on: Brunetti , J.M.; Auer, S.; García, R. The Linked Data Visualization Model.
(Optional)
User
interaction
11. country releases
United Kingdom 225
United States 140
Germany 30
Luxembourg 29
LDVisualizationTechniques (3)
EUCLID – Interaction with Linked Data 11
Example of the Linked DataVisualization process
…
RDF data
Analytically
extracted data
…
Visualization
abstraction
SELECT ?country (COUNT(?release) AS ?releases)
WHERE {
<http://dbpedia.org/resource/The_Beatles> foaf:made
?release .
?release a mo:Release ;
mo:label ?label .
?label foaf:based_near ?country .}
GROUP BY ?country
ORDER BY DESC(?releases)
Data extraction
SPARQL query: Retrieve number of releases per
country of The Beatles
#widget : HeatMap |
input = 'country_code' |
output = {{ 'releases' }}
Visualization
transformation
country_code releases
GB 225
US 140
DE 30
LU 29
?country_code2 := REPLACE(str(?country), "http://ontologi.es/place/", "", "i”)
?country_code := REPLACE(?country_code2, "%", "", "i")
Formatting the names of the countries
View Visual mapping
transformation
Selecting the visualization technique (input, output)
Can be performed in a single step
… …
13. Challenges for
Linked DataVisualization
EUCLID – Interaction with Linked Data 13
• Enabling user interaction
– Users must be able to navigate through the data by exploiting the
connections between Linked Data resources
– The user might edit the underlying data to enrich it by:
• Creating additional metadata
• Highlighting or correcting errors
• Validating data
• Supporting data reusability
– The output (the plotted data or the visualization itself) might be
encoded using standard ontologies and vocabularies
• Scalability
– Linked Data visualization techniques should support the display of
large amount of data in an efficient way
14. Challenges for
Linked Open DataVisualization
EUCLID – Interaction with Linked Data 14
• Extracting data from different repositories
– A Linked Data set might be partitioned into several repositories
– The region of interest (ROI) might include data from different data
sets, requiring the access to distributed repositories
• Handling heterogeneous data
– The same data (concepts) might be modeled differently, for example,
using different vocabularies
– Certain values might have different formats, for example, dates
represented as DD-MM-YYYY, MM-DD-YYYY or just YYYY
• Dealing with missing values
– Due to the semi-structuredness of Linked Data, some instances might
have missing values for certain properties
15. Classification of
VisualizationTechniques
15EUCLID – Interaction with Linked Data
Task Visualization techniques
Comparison of attributes /
values
• Bar/column and pie chart
• Line charts
• Histogram
Analysis of relationships
and hierarchies
• Graph
• Arc diagram
• Matrix
• Node-link visualizations
• Space-filling techniques: Treemaps, icicles and sunburst,
circle packing and rose diagrams
Analysis of temporal or
geographical events
• Timeline
• Maps
Analysis of multi-
dimensional data
• Parallel coordinates
• Radar/star chart
• Scatter plot
16. Bar/column chart
Allows the comparison of values of
different categories.
Pie chart
Useful for performing comparison
of percentages or proportions.
Comparison of
Attributes /Values
16EUCLID – Interaction with Linked Data
Line chart
Allows visualizing data as a series of
data points, where the measurement
points (x-axis) are ordered.
Histogram
Graphical representation of the
distribution of the data.
Image source: http://mbostock.github.io/protovis/Image source: http://musicbrainz.fluidops.net
Image source: http://mbostock.github.io/protovis/Image source: http://musicbrainz.fluidops.net
17. Arc diagram
The nodes are displayed in one
dimension, and the arcs represent
the connections.
Analysis of
Relationships and Hierarchies
Graph
The data entries are represented as
nodes and the links as edges.
17EUCLID – Interaction with Linked Data
Adjacency Matrix diagram
The nodes are displayed as rows and
columns, and the links between the
nodes are entries in the matrix.
Node-link visualizations
The data is organized in hierarchies.
Source of images: http://mbostock.github.io/protovis/
18. Icicles and sunburst
Hierarchies are represented by
adjacencies.
Analysis of
Relationships and Hierarchies (2)
Treemaps
Subdivide area into rectangles.
18EUCLID – Interaction with Linked Data
Circle-packing
Containment is used to represent the
hierarchies.
Rose diagrams
Areas are equal angles and the data
is represented by
the extension of
the area.
Source of images: http://mbostock.github.io/protovis/
Space-fillingtechniques
19. Analysis of Temporal or
Geographical Events
Timeline
19EUCLID – Interaction with Linked Data
Maps
Source: http://mbostock.github.io/protovis/
Choropleth maps
Aggregate data by
geographical area
Location maps
Display geo-points on a map
Dorling cartograms
Aggregate data and replace
each area with a circle
Discrete data points in time Continuous data in time
Source: http://www.kottke.org/08/08/2008-movie-box-office-chart
Source: http//musicbrainz.fluidops.net
Source: Google Map API Source: http//musicbrainz.fluidops.net
20. Scatter plot
Useful for performing comparison
of percentages or proportions.
Analysis of
Multidimensional Data
Radar/star chart
Displays multivariate data as a two-
dimensional chart. The axes
correspond to the
variables.
20EUCLID – Interaction with Linked Data
Parallel coordinates
Allows visualizing high-dimensional data.
Each vertical axis denotes a dimension, and
a multidimensional point is represented as
a polyline with vertices on the axes.
Source: http://mbostock.github.io/protovis/
Source: http://mbostock.github.io/protovis/Source: http://mbostock.github.io/protovis/
21. OtherVisualizationTechniques
EUCLID – Interaction with Linked Data 21
• Text-based visualizations: tag clouds
• Some of the previously presented techniques can be
combined to produce more complex data
visualizations
Phrase Net of Beatles Lyrics
DBpedia music genres
Source: http://www.wordle.net
Source: http://many-eyes.com
22. • Get an overview of the data
• Identification of relevant resources, classes or properties in
datasets
• Learning about certain underlying characteristics of the data,
e.g., vocabularies or ontologies
• Detecting missing links between nodes in an RDF graph
• Discovering new paths between nodes in an RDF graph
• Identifying hidden patterns in the data
• Finding errors or atypical values (outliers)
22EUCLID – Interaction with Linked Data
Applications of Linked Data
Visualization Techniques
23. Linked DataVisualization
Tool Requirements
The requirements for visualization tools that consume Linked Data can be
summarized as follows:
• Data navigation and exploration capabilities in order to understand the
structure and the content
• Exploiting data structures:
• Links to visualize hierarchies or graphs
• Multi-dimensional
• User interaction:
• Basic and advanced querying
• Filtering values
• Interactive UI: responsive to the user input
• Publication/syndication of the graphical representation of the data
• Data extraction in order to export the data such that can be reused by
third parties
23EUCLID – Interaction with Linked Data
24. Linked DataVisualization
ToolTypes
1. LD browsers with text-based representation
• Dereference URIs to retrieve the resource description
• Use a textual representation of LD resources
• Display adequately texts and images
• Mainly support exploratory browsing and knowledge discovery
2. LD and RDF browsers with visualization options
• Exploit picture, graphics, images and other visual
representations of the data
• Support user interaction: allows for querying, filtering and
jumping between resources
• Suitable for browsing and knowledge discovery as well as
analytic activities
24EUCLID – Interaction with Linked Data
25. Linked DataVisualization
ToolTypes (2)
3.Visualization toolkits
• Frameworks providing a wide range of visualization techniques
• General toolkits support LD visualization by applying a set of
transformations of the data
• Some toolkits are specially designed to consume LD
4. SPARQL visualization
• These tools allow transforming the output of SPARQL queries
into graphics
• Contact SPARQL endpoints in order to evaluate the query
• Suitable for analytical activities
25EUCLID – Interaction with Linked Data
26. Linked DataVisualization
ToolTypes (3)
26EUCLID – Interaction with Linked Data
LD browsers with text-
based presentations
Sig.ma
Sindice
OpenLink RDF Browser
Marbles
Disco Hyperdata Browser
Piggy Bank (SIMILE)
Zitgist DataViewer
iLOD
URI Burner
Dipper – Talis Platform Browser
LD and RDF browsers
with visualization
options
Tabulator
IsaViz
OpenLink Data Explorer
RDF Gravity
RelFinder
DBpedia Mobile
LESS
SIMILE Exhibit
Haystack
FoaF Explorer
Humboldt
LENA
Noadster
Visualization toolkits
Linked Data tools:
Information Workbench
Visual RDF (by Graves)
LOD Live
LOD Visualization
Data-Driven Documents (D3)
NetworkX
Many Eyes
Tableau
Prefuse
SPARQL visualization
Information Workbench
Google Visualization API
SPARQL package for R
Gruff (for AllegroGraph)
Linked Data:
General data:
27. Linked DataVisualization
Examples (1)
EUCLID – Interaction with Linked Data 27
Sig.ma
Source: http://sig.ma/search?q=The+Beatles
Retrieves information from
different LD sources
Keyword
search
Displays
values per
predicate
Displays
the source
for each
value
28. Linked DataVisualization
Examples (2)
EUCLID – Interaction with Linked Data 28
Sig.ma
Source: http://sig.ma/search?q=The+Beatles
Displays
values per
predicate:
May include (redundant)
information in different
languages, for example: annés
and anno
Summary:
• Sig.ma lists all the triples, and group
them per predicate
• Useful for browsing predicates and
values within data sets
• The meaning of the values is not evident
URIs are clickable, allowing
navigation through RDF
resources
29. Linked DataVisualization
Examples (3)
EUCLID – Interaction with Linked Data 29
Sindice
Keyword
search
Filtering
per type
of
document
Retrieves links
to documents
Allows accessing
cache documents
Allows inspecting
resources
Source: http://sindice.com/search?q=The+Beatles
30. Linked DataVisualization
Examples (4)
EUCLID – Interaction with Linked Data 30
Sindice
Both interfaces display the
set of triples related to the
inspected resource
Cache triples
Live triples
31. Linked DataVisualization
Examples (5)
EUCLID – Interaction with Linked Data 31
Information Workbench
• Demo available at: http://musicbrainz.fluidops.net
• Displays human-readable content about Linked Data
resources
• Supports visualization techniques (different types of charts,
maps, timelines, etc.) to plot results from SPARQL queries
• Allows the user to interact with the displayed data
35. Linked DataVisualization
Examples (9)
EUCLID – Interaction with Linked Data 35
Information Workbench: User interaction
LD visualizations must support navigation through the data
Source: http://musicbrainz.fluidops.net/resource/Analytical5
36. Linked DataVisualization
Examples (9)
EUCLID – Interaction with Linked Data 36
Information Workbench: SPARQLVisualization
Implements widgets which allow:
• Retrieving ROI via SPARQL queries
• Selecting the appropriate visualization technique
• Configuring parameters of the visualization
37. Linked DataVisualization
Examples (10)
EUCLID – Interaction with Linked Data 37
Information Workbench: SPARQL visualization
SELECT ?release
((SUM(xsd:double(?duration/60000))) AS ?avg)
WHERE {
<http://dbpedia.org/resource/The_Beatles>
foaf:made ?release .
?release mo:record ?record .
?record mo:track ?track .
?track mo:duration ?duration .}
GROUP BY ?release
ORDER BY DESC(?avg)
LIMIT 10
SPARQLQuery
Result set
Top ten The Beatles releases according to the sum of track durations in minutes
38. Linked DataVisualization
Examples (11)
EUCLID – Interaction with Linked Data 38
Information Workbench: SPARQL visualization
Top ten The Beatles releases according to the sum of track durations in minutes
Widget
Visualization: Bar chart
{{#widget: BarChart |
query ='SELECT (COUNT(?Release) AS ?COUNT)
?label WHERE {
<http://musicbrainz.org/artist/8538e728-ca0b-4321-b7e5-
cff6565dd4c0#_> foaf:made ?Release.
?Release rdf:type mo:Release .
?Release dc:title ?label .}
GROUP BY ?label
ORDER BY DESC(?COUNT)
LIMIT 20'
| settings = 'Settings:barvertical_mb'
| asynch = 'true'
| input = 'label'
| output = 'COUNT'
| height = '300’}}
39. Linked DataVisualization
Examples (12)
EUCLID – Interaction with Linked Data 39
Information Workbench: SPARQL visualization
Top ten The Beatles releases according to the sum of track durations in minutes
Other visualizations of the same result set …
Line chart:
Pie chart:
40. Linked DataVisualization
Examples (13)
EUCLID – Interaction with Linked Data 40
Information Workbench: Automated Widget Suggestion
Bar chart
Line chart
Pie chart
1
2 3Table
Pivot
view
Select a suggested visualization Visualization
automatically built
41. Linked DataVisualization
Examples (14)
EUCLID – Interaction with Linked Data 41
Other tools
Source: http://en.lodlive.it Source: http://lodvisualization.appspot.com
LODVisualizationLOD live
• Graph visualizations
• Interactive UI (the graph can be
expanded by clicking on the nodes)
• Live access to SPARQL endpoints
• Hierarchy visualizations: treemaps and trees
• Live access to SPARQL endpoints
(supporting JSON and SPARQL 1.1)
42. LinkingOpen Data Cloud
Visualization (1)
42EUCLID – Interaction with Linked Data
“The Linking Open Data cloud diagram”
by Richard Cyganiak and Anja Jentzsch
Source: http://lod-cloud.net
• The nodes correspond
to Linked Data sets
• The edges represent
connections between
Linked Data sets
• The size of the nodes is
proportional to the
number of triples in
each data set
• The datasets are
categorized by
knowledge domains
represented with colors
43. LinkingOpen Data Cloud
Visualization (2)
43EUCLID – Interaction with Linked Data
Image source: http://twitpic.com/17qj1h
“Linked Open Data Cloud” generated by Gephis
• The central cluster (green) displays DBpedia as a central focus
• The size of the nodes reflect the size of the datasets
• The length of the connections encode information about the data structure
Source: A. Dadzie and M. Rowe. Approaches to Visualizing Linked Data: A Survey. 2011
44. LinkingOpen Data Cloud
Visualization (3)
44EUCLID – Interaction with Linked Data
“Linked Open Data Graph” by Protovis
Source: http://inkdroid.org/lod-graph/
• The data to be displayed are
retrieved using the CKAN API
• The nodes represent Linked Data
sets available in the Data Hub “lod-
cloud” group
• The size of the nodes is proportional
to the data set size
• Edges are connections between data
sets
• The colors reflect the CKAN rating
and the intensity of the color reflects
the number of received ratings
• The nodes can be clicked to go to the
data set CKAN page
45. LD Reporting
EUCLID – Interaction with Linked Data 45
• Visualizations techniques are used in the creation of reports
included in data monitoring and management solutions
• Provides and overview of the dataset by generating a low-level
descriptive analysis:
• Quantitative information about the dataset
• Users may interact with the data via dashboards
• Some systems support this feature over structured data:
• Google Webmaster Tools (https://www.google.com/webmasters/tools)
• Information Workbench (http://www.fluidops.com/information-workbench)
• eCloudManager (http://www.fluidops.com/ecloudmanager)
46. GoogleWebmasterTool:
Structure Data Dashboard (1)
EUCLID – Interaction with Linked Data 46
• Provides to webmasters information about the structured
data embedded in their websites (and recognized by Google)
• The dashboard three levels:
i. Site-level view: aggregates the data by classes defined in
the vocabulary schema
ii. Item-type-level view: provides details per page for each
type of resource
iii. Page-level view: shows the attributes of every type of
resource on a given web page
47. GoogleWebmasterTool:
Structure Data Dashboard (2)
EUCLID – Interaction with Linked Data 47
Source: http://googlewebmastercentral.blogspot.de/2012/07/introducing-structured-data-dashboard.html
Site-level view
48. GoogleWebmasterTool:
Structure Data Dashboard (3)
EUCLID – Interaction with Linked Data 48
Source: http://googlewebmastercentral.blogspot.de/2012/07/introducing-structured-data-dashboard.html
Page-level view
Site-level view
50. Semantic Search Process
Using semantic models for the search process
50EUCLID – Interaction with Linked Data
Faceted
Search
Semantic
Search
Image based on: Tran, T., Herzig, D., Ladwig, G. SemSearchPro- Using semantics through the search process
Data graphs Query
Result
visualization/present
ation
User query
(e.g. keywords, NL)
Query visualization
(Optional) User
System
Refinement
Presentation
Analysis
Presentation /
Ranking
Graph matching
Entity Extraction /
Semantic query analysis
51. Image Source: http://musicontology.com
Semantic Search: Example (1)
51EUCLID – Interaction with Linked Data
User query
(NL)
“songs written by members of the beatles”
Entity extraction:
Query expansion:
song
track
melody
tune
synonym
mo:Track
Candidates
…
song member (of)written by (the) beatles
Entity mapping:
52. Semantic Search: Example (2)
52EUCLID – Interaction with Linked Data
User query
(NL)
“songs written by members of the beatles”
Entity extraction:
Query expansion:
writer
composer
creator
synonym
mo:composer
Image Source: http://musicontology.com
Candidates
written by
inverse of
…
song member (of)written by (the) beatles
Entity mapping:
53. Semantic Search: Example (3)
53EUCLID – Interaction with Linked Data
User query
(NL)
“songs written by members of the beatles”
Entity extraction: song member (of)written by (the) beatles
Query expansion:
member (of)
mo:member
_of
mo:member
inverse of
Image Source: http://musicontology.com
Entity mapping:
54. Semantic Search: Example (4)
54EUCLID – Interaction with Linked Data
User query
(NL)
“songs written by members of the beatles”
Entity extraction: song member (of)written by (the) beatles
Entity mapping:
(the) beatles
Candidates
Beatles
(Book)
The Beatles
(Music Group)
Beatle
(Animal)
Beatle
(Automobile)
How to identify the right “Beatle”? Examine the context (Contextual Analysis)
55. Semantic Search: Example (5)
55EUCLID – Interaction with Linked Data
User query
(NL)
“songs written by members of the beatles”
Entity extraction: song member (of)written by (the) beatles
Entity mapping:
(the) beatles
Contextual Analysis
foaf:Agent
mo:composer
mo:Track
mo:
MusicArtist
rdfs:subClassOf
mo:
MusicGroup
mo:member
rdfs:subClassOf
This subgraph is part of the query
The Beatles
(Music Group)
dbpedia:
The_Beatles
Entity mapping:
56. Semantic Search: Example (6)
56EUCLID – Interaction with Linked Data
User query
(NL)
“songs written by members of the beatles”
Entity extraction: song member (of)written by (the) beatles
?y
Mo:Track
?x
dbpedia:
The_Beatles
Results
(I want to) Come Home
Angel in Disguise
Another Day
…
Answers presented to the user
The results could be ranked
Query
foaf:Agent
57. Semantic Search
• Aims at understanding the meaning of the resources specified
in the query
• Different approaches to exploit semantics:
• Query expansion using ontologies
Since ontologies represent knowledge about specific domains, they can
be used to expand the query by incorporating related ontology terms into
the query.
• Contextual analysis
In LD, this approach may explore the resources specified in the query and their
adjacent nodes in the RDF graph. Mainly applied to disambiguate query terms.
• Reasoning
In some cases, the answer to a specific query is not explicitly contained in the
data, but it can be computed by using reasoning methods.
57EUCLID – Interaction with Linked Data
58. Semantic Search & Linked Data
58EUCLID – Interaction with Linked Data
Component Semantic search SPARQL query
Keyword or NL /
concept matching
Performs entity extraction
and matching to formal
concepts
Not supported
Fuzzy
concepts/relation/logics
Allows the application of
fuzzy qualifiers as query
constrains
Not supported
Graph patterns Uses the context and
other semantic
information to locate
interesting sub-graphs
Applies pattern matching
Path discovery Finds new interesting
links that may lead to
additional information
Not supported
Semantic Search vs. SPARQL query
59. Semantic Search: Google (1)
59EUCLID – Interaction with Linked Data
Input: query in NL
Output: List of answers
Google performs semantic search on certain entities and queries!
60. Semantic Search: Google (2)
60EUCLID – Interaction with Linked Data
Input: question in NL
Output: List of web pages
ranked using the algorithm
Google PageRank to display the
most relevant pages first
61. Semantic Search: DuckDuckGo (1)
61EUCLID – Interaction with Linked Data
Input: question in NL
Output: List of answers
62. Semantic Search: DuckDuckGo (2)
62EUCLID – Interaction with Linked Data
Performs disambiguation of the
query terms.
The 45 suggestions are grouped by
classes according to their
corresponding knowledge domain:
This approach is denominated
Faceted Search
63. Faceted Search: Example
InformationWorkbench: Searching for artists in categories
63EUCLID – Interaction with Linked Data
Facet
Facet
Facet
Source: http://musicbrainz.fluidops.net/resource/mo:MusicArtist?view=pivot
Depictions of artists
64. Faceted Search
• Facets = properties
• Suitable for browsing multi-dimensional taxonomies based on
the search attributes
• Allows user to explore the data:
• User submits a (keyword) query
• Faceted system dynamically identifies the relevant facets (properties)
for the given query and the constrains (values of those properties), and
display the search results
• User may “drill down” by selecting specific constrains to the search
results
• Information can be accessed and ranked in multiple ways
64EUCLID – Interaction with Linked Data
65. Faceted Search (2)
Challenges for supporting Faceted Search
• Identifying which facets to surface:
• In heterogeneous datasets, data entries may have different facets
• Dynamically identify the most appropriate facets for each query
• Ordering the facets depending on the relevance to the query
• Computing previews:
• Accurately predicting counts, without examining all the results
• Offering facet preview to give users an idea of what to expect
65EUCLID – Interaction with Linked Data
Source: Teevan , J., Dumais, S., Gutt. Z. Challenges for Supporting Faceted Search in Large, Heterogeneous
Corpora like the Web
66. Faceted Search: LD Example (1)
FacetedDBLP
• Retrieves information from the DBLP collection
• Shows the result set with different facets:
• Publication years
• Authors
• Conferences
• It is implemented upon the DBLP++ dataset (enhancement of
DBLP including additional keywords and abstracts):
• DBLP ++ is stored in a MySQL database
• Uses D2R server to consume RDF triples
66EUCLID – Interaction with Linked Data
67. Faceted Search: LD Example (2)
67EUCLID – Interaction with Linked Data
Input: “crowdsourcing”
Facets
485 results
FacetedDBLP
68. Classification of Search Engines
68EUCLID – Interaction with Linked Data
Semantic
Search
Systems
Faceted
Search
Systems
Google
(GKG)Bing
KIM
sig.ma
LOD cloud cache
/facet
Longwell
mSpace
Exhibit (SIMILE)
PoolParty Semantic
Search Server
DuckDuckGo
Hakia
SenseBot
PowerSet
DeepDive
Kosmix
Factibles
Lexxe
Information Workbench
69. Searching for Semantic Data
69EUCLID – Interaction with Linked Data
Search for
• Ontologies
• Vocabularies
• RDF documents
70. Semantic Data Search Engines (1)
EUCLID – Interaction with Linked Data 70
Searching for ontologies
Swoogle
http://kmi-web05.open.ac.uk/WatsonWUIhttp://swoogle.umbc.edu
Watson
Keyword search
Keyword search
71. Semantic Data Search Engines (2)
Searching for vocabularies: LOV Portal
• Allows to search properties, classes or vocabularies in
the Linked Open Vocabulary (LOV) catalog
• The LOV search engine implement faceted search on:
• The knowledge domain
• The role of the resource matched from the input query
• The vocabulary containing the resource
• Results are ranked according to a score considering:
• Relevancy to the query (string)
• Element labels matched importance
• Number of LOV vocabularies that refer to the element
71EUCLID – Interaction with Linked Data
72. Semantic Data Search Engines (3)
72EUCLID – Interaction with Linked Data
Facets
84 results
Input: “artist”
CH 3
Searching for vocabularies: LOV Portal
73. Semantic Data Search Engines (4)
EUCLID – Interaction with Linked Data 73
Searching for documents
http://swse.deri.org http://sindice.com
Semantic Web Search Engine Sindice
74. METHODS FOR LINKED DATA
ANALYSIS
EUCLID – Interaction with Linked Data 74
75. Features of Data Analysis
75EUCLID – Interaction with Linked Data
Statistical analysis
• Allows describing the data via Exploratory Data Analysis (EDA) methods
• Includes statistical inference and prediction
Data aggregation & filtering
• One of the first steps in data analysis is pre-processing in order to select the
appropriate data to study
Visualization techniques can be built on top of these as part of data analysis
Machine learning
• Focuses on prediction
• Combines Artificial Intelligence and Statistics
• Includes supervised and unsupervised learning (not covered in this course)
76. LD Data Aggregation & Filtering
EUCLID – Interaction with Linked Data 76
• Data aggregation refers to merging/summarizing several
values into a single a one
• Filtering allows retrieving relevant data properties and
selecting a particular range of data values
• SPARQL is able to perform these features via SELECT queries
as follows:
Features SPARQL capabilities
Aggregation Combining aggregate functions (COUNT, SUM, AVG, … ) and
GROUP BY operator
Filtering Combining projection, FILTER and HAVING operators
77. LD Statistical Analysis
EUCLID – Interaction with Linked Data 77
• Statistical analysis supports descriptive and predictive
operations
• SPARQL supports some descriptive operations (average,
maximum, minimum) but does not offer more sophisticated
statistical features like:
• Fitting distributions
• Linear regressions
• Analysis of variance
• …
• Some approaches are able to consume data retrieved from
SPARQL endpoints:
– “R for SPARQL” by Willen Robert van Hage & Tomi Kauppinen
– “Performing Statistical Methods on Linked Data” by Zapilko & Mathiak
78. R – Statistical Computing
EUCLID – Interaction with Linked Data 78
• R is a language and environment for statistical computing
• R provides a wide variety of statistical and graphical
techniques
• Linear and nonlinear modeling
• Classical statistical tests
• Time-series analysis
• Classification (Machine Learning)
• Clustering (Machine Learning)
• Extensible with further functionalities
• R is available as Free Software (under the terms of the
GNU general public license)
80. R for SPARQL
EUCLID – Interaction with Linked Data 80
• The R for SPARQL Package enables to:
• Connect a SPARQL endpoint over HTTP
• Pose a SELECT query or an UPDATE operation (LOAD, INSERT, DELETE)
• If given a SELECT query, it returns the results as a data frame
• The results can directly be mapped and visualized
• Posing requests:
• If the parameter query is given, it is assumed that the input is a SELECT query
and a GET request will be performed to get the results from the URL of the
endpoint
• If the parameter update is given, it is assumed that the input is an UPDATE
operation and a POST request will be submit to the URL of the endpoint.
Nothing is returned
Source: http://linkedscience.org/tools/sparql-package-for-r/
81. R for SPARQL: Example (1)
EUCLID – Interaction with Linked Data 81
1. Download the R package and load it:
• library(SPARQL)
• Library(sp) #user for plotting spatial data
2. Define the endpoint with the triples
• endpoint = "http://spatial.linkedscience.org/sparql"
3. Define the query
• q = "SELECT ?cell ?row ?col ?polygon ?DEFOR_2002
WHERE {
?cell a <http://linkedscience.org/lsv/ns#Item> ;
<http://spatial.linkedscience.org/context/amazon/Lin> ?row ;
<http://spatial.linkedscience.org/context/amazon/Col> ?col;
<http://observedchange.com/tisc/ns#geometry> ?polygon .
<http://spatial.linkedscience.org/context/amazon/DEFOR_2002>
?DEFOR_2002 .
}"
Source: http://linkedscience.org/tools/sparql-package-for-r
82. R for SPARQL: Example (2)
EUCLID – Interaction with Linked Data 82
4. Link the result to an object
• res <- SPARQL(endpoint,q)$results
5. Handling the results
• res$row <- -res$row
• coordinates(res) <- ~col - row
6. Chose the graphical format and plot the results
• spplot(res,"DEFOR_2002",col.regions=rev(heat.colors(
17))[-1], at=(0:16)/100, main="relative
deforestation per pixel during 2002")
Source: http://linkedscience.org/tools/sparql-package-for-r
83. R for SPARQL: Example (3)
EUCLID – Interaction with Linked Data 83
Source: http://linkedscience.org/tools/sparql-package-for-r
84. Machine Learning
EUCLID – Interaction with Linked Data 84
• Machine Learning techniques allow to extract interesting
information from data sources, and can be used to discover
hidden patterns within datasets by generalizing from examples
• Different ML approaches can be applied:
• Clustering: groups similar data into data partitions called clusters
• Association rule learning: discovers relations between variables
• Decision tree learning: analyses observations to build a predictive
model represented as a tree
• Many others …
• Weka is a Data Mining framework commonly used to apply ML
on tabular data:
– www.cs.waikato.ac.nz/ml/weka
85. Machine Learning on LD
EUCLID – Interaction with Linked Data 85
Challenges for applying Machine Learning on LD
• LD heterogeneity introduces noise to the data:
– Same LD resources, different URIs
– Predicates with similar semantics, but different constraints
• The data is not independent and identically distributed (iid):
– It does not consist of only one type of objects
– The entities are related to each other
• LD rarely contains negative examples needed for ML
algorithms:
– For example, owl:differentFrom
Source http://www.cip.ifi.lmu.de/~nickel/iswc2012-slides
86. Applications of
Machine Learning on LD
EUCLID – Interaction with Linked Data 86
• Node ranking:
– Ranking nodes according to their relevance for a query
• Link prediction:
– Infer edges between LD resources
– Predict the new edges that will be added to the RDF graph
• Entity resolution:
– Determine whether two URIs correspond to the same real-
world object
• Taxonomy learning:
– Infer taxonomies or concept hierarchies from a given
vocabulary or ontology
87. Summary
EUCLID – Interaction with Linked Data 87
• Linked Data visualization techniques:
• Visualizations must be chosen according the type of the data
• Wide variety of tools supporting SPARQL results’ visualization
• Might be used in dashboards for supporting administrative tasks
• Linked Data search
• Semantic search: exploits the meaning of user queries (NL or set of
keywords) to present useful results
• Faceted search: allows browsing multi-dimensional data
• Linked Data analysis:
• Includes data manipulation such as aggregation & filtering
• Applies statistical methods to get a better understanding of the data
• Machine Learning techniques can be applied for predictive analysis
• Visualization techniques can be built on top of the previous features
88. For exercises, quiz and further material visit our website:
EUCLID - Providing Linked Data 88
@euclid_project euclidproject euclidproject
http://www.euclid-project.eu
Other channels:
eBook Course
89. Acknowledgements
• Alexander Mikroyannidis
• Alice Carpentier
• Andreas Harth
• Andreas Wagner
• Andriy Nikolov
• Barry Norton
• Daniel M. Herzig
• Elena Simperl
• Günter Ladwig
• Inga Shamkhalov
• Jacek Kopecky
• John Domingue
• Juan Sequeda
• Kalina Bontcheva
• Maria Maleshkova
• Maria-Esther Vidal
• Maribel Acosta
• Michael Meier
• Ning Li
• Paul Mulholland
• Peter Haase
• Richard Power
• Steffen Stadtmüller
89
Editor's Notes
visualizations techniques by visualization techniques Tell by TellingEngage by EngagingIdentify by Identifying
Accordingly BY accordingly to
may may BY may
Allows BY allow
Dbpedia by DBpedia
Can you please send me: - the endpoint - a query that works
Semantic query analysis mean: query expansion using ontologies, context analysis and reasoning
guest1Password1SPARQL Package enables to connect to a SPARQL end-point over HTTP, pose a SELECT query or an update query (LOAD, INSERT, DELETE).If given a SELECT query it returns the results as a data frame with a named column for each variable from the SELECT query, a list of prefixes and namespaces that were shortened to qnames is also returned.If given an update query nothing is returned. If the parameter “query” is given, it is assumed the given query is a SELECT query and a GET request will be done to get the results from the URL of the end point.Otherwise, if the parameter “update” is given, it is assumed the given query is an update query and a POST request will be done to send the request to the URL of the end point.
Accessing the dataAt first, make sure that you have recent versions of the two R packages SPARQL and sp installed. Load the two packages by calling:library(SPARQL) # make sure to use at least version 1.9library(sp)Define the endpoint that will provide you with the triples byendpoint <- "http://spatial.linkedscience.org/sparql"To reduce the XML’s file size, the data is queried piece-wise. The query is initiated byq <- "SELECT ?cell ?row ?col ?polygon WHERE { ?cell a <http://linkedscience.org/lsv/ns#Item> ; <http://spatial.linkedscience.org/context/amazon/Lin> ?row ; <http://spatial.linkedscience.org/context/amazon/Col> ?col ; <http://observedchange.com/tisc/ns#geometry> ?polygon . }"res <- SPARQL(url=endpoint, q)$resultsand completed within a loop over all deforestation variablesfor(var in c("DEFOR_2002", "DEFOR_2003", "DEFOR_2004", "DEFOR_2005", "DEFOR_2006", "DEFOR_2007","DEFOR_2008")) {tmp_q <- paste("SELECT ?cell ?",var,"\\n WHERE { \\n ?cell a <http://linkedscience.org/lsv/ns#Item> ;\\n <http://spatial.linkedscience.org/context/amazon/",var,"> ?",var," .\\n }\\n",sep="")cat(tmp_q) res <- merge(res, SPARQL(endpoint, tmp_q)$results, by="cell")}Creating a SpatialPixelsDataFrameWe copy the results to a new object and flip the y-axis:amazon <- resamazon$row <- -res$rowAssigningcoordinates to a data.framewillresult in a Spatial-object. Setting the type to griddedwill produce a SpatialPixelsDataFrame:coordinates(amazon) <- ~ col+rowgridded(amazon) <- TRUEPlotting and handling the dataAs a first application, we produce a mapshowing relative deforestation per pixel during 2002 by:spplot(amazon,"DEFOR_2002",col.regions=rev(heat.colors(17))[-1], at=(0:16)/100, main="relative deforestation per pixel during 2002")
Accessing the dataAt first, make sure that you have recent versions of the two R packages SPARQL and sp installed. Load the two packages by calling:library(SPARQL) # make sure to use at least version 1.9library(sp)Define the endpoint that will provide you with the triples byendpoint <- "http://spatial.linkedscience.org/sparql"To reduce the XML’s file size, the data is queried piece-wise. The query is initiated byq <- "SELECT ?cell ?row ?col ?polygon WHERE { ?cell a <http://linkedscience.org/lsv/ns#Item> ; <http://spatial.linkedscience.org/context/amazon/Lin> ?row ; <http://spatial.linkedscience.org/context/amazon/Col> ?col ; <http://observedchange.com/tisc/ns#geometry> ?polygon . }"res <- SPARQL(url=endpoint, q)$resultsand completed within a loop over all deforestation variablesfor(var in c("DEFOR_2002", "DEFOR_2003", "DEFOR_2004", "DEFOR_2005", "DEFOR_2006", "DEFOR_2007","DEFOR_2008")) {tmp_q <- paste("SELECT ?cell ?",var,"\\n WHERE { \\n ?cell a <http://linkedscience.org/lsv/ns#Item> ;\\n <http://spatial.linkedscience.org/context/amazon/",var,"> ?",var," .\\n }\\n",sep="")cat(tmp_q) res <- merge(res, SPARQL(endpoint, tmp_q)$results, by="cell")}Creating a SpatialPixelsDataFrameWe copy the results to a new object and flip the y-axis:amazon <- resamazon$row <- -res$rowAssigningcoordinates to a data.framewillresult in a Spatial-object. Setting the type to griddedwill produce a SpatialPixelsDataFrame:coordinates(amazon) <- ~ col+rowgridded(amazon) <- TRUEPlotting and handling the dataAs a first application, we produce a mapshowing relative deforestation per pixel during 2002 by:spplot(amazon,"DEFOR_2002",col.regions=rev(heat.colors(17))[-1], at=(0:16)/100, main="relative deforestation per pixel during 2002")
Accessing the dataAt first, make sure that you have recent versions of the two R packages SPARQL and sp installed. Load the two packages by calling:library(SPARQL) # make sure to use at least version 1.9library(sp)Define the endpoint that will provide you with the triples byendpoint <- "http://spatial.linkedscience.org/sparql"To reduce the XML’s file size, the data is queried piece-wise. The query is initiated byq <- "SELECT ?cell ?row ?col ?polygon WHERE { ?cell a <http://linkedscience.org/lsv/ns#Item> ; <http://spatial.linkedscience.org/context/amazon/Lin> ?row ; <http://spatial.linkedscience.org/context/amazon/Col> ?col ; <http://observedchange.com/tisc/ns#geometry> ?polygon . }"res <- SPARQL(url=endpoint, q)$resultsand completed within a loop over all deforestation variablesfor(var in c("DEFOR_2002", "DEFOR_2003", "DEFOR_2004", "DEFOR_2005", "DEFOR_2006", "DEFOR_2007","DEFOR_2008")) {tmp_q <- paste("SELECT ?cell ?",var,"\\n WHERE { \\n ?cell a <http://linkedscience.org/lsv/ns#Item> ;\\n <http://spatial.linkedscience.org/context/amazon/",var,"> ?",var," .\\n }\\n",sep="")cat(tmp_q) res <- merge(res, SPARQL(endpoint, tmp_q)$results, by="cell")}Creating a SpatialPixelsDataFrameWe copy the results to a new object and flip the y-axis:amazon <- resamazon$row <- -res$rowAssigningcoordinates to a data.framewillresult in a Spatial-object. Setting the type to griddedwill produce a SpatialPixelsDataFrame:coordinates(amazon) <- ~ col+rowgridded(amazon) <- TRUEPlotting and handling the dataAs a first application, we produce a mapshowing relative deforestation per pixel during 2002 by:spplot(amazon,"DEFOR_2002",col.regions=rev(heat.colors(17))[-1], at=(0:16)/100, main="relative deforestation per pixel during 2002")
Can you please send me: - the endpoint - a query that works
Can you please send me: - the endpoint - a query that works
Can you please send me: - the endpoint - a query that works