Dr. Jesús Barrasa's slides from his talk at Connected Data London. Jesús, who is a senior field engineer at Neo4j presented how semantic web principles can be used in a graph database.
Semantic Variation Graphs the case for RDF & SPARQLJerven Bolleman
Presentation given to the GA4GH dataworking group. It starts with an introduction to what RDF is followed by how one can model genomic variation graphs in RDF. Then we show how one can use SPARQL to query this data.
Property graph vs. RDF Triplestore comparison in 2020Ontotext
This presentation goes all the way from intro "what graph databases are" to table comparing the RDF vs. PG plus two different diagrams presenting the market circa 2020
The Semantic Web is a mesh of information linked up in such a way as to be easily processable by machines, on a global scale. You can think of it as being an efficient way of representing data on the World Wide Web, or as a globally linked database.
Enterprise systems are increasingly complex, often requiring data and software components to be accessed and maintained by different company departments. This complexity often becomes an organization’s biggest challenge as changing data fields and adding new applications rapidly grow to meet business demands for increased customer insights.
These slides are from a Webinar discussing how using SHACL and JSON-LD with AllegroGraph helps our customers simplify the complexity of enterprise systems through the ability to loosely combine independent elements, while allowing the overall system to function smoothly.
In this Webinar we will demonstrate how AllegroGraph’s SHACL validation engine confirms whether JSON-LD data is conforming to the desired requirements. We will describe how SHACL provides a way for a Data Graph to specify the Shapes Graph that should be used for validation and describes how a given shape is linked to targets in the data.
The recording is at youtube.com/allegrograph
Semantic Variation Graphs the case for RDF & SPARQLJerven Bolleman
Presentation given to the GA4GH dataworking group. It starts with an introduction to what RDF is followed by how one can model genomic variation graphs in RDF. Then we show how one can use SPARQL to query this data.
Property graph vs. RDF Triplestore comparison in 2020Ontotext
This presentation goes all the way from intro "what graph databases are" to table comparing the RDF vs. PG plus two different diagrams presenting the market circa 2020
The Semantic Web is a mesh of information linked up in such a way as to be easily processable by machines, on a global scale. You can think of it as being an efficient way of representing data on the World Wide Web, or as a globally linked database.
Enterprise systems are increasingly complex, often requiring data and software components to be accessed and maintained by different company departments. This complexity often becomes an organization’s biggest challenge as changing data fields and adding new applications rapidly grow to meet business demands for increased customer insights.
These slides are from a Webinar discussing how using SHACL and JSON-LD with AllegroGraph helps our customers simplify the complexity of enterprise systems through the ability to loosely combine independent elements, while allowing the overall system to function smoothly.
In this Webinar we will demonstrate how AllegroGraph’s SHACL validation engine confirms whether JSON-LD data is conforming to the desired requirements. We will describe how SHACL provides a way for a Data Graph to specify the Shapes Graph that should be used for validation and describes how a given shape is linked to targets in the data.
The recording is at youtube.com/allegrograph
Although you may not have heard of JavaScript Object Notation Linked Data (JSON-LD), it is already impacting your business. Search engine giants such as Google have mandated JSON-LD as a preferred means of adding structured data to web pages to make them considerably easier to parse for more accurate search engine results. The Google use case is indicative of the larger capacity for JSON-LD to increase web traffic for sites and better guide users to the results they want.
Expectations are high for (JSON-LD), and with good reason. JSON-LD effectively delivers the many benefits of JSON, a lightweight data interchange format, into the linked data world. Linked data is the technological approach supporting the World Wide Web and one of the most effective means of sharing data ever devised.
In addition, the growing number of enterprise knowledge graphs fully exploit the potential of JSON-LD as it enables organizations to readily access data stored in document formats and a variety of semi-structured and unstructured data as well. By using this technology to link internal and external data, knowledge graphs exemplify the linked data approach underpinning the growing adoption of JSON-LD—and the demonstrable, recurring business value that linked data consistently provides.
Join us learn more about optimizing the unique Document and Graph Database capabilities provided by AllegroGraph to develop or enhance your Enterprise Knowledge Graph using JSON-LD.
The trajectory schema.org has taken, starting with a history that is less a retrospective than a narrative. I'll follow this narrative to the fortunately-timed emergence of JSON-LD, providing as it does a flexible, standards-based serialization of the vocabulary.
This, I'll explain, helped fuel the popularity of schema.org, which in turn has caused a demand for more schemas, growing the vocabulary and its capabilities. I'll make the case that schema.org has started to resemble exactly what everyone involved in the initiative declared it shouldn't be: an ontology of everything.
Whether or not that be the case, I'll say, the utility of having a relatively simple, well thought-out, well-understood and very broad vocabulary available has made schema.org (along with JSON-LD) a go-to tool for linked data modelers.
Finally, and with a look at the many ways Google, in particular, has made use of schema.org, I'll explore to what extent its utility extends past being a convenient starting for point for back-of-the napkin knowledge graph development, or whether it's making a significant contribution to realizing the promise of a web of data.
A view on data quality in the real estate domain.
Presented at the LDQ workshop, colocated with SEMANTICS 2017 conference.
see https://2017.semantics.cc/satellite-events/linked-data-quality-assessment-and-improvement-academia-industry
for more details
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018Ontotext
These are slides from a live webinar taken place January 2018.
GraphDB™ Fundamentals builds the basis for working with graph databases that utilize the W3C standards, and particularly GraphDB™. In this webinar, we demonstrated how to install and set-up GraphDB™ 8.4 and how you can generate your first RDF dataset. We also showed how to quickly integrate complex and highly interconnected data using RDF and SPARQL and much more.
With the help of GraphDB™, you can start smartly managing your data assets, visually represent your data model and get insights from them.
First Steps in Semantic Data Modelling and Search & Analytics in the CloudOntotext
This webinar will break the roadblocks that prevent many from reaping the benefits of heavyweight Semantic Technology in small scale projects. We will show you how to build Semantic Search & Analytics proof of concepts by using managed services in the Cloud.
Knowledge Discovery tools using Linked Data techniques - {resentation for the Linked Data 4 Knowledge Discovery Workshop at ECML/PKDD2015 conference - http://events.kmi.open.ac.uk/ld4kd2015/ -
GraphDB Cloud: Enterprise Ready RDF Database on DemandOntotext
GraphDB Cloud is an enterprise grade RDF graph database providing high-performance querying over large volumes of RDF data. On this webinar, Ontotext demonstrates how to instantly create and deploy a fully managed Graph Database, then import & query data with the (OpenRDF) GraphDB Workbench, and finally explore and visualize data with the build in visualization tools.
Knowledge graph embeddings are a mechanism that projects each entity in a knowledge graph to a point in a continuous vector space. It is commonly assumed that those approaches project two entities closely to each other if they are similar and/or related. In this talk, I give a closer look at the roles of similarity and relatedness with respect to knowledge graph embeddings, and discuss how the well-known embedding mechanism RDF2vec can be tailored towards focusing on similarity, relatedness, or both.
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven RecipesOntotext
This presentation will provide a brief introduction to logical reasoning and overview of the most popular semantic schema and ontology languages: RDFS and the profiles of OWL 2.
While automatic reasoning has always inspired the imagination, numerous projects have failed to deliver to the promises. The typical pitfalls related to ontologies and symbolic reasoning fall into two categories:
- Over-engineered ontologies. The selected ontology language and modeling patterns can be too expressive. This can make the results of inference hard to understand and verify, which in its turn makes KG hard to evolve and maintain. It can also impose performance penalties far greater than the benefits.
- Inappropriate reasoning support. There are many inference algorithms and implementation approaches, which work well with taxonomies and conceptual models of few thousands of concepts, but cannot cope with KG of millions of entities.
- Inappropriate data layer architecture. One such example is reasoning with virtual KG, which is often infeasible.
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...Ontotext
This webinar continues series are demonstrating how linked open data and semantic tagging of news can be used for comprehensive media monitoring, market and business intelligence. The platform for the demonstrations is FactForge: a hub for news and data about people, organizations, and locations (POL). FactForge embodies a big knowledge graph (BKG) of more than 1 billion facts that allows various analytical queries, including tracing suspicious patterns of company control; media monitoring of people, including companies owned by them, their subsidiaries, etc.
Building materialised views for linked data systems using microservicesConnected Data World
In the BBC’s Content Distribution Services division, we build and maintain systems that provide content metadata to a wide range of audience-facing products.
Our current architecture for distributing tagging metadata consists mainly of two RDF-based read and write APIs feeding off a central triplestore. This single storage setup for all operations imposes restrictions on performance and scalability.
I will talk about our work done to create an event-driven distribution pipeline that generates materialised views of tagging metadata.
The new microservices architecture comprises of small, single-purpose services, lambda functions, event stores, queues, streams etc. The views are built on data stores that are optimised to serve specific query profiles thus improving the overall performance and scalability of the system.
PoolParty Semantic Suite is Semantic Web Company’s platform for enterprise information integration based on Linked Data principles. PoolParty consists of several components that process and manage RDF based data sets. These components have consistency requirements towards the data they work on.
Also, users have requirements towards the quality of the data they manage. We want to express constraints for both in a standard way throughout PoolParty components. SKOS-based PoolParty Thesaurus project data requires both consistency and quality.
Data curation and data archiving at different stages of the research processAndrea Scharnhorst
Henk van den Berg, Jerry de Vries, Andrea Scharnhorst (2019) Data curation and data archiving at different stages of the research process. Presentation given at the DANS Colloquium on Research and Data: Women readers finding their literary foremothers, March 21, 2019, The Hague
Semantic Web technologies (such as RDF and SPARQL) excel at bringing together diverse data in a world of independent data publishers and consumers. Common ontologies help to arrive at a shared understanding of the intended meaning of data.
However, they don’t address one critically important issue: What does it mean for data to be complete and/or valid? Semantic knowledge graphs without a shared notion of completeness and validity quickly turn into a Big Ball of Data Mud.
The Shapes Constraint Language (SHACL), an upcoming W3C standard, promises to help solve this problem. By keeping semantics separate from validity, SHACL makes it possible to resolve a slew of data quality and data exchange issues.
Presented at the Lotico Berlin Semantic Web Meetup.
GraphQL and its schema as a universal layer for database accessConnected Data World
GraphQL is a query language mostly used to streamline access to REST APIs. It is seeing tremendous growth and adoption, in organizations like Airbnb, Coursera, Docker, GitHub, Twitter, Uber, and Facebook, where it was invented.
As REST APIs are proliferating, the promise of accessing them all through a single query language and hub, which is what GraphQL and GraphQL server implementations bring, is alluring.
A significant recent addition to GraphQL was SDL, its schema definition language. SDL enables developers to define a schema governing interaction with the back-end that GraphQL servers can then implement and enforce.
Prisma is a productized version of the data layer leveraging GraphQL to access any database. Prisma works with MySQL, Postgres, and MongoDB, and is adding to this list.
Prisma sees the GraphQL community really coming together around the idea of schema-first development, and wants to use GraphQL SDL as the foundation for all interfaces between systems.
These slides were presented as part of a W3C tutorial at the CSHALS 2010 conference (http://www.iscb.org/cshals2010). The slides are adapted from a longer introduction to the Semantic Web available at http://www.slideshare.net/LeeFeigenbaum/semantic-web-landscape-2009 .
A PDF version of the slides is available at http://thefigtrees.net/lee/sw/cshals/cshals-w3c-semantic-web-tutorial.pdf .
Although you may not have heard of JavaScript Object Notation Linked Data (JSON-LD), it is already impacting your business. Search engine giants such as Google have mandated JSON-LD as a preferred means of adding structured data to web pages to make them considerably easier to parse for more accurate search engine results. The Google use case is indicative of the larger capacity for JSON-LD to increase web traffic for sites and better guide users to the results they want.
Expectations are high for (JSON-LD), and with good reason. JSON-LD effectively delivers the many benefits of JSON, a lightweight data interchange format, into the linked data world. Linked data is the technological approach supporting the World Wide Web and one of the most effective means of sharing data ever devised.
In addition, the growing number of enterprise knowledge graphs fully exploit the potential of JSON-LD as it enables organizations to readily access data stored in document formats and a variety of semi-structured and unstructured data as well. By using this technology to link internal and external data, knowledge graphs exemplify the linked data approach underpinning the growing adoption of JSON-LD—and the demonstrable, recurring business value that linked data consistently provides.
Join us learn more about optimizing the unique Document and Graph Database capabilities provided by AllegroGraph to develop or enhance your Enterprise Knowledge Graph using JSON-LD.
The trajectory schema.org has taken, starting with a history that is less a retrospective than a narrative. I'll follow this narrative to the fortunately-timed emergence of JSON-LD, providing as it does a flexible, standards-based serialization of the vocabulary.
This, I'll explain, helped fuel the popularity of schema.org, which in turn has caused a demand for more schemas, growing the vocabulary and its capabilities. I'll make the case that schema.org has started to resemble exactly what everyone involved in the initiative declared it shouldn't be: an ontology of everything.
Whether or not that be the case, I'll say, the utility of having a relatively simple, well thought-out, well-understood and very broad vocabulary available has made schema.org (along with JSON-LD) a go-to tool for linked data modelers.
Finally, and with a look at the many ways Google, in particular, has made use of schema.org, I'll explore to what extent its utility extends past being a convenient starting for point for back-of-the napkin knowledge graph development, or whether it's making a significant contribution to realizing the promise of a web of data.
A view on data quality in the real estate domain.
Presented at the LDQ workshop, colocated with SEMANTICS 2017 conference.
see https://2017.semantics.cc/satellite-events/linked-data-quality-assessment-and-improvement-academia-industry
for more details
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018Ontotext
These are slides from a live webinar taken place January 2018.
GraphDB™ Fundamentals builds the basis for working with graph databases that utilize the W3C standards, and particularly GraphDB™. In this webinar, we demonstrated how to install and set-up GraphDB™ 8.4 and how you can generate your first RDF dataset. We also showed how to quickly integrate complex and highly interconnected data using RDF and SPARQL and much more.
With the help of GraphDB™, you can start smartly managing your data assets, visually represent your data model and get insights from them.
First Steps in Semantic Data Modelling and Search & Analytics in the CloudOntotext
This webinar will break the roadblocks that prevent many from reaping the benefits of heavyweight Semantic Technology in small scale projects. We will show you how to build Semantic Search & Analytics proof of concepts by using managed services in the Cloud.
Knowledge Discovery tools using Linked Data techniques - {resentation for the Linked Data 4 Knowledge Discovery Workshop at ECML/PKDD2015 conference - http://events.kmi.open.ac.uk/ld4kd2015/ -
GraphDB Cloud: Enterprise Ready RDF Database on DemandOntotext
GraphDB Cloud is an enterprise grade RDF graph database providing high-performance querying over large volumes of RDF data. On this webinar, Ontotext demonstrates how to instantly create and deploy a fully managed Graph Database, then import & query data with the (OpenRDF) GraphDB Workbench, and finally explore and visualize data with the build in visualization tools.
Knowledge graph embeddings are a mechanism that projects each entity in a knowledge graph to a point in a continuous vector space. It is commonly assumed that those approaches project two entities closely to each other if they are similar and/or related. In this talk, I give a closer look at the roles of similarity and relatedness with respect to knowledge graph embeddings, and discuss how the well-known embedding mechanism RDF2vec can be tailored towards focusing on similarity, relatedness, or both.
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven RecipesOntotext
This presentation will provide a brief introduction to logical reasoning and overview of the most popular semantic schema and ontology languages: RDFS and the profiles of OWL 2.
While automatic reasoning has always inspired the imagination, numerous projects have failed to deliver to the promises. The typical pitfalls related to ontologies and symbolic reasoning fall into two categories:
- Over-engineered ontologies. The selected ontology language and modeling patterns can be too expressive. This can make the results of inference hard to understand and verify, which in its turn makes KG hard to evolve and maintain. It can also impose performance penalties far greater than the benefits.
- Inappropriate reasoning support. There are many inference algorithms and implementation approaches, which work well with taxonomies and conceptual models of few thousands of concepts, but cannot cope with KG of millions of entities.
- Inappropriate data layer architecture. One such example is reasoning with virtual KG, which is often infeasible.
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...Ontotext
This webinar continues series are demonstrating how linked open data and semantic tagging of news can be used for comprehensive media monitoring, market and business intelligence. The platform for the demonstrations is FactForge: a hub for news and data about people, organizations, and locations (POL). FactForge embodies a big knowledge graph (BKG) of more than 1 billion facts that allows various analytical queries, including tracing suspicious patterns of company control; media monitoring of people, including companies owned by them, their subsidiaries, etc.
Building materialised views for linked data systems using microservicesConnected Data World
In the BBC’s Content Distribution Services division, we build and maintain systems that provide content metadata to a wide range of audience-facing products.
Our current architecture for distributing tagging metadata consists mainly of two RDF-based read and write APIs feeding off a central triplestore. This single storage setup for all operations imposes restrictions on performance and scalability.
I will talk about our work done to create an event-driven distribution pipeline that generates materialised views of tagging metadata.
The new microservices architecture comprises of small, single-purpose services, lambda functions, event stores, queues, streams etc. The views are built on data stores that are optimised to serve specific query profiles thus improving the overall performance and scalability of the system.
PoolParty Semantic Suite is Semantic Web Company’s platform for enterprise information integration based on Linked Data principles. PoolParty consists of several components that process and manage RDF based data sets. These components have consistency requirements towards the data they work on.
Also, users have requirements towards the quality of the data they manage. We want to express constraints for both in a standard way throughout PoolParty components. SKOS-based PoolParty Thesaurus project data requires both consistency and quality.
Data curation and data archiving at different stages of the research processAndrea Scharnhorst
Henk van den Berg, Jerry de Vries, Andrea Scharnhorst (2019) Data curation and data archiving at different stages of the research process. Presentation given at the DANS Colloquium on Research and Data: Women readers finding their literary foremothers, March 21, 2019, The Hague
Semantic Web technologies (such as RDF and SPARQL) excel at bringing together diverse data in a world of independent data publishers and consumers. Common ontologies help to arrive at a shared understanding of the intended meaning of data.
However, they don’t address one critically important issue: What does it mean for data to be complete and/or valid? Semantic knowledge graphs without a shared notion of completeness and validity quickly turn into a Big Ball of Data Mud.
The Shapes Constraint Language (SHACL), an upcoming W3C standard, promises to help solve this problem. By keeping semantics separate from validity, SHACL makes it possible to resolve a slew of data quality and data exchange issues.
Presented at the Lotico Berlin Semantic Web Meetup.
GraphQL and its schema as a universal layer for database accessConnected Data World
GraphQL is a query language mostly used to streamline access to REST APIs. It is seeing tremendous growth and adoption, in organizations like Airbnb, Coursera, Docker, GitHub, Twitter, Uber, and Facebook, where it was invented.
As REST APIs are proliferating, the promise of accessing them all through a single query language and hub, which is what GraphQL and GraphQL server implementations bring, is alluring.
A significant recent addition to GraphQL was SDL, its schema definition language. SDL enables developers to define a schema governing interaction with the back-end that GraphQL servers can then implement and enforce.
Prisma is a productized version of the data layer leveraging GraphQL to access any database. Prisma works with MySQL, Postgres, and MongoDB, and is adding to this list.
Prisma sees the GraphQL community really coming together around the idea of schema-first development, and wants to use GraphQL SDL as the foundation for all interfaces between systems.
These slides were presented as part of a W3C tutorial at the CSHALS 2010 conference (http://www.iscb.org/cshals2010). The slides are adapted from a longer introduction to the Semantic Web available at http://www.slideshare.net/LeeFeigenbaum/semantic-web-landscape-2009 .
A PDF version of the slides is available at http://thefigtrees.net/lee/sw/cshals/cshals-w3c-semantic-web-tutorial.pdf .
Understanding RDF: the Resource Description Framework in Context (1999)Dan Brickley
Dan Brickley, 3rd European Commission Metadata Workshop, Luxemburg, April 12th 1999
Understanding RDF: the Resource Description Framework in Context
http://ilrt.org/discovery/2001/01/understanding-rdf/
Applying large scale text analytics with graph databasesData Ninja API
Data Ninja Services collaborated with Oracle to reach a major milestone in the integration of text analytics with Oracle Spatial and Graph. The Data Ninja Services client in Java can be used to analyze free texts, extract entities, generate RDF semantic graphs, and choose from a number of graph analytics to infer entity relationships. We demonstrated two case studies involving mining health news and detecting anomalies in product reviews.
Re-using Media on the Web: Media fragment re-mixing and playoutMediaMixerCommunity
A number of novel application ideas will be introduced based on the media fragment creation, specification and rights management technologies. Semantic search and retrieval allows us to organize sets of fragments by topical or conceptual relevance. These fragment sets can then be played out in a non-linear fashion to create a new media re-mix. We look at a server-client implementation supporting Media Fragments, before allowing the participants to take the sets of media they have selected and create their own re-mix.
Invited talk at USEWOD2014 (http://people.cs.kuleuven.be/~bettina.berendt/USEWOD2014/)
A tremendous amount of machine-interpretable information is available in the Linked Open Data Cloud. Unfortunately, much of this data remains underused as machine clients struggle to use the Web. I believe this can be solved by giving machines interfaces similar to those we offer humans, instead of separate interfaces such as SPARQL endpoints. In this talk, I'll discuss the Linked Data Fragments vision on machine access to the Web of Data, and indicate how this impacts usage analysis of the LOD Cloud. We all can learn a lot from how humans access the Web, and those strategies can be applied to querying and analysis. In particular, we have to focus first on solving those use cases that humans can do easily, and only then consider tackling others.
After the amazing breakthroughs of machine learning (deep learning or otherwise) in the past decade, the shortcomings of machine learning are also becoming increasingly clear: unexplainable results, data hunger and limited generalisability are all becoming bottlenecks.
In this talk we will look at how the combination with symbolic AI (in the form of very large knowledge graphs) can give us a way forward, towards machine learning systems that can explain their results, that need less data, and that generalise better outside their training set.
--
Frank van Harmelen leads the Knowledge Representation & Reasoning group in the CS Department of the VU University Amsterdam. He is also Principal investigator of the Hybrid Intelligence Centre, a 20Μ€, 10 year collaboration between researchers at 6 Dutch universities into AI that collaborates with people instead of replacing them.
--
While mathematicians have used graph theory since the 18th century to solve problems, the software patterns for graph data are new to most developers. To enable "mass adoption" of graph technology, we need to establish the right abstractions, access APIs, and data models.
RDF triples, while of paramount importance in establishing RDF graph semantics, are a low-level abstraction, much like using assembly language. For practical and productive “graph programming” we need something different.
Similarly, existing declarative graph query languages (such as SPARQL and Cypher) are not always the best way to access graph data, and sometimes you need a simpler interface (e.g., GraphQL), or even a different approach altogether (e.g., imperative traversals such as with Gremlin).
Ora Lassila is a Principal Graph Technologist in the Amazon Neptune graph database group. He has a long experience with graphs, graph databases, ontologies, and knowledge representation. He was a co-author of the original RDF specification as well as a co-author of the seminal article on the Semantic Web.
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...Connected Data World
"The most important contribution management needs to make in the 21st Century is to increase the productivity of knowledge work and the knowledge worker", said Peter F. Drucker in 1999, and time has proven him right.
Even NASA is no exception, as it faces a number of challenges. NASA has hundreds of millions of documents, reports, project data, lessons learned, scientific research, medical analysis, geospatial data, IT logs, and all kinds of other data stored nation-wide.
The data is growing in terms of variety, velocity, volume, value and veracity. NASA needs to provide accessibility to engineering data sources, whose visibility is currently limited. To convert data to knowledge a convergence of Knowledge Management, Information Architecture and Data Science is necessary.
This is what David Meza, Acting Branch Chief - People Analytics, Sr. Data Scientist at NASA, calls "Knowledge Architecture": the people, processes, and technology of designing, implementing, and applying the intellectual infrastructure of organizations.
A talk by Aleksa Gordic | Software - Deep Learning engineer, Microsoft | The AI Epiphany
What can you learn about Graph Machine Learning in 2 months?
Aleksa Gordic, Machine Learning engineer @ Microsoft and Founder @ The AI Epiphany, shares his journey in the world of Graph Machine Learning. Aleksa started exploring the basics in the world of Graph Machine Learning, and ended up implementing and open sourcing his own Graph Attention Network on PyTorch.
In this talk, Aleksa will share the fundamentals of Graph Machine Learning, provide real-world examples, resources, and everything his younger self would be grateful for. Aleksa will also be available to answer questions.
What is Graph Machine Learning? Simply put, Graph Machine Learning is a branch of machine learning that deals with graph data.
Graphs consist of nodes, that may have feature vectors associated with them, and edges, which again may or may not have feature vectors attached. The applications are endless. Massive-scale recommender systems, particle physics, computational pharmacology / chemistry / biology, traffic prediction, fake news detection, and the list goes on and on.
In recent years graphs have been increasingly adopted in financial services for everything from fraud detection to Know Your Customer (KYC) to regulatory requirements. At the same time Environmental Social Governance (ESG) investing has become the fastest growing segment of financial services. In this session James discusses how many of these historical graph techniques are now being enhanced for the era of sustainable investing. Going beyond definitions, let's identify use cases, discuss news and trends, and wrap up with an ask me anything session.
What is graph all about, and why should you care? Graphs come in many shapes and forms, and can be used for different applications: Graph Analytics, Graph AI, Knowledge Graphs, and Graph Databases.
Talk by George Anadiotis. Connected Data London Meetup June 29th 2020.
Up until the beginning of the 2010s, the world was mostly running on spreadsheets and relational databases. To a large extent, it still does. But the NoSQL wave of databases has largely succeeded in instilling the “best tool for the job” mindset.
After relational, key-value, document, and columnar, the latest link in this evolutionary proliferation of data structures is graph. Graph analytics, Graph AI, Knowledge Graphs and Graph Databases have been making waves, included in hype cycles for the last couple of years.
The Year of the Graph marked the beginning of it all before the Gartners of the world got in the game. The Year of the Graph is a term coined to convey the fact that the time has come for this technology to flourish.
The eponymous article that set the tone was published in January 2018 on ZDNet by domain expert George Anadiotis. George has been working with, and keeping an eye on, all things Graph since the early 2000s. He was one of the first to note the continuing rise of Graph Databases, and to bring this technology in front of a mainstream audience.
The Year of the Graph has been going strong since 2018. In August 2018, Gartner started including Graph in its hype cycles. Ever since, Graph has been riding the upward slope of the Hype Cycle.
The need for knowledge on these technologies is constantly growing. To respond to that need, the Year of the Graph newsletter was released in April 2018. In addition, a constant flow of graph-related news and resources is being shared on social media.
To help people make educated choices, the Year of the Graph Database Report was released. The report has been hailed as the most comprehensive of its kind in the market, consistently helping people choose the most appropriate solution for their use case since 2018.
The report, articles, news stream, and the newsletter have been reaching thousands of people, helping them understand and navigate this landscape. We’ll talk about the Year of the Graph, the different shapes, forms, and applications for graphs, the latest news and trends, and wrap up with an ask me anything session.
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2Connected Data World
Do you have experience in data modeling, or using taxonomies to classify things, and want to upgrade to modeling knowledge graphs? This hands-on workshop with one of the leading knowledge graph practitioners will help you get started.
Parts 1 & 2
Do you have experience in data modeling, or using taxonomies to classify things, and want to upgrade to modeling knowledge graphs? This hands-on workshop with one of the leading knowledge graph practitioners will help you get started.
Part 3
For as long as people have been thinking about thinking, we have imagined that somewhere in the inner reaches of our minds there are ghostly, intangible things called ideas which can be linked together to create representations of the world around us — a world that has a certain structure, conforms to certain rules, and to a certain extent, can be predicted and manipulated on the basis of our ideas.
Rationalist philosophers have struggled for centuries to make a solid case for this intuitive, almost inborn view of human experience, but it is only with the advent of modern computing that we have the opportunity to build machines which truly think the way we think we think.
For the first time, we can give concrete form to our mental representations as graphs or hypergraphs, explicitly specify our mental schemas as ontologies, and formally define the rules by which we reason and act on new information. If we so choose, we can even use these human-like building blocks to construct systems that carry far more information than any single human brain, and that connect and serve millions of people in real time.
As enterprise knowledge graphs become increasingly mainstream, we appear to be headed in that direction, although there is no guarantee that the momentum will continue unless actively sustained. Where knowledge graphs are likely to be the most essential, in the long run, is at the interface between human and machine; mental representation versus formal knowledge representation.
In this talk, we will take a step back from the many practical and social challenges of building large-scale knowledge graphs, which at this point are well-known. Instead, we will take up the quest for an ideal data model for knowledge representation and data integration, seeking common ground among the most popular data models used in industry and open source software, surveying what we suspect to be true of our own inner models, and previewing structure and process in Apache TinkerPop, version 4. We will also take a tentative step forward into the world of augmented perception via graph stream processing.
Graph in Apache Cassandra. The World’s Most Scalable Graph DatabaseConnected Data World
Graph databases are everywhere right now. The explosive growth in the graph market coupled with the hype of solving graph problems is causing both excitement and confusion. From labeled property graphs to RDF to pure graph analytics to multi-model databases, the breadth of graph offerings is staggering.
The good news? DataStax has been listening—and building.
In this session, we’ll show you how DataStax Graph is architected into Apache Cassandra to deliver the world’s most scalable graph database. You’ll learn how to integrate Cassandra data into mixed workloads, design scalable property graphs, and even turn your existing tables into graphs.
With your high throughput time series data distributed next to its relationships, what will you build next?
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...Connected Data World
As one of the largest financial institutions worldwide, JP Morgan is reliant on data to drive its day-to-day operations, against an ever evolving regulatory regime. Our global data landscape possesses particular challenges of effectively maintaining data governance and metadata management.
The Data strategy at JP Morgan intends to:
a) generate business value
b) adhere to regulatory & compliance requirements
c) reduce barriers to access
d) democratize access to data
In this talk, we show how JP Morgan leverages semantic technologies to drive the implementation of our data strategy. We demonstrate how we exploit knowledge graph capabilities to answer:
1) What Data do I need?
2) What Data do we have?
3) Where does my Data come from?
4) Where should my Data come from?
5) What Data should be shared most?
Graph applications were once considered “exotic” and expensive. Until recently, few software engineers had much experience putting graphs to work. However, the use cases are now becoming more commonplace.
This talk explores a practical use case, one which addresses key issues of data governance and reproducible research, and depends on sophisticated use of graph technology.
Consider: some academic disciplines such as astronomy enjoy a wealth of data — mostly open data. Popular machine learning algorithms, open source Python libraries, and distributed systems all owe much to those disciplines and their history of big data.
Other disciplines require strong guarantees for privacy and security. Datasets used in social science research involve confidential details about human subjects: medical histories, wages, home addresses for family members, police records, etc.
Those cannot be shared openly, which impedes researchers from learning about related work by others. Reproducibility of research and the pace of science in general are limited. Nonetheless, social science research is vital for civil governance, especially for evidence-based policymaking (US federal law since 2018).
Even when data may be too sensitive to share openly, often the metadata can be shared. Constructing knowledge graphs of metadata about datasets — along with metadata about authors, their published research, methods used, data providers, data stewards, and so on — that provides effective means to tackle hard problems in data governance.
Knowledge graph work supports use cases such as entity linking, discovery and recommendations, axioms to infer about compliance, etc. This talk reviews the Rich Context AI competition and the related ADRF framework used now by more than 15 federal agencies in the US.
We’ll explore knowledge graph use cases, use of open standards and open source, and how this enhances reproducible research. Social science research for the public sector has much in common with data use in industry.
Issues of privacy, security, and compliance overlap, pointing toward what will be required of banks, media channels, etc., and what technologies apply. We’ll look at comparable work emerging in other parts of industry: open source projects, open standards emerging, and in particular a new set of features in Project Jupyter that support knowledge graphs about data governance.
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...Connected Data World
Making true “molecule”-“mechanism”-“observation” relationship connections is a time consuming, iterative and laborious process. In addition, it is very easy to miss critical information that affects key decisions or helps make plausible scientific connections.
The current practice for deciphering such relationships frequently involves subject matter experts (SMEs) requesting resource from resource-constrained data science departments to refine and redo highly similar ad hoc searches. The result of this is impairment of both the pace and quality of scientific reviews.
In this presentation, I show how semantic integration can be made to ultimately become part of an integrated learning framework for more informed scientific decision making. I will take the audience through our pilot journey and highlight practical learnings that should inform subsequent endeavours.
Semantic similarity for faster Knowledge Graph delivery at scaleConnected Data World
Knowledge graphs promise a novel platform for better holistic decision making and analytics. Many projects fail to reach their full potential because of the prohibitively high cost of integrating new knowledge from the required information sources.
The talk explains the concept of semantic similarity as a tool for efficient entity clustering and matching based on graph and text embeddings. It will demonstrate the underlying scalable and easy to understand algorithm of Random Indexing.
This work is part of the Ontotext Platform, which increases productivity in developing and maintaining large scale knowledge graphs. The platform enables enterprises to develop and operate on top of such mission-critical systems for decision support, information discovery and metadata management.
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...Connected Data World
What is the key to the holistic success of the fastest growing and most successful companies of our time globally? Well, often the key is the rapid increase in collected and analysed data. Graph databases provide a way to organise semantically by classes, not tables, are web-aware, and are superior for handling deep, complex relationships than traditional relational or NoSQL data stores.
It is these deep, complex relationships that can provide the rich context for hyper-personalising your product offering, inspiring consumers to purchase. In this talk, we describe how we are using artificial intelligence at Farfetch to not only help build a knowledge graph but also to evolve our insights with state-of-the-art graph-based AI.
A world of structured data promises us an incredible future. But most websites struggle to even implement basic schema.org markup. Fewer still represent and connect their pages and content in sophisticated, structured graphs. We can’t reach that incredible future without increasing and improving adoption.
To move forward, we need to make constructing rich structured data as easy as writing a recipe. This isn’t a pipe dream: at Yoast, we think we’ve solved schema for everybody, everywhere. We’d love to share our story.
The relationships between data sets matter. Discovering, analyzing, and learning those relationships is a central part to expanding our understand, and is a critical step to being able to predict and act upon the data. Unfortunately, these are not always simple or quick tasks.
To help the analyst we introduce RAPIDS, a collection of open-source libraries, incubated by NVIDIA and focused on accelerating the complete end-to-end data science ecosystem. Graph analytics is a critical piece of the data science ecosystem for processing linked data, and RAPIDS is pleased to offer cuGraph as our accelerated graph library.
Simply accelerating algorithms only addressed a portion of the problem. To address the full problem space, RAPIDS cuGraph strives to be feature-rich, easy to use, and intuitive. Rather than limiting the solution to a single graph technology, cuGraph supports Property Graphs, Knowledge Graphs, Hyper-Graphs, Bipartite graphs, and the basic directed and undirected graph.
A Python API allows the data to be manipulated as a DataFrame, similar and compatible with Pandas, with inputs and outputs being shared across the full RAPIDS suite, for example with the RAPIDS machine learning package, cuML.
This talk will present an overview of RAPIDS and cuGraph. Discuss and show examples of how to manipulate and analyze bipartite and property graph, plus show how data can be shared with machine learning algorithms. The talk will include some performance and scalability metrics. Then conclude with a preview of upcoming features, like graph query language support, and the general RAPIDS roadmap.
Elegant and Scalable Code Querying with Code Property GraphsConnected Data World
Programming is an unforgiving art form in which even minor flaws can cause rockets to explode, data to be stolen, and systems to be compromised. Today, a system tasked to automatically identify these flaws not only faces the intrinsic difficulties and theoretical limits of the task itself, it must also account for the many different forms in which programs can be formulated and account for the awe-inspiring speed at which developers push new code into CI/CD pipelines. So much code, so little time.
The code property graph – a multi-layered graph representation of code that captures properties of code across different abstractions – (application code, libraries and frameworks) – has been developed over the last six years to provide a foundation for the challenging problem of identifying flaws in program code at scale, whether it is high-level dynamically-typed Javascript, statically-typed Scala in its bytecode form, the syntax trees generated by Roslyn C# compiler, or the bitcode that flows through LLVM.
Based on this graph, we define a common query language based on formal code property graph specification to elegantly analyze code regardless of the source language. Paired with the formulation of a state-of-the-art data flow tracker based on code property graphs, we arrive at a distributed cloud native powerful code analysis. This talk provides an introduction to the technology.
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...Connected Data World
Do you want to learn how to use the low-hanging fruit of knowledge graphs — schema.org and JSON-LD — to annotate content and improve your SEO with semantics and entities? This hands-on workshop with one of the leading Semantic SEO practitioners will help you get started.
May graph technology improve the deployment of humanitarian projects? The goal of using what we call “Graphs for good at Action Against Hunger” is to be more efficient and transparent, and this can have a crucial impact on people’s lives.
Is there common behaviour factors between different projects? Can elements of different resources or projects be related? For example, security incidents in a city could influence the way other projects run in there.
The explained use case data comes from a project called Kit For Autonomous Cash Transfer in Humanitarian Emergencies (KACHE) whose goal is to deploy electronic cash transfers in emergency situations when no suitable infrastructure is available.
It also offers the opportunity to track transactions in order to better recognize crisis-affected population behaviours, understanding goods distribution network to improve recommendations, identifying the role of culture in transactional patterns, as well as most required items for every place.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™UiPathCommunity
In questo evento online gratuito, organizzato dalla Community Italiana di UiPath, potrai esplorare le nuove funzionalità di Autopilot, il tool che integra l'Intelligenza Artificiale nei processi di sviluppo e utilizzo delle Automazioni.
📕 Vedremo insieme alcuni esempi dell'utilizzo di Autopilot in diversi tool della Suite UiPath:
Autopilot per Studio Web
Autopilot per Studio
Autopilot per Apps
Clipboard AI
GenAI applicata alla Document Understanding
👨🏫👨💻 Speakers:
Stefano Negro, UiPath MVPx3, RPA Tech Lead @ BSP Consultant
Flavio Martinelli, UiPath MVP 2023, Technical Account Manager @UiPath
Andrei Tasca, RPA Solutions Team Lead @NTT Data
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Welcome to the first live UiPath Community Day Dubai! Join us for this unique occasion to meet our local and global UiPath Community and leaders. You will get a full view of the MEA region's automation landscape and the AI Powered automation technology capabilities of UiPath. Also, hosted by our local partners Marc Ellis, you will enjoy a half-day packed with industry insights and automation peers networking.
📕 Curious on our agenda? Wait no more!
10:00 Welcome note - UiPath Community in Dubai
Lovely Sinha, UiPath Community Chapter Leader, UiPath MVPx3, Hyper-automation Consultant, First Abu Dhabi Bank
10:20 A UiPath cross-region MEA overview
Ashraf El Zarka, VP and Managing Director MEA, UiPath
10:35: Customer Success Journey
Deepthi Deepak, Head of Intelligent Automation CoE, First Abu Dhabi Bank
11:15 The UiPath approach to GenAI with our three principles: improve accuracy, supercharge productivity, and automate more
Boris Krumrey, Global VP, Automation Innovation, UiPath
12:15 To discover how Marc Ellis leverages tech-driven solutions in recruitment and managed services.
Brendan Lingam, Director of Sales and Business Development, Marc Ellis
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
3. My talk today
What do they mean when they call it semantics?
Semantics in Graph DBs. Is this a thing?
Explicit semantics in Neo4j: A 5 min experiment
And now what?
4. Quick show of hands: What do you
mean when you say semantics?
Fragment Syntax Semantic?
:JohnSmith :livesIn :London
:London :cityIn :England RDF/Turtle Y/N
{ uri: “JohnSmith”, livesIn: { uri:“London”, cityIn: { uri:”England”}}} JSON Y/N
(j:Resource { uri:’JohnSmith’})-[:livesIn]->(l:Resource { uri:’London’})
(l)-[:cityIn]->(e:Resource { uri:’England’}) Cypher Y/N
<rdf:RDF>
<rdf:Description rdf:about=“Cljkfkojhg”>
<fjoijlgkih rdf:resource=“lksdfjslkjghhhlkjsfd”/>
</rdf:Description>
</rdf:RDF>
RDF/XML Y/N
{"@graph": [{
"@id": "http://places.com/London",
"v:cityIn": { "@id": “http://places.com/England" }
},{
"@id": "v:cityIn",
"rdfs:domain": { "@id": “v:Location"}, "rdfs:range": { "@id": "v:Location"}
}]}
RDFS (RDF/
JSON-LD)
Y/N
| who | livesIn | | what | cityIn |
|JohnSmith | London | | London | England |
CSV Y/N
5. Misconception… that some
cheeky RDF vendors keep alive.
Two facts:
John Smith lives in London and London is a city
in England.
And magic will happen…
Two triples:
:JohnSmith :livesIn :London
:London :cityIn :England
One Question:
Who lives in England?
The Query:
SELECT ?who WHERE { ?who :LivesIn
:England }
huh? what’s going on ?!?
6. A little detail is missing…
If someone lives in a city and that city is in a country, then
we can derive that this someone lives in that country.
But someone has to EXPLICITLY state this for an “intelligent
semantic DB” to apply it to the data
?x :livesIn ?city ^ ?city :cityIn ?ctry => ?x :LivesIn ?ctry
Making the semantics of your data explicit is what operates
the ‘magic’
Now we can try again: Who lives in England?
ANSWER: JohnSmith
7. more “magic”...
Now with a reasoning engine that understands RDFS
semantics we can ask: What locations do we know?
:cityIn rdfs:domain :Location
:cityIn rdfs:range :Location
:cityIn is a relationship stated between two locations
ANSWER: London and England
I’m a purist (and have a PhD in description logics) so I’m not
writing rules, I use a set of primitives with well defined
meaning like InverseFunctional, Domain, Disjoint, Range
and a generic rules engine will apply them for me on my data.
8. back to the origins: The Semantic Web
(1) https://www.w3.org/DesignIssues/RDFnot.html
TBL in 1998: The Semantic Web is not AI
[…] it does not imply some magical artificial
intelligence which allows machines to
comprehend human mumblings. It only indicates
a machine's ability to solve a well-defined
problem by performing well-defined
operations on existing well-defined data.
Instead of asking machines to understand
people's language, it involves asking people to
make the extra effort. (1)
9. :JohnSmith :livesIn :London
:London :cityIn :England
…
:JohnSmith :livesIn :London
:London :cityIn :England
…
An example with (a bit of) code
:JohnSmith :livesIn :London
:London :cityIn :England
…
Implicit Semantics
Application SELECt ?loc WHERE
{ ?loc a :Location}
SELECt ?loc WHERE
{ ?loc a :Location}
Application
Explicit SemanticsSELECt ?loc WHERE
{
}
Application
{ ?loc a :Location }
union
{ [] :livesIn ?loc }
union
{ ?loc :cityIn [] }
union
{ [] :cityIn ?loc }
?x :livesIn ?place
=> ?place a :Location
OWL/RDFS Ontology
:cityIn rdfs:domain :Location
:cityIn rdfs:range :Location
OWL/RDFS Reasoner
?prop refs:domain ?class ^
?res ?prop []
=> ?res a ?class(FC/BC/H) Rules Engine
10. So a semantic DB is…
• A graph database (often based on the RDF model,
but…)
• Some explicit description of the data in the graph
(typically RDFS/OWL or rule based)
• An (often rules based) domain-independent
processor that applies the explicit semantics
11. Two consequences:
No, but don’t worry. It’s a GRAPH!
1. Is my DB still semantic if I just use RDF
but don’t make my semantics explicit?
2. RDF is not the only way to build a
semantic DB?
You got it!
12. After all… RDF Triple store (!)
• Graph based data/knowledge/… exchange model
• Using RDF as an exchange format does not necessarily imply using tiple
storage: Think of Linked Data
• Neo4j for instance can expose graph data as RDF as do other stores or
middleware (DV, D2R).
Indeed, one of the main driving forces for the Semantic
web, has always been the expression, on the Web, of
the vast amount of relational database information in a
way that can be processed by machines (1).
(1) https://www.w3.org/DesignIssues/RDB-RDF.html
RDF is a standard model for data
interchange on the Web.
https://www.w3.org/TR/2004/REC-rdf-primer-20040210/
RDF is a directed, labeled graph data format for
representing information in the Web.
https://www.w3.org/TR/rdf-sparql-query/
13. So…Linked data, semantic data,
graph data…
Graph Data
LPG Data RDF Data
Explicit Semantics
:abc :custId :def({id:’abc’})-[:custId]->(id:’def’)
:custId a owl:InverseFunctionalProperty
:custId rdfs:domain :Customer
:Customer rdfs:subClassOf :Person
({id: ‘custId’})-[:domain]->({id: ‘Customer’})
({id: ‘Customer’})-[:subClassOf]->({id: ‘Person’})
:abc :def
:custId
:abc :def
:custId
14. Build a Semantic Graph DB in 5’
• Learn an Ontology from a data set (3’)
• Formalize the ontology -> make semantics explicit (1’)
• Use these semantics to drive your ‘intelligent’
application (1’)
15. The dataset
• 230K+ article summaries from the Financial Times
Demo...
articleDate articleTitle articleUrl keywords description storySummary
2012-12-17 “Vintage performance” http://… "EU integration, economies,
natural resources, energy policy,
industry, entrepreneurs, investment,
restaurants, filmmaking,
central and eastern Europe”
2013-01-09 "Google and the US
economy”
"On a warm autumn afternoon in Tokaj,
Laszlo Kalocsai waits patiently for his
grapes to turn mouldy. On the edge of the
Carpathian mountains, the best wines are
only possible once the fog-borne fungus
Botrytis cinerea has risen from the
wetlands"
"Thanks to
investment and a
focus on quality
eastern European
wines are now
much in demand”
http://… “Lex” ”A nice easy question: is the US economy
growing or shrinking? The majority view,
to simplify slightly, is that increasing
employment and rising house prices must
amount to an expansion. Dissident
pessimists have a more complicated story.
The current"
"Bullish
investors
should look at
world’s biggest
internet search
group"
16. More on explicit semantics in
Non-RDF graph DBs
https://jesusbarrasa.wordpress.com