Text is the medium used to store the tremendous wealth of scientific knowledge regarding the world we live in. However, with its ever-increasing magnitude and throughput, analysing this unstructured data has become a tedious task. This has led to the rise of Natural Language Processing (NLP), as the go-to for examining and processing large amounts of natural language data.
This involves the automatic extraction of structured semantic information from unstructured machine-readable text. The identification of these explicit concepts and relationships help in discovering multiple insights contained in text in a scalable and effective way.
A major challenge is the mapping of unstructured information from raw texts into entities, relationships and attributes in the knowledge graph. In this talk, we demonstrate how Grakn can be used to create a text mining knowledge graph capable of modelling, storing, and exploring beneficial information extracted from medical literature.
Natural Language Processing and Text Mining with Knowledge GraphsVaticle
Text is the medium used to store the tremendous wealth of scientific knowledge regarding the world we live in. However with its ever increasing magnitude and throughput, analysing this unstructured data has become a tedious task. This has led to the rise of Natural Language Processing (NLP), as the go-to for examining and processing large amounts of natural language data.
This involves the automatic extraction of structured semantic information from unstructured machine-readable text. The identification of these explicit concepts and relationships help in discovering multiple insights contained in text in a scalable and effective way.
A major challenge is the mapping of un-structured information from raw texts into entities, relationships and attributes in the knowledge graph. In this talk, we demonstrate how Grakn can be used to create a text mining knowledge graph capable of modelling, storing, and exploring beneficial information extracted from medical literature.
#### Syed Irtaza Raza, Software and Biomedical Engineer @ Grakn Labs
Syed is a Software and Biomedical Engineer at Grakn, primarily working on introducing the world on how to use a knowledge graph such as Grakn to build cognitive/intelligent systems in the Biomedical domain. To achieve this, he is implementing innovative examples as templates and ideas for how clients and community members may apply in their own specific projects of any field.
With a background in Electrical, Software and Biomedical Engineering, Syed’s mission is to discover and implement intelligent biomedical tools that are only possible with Grakn as a knowledge graph.
This is a clip from the Grakn London Meetup at the Royal Academy of Engineering (March 2019). Join the community: www.grakn.ai/community
Knowledge graph convolutional networks - Berlin 2019Vaticle
As humans we use our knowledge, our reasoning and our understanding of situational context to make accurate predictions about the world around us; machine learning doesn’t typically make use of any of this rich information.
The ability to leverage highly interrelated data will yield a step-change in the quality and complexity of predictions that can be made for the same volume of data.
We present Knowledge Graph Convolutional Networks: a method for performing machine learning over a Grakn Knowledge Graph, which captures micro-context and macro-context for any Concept within the graph.
This methodology demonstrates how we can usably combine knowledge, learning and reasoning to build systems that start to look truly intelligent.
Associated blog post:
https://blog.grakn.ai/kgcns-machine-learning-over-knowledge-graphs-with-tensorflow-a1d3328b8f02
Associated video:
https://www.youtube.com/watch?v=3adsYypRDsQ
This is a clip from the Grakn Berlin Meetup (Berlin 2019). Join the community: grakn.ai/community
Cognitive/AI systems process knowledge that is far too complex for current databases. They require an expressive data model and an intelligent query language to perform knowledge engineering over complex datasets. In this Meetup event, we will introduce GRAKN.AI, a distributed hyper-relational database for knowledge engineering, to Amsterdam's engineering community.
Grakn provides the knowledge base foundation for intelligent systems to manage complex data. We will also introduce Graql: Grakn's reasoning (through OLTP) and analytics (through OLAP) query language. Graql provides the tools required to do knowledge engineering: an expressive schema for knowledge modelling, reasoning transactions for real-time inference, distributed algorithms for large-scale analytics, and optimisation of query execution. And finally, we will discuss how Graql’s language serves as unified data representation of data for cognitive systems.
Knowledge graph convolutional networks - London 2018Vaticle
As humans we use our knowledge, our reasoning and our understanding of situational context to make accurate predictions about the world around us; machine learning doesn’t typically make use of any of this rich information.
The ability to leverage highly interrelated data will yield a step-change in the quality and complexity of predictions that can be made for the same volume of data.
We present Knowledge Graph Convolutional Networks: a method for performing machine learning over a Grakn Knowledge Graph, which captures micro-context and macro-context for any Concept within the graph.
This methodology demonstrates how we can usably combine knowledge, learning and reasoning to build systems that start to look truly intelligent.
Associated blog post:
https://blog.grakn.ai/knowledge-graph-convolutional-networks-machine-learning-over-reasoned-knowledge-9eb5ce5e0f68
Associated video:
https://youtu.be/Jx_Twc75ka0
This is a clip from the Grakn London Meetup at the Royal Academy of Engineering (November 2018). Join the community: grakn.ai/community
Deutsche Telecom Expert System - Router TroubleshootingVaticle
The presentation given at a Grakn meetup, held at Deutsche Telecom's hubraum in Berlin on 29 September 2018. The presentation details one example of how to construct an expert system at the database level.
Using Grakn to Analyse Protein Sequence AlignmentVaticle
Cognitive and AI applications consume data that is far too complex for current databases. These systems require an expressive data model and an intelligent query language to perform knowledge engineering over complex datasets. GRAKN.AI is a database to organise such complex networks of data.
Systems biology is one of the domains that produces huge amounts of data which presents integration challenges due to their complex nature. As understanding the complex relationships among these biological data is one of the key goals in biology, solutions are necessary that speed up the integration and querying of such data.
However, analysing large volumes of this biological data through traditional database systems is troublesome and challenging. In this talk, we will demonstrate how integrating a sequencing algorithm with a Grakn knowledge graph leads to valuable new insights of our data at scale.
Cognitive and AI systems process knowledge that is far too complex for current databases. They require an expressive data model and an intelligent query language to perform knowledge engineering over complex datasets.
Grakn provides the knowledge base foundation for intelligent systems to manage complex data. We will also introduce Graql: Grakn's reasoning (through OLTP) and analytics (through OLAP) query language. Graql provides the tools required to do knowledge engineering: an expressive schema for knowledge modelling, reasoning transactions for real-time inference, distributed algorithms for large-scale analytics, and optimisation of query execution. And finally, we will discuss how Graql’s language serves as unified data representation of data for cognitive systems.
Precision Medicine Knowledge Graph with GRAKN.AIVaticle
The success or failure of any modern organisation relies greatly on the way they leverage their data. However, most institutions and organisations have no way to aggregate the magnitude and complexity of their disparate data catalogs. They require a unified representation of their data which represents their specific domain truthfully as well as conceptually. In this talk, we introduce how using a knowledge graph addresses these problems in the field of Precision Medicine.
Precision medicine aims at establishing personalised context-centred therapies and diagnostics. This is done by integrating complex and disparate data repositories relating to environmental and molecular origins of diseases.
It has become increasingly difficult to design models for complex diseases which accommodate genetic individual variabilities. We need efficient and successful techniques to integrate, manage, maintain and visualise sizeable datasets. These datasets can be from a multitude of sources, having many various formats and levels of confidentiality. This forms the need to accumulate all this knowledge in one single structured architecture - a knowledge graph.
In this talk, we aspire to inspire a strategy, motivated by translational bioinformatics, to demonstrate how to fulfil the promises of Precision Medicine using Grakn.
This is a clip from the Grakn London Meetup in February 2019. Join the community: www.grakn.ai/community
Natural Language Processing and Text Mining with Knowledge GraphsVaticle
Text is the medium used to store the tremendous wealth of scientific knowledge regarding the world we live in. However with its ever increasing magnitude and throughput, analysing this unstructured data has become a tedious task. This has led to the rise of Natural Language Processing (NLP), as the go-to for examining and processing large amounts of natural language data.
This involves the automatic extraction of structured semantic information from unstructured machine-readable text. The identification of these explicit concepts and relationships help in discovering multiple insights contained in text in a scalable and effective way.
A major challenge is the mapping of un-structured information from raw texts into entities, relationships and attributes in the knowledge graph. In this talk, we demonstrate how Grakn can be used to create a text mining knowledge graph capable of modelling, storing, and exploring beneficial information extracted from medical literature.
#### Syed Irtaza Raza, Software and Biomedical Engineer @ Grakn Labs
Syed is a Software and Biomedical Engineer at Grakn, primarily working on introducing the world on how to use a knowledge graph such as Grakn to build cognitive/intelligent systems in the Biomedical domain. To achieve this, he is implementing innovative examples as templates and ideas for how clients and community members may apply in their own specific projects of any field.
With a background in Electrical, Software and Biomedical Engineering, Syed’s mission is to discover and implement intelligent biomedical tools that are only possible with Grakn as a knowledge graph.
This is a clip from the Grakn London Meetup at the Royal Academy of Engineering (March 2019). Join the community: www.grakn.ai/community
Knowledge graph convolutional networks - Berlin 2019Vaticle
As humans we use our knowledge, our reasoning and our understanding of situational context to make accurate predictions about the world around us; machine learning doesn’t typically make use of any of this rich information.
The ability to leverage highly interrelated data will yield a step-change in the quality and complexity of predictions that can be made for the same volume of data.
We present Knowledge Graph Convolutional Networks: a method for performing machine learning over a Grakn Knowledge Graph, which captures micro-context and macro-context for any Concept within the graph.
This methodology demonstrates how we can usably combine knowledge, learning and reasoning to build systems that start to look truly intelligent.
Associated blog post:
https://blog.grakn.ai/kgcns-machine-learning-over-knowledge-graphs-with-tensorflow-a1d3328b8f02
Associated video:
https://www.youtube.com/watch?v=3adsYypRDsQ
This is a clip from the Grakn Berlin Meetup (Berlin 2019). Join the community: grakn.ai/community
Cognitive/AI systems process knowledge that is far too complex for current databases. They require an expressive data model and an intelligent query language to perform knowledge engineering over complex datasets. In this Meetup event, we will introduce GRAKN.AI, a distributed hyper-relational database for knowledge engineering, to Amsterdam's engineering community.
Grakn provides the knowledge base foundation for intelligent systems to manage complex data. We will also introduce Graql: Grakn's reasoning (through OLTP) and analytics (through OLAP) query language. Graql provides the tools required to do knowledge engineering: an expressive schema for knowledge modelling, reasoning transactions for real-time inference, distributed algorithms for large-scale analytics, and optimisation of query execution. And finally, we will discuss how Graql’s language serves as unified data representation of data for cognitive systems.
Knowledge graph convolutional networks - London 2018Vaticle
As humans we use our knowledge, our reasoning and our understanding of situational context to make accurate predictions about the world around us; machine learning doesn’t typically make use of any of this rich information.
The ability to leverage highly interrelated data will yield a step-change in the quality and complexity of predictions that can be made for the same volume of data.
We present Knowledge Graph Convolutional Networks: a method for performing machine learning over a Grakn Knowledge Graph, which captures micro-context and macro-context for any Concept within the graph.
This methodology demonstrates how we can usably combine knowledge, learning and reasoning to build systems that start to look truly intelligent.
Associated blog post:
https://blog.grakn.ai/knowledge-graph-convolutional-networks-machine-learning-over-reasoned-knowledge-9eb5ce5e0f68
Associated video:
https://youtu.be/Jx_Twc75ka0
This is a clip from the Grakn London Meetup at the Royal Academy of Engineering (November 2018). Join the community: grakn.ai/community
Deutsche Telecom Expert System - Router TroubleshootingVaticle
The presentation given at a Grakn meetup, held at Deutsche Telecom's hubraum in Berlin on 29 September 2018. The presentation details one example of how to construct an expert system at the database level.
Using Grakn to Analyse Protein Sequence AlignmentVaticle
Cognitive and AI applications consume data that is far too complex for current databases. These systems require an expressive data model and an intelligent query language to perform knowledge engineering over complex datasets. GRAKN.AI is a database to organise such complex networks of data.
Systems biology is one of the domains that produces huge amounts of data which presents integration challenges due to their complex nature. As understanding the complex relationships among these biological data is one of the key goals in biology, solutions are necessary that speed up the integration and querying of such data.
However, analysing large volumes of this biological data through traditional database systems is troublesome and challenging. In this talk, we will demonstrate how integrating a sequencing algorithm with a Grakn knowledge graph leads to valuable new insights of our data at scale.
Cognitive and AI systems process knowledge that is far too complex for current databases. They require an expressive data model and an intelligent query language to perform knowledge engineering over complex datasets.
Grakn provides the knowledge base foundation for intelligent systems to manage complex data. We will also introduce Graql: Grakn's reasoning (through OLTP) and analytics (through OLAP) query language. Graql provides the tools required to do knowledge engineering: an expressive schema for knowledge modelling, reasoning transactions for real-time inference, distributed algorithms for large-scale analytics, and optimisation of query execution. And finally, we will discuss how Graql’s language serves as unified data representation of data for cognitive systems.
Precision Medicine Knowledge Graph with GRAKN.AIVaticle
The success or failure of any modern organisation relies greatly on the way they leverage their data. However, most institutions and organisations have no way to aggregate the magnitude and complexity of their disparate data catalogs. They require a unified representation of their data which represents their specific domain truthfully as well as conceptually. In this talk, we introduce how using a knowledge graph addresses these problems in the field of Precision Medicine.
Precision medicine aims at establishing personalised context-centred therapies and diagnostics. This is done by integrating complex and disparate data repositories relating to environmental and molecular origins of diseases.
It has become increasingly difficult to design models for complex diseases which accommodate genetic individual variabilities. We need efficient and successful techniques to integrate, manage, maintain and visualise sizeable datasets. These datasets can be from a multitude of sources, having many various formats and levels of confidentiality. This forms the need to accumulate all this knowledge in one single structured architecture - a knowledge graph.
In this talk, we aspire to inspire a strategy, motivated by translational bioinformatics, to demonstrate how to fulfil the promises of Precision Medicine using Grakn.
This is a clip from the Grakn London Meetup in February 2019. Join the community: www.grakn.ai/community
GRAKN.AI: The Hyper-Relational Database for Knowledge-Oriented SystemsVaticle
AI systems process knowledge that is far too complex for current databases. They require more expressive data schemas and intelligent query languages to provide a strong abstraction over complex data and their relationships. In this talk, we will discuss how GRAKN.AI, a distributed hyper-relational database, enables knowledge-oriented systems to work with complex data that serves as a knowledge base.
We will discuss how Graql, Grakn's reasoning (through OLTP) and analytics (through OLAP) query language, provides a much higher-level abstraction over traditional query languages. And finally, we will review the challenges of data management when developing Cognitive and AI systems, and how we solve them using Grakn and Graql as the database and query language.
Grakn is a hyper-relational knowledge base designed for AI applications. Many of these applications involve complex computation on big data.
In this talk, we first explore two big data processing models: map-reduce and Pregel. Then we introduce how we make use of these modes to build Grakn Analytics our powerful tool for big data processing. We will also discuss how we transform common algorithms to their massive parallel versions, so they can take full advantage of Grakn Analytics.
Logical Inference in a Hyper-Relational DatabaseVaticle
Inference is something we humans do all the time. Given a set of facts about the world, we derive new ones using some form of inference. Automated reasoning has been studied extensively but its value in providing a more powerful abstraction layer for database languages has been overlooked so far.
This talk explores deductive inference in Grakn, a hyper-relational database that has automated inference as one of its core features. Rather than defining SQL views or writing ad hoc code, in Grakn we can define logical rules that provide a more intuitive way to describe higher level domain concepts. In the talk we give a quick overview of computational logic semantics and of top-down and bottom-up inference algorithms. Then, after introducing some preliminary Grakn concepts, we show how logical rules are resolved in a query.
Introduction to Knowledge Graphs with Grakn and Graql Vaticle
Cognitive/AI systems process knowledge that is far too complex for current databases. They require an expressive data model and an intelligent query language to perform knowledge engineering over complex datasets.
In this talk, we will discuss how Grakn, a database to organise complex networks of data and make it queryable, provides the knowledge graph foundation for intelligent systems to manage complex data.
We will discuss how Graql, Grakn's reasoning (through OLTP) and analytics (through OLAP) query language, provides the tools required to do the job: a knowledge schema, a logical inference language, a distributed analytics framework.
And finally, we will discuss how Graql’s language serves as unified data representation of data for cognitive systems.
ntegrating Knowledge Bases with Neural Networks - by Nick Powell:
Knowledge bases are used as the under-pinning for reasoning systems. This talk will describe experiences using deep learning to facilitate knowledge base completion. With an existing knowledge base as a training set, we programmed the neural net as a binary classifier to find likely relationships and then insert them back into the graph. We'll describe lessons learned and next steps.
Artificial Intelligence in real world applications needs the notion of an open world assumption; they need to be able to work in unknown situations. However, most current image processing application cannot handle unknown situations and objects. Unknown objects are classified as background and systems are only able to classify images into pretrained and predefined object classes.
Using the KGLIB package of Grakn, we designed and trained a graph network for object classification, which is able to handle unknown objects. Data-driven insights based on image properties are combined with expert knowledge about class-hierarchies to classify images on multiple categories. We tested our network on a dataset of vehicles and predicted higher level categories. (For example 'land', 'air' or 'sea' vehicle). The graph network is used to predict interesting object characteristic, which require abstract knowledge predefined in a Grakn knowledge graph.
During this talk we will present our approach taken and discuss the design process we have taken. We will not only discuss the results, but also the difficulties and learning process we encountered.
Does Google still need links? - SearchLove San Diego 2017Tom Capper
Back in Google's early days, people navigated the web using links, and this made PageRank an excellent proxy for popularity and authority. The web is moving away from primarily link based surfing, and Google no longer needs a proxy - so what, in 2017, is the point in links?
SearchLove London 2018 - Tom Capper - The two-tiered SERP: Ranking for the mo...Distilled
Like it or loathe it, as SEOs we often find ourselves being asked to explain rankings, especially for highly visible head terms - but I’ve noticed in the last few years that for these most competitive terms, the normal rules don’t always apply. In this talk, I’ll dig into whether and how Google is going beyond our normal understanding of ranking factors, and how we need to react.
Building a semantic search system - one that can correctly parse and interpret end-user intent and return the ideal results for users’ queries - is not an easy task. It requires semantically parsing the terms, phrases, and structure within queries, disambiguating polysemous terms, correcting misspellings, expanding to conceptually synonymous or related concepts, and rewriting queries in a way that maps the correct interpretation of each end user’s query into the ideal representation of features and weights that will return the best results for that user. Not only that, but the above must often be done within the confines of a very specific domain - ripe with its own jargon and linguistic and conceptual nuances.
This talk will walk through the anatomy of a semantic search system and how each of the pieces described above fit together to deliver a final solution. We'll leverage several recently-released capabilities in Apache Solr (the Semantic Knowledge Graph, Solr Text Tagger, Statistical Phrase Identifier) and Lucidworks Fusion (query log mining, misspelling job, word2vec job, query pipelines, relevancy experiment backtesting) to show you an end-to-end working Semantic Search system that can automatically learn the nuances of any domain and deliver a substantially more relevant search experience.
Slides for the sixth meeting of the course 'Big Data and Automated Content Analysis' at the Department of Communication Science, University of Amsterdam
Crowdsourced query augmentation through the semantic discovery of domain spec...Trey Grainger
Talk Abstract: Most work in semantic search has thus far focused upon either manually building language-specific taxonomies/ontologies or upon automatic techniques such as clustering or dimensionality reduction to discover latent semantic links within the content that is being searched. The former is very labor intensive and is hard to maintain, while the latter is prone to noise and may be hard for a human to understand or to interact with directly. We believe that the links between similar user’s queries represent a largely untapped source for discovering latent semantic relationships between search terms. The proposed system is capable of mining user search logs to discover semantic relationships between key phrases in a manner that is language agnostic, human understandable, and virtually noise-free.
K anonymity for crowdsourcing database
In crowdsourcing database, human operators are embedded into the database engine and collaborate with other conventional database operators to process the queries. Each human operator publishes small HITs (Human Intelligent Task) to the crowdsourcing platform, which consists of a set of database records and corresponding questions for human workers.
Anonymization techniques are used to ensure the privacy preservation of the data owners, especially for personal and sensitive data. While in most cases, data reside inside the database management system; most of the proposed anonymization techniques operate on and anonymize isolated datasets stored outside the DBMS. Hence, most of the desired functionalities of the DBMS are lost, e.g., consistency, recoverability, and efficient querying. In this paper, we address the challenges involved in enforcing the data privacy inside the DBMS. We implement the k-anonymity algorithm as a relational operator that interacts with other query operators to apply the privacy requirements while querying the data. We study anonymizing a single table, multiple tables, and complex queries that involve multiple predicates. We propose several algorithms to implement the anonymization operator that allow efficient non-blocking and pipelined execution of the query plan. We introduce the concept of k-anonymity view as an abstraction to treat k-anonymity (possibly, with multiple k preferences) as a relational view over the base table(s). For non-static datasets, we introduce the materialized k-anonymity views to ensure preserving the privacy under incremental updates. A prototype system is realized based on PostgreSQL with extended SQL and new relational operators to support anonymity views. The prototype system demonstrates how anonymity views integrate with other privacy- preserving components, e.g., limited retention, limited disclosure, and privacy policy management. Our experiments, on both synthetic and real datasets, illustrate the performance gain from the anonymity views as well as the proposed query optimization techniques under various scenarios.
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and BeyondBhaskar Mitra
The emergence of deep learning-based methods for information retrieval (IR) poses several challenges and opportunities for benchmarking. Some of these are new, while others have evolved from existing challenges in IR exacerbated by the scale at which deep learning models operate. In this talk, I will present a brief overview of what we have learned from our work on MS MARCO and the TREC Deep Learning track, and reflect on the road ahead.
Slides for the second meeting of the course 'Big Data and Automated Content Analysis' at the Department of Communication Science, University of Amsterdam
When to use Machine Learning Models in SEO and Which ones to use - Lazarina S...LazarinaStoyanova
This talk is a walk-through of different ways you can incorporate machine learning in SEO tasks. It will involve a speed-run of different task categories/ aspects of SEO work, and the models that you can use for this purpose, and the results of a comparative analysis of how they perform.
Listeners will leave with (1) understanding of what are different ML models, and where to incorporate them in their day-to-day SEO work, (2) why and how to choose one solution over the other, (3) how to get started with the recommended ones (will be sharing videos/templates/walk-throughs)
This session focuses on:
The process of incorporating ML models and the aspects of SEO work when they can be incorporated (e.g. image captioning, generative work in content or meta elements, content localization, etc)
A summary of comparative analysis work I’ve done where I’m comparing the performance of different models for specific tasks in SEO, and providing a recommendation of which one to use for what task and why
Summary of steps and costs, plus templates/code to use
Who is this talk for?
Any SEO (agency), SEO manager (In-house), or site owner
Team Leads, looking to upskill their teams and processes to rely a bit more on automation
People interested in automation and ML/AI, and interested in going beyond chatGPT
Those interested in saving some time in their day-to-day tasks via automation
GRAKN.AI: The Hyper-Relational Database for Knowledge-Oriented SystemsVaticle
AI systems process knowledge that is far too complex for current databases. They require more expressive data schemas and intelligent query languages to provide a strong abstraction over complex data and their relationships. In this talk, we will discuss how GRAKN.AI, a distributed hyper-relational database, enables knowledge-oriented systems to work with complex data that serves as a knowledge base.
We will discuss how Graql, Grakn's reasoning (through OLTP) and analytics (through OLAP) query language, provides a much higher-level abstraction over traditional query languages. And finally, we will review the challenges of data management when developing Cognitive and AI systems, and how we solve them using Grakn and Graql as the database and query language.
Grakn is a hyper-relational knowledge base designed for AI applications. Many of these applications involve complex computation on big data.
In this talk, we first explore two big data processing models: map-reduce and Pregel. Then we introduce how we make use of these modes to build Grakn Analytics our powerful tool for big data processing. We will also discuss how we transform common algorithms to their massive parallel versions, so they can take full advantage of Grakn Analytics.
Logical Inference in a Hyper-Relational DatabaseVaticle
Inference is something we humans do all the time. Given a set of facts about the world, we derive new ones using some form of inference. Automated reasoning has been studied extensively but its value in providing a more powerful abstraction layer for database languages has been overlooked so far.
This talk explores deductive inference in Grakn, a hyper-relational database that has automated inference as one of its core features. Rather than defining SQL views or writing ad hoc code, in Grakn we can define logical rules that provide a more intuitive way to describe higher level domain concepts. In the talk we give a quick overview of computational logic semantics and of top-down and bottom-up inference algorithms. Then, after introducing some preliminary Grakn concepts, we show how logical rules are resolved in a query.
Introduction to Knowledge Graphs with Grakn and Graql Vaticle
Cognitive/AI systems process knowledge that is far too complex for current databases. They require an expressive data model and an intelligent query language to perform knowledge engineering over complex datasets.
In this talk, we will discuss how Grakn, a database to organise complex networks of data and make it queryable, provides the knowledge graph foundation for intelligent systems to manage complex data.
We will discuss how Graql, Grakn's reasoning (through OLTP) and analytics (through OLAP) query language, provides the tools required to do the job: a knowledge schema, a logical inference language, a distributed analytics framework.
And finally, we will discuss how Graql’s language serves as unified data representation of data for cognitive systems.
ntegrating Knowledge Bases with Neural Networks - by Nick Powell:
Knowledge bases are used as the under-pinning for reasoning systems. This talk will describe experiences using deep learning to facilitate knowledge base completion. With an existing knowledge base as a training set, we programmed the neural net as a binary classifier to find likely relationships and then insert them back into the graph. We'll describe lessons learned and next steps.
Artificial Intelligence in real world applications needs the notion of an open world assumption; they need to be able to work in unknown situations. However, most current image processing application cannot handle unknown situations and objects. Unknown objects are classified as background and systems are only able to classify images into pretrained and predefined object classes.
Using the KGLIB package of Grakn, we designed and trained a graph network for object classification, which is able to handle unknown objects. Data-driven insights based on image properties are combined with expert knowledge about class-hierarchies to classify images on multiple categories. We tested our network on a dataset of vehicles and predicted higher level categories. (For example 'land', 'air' or 'sea' vehicle). The graph network is used to predict interesting object characteristic, which require abstract knowledge predefined in a Grakn knowledge graph.
During this talk we will present our approach taken and discuss the design process we have taken. We will not only discuss the results, but also the difficulties and learning process we encountered.
Does Google still need links? - SearchLove San Diego 2017Tom Capper
Back in Google's early days, people navigated the web using links, and this made PageRank an excellent proxy for popularity and authority. The web is moving away from primarily link based surfing, and Google no longer needs a proxy - so what, in 2017, is the point in links?
SearchLove London 2018 - Tom Capper - The two-tiered SERP: Ranking for the mo...Distilled
Like it or loathe it, as SEOs we often find ourselves being asked to explain rankings, especially for highly visible head terms - but I’ve noticed in the last few years that for these most competitive terms, the normal rules don’t always apply. In this talk, I’ll dig into whether and how Google is going beyond our normal understanding of ranking factors, and how we need to react.
Building a semantic search system - one that can correctly parse and interpret end-user intent and return the ideal results for users’ queries - is not an easy task. It requires semantically parsing the terms, phrases, and structure within queries, disambiguating polysemous terms, correcting misspellings, expanding to conceptually synonymous or related concepts, and rewriting queries in a way that maps the correct interpretation of each end user’s query into the ideal representation of features and weights that will return the best results for that user. Not only that, but the above must often be done within the confines of a very specific domain - ripe with its own jargon and linguistic and conceptual nuances.
This talk will walk through the anatomy of a semantic search system and how each of the pieces described above fit together to deliver a final solution. We'll leverage several recently-released capabilities in Apache Solr (the Semantic Knowledge Graph, Solr Text Tagger, Statistical Phrase Identifier) and Lucidworks Fusion (query log mining, misspelling job, word2vec job, query pipelines, relevancy experiment backtesting) to show you an end-to-end working Semantic Search system that can automatically learn the nuances of any domain and deliver a substantially more relevant search experience.
Slides for the sixth meeting of the course 'Big Data and Automated Content Analysis' at the Department of Communication Science, University of Amsterdam
Crowdsourced query augmentation through the semantic discovery of domain spec...Trey Grainger
Talk Abstract: Most work in semantic search has thus far focused upon either manually building language-specific taxonomies/ontologies or upon automatic techniques such as clustering or dimensionality reduction to discover latent semantic links within the content that is being searched. The former is very labor intensive and is hard to maintain, while the latter is prone to noise and may be hard for a human to understand or to interact with directly. We believe that the links between similar user’s queries represent a largely untapped source for discovering latent semantic relationships between search terms. The proposed system is capable of mining user search logs to discover semantic relationships between key phrases in a manner that is language agnostic, human understandable, and virtually noise-free.
K anonymity for crowdsourcing database
In crowdsourcing database, human operators are embedded into the database engine and collaborate with other conventional database operators to process the queries. Each human operator publishes small HITs (Human Intelligent Task) to the crowdsourcing platform, which consists of a set of database records and corresponding questions for human workers.
Anonymization techniques are used to ensure the privacy preservation of the data owners, especially for personal and sensitive data. While in most cases, data reside inside the database management system; most of the proposed anonymization techniques operate on and anonymize isolated datasets stored outside the DBMS. Hence, most of the desired functionalities of the DBMS are lost, e.g., consistency, recoverability, and efficient querying. In this paper, we address the challenges involved in enforcing the data privacy inside the DBMS. We implement the k-anonymity algorithm as a relational operator that interacts with other query operators to apply the privacy requirements while querying the data. We study anonymizing a single table, multiple tables, and complex queries that involve multiple predicates. We propose several algorithms to implement the anonymization operator that allow efficient non-blocking and pipelined execution of the query plan. We introduce the concept of k-anonymity view as an abstraction to treat k-anonymity (possibly, with multiple k preferences) as a relational view over the base table(s). For non-static datasets, we introduce the materialized k-anonymity views to ensure preserving the privacy under incremental updates. A prototype system is realized based on PostgreSQL with extended SQL and new relational operators to support anonymity views. The prototype system demonstrates how anonymity views integrate with other privacy- preserving components, e.g., limited retention, limited disclosure, and privacy policy management. Our experiments, on both synthetic and real datasets, illustrate the performance gain from the anonymity views as well as the proposed query optimization techniques under various scenarios.
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and BeyondBhaskar Mitra
The emergence of deep learning-based methods for information retrieval (IR) poses several challenges and opportunities for benchmarking. Some of these are new, while others have evolved from existing challenges in IR exacerbated by the scale at which deep learning models operate. In this talk, I will present a brief overview of what we have learned from our work on MS MARCO and the TREC Deep Learning track, and reflect on the road ahead.
Slides for the second meeting of the course 'Big Data and Automated Content Analysis' at the Department of Communication Science, University of Amsterdam
When to use Machine Learning Models in SEO and Which ones to use - Lazarina S...LazarinaStoyanova
This talk is a walk-through of different ways you can incorporate machine learning in SEO tasks. It will involve a speed-run of different task categories/ aspects of SEO work, and the models that you can use for this purpose, and the results of a comparative analysis of how they perform.
Listeners will leave with (1) understanding of what are different ML models, and where to incorporate them in their day-to-day SEO work, (2) why and how to choose one solution over the other, (3) how to get started with the recommended ones (will be sharing videos/templates/walk-throughs)
This session focuses on:
The process of incorporating ML models and the aspects of SEO work when they can be incorporated (e.g. image captioning, generative work in content or meta elements, content localization, etc)
A summary of comparative analysis work I’ve done where I’m comparing the performance of different models for specific tasks in SEO, and providing a recommendation of which one to use for what task and why
Summary of steps and costs, plus templates/code to use
Who is this talk for?
Any SEO (agency), SEO manager (In-house), or site owner
Team Leads, looking to upskill their teams and processes to rely a bit more on automation
People interested in automation and ML/AI, and interested in going beyond chatGPT
Those interested in saving some time in their day-to-day tasks via automation
GPT and other Text Transformers: Black Swans and Stochastic ParrotsKonstantin Savenkov
Over the last year, we see increasingly more performant Text Transformers models, such as GPT-3 from OpenAI, Turing from Microsoft, and T5 from Google. They are capable of transforming the text in very creative and unexpected ways, like generating a summary of an article, explaining complex concepts in a simple language, or synthesizing realistic datasets for AI training. Unlike more traditional Machine Learning models, they do not require vast training datasets and can start based on just a few examples.
In this talk, we will make a short overview of such models, share the first experimental results and ask questions about the future of the content creation process. Are those models ready for prime time? What will happen to the professional content creators? Will they be able to compete against such powerful models? Will we see GPT post-editing similar to MT post-editing? We will share some answers we have based on the extensive experimenting and the first production projects that employ this new technology.
How the Web can change social science research (including yours)Frank van Harmelen
A presentation for a group of PhD students from the Leibniz Institutes (section B, social sciences) to discuss how they could use the Web, and even better the Web of Data, as an instrument in their research.
Google is using Large Language Models and Machine Learning in the algorithms that rank your sites and show them to users.
This talk will help you better understand from BERT to Rank Brain to Neural Matching and SGE, how they work, and what you should do about it.
Social Media Posts On Platforms Such As Twitter Or Instagram Use Hashtags,Which Are Author-Created
Labels Representing Topics Or Themes, Toassist In Categorization Of Posts And Searches For Posts Of
Interest. The Structural Analysis Of Hashtags Is Necessary As Precursor To Understandingtheir Meanings.
This Paper Describes Our Work On Segmenting Nondelimited Strings Of Hashtag-Type English Text. We
Adapt And Extend Methods Used Mostly In Non-Eng
To Make Your Chatbot Smart, You Need to Feed It Right: How to Write for Chatb...LavaConConference
Chatbots are becoming an increasingly popular delivery channel for many types of content, including customer support, marketing, and pre-sales. To make chatbots scalable, helpful, and smoothly integrated into the content ecosystem of your organization, you need to feed the chatbots with the right content prepared in the right way.
In this workshop, you’ll learn how to write content for chatbots in a way that lets the chatbots find the relevant content and precisely match it to the user’s request.
In this session, attendee’s will learn:
How chatbots work: approaches to recognizing user’s intent, handling incomplete requests, and finding matching content
How the content needs to be organized, structured, and written to be discoverable by the chatbot
How to not create an isolated, chatbot-specific content that would be available just for the chatbot
What makes content undiscoverable by the chatbot and makes the chatbot to give wrong or irrelevant answers
How to handle situations when the chatbot is unsure which content should be provided to the user
How to handle content variations (for example, product- or audience- specific)
Presented Saturday - June 23rd, 2018
I presented a 45-minute version of my "TypeScript 101" talk that serves as a short introduction to TypeScript and the benefits it provides to large-scale projects.
Let's talk about GPT: A crash course in Generative AI for researchersSteven Van Vaerenbergh
This talk delves into the extraordinary capabilities of the emerging technology of generative AI, outlining its recent history and emphasizing its growing influence on scientific endeavors. Through a series of practical examples tailored for researchers, we will explore the transformative influence of these powerful tools on scientific tasks such as writing, coding, data wrangling and literature review.
Software, like board games, needs instructions. Should I draw a card or play my card first? Will the connection be reused, or should I use a connection pool? In heroic tales, a protagonist needs directions from elder scrolls to user magical artifacts and complete their mission.
In this talk, Ignasi will explain his journey from neglecting docs to considering them an important step on the software delivery cycle. Ignasi will share tips ant tricks he’s been collecting over the years and focus on which habits he adopted to make sure he doesn’t forget docs. Ignasi will also share the types of audience and cases where documentation can save time and money to your organization.
After discussing the importance of documentation within several teams, Ignasi will try to counter the usual arguments and excuses those who don’t document often use. No, the code is not the documentation, the code doesn’t tell the whole story. You can have a strong type system restricting how to call an API and still be an unusable API: “Hmm, I need a Token here, where do I get it?”.
Join Ignasi for a talk about board games, child tales, and embarrassing PRs.
This talk targets beginner/daily user/experts alike.
Unlocking Academic Integrity Research Using Simulations, AI Assistance and Ch...Thomas Lancaster
Can research papers be faked using ChatGPT? These slides were presented at the Welsh Integrity and Assessment Network Symposium Event in June 2023 and considered examples of research using ChatGPT in ways that could be considered both ethical and unethical. The session included a live demo showing how ChatGPT can be used to construct a research paper. Screenshots of the live demo are included at the end of the slide set.
Distributed Natural Language Processing Systems in PythonClare Corthell
Much of human knowledge is “locked up” in a type of data called text. Humans are great at reading, but are computers? This workshop leads you through open source data science libraries in Python that turn text into valuable data, then tours an open source system built for the Wordnik dictionary to source definitions of words from across the internet.
Thinking Machines Conference, Manila, February 2016
http://thinkingmachin.es/events/
Similar to Text-Mined Data in a Knowledge Graph (17)
Building Biomedical Knowledge Graphs for In-Silico Drug DiscoveryVaticle
The rapid development and spread of analytical tools in the biomedical sciences has produced a variety of information about all sorts of biological components and their functions. Though important individually, their biological characteristics need to be understood in relation to the interactions they have with other biological components, which requires the integration of vast amounts of complex, semantically-rich, heterogenous data.
Traditional systems are inadequate at accurately modelling and handling data at this scale and complexity, making solutions that speed up the integration and querying of such data a necessity.
In this talk, we present various approaches being used in organisations to build biomedical computational pipelines to address these problems using tools such as Machine Learning and TypeDB. In particular, we discuss how to create an accurate and scalable semantic representation of molecular level biomedical data by presenting examples from drug discovery, precision medicine and competitive intelligence.
Speaker: Tomás Sabat
Tomás is the Chief Operating Officer at Vaticle, dedicated to building a strongly-typed database for intelligent systems. He works directly with TypeDB's open source and enterprise users so they can fulfil their potential with TypeDB and change the world. He focuses mainly in life sciences, cyber security, finance and robotics.
Loading a lot of data into a graph database is not a trivial exercise. TypeDB Loader (formerly known as GraMi) was developed to allow large-scale data import into TypeDB, a strongly-typed database. Recent improvements have immensely simplified the configuration interface to allow for easier data importing, while maintaining features and the promise of loading huge amounts of data into TypeDB as fast as possible.
Natural Language Interface to Knowledge GraphVaticle
Natural language interfaces (NLI) offer end-users an easy and convenient way to query ontology-based knowledge graphs. They automatically generate database queries based on their natural language inputs, avoiding the need for the end user to learn different query languages. NLIs can be used with REST APIs to facilitate and enrich the interactions with knowledge graphs, in domains such as interactive root cause analysis (RCA), dynamic dashboard generation, and Online Transactional Processing (OLTP).
In this talk, you'll learn about a natural language interface built with a TypeDB server running on Raspberry Pi4. This application offers a conversational bot assistant with Cisco Webex for an efficient and flexible way to facilitate human-machine interactions. In particular, this talk will demonstrate how natural language inputs are translated into TypeQL queries using Abstract Syntax Trees that represent the syntactic structure discovered during the Named Entity Recognition (NER) analysis of the textual inputs provided by Rasa 2.X running on an Intel Celeron J3455 miniPC.
A Data Modelling Framework to Unify Cyber Security KnowledgeVaticle
Cyber security companies collect massive amounts of heterogenous data coming from a huge number of sources. These describe hundreds of different data types, such as vulnerabilities, observables, incidents, and malwares. While this data is highly complex (with many types of relations, type hierarchies, and rules), its structure doesn't significantly change between organisations. However, without a publicly available data model, organisations end up modelling the same data in different ways: in other words, reinventing the wheel, and wasting their resources. This modelling complexity makes scaling cyber security applications extremely difficult.
That's why efforts are underway to provide ready-made solutions for typical cyber security use cases which provide the flexibility to expand for specific requirement of individual setups. The combination of those efforts have created a lot of inter-related knowledge silos (e.g. CVE, CAPEC, CWE, CVSS, Cocoa, MITRE, VERIS, STIX, MAEC). To unify these silos, various ontologies have been proposed by researchers, with different levels of granularity - from specific use cases like defence exercises, to more comprehensive cases like the UCO project.
During this talk, you’ll learn about the OmnibusCyber Project, an open-source, ready-made solution that aggregates cyber security knowledge silos, based on TypeDB. TypeDB’s framework offers the expressivity, safety, and inference properties required to implement a knowledge graph without the complexity associated with the OWL/RDF semantic frameworks.
Unifying Space Mission Knowledge with NLP & Knowledge GraphVaticle
Synopsis
The number of space missions being designed and launched worldwide is growing exponentially. Information on these missions, such as their objectives, orbit, or payload, is disseminated across various documents and datasets. Facilitating access to this information is key to accelerating the design of future missions, enabling experts to link an application to a mission, and following various stakeholders' activities.
This presentation introduces recent research done at the ESA to combine the latest Language Models with Knowledge Graphs, unifying our knowledge on space missions. Language Models such as GPT-3 and BERT are trained to understand the patterns of human (natural) language. These models have revolutionised the field of NLP, the branch of AI enabling machines to understand human language in all its complexity. In this work, key information on a mission is parsed from documents with the GPT-3 model, and the parsed data is then migrated to a TypeDB Knowledge Graph to be easily queried. Although this work focuses on an application in the space sector, the method can be transferred to other engineering fields.
Presenters
Dr. Audrey Berquand is a Research Fellow at the ESA. Her research aims at enhancing space mission design and knowledge management with text mining, NLP, and Knowledge Graphs. She was awarded her PhD in 2021 from the University of Strathclyde (Scotland) for her thesis on “Text Mining and Natural Language Processing for the Early Stages of Space Mission Design”. Audrey has a background in space systems engineering, she holds an MSc in Aerospace Engineering from the Royal Institute of Technology KTH (Sweden), and a diplôme d'ingénieur from the EPF Graduate School of Engineering (France). Before diving into the world of AI, she spent 3 years at ESA being involved in the early design phases of future Earth Observation missions.
Ana Victória Ladeira works with Knowledge Management at the ESA, using automated methods to exploit the information contained in the piles and piles of documents that ESA generates every day. With a Masters degree in Data Science from Maastricht University, Ana is particularly excited about how NLP methods can help large organizations connect different documents and highlight the bigger picture over a big universe of data sources, as well as using Knowledge Graphs to help connect people to the expertise and information they need.
Talk Summary:
State of the art AI approaches can struggle to create solutions which provide accurate results that stand the test of time. They are also plagued by problems such as bias and a lack of explainability. Causal AI addresses these key problems and is at the center of the Geminos Causeway platform, which is built on TypeDB.
This webinar will give you an introduction to why causal AI is so important, and how you can start to use it to drive more value for your organisation.
Speaker: Stuart Frost
Stu is the CEO and founder of Geminos. Their focus is on building AI-driven solutions for mid-sized Smart Manufacturing and Logistics companies, that are frustrated by their inability to digitalize their operations at sensible cost. Stu has 30 years’ experience in founding and leading successful data management and analytics startups, starting at 26 when he founded SELECT Software Tools, and led the company to a NASDAQ IPO in 1996. He then founded DATAllegro in 2003 which was acquired by Microsoft.
Building a Cyber Threat Intelligence Knowledge GraphVaticle
Knowledge of cyber threats is a key focus in cyber security. In this talk, we present TypeDB CTI, which is an open source threat intelligence platform to store and manage such knowledge. It enables Cyber Security Intelligence (CTI) professionals to bring together their disparate CTI information into one platform, enabling them to more easily manage such data and discover new insights about cyber threats.
We will describe how we use TypeDB to represent STIX 2.1, the most widely used language and serialization format used to exchange cyber threat intelligence. We cover how we leverage TypeDB's modelling constructs such as type hierarchies, nested relations, hyper relations, unique attributes, and logical inference to build this threat intelligence platform.
Speaker: Tomás Sabat
Tomás is the Chief Operating Officer at Vaticle. He works closely with TypeDB's open source and enterprise users who use TypeDB to build applications in a wide number of industries including financial services, life sciences, cyber security and supply chain management. A graduate of the University of Cambridge, Tomás has spent the last seven years founding and building businesses in the technology industry.
Knowledge Graphs for Supply Chain Operations.pdfVaticle
Agility in supply chain operations has never been so important, especially with today's nonlinear and complex world. That is why companies with supply chains need knowledge graphs.
So how do enterprises unleash the power of their own supply chain data to make smarter decisions? This is where bops comes into play. Bops activates supply chain data from existing operating systems (ERPs, Pos, OMS, etc) simplifying how operators optimize working capital in every decision.
In this session, bops will showcase a few use cases that portray the power of a knowledge graph to represent a supply chain network composed of an end to end product flow driven by actions among plants, customers and suppliers.
Supply chain operations visibility:
- Story of a Product and an SKU: from raw material to finished goods track trace & bill of material deviations
- Story of a Supplier – risk assessments – “the most influential supplier”
- Story of a Process – anomaly detection – “what went wrong?”
Join us for a lively discussion to learn how using knowledge graphs is already helping supply chain companies to better collect, unify, and activate their data.
Speaker: Jorge Risquez
Jorge is the Co-founder and CEO of bops, a headless supply chain intelligence platform helping manufacturers and distributors source, make, and deliver their products, and unlock working capital. Previously, Jorge spent a decade as a Supply Chain Consultant for Deloitte, where he worked with Fortune 500 companies such as Tyson and Cargill. In his spare time, he enjoys going for a run in Central Park and spending time with family and friends.
Building a Distributed Database with Raft.pdfVaticle
Applications running on production have much higher requirements. Not only do they need to be correct, they also need to be "always-on", handle a much bigger user load, and also be secure.
Meet TypeDB Cluster, the TypeDB database for production-scale, built using the Raft replication algorithm. Join us for a walk through the underlying architecture and what value it brings to developers running an application at scale.
Speaker: Ganeshwara Henanda
Ganesh leads the development of TypeDB Cluster while also managing other aspects such as infrastructure and project management. His day-to-day work involves building concurrent and distributed algorithms such as Raft and the Actor Model.
He graduated with an MSc of Grid Computing from University of Amsterdam, and has built several large scale distributed and real-time systems throughout his career.
Enabling the Computational Future of Biology.pdfVaticle
Computational biology has revolutionised biomedicine. The volume of data it is generating is growing exponentially. This requires tools that enable computational and non-computational biologists to collaborate and derive meaningful insights. However, traditional systems are inadequate to accurately model and handle data at this scale and complexity.
In this talk, we discuss how TypeDB enables biologists to build a deeper understanding of life, and increase the probability of groundbreaking discoveries, across the life sciences.
Speaker: Tomás Sabat
Tomás is the Chief Operating Officer at Vaticle. He works closely with TypeDB's open source and enterprise users who use TypeDB to build applications in a wide number of industries including financial services, life sciences, cybersecurity and supply chain management. A graduate of the University of Cambridge, Tomás has spent the last seven years founding and building businesses in the technology industry.
Build your skills and learn how TypeDB's native inference engine works.
Good for:
- Beginners to TypeDB and TypeQL
- Those who have been using TypeDB and want a refresher on inference in TypeDB
- Experienced software engineers
- Those who want to better represent their domain in a model that allows for logical reasoning at the database level
Description:
TypeDB is capable of reasoning over data via pre-defined rules. TypeQL rules look for a given pattern in the database and when found, infer the given queryable fact. The inference provided by rules is performed at query (run) time. Rules not only allow shortening and simplifying of commonly-used queries, but also enable knowledge discovery and implementation of business logic at the database level.
Takeaways:
- Understanding of fundamental components of TypeDB's inference engine and how to write rules for your domain
- Write at least 1 rule for your use case
- Utilise the rule you wrote in a query
Tomás Sabat:
Tomás is the Chief Operating Officer at Vaticle, dedicated to building a strongly-typed database for intelligent systems. He works directly with TypeDB's open source and enterprise users so they can fulfil their potential with TypeDB and change the world. He focuses mainly in life sciences, cyber security, finance and robotics.
Join the TypeDB community to learn how we think about data modelling, and how TypeDB's expressivity allows you to model your domain based on logical and object-oriented programming principles.
Good for:
- Engineers, scientists, and technical executives
- Those in a technical field working with complex datasets, and building intelligent systems
- Anyone curious to learn about the expressive power of TypeDB's data model
Description:
We open this training with an exploration into what a schema looks like in TypeDB, starting with clarifying the motivation for the conceptual model in TypeDB, and its relationship to the Enhanced Entity-Relationship model.
Then we break things down a bit more philosophically, delving into: what does it mean to represent data in TypeDB, and how TypeDB allows you to think higher-level, as opposed to join-tables, columns, documents, vertices, edges, and properties.
Takeaways:
- Be able to articulate why TypeDB's data model is so beneficial for complex data, and why we use it to build intelligent systems
- Write a TypeDB schema in TypeQL
- Practice modelling one of your own domains
Tomás Sabat:
Tomás is the Chief Operating Officer at Vaticle, dedicated to building a strongly-typed database for intelligent systems. He works directly with TypeDB's open source and enterprise users so they can fulfil their potential with TypeDB and change the world. He focuses mainly in life sciences, cyber security, finance and robotics.
Using SQL to query relational databases is easy. As a declarative language, it’s straightforward to write queries and build powerful applications. However, relational databases struggle when working with complex data. When querying such data in SQL, challenges especially arise in the modelling and querying of the data.
For example, due to the large number of necessary JOINs, it forces us to write long and verbose queries. Such queries are difficult to write and prone to mistakes.
TypeQL is the query language used in TypeDB. Just as SQL is the standard query language in relational databases, TypeQL is TypeDB's query language. It’s a declarative language, and allows us to model, query and reason over our data.
In this talk, we will look at how TypeQL compares to SQL. Why and when should you use TypeQL over SQL? How do we do outer/inner joins in TypeQL? We'll look at the common concepts, but mostly talk about the differences between the two.
Speaker: Tomás Sabat
Tomás is the Chief Operating Officer at Vaticle. He works closely with TypeDB's open source and enterprise users who use TypeDB to build applications in a wide number of industries including financial services, life sciences, cybersecurity and supply chain management. A graduate of the University of Cambridge, Tomás has spent the last seven years founding and building businesses in the technology industry.
TypeDB Academy- Getting Started with Schema DesignVaticle
In this TypeDB Academy, we start by gaining an understanding of the fundamental components of TypeDB's type system and what makes it unique. We will see how we can download, install, and run TypeDB, and learn to perform basic database operations.
We'll then explore what a schema looks like in TypeDB, starting with clarifying the motivation for schema, the conceptual schema of TypeDB, and its relationship to the Enhanced Entity-Relationship model.
Good for:
- Beginners to TypeDB and TypeQL
- Those who have been using TypeDB and want a refresher on schema and TypeQL
- Experienced database administrators and software engineers
Takeaways:
- Understanding of fundamental components of TypeDB
- How to download, install, and run TypeDB on your computer
- Be able to articulate why schema is so beneficial when using TypeDB, why we use one, and how it enables a more expressive model
- Write a TypeDB schema in TypeQL
Comparing Semantic Web Technologies to TypeDBVaticle
Semantic Web technologies enable us to represent and query for very complex and heterogeneous datasets. We can add semantics and reason over large bodies of data on the web. However, despite a lot of educational material available, they have failed to achieve mass adoption outside academia.
TypeDB works at a higher level of abstraction and enables developers to be more productive when working with complex data. TypeDB is easier to learn, reducing the barrier to entry and enabling more developers to access semantic technologies. Instead of using a myriad of standards and technologies, we just use one language - TypeQL.
In this talk we will:
- look at how TypeQL compares to Semantic Web standards, specifically RDF, SPARQL RDFS, OWL and SHACL.
- cover questions such as, how do we represent hyper-relations in TypeDB? How does one use rdfs:domain and rdfs:range in TypeDB? And how do the modelling philosophies compare?
Speaker: Tomás Sabat
Tomás is the Chief Operating Officer at Vaticle. He works closely with TypeDB's open source and enterprise users who use TypeDB to build applications in a wide number of industries including financial services, life sciences, cyber security and supply chain management. A graduate of the University of Cambridge, Tomás has spent the last seven years founding and building businesses in the technology industry.
How might we utilise an actor-based execution model to build a powerful yet elegant reasoning engine?
Actors are an asynchronous, inherently parallel framework that form the basis of some of the most computationally heavy systems in the world. By leveraging this in an event-driven model, we can build an execution engine that makes efficient use of all available hardware resources to answer your reasoning queries.
We'll visit the key ideas behind actors, and then walk through how we break reasoning into neat, actor-sized building blocks. As we do this, it will become clear how our marriage of reasoning and actors naturally produces a scalable and elegant execution engine. By examining the problem of reasoning from an actor-based lens, we'll be able to better understand the complexities of reasoning and visualise bottlenecks and optimisations.
Intro to TypeDB and TypeQL | A strongly-typed databaseVaticle
TypeDB is a strongly-typed database. It provides a rich and logical type system which breaks down complex problems into meaningful and logical systems, using TypeQL as its query language.
TypeDB allows you to model your domain based on logical and object-oriented principles. Composed of entity, relationship, and attribute types, as well as type hierarchies, roles, and rules, TypeDB allows you to think higher-level, as opposed to join-tables, columns, documents, vertices, and edges.
Types describe the logical structures of your data, allowing TypeDB to validate that your code inserts and queries data correctly. Query validation goes beyond static type-checking, and includes logical validation of meaningless queries. With strict type-checking errors, you have a dataset that you can trust.
Finally, TypeDB encodes your data for logical interpretation by its reasoning engine. It enables type-inference and rule-inference, which create logical abstractions of data. This allows for the discovery of facts and patterns that would otherwise be too hard to find.
With these abstractions, queries in the tens to hundreds of lines in SQL or NoSQL databases can be written in just a few lines in TypeQL – collapsing code complexity by orders of magnitude.
Join Tomás from the Vaticle team where he'll discuss the origins of TypeDB, the impetus for inventing a new query language, TypeQL, and why we are so excited about the future of software and intelligent systems.
Tomás Sabat:
Tomás is the Chief Operating Officer at Vaticle, dedicated to building a strongly typed database for intelligent systems. He works directly with TypeDB's open source and enterprise users so they can fulfil their potential with TypeDB and change the world. He focuses mainly in life sciences, cyber security, finance and robotics.
Graph Databases vs TypeDB | What you can't do with graphsVaticle
Developing with graph databases has a number of challenges, such as the modelling of complex schemas, and maintaining data consistency in your database.
In this talk, we discuss how TypeDB addresses these challenges, as well as how it compares to property graph databases. We’ll look at how to read and write data, how to model complex domains, and TypeDB’s ability to infer new data.
The main differences between TypeDB and graph databases can be summarised as:
1. TypeDB provides a concept-level schema with a type system that fully implements the Entity-Relationship (ER) model. Graph databases, on the other hand, use vertices and edges without integrity constraints imposed in the form of a schema
2. TypeDB contains a built-in inference engine - graph databases don’t provide native inferencing capabilities
3. TypeDB is an abstraction over a graph, and leverages a graph database under the hood to create a higher-level model, while graph databases work at different levels of abstraction
Tomás Sabat
Tomás is the Chief Operating Officer at Vaticle. He works closely with TypeDB's open source and enterprise users who use TypeDB to build applications in a wide number of industries including financial services, life sciences, cyber security and supply chain management. A graduate of the University of Cambridge, Tomás has spent the last seven years founding and building businesses in the technology industry.
In this seminar we use TypeDB to open a window on the Pandora Papers, a massive 'data tsunami' based on 11.9 million leaked source documents obtained by the International Consortium of Investigative Journalists (ICIJ).
We will use an automated query builder to get an initial set of results, and then hop from node to node, exploring neighbours and mapping out a suspicious-looking network of offshore shell companies, officers and intermediaries.
Speaker: Jon Thompson
Jon has an MSc in Applied Mathematics and has worked for several years as a Data Scientist in high-throughput biological sequencing. He is the founder of Nodelab, which is on a mission to provide a fully-featured graphical user interface experience for TypeDB.
Heterogenous data holds significant inherent context. We would like our machine learning models to understand this context, and utilise this ancillary but critical information to improve the accuracy and versatility of our models.
How can we systematically make use of context in Machine Learning?
We delve in and investigate the knowledge modelling techniques, which applied with the right ML strategies, give us a promising approach for robustly handling heterogeneous data in large knowledge models. We aim to do this in a way that allows us to build any Machine Learning models, including graph learning models like our KGCN.
Speaker: James Fletcher, Vaticle
James comes from a background of Computer Vision, specialising in automated diagnostics. As Principal Scientist at Vaticle, his mission is to demonstrate to the world how traditional symbolic approaches to AI, built-in to TypeDB, can be combined with present-day research in machine learning.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
3. What is Text Mining?
Text mining is the automatic extraction of structured semantic information
from unstructured machine-readable text.
Twitter: @GraknLabs
4. What is Text Mining?
Text mining is the automatic extraction of structured semantic information
from unstructured machine-readable text.
Twitter: @GraknLabs
5. What is Text Mining?
Text mining is the automatic extraction of structured semantic information
from unstructured machine-readable text.
Text Mining
Twitter: @GraknLabs
6. What is Text Mining?
Text mining is the automatic extraction of structured semantic information
from unstructured machine-readable text.
Text Mining
Twitter: @GraknLabs
7. What is Text Mining?
Text mining is the automatic extraction of structured semantic information
from unstructured machine-readable text.
?Text Mining
Twitter: @GraknLabs
8. What are the Challenges of Going Beyond Text Mining?
Twitter: @GraknLabs
9. What are the Challenges of Going Beyond Text Mining?
Integration
Difficult to ingest and integrate
complex networks of text mined output
across bodies of text
Twitter: @GraknLabs
10. What are the Challenges of Going Beyond Text Mining?
Integration
Difficult to ingest and integrate
complex networks of text mined output
across bodies of text
Normalisation
Difficult to contextualise extracted
knowledge from text with existing
knowledge
Twitter: @GraknLabs
11. What are the Challenges of Going Beyond Text Mining?
Integration
Difficult to ingest and integrate
complex networks of text mined output
across bodies of text
Normalisation
Difficult to contextualise extracted
knowledge from text with existing
knowledge
Discovery
Difficult to investigate insights contained
in text in a scalable and efficient way
Twitter: @GraknLabs
12. How Can we Solve These Challenges?
Twitter: @GraknLabs
13. How Can we Solve These Challenges?
Integration
Ingest and integrate complex networks of
text mined output into one collection – a
knowledge graph
Twitter: @GraknLabs
14. How Can we Solve These Challenges?
Integration
Ingest and integrate complex networks of
text mined output into one collection – a
knowledge graph
Normalisation
Impose an explicit structure on text
mined data to contextualise the
relationships with existing knowledge
Twitter: @GraknLabs
15. How Can we Solve These Challenges?
Integration
Ingest and integrate complex networks of
text mined output into one collection – a
knowledge graph
Normalisation
Impose an explicit structure on text
mined data to contextualise the
relationships with existing knowledge
Discovery
Use automated reasoning and analytics
to investigate and interpret insights
contained across text in a scalable and
efficient way
Twitter: @GraknLabs
16. Grakn is the knowledge base
foundation for intelligent systems
a.k.a.
a Knowledge Graph
Knowledge Storage System
Knowledge Inference Knowledge Analytics
Twitter: @GraknLabs
18. How Do We Build A Text Mined Knowledge Graph?
1. Model and migrate text mined output into a knowledge graph – Grakn.
2. Discover and interpret new insights.
Twitter: @GraknLabs
19. How Do We Build A Text Mined Knowledge Graph?
1. Model and migrate text mined output into a knowledge graph – Grakn.
Twitter: @GraknLabs
31. Relation
~~~~~~~~~~
~~~~~~~~~~
~~~~~~~~~~
Syed is proud to be the brother of Zainab.
Sentence
Zainab
Token
Syed
Token
Person
Type
Person
Type
Sibling
Type
1.0
Confidence
1.0
Confidence
0.9
Confidence
Sentiment
…..
…..
…..
…..
…..
0.9
Confidence
Twitter: @GraknLabs
What Does Text Mined Output Look Like?
Natural Language Processing
Mined-Text
37. How Do We Model in Graql?
Twitter: @GraknLabs
Entities
38. How Do We Model in Graql?
Relationships
Twitter: @GraknLabs
Entities
39. How Do We Model in Graql?
Relationships
Attributes
Twitter: @GraknLabs
Entities
40. How Do We Migrate Data Into Grakn?
const query = await transaction.query (
“insert $t isa token, has lemma ” Syed”, has type ” person”; “
);
InsertQuery query = Graql.insert(
var(”t").isa(”token”).has(”lemma”, ”Syed”).has(”type”, ”person”)
);
query = transaction.query(
“insert $t isa token, has lemma ” Syed”, has type ” person”;”
)
Twitter: @GraknLabs
41. How Do We Build A Text Mined Knowledge Graph?
2. Discover and interpret new insights.
Twitter: @GraknLabs
42. How Do We Discover An Insight?
What knowledge is extracted
from a PubMed article?
Question
Twitter: @GraknLabs
43. How Do We Discover An Insight?
What knowledge is extracted
from a PubMed article? =
Graql
Twitter: @GraknLabs
Question
44. ~~~~~BRAF inhibitor Dabrafenib~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Trametinib and Dabrafenib treat Melanoma
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~MEK inhibitor Trametinib~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Twitter: @GraknLabs
gene
protein drug
drug
disease
treatment
treatment
inhibition
inhibition
45. How Do We Discover An Insight?
Which PubMed articles
mention the disease
Melanoma and the gene BRAF?
Question
Twitter: @GraknLabs
46. How Do We Discover An Insight?
Which PubMed articles
mention the disease
Melanoma and the gene BRAF? =
Twitter: @GraknLabs
Question Graql
51. How Do We Interpret An Insight?
Twitter: @GraknLabs
52. How Do We Interpret An Insight?
titleabstractpmid
sentence
treatment
melanomaDabrafenib
When
Twitter: @GraknLabs
Pubmed-article
mined-
relation
token token
53. How Do We Interpret An Insight?
disease
therapeutic
treated-condition
Then
Twitter: @GraknLabs
When
titleabstractpmid
sentence
treatment
melanomaDabrafenib
Pubmed-article
mined-
relation
token token
treatment
Drug: Dabrafenib
Disease: Melanoma
54. How Do We Do Reasoning in Graql?
Twitter: @GraknLabs
55. How Do We Do Reasoning in Graql?
Twitter: @GraknLabs
56. How Do We Do Reasoning in Graql?
Twitter: @GraknLabs
titleabstractpmid
Pubmed-article
57. How Do We Do Reasoning in Graql?
Twitter: @GraknLabs
titleabstractpmid
sentence
Pubmed-article
58. How Do We Do Reasoning in Graql?
Twitter: @GraknLabs
titleabstractpmid
sentence
treatment
Pubmed-article
mined-
relation
token token
59. How Do We Do Reasoning in Graql?
Twitter: @GraknLabs
titleabstractpmid
sentence
treatment
melanomaDabrafenib
Pubmed-article
mined-
relation
token token
60. How Do We Do Reasoning in Graql?
Twitter: @GraknLabs
titleabstractpmid
sentence
treatment
melanomaDabrafenib
Pubmed-article
mined-
relation
token token
Disease: Melanoma Drug: Dabrafenib
61. How Do We Do Reasoning in Graql?
Twitter: @GraknLabs
Disease: Melanoma Drug: Dabrafenibtreatment
titleabstractpmid
sentence
treatment
melanomaDabrafenib
Pubmed-article
mined-
relation
token token
63. What makes Grakn a Knowledge Base for Text Mining?
Integration
Ingest and integrate complex networks of
text mined output into one collection – a
knowledge graph
Normalisation
Impose an explicit structure on text
mined data to contextualise the
relationships with existing knowledge
Discovery
Use automated reasoning and analytics
to investigate and interpret insights
contained across text in a scalable and
efficient way
Twitter: @GraknLabs
64. Thank you for attending this webinar!
Follow us on:
@graknlabs
Tomás: @tasabat
Daniel: @meanwhile_inn
Join our chatroom on:
https://discord.gg/graknlabs