Computational biology has revolutionised biomedicine. The volume of data it is generating is growing exponentially. This requires tools that enable computational and non-computational biologists to collaborate and derive meaningful insights. However, traditional systems are inadequate to accurately model and handle data at this scale and complexity.
In this talk, we discuss how TypeDB enables biologists to build a deeper understanding of life, and increase the probability of groundbreaking discoveries, across the life sciences.
Speaker: Tomás Sabat
Tomás is the Chief Operating Officer at Vaticle. He works closely with TypeDB's open source and enterprise users who use TypeDB to build applications in a wide number of industries including financial services, life sciences, cybersecurity and supply chain management. A graduate of the University of Cambridge, Tomás has spent the last seven years founding and building businesses in the technology industry.
Building Biomedical Knowledge Graphs for In-Silico Drug DiscoveryVaticle
The rapid development and spread of analytical tools in the biomedical sciences has produced a variety of information about all sorts of biological components and their functions. Though important individually, their biological characteristics need to be understood in relation to the interactions they have with other biological components, which requires the integration of vast amounts of complex, semantically-rich, heterogenous data.
Traditional systems are inadequate at accurately modelling and handling data at this scale and complexity, making solutions that speed up the integration and querying of such data a necessity.
In this talk, we present various approaches being used in organisations to build biomedical computational pipelines to address these problems using tools such as Machine Learning and TypeDB. In particular, we discuss how to create an accurate and scalable semantic representation of molecular level biomedical data by presenting examples from drug discovery, precision medicine and competitive intelligence.
Speaker: Tomás Sabat
Tomás is the Chief Operating Officer at Vaticle, dedicated to building a strongly-typed database for intelligent systems. He works directly with TypeDB's open source and enterprise users so they can fulfil their potential with TypeDB and change the world. He focuses mainly in life sciences, cyber security, finance and robotics.
Precision Medicine Knowledge Graph with GRAKN.AIVaticle
The success or failure of any modern organisation relies greatly on the way they leverage their data. However, most institutions and organisations have no way to aggregate the magnitude and complexity of their disparate data catalogs. They require a unified representation of their data which represents their specific domain truthfully as well as conceptually. In this talk, we introduce how using a knowledge graph addresses these problems in the field of Precision Medicine.
Precision medicine aims at establishing personalised context-centred therapies and diagnostics. This is done by integrating complex and disparate data repositories relating to environmental and molecular origins of diseases.
It has become increasingly difficult to design models for complex diseases which accommodate genetic individual variabilities. We need efficient and successful techniques to integrate, manage, maintain and visualise sizeable datasets. These datasets can be from a multitude of sources, having many various formats and levels of confidentiality. This forms the need to accumulate all this knowledge in one single structured architecture - a knowledge graph.
In this talk, we aspire to inspire a strategy, motivated by translational bioinformatics, to demonstrate how to fulfil the promises of Precision Medicine using Grakn.
This is a clip from the Grakn London Meetup in February 2019. Join the community: www.grakn.ai/community
An overview of the i2b2 clinical research platform, and the implications of connecting Indivo to i2b2 as a source of patient-reported outcomes. Presented at the 2012 Indivo X Users' Conference.
By Shawn Murphy MD, Ph.D., Partners Healthcare.
Introduction to Gene Mining Part A: BLASTn-off!adcobb
In this lesson, students will learn to use bioinformatics portals and tools to mine plant versions of human genes. Student handout and teacher resource materials are available at www.Araport.org, Teaching Resources (Community tab). Suitable for grades 9-12 or first year undergraduate students.
Building Biomedical Knowledge Graphs for In-Silico Drug DiscoveryVaticle
The rapid development and spread of analytical tools in the biomedical sciences has produced a variety of information about all sorts of biological components and their functions. Though important individually, their biological characteristics need to be understood in relation to the interactions they have with other biological components, which requires the integration of vast amounts of complex, semantically-rich, heterogenous data.
Traditional systems are inadequate at accurately modelling and handling data at this scale and complexity, making solutions that speed up the integration and querying of such data a necessity.
In this talk, we present various approaches being used in organisations to build biomedical computational pipelines to address these problems using tools such as Machine Learning and TypeDB. In particular, we discuss how to create an accurate and scalable semantic representation of molecular level biomedical data by presenting examples from drug discovery, precision medicine and competitive intelligence.
Speaker: Tomás Sabat
Tomás is the Chief Operating Officer at Vaticle, dedicated to building a strongly-typed database for intelligent systems. He works directly with TypeDB's open source and enterprise users so they can fulfil their potential with TypeDB and change the world. He focuses mainly in life sciences, cyber security, finance and robotics.
Precision Medicine Knowledge Graph with GRAKN.AIVaticle
The success or failure of any modern organisation relies greatly on the way they leverage their data. However, most institutions and organisations have no way to aggregate the magnitude and complexity of their disparate data catalogs. They require a unified representation of their data which represents their specific domain truthfully as well as conceptually. In this talk, we introduce how using a knowledge graph addresses these problems in the field of Precision Medicine.
Precision medicine aims at establishing personalised context-centred therapies and diagnostics. This is done by integrating complex and disparate data repositories relating to environmental and molecular origins of diseases.
It has become increasingly difficult to design models for complex diseases which accommodate genetic individual variabilities. We need efficient and successful techniques to integrate, manage, maintain and visualise sizeable datasets. These datasets can be from a multitude of sources, having many various formats and levels of confidentiality. This forms the need to accumulate all this knowledge in one single structured architecture - a knowledge graph.
In this talk, we aspire to inspire a strategy, motivated by translational bioinformatics, to demonstrate how to fulfil the promises of Precision Medicine using Grakn.
This is a clip from the Grakn London Meetup in February 2019. Join the community: www.grakn.ai/community
An overview of the i2b2 clinical research platform, and the implications of connecting Indivo to i2b2 as a source of patient-reported outcomes. Presented at the 2012 Indivo X Users' Conference.
By Shawn Murphy MD, Ph.D., Partners Healthcare.
Introduction to Gene Mining Part A: BLASTn-off!adcobb
In this lesson, students will learn to use bioinformatics portals and tools to mine plant versions of human genes. Student handout and teacher resource materials are available at www.Araport.org, Teaching Resources (Community tab). Suitable for grades 9-12 or first year undergraduate students.
Accelerating Scientific Research Through Machine Learning and GraphNeo4j
Miroculus is a molecular diagnostics company that leverages the potential of microRNAs as biomarkers and has created the most easy-to-use and automated platform for their detection. MicroRNAs are small non-coding RNA molecules, whose primary role is to regulate the expression of our genes. Their discovery in circulation of body fluids such as blood plasma/serum, urine and saliva has been followed up by a multitude of studies, providing evidence that detection of specific microRNA molecules can give clues about a person’s health status and may therefore be used as biomarkers for various conditions.
Loom is an up-to-date snapshot of the scientific literature landscape focused on microRNAs that we built to expedite our own research. As of today, there is no compelling way to access much of the microRNA research. By using Loom's easy-to-use, interactive UI, the researcher is able to quickly locate the relevant sentences across many publications relating specific microRNAs with her disease or gene of interest. With this tool, our objective is to provide a visually compelling and complete overview of how microRNAs relate to specific diseases and genes.
At the backend, Loom is comprised of 4 microservices. The first one is a listener that fetches new publications daily that are available in the NCBI databases: PubMed for abstracts and PMC for full-text, open-access publications. Then, a natural language processor scans the publication, breaking them down into their constituent sentences and detecting mentions of microRNAs, genes and diseases.
Within each sentence, a machine learning scorer evaluates the strength and type of relationship on a scale from 0 to 1 and outputs the results in a graph database. The resulting graph database is then queried in real-time by the UI to retrieve the sentences and relationships the user is interested in.
The Foundation of P4 Medicine Keynote Presentation as presented by Leroy Hood, M.D., PhD, at the Ohio State University Personalized Health Care National Conference 2010.
Possible Solution for Managing the Worlds Personal Genetic Data - DNA Guide, ...DNA Compass
World DNA Day and Genome Day, Dalian China 2011
"Possible Solution for Managing the Worlds Genetic Data" given by Alice Rathjen, Founder & President DNA Guide, Inc.
Proposes genetic tests be given a rating for quality of science, medical utility and viewing risk so as to facilitate the flow of genetic information in a responsible manner from the lab to the physician and patient. Explains how technology combined with public policy could enable both privacy and personalized medicine to thrive. Advocates individual ownership over personal genetic data and suggests the genome as a data format could provide the foundation for digital human rights.
tags: DNA, genetic testing, privacy, personalized medicine, FDA regulation
Semantic Web for Health Care and Biomedical InformaticsAmit Sheth
Amit Sheth, "Semantic Web for Health Care and Biomedical Informatics," Keynote at NSF Biomed Web Workshop, Corbett, Oregon, December 4-5, 2007.
http://www.biomedweb.info/2007/
Overview of GSK Machine Learning and Artificial Intelligence activities, by Kim Branson, SVP and Head of AI at GSK Pharma, November 3rd, 2021. AI methods are becoming widely used due to the exponential nature of data generation. AI is used to collect the data, process it, derive causal relations. AI is being used to aid design the next experiment in an efficient manner. (RL, Bandits ..). The exponential nature of data improves AI in a virtuous cycle. Target discovery: integration of Functional Genomic, Genetic and other data and other sources for target discovery.
Companion Software: for each asset we we will generate software for stratification, and individual response prediction
Fundamental AI Research: Fundamental research into causal machine learning, automated machine learning, and multi modal data combination. We are developing a feedback loop for each AI system we build. We have best in industry full automated discovery biology robotics. We ask the model what data it needs. We only know what to do with 15% of the genetic variants we obtain from genetic association studies. How do we unlock all the value of our investments in genetic data? We build AI for Variant to Gene Prediction: It transforms a complex genetic locus, To a ranked list of candidate genes with confidence bounds, That are tested experimentally through Functional Genomics. Variant to Gene AI: A multi AI system for solving the variant to gene problem. Teaching our AI what we know about the world- Internal and external data, GSK AI team developed a custom NLP model for biomedical data, Knowledge Graph of all data. Data becomes a critical factor for AI success. Private Data Sources, Generate data allow us determine the Value of other public / private sources. Models trained on private and public Data are unique. Common Public data sources. Moving Beyond medical records for cohort definition. Image Derived Phenotype (IDP) discovery & generation using AI/ML. Computational companion diagnostics and learning from clinical trials. Focusing on Computational Pathology- Applying the advances in AI for image analysis. Tissues are collected as part of the biopsy for pathology. Digital versions of these H&E slides as a tool for diagnosis/prognosis by human pathologist. What else can we do with this image data? Genetic differences are not human discernible. Currently determined by sequencing the tumor. Should we be constrained by human ability? AI can determine HRD genetic status from image.
The Human Phenotype Ontology (HPO) was developed to describe phenotypic abnormalities, aka, “deep phenotyping”, whereby symptoms and characteristic phenotypic findings (a phenotypic profile) are captured. The HPO has been utilized to great success for assisting computational phenotype comparison against known diseases, other patients, and model organisms to support diagnosis of rare disease patients. Clinicians and geneticists create phenotypic profiles based on clinical evaluation, but this is time consuming and can miss important phenotypic features. Patients are sometimes the best source of information about their symptoms that might otherwise be missed in a clinical encounter. However, HPO primarily use medical terminology, which can be difficult for patients and their families to understand. To make the HPO accessible to patients, we systematically added non-expert terminology (i.e., layperson terms) synonyms. Using semantic similarity, patient-recorded phenotypic profiles can be evaluated against those created clinically for undiagnosed patients to determine the improvement gained from the patient-driven phenotyping, as well as how much the patient phenotyping narrows the diagnosis. This patient-centric HPO can be utilized by all: in patient-centered rare disease websites, in patient community platforms and registries, or even to post one’s hard-to-diagnosed phenotypic profile on the Web.
Short tutorials on how to use the web-based tool DAVID - Database for Annotation, Visualization and Integrated Discovery) - http://david.abcc.ncifcrf.gov/
DAVID provides a comprehensive set of functional annotation tools for investigators to understand biological meaning behind large list of genes.
Loading a lot of data into a graph database is not a trivial exercise. TypeDB Loader (formerly known as GraMi) was developed to allow large-scale data import into TypeDB, a strongly-typed database. Recent improvements have immensely simplified the configuration interface to allow for easier data importing, while maintaining features and the promise of loading huge amounts of data into TypeDB as fast as possible.
Natural Language Interface to Knowledge GraphVaticle
Natural language interfaces (NLI) offer end-users an easy and convenient way to query ontology-based knowledge graphs. They automatically generate database queries based on their natural language inputs, avoiding the need for the end user to learn different query languages. NLIs can be used with REST APIs to facilitate and enrich the interactions with knowledge graphs, in domains such as interactive root cause analysis (RCA), dynamic dashboard generation, and Online Transactional Processing (OLTP).
In this talk, you'll learn about a natural language interface built with a TypeDB server running on Raspberry Pi4. This application offers a conversational bot assistant with Cisco Webex for an efficient and flexible way to facilitate human-machine interactions. In particular, this talk will demonstrate how natural language inputs are translated into TypeQL queries using Abstract Syntax Trees that represent the syntactic structure discovered during the Named Entity Recognition (NER) analysis of the textual inputs provided by Rasa 2.X running on an Intel Celeron J3455 miniPC.
More Related Content
Similar to Enabling the Computational Future of Biology.pdf
Accelerating Scientific Research Through Machine Learning and GraphNeo4j
Miroculus is a molecular diagnostics company that leverages the potential of microRNAs as biomarkers and has created the most easy-to-use and automated platform for their detection. MicroRNAs are small non-coding RNA molecules, whose primary role is to regulate the expression of our genes. Their discovery in circulation of body fluids such as blood plasma/serum, urine and saliva has been followed up by a multitude of studies, providing evidence that detection of specific microRNA molecules can give clues about a person’s health status and may therefore be used as biomarkers for various conditions.
Loom is an up-to-date snapshot of the scientific literature landscape focused on microRNAs that we built to expedite our own research. As of today, there is no compelling way to access much of the microRNA research. By using Loom's easy-to-use, interactive UI, the researcher is able to quickly locate the relevant sentences across many publications relating specific microRNAs with her disease or gene of interest. With this tool, our objective is to provide a visually compelling and complete overview of how microRNAs relate to specific diseases and genes.
At the backend, Loom is comprised of 4 microservices. The first one is a listener that fetches new publications daily that are available in the NCBI databases: PubMed for abstracts and PMC for full-text, open-access publications. Then, a natural language processor scans the publication, breaking them down into their constituent sentences and detecting mentions of microRNAs, genes and diseases.
Within each sentence, a machine learning scorer evaluates the strength and type of relationship on a scale from 0 to 1 and outputs the results in a graph database. The resulting graph database is then queried in real-time by the UI to retrieve the sentences and relationships the user is interested in.
The Foundation of P4 Medicine Keynote Presentation as presented by Leroy Hood, M.D., PhD, at the Ohio State University Personalized Health Care National Conference 2010.
Possible Solution for Managing the Worlds Personal Genetic Data - DNA Guide, ...DNA Compass
World DNA Day and Genome Day, Dalian China 2011
"Possible Solution for Managing the Worlds Genetic Data" given by Alice Rathjen, Founder & President DNA Guide, Inc.
Proposes genetic tests be given a rating for quality of science, medical utility and viewing risk so as to facilitate the flow of genetic information in a responsible manner from the lab to the physician and patient. Explains how technology combined with public policy could enable both privacy and personalized medicine to thrive. Advocates individual ownership over personal genetic data and suggests the genome as a data format could provide the foundation for digital human rights.
tags: DNA, genetic testing, privacy, personalized medicine, FDA regulation
Semantic Web for Health Care and Biomedical InformaticsAmit Sheth
Amit Sheth, "Semantic Web for Health Care and Biomedical Informatics," Keynote at NSF Biomed Web Workshop, Corbett, Oregon, December 4-5, 2007.
http://www.biomedweb.info/2007/
Overview of GSK Machine Learning and Artificial Intelligence activities, by Kim Branson, SVP and Head of AI at GSK Pharma, November 3rd, 2021. AI methods are becoming widely used due to the exponential nature of data generation. AI is used to collect the data, process it, derive causal relations. AI is being used to aid design the next experiment in an efficient manner. (RL, Bandits ..). The exponential nature of data improves AI in a virtuous cycle. Target discovery: integration of Functional Genomic, Genetic and other data and other sources for target discovery.
Companion Software: for each asset we we will generate software for stratification, and individual response prediction
Fundamental AI Research: Fundamental research into causal machine learning, automated machine learning, and multi modal data combination. We are developing a feedback loop for each AI system we build. We have best in industry full automated discovery biology robotics. We ask the model what data it needs. We only know what to do with 15% of the genetic variants we obtain from genetic association studies. How do we unlock all the value of our investments in genetic data? We build AI for Variant to Gene Prediction: It transforms a complex genetic locus, To a ranked list of candidate genes with confidence bounds, That are tested experimentally through Functional Genomics. Variant to Gene AI: A multi AI system for solving the variant to gene problem. Teaching our AI what we know about the world- Internal and external data, GSK AI team developed a custom NLP model for biomedical data, Knowledge Graph of all data. Data becomes a critical factor for AI success. Private Data Sources, Generate data allow us determine the Value of other public / private sources. Models trained on private and public Data are unique. Common Public data sources. Moving Beyond medical records for cohort definition. Image Derived Phenotype (IDP) discovery & generation using AI/ML. Computational companion diagnostics and learning from clinical trials. Focusing on Computational Pathology- Applying the advances in AI for image analysis. Tissues are collected as part of the biopsy for pathology. Digital versions of these H&E slides as a tool for diagnosis/prognosis by human pathologist. What else can we do with this image data? Genetic differences are not human discernible. Currently determined by sequencing the tumor. Should we be constrained by human ability? AI can determine HRD genetic status from image.
The Human Phenotype Ontology (HPO) was developed to describe phenotypic abnormalities, aka, “deep phenotyping”, whereby symptoms and characteristic phenotypic findings (a phenotypic profile) are captured. The HPO has been utilized to great success for assisting computational phenotype comparison against known diseases, other patients, and model organisms to support diagnosis of rare disease patients. Clinicians and geneticists create phenotypic profiles based on clinical evaluation, but this is time consuming and can miss important phenotypic features. Patients are sometimes the best source of information about their symptoms that might otherwise be missed in a clinical encounter. However, HPO primarily use medical terminology, which can be difficult for patients and their families to understand. To make the HPO accessible to patients, we systematically added non-expert terminology (i.e., layperson terms) synonyms. Using semantic similarity, patient-recorded phenotypic profiles can be evaluated against those created clinically for undiagnosed patients to determine the improvement gained from the patient-driven phenotyping, as well as how much the patient phenotyping narrows the diagnosis. This patient-centric HPO can be utilized by all: in patient-centered rare disease websites, in patient community platforms and registries, or even to post one’s hard-to-diagnosed phenotypic profile on the Web.
Short tutorials on how to use the web-based tool DAVID - Database for Annotation, Visualization and Integrated Discovery) - http://david.abcc.ncifcrf.gov/
DAVID provides a comprehensive set of functional annotation tools for investigators to understand biological meaning behind large list of genes.
Loading a lot of data into a graph database is not a trivial exercise. TypeDB Loader (formerly known as GraMi) was developed to allow large-scale data import into TypeDB, a strongly-typed database. Recent improvements have immensely simplified the configuration interface to allow for easier data importing, while maintaining features and the promise of loading huge amounts of data into TypeDB as fast as possible.
Natural Language Interface to Knowledge GraphVaticle
Natural language interfaces (NLI) offer end-users an easy and convenient way to query ontology-based knowledge graphs. They automatically generate database queries based on their natural language inputs, avoiding the need for the end user to learn different query languages. NLIs can be used with REST APIs to facilitate and enrich the interactions with knowledge graphs, in domains such as interactive root cause analysis (RCA), dynamic dashboard generation, and Online Transactional Processing (OLTP).
In this talk, you'll learn about a natural language interface built with a TypeDB server running on Raspberry Pi4. This application offers a conversational bot assistant with Cisco Webex for an efficient and flexible way to facilitate human-machine interactions. In particular, this talk will demonstrate how natural language inputs are translated into TypeQL queries using Abstract Syntax Trees that represent the syntactic structure discovered during the Named Entity Recognition (NER) analysis of the textual inputs provided by Rasa 2.X running on an Intel Celeron J3455 miniPC.
A Data Modelling Framework to Unify Cyber Security KnowledgeVaticle
Cyber security companies collect massive amounts of heterogenous data coming from a huge number of sources. These describe hundreds of different data types, such as vulnerabilities, observables, incidents, and malwares. While this data is highly complex (with many types of relations, type hierarchies, and rules), its structure doesn't significantly change between organisations. However, without a publicly available data model, organisations end up modelling the same data in different ways: in other words, reinventing the wheel, and wasting their resources. This modelling complexity makes scaling cyber security applications extremely difficult.
That's why efforts are underway to provide ready-made solutions for typical cyber security use cases which provide the flexibility to expand for specific requirement of individual setups. The combination of those efforts have created a lot of inter-related knowledge silos (e.g. CVE, CAPEC, CWE, CVSS, Cocoa, MITRE, VERIS, STIX, MAEC). To unify these silos, various ontologies have been proposed by researchers, with different levels of granularity - from specific use cases like defence exercises, to more comprehensive cases like the UCO project.
During this talk, you’ll learn about the OmnibusCyber Project, an open-source, ready-made solution that aggregates cyber security knowledge silos, based on TypeDB. TypeDB’s framework offers the expressivity, safety, and inference properties required to implement a knowledge graph without the complexity associated with the OWL/RDF semantic frameworks.
Unifying Space Mission Knowledge with NLP & Knowledge GraphVaticle
Synopsis
The number of space missions being designed and launched worldwide is growing exponentially. Information on these missions, such as their objectives, orbit, or payload, is disseminated across various documents and datasets. Facilitating access to this information is key to accelerating the design of future missions, enabling experts to link an application to a mission, and following various stakeholders' activities.
This presentation introduces recent research done at the ESA to combine the latest Language Models with Knowledge Graphs, unifying our knowledge on space missions. Language Models such as GPT-3 and BERT are trained to understand the patterns of human (natural) language. These models have revolutionised the field of NLP, the branch of AI enabling machines to understand human language in all its complexity. In this work, key information on a mission is parsed from documents with the GPT-3 model, and the parsed data is then migrated to a TypeDB Knowledge Graph to be easily queried. Although this work focuses on an application in the space sector, the method can be transferred to other engineering fields.
Presenters
Dr. Audrey Berquand is a Research Fellow at the ESA. Her research aims at enhancing space mission design and knowledge management with text mining, NLP, and Knowledge Graphs. She was awarded her PhD in 2021 from the University of Strathclyde (Scotland) for her thesis on “Text Mining and Natural Language Processing for the Early Stages of Space Mission Design”. Audrey has a background in space systems engineering, she holds an MSc in Aerospace Engineering from the Royal Institute of Technology KTH (Sweden), and a diplôme d'ingénieur from the EPF Graduate School of Engineering (France). Before diving into the world of AI, she spent 3 years at ESA being involved in the early design phases of future Earth Observation missions.
Ana Victória Ladeira works with Knowledge Management at the ESA, using automated methods to exploit the information contained in the piles and piles of documents that ESA generates every day. With a Masters degree in Data Science from Maastricht University, Ana is particularly excited about how NLP methods can help large organizations connect different documents and highlight the bigger picture over a big universe of data sources, as well as using Knowledge Graphs to help connect people to the expertise and information they need.
Talk Summary:
State of the art AI approaches can struggle to create solutions which provide accurate results that stand the test of time. They are also plagued by problems such as bias and a lack of explainability. Causal AI addresses these key problems and is at the center of the Geminos Causeway platform, which is built on TypeDB.
This webinar will give you an introduction to why causal AI is so important, and how you can start to use it to drive more value for your organisation.
Speaker: Stuart Frost
Stu is the CEO and founder of Geminos. Their focus is on building AI-driven solutions for mid-sized Smart Manufacturing and Logistics companies, that are frustrated by their inability to digitalize their operations at sensible cost. Stu has 30 years’ experience in founding and leading successful data management and analytics startups, starting at 26 when he founded SELECT Software Tools, and led the company to a NASDAQ IPO in 1996. He then founded DATAllegro in 2003 which was acquired by Microsoft.
Building a Cyber Threat Intelligence Knowledge GraphVaticle
Knowledge of cyber threats is a key focus in cyber security. In this talk, we present TypeDB CTI, which is an open source threat intelligence platform to store and manage such knowledge. It enables Cyber Security Intelligence (CTI) professionals to bring together their disparate CTI information into one platform, enabling them to more easily manage such data and discover new insights about cyber threats.
We will describe how we use TypeDB to represent STIX 2.1, the most widely used language and serialization format used to exchange cyber threat intelligence. We cover how we leverage TypeDB's modelling constructs such as type hierarchies, nested relations, hyper relations, unique attributes, and logical inference to build this threat intelligence platform.
Speaker: Tomás Sabat
Tomás is the Chief Operating Officer at Vaticle. He works closely with TypeDB's open source and enterprise users who use TypeDB to build applications in a wide number of industries including financial services, life sciences, cyber security and supply chain management. A graduate of the University of Cambridge, Tomás has spent the last seven years founding and building businesses in the technology industry.
Knowledge Graphs for Supply Chain Operations.pdfVaticle
Agility in supply chain operations has never been so important, especially with today's nonlinear and complex world. That is why companies with supply chains need knowledge graphs.
So how do enterprises unleash the power of their own supply chain data to make smarter decisions? This is where bops comes into play. Bops activates supply chain data from existing operating systems (ERPs, Pos, OMS, etc) simplifying how operators optimize working capital in every decision.
In this session, bops will showcase a few use cases that portray the power of a knowledge graph to represent a supply chain network composed of an end to end product flow driven by actions among plants, customers and suppliers.
Supply chain operations visibility:
- Story of a Product and an SKU: from raw material to finished goods track trace & bill of material deviations
- Story of a Supplier – risk assessments – “the most influential supplier”
- Story of a Process – anomaly detection – “what went wrong?”
Join us for a lively discussion to learn how using knowledge graphs is already helping supply chain companies to better collect, unify, and activate their data.
Speaker: Jorge Risquez
Jorge is the Co-founder and CEO of bops, a headless supply chain intelligence platform helping manufacturers and distributors source, make, and deliver their products, and unlock working capital. Previously, Jorge spent a decade as a Supply Chain Consultant for Deloitte, where he worked with Fortune 500 companies such as Tyson and Cargill. In his spare time, he enjoys going for a run in Central Park and spending time with family and friends.
Building a Distributed Database with Raft.pdfVaticle
Applications running on production have much higher requirements. Not only do they need to be correct, they also need to be "always-on", handle a much bigger user load, and also be secure.
Meet TypeDB Cluster, the TypeDB database for production-scale, built using the Raft replication algorithm. Join us for a walk through the underlying architecture and what value it brings to developers running an application at scale.
Speaker: Ganeshwara Henanda
Ganesh leads the development of TypeDB Cluster while also managing other aspects such as infrastructure and project management. His day-to-day work involves building concurrent and distributed algorithms such as Raft and the Actor Model.
He graduated with an MSc of Grid Computing from University of Amsterdam, and has built several large scale distributed and real-time systems throughout his career.
Build your skills and learn how TypeDB's native inference engine works.
Good for:
- Beginners to TypeDB and TypeQL
- Those who have been using TypeDB and want a refresher on inference in TypeDB
- Experienced software engineers
- Those who want to better represent their domain in a model that allows for logical reasoning at the database level
Description:
TypeDB is capable of reasoning over data via pre-defined rules. TypeQL rules look for a given pattern in the database and when found, infer the given queryable fact. The inference provided by rules is performed at query (run) time. Rules not only allow shortening and simplifying of commonly-used queries, but also enable knowledge discovery and implementation of business logic at the database level.
Takeaways:
- Understanding of fundamental components of TypeDB's inference engine and how to write rules for your domain
- Write at least 1 rule for your use case
- Utilise the rule you wrote in a query
Tomás Sabat:
Tomás is the Chief Operating Officer at Vaticle, dedicated to building a strongly-typed database for intelligent systems. He works directly with TypeDB's open source and enterprise users so they can fulfil their potential with TypeDB and change the world. He focuses mainly in life sciences, cyber security, finance and robotics.
Join the TypeDB community to learn how we think about data modelling, and how TypeDB's expressivity allows you to model your domain based on logical and object-oriented programming principles.
Good for:
- Engineers, scientists, and technical executives
- Those in a technical field working with complex datasets, and building intelligent systems
- Anyone curious to learn about the expressive power of TypeDB's data model
Description:
We open this training with an exploration into what a schema looks like in TypeDB, starting with clarifying the motivation for the conceptual model in TypeDB, and its relationship to the Enhanced Entity-Relationship model.
Then we break things down a bit more philosophically, delving into: what does it mean to represent data in TypeDB, and how TypeDB allows you to think higher-level, as opposed to join-tables, columns, documents, vertices, edges, and properties.
Takeaways:
- Be able to articulate why TypeDB's data model is so beneficial for complex data, and why we use it to build intelligent systems
- Write a TypeDB schema in TypeQL
- Practice modelling one of your own domains
Tomás Sabat:
Tomás is the Chief Operating Officer at Vaticle, dedicated to building a strongly-typed database for intelligent systems. He works directly with TypeDB's open source and enterprise users so they can fulfil their potential with TypeDB and change the world. He focuses mainly in life sciences, cyber security, finance and robotics.
Using SQL to query relational databases is easy. As a declarative language, it’s straightforward to write queries and build powerful applications. However, relational databases struggle when working with complex data. When querying such data in SQL, challenges especially arise in the modelling and querying of the data.
For example, due to the large number of necessary JOINs, it forces us to write long and verbose queries. Such queries are difficult to write and prone to mistakes.
TypeQL is the query language used in TypeDB. Just as SQL is the standard query language in relational databases, TypeQL is TypeDB's query language. It’s a declarative language, and allows us to model, query and reason over our data.
In this talk, we will look at how TypeQL compares to SQL. Why and when should you use TypeQL over SQL? How do we do outer/inner joins in TypeQL? We'll look at the common concepts, but mostly talk about the differences between the two.
Speaker: Tomás Sabat
Tomás is the Chief Operating Officer at Vaticle. He works closely with TypeDB's open source and enterprise users who use TypeDB to build applications in a wide number of industries including financial services, life sciences, cybersecurity and supply chain management. A graduate of the University of Cambridge, Tomás has spent the last seven years founding and building businesses in the technology industry.
TypeDB Academy- Getting Started with Schema DesignVaticle
In this TypeDB Academy, we start by gaining an understanding of the fundamental components of TypeDB's type system and what makes it unique. We will see how we can download, install, and run TypeDB, and learn to perform basic database operations.
We'll then explore what a schema looks like in TypeDB, starting with clarifying the motivation for schema, the conceptual schema of TypeDB, and its relationship to the Enhanced Entity-Relationship model.
Good for:
- Beginners to TypeDB and TypeQL
- Those who have been using TypeDB and want a refresher on schema and TypeQL
- Experienced database administrators and software engineers
Takeaways:
- Understanding of fundamental components of TypeDB
- How to download, install, and run TypeDB on your computer
- Be able to articulate why schema is so beneficial when using TypeDB, why we use one, and how it enables a more expressive model
- Write a TypeDB schema in TypeQL
Comparing Semantic Web Technologies to TypeDBVaticle
Semantic Web technologies enable us to represent and query for very complex and heterogeneous datasets. We can add semantics and reason over large bodies of data on the web. However, despite a lot of educational material available, they have failed to achieve mass adoption outside academia.
TypeDB works at a higher level of abstraction and enables developers to be more productive when working with complex data. TypeDB is easier to learn, reducing the barrier to entry and enabling more developers to access semantic technologies. Instead of using a myriad of standards and technologies, we just use one language - TypeQL.
In this talk we will:
- look at how TypeQL compares to Semantic Web standards, specifically RDF, SPARQL RDFS, OWL and SHACL.
- cover questions such as, how do we represent hyper-relations in TypeDB? How does one use rdfs:domain and rdfs:range in TypeDB? And how do the modelling philosophies compare?
Speaker: Tomás Sabat
Tomás is the Chief Operating Officer at Vaticle. He works closely with TypeDB's open source and enterprise users who use TypeDB to build applications in a wide number of industries including financial services, life sciences, cyber security and supply chain management. A graduate of the University of Cambridge, Tomás has spent the last seven years founding and building businesses in the technology industry.
How might we utilise an actor-based execution model to build a powerful yet elegant reasoning engine?
Actors are an asynchronous, inherently parallel framework that form the basis of some of the most computationally heavy systems in the world. By leveraging this in an event-driven model, we can build an execution engine that makes efficient use of all available hardware resources to answer your reasoning queries.
We'll visit the key ideas behind actors, and then walk through how we break reasoning into neat, actor-sized building blocks. As we do this, it will become clear how our marriage of reasoning and actors naturally produces a scalable and elegant execution engine. By examining the problem of reasoning from an actor-based lens, we'll be able to better understand the complexities of reasoning and visualise bottlenecks and optimisations.
Intro to TypeDB and TypeQL | A strongly-typed databaseVaticle
TypeDB is a strongly-typed database. It provides a rich and logical type system which breaks down complex problems into meaningful and logical systems, using TypeQL as its query language.
TypeDB allows you to model your domain based on logical and object-oriented principles. Composed of entity, relationship, and attribute types, as well as type hierarchies, roles, and rules, TypeDB allows you to think higher-level, as opposed to join-tables, columns, documents, vertices, and edges.
Types describe the logical structures of your data, allowing TypeDB to validate that your code inserts and queries data correctly. Query validation goes beyond static type-checking, and includes logical validation of meaningless queries. With strict type-checking errors, you have a dataset that you can trust.
Finally, TypeDB encodes your data for logical interpretation by its reasoning engine. It enables type-inference and rule-inference, which create logical abstractions of data. This allows for the discovery of facts and patterns that would otherwise be too hard to find.
With these abstractions, queries in the tens to hundreds of lines in SQL or NoSQL databases can be written in just a few lines in TypeQL – collapsing code complexity by orders of magnitude.
Join Tomás from the Vaticle team where he'll discuss the origins of TypeDB, the impetus for inventing a new query language, TypeQL, and why we are so excited about the future of software and intelligent systems.
Tomás Sabat:
Tomás is the Chief Operating Officer at Vaticle, dedicated to building a strongly typed database for intelligent systems. He works directly with TypeDB's open source and enterprise users so they can fulfil their potential with TypeDB and change the world. He focuses mainly in life sciences, cyber security, finance and robotics.
Graph Databases vs TypeDB | What you can't do with graphsVaticle
Developing with graph databases has a number of challenges, such as the modelling of complex schemas, and maintaining data consistency in your database.
In this talk, we discuss how TypeDB addresses these challenges, as well as how it compares to property graph databases. We’ll look at how to read and write data, how to model complex domains, and TypeDB’s ability to infer new data.
The main differences between TypeDB and graph databases can be summarised as:
1. TypeDB provides a concept-level schema with a type system that fully implements the Entity-Relationship (ER) model. Graph databases, on the other hand, use vertices and edges without integrity constraints imposed in the form of a schema
2. TypeDB contains a built-in inference engine - graph databases don’t provide native inferencing capabilities
3. TypeDB is an abstraction over a graph, and leverages a graph database under the hood to create a higher-level model, while graph databases work at different levels of abstraction
Tomás Sabat
Tomás is the Chief Operating Officer at Vaticle. He works closely with TypeDB's open source and enterprise users who use TypeDB to build applications in a wide number of industries including financial services, life sciences, cyber security and supply chain management. A graduate of the University of Cambridge, Tomás has spent the last seven years founding and building businesses in the technology industry.
In this seminar we use TypeDB to open a window on the Pandora Papers, a massive 'data tsunami' based on 11.9 million leaked source documents obtained by the International Consortium of Investigative Journalists (ICIJ).
We will use an automated query builder to get an initial set of results, and then hop from node to node, exploring neighbours and mapping out a suspicious-looking network of offshore shell companies, officers and intermediaries.
Speaker: Jon Thompson
Jon has an MSc in Applied Mathematics and has worked for several years as a Data Scientist in high-throughput biological sequencing. He is the founder of Nodelab, which is on a mission to provide a fully-featured graphical user interface experience for TypeDB.
Heterogenous data holds significant inherent context. We would like our machine learning models to understand this context, and utilise this ancillary but critical information to improve the accuracy and versatility of our models.
How can we systematically make use of context in Machine Learning?
We delve in and investigate the knowledge modelling techniques, which applied with the right ML strategies, give us a promising approach for robustly handling heterogeneous data in large knowledge models. We aim to do this in a way that allows us to build any Machine Learning models, including graph learning models like our KGCN.
Speaker: James Fletcher, Vaticle
James comes from a background of Computer Vision, specialising in automated diagnostics. As Principal Scientist at Vaticle, his mission is to demonstrate to the world how traditional symbolic approaches to AI, built-in to TypeDB, can be combined with present-day research in machine learning.
AI offers enormous potential in terms of improving the effectiveness and efficiency of robots. In recent years, data-driven AI has achieved remarkable success in specialised tasks such as speech recognition, machine translation and object detection. Despite these successes, there are also some clear signs of the limitations.
On finding a solution to these limitations, we study the following three challenges:
1) How may robots operate under real-world conditions, which are dynamic and packed with unknown objects and situations?
2) How may robots be able to execute multiple tasks, instead of just one?
3) How can robots cooperate with other robots and with human team-mates?
In this talk the first two challenges will be addressed. Also, we will show how the knowledge-base of TypeDB enables us to tackle such challenges.
Speaker: Joris Sijs, Scientist @ TNO
Joris is a team-lead at TNO, where he develops and integrates software modules for the perception, awareness and planning of autonomous systems and autonomous robots. He recently started to extend this work with the development of knowledge-graphs (or cognitive databases), and how to combine this type of AI with the machine- and deep-learning solutions in AI.
Combining Causal and Knowledge Modeling for Digital TransformationVaticle
Geminos has created a low-code digital transformation platform that combines causal and knowledge modeling. It uses TypeDB as its internal repository. Initial projects are in supply chain and smart manufacturing, with a focus on sustainability.
Speakers: Stuart Frost (CEO), Owen Frost (Analyst)
Stu is the CEO and founder of Geminos. Their focus is on building AI-driven solutions for mid-sized Smart Manufacturing and Logistics companies, that are frustrated by their inability to digitalize their operations at sensible cost.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
3. 360 real time patient views In silico clinical trials
4. 360 real time patient views In silico clinical trials
Hyper-personalised medicine
5. 360 real time patient views In silico clinical trials
Hyper-personalised medicine De novo drug design
6. 360 real time patient views In silico clinical trials
Hyper-personalised medicine De novo drug design
Cell and gene therapy
7. 360 real time patient views In silico clinical trials
Hyper-personalised medicine De novo drug design
Cell and gene therapy Ageing research
8. 360 real time patient views In silico clinical trials
Hyper-personalised medicine De novo drug design
Cell and gene therapy Ageing research
Immunotherapy
9. 360 real time patient views In silico clinical trials
Hyper-personalised medicine De novo drug design
Cell and gene therapy Ageing research
Immunotherapy mRNA technology
31. protein
Choose the major entities
Identify the relationship types
drug
interaction
Data Modelling
32. Choose the major entities
Identify the relationship types
Determine which attributes belong to which entities
protein
uniprot-id
drug
chembl-id
interaction
Data Modelling
33. Choose the major entities
Identify the relationship types
Determine which attributes belong to which entities
Normalise
protein
uniprot-id
drug
chembl-id
interaction
Data Modelling
36. Choose the major entities
Identify the relationship types
Determine which attributes belong to which entities
Normalise
protein
uniprot-id
drug
chembl-id
interaction
Data Modelling
37. Choose the major entities
Identify the relationship types
Determine which attributes belong to which entities
Normalise
X
protein
uniprot-id
drug
chembl-id
interaction
Data Modelling
39. protein
uniprot-id
drug
chembl-id
interaction
define
protein sub entity,
owns uniprot-id,
plays interaction:interacted;
drug sub entity,
owns chembl-id,
plays interaction:interacting;
interaction sub relation,
relates interacting,
relates interacted;
uniprot-id sub attribute, value string;
chembl-id sub attribute, value string;
No need to normalise our data!
Data Modelling
41. drug
chembl-id
owns
protein
uniprot-id
drug
chembl-id
kinase ion-channel
interaction
define
protein sub entity,
owns uniprot-id,
plays interaction:interacted;
kinase sub protein;
ion-channel sub protein;
drug sub entity,
owns chembl-id,
plays interaction:interacting;
interaction sub relation,
relates interacting,
relates interacted;
uniprot-id sub attribute, value string;
chembl-id sub attribute, value string;
Data Modelling
43. interaction
protein
interacting interacted
match
$drug isa drug, has chembl-id "CHEMBL1193654";
$protein isa protein;
(interacted: $protein, interacting: $drug) isa interaction;
get $protein;
drug
interacting interacted
Return kinases and ion-channels connected to drugs
51. Drug Discovery
Data Harmonisation
Precision Medicine
Competitive Intelligence
Precision Medicine
Supply Chain Optimisation
Clinical Trial
Cohort Selection
Disease Understanding
52. Who have run clinical trials on Ebola who also own patents?
What are the most likely gene targets for Melanoma?
Given someone’s biological and genetic profile, what clinical trials are they
eligible for?
Questions we can ask
73. Competitive Intelligence
Drug Discovery
Precision Medicine
Given someone’s biological and genetic profile, what clinical trials are they
eligible for?
What are the most likely gene targets for Melanoma?
Who have run clinical trials on Ebola who also own patents?
77. Competitive Intelligence Drug Discovery Precision Medicine
Public Data
Unstructured data
Structured Data
Molecular
Clinical Trials
Patents
Disease
…
78. Competitive Intelligence Drug Discovery Precision Medicine
Public Data
Unstructured data
Structured Data
Molecular
Clinical Trials
Patents
Disease
…
79. Competitive Intelligence Drug Discovery Precision Medicine
Public Data
Unstructured data
Structured Data
TypeDB
Loader
Custom
Loaders
Connectors
…
Molecular
Clinical Trials
Patents
Disease
…
80. Competitive Intelligence Drug Discovery Precision Medicine
Text Mining
coreNLP
…
Public Data
Unstructured data
Structured Data
TypeDB
Loader
Custom
Loaders
Connectors
…
Molecular
Clinical Trials
Patents
Disease
…
81. Client Drivers
(Python, Java,
NodeJS, etc)
Competitive Intelligence Drug Discovery Precision Medicine
Text Mining
coreNLP
…
Public Data
Unstructured data
Structured Data
TypeDB
Loader
Custom
Loaders
Connectors
…
Molecular
Clinical Trials
Patents
Disease
…
82. Who have run clinical trials on Ebola who also own patents?
Client Drivers
(Python, Java,
NodeJS, etc)
Competitive
Insights
Output
Competitive Intelligence Drug Discovery Precision Medicine
Text Mining
coreNLP
…
Public Data
Unstructured data
Structured Data
TypeDB
Loader
Custom
Loaders
Connectors
…
Molecular
Clinical Trials
Patents
Disease
…
83. Who have run clinical trials on Ebola who also own patents?
person
Competitive Intelligence Drug Discovery Precision Medicine
84. Who have run clinical trials on Ebola who also own patents?
patent person
clinical-trial
disease
name: “Ebola”
Competitive Intelligence Drug Discovery Precision Medicine
85. Who have run clinical trials on Ebola who also own patents?
patent person
clinical-trial investigation
disease
name: “Ebola”
Competitive Intelligence Drug Discovery Precision Medicine
investigator
investigated
86. Who have run clinical trials on Ebola who also own patents?
patent person
clinical-trial investigation
study disease
name: “Ebola”
Competitive Intelligence Drug Discovery Precision Medicine
investigator
studied
investigated
studying
87. Who have run clinical trials on Ebola who also own patents?
patent
ownership
person
clinical-trial investigation
study disease
name: “Ebola”
Competitive Intelligence Drug Discovery Precision Medicine
investigator
investigated
studying
studied
owned owner
88. Who have run clinical trials on Ebola who also own patents?
patent
ownership
person
clinical-trial investigation
study disease
name: “Ebola”
Competitive Intelligence Drug Discovery Precision Medicine
match
$person isa person;
investigator
investigated
studying
studied
owned owner
89. Who have run clinical trials on Ebola who also own patents?
patent
ownership
person
clinical-trial investigation
study disease
name: “Ebola”
Competitive Intelligence Drug Discovery Precision Medicine
match
$person isa person;
$patent isa patent;
investigator
investigated
studying
studied
owned owner
90. Who have run clinical trials on Ebola who also own patents?
patent
ownership
person
clinical-trial investigation
study disease
name: “Ebola”
Competitive Intelligence Drug Discovery Precision Medicine
match
$person isa person;
$patent isa patent;
$trial isa clinical-trial;
investigator
investigated
studying
studied
owned owner
91. Who have run clinical trials on Ebola who also own patents?
patent
ownership
person
clinical-trial investigation
study disease
name: “Ebola”
Competitive Intelligence Drug Discovery Precision Medicine
match
$person isa person;
$patent isa patent;
$trial isa clinical-trial;
$disease isa disease;
investigator
investigated
studying
studied
owned owner
92. Who have run clinical trials on Ebola who also own patents?
patent
ownership
person
clinical-trial investigation
study disease
name: “Ebola”
Competitive Intelligence Drug Discovery Precision Medicine
match
$person isa person;
$patent isa patent;
$trial isa clinical-trial;
$disease isa disease, has name "Ebola";
investigator
investigated
studying
studied
owned owner
93. Who have run clinical trials on Ebola who also own patents?
patent
ownership
person
clinical-trial investigation
study disease
name: “Ebola”
Competitive Intelligence Drug Discovery Precision Medicine
match
$person isa person;
$patent isa patent;
$trial isa clinical-trial;
$disease isa disease, has name "Ebola";
(owner: $person, owned: $patent) isa ownership;
investigator
investigated
studying
studied
owned owner
94. Who have run clinical trials on Ebola who also own patents?
patent
ownership
person
clinical-trial investigation
study disease
name: “Ebola”
Competitive Intelligence Drug Discovery Precision Medicine
match
$person isa person;
$patent isa patent;
$trial isa clinical-trial;
$disease isa disease, has name "Ebola";
(owner: $person, owned: $patent) isa ownership;
(investigator: $person, investigated: $trial) isa
investigation;
investigator
investigated
studying
studied
owned owner
95. Who have run clinical trials on Ebola who also own patents?
patent
ownership
person
clinical-trial investigation
study disease
name: “Ebola”
Competitive Intelligence Drug Discovery Precision Medicine
match
$person isa person;
$patent isa patent;
$trial isa clinical-trial;
$disease isa disease, has name "Ebola";
(owner: $person, owned: $patent) isa ownership;
(investigator: $person, investigated: $trial) isa
investigation;
(studying: $trial, studied: $disease) isa study;
investigator
investigated
studying
studied
owned owner
96. Who have run clinical trials on Ebola who also own patents?
patent
ownership
person
clinical-trial investigation
study disease
name: “Ebola”
Competitive Intelligence Drug Discovery Precision Medicine
match
$person isa person;
$patent isa patent;
$trial isa clinical-trial;
$disease isa disease, has name "Ebola";
(owner: $person, owned: $patent) isa ownership;
(investigator: $person, investigated: $trial) isa
investigation;
(studying: $trial, studied: $disease) isa study;
get $person;
investigator
investigated
studying
studied
owned owner
97. Who have run clinical trials on Ebola who also own patents?
patent
ownership
person
clinical-trial investigation
study disease
match
$person isa person, has name $name;
$patent isa patent;
$trial isa clinical-trial;
$disease isa disease, has name "Ebola";
(owner: $person, owned: $patent) isa ownership;
(investigator: $person, investigated: $trial) isa
investigation;
(studying: $trial, studied: $disease) isa study;
get $name;
name: “Ebola”
Competitive Intelligence Drug Discovery Precision Medicine
investigator
investigated
studying
studied
owned owner
100. Competitive Intelligence Drug Discovery Precision Medicine
Public Data
Unstructured data
Structured Data
Legacy data
Lab data
Internal Data
101. Competitive Intelligence Drug Discovery Precision Medicine
Client Drivers
(Python, Java,
NodeJS, etc)
Text Mining
coreNLP
…
Public Data
Unstructured data
Structured Data
Legacy data
Lab data
Internal Data
TypeDB
Loader
Custom
Loaders
Connectors
…
102. Competitive Intelligence Drug Discovery Precision Medicine
Client Drivers
(Python, Java,
NodeJS, etc) KGCN
Text Mining
coreNLP
…
Public Data
Unstructured data
Structured Data
Legacy data
Lab data
Internal Data
TypeDB
Loader
Custom
Loaders
Connectors
…
103. Query Result, a subgraph
Graph
Learning
Algorithm
Learner
TypeQL Query Subgraph Predictions
match
$p isa protein;
$d isa disease, has disease-
group ”Cancer", has disease-id
$did;
$t isa tissue;
$g isa gene, has gene-id $gid;
($p, $t);
($t, $d);
($g, $t);
104. Competitive Intelligence Drug Discovery Precision Medicine
Client Drivers
(Python, Java,
NodeJS, etc) KGCN
List of targets
Output
What are the most likely gene targets for Melanoma?
Text Mining
coreNLP
…
Public Data
Unstructured data
Structured Data
Legacy data
Lab data
Internal Data
TypeDB
Loader
Custom
Loaders
Connectors
…
105. > match $g isa gene, has gene-id $gid;
$d isa disease, has disease-name ”melanoma";
What are the most likely gene targets for Melanoma?
106. > match $g isa gene, has gene-id $gid;
$d isa disease, has disease-name ”melanoma";
($g, $d) isa gene-disease-association, has kgcn-prob $p;
What are the most likely gene targets for Melanoma?
107. > match $g isa gene, has gene-id $gid;
$d isa disease, has disease-name ”melanoma";
($g, $d) isa gene-disease-association, has kgcn-prob $p;
get $gid; sort desc $p;
What are the most likely gene targets for Melanoma?
108. > match $g isa gene, has gene-id $gid;
$d isa disease, has disease-name ”melanoma";
($g, $d) isa gene-disease-association, has kgcn-prob $p;
get $gid; sort desc $p;
{$gid "DDXIIL1" isa gene-id;}
{$gid "WASH7P" isa gene-id;}
{$gid "MIR1302-10" isa gene-id;}
{$gid "MIR1302-11" isa gene-id;}
{$gid "OR4F5" isa gene-id;}
{$gid "FAM138D" isa gene-id;}
{$gid "FAM41C" isa gene-id;}
{$gid "NOC2L" isa gene-id;}
{$gid "HES4" isa gene-id;}
{$gid "RNF223" isa gene-id;}
{$gid "TNFRSF4" isa gene-id;}
...
What are the most likely gene targets for Melanoma?
110. Competitive Intelligence Drug Discovery Precision Medicine
Public Data
Unstructured data
Structured Data
Molecular
Clinical Trials
…
Precision DBs
…
111. Competitive Intelligence Drug Discovery Precision Medicine
Public Data
Unstructured data
Structured Data
Molecular
Clinical Trials
…
Precision DBs
…
112. Competitive Intelligence Drug Discovery Precision Medicine
Public Data
Unstructured data
Structured Data
Molecular
Clinical Trials
…
Precision DBs
…
114. Competitive Intelligence Drug Discovery Precision Medicine
Text Mining
coreNLP
…
TypeDB
Loader
Custom
Loaders
Connectors
…
Public Data
Unstructured data
Structured Data
Molecular
Clinical Trials
…
Precision DBs
…
Competitive Intelligence Drug Discovery Precision Medicine
115. Client Drivers
(Python, Java,
NodeJS, etc)
Competitive Intelligence Drug Discovery Precision Medicine
Text Mining
coreNLP
…
TypeDB
Loader
Custom
Loaders
Connectors
…
Public Data
Unstructured data
Structured Data
Molecular
Clinical Trials
…
Precision DBs
…
116. Client Drivers
(Python, Java,
NodeJS, etc)
Competitive Intelligence Drug Discovery Precision Medicine
Personalised-
therapies
Output
Text Mining
coreNLP
…
TypeDB
Loader
Custom
Loaders
Connectors
…
Public Data
Unstructured data
Structured Data
Molecular
Clinical Trials
…
Precision DBs
…
117. Client Drivers
(Python, Java,
NodeJS, etc)
Personalised-
therapies
Output
Competitive Intelligence Drug Discovery Precision Medicine
Given someone’s biological and genetic profile, what clinical trials are they
eligible for?
Text Mining
coreNLP
…
TypeDB
Loader
Custom
Loaders
Connectors
…
Public Data
Unstructured data
Structured Data
Molecular
Clinical Trials
…
Precision DBs
…
118. trial
personalised-
therapy
person
match
$person isa person, has name "Alice";
$trial isa clinical-trial, has nct-id $nct;
($person, $trial) isa personalised-therapy;
get $nct;
Competitive Intelligence Drug Discovery Precision Medicine
Given someone’s biological and genetic profile, what clinical trials are they
eligible for?
119. Given someone’s biological and genetic profile, what clinical trials are they
eligible for?
Relevance for a clinical trial Eligibility for a clinical trial
Patient has the same gene and variant
mentioned in the clinical trial
Patient is within the right age bracket and
gender for the trial
Competitive Intelligence Drug Discovery Precision Medicine
124. Given someone’s biological and genetic profile, what clinical trials are they
eligible for?
Relevance for a clinical trial
Patient has the same gene and variant
mentioned in the clinical trial
relevant-
trial
Competitive Intelligence Drug Discovery Precision Medicine
127. trial person
gene assoc
variant assoc
mention
mention
relevant-
trial
Competitive Intelligence Drug Discovery Precision Medicine
128. trial person
gene
symbol: $gs
assoc
variant assoc
symbol: $vs
mention
mention
relevant-
trial
rule trial-participant-relevance:
when {
$person isa person;
$gene isa gene;
$variant isa variant;
$trial isa clinical-trial;
($person, $gene);
($person, $variant);
($trial, $gene);
($trial, $variant);
} then {
($person, $trial) isa relevant-trial;
};
Competitive Intelligence Drug Discovery Precision Medicine
129. Given someone’s biological and genetic profile, what clinical trials are they
eligible for?
Eligibility for a clinical trial
Patient is within the right age bracket and
gender for the trial
eligible-trial
Competitive Intelligence Drug Discovery Precision Medicine
133. trial person
disease assoc
assoc
eligible-trial
max-age
gender
age
min-age
greater than
less than
rule trial-participant-eligibility:
when {
$person isa person, has age $age, has gender $gender;
$trial isa clinical-trial,
has min-age <= $age,
has max-age >= $age,
has gender = $gender;
$disease isa disease;
($disease, $person);
($disease, $trial);
} then {
($person, $trial) isa eligible-trial;
};
Competitive Intelligence Drug Discovery Precision Medicine
134. Given someone’s biological and genetic profile, what clinical trials are they
eligible for?
Relevance for a clinical trial Eligibility for a clinical trial
Patient has the same gene and variant
mentioned in the clinical trial
Patient is within the right age bracket and
gender for the trial
Competitive Intelligence Drug Discovery Precision Medicine
135. Drug Discovery
Data Harmonisation
Precision Medicine
Competitive Intelligence
Precision Medicine
Supply Chain Optimisation
Clinical Trial
Cohort Selection
Disease Understanding