Cyber security companies collect massive amounts of heterogenous data coming from a huge number of sources. These describe hundreds of different data types, such as vulnerabilities, observables, incidents, and malwares. While this data is highly complex (with many types of relations, type hierarchies, and rules), its structure doesn't significantly change between organisations. However, without a publicly available data model, organisations end up modelling the same data in different ways: in other words, reinventing the wheel, and wasting their resources. This modelling complexity makes scaling cyber security applications extremely difficult.
That's why efforts are underway to provide ready-made solutions for typical cyber security use cases which provide the flexibility to expand for specific requirement of individual setups. The combination of those efforts have created a lot of inter-related knowledge silos (e.g. CVE, CAPEC, CWE, CVSS, Cocoa, MITRE, VERIS, STIX, MAEC). To unify these silos, various ontologies have been proposed by researchers, with different levels of granularity - from specific use cases like defence exercises, to more comprehensive cases like the UCO project.
During this talk, you’ll learn about the OmnibusCyber Project, an open-source, ready-made solution that aggregates cyber security knowledge silos, based on TypeDB. TypeDB’s framework offers the expressivity, safety, and inference properties required to implement a knowledge graph without the complexity associated with the OWL/RDF semantic frameworks.
Knowledge Graphs for Supply Chain Operations.pdfVaticle
Agility in supply chain operations has never been so important, especially with today's nonlinear and complex world. That is why companies with supply chains need knowledge graphs.
So how do enterprises unleash the power of their own supply chain data to make smarter decisions? This is where bops comes into play. Bops activates supply chain data from existing operating systems (ERPs, Pos, OMS, etc) simplifying how operators optimize working capital in every decision.
In this session, bops will showcase a few use cases that portray the power of a knowledge graph to represent a supply chain network composed of an end to end product flow driven by actions among plants, customers and suppliers.
Supply chain operations visibility:
- Story of a Product and an SKU: from raw material to finished goods track trace & bill of material deviations
- Story of a Supplier – risk assessments – “the most influential supplier”
- Story of a Process – anomaly detection – “what went wrong?”
Join us for a lively discussion to learn how using knowledge graphs is already helping supply chain companies to better collect, unify, and activate their data.
Speaker: Jorge Risquez
Jorge is the Co-founder and CEO of bops, a headless supply chain intelligence platform helping manufacturers and distributors source, make, and deliver their products, and unlock working capital. Previously, Jorge spent a decade as a Supply Chain Consultant for Deloitte, where he worked with Fortune 500 companies such as Tyson and Cargill. In his spare time, he enjoys going for a run in Central Park and spending time with family and friends.
Building a Cyber Threat Intelligence Knowledge GraphVaticle
Knowledge of cyber threats is a key focus in cyber security. In this talk, we present TypeDB CTI, which is an open source threat intelligence platform to store and manage such knowledge. It enables Cyber Security Intelligence (CTI) professionals to bring together their disparate CTI information into one platform, enabling them to more easily manage such data and discover new insights about cyber threats.
We will describe how we use TypeDB to represent STIX 2.1, the most widely used language and serialization format used to exchange cyber threat intelligence. We cover how we leverage TypeDB's modelling constructs such as type hierarchies, nested relations, hyper relations, unique attributes, and logical inference to build this threat intelligence platform.
Speaker: Tomás Sabat
Tomás is the Chief Operating Officer at Vaticle. He works closely with TypeDB's open source and enterprise users who use TypeDB to build applications in a wide number of industries including financial services, life sciences, cyber security and supply chain management. A graduate of the University of Cambridge, Tomás has spent the last seven years founding and building businesses in the technology industry.
Intuit Data Ecosystem supports unique consumer and small business assets at scale, and handle petabytes of customer data. We have 8M active small business customers and 16M paid workers that uses Intuit Quick Books and Quick Books Payroll Products. Huge customer base and large volumes of data always challenges the data teams in terms of freshness of data, correctness of data etc. This presentation is intended to cover such problems we faced at Intuit along with the data observability model we follow to cure, detect and prevent data Issues. We would like to provide deep insights into the implementations and the impact of some of the great work done by Intuit in this direction.
Unifying Space Mission Knowledge with NLP & Knowledge GraphVaticle
Synopsis
The number of space missions being designed and launched worldwide is growing exponentially. Information on these missions, such as their objectives, orbit, or payload, is disseminated across various documents and datasets. Facilitating access to this information is key to accelerating the design of future missions, enabling experts to link an application to a mission, and following various stakeholders' activities.
This presentation introduces recent research done at the ESA to combine the latest Language Models with Knowledge Graphs, unifying our knowledge on space missions. Language Models such as GPT-3 and BERT are trained to understand the patterns of human (natural) language. These models have revolutionised the field of NLP, the branch of AI enabling machines to understand human language in all its complexity. In this work, key information on a mission is parsed from documents with the GPT-3 model, and the parsed data is then migrated to a TypeDB Knowledge Graph to be easily queried. Although this work focuses on an application in the space sector, the method can be transferred to other engineering fields.
Presenters
Dr. Audrey Berquand is a Research Fellow at the ESA. Her research aims at enhancing space mission design and knowledge management with text mining, NLP, and Knowledge Graphs. She was awarded her PhD in 2021 from the University of Strathclyde (Scotland) for her thesis on “Text Mining and Natural Language Processing for the Early Stages of Space Mission Design”. Audrey has a background in space systems engineering, she holds an MSc in Aerospace Engineering from the Royal Institute of Technology KTH (Sweden), and a diplôme d'ingénieur from the EPF Graduate School of Engineering (France). Before diving into the world of AI, she spent 3 years at ESA being involved in the early design phases of future Earth Observation missions.
Ana Victória Ladeira works with Knowledge Management at the ESA, using automated methods to exploit the information contained in the piles and piles of documents that ESA generates every day. With a Masters degree in Data Science from Maastricht University, Ana is particularly excited about how NLP methods can help large organizations connect different documents and highlight the bigger picture over a big universe of data sources, as well as using Knowledge Graphs to help connect people to the expertise and information they need.
Optimizing Your Supply Chain with the Neo4j GraphNeo4j
With the world’s supply chain system in crisis, it’s clear that better solutions are needed. Digital twins built on knowledge graph technology allow you to achieve an end-to-end view of the process, supporting real-time monitoring of critical assets.
The perfect couple: Uniting Large Language Models and Knowledge Graphs for En...Neo4j
Large Language models are amazing but are also black-box models that often fail to capture and accurately represent factual knowledge. Knowledge graphs, by contrast, are structural knowledge models that explicitly represent knowledge and, indeed, allow us to detect implicit relationships. In this talk we will demonstrate how LLMs can be improved by Knowledge Graphs, and how LLM’s can augment Knowledge Graphs. A perfect couple!
“Artificial Intelligence” covers a wide range of technologies today, including those that enable machine vision, effective computing, deep learning, and natural language processing. As advances increase, so do expectations. We now see a rush to add “AI inside” for applications and appliances in almost every domain. The reality is that some firms will have mega-hits with AI-enabled applications, and many more will suffer setbacks based on flawed adoption strategies.
This webinar will present an assessment of key AI technologies today, and help participants identify promising applications based on matching requirements to mature-enough technologies.
Knowledge Graphs for Supply Chain Operations.pdfVaticle
Agility in supply chain operations has never been so important, especially with today's nonlinear and complex world. That is why companies with supply chains need knowledge graphs.
So how do enterprises unleash the power of their own supply chain data to make smarter decisions? This is where bops comes into play. Bops activates supply chain data from existing operating systems (ERPs, Pos, OMS, etc) simplifying how operators optimize working capital in every decision.
In this session, bops will showcase a few use cases that portray the power of a knowledge graph to represent a supply chain network composed of an end to end product flow driven by actions among plants, customers and suppliers.
Supply chain operations visibility:
- Story of a Product and an SKU: from raw material to finished goods track trace & bill of material deviations
- Story of a Supplier – risk assessments – “the most influential supplier”
- Story of a Process – anomaly detection – “what went wrong?”
Join us for a lively discussion to learn how using knowledge graphs is already helping supply chain companies to better collect, unify, and activate their data.
Speaker: Jorge Risquez
Jorge is the Co-founder and CEO of bops, a headless supply chain intelligence platform helping manufacturers and distributors source, make, and deliver their products, and unlock working capital. Previously, Jorge spent a decade as a Supply Chain Consultant for Deloitte, where he worked with Fortune 500 companies such as Tyson and Cargill. In his spare time, he enjoys going for a run in Central Park and spending time with family and friends.
Building a Cyber Threat Intelligence Knowledge GraphVaticle
Knowledge of cyber threats is a key focus in cyber security. In this talk, we present TypeDB CTI, which is an open source threat intelligence platform to store and manage such knowledge. It enables Cyber Security Intelligence (CTI) professionals to bring together their disparate CTI information into one platform, enabling them to more easily manage such data and discover new insights about cyber threats.
We will describe how we use TypeDB to represent STIX 2.1, the most widely used language and serialization format used to exchange cyber threat intelligence. We cover how we leverage TypeDB's modelling constructs such as type hierarchies, nested relations, hyper relations, unique attributes, and logical inference to build this threat intelligence platform.
Speaker: Tomás Sabat
Tomás is the Chief Operating Officer at Vaticle. He works closely with TypeDB's open source and enterprise users who use TypeDB to build applications in a wide number of industries including financial services, life sciences, cyber security and supply chain management. A graduate of the University of Cambridge, Tomás has spent the last seven years founding and building businesses in the technology industry.
Intuit Data Ecosystem supports unique consumer and small business assets at scale, and handle petabytes of customer data. We have 8M active small business customers and 16M paid workers that uses Intuit Quick Books and Quick Books Payroll Products. Huge customer base and large volumes of data always challenges the data teams in terms of freshness of data, correctness of data etc. This presentation is intended to cover such problems we faced at Intuit along with the data observability model we follow to cure, detect and prevent data Issues. We would like to provide deep insights into the implementations and the impact of some of the great work done by Intuit in this direction.
Unifying Space Mission Knowledge with NLP & Knowledge GraphVaticle
Synopsis
The number of space missions being designed and launched worldwide is growing exponentially. Information on these missions, such as their objectives, orbit, or payload, is disseminated across various documents and datasets. Facilitating access to this information is key to accelerating the design of future missions, enabling experts to link an application to a mission, and following various stakeholders' activities.
This presentation introduces recent research done at the ESA to combine the latest Language Models with Knowledge Graphs, unifying our knowledge on space missions. Language Models such as GPT-3 and BERT are trained to understand the patterns of human (natural) language. These models have revolutionised the field of NLP, the branch of AI enabling machines to understand human language in all its complexity. In this work, key information on a mission is parsed from documents with the GPT-3 model, and the parsed data is then migrated to a TypeDB Knowledge Graph to be easily queried. Although this work focuses on an application in the space sector, the method can be transferred to other engineering fields.
Presenters
Dr. Audrey Berquand is a Research Fellow at the ESA. Her research aims at enhancing space mission design and knowledge management with text mining, NLP, and Knowledge Graphs. She was awarded her PhD in 2021 from the University of Strathclyde (Scotland) for her thesis on “Text Mining and Natural Language Processing for the Early Stages of Space Mission Design”. Audrey has a background in space systems engineering, she holds an MSc in Aerospace Engineering from the Royal Institute of Technology KTH (Sweden), and a diplôme d'ingénieur from the EPF Graduate School of Engineering (France). Before diving into the world of AI, she spent 3 years at ESA being involved in the early design phases of future Earth Observation missions.
Ana Victória Ladeira works with Knowledge Management at the ESA, using automated methods to exploit the information contained in the piles and piles of documents that ESA generates every day. With a Masters degree in Data Science from Maastricht University, Ana is particularly excited about how NLP methods can help large organizations connect different documents and highlight the bigger picture over a big universe of data sources, as well as using Knowledge Graphs to help connect people to the expertise and information they need.
Optimizing Your Supply Chain with the Neo4j GraphNeo4j
With the world’s supply chain system in crisis, it’s clear that better solutions are needed. Digital twins built on knowledge graph technology allow you to achieve an end-to-end view of the process, supporting real-time monitoring of critical assets.
The perfect couple: Uniting Large Language Models and Knowledge Graphs for En...Neo4j
Large Language models are amazing but are also black-box models that often fail to capture and accurately represent factual knowledge. Knowledge graphs, by contrast, are structural knowledge models that explicitly represent knowledge and, indeed, allow us to detect implicit relationships. In this talk we will demonstrate how LLMs can be improved by Knowledge Graphs, and how LLM’s can augment Knowledge Graphs. A perfect couple!
“Artificial Intelligence” covers a wide range of technologies today, including those that enable machine vision, effective computing, deep learning, and natural language processing. As advances increase, so do expectations. We now see a rush to add “AI inside” for applications and appliances in almost every domain. The reality is that some firms will have mega-hits with AI-enabled applications, and many more will suffer setbacks based on flawed adoption strategies.
This webinar will present an assessment of key AI technologies today, and help participants identify promising applications based on matching requirements to mature-enough technologies.
MLOps and Data Quality: Deploying Reliable ML Models in ProductionProvectus
Looking to build a robust machine learning infrastructure to streamline MLOps? Learn from Provectus experts how to ensure the success of your MLOps initiative by implementing Data QA components in your ML infrastructure.
For most organizations, the development of multiple machine learning models, their deployment and maintenance in production are relatively new tasks. Join Provectus as we explain how to build an end-to-end infrastructure for machine learning, with a focus on data quality and metadata management, to standardize and streamline machine learning life cycle management (MLOps).
Agenda
- Data Quality and why it matters
- Challenges and solutions of Data Testing
- Challenges and solutions of Model Testing
- MLOps pipelines and why they matter
- How to expand validation pipelines for Data Quality
Observability for Data Pipelines With OpenLineageDatabricks
Data is increasingly becoming core to many products. Whether to provide recommendations for users, getting insights on how they use the product, or using machine learning to improve the experience. This creates a critical need for reliable data operations and understanding how data is flowing through our systems. Data pipelines must be auditable, reliable, and run on time. This proves particularly difficult in a constantly changing, fast-paced environment.
Collecting this lineage metadata as data pipelines are running provides an understanding of dependencies between many teams consuming and producing data and how constant changes impact them. It is the underlying foundation that enables the many use cases related to data operations. The OpenLineage project is an API standardizing this metadata across the ecosystem, reducing complexity and duplicate work in collecting lineage information. It enables many projects, consumers of lineage in the ecosystem whether they focus on operations, governance or security.
Marquez is an open source project part of the LF AI & Data foundation which instruments data pipelines to collect lineage and metadata and enable those use cases. It implements the OpenLineage API and provides context by making visible dependencies across organizations and technologies as they change over time.
MITRE ATT&CK is quickly gaining traction and is becoming an important standard to use to assess the overall cyber security posture of an organization. Tools like ATT&CK Navigator facilitate corporate adoption and allow for a holistic overview on attack techniques and how the organization is preventing and detecting them. Furthermore, many vendors, technologies and open-source initiatives are aligning with ATT&CK. Join Erik Van Buggenhout in this presentation, where he will discuss how MITRE ATT&CK can be leveraged in the organization as part of your overall cyber security program, with a focus on adversary emulation.
Erik Van Buggenhout is the lead author of SANS SEC599 - Defeating Advanced Adversaries - Purple Team Tactics & Kill Chain Defenses. Next to his activities at SANS, Erik is also a co-founder of NVISO, a European cyber security firm with offices in Brussels, Frankfurt and Munich.
Tracking Noisy Behavior and Risk-Based Alerting with ATT&CKMITRE ATT&CK
From ATT&CKcon 3.0
By Haylee Mills, Splunk
Having ATT&CK to identify threats, prioritize data sources, and improve security posture has been a huge step forward for our industry, but how do we actualize those insights for better detection and alerting? By shifting to observations of behavior over one-to-one direct alerts, noisy datasets become valuable treasure troves with ATT&CK metadata. Additionally, we can begin to look at detection and threat hunting on behavior instead of users or systems. In this presentation, Haylee will discuss the shift in mindset and the nuts and bolts of detections that leverage this metadata in Splunk, but the concept can be applied with custom tools to any valuable security dataset.
From ATT&CKcon 3.0
By Fred Frey and Jonathan Mulholland, SnapAttack
Atomic Red Team and Sigma are the largest open-source attack simulation and analytic projects. Many organizations utilize one or both internally for security controls validation or supplementing their detections and alerts. Building on the work from these two great communities, we smashed (scientific-term) the attacks and analytics together and applied data science to analyze the results. We'll describe our methodology and testing framework, show the real-world MITRE ATT&CK coverage and gaps, discuss our algorithms for calculating analytic similarity, identifying log sources for a technique, and determining the best analytics to deploy that maximizes ATT&CK coverage.
This project aims to:
- Bring a measurable testing rigor to community analytics to improve adoption
- Test every analytic against every attack, validating the true positive detection
- Understand the log sources required to detect specific attack techniques
- Apply data science to identify analytic similarity (reduce community duplication)
- Identify gaps between the projects' analytics without attack simulations; attack simulations without detections; missing or incorrect MITRE ATT&CK labels, etc
- Automate the process so insights can stay up to date with new attack/analytic contributions over time
- Share our analysis back to the community to improve these projects
AI offers enormous potential in terms of improving the effectiveness and efficiency of robots. In recent years, data-driven AI has achieved remarkable success in specialised tasks such as speech recognition, machine translation and object detection. Despite these successes, there are also some clear signs of the limitations.
On finding a solution to these limitations, we study the following three challenges:
1) How may robots operate under real-world conditions, which are dynamic and packed with unknown objects and situations?
2) How may robots be able to execute multiple tasks, instead of just one?
3) How can robots cooperate with other robots and with human team-mates?
In this talk the first two challenges will be addressed. Also, we will show how the knowledge-base of TypeDB enables us to tackle such challenges.
Speaker: Joris Sijs, Scientist @ TNO
Joris is a team-lead at TNO, where he develops and integrates software modules for the perception, awareness and planning of autonomous systems and autonomous robots. He recently started to extend this work with the development of knowledge-graphs (or cognitive databases), and how to combine this type of AI with the machine- and deep-learning solutions in AI.
Slides: Knowledge Graphs vs. Property GraphsDATAVERSITY
We are in the era of graphs. Graphs are hot. Why? Flexibility is one strong driver: Heterogeneous data, integrating new data sources, and analytics all require flexibility. Graphs deliver it in spades.
Over the last few years, a number of new graph databases came to market. As we start the next decade, dare we say “the semantic twenties,” we also see vendors that never before mentioned graphs starting to position their products and solutions as graphs or graph-based.
Graph databases are one thing, but “Knowledge Graphs” are an even hotter topic. We are often asked to explain Knowledge Graphs.
Today, there are two main graph data models:
• Property Graphs (also known as Labeled Property Graphs)
• RDF Graphs (Resource Description Framework) aka Knowledge Graphs
Other graph data models are possible as well, but over 90 percent of the implementations use one of these two models. In this webinar, we will cover the following:
I. A brief overview of each of the two main graph models noted above
II. Differences in Terminology and Capabilities of these models
III. Strengths and Limitations of each approach
IV. Why Knowledge Graphs provide a strong foundation for Enterprise Data Governance and Metadata Management
Introdution to Dataops and AIOps (or MLOps)Adrien Blind
This presentation introduces the audience to the DataOps and AIOps practices. It deals with organizational & tech aspects, and provide hints to start you data journey.
Driving Intelligence with MITRE ATT&CK: Leveraging Limited Resources to Build...MITRE ATT&CK
From ATT&CKcon 4.0
By Scott Roberts, Interpres Security
"Building threat intelligence is challenging, even under the most ideal circumstances. But what if you are even more limited in your resources? You are part of a small (but skilled) team, with high expectations, and people are relying on you to make business-critical decisions…all the time! What do you do in that situation? Turn a Toyota Tercel into a tank, of course.
The Interpres Security threat intelligence team found itself in that exact situation. Wanting to leverage the MITRE ATT&CK catalog in creating a comprehensive and timely threat intelligence repository, the Interpres team built a series of tools, processes, and paradigms that we call Intelligence Engineering. In this talk, we’ll examine how we combined ATT&CK, STIX2, the Vertex Project’s open-source intelligence platform, Synapse, and custom code to deliver meaningful, rapid, verifiable intelligence to our customers. We’ll share lessons learned on automation, how to run multiple ATT&CK libraries side-by-side, and making programmatic intelligence delivery scalable and effective – just like building a tank out of an imported sedan."
Knowledge for the masses: Storytelling with ATT&CKMITRE ATT&CK
From ATT&CKcon 3.0
By Ismael Valenzuela and Jose Luis Sanchez Martinez, Trellix
The Trellix team believes that creating and sharing compelling stories about cyber threats -with ATT&CK- is a powerful way for raising awareness and enabling actionability against cyber threats.
In this talk the team will share their experiences leveraging ATT&CK to disseminate Threat knowledge to different audiences (Software Development teams, Managers, Threat detection engineers, Threat hunters, Cyber Threat Analysts, Support Engineers, upper management, etc.).
They will show concrete examples and representations created with ATT&CK to describe the threats at different levels, including: 1) an Attack Path graph that shows the overall flow of the attack; 2) Tactic-specific TTP summary tables and graphs; 3) very detailed, step-by-step description of the attacker's behaviors.
Property graph vs. RDF Triplestore comparison in 2020Ontotext
This presentation goes all the way from intro "what graph databases are" to table comparing the RDF vs. PG plus two different diagrams presenting the market circa 2020
It's just a jump to the left (of boom): Prioritizing detection implementation...MITRE ATT&CK
From ATT&CKcon 3.0
By Lindsay Kaye and Scott Small, Recorded Future
Many organizations ask: "Where do I start, and where do I go next" when prioritizing implementation of behavior-based detections? We often hear "use threat intelligence!" but your goals must be qualified and quantified in order to properly prioritize the most relevant TTPs. A wealth of open-sourced, ATT&CK-mapped resources now exists, giving security teams greater access to both detections and red team tests they can implement, but intelligence (also aligned with ATT&CK), is essential to provide necessary context to ensure that detection efforts are focused effectively.
This session will discuss a new approach to the prioritization challenge, starting with an analysis of the current defensive landscape, as measured by ATT&CK coverage for more than a dozen detection repositories and technologies, and guidance on sourcing TTP intelligence. The team will then show how real-world defensive strategies can be strengthened by encompassing a full-spectrum view of threat detection, including the implementation of YARA, Sigma, and Snort in security appliances. Critically, alignment of both intelligence and defenses with ATT&CK enables defenders to move the focus of detection efforts to indications of malicious behavior before the final payload is deployed, where controls are most effective at preventing serious damage to the organization.
The Apache Solr Semantic Knowledge GraphTrey Grainger
What if instead of a query returning documents, you could alternatively return other keywords most related to the query: i.e. given a search for "data science", return me back results like "machine learning", "predictive modeling", "artificial neural networks", etc.? Solr’s Semantic Knowledge Graph does just that. It leverages the inverted index to automatically model the significance of relationships between every term in the inverted index (even across multiple fields) allowing real-time traversal and ranking of any relationship within your documents. Use cases for the Semantic Knowledge Graph include disambiguation of multiple meanings of terms (does "driver" mean truck driver, printer driver, a type of golf club, etc.), searching on vectors of related keywords to form a conceptual search (versus just a text match), powering recommendation algorithms, ranking lists of keywords based upon conceptual cohesion to reduce noise, summarizing documents by extracting their most significant terms, and numerous other applications involving anomaly detection, significance/relationship discovery, and semantic search. In this talk, we'll do a deep dive into the internals of how the Semantic Knowledge Graph works and will walk you through how to get up and running with an example dataset to explore the meaningful relationships hidden within your data.
Discuss the different ways model can be served with MLflow. We will cover both the open source MLflow and Databricks managed MLflow ways to serve models. Will cover the basic differences between batch scoring and real-time scoring. Special emphasis on the new upcoming Databricks production-ready model serving.
Building Biomedical Knowledge Graphs for In-Silico Drug DiscoveryVaticle
The rapid development and spread of analytical tools in the biomedical sciences has produced a variety of information about all sorts of biological components and their functions. Though important individually, their biological characteristics need to be understood in relation to the interactions they have with other biological components, which requires the integration of vast amounts of complex, semantically-rich, heterogenous data.
Traditional systems are inadequate at accurately modelling and handling data at this scale and complexity, making solutions that speed up the integration and querying of such data a necessity.
In this talk, we present various approaches being used in organisations to build biomedical computational pipelines to address these problems using tools such as Machine Learning and TypeDB. In particular, we discuss how to create an accurate and scalable semantic representation of molecular level biomedical data by presenting examples from drug discovery, precision medicine and competitive intelligence.
Speaker: Tomás Sabat
Tomás is the Chief Operating Officer at Vaticle, dedicated to building a strongly-typed database for intelligent systems. He works directly with TypeDB's open source and enterprise users so they can fulfil their potential with TypeDB and change the world. He focuses mainly in life sciences, cyber security, finance and robotics.
Tsunami of Technologies. Are we prepared?
Slide from workshop with open source community in Malaysia.
"Bengkel Bersama Komuniti Sumber Terbuka Bilangan 1 Tahun 2020" in De Baron Resort, Langkawi, Kedah, Malaysia
MLOps and Data Quality: Deploying Reliable ML Models in ProductionProvectus
Looking to build a robust machine learning infrastructure to streamline MLOps? Learn from Provectus experts how to ensure the success of your MLOps initiative by implementing Data QA components in your ML infrastructure.
For most organizations, the development of multiple machine learning models, their deployment and maintenance in production are relatively new tasks. Join Provectus as we explain how to build an end-to-end infrastructure for machine learning, with a focus on data quality and metadata management, to standardize and streamline machine learning life cycle management (MLOps).
Agenda
- Data Quality and why it matters
- Challenges and solutions of Data Testing
- Challenges and solutions of Model Testing
- MLOps pipelines and why they matter
- How to expand validation pipelines for Data Quality
Observability for Data Pipelines With OpenLineageDatabricks
Data is increasingly becoming core to many products. Whether to provide recommendations for users, getting insights on how they use the product, or using machine learning to improve the experience. This creates a critical need for reliable data operations and understanding how data is flowing through our systems. Data pipelines must be auditable, reliable, and run on time. This proves particularly difficult in a constantly changing, fast-paced environment.
Collecting this lineage metadata as data pipelines are running provides an understanding of dependencies between many teams consuming and producing data and how constant changes impact them. It is the underlying foundation that enables the many use cases related to data operations. The OpenLineage project is an API standardizing this metadata across the ecosystem, reducing complexity and duplicate work in collecting lineage information. It enables many projects, consumers of lineage in the ecosystem whether they focus on operations, governance or security.
Marquez is an open source project part of the LF AI & Data foundation which instruments data pipelines to collect lineage and metadata and enable those use cases. It implements the OpenLineage API and provides context by making visible dependencies across organizations and technologies as they change over time.
MITRE ATT&CK is quickly gaining traction and is becoming an important standard to use to assess the overall cyber security posture of an organization. Tools like ATT&CK Navigator facilitate corporate adoption and allow for a holistic overview on attack techniques and how the organization is preventing and detecting them. Furthermore, many vendors, technologies and open-source initiatives are aligning with ATT&CK. Join Erik Van Buggenhout in this presentation, where he will discuss how MITRE ATT&CK can be leveraged in the organization as part of your overall cyber security program, with a focus on adversary emulation.
Erik Van Buggenhout is the lead author of SANS SEC599 - Defeating Advanced Adversaries - Purple Team Tactics & Kill Chain Defenses. Next to his activities at SANS, Erik is also a co-founder of NVISO, a European cyber security firm with offices in Brussels, Frankfurt and Munich.
Tracking Noisy Behavior and Risk-Based Alerting with ATT&CKMITRE ATT&CK
From ATT&CKcon 3.0
By Haylee Mills, Splunk
Having ATT&CK to identify threats, prioritize data sources, and improve security posture has been a huge step forward for our industry, but how do we actualize those insights for better detection and alerting? By shifting to observations of behavior over one-to-one direct alerts, noisy datasets become valuable treasure troves with ATT&CK metadata. Additionally, we can begin to look at detection and threat hunting on behavior instead of users or systems. In this presentation, Haylee will discuss the shift in mindset and the nuts and bolts of detections that leverage this metadata in Splunk, but the concept can be applied with custom tools to any valuable security dataset.
From ATT&CKcon 3.0
By Fred Frey and Jonathan Mulholland, SnapAttack
Atomic Red Team and Sigma are the largest open-source attack simulation and analytic projects. Many organizations utilize one or both internally for security controls validation or supplementing their detections and alerts. Building on the work from these two great communities, we smashed (scientific-term) the attacks and analytics together and applied data science to analyze the results. We'll describe our methodology and testing framework, show the real-world MITRE ATT&CK coverage and gaps, discuss our algorithms for calculating analytic similarity, identifying log sources for a technique, and determining the best analytics to deploy that maximizes ATT&CK coverage.
This project aims to:
- Bring a measurable testing rigor to community analytics to improve adoption
- Test every analytic against every attack, validating the true positive detection
- Understand the log sources required to detect specific attack techniques
- Apply data science to identify analytic similarity (reduce community duplication)
- Identify gaps between the projects' analytics without attack simulations; attack simulations without detections; missing or incorrect MITRE ATT&CK labels, etc
- Automate the process so insights can stay up to date with new attack/analytic contributions over time
- Share our analysis back to the community to improve these projects
AI offers enormous potential in terms of improving the effectiveness and efficiency of robots. In recent years, data-driven AI has achieved remarkable success in specialised tasks such as speech recognition, machine translation and object detection. Despite these successes, there are also some clear signs of the limitations.
On finding a solution to these limitations, we study the following three challenges:
1) How may robots operate under real-world conditions, which are dynamic and packed with unknown objects and situations?
2) How may robots be able to execute multiple tasks, instead of just one?
3) How can robots cooperate with other robots and with human team-mates?
In this talk the first two challenges will be addressed. Also, we will show how the knowledge-base of TypeDB enables us to tackle such challenges.
Speaker: Joris Sijs, Scientist @ TNO
Joris is a team-lead at TNO, where he develops and integrates software modules for the perception, awareness and planning of autonomous systems and autonomous robots. He recently started to extend this work with the development of knowledge-graphs (or cognitive databases), and how to combine this type of AI with the machine- and deep-learning solutions in AI.
Slides: Knowledge Graphs vs. Property GraphsDATAVERSITY
We are in the era of graphs. Graphs are hot. Why? Flexibility is one strong driver: Heterogeneous data, integrating new data sources, and analytics all require flexibility. Graphs deliver it in spades.
Over the last few years, a number of new graph databases came to market. As we start the next decade, dare we say “the semantic twenties,” we also see vendors that never before mentioned graphs starting to position their products and solutions as graphs or graph-based.
Graph databases are one thing, but “Knowledge Graphs” are an even hotter topic. We are often asked to explain Knowledge Graphs.
Today, there are two main graph data models:
• Property Graphs (also known as Labeled Property Graphs)
• RDF Graphs (Resource Description Framework) aka Knowledge Graphs
Other graph data models are possible as well, but over 90 percent of the implementations use one of these two models. In this webinar, we will cover the following:
I. A brief overview of each of the two main graph models noted above
II. Differences in Terminology and Capabilities of these models
III. Strengths and Limitations of each approach
IV. Why Knowledge Graphs provide a strong foundation for Enterprise Data Governance and Metadata Management
Introdution to Dataops and AIOps (or MLOps)Adrien Blind
This presentation introduces the audience to the DataOps and AIOps practices. It deals with organizational & tech aspects, and provide hints to start you data journey.
Driving Intelligence with MITRE ATT&CK: Leveraging Limited Resources to Build...MITRE ATT&CK
From ATT&CKcon 4.0
By Scott Roberts, Interpres Security
"Building threat intelligence is challenging, even under the most ideal circumstances. But what if you are even more limited in your resources? You are part of a small (but skilled) team, with high expectations, and people are relying on you to make business-critical decisions…all the time! What do you do in that situation? Turn a Toyota Tercel into a tank, of course.
The Interpres Security threat intelligence team found itself in that exact situation. Wanting to leverage the MITRE ATT&CK catalog in creating a comprehensive and timely threat intelligence repository, the Interpres team built a series of tools, processes, and paradigms that we call Intelligence Engineering. In this talk, we’ll examine how we combined ATT&CK, STIX2, the Vertex Project’s open-source intelligence platform, Synapse, and custom code to deliver meaningful, rapid, verifiable intelligence to our customers. We’ll share lessons learned on automation, how to run multiple ATT&CK libraries side-by-side, and making programmatic intelligence delivery scalable and effective – just like building a tank out of an imported sedan."
Knowledge for the masses: Storytelling with ATT&CKMITRE ATT&CK
From ATT&CKcon 3.0
By Ismael Valenzuela and Jose Luis Sanchez Martinez, Trellix
The Trellix team believes that creating and sharing compelling stories about cyber threats -with ATT&CK- is a powerful way for raising awareness and enabling actionability against cyber threats.
In this talk the team will share their experiences leveraging ATT&CK to disseminate Threat knowledge to different audiences (Software Development teams, Managers, Threat detection engineers, Threat hunters, Cyber Threat Analysts, Support Engineers, upper management, etc.).
They will show concrete examples and representations created with ATT&CK to describe the threats at different levels, including: 1) an Attack Path graph that shows the overall flow of the attack; 2) Tactic-specific TTP summary tables and graphs; 3) very detailed, step-by-step description of the attacker's behaviors.
Property graph vs. RDF Triplestore comparison in 2020Ontotext
This presentation goes all the way from intro "what graph databases are" to table comparing the RDF vs. PG plus two different diagrams presenting the market circa 2020
It's just a jump to the left (of boom): Prioritizing detection implementation...MITRE ATT&CK
From ATT&CKcon 3.0
By Lindsay Kaye and Scott Small, Recorded Future
Many organizations ask: "Where do I start, and where do I go next" when prioritizing implementation of behavior-based detections? We often hear "use threat intelligence!" but your goals must be qualified and quantified in order to properly prioritize the most relevant TTPs. A wealth of open-sourced, ATT&CK-mapped resources now exists, giving security teams greater access to both detections and red team tests they can implement, but intelligence (also aligned with ATT&CK), is essential to provide necessary context to ensure that detection efforts are focused effectively.
This session will discuss a new approach to the prioritization challenge, starting with an analysis of the current defensive landscape, as measured by ATT&CK coverage for more than a dozen detection repositories and technologies, and guidance on sourcing TTP intelligence. The team will then show how real-world defensive strategies can be strengthened by encompassing a full-spectrum view of threat detection, including the implementation of YARA, Sigma, and Snort in security appliances. Critically, alignment of both intelligence and defenses with ATT&CK enables defenders to move the focus of detection efforts to indications of malicious behavior before the final payload is deployed, where controls are most effective at preventing serious damage to the organization.
The Apache Solr Semantic Knowledge GraphTrey Grainger
What if instead of a query returning documents, you could alternatively return other keywords most related to the query: i.e. given a search for "data science", return me back results like "machine learning", "predictive modeling", "artificial neural networks", etc.? Solr’s Semantic Knowledge Graph does just that. It leverages the inverted index to automatically model the significance of relationships between every term in the inverted index (even across multiple fields) allowing real-time traversal and ranking of any relationship within your documents. Use cases for the Semantic Knowledge Graph include disambiguation of multiple meanings of terms (does "driver" mean truck driver, printer driver, a type of golf club, etc.), searching on vectors of related keywords to form a conceptual search (versus just a text match), powering recommendation algorithms, ranking lists of keywords based upon conceptual cohesion to reduce noise, summarizing documents by extracting their most significant terms, and numerous other applications involving anomaly detection, significance/relationship discovery, and semantic search. In this talk, we'll do a deep dive into the internals of how the Semantic Knowledge Graph works and will walk you through how to get up and running with an example dataset to explore the meaningful relationships hidden within your data.
Discuss the different ways model can be served with MLflow. We will cover both the open source MLflow and Databricks managed MLflow ways to serve models. Will cover the basic differences between batch scoring and real-time scoring. Special emphasis on the new upcoming Databricks production-ready model serving.
Building Biomedical Knowledge Graphs for In-Silico Drug DiscoveryVaticle
The rapid development and spread of analytical tools in the biomedical sciences has produced a variety of information about all sorts of biological components and their functions. Though important individually, their biological characteristics need to be understood in relation to the interactions they have with other biological components, which requires the integration of vast amounts of complex, semantically-rich, heterogenous data.
Traditional systems are inadequate at accurately modelling and handling data at this scale and complexity, making solutions that speed up the integration and querying of such data a necessity.
In this talk, we present various approaches being used in organisations to build biomedical computational pipelines to address these problems using tools such as Machine Learning and TypeDB. In particular, we discuss how to create an accurate and scalable semantic representation of molecular level biomedical data by presenting examples from drug discovery, precision medicine and competitive intelligence.
Speaker: Tomás Sabat
Tomás is the Chief Operating Officer at Vaticle, dedicated to building a strongly-typed database for intelligent systems. He works directly with TypeDB's open source and enterprise users so they can fulfil their potential with TypeDB and change the world. He focuses mainly in life sciences, cyber security, finance and robotics.
Tsunami of Technologies. Are we prepared?
Slide from workshop with open source community in Malaysia.
"Bengkel Bersama Komuniti Sumber Terbuka Bilangan 1 Tahun 2020" in De Baron Resort, Langkawi, Kedah, Malaysia
Top 10 Software to Detect & Prevent Security Vulnerabilities from BlackHat US...Mobodexter
BlackHat USA 2015 got recently concluded and we head a bunch of news around how BlackHat brought to light various security vulnerabilities in day-to-day life like ZigBee protocol, Device for stealing keyless cars & ATM card skimmers. However the presenters, who are also ethical hackers, also gave a bunch of tools to help software community to detect & prevent security holes in the hardware & software while the product is ready for release. We have reviewed all the presentations from the conference and give you here a list of Top 10 tools/utilities that helps in security vulnerability detection & prevention.
Edge computing and the Internet of Things bring great promise, but often just getting data from the edge requires moving mountains. Let's learn how to make edge data ingestion and analytics easier using StreamSets Data Collector edge, an ultralight, platform independent and small-footprint Open Source solution written in Go for streaming data from resource-constrained sensors and personal devices (like medical equipment or smartphones) to Apache Kafka, Amazon Kinesis and many others. This talk includes an overview of the SDC Edge main features, supported protocols and available processors for data transformation, insights on how it solves some challenges of traditional approaches to data ingestion, pipeline design basics, a walk-through some practical applications (Android devices and Raspberry Pi) and its integration with other technologies such as Streamsets Data Collector, Apache Kafka, Apache Hadoop, InfluxDB and Grafana. The goal here is to make attendees ready to quickly become IoT data intake and SDC Edge Ninjas.
Speaker
Guglielmo Iozzia, Big Data Delivery Manager, Optum (United Health)
Open Source Insight: SCA for DevOps, DHS Security, Securing Open Source for G...Black Duck by Synopsys
It’s an acronym-filled issue of Open Source Insight, as we look at the question of SCA (software composition analysis) and how it fits into the DevOps environment. The DHS (Department of Homeland Security) has concerning security gaps, according to its OIG (Office of Inspector General). Can the CVE (Common Vulnerabilities and Exposures) gap be closed? The GDPR (General Data Protection Regulation) is bearing down on us like a freight train, and it’s past time to include open source security into your GDPR plans.
Plus, an intro to the Open Hub community, looking at security for blockchain apps, and best practices for open source security in container environments are all featured in this week’s cybersecurity and open source security news.
Product security by Blockchain, AI and Security CertsLabSharegroup
Three themes You need to think about Product Security — and some tips for How to Do It
I have been working with software security laboratories and IT security firms for years. I have talked with clients, read and watched dozens of articles/videos and talked with several experts about product security themes, future, technologies.
The three themes are:
Is the blockchain the new technology of trust?
Blockchain has the potential to transform industries. However, some security experts raised questions: If blockchain is broadly used in technology solutions will security standards be adopted? How to protect the cryptographic keys that allow access to the blockchain applications? Although it is true that the potential is huge such as securing IoT nodes, edge devices with authentication, improved confidentiality and data integrity, disrupting current PKI systems, reducing DDoS attacks etc.
AI (Machine Learning, Deep Learning, Reinforcement Learning algorithm) potential in Product Security
Machine learning can help in creating products that analyse threats and respond to attacks and security incidents. There are several repositories on GitHub or open-source codes by IBM available for developers. Deep learning networks are rapidly growing due to cheap cloud GPU services and after Reinforcement learning algorithm’s last success nobody knows the upper limit.
Product Security by International security standards and practices
The present, future, and developmental orientations of independent third party certificates Industry. How can the international standards answer the rapid growth of new technologies and maintain secure applications in IoT, Blockchain or AI-driven industries?
Are IT products reliable, secure and will they stay that way?
I would like to explain Product Security in a simple way. My goal is the introduction of product security for Tech startups, fast-growing Tech firms. Furthermore, I would like to emphasize the benefits of product security certification.
Open Source Insight: Black Duck Announces OpsSight for DevOps Open Source Sec...Black Duck by Synopsys
Continuing a month of major announcements, Black Duck launched its new product, OpsSight — comprehensive, automated open source container security for production environments — at its FLIGHT 2017 user conference in Boston this week. Targeting the production phase of the software development life cycle, the initial release of OpsSight is optimized for Red Hat’s OpenShift Container Platform.
If you missed FLIGHT 2017, you can read all the news about OpsSight below, as well as stories on FLIGHT keynoters Charlie Miller and Chris Valasek’s presentation on why IoT insecurity is here to stay; the top 5 cybersecurity mistakes you need to avoid; the SEC prepares new cybersecurity guidelines; and security for the connected car
Log Standards & Future Trends by Dr. Anton ChuvakinAnton Chuvakin
The presentation will discuss how to bring order (in the form of standards!) to the chaotic world of logging.
It will give a brief introduction to logs and logging and explain how and why logs grew so chaotic and disorganized.
Next it will cover why log standards are sorely needed.
It will offer a walk-through that highlights the critical areas of log standardization. Current standard efforts will be discussion.
Finally, the presentation will cover a few of the emerging and yet-to-emerge trends related to logging and log management.
Understanding what is IoT security
What is the scope of IoT security
Uses of IoT and where do we see it in our daily life
Possible attack surface and likelihood of IoT-related attacks
IoT specific security assessment (understanding approach, IoT protocols, how it is a combination of different type assessments)
The myths of IoT security and the way it has progressed in past few years and how far fetched it can be.
Available Resources and Tools
Charith Perera, Ciaran Mccormick, Arosha Bandara, Blaine A. Price, Bashar Nuseibeh, Privacy-by-Design Framework for Assessing Internet of Things Applications and Platforms, Proceedings of the 6th ACM International Conference on Internet of Things (IoT), Stuttgart, Germany, November, 2016, Pages 83-92
OASIS: open source and open standards: internet of thingsJamie Clark
How FOSS projects and open ICT standards often interact in a virtuous cycle. Recent examples, and a list of IoT-relevant open standards projects at OASIS. Feb 2014
Similar to A Data Modelling Framework to Unify Cyber Security Knowledge (20)
Loading a lot of data into a graph database is not a trivial exercise. TypeDB Loader (formerly known as GraMi) was developed to allow large-scale data import into TypeDB, a strongly-typed database. Recent improvements have immensely simplified the configuration interface to allow for easier data importing, while maintaining features and the promise of loading huge amounts of data into TypeDB as fast as possible.
Natural Language Interface to Knowledge GraphVaticle
Natural language interfaces (NLI) offer end-users an easy and convenient way to query ontology-based knowledge graphs. They automatically generate database queries based on their natural language inputs, avoiding the need for the end user to learn different query languages. NLIs can be used with REST APIs to facilitate and enrich the interactions with knowledge graphs, in domains such as interactive root cause analysis (RCA), dynamic dashboard generation, and Online Transactional Processing (OLTP).
In this talk, you'll learn about a natural language interface built with a TypeDB server running on Raspberry Pi4. This application offers a conversational bot assistant with Cisco Webex for an efficient and flexible way to facilitate human-machine interactions. In particular, this talk will demonstrate how natural language inputs are translated into TypeQL queries using Abstract Syntax Trees that represent the syntactic structure discovered during the Named Entity Recognition (NER) analysis of the textual inputs provided by Rasa 2.X running on an Intel Celeron J3455 miniPC.
Talk Summary:
State of the art AI approaches can struggle to create solutions which provide accurate results that stand the test of time. They are also plagued by problems such as bias and a lack of explainability. Causal AI addresses these key problems and is at the center of the Geminos Causeway platform, which is built on TypeDB.
This webinar will give you an introduction to why causal AI is so important, and how you can start to use it to drive more value for your organisation.
Speaker: Stuart Frost
Stu is the CEO and founder of Geminos. Their focus is on building AI-driven solutions for mid-sized Smart Manufacturing and Logistics companies, that are frustrated by their inability to digitalize their operations at sensible cost. Stu has 30 years’ experience in founding and leading successful data management and analytics startups, starting at 26 when he founded SELECT Software Tools, and led the company to a NASDAQ IPO in 1996. He then founded DATAllegro in 2003 which was acquired by Microsoft.
Building a Distributed Database with Raft.pdfVaticle
Applications running on production have much higher requirements. Not only do they need to be correct, they also need to be "always-on", handle a much bigger user load, and also be secure.
Meet TypeDB Cluster, the TypeDB database for production-scale, built using the Raft replication algorithm. Join us for a walk through the underlying architecture and what value it brings to developers running an application at scale.
Speaker: Ganeshwara Henanda
Ganesh leads the development of TypeDB Cluster while also managing other aspects such as infrastructure and project management. His day-to-day work involves building concurrent and distributed algorithms such as Raft and the Actor Model.
He graduated with an MSc of Grid Computing from University of Amsterdam, and has built several large scale distributed and real-time systems throughout his career.
Enabling the Computational Future of Biology.pdfVaticle
Computational biology has revolutionised biomedicine. The volume of data it is generating is growing exponentially. This requires tools that enable computational and non-computational biologists to collaborate and derive meaningful insights. However, traditional systems are inadequate to accurately model and handle data at this scale and complexity.
In this talk, we discuss how TypeDB enables biologists to build a deeper understanding of life, and increase the probability of groundbreaking discoveries, across the life sciences.
Speaker: Tomás Sabat
Tomás is the Chief Operating Officer at Vaticle. He works closely with TypeDB's open source and enterprise users who use TypeDB to build applications in a wide number of industries including financial services, life sciences, cybersecurity and supply chain management. A graduate of the University of Cambridge, Tomás has spent the last seven years founding and building businesses in the technology industry.
Build your skills and learn how TypeDB's native inference engine works.
Good for:
- Beginners to TypeDB and TypeQL
- Those who have been using TypeDB and want a refresher on inference in TypeDB
- Experienced software engineers
- Those who want to better represent their domain in a model that allows for logical reasoning at the database level
Description:
TypeDB is capable of reasoning over data via pre-defined rules. TypeQL rules look for a given pattern in the database and when found, infer the given queryable fact. The inference provided by rules is performed at query (run) time. Rules not only allow shortening and simplifying of commonly-used queries, but also enable knowledge discovery and implementation of business logic at the database level.
Takeaways:
- Understanding of fundamental components of TypeDB's inference engine and how to write rules for your domain
- Write at least 1 rule for your use case
- Utilise the rule you wrote in a query
Tomás Sabat:
Tomás is the Chief Operating Officer at Vaticle, dedicated to building a strongly-typed database for intelligent systems. He works directly with TypeDB's open source and enterprise users so they can fulfil their potential with TypeDB and change the world. He focuses mainly in life sciences, cyber security, finance and robotics.
Join the TypeDB community to learn how we think about data modelling, and how TypeDB's expressivity allows you to model your domain based on logical and object-oriented programming principles.
Good for:
- Engineers, scientists, and technical executives
- Those in a technical field working with complex datasets, and building intelligent systems
- Anyone curious to learn about the expressive power of TypeDB's data model
Description:
We open this training with an exploration into what a schema looks like in TypeDB, starting with clarifying the motivation for the conceptual model in TypeDB, and its relationship to the Enhanced Entity-Relationship model.
Then we break things down a bit more philosophically, delving into: what does it mean to represent data in TypeDB, and how TypeDB allows you to think higher-level, as opposed to join-tables, columns, documents, vertices, edges, and properties.
Takeaways:
- Be able to articulate why TypeDB's data model is so beneficial for complex data, and why we use it to build intelligent systems
- Write a TypeDB schema in TypeQL
- Practice modelling one of your own domains
Tomás Sabat:
Tomás is the Chief Operating Officer at Vaticle, dedicated to building a strongly-typed database for intelligent systems. He works directly with TypeDB's open source and enterprise users so they can fulfil their potential with TypeDB and change the world. He focuses mainly in life sciences, cyber security, finance and robotics.
Using SQL to query relational databases is easy. As a declarative language, it’s straightforward to write queries and build powerful applications. However, relational databases struggle when working with complex data. When querying such data in SQL, challenges especially arise in the modelling and querying of the data.
For example, due to the large number of necessary JOINs, it forces us to write long and verbose queries. Such queries are difficult to write and prone to mistakes.
TypeQL is the query language used in TypeDB. Just as SQL is the standard query language in relational databases, TypeQL is TypeDB's query language. It’s a declarative language, and allows us to model, query and reason over our data.
In this talk, we will look at how TypeQL compares to SQL. Why and when should you use TypeQL over SQL? How do we do outer/inner joins in TypeQL? We'll look at the common concepts, but mostly talk about the differences between the two.
Speaker: Tomás Sabat
Tomás is the Chief Operating Officer at Vaticle. He works closely with TypeDB's open source and enterprise users who use TypeDB to build applications in a wide number of industries including financial services, life sciences, cybersecurity and supply chain management. A graduate of the University of Cambridge, Tomás has spent the last seven years founding and building businesses in the technology industry.
TypeDB Academy- Getting Started with Schema DesignVaticle
In this TypeDB Academy, we start by gaining an understanding of the fundamental components of TypeDB's type system and what makes it unique. We will see how we can download, install, and run TypeDB, and learn to perform basic database operations.
We'll then explore what a schema looks like in TypeDB, starting with clarifying the motivation for schema, the conceptual schema of TypeDB, and its relationship to the Enhanced Entity-Relationship model.
Good for:
- Beginners to TypeDB and TypeQL
- Those who have been using TypeDB and want a refresher on schema and TypeQL
- Experienced database administrators and software engineers
Takeaways:
- Understanding of fundamental components of TypeDB
- How to download, install, and run TypeDB on your computer
- Be able to articulate why schema is so beneficial when using TypeDB, why we use one, and how it enables a more expressive model
- Write a TypeDB schema in TypeQL
Comparing Semantic Web Technologies to TypeDBVaticle
Semantic Web technologies enable us to represent and query for very complex and heterogeneous datasets. We can add semantics and reason over large bodies of data on the web. However, despite a lot of educational material available, they have failed to achieve mass adoption outside academia.
TypeDB works at a higher level of abstraction and enables developers to be more productive when working with complex data. TypeDB is easier to learn, reducing the barrier to entry and enabling more developers to access semantic technologies. Instead of using a myriad of standards and technologies, we just use one language - TypeQL.
In this talk we will:
- look at how TypeQL compares to Semantic Web standards, specifically RDF, SPARQL RDFS, OWL and SHACL.
- cover questions such as, how do we represent hyper-relations in TypeDB? How does one use rdfs:domain and rdfs:range in TypeDB? And how do the modelling philosophies compare?
Speaker: Tomás Sabat
Tomás is the Chief Operating Officer at Vaticle. He works closely with TypeDB's open source and enterprise users who use TypeDB to build applications in a wide number of industries including financial services, life sciences, cyber security and supply chain management. A graduate of the University of Cambridge, Tomás has spent the last seven years founding and building businesses in the technology industry.
How might we utilise an actor-based execution model to build a powerful yet elegant reasoning engine?
Actors are an asynchronous, inherently parallel framework that form the basis of some of the most computationally heavy systems in the world. By leveraging this in an event-driven model, we can build an execution engine that makes efficient use of all available hardware resources to answer your reasoning queries.
We'll visit the key ideas behind actors, and then walk through how we break reasoning into neat, actor-sized building blocks. As we do this, it will become clear how our marriage of reasoning and actors naturally produces a scalable and elegant execution engine. By examining the problem of reasoning from an actor-based lens, we'll be able to better understand the complexities of reasoning and visualise bottlenecks and optimisations.
Intro to TypeDB and TypeQL | A strongly-typed databaseVaticle
TypeDB is a strongly-typed database. It provides a rich and logical type system which breaks down complex problems into meaningful and logical systems, using TypeQL as its query language.
TypeDB allows you to model your domain based on logical and object-oriented principles. Composed of entity, relationship, and attribute types, as well as type hierarchies, roles, and rules, TypeDB allows you to think higher-level, as opposed to join-tables, columns, documents, vertices, and edges.
Types describe the logical structures of your data, allowing TypeDB to validate that your code inserts and queries data correctly. Query validation goes beyond static type-checking, and includes logical validation of meaningless queries. With strict type-checking errors, you have a dataset that you can trust.
Finally, TypeDB encodes your data for logical interpretation by its reasoning engine. It enables type-inference and rule-inference, which create logical abstractions of data. This allows for the discovery of facts and patterns that would otherwise be too hard to find.
With these abstractions, queries in the tens to hundreds of lines in SQL or NoSQL databases can be written in just a few lines in TypeQL – collapsing code complexity by orders of magnitude.
Join Tomás from the Vaticle team where he'll discuss the origins of TypeDB, the impetus for inventing a new query language, TypeQL, and why we are so excited about the future of software and intelligent systems.
Tomás Sabat:
Tomás is the Chief Operating Officer at Vaticle, dedicated to building a strongly typed database for intelligent systems. He works directly with TypeDB's open source and enterprise users so they can fulfil their potential with TypeDB and change the world. He focuses mainly in life sciences, cyber security, finance and robotics.
Graph Databases vs TypeDB | What you can't do with graphsVaticle
Developing with graph databases has a number of challenges, such as the modelling of complex schemas, and maintaining data consistency in your database.
In this talk, we discuss how TypeDB addresses these challenges, as well as how it compares to property graph databases. We’ll look at how to read and write data, how to model complex domains, and TypeDB’s ability to infer new data.
The main differences between TypeDB and graph databases can be summarised as:
1. TypeDB provides a concept-level schema with a type system that fully implements the Entity-Relationship (ER) model. Graph databases, on the other hand, use vertices and edges without integrity constraints imposed in the form of a schema
2. TypeDB contains a built-in inference engine - graph databases don’t provide native inferencing capabilities
3. TypeDB is an abstraction over a graph, and leverages a graph database under the hood to create a higher-level model, while graph databases work at different levels of abstraction
Tomás Sabat
Tomás is the Chief Operating Officer at Vaticle. He works closely with TypeDB's open source and enterprise users who use TypeDB to build applications in a wide number of industries including financial services, life sciences, cyber security and supply chain management. A graduate of the University of Cambridge, Tomás has spent the last seven years founding and building businesses in the technology industry.
In this seminar we use TypeDB to open a window on the Pandora Papers, a massive 'data tsunami' based on 11.9 million leaked source documents obtained by the International Consortium of Investigative Journalists (ICIJ).
We will use an automated query builder to get an initial set of results, and then hop from node to node, exploring neighbours and mapping out a suspicious-looking network of offshore shell companies, officers and intermediaries.
Speaker: Jon Thompson
Jon has an MSc in Applied Mathematics and has worked for several years as a Data Scientist in high-throughput biological sequencing. He is the founder of Nodelab, which is on a mission to provide a fully-featured graphical user interface experience for TypeDB.
Heterogenous data holds significant inherent context. We would like our machine learning models to understand this context, and utilise this ancillary but critical information to improve the accuracy and versatility of our models.
How can we systematically make use of context in Machine Learning?
We delve in and investigate the knowledge modelling techniques, which applied with the right ML strategies, give us a promising approach for robustly handling heterogeneous data in large knowledge models. We aim to do this in a way that allows us to build any Machine Learning models, including graph learning models like our KGCN.
Speaker: James Fletcher, Vaticle
James comes from a background of Computer Vision, specialising in automated diagnostics. As Principal Scientist at Vaticle, his mission is to demonstrate to the world how traditional symbolic approaches to AI, built-in to TypeDB, can be combined with present-day research in machine learning.
Combining Causal and Knowledge Modeling for Digital TransformationVaticle
Geminos has created a low-code digital transformation platform that combines causal and knowledge modeling. It uses TypeDB as its internal repository. Initial projects are in supply chain and smart manufacturing, with a focus on sustainability.
Speakers: Stuart Frost (CEO), Owen Frost (Analyst)
Stu is the CEO and founder of Geminos. Their focus is on building AI-driven solutions for mid-sized Smart Manufacturing and Logistics companies, that are frustrated by their inability to digitalize their operations at sensible cost.
A Knowledge Graph is as valuable as the insights we can derive from it. So, what do we do when our Knowledge Graph doesn’t contain the answers? We need to complete it.
We know that Grakn’s logical reasoner can help us to deduce insights. However, when our answers can’t be deduced we need to turn to statistical methods to infer new facts - making predictions inductively, by example. This could be relations, attributes or even rules.
In this talk, we will delve into the advanced graph learning systems that we can construct and use on top of Grakn to create intelligent systems. This is the core of the research that we conduct at Grakn Labs - all of which is made available in KGLIB.
Text is the medium used to store the tremendous wealth of scientific knowledge regarding the world we live in. However, with its ever-increasing magnitude and throughput, analysing this unstructured data has become a tedious task. This has led to the rise of Natural Language Processing (NLP), as the go-to for examining and processing large amounts of natural language data.
This involves the automatic extraction of structured semantic information from unstructured machine-readable text. The identification of these explicit concepts and relationships help in discovering multiple insights contained in text in a scalable and effective way.
A major challenge is the mapping of unstructured information from raw texts into entities, relationships and attributes in the knowledge graph. In this talk, we demonstrate how Grakn can be used to create a text mining knowledge graph capable of modelling, storing, and exploring beneficial information extracted from medical literature.
Introduction to Knowledge Graphs with Grakn and Graql Vaticle
Cognitive/AI systems process knowledge that is far too complex for current databases. They require an expressive data model and an intelligent query language to perform knowledge engineering over complex datasets.
In this talk, we will discuss how Grakn, a database to organise complex networks of data and make it queryable, provides the knowledge graph foundation for intelligent systems to manage complex data.
We will discuss how Graql, Grakn's reasoning (through OLTP) and analytics (through OLAP) query language, provides the tools required to do the job: a knowledge schema, a logical inference language, a distributed analytics framework.
And finally, we will discuss how Graql’s language serves as unified data representation of data for cognitive systems.
We explain how we use Grakn as part of a wider solution to deliver next generation Data Operations (Data Ops) tooling, enabling us to deliver sophisticated "Run Graph Analytics".
The Run Graph is a component to passively track and trace our data assets as they move across the organisation, and is used to quickly reverse engineer our global flows of data to better plan change and understand hidden dependencies. When operational failures do arise, we demonstrate how Grakn quickly allows us to assess the inferred impacts downstream, and to prioritise and communicate the impacts of outages to stakeholders.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
"Impact of front-end architecture on development cost", Viktor Turskyi
A Data Modelling Framework to Unify Cyber Security Knowledge
1. A Data Modelling Framework
to Unify Cyber Security
Knowledge
OmnibusCyber
Authors:
Dr. Paolo Di Prodi
Dr. Brett Forbes
2. About Me
Paolo Di Prodi
Phd in Machine Learning
Software and Automation Engineer
Worked for Microsoft and now Fortinet.
Mostly Data Science in Cyber Security.
Prior to that malware reversing.
3. Problem we have right now!
External: Threat Intelligence Exchange
Internal: Any cyber data
4. The Sheriff of data
modelling
š Classical drama buy vs build vs reuse
š Buy is not an option
š Build is usually the option
š How can we avoid typical mistakes?
š Can provide a basic structure?
š With the ability to extend to each
company?
5. External Threat Intelligence
š STIX and TAXII are standards developed in
an effort to improve the prevention and
mitigation of cyber-attacks. STIX states the
“what” of threat intelligence, while TAXII
defines “how” that information is relayed.
Unlike previous methods of sharing, STIX
and TAXII are machine-readable and
therefore easily automated.
6. STIX and TAXI
STIX, short for Structured Threat Information
eXpression, is a standardized language
developed by MITRE and the OASIS Cyber
Threat Intelligence (CTI) Technical
Committee for describing cyber threat
information.
TAXII, short for Trusted Automated eXchange
of Intelligence Information, defines how
cyber threat information can be shared via
services and message exchanges. It is
designed specifically to support STIX
information
11. God created silos in the last day
IPS
DB
Query
EDR
DB
AV
DB
Query
FIM
DB
Query
ZTN
DB
Query
š Each product have their own syntax,
taxonomies and ontologies
š Building a federated DB is a big
challenge
š I mean just even look at the SIEM vendor
space….
Where
are my
CVE?
What is a
vulnerabili
ty?
What is
the
context?
Where is
my
OLAP?
15. Unified Cyber Ontology (UCO)
š A foundation for standardized information
representation across the cyber security
domain/ecosystem
š Last version: 0.9.0 on 16 June 2022
š First Version: 01.0 on 5 Jan 2017
š Based on:
š OWL
š Java 11
š Key stats:
š 418 Classes
š 707 Properties
š 11812 Triples
RDF
Adoption
Focus on
Observables
16. Open Cybersecurity Schema Framework (OCSF)
š The Open Cybersecurity Schema
Framework is an open-source project,
delivering an extensible framework for
developing schemas, along with a vendor-
agnostic core security schema. Vendors
and other data producers can adopt and
extend the schema for their specific
domains.
š OCSF is intended to be used by both
products and devices which produce log
events, analytic systems, and logging
systems which retain log events.
š First Version: 14 July 2022
š Schema: JSON
Loose
inheritance
There is no
reference
database
implementation.
17. Our advantages
Extensibility
• Base Schema
• Inheritance
Reference
implementation
• TypeDB
• Toolbox
ER
• Entity-
Relationships
• URI
Sharing
• Native STIX
import/export
19. STIX Databases and Extensions
Section 7.3
•Extension
Definition
Policy
•JSON
schema
Section 11
•Custom
Object
Extensions
•Deprecated
š A work in progress for now in cooperation
with OASIS
š Is it possible? Yes
20. Omnibus Design
Prod schema
Corp Schema
Base Schema
š Basic pattern: inherit and extend
š Base Schema contains main concepts:
š CVE/CVSS/CWE/CAPEC
š MAEC
š COCOA
š ATT&CK, DEFEND, ATTCK FLOW etc
š VERIZON/VERIS
š Specialized schema contains business
logic
š Sensor facts
š Incident Response Playbooks
22. Example for IPS packet
Inherit and expand
š It’s excellent for additions
š Example here is to derive CVE entity:
š Add relation to device object
š Add relation to volume count
Simple YAML
config
Specific schema
Hostxyz|2022-01-01T10:00:00|CVE-
2022-1234|1000
23. Cool benefits
Auto Enrichment
š Each entity could have an authoritative
source
š This means auto enrichment in real time if
required.
Demo Example
š Let’s enrich the CVE data stream
š Source is the NVD database
š https://youtu.be/R0fyiBZCEyg
Project: https://github.com/priamai/omnicyberdb/tree/experimental