Machine learning aims at learning complex functions from data. Very often, this challenge remains ill-defined given the available amount of data, however, background knowledge that is available as knowledge graphs, ontologies or symbolic (physical) equations allows for an improved specification of the targeted solution. In this talk, we want to discuss several use cases that include symbolic background knowledge as regularizing priors, as constraints or as other inductive biases into machine learning tasks.
Generalizing Scientific Machine Learning and Differentiable Simulation Beyond...Chris Rackauckas
The combination of scientific models into deep learning structures, commonly referred to as scientific machine learning (SciML), has made great strides in the last few years in incorporating models such as ODEs and PDEs into deep learning through differentiable simulation. However, the vast space of scientific simulation also includes models like jump diffusions, agent-based models, and more. Is SciML constrained to the simple continuous cases or is there a way to generalize to more advanced model forms? This talk will dive into the mathematical aspects of generalizing differentiable simulation to discuss cases like chaotic simulations, differentiating stochastic simulations like particle filters and agent-based models, and solving inverse problems of Bayesian inverse problems (i.e. differentiation of Markov Chain Monte Carlo methods). We will then discuss the evolving numerical stability issues, implementation issues, and other interesting mathematical tidbits that are coming to light as these differentiable programming capabilities are being adopted.
Bio: Dr. Chris Rackauckas is the VP of Modeling and Simulation at JuliaHub, the Director of Scientific Research at Pumas-AI, Co-PI of the Julia Lab at MIT, and the lead developer of the SciML Open Source Software Organization. For his work in mechanistic machine learning, his work is credited for the 15,000x acceleration of NASA Launch Services simulations and recently demonstrated a 60x-570x acceleration over Modelica tools in HVAC simulation, earning Chris the US Air Force Artificial Intelligence Accelerator Scientific Excellence Award. See more at https://chrisrackauckas.com/. He is the lead developer of the Pumas project and has received a top presentation award at every ACoP in the last 3 years for improving methods for uncertainty quantification, automated GPU acceleration of nonlinear mixed effects modeling (NLME), and machine learning assisted construction of NLME models with DeepNLME. For these achievements, Chris received the Emerging Scientist award from ISoP.
Scalable and Efficient Algorithms for Analysis of Massive, Streaming GraphsJason Riedy
Graph-structured data in network security, social networks, finance, and other applications not only are massive but also under continual evolution. The changes often are scattered across the graph, permitting novel parallel and incremental analysis algorithms. We discuss analysis algorithms for streaming graph data to maintain both local and global metrics with low latency and high efficiency.
High-Performance Analysis of Streaming Graphs Jason Riedy
Graph-structured data in social networks, finance, network security, and others not only are massive but also under continual change. These changes often are scattered across the graph. Stopping the world to run a single, static query is infeasible. Repeating complex global analyses on massive snapshots to capture only what has changed is inefficient. We discuss requirements for single-shot queries on changing graphs as well as recent high-performance algorithms that update rather than recompute results. These algorithms are incorporated into our software framework for streaming graph analysis, STINGER.
Generalizing Scientific Machine Learning and Differentiable Simulation Beyond...Chris Rackauckas
The combination of scientific models into deep learning structures, commonly referred to as scientific machine learning (SciML), has made great strides in the last few years in incorporating models such as ODEs and PDEs into deep learning through differentiable simulation. However, the vast space of scientific simulation also includes models like jump diffusions, agent-based models, and more. Is SciML constrained to the simple continuous cases or is there a way to generalize to more advanced model forms? This talk will dive into the mathematical aspects of generalizing differentiable simulation to discuss cases like chaotic simulations, differentiating stochastic simulations like particle filters and agent-based models, and solving inverse problems of Bayesian inverse problems (i.e. differentiation of Markov Chain Monte Carlo methods). We will then discuss the evolving numerical stability issues, implementation issues, and other interesting mathematical tidbits that are coming to light as these differentiable programming capabilities are being adopted.
Bio: Dr. Chris Rackauckas is the VP of Modeling and Simulation at JuliaHub, the Director of Scientific Research at Pumas-AI, Co-PI of the Julia Lab at MIT, and the lead developer of the SciML Open Source Software Organization. For his work in mechanistic machine learning, his work is credited for the 15,000x acceleration of NASA Launch Services simulations and recently demonstrated a 60x-570x acceleration over Modelica tools in HVAC simulation, earning Chris the US Air Force Artificial Intelligence Accelerator Scientific Excellence Award. See more at https://chrisrackauckas.com/. He is the lead developer of the Pumas project and has received a top presentation award at every ACoP in the last 3 years for improving methods for uncertainty quantification, automated GPU acceleration of nonlinear mixed effects modeling (NLME), and machine learning assisted construction of NLME models with DeepNLME. For these achievements, Chris received the Emerging Scientist award from ISoP.
Scalable and Efficient Algorithms for Analysis of Massive, Streaming GraphsJason Riedy
Graph-structured data in network security, social networks, finance, and other applications not only are massive but also under continual evolution. The changes often are scattered across the graph, permitting novel parallel and incremental analysis algorithms. We discuss analysis algorithms for streaming graph data to maintain both local and global metrics with low latency and high efficiency.
High-Performance Analysis of Streaming Graphs Jason Riedy
Graph-structured data in social networks, finance, network security, and others not only are massive but also under continual change. These changes often are scattered across the graph. Stopping the world to run a single, static query is infeasible. Repeating complex global analyses on massive snapshots to capture only what has changed is inefficient. We discuss requirements for single-shot queries on changing graphs as well as recent high-performance algorithms that update rather than recompute results. These algorithms are incorporated into our software framework for streaming graph analysis, STINGER.
Workshop nwav 47 - LVS - Tool for Quantitative Data AnalysisOlga Scrivner
In the format of hands-on session, this workshop will introduce participants to the Language Variation Suite (LVS), a user-friendly interactive web application built in R. LVS provides access to advanced statistical methods and visualization techniques, such as mixed-effects modeling, conditional and random tree analyses, cluster analysis. These advanced methods enable researchers to handle imbalanced data, measure individual and group variation, estimate significance, and rank variables according to their significance.
Camp IT: Making the World More Efficient Using AI & Machine LearningKrzysztof Kowalczyk
Slides from the introductory lecture I gave for students at Camp IT 2019. I tried to cover artificial inteligence, machine learning, most popular algorithms and their applications to business as broadly as possible - for in-depth materials on the given topics, see links and references in the presentation.
Interest in Deep Learning has been growing in the past few years. With advances in software and hardware technologies, Neural Networks are making a resurgence. With interest in AI based applications growing, and companies like IBM, Google, Microsoft, NVidia investing heavily in computing and software applications, it is time to understand Deep Learning better!
In this lecture, we will get an introduction to Autoencoders and Recurrent Neural Networks and understand the state-of-the-art in hardware and software architectures. Functional Demos will be presented in Keras, a popular Python package with a backend in Theano. This will be a preview of the QuantUniversity Deep Learning Workshop that will be offered in 2017.
Tutorial presented at ACM SIGIR/SIGKDD Africa Summer School on Machine Learning for Data Mining and Search (AFIRM 2020) conference in Cape Town, South Africa.
Deep Learning in Recommender Systems - RecSys Summer School 2017Balázs Hidasi
This is the presentation accompanying my tutorial about deep learning methods in the recommender systems domain. The tutorial consists of a brief general overview of deep learning and the introduction of the four most prominent research direction of DL in recsys as of 2017. Presented during RecSys Summer School 2017 in Bolzano, Italy.
In this talk, after a brief overview of AI concepts in particular Machine Learning (ML) techniques, some of the well-known computer design concepts for high performance and power efficiency are presented. Subsequently, those techniques that have had a promising impact for computing ML algorithms are discussed. Deep learning has emerged as a game changer for many applications in various fields of engineering and medical sciences. Although the primary computation function is matrix vector multiplication, many competing efficient implementations of this primary function have been proposed and put into practice. This talk will review and compare some of those techniques that are used for ML computer design.
Aula de Deep Learnig.
A aprendizagem profunda é parte de uma família mais abrangente de métodos de aprendizado de máquina baseados na aprendizagem de representações de dados. Uma observação (por exemplo, uma imagem), pode ser representada de várias maneiras, tais como um vetor de valores de intensidade por pixel, ou de uma forma mais abstrata como um conjunto de arestas, regiões com um formato particular, etc. Algumas representações são melhores do que outras para simplificar a tarefa de aprendizagem (por exemplo, reconhecimento facial ou reconhecimento de expressões faciais[10]). Uma das promessas da aprendizagem profunda é a substituição de características feitas manualmente por algoritmos eficientes para a aprendizagem de características supervisionada ou semissupervisionada e extração hierárquica de características.
A pesquisa nesta área tenta fazer representações melhores e criar modelos para aprender essas representações a partir de dados não rotulados em grande escala. Algumas das representações são inspiradas pelos avanços da neurociência e são vagamente baseadas na interpretação do processamento de informações e padrões de comunicação em um sistema nervoso, tais como codificação neural que tenta definir uma relação entre vários estímulos e as respostas neuronais associados no cérebro.
Várias arquiteturas de aprendizagem profunda, tais como redes neurais profundas, redes neurais profundas convolucionais, redes de crenças profundas e redes neurais recorrentes têm sido aplicadas em áreas como visão computacional, reconhecimento automático de fala, processamento de linguagem natural, reconhecimento de áudio e bioinformática, onde elas têm se mostrado capazes de produzir resultados do estado-da-arte em várias tarefas.
Aprendizagem profunda foi caracterizada como a expressão na moda, ou uma recaracterização das redes neurais.
Deep Learning: concepts and use cases (October 2018)Julien SIMON
An introduction to Deep Learning theory
Neurons & Neural Networks
The Training Process
Backpropagation
Optimizers
Common network architectures and use cases
Convolutional Neural Networks
Recurrent Neural Networks
Long Short Term Memory Networks
Generative Adversarial Networks
Getting started
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
Data spaces in distributed environments should be allowed to evolve in agile ways providing data space owners with large flexibility about which data they store. Agility and heterogeneity, however, jeopardize data exchanges because representations may build on varying ontologies and data consumers may not rely on the semantic correctness of their queries in the context of semantically heterogeneous, evolving data spaces. Graph data spaces are one example of a powerful model for representing and querying data whose semantics may change over time. To assert and enforce conditions on individual graph data spaces, shape languages (e.g SHACL) have been developed. We investigate the question of how querying and programming can be guarded by reasoning over SHACL constraints in a distributed setting and we sketch a picture of how a future landscape based on semantically heterogeneous data spaces might look like.
Knowledge graphs for knowing more and knowing for sureSteffen Staab
Knowledge graphs have been conceived to collect heterogeneous data and knowledge about large domains, e.g. medical or engineering domains, and to allow versatile access to such collections by means of querying and logical reasoning. A surge of methods has responded to additional requirements in recent years. (i) Knowledge graph embeddings use similarity and analogy of structures to speculatively add to the collected data and knowledge. (ii) Queries with shapes and schema information can be typed to provide certainty about results. We survey both developments and find that the development of techniques happens in disjoint communities that mostly do not understand each other, thus limiting the proper and most versatile use of knowledge graphs.
More Related Content
Similar to Symbolic Background Knowledge for Machine Learning
Workshop nwav 47 - LVS - Tool for Quantitative Data AnalysisOlga Scrivner
In the format of hands-on session, this workshop will introduce participants to the Language Variation Suite (LVS), a user-friendly interactive web application built in R. LVS provides access to advanced statistical methods and visualization techniques, such as mixed-effects modeling, conditional and random tree analyses, cluster analysis. These advanced methods enable researchers to handle imbalanced data, measure individual and group variation, estimate significance, and rank variables according to their significance.
Camp IT: Making the World More Efficient Using AI & Machine LearningKrzysztof Kowalczyk
Slides from the introductory lecture I gave for students at Camp IT 2019. I tried to cover artificial inteligence, machine learning, most popular algorithms and their applications to business as broadly as possible - for in-depth materials on the given topics, see links and references in the presentation.
Interest in Deep Learning has been growing in the past few years. With advances in software and hardware technologies, Neural Networks are making a resurgence. With interest in AI based applications growing, and companies like IBM, Google, Microsoft, NVidia investing heavily in computing and software applications, it is time to understand Deep Learning better!
In this lecture, we will get an introduction to Autoencoders and Recurrent Neural Networks and understand the state-of-the-art in hardware and software architectures. Functional Demos will be presented in Keras, a popular Python package with a backend in Theano. This will be a preview of the QuantUniversity Deep Learning Workshop that will be offered in 2017.
Tutorial presented at ACM SIGIR/SIGKDD Africa Summer School on Machine Learning for Data Mining and Search (AFIRM 2020) conference in Cape Town, South Africa.
Deep Learning in Recommender Systems - RecSys Summer School 2017Balázs Hidasi
This is the presentation accompanying my tutorial about deep learning methods in the recommender systems domain. The tutorial consists of a brief general overview of deep learning and the introduction of the four most prominent research direction of DL in recsys as of 2017. Presented during RecSys Summer School 2017 in Bolzano, Italy.
In this talk, after a brief overview of AI concepts in particular Machine Learning (ML) techniques, some of the well-known computer design concepts for high performance and power efficiency are presented. Subsequently, those techniques that have had a promising impact for computing ML algorithms are discussed. Deep learning has emerged as a game changer for many applications in various fields of engineering and medical sciences. Although the primary computation function is matrix vector multiplication, many competing efficient implementations of this primary function have been proposed and put into practice. This talk will review and compare some of those techniques that are used for ML computer design.
Aula de Deep Learnig.
A aprendizagem profunda é parte de uma família mais abrangente de métodos de aprendizado de máquina baseados na aprendizagem de representações de dados. Uma observação (por exemplo, uma imagem), pode ser representada de várias maneiras, tais como um vetor de valores de intensidade por pixel, ou de uma forma mais abstrata como um conjunto de arestas, regiões com um formato particular, etc. Algumas representações são melhores do que outras para simplificar a tarefa de aprendizagem (por exemplo, reconhecimento facial ou reconhecimento de expressões faciais[10]). Uma das promessas da aprendizagem profunda é a substituição de características feitas manualmente por algoritmos eficientes para a aprendizagem de características supervisionada ou semissupervisionada e extração hierárquica de características.
A pesquisa nesta área tenta fazer representações melhores e criar modelos para aprender essas representações a partir de dados não rotulados em grande escala. Algumas das representações são inspiradas pelos avanços da neurociência e são vagamente baseadas na interpretação do processamento de informações e padrões de comunicação em um sistema nervoso, tais como codificação neural que tenta definir uma relação entre vários estímulos e as respostas neuronais associados no cérebro.
Várias arquiteturas de aprendizagem profunda, tais como redes neurais profundas, redes neurais profundas convolucionais, redes de crenças profundas e redes neurais recorrentes têm sido aplicadas em áreas como visão computacional, reconhecimento automático de fala, processamento de linguagem natural, reconhecimento de áudio e bioinformática, onde elas têm se mostrado capazes de produzir resultados do estado-da-arte em várias tarefas.
Aprendizagem profunda foi caracterizada como a expressão na moda, ou uma recaracterização das redes neurais.
Deep Learning: concepts and use cases (October 2018)Julien SIMON
An introduction to Deep Learning theory
Neurons & Neural Networks
The Training Process
Backpropagation
Optimizers
Common network architectures and use cases
Convolutional Neural Networks
Recurrent Neural Networks
Long Short Term Memory Networks
Generative Adversarial Networks
Getting started
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
Data spaces in distributed environments should be allowed to evolve in agile ways providing data space owners with large flexibility about which data they store. Agility and heterogeneity, however, jeopardize data exchanges because representations may build on varying ontologies and data consumers may not rely on the semantic correctness of their queries in the context of semantically heterogeneous, evolving data spaces. Graph data spaces are one example of a powerful model for representing and querying data whose semantics may change over time. To assert and enforce conditions on individual graph data spaces, shape languages (e.g SHACL) have been developed. We investigate the question of how querying and programming can be guarded by reasoning over SHACL constraints in a distributed setting and we sketch a picture of how a future landscape based on semantically heterogeneous data spaces might look like.
Knowledge graphs for knowing more and knowing for sureSteffen Staab
Knowledge graphs have been conceived to collect heterogeneous data and knowledge about large domains, e.g. medical or engineering domains, and to allow versatile access to such collections by means of querying and logical reasoning. A surge of methods has responded to additional requirements in recent years. (i) Knowledge graph embeddings use similarity and analogy of structures to speculatively add to the collected data and knowledge. (ii) Queries with shapes and schema information can be typed to provide certainty about results. We survey both developments and find that the development of techniques happens in disjoint communities that mostly do not understand each other, thus limiting the proper and most versatile use of knowledge graphs.
Soziale Netzwerke und Medien: Multi-disziplinäre Ansätze für ein multi-dimens...Steffen Staab
Präsentation von Oul Han und Steffen Staab
Workshop "Soziale Netzwerke und Medien" auf dem Treffen des Fakultätentags Informatik, 14. November 2019, Hamburg
Web Futures: Inclusive, Intelligent, SustainableSteffen Staab
Almost from its very beginning, the Web has been ambivalent.
It has facilitated freedom for information, but this also included the freedom to spread misinformation. It has faciliated intelligent personalization, but at the cost of intrusion into our private lifes. It has included more people than any other system before, but at the risk of exploiting them.
The Web is full of such ambivalences and the usage of artificial intelligences threatens to further amplify these ambivalences. To further the good and to contain the negative consequences, we need a research agenda studying and engineering the Web, as well as numerous activities by societies at large. In this talk, I will present and discuss a joint effort by an interdisciplinary team of Web Scientists to prepare and pursue such an agenda.
Concepts in Application Context ( How we may think conceptually )Steffen Staab
Formal concept analysis (FCA) derives a hierarchy of concepts
in a formal context that relates objects with attributes. This approach is very well aligned with the traditions of Frege, Saussure and Peirce, which relate a signifier (e.g. a word/an attribute) to a mental concept evoked by this word and meant to refer to a specific object in the real world. However, in the practice of natural languages as well as artificial languages (e.g. programming languages), the application context
often constitutes a latent variable that influences the interpretation of a signifier. We present some of our current work that analyzes the usage of words in natural language in varying application contexts as well as the usage of variables in programming languages in varying application contexts in order to provide conceptual constraints on these signifiers.
Storing and Querying Semantic Data in the CloudSteffen Staab
Daniel Janke and Steffen Staab. Tutorial at Reasoning Web
With proliferation of semantic data, there is a need to cope with trillions of triples by horizontally scaling data management in the cloud. To this end one needs to advance (i) strategies for data placement over compute and storage nodes, (ii) strategies for distributed query processing, and (iii) strategies for handling failure of compute and storage nodes. In this tutorial, we want to review challenges and how they have been addressed by research and development in the last 15 years.
Talk at Leopoldina Symposium on Digitization and its Effects on Man and Society
(Die Digitalisierung und ihre Auswirkungen auf Mensch und Gesellschaft)
leopoldina.org/de/veranstaltungen/veranstaltung/event/2464/
The evolution of the Web should move forward in an upward spiral that cylces between guiding values, engineering and science. Guiding values should comprise social values as well as system principles that further stabilization and growth of the Web. Principles I will talk about will include social inclusion, connectedness and fairness. Example efforts improve Web access for disabled, critically access Web structures and Web growth, and try to transfer knowledge about previously found patterns of Web growth to analogous cases.
(Semi-)Automatic analysis of online contentsSteffen Staab
How can media and discourse analyses combine approaches from humanities and statistical methods to deeply analyse large amounts of online contents.
Invited talk at Fachgruppen-Workshop der Deutschen Gesellschaft für Publizistik und Kommunikationswissenschaft
Soziale Medien – Echo-Kammer oder öffentlicher Raum?
Ansätze zur computergestützten Analyse von Internet-Korpora
6. Oktober 2016, Karlsruher Institut für Technologie (KIT)
Joint Keynote at Int. Conference on Knowledge Engineering and Semantic Web and Prague Computer Science Seminar, Prague, September 22, 2016
The challenges of Big Data are frequently explained by dealing with Volume, Velocity, Variety and Veracity. The large variety of data in organizations results from accessing different information systems with heterogeneous schemata or ontologies. In this talk I will present the research efforts that target the management of such broad data.
They include: (i) an integrated development environment for programming with broad data, (ii) a query language that allows for typing of query results, (iii) a typed lambda-calculus based on description logics, and (iv) efficient access to data repositories via schema indices.
We use metadata of various kind to improve and enrich text document clustering using an extension of Latent Dirichlet Allocation (LDA). The methods are fully implemented, evaluated and software is available on github.
These are the slides of an invited talk I gave September 8 at the Alexandria Workshop of TPDL-2016: http://alexandria-project.eu/events/3rd-workshop/
Richard's entangled aventures in wonderlandRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
This pdf is about the Schizophrenia.
For more details visit on YouTube; @SELF-EXPLANATORY;
https://www.youtube.com/channel/UCAiarMZDNhe1A3Rnpr_WkzA/videos
Thanks...!
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
A brief information about the SCOP protein database used in bioinformatics.
The Structural Classification of Proteins (SCOP) database is a comprehensive and authoritative resource for the structural and evolutionary relationships of proteins. It provides a detailed and curated classification of protein structures, grouping them into families, superfamilies, and folds based on their structural and sequence similarities.
This presentation explores a brief idea about the structural and functional attributes of nucleotides, the structure and function of genetic materials along with the impact of UV rays and pH upon them.
Symbolic Background Knowledge for Machine Learning
1. IPVS – Institute for Parallel and Distributed Systems
Analytic Computing
Symbolic Background Knowledge
for Machine Learning
Steffen Staab
https://www.ipvs.uni-stuttgart.de/departments/ac/
With slides contributed by
Alexandra Baier, Luis Chamon, Tim Schneider, Bo Xiong, Thomas Monninger
2. • What is machine learning?
• Why (symbolic) background knowledge in machine learning?
• Which background knowledge?
• In which applications?
• How to apply in machine learning?
•A broad range of methods
for applying background knowledge in ML
So, background knowledge helps, but....
3. Classification: predict discrete label for given examples
• Given a news article, assign a topic
Regression: predict continuous value for given examples
• Given meteorological conditions today, predict temperature tomorrow
Sequence Prediction: predict sequence for given sequence
• Given text, identify all noun phrases in the text
Supervised Learning
Machine Learning
Model
lore ipsum
dolores
lore ipsum
lore ipsum
dolores
lore ipsum
dolores
lore ipsum
Politics Culture Lifestyle Sports
0
From KnowGraphs Winter School 2021
4. Unsupervised Learning
Clustering: summarize similar examples in clusters
• Given articles, form k clusters of most similar articles
Visualization and Dimensionality reduction: map high-
dimensional data into lower-dimensional space
• Map documents in d-dimensional space
such that similar documents are close
Rule Mining: learn general rules from data
• Given gene network, find rules about frequent relationships
Machine Learning
Model
lore ipsum
dolores
lore ipsum
lore ipsum
dolores
lore ipsum
dolores
lore ipsum
From KnowGraphs Winter School 2021
5. Supervised vs. Unsupervised
Many supervised methods now include some form of unsupervised learning:
• Word embedding layers: can encode interesting linguistic patterns
• Convolutional layers: can encode interesting visual or sequential patterns
From KnowGraphs Winter School 2021
One but last layer
encodes learned
features:
• embeddings
• implicit, re-usable,
non-symbolic
background knowledge
6. Parameters 𝜽
Loss-based Machine Learning
• Define loss function that evaluates errors of a model wrt training data:
• Training data: 𝒟 = 𝑥𝑛, 𝑦𝑛 𝑛=1
𝑁
, 𝑥𝑛 ∈ ℝ𝑑, 𝑦𝑛 ∈ ℝ
• Adjust model parameters 𝜃 to minimize empirical risk
min
𝜽
1
𝑁
𝑛=1
𝑁
Loss 𝑓𝜽 𝒙𝑛 , 𝑦𝑛
Machine Learning
Model 𝑓𝜽
Data
{(𝑥𝑛, 𝑦𝑛)}
Output 𝑓𝜽 𝒙𝒏
From KnowGraphs Winter School 2021
7. Training 𝑓𝜽: Gradient-Based Adaptation of 𝜽
Minimize 𝒏=𝟏
𝑵
𝐋𝐨𝐬𝐬 𝒇𝜽 𝒙𝒏 , 𝒚𝒏 locally:
1. initialize parameters 𝜃 randomly,
2. change parameters 𝜃 in the direction
of the negative gradient,
3. repeat until a local minimum is reached
8. • Training data: 𝒟 = 𝑥𝑛, 𝑦𝑛 𝑛=1
𝑁
, 𝑥𝑛 ∈ ℝ, 𝑦𝑛 ∈ ℝ
• Linear function: 𝑓𝜃(x) = 𝜃1𝑥 + 𝜃0
• Loss function
𝐿𝑜𝑠𝑠 𝑦, 𝑦 = 𝑦 − 𝑦 2
• Minimization of
empirical risk:
min
𝜽
1
𝑁 𝑛=1
𝑁
𝑓𝜽 𝒙𝑛 − 𝑦𝑛
2
12.03.2023 8
Example: Linear Regression
10. Classify unseen data
Zero shot learning
No training data about unicorns
Background knowledge
Unicorn is a horse with a horn
Classify unseen data
Few shot learning
one/few training examples
Sparse data
11. General differential equation for water flux
𝜕𝑢
𝜕𝑡
= 𝐷(𝑢)
𝜕2𝑢
𝜕𝑥2
− 𝑣(𝑢)
𝜕𝑢
𝜕𝑥
+ 𝑞(𝑢)
Approximate for all finite volumes 𝑖.
Flux kernel ℱ𝑖
ℱ𝑖 = 𝑗=1
𝑁𝑠𝑖
𝑓𝑗 ≈ ∮ 𝜔⊆Ω 𝐷(𝑢)
𝜕2
𝑢
𝜕𝑥2
− 𝑣(𝑢)
𝜕𝑢
𝜕𝑥
⋅ 𝑛𝑑Γ
Simulation: numerically solve for all ℱ𝑖
exchange with neighbors given boundary
conditions
ML alternative (Karlbauer et al 2022):
1. learn behavior of finite volumes
2. interleave with numerical solving
Learning physics
Picture (cc) by Shu, L., Ullrich, P. A., Duffy, C. J. (2020)
in Geosci. Model Dev.:
https://gmd.copernicus.org/articles/13/2743/2020/
12. ML alternative (Karlbauer et al 2022):
1. learn behavior of finite volumes
2. interleave with numerical solving
Background knowledge:
• water does not appear or disappear
• sum of amount of water is constant
• energy must be constant
• water cannot rise arbitrarily
• bedrock does not allow entry of water
• ...
Learning physics
Picture (cc) by Shu, L., Ullrich, P. A., Duffy, C. J. (2020) in Geosci. Model Dev.:
https://gmd.copernicus.org/articles/13/2743/2020/#&gid=1&pid=1
13. Given
• Perception
• Tracked dynamic agents
• Background knowledge
High definition map
• Lane topology
• Infrastructure
• Semantic information
• Predict: Intention of others, sensing mistakes
Traffic Scene Understanding
13
14. 12.03.2023
Steffen Staab, Universität Stuttgart, @ststaab, https://www.ipvs.uni-stuttgart.de/departments/ac/ 14
Knowledge Representations in Informed ML
(von Rueden et al 2021)
We will look at
several of them
and some others
Pre-trained non-
symbolic models
are missing here
15. 12.03.2023 15
Influencing ML by Empirical Risk Minimization
min
𝜽
1
𝑁
𝑛=1
𝑁
Loss 𝑓𝜽 𝒙𝑛 , 𝑦𝑛
Given training data 𝒟 = 𝑥𝑛, 𝑦𝑛 𝑛=1
𝑁
, 𝑥𝑛 ∈ ℝ𝑑
, 𝑦𝑛 ∈ ℝ
min
𝜽
1
𝑁
𝑛=1
𝑁
Loss 𝑓𝜽 𝒙𝑛 , 𝑦𝑛
min
𝜽
1
𝑁
𝑛=1
𝑁
Loss 𝑓𝜽 𝒙𝑛 , 𝑦𝑛 min
𝜽
1
𝑁
𝑛=1
𝑁
Loss 𝑓𝜽 𝒙𝑛 , 𝑦𝑛
Specific algorithms often modify several elements at once
16. 12.03.2023
Steffen Staab, Universität Stuttgart, @ststaab, https://www.ipvs.uni-stuttgart.de/departments/ac/ 16
A pipeline view of the same consideration
(von Rueden et al 2021)
23. TBox statements can be normalized into:
1. concept subsumption: 𝐶 ⊑ 𝐷,
2. concept intersection: 𝐶1 ⊓ 𝐶2 ⊑ 𝐷,
3. right existential: ∃𝑟. 𝐶1 ⊑ 𝐷,
4. left existential: 𝐶1 ⊑ ∃𝑟. 𝐶2,
EL++ Knowledge Bases
23
Baader, F., Brandt, S., & Lutz, C. Pushing the EL envelope. In IJCAI 2005.
ABox contains:
1. concept assertion: C(a)
2. role assertion: r(a,b)
• EL++ is a lightweight description logic that
• balances expressive power and reasoning complexity (polynomial)
• is applied for large-scale ontologies, e.g. Gene Ontology
24. Box EL++ Embedding
2
4
• designing one loss term for each logical statement
• such that the KB is satisfiable when the loss is 0 (a.k.a. soundness)
• satisfiability implies that there is a geometric interpretation satisfying all
logical statements
Idea: mapping logical constraints to geometric (soft) constraints
Solution: finding a geometric interpretation for each statement
25. Geometric Interpretations of ABox
2
5
Concept assertion 𝐶(𝑎)
𝑎
𝐶
r(𝑎, 𝑏)
𝑇𝑟
𝑏
𝑎
Role assertion
Geometric membership
Affine transformation
between two points
28. • Super important and successful class of ML methods for
supporting scientific problem solving
• 3 pages, short easy read:
• Chris Edwards. 2022. Neural networks learn to speed up simulations.
Commun. ACM 65, 5 (May 2022), 27–29. https://doi.org/10.1145/3524015
• A bit longer:
• Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S. and
Yang, L., ”Physics-informed machine learning,” in: Nature Reviews
Physics, 3(6), 2021, pp.422-440.
• I skip it now for lack of time
28
Physics-informed Neural Networks
30. Structured Multilabel Prediction
3
0
• Structured multilabel prediction
• assigns every instance multiple labels
• Labels are constrained by some background knowledge
• Question: can we produce predictions that are logically consistent with
structured background knowledge?
31. Structured background knowledge
3
1
• Labels are organized in taxonomies/hierarchies/ontologies
• 2 kinds of logical constraints
• implication: rdfs:subclassOf
• exclusion: owl:disjointFrom
WordNet Biomedical ontologies
32. Plant
Person
Mother
Father
Issues
• Mother classifier misclassifies
people (solvable with kernel trick)
• Boundaries are unrelated
32
Try support vector machines
Women Girl
Person
Parent
Mother
Implication Exclusion
Plant
Tree
Father
33. Inductive bias in Poincaré hyperbolic space
3
3
• Each label is associated with a region contained by Poincaré hyperplane
• Instances are points inside the region
• Logical constraints of labels are transformed into
geometric soft constraints of the corresponding label regions
Women Girl
Person
Parent
Mother
Implication Exclusion
Plant
Tree
Father
38. 12.03.2023
Steffen Staab, Universität Stuttgart, @ststaab, https://www.ipvs.uni-stuttgart.de/departments/ac/ 39
Use case: Multi-step prediction of trajectory prediction
Critical issues
Does the neural network make
catastrophic predictions
on unseen data?
39. • 𝑓𝜽 = 𝑔𝜃1
∘ σ ∘ MLP𝜃2
• MLP𝜃2
: ℝ𝑑1 → ℝ𝑑2
• arbitrary mapping from input to output
• full learning capacity
• σ
• component-wise sigmoid
• bounds output of MLP𝜃2
• 𝑔𝜃1
• predicting with background restrictions
Pattern: From full capacity to constrained learning
min
𝜽
1
𝑁
𝑛=1
𝑁
Loss 𝑓𝜽 𝒙𝑛 , 𝑦𝑛
40. Linear system
𝒙𝑡 = 𝑨𝒙𝑡−1 + 𝑩𝒖𝑡
Next system state 𝒙𝑡
(e.g. velocity of ship)
depends linearly
on previous state 𝒙𝑡−1
and control 𝒖𝑡 (e.g. force)
Switched linear system
𝒙𝑡 = 𝑨𝜎(𝑡)𝒙𝑡−1 + 𝑩𝜎(𝑡)𝒖𝑡
𝜎(𝑡) chooses between
different matrices 𝑨𝜎(𝑡), 𝑩𝜎(𝑡)
12.03.2023
Steffen Staab, Universität Stuttgart, @ststaab, https://www.ipvs.uni-stuttgart.de/departments/ac/ 41
System identification
41. ReLiNet: Stable and Explainable Multistep Prediction
LSTM
ut
+
xt ut
Bt
At xt+1
W(A)
W(B)
ht
ut ht xt W(.)
Legend:
control
input
hidden
state
output
hidden state
to predicted matrix
1. LSTM predicts a linear model
at each time step
2. Prediction only depends
on linear models
→ Inherently explainable
3. ReLiNet is a
Switched Linear System
→ Stability guarantees
with simple constraints
(Baier et al 2023)
42. Explaining Predictions
Contribution of past
inputs to prediction
Explanation allows faithful
reconstruction of prediction
Ft,d = Feature weight
at time t for output d
ut = Input at time t
43. • 𝑓𝜽 = 𝑔𝜃1
∘ σ ∘ MLP𝜃2
• MLP𝜃2
: ℝ𝑑1 → ℝ𝑑2
• arbitrary mapping from input to output
• full learning capacity, maps entities and concepts
• σ
• component-wise sigmoid
• bounds output of MLP𝜃2
• 𝑔𝜃1
• encodes background restrictions
12.03.2023
Steffen Staab, Universität Stuttgart, @ststaab, https://www.ipvs.uni-stuttgart.de/departments/ac/ 44
Falcon applying the pattern
(Tang et al 2023)
min
𝜽
1
𝑁
𝑛=1
𝑁
Loss 𝑓𝜽 𝒙𝑛 , 𝑦𝑛
44. Input ALC A-Box and T-Box
1. ℝ𝑑 is the domain of interpretations
2. 𝑓𝑒 is learned to interpret concept, relation and object
names as elements of ℝ𝑑
3. E.g. Check whether an object belongs to a class by 𝑚:
𝑚 𝑥, 𝐶ℐ = 𝜎(𝑀𝐿𝑃 𝑓𝑒 𝐶 , 𝑓𝑒 𝑥 )
4. Exemplary part of the loss function: 𝐸 is sampled from ℝ𝑑
ℒ𝒯 =
1
|𝐸|
1
|𝒯| 𝐶⊑𝐷∈𝒯 𝑒∈𝐸 𝑚 𝑒, (𝐶 ⊓ ¬𝐷)ℐ
12.03.2023 45
Falcon: Faithful Neural Semantic Entailment over
𝓐𝓛𝓒 Ontologies
50. Use Case: Environmental Science
sorption
sorptive
concentration c
(mg/L)
background
knowledge
sorbate
concentration s
(mg/Kg)
51. Output
Input
Knowledge-informed Machine Learning to
Extend Scientific Knowledge
Algebraic
Equations /
Analytical
Forms
Bayesian
Machine
Scientist
Training Data
Domain Knowledge
- class of expressions
- symmetries
- …
1
2
Research Questions:
„How to represent (RQ1)
and automatically exploit
(RQ2) scientific knowledge
for inference?“
RQ1
RQ2
e.g.
symbolic structure and
continous parameters
52. How to represent and exploit the knowledge?
Representation (RQ1)
1
How likely does explored
solution conform to the
knowledge?
Scientific Domain
Knowledge
Probabilistic Regular
Tree Expression
(pRTE)
2
Factor Graph
3
Build a probabilistic
finite state machine
Exploitation (RQ2)
4
(Prior) Probabilies to
perform Bayesian
Inference (MCMC)
55. 12.03.2023
Steffen Staab, Universität Stuttgart, @ststaab, https://www.ipvs.uni-stuttgart.de/departments/ac/ 56
Ontology of graph neural network for traffic scene
understanding (Schmidt et al 2023)
56. Message passing in graph neural networks
• Every instantiated node is randomly initialized with a vector
• Every instantiated node sends his vector to his neighbors
• Every instantiated node learns how to aggregate neighbor information
• Parameter sharing over same node types and edge types
57
Car 1
Car 2
Car 3
Lane B
Lane A
Stop
7
58. • Approaches often modify
several aspects
• The focus of my presentation was on work that was done at
Analytic Computing@University of Stuttgart
• For practical reasons:
• not doing comprehensive literature study
• but our papers point to other papers
• Existing research on knowledge-informed ML goes much
beyond this
12.03.2023
Steffen Staab, Universität Stuttgart, @ststaab, https://www.ipvs.uni-stuttgart.de/departments/ac/ 59
Limitations
min
𝜽
1
𝑁
𝑛=1
𝑁
Loss 𝑓𝜽 𝒙𝑛 , 𝑦𝑛
59. • From knowledge-informed to knowledge discovery
• From interpolation to extrapolation
• including few and zero-shot learning
• From implicit knowledge to explicit knowledge
• including self-learning
• including prototypical knowledge
• From classification to non-standard queries
• similarity
• analogy
12.03.2023
Steffen Staab, Universität Stuttgart, @ststaab, https://www.ipvs.uni-stuttgart.de/departments/ac/ 60
Outlook on Knowledge-informed ML
60. Thank you!
E-Mail
Telefon +49 (0) 711 685-
www.
Universität Stuttgart
IPVS
Universitätsstraße 32, 50569 Stuttgart
Steffen Staab
To be defined
ipvs.uni-stuttgart.de/departments/ac/
Analytic Computing, IPVS
Steffen.staab@ipvs.uni-stuttgart.de
My thanks and acknowledgements to all my collaborators
within and beyond Analytic Computing, within and beyond
EXC Simtech in particular:
Alexandra Baier, Daniel Frank, Cosimo Gregucci, Daniel
Hernandez, Wolfgang Nowak, Nico Potyka, Tim
Schneider, Amin Totounferoush, Bo Xiong, Thomas
Monninger, Julian Schmidt, and many more
61. References
Surveys
• Hu, Y., Chapman, A., Wen, G. and Hall, D.W., ”What Can Knowledge Bring
to Machine Learning? A Survey of Lowshot Learning for Structured Data,”
in ACM Transactions on Intelligent Systems and Technology (TIST), 13(3),
2021, pp.1-45.
• von Rueden, Laura, et al. ”Informed Machine Learning-A Taxonomy and
Survey of Integrating Prior Knowledge into Learning Systems.” IEEE
Transactions on Knowledge and Data Engineering (2021).
Factual knowledge
• J. Schmidt, T. Monninger, J. Rupprecht, D. Raba, J. Jordan, D. Frank, S.
Staab, K. Dietmayer. SCENE: Reasoning about Traffic Scenes using
Heterogeneous Graph Neural Networks. In: IEEE Robotics and Automation
Letters, 8(3), 2023.
62. References: Physics-informed Neural Networks
• Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S. and
Yang, L., ”Physics-informed machine learning,” in: Nature Reviews
Physics, 3(6), 2021, pp.422-440.
• Chris Edwards. 2022. Neural networks learn to speed up simulations.
Commun. ACM 65, 5 (May 2022), 27–29.
https://doi.org/10.1145/3524015
• Matthias Karlbauer, Timothy Praditia, Sebastian Otte, Sergey
Oladyshkin, Wolfgang Nowak, Martin V. Butz: Composing Partial
Differential Equations with Physics-Aware Neural Networks. ICML 2022:
10773-10801
12.03.2023
Steffen Staab, Universität Stuttgart, @ststaab, https://www.ipvs.uni-stuttgart.de/departments/ac/ 63
63. References: Inductive Biases
• A. Baier, D. Aspandi-Latif, S. Staab. ReLiNet: Stable and Explainable Multistep Prediction with
Recurrent Linear Parameter Varying Networks. Unpublished/submitted 2023.
• Zhenwei Tang, Tilman Hinnerichs, Xi Peng, Xiangliang Zhang, Robert Hoehndorf. FALCON: Faithful
Neural Semantic Entailment over ALC Ontologies. https://arxiv.org/abs/2208.07628
Geometric inductive biases
• B. Xiong, S. Zhu, M. Nayyeri, C. Xu, S. Pan, C. Zhou, S. Staab. Ultrahyperbolic Knowledge Graph
Embeddings. In: Proc. Of KDD ’22 – The 28th ACM SIGKDD Conference on Knowledge Discovery and
Data Mining. East Lansing, MI, USA, August 14-18, 2022.
• B. Xiong, N. Potyka, T.-K. Tran, M. Nayyeri, S. Staab. Faithful Embeddings for EL++ Knowledge
Bases. In: 21st International Semantic Web Conference (ISWC2022), (virtual event) Nov, 2022,
Springer 2022.
• B. Xiong, S. Zhu, N. Potyka, S. Pan, C. Zhou, S. Staab. Pseudo-Riemannian Graph Convolutional
Networks. In: Proc of 36th Conference on Neural Information Processing Systems (NeuRIPS2022),
Nov 28-Dec 9, 2022.
• B. Xiong, M. Cochez, M. Nayyeri, S. Staab. Hyperbolic Embedding Inference for Structured Multi-Label
Prediction. In: Proc of 36th Conference on Neural Information Processing Systems (NeuRIPS2022),
Nov 28-Dec 9, 2022.
12.03.2023
Steffen Staab, Universität Stuttgart, @ststaab, https://www.ipvs.uni-stuttgart.de/departments/ac/ 64
64. References
Data augmentation
• A. Hotho, S. Staab, G. Stumme. Ontologies Improve Text Document Clustering. Proceedings of the
International Conference on Data Mining – ICDM-2003. IEEE Press, 2003a.
• A. Hotho, S. Staab, G. Stumme. Explaining Text Clustering Results using Semantic Structures. In
Principles of Data Mining and Knowledge Discovery, 7th European Conference, PKDD 2003,
Dubrovnik, Croatia, September 22-26, 2003b.
• Stephan Bloehdorn, Andreas Hotho: Text Classification by Boosting Weak Learners based on Terms
and Concepts. ICDM 2004: 331-334
• Stephan Bloehdorn, Andreas Hotho: Ontologies for Machine Learning. Handbook on Ontologies 2009:
637-661
Optimization
• T. Schneider, A. Totounferoush, W. Nowak, S. Staab. Probabilistic Regular Tree Priors for Scientific
Symbolic Reasoning. Unpublished/submitted 2023. (grammar-induced optimization)
• L. F. O. Chamon, S. Paternain, M. Calvo-Fullana and A. Ribeiro, "Constrained Learning With Non-
Convex Losses," in IEEE Transactions on Information Theory, vol. 69, no. 3, pp. 1739-1760, March
2023, doi: 10.1109/TIT.2022.3187948.
12.03.2023
Steffen Staab, Universität Stuttgart, @ststaab, https://www.ipvs.uni-stuttgart.de/departments/ac/ 65