Genetics influence inter-subject Brain State Prediction.Cameron Craddock
Poster from 2011 Annual Meeting of the Organization for Human Brain Mapping.
Support vector regression trained to predict intrinsic brain activity from one individual, applied to their twin, works better for identical twins than fraternal twins.
Deep reinforcement learning from scratchJie-Han Chen
1. The document provides an overview of deep reinforcement learning and the Deep Q-Network algorithm. It defines the key concepts of Markov Decision Processes including states, actions, rewards, and policies.
2. The Deep Q-Network uses a deep neural network as a function approximator to estimate the optimal action-value function. It employs experience replay and a separate target network to stabilize learning.
3. Experiments applying DQN to the Atari 2600 game Space Invaders are discussed, comparing different loss functions and optimizers. The standard DQN configuration with MSE loss and RMSProp performed best.
Deep neural networks with remarkably strong generalization performances are usually over-parameterized. Despite explicit regularization strategies are used for practitioners to avoid over-fitting, the impacts are often small. Some theoretical studies have analyzed the implicit regularization effect of stochastic gradient descent (SGD) on simple machine learning models with certain assumptions. However, how it behaves practically in state-of-the-art models and real-world datasets is still unknown. To bridge this gap, we study the role of SGD implicit regularization in deep learning systems. We show pure SGD tends to converge to minimas that have better generalization performances in multiple natural language processing (NLP) tasks. This phenomenon coexists with dropout, an explicit regularizer. In addition, neural network's finite learning capability does not impact the intrinsic nature of SGD's implicit regularization effect. Specifically, under limited training samples or with certain corrupted labels, the implicit regularization effect remains strong. We further analyze the stability by varying the weight initialization range. We corroborate these experimental findings with a decision boundary visualization using a 3-layer neural network for interpretation. Altogether, our work enables a deepened understanding on how implicit regularization affects the deep learning model and sheds light on the future study of the over-parameterized model's generalization ability.
Lecture slides in DASI spring 2018, National Cheng Kung University, Taiwan. The content is about deep reinforcement learning: policy gradient including variance reduction and importance sampling
This document summarizes different approaches for multi-agent deep reinforcement learning. It discusses training multiple independent agents concurrently, centralized training with decentralized execution, and approaches that involve agent communication like parameter sharing and multi-agent deep deterministic policy gradient (MADDPG). MADDPG allows each agent to have its own reward function and trains agents centrally while executing decisions in a decentralized manner. The document provides examples of applying these methods to problems like predator-prey and uses the prisoners dilemma to illustrate how agents can learn communication protocols.
- The document discusses the multi-armed bandit problem, which is a simplified decision-making problem used to discuss exploration-exploitation dilemmas in reinforcement learning.
- It provides examples of applying the k-armed bandit problem to recommendation systems, choosing experimental medical treatments, and other scenarios.
- Two methods are introduced for estimating the value of each action: sample-average methods which average rewards over time, and incremental implementations which update estimates online without storing all past rewards.
- Exploration involves selecting non-greedy actions to improve estimates, while exploitation selects the action with the highest estimated value. The ε-greedy policy balances exploration and exploitation.
The document discusses challenges in reinforcement learning. It defines reinforcement learning as combining aspects of supervised and unsupervised learning, using sparse, time-delayed rewards to learn optimal behavior. The two main challenges are the credit assignment problem of determining which actions led to rewards, and balancing exploration of new actions with exploitation of existing knowledge. Q-learning is introduced as a way to estimate state-action values to learn optimal policies, and deep Q-networks are proposed to approximate Q-functions using neural networks for large state spaces. Experience replay and epsilon-greedy exploration are also summarized as techniques to improve deep Q-learning performance and exploration.
Genetics influence inter-subject Brain State Prediction.Cameron Craddock
Poster from 2011 Annual Meeting of the Organization for Human Brain Mapping.
Support vector regression trained to predict intrinsic brain activity from one individual, applied to their twin, works better for identical twins than fraternal twins.
Deep reinforcement learning from scratchJie-Han Chen
1. The document provides an overview of deep reinforcement learning and the Deep Q-Network algorithm. It defines the key concepts of Markov Decision Processes including states, actions, rewards, and policies.
2. The Deep Q-Network uses a deep neural network as a function approximator to estimate the optimal action-value function. It employs experience replay and a separate target network to stabilize learning.
3. Experiments applying DQN to the Atari 2600 game Space Invaders are discussed, comparing different loss functions and optimizers. The standard DQN configuration with MSE loss and RMSProp performed best.
Deep neural networks with remarkably strong generalization performances are usually over-parameterized. Despite explicit regularization strategies are used for practitioners to avoid over-fitting, the impacts are often small. Some theoretical studies have analyzed the implicit regularization effect of stochastic gradient descent (SGD) on simple machine learning models with certain assumptions. However, how it behaves practically in state-of-the-art models and real-world datasets is still unknown. To bridge this gap, we study the role of SGD implicit regularization in deep learning systems. We show pure SGD tends to converge to minimas that have better generalization performances in multiple natural language processing (NLP) tasks. This phenomenon coexists with dropout, an explicit regularizer. In addition, neural network's finite learning capability does not impact the intrinsic nature of SGD's implicit regularization effect. Specifically, under limited training samples or with certain corrupted labels, the implicit regularization effect remains strong. We further analyze the stability by varying the weight initialization range. We corroborate these experimental findings with a decision boundary visualization using a 3-layer neural network for interpretation. Altogether, our work enables a deepened understanding on how implicit regularization affects the deep learning model and sheds light on the future study of the over-parameterized model's generalization ability.
Lecture slides in DASI spring 2018, National Cheng Kung University, Taiwan. The content is about deep reinforcement learning: policy gradient including variance reduction and importance sampling
This document summarizes different approaches for multi-agent deep reinforcement learning. It discusses training multiple independent agents concurrently, centralized training with decentralized execution, and approaches that involve agent communication like parameter sharing and multi-agent deep deterministic policy gradient (MADDPG). MADDPG allows each agent to have its own reward function and trains agents centrally while executing decisions in a decentralized manner. The document provides examples of applying these methods to problems like predator-prey and uses the prisoners dilemma to illustrate how agents can learn communication protocols.
- The document discusses the multi-armed bandit problem, which is a simplified decision-making problem used to discuss exploration-exploitation dilemmas in reinforcement learning.
- It provides examples of applying the k-armed bandit problem to recommendation systems, choosing experimental medical treatments, and other scenarios.
- Two methods are introduced for estimating the value of each action: sample-average methods which average rewards over time, and incremental implementations which update estimates online without storing all past rewards.
- Exploration involves selecting non-greedy actions to improve estimates, while exploitation selects the action with the highest estimated value. The ε-greedy policy balances exploration and exploitation.
The document discusses challenges in reinforcement learning. It defines reinforcement learning as combining aspects of supervised and unsupervised learning, using sparse, time-delayed rewards to learn optimal behavior. The two main challenges are the credit assignment problem of determining which actions led to rewards, and balancing exploration of new actions with exploitation of existing knowledge. Q-learning is introduced as a way to estimate state-action values to learn optimal policies, and deep Q-networks are proposed to approximate Q-functions using neural networks for large state spaces. Experience replay and epsilon-greedy exploration are also summarized as techniques to improve deep Q-learning performance and exploration.
Using RealTime fMRI Based Neurofeedback To Probe Default Network RegulationCameron Craddock
Talk given at the 63rd Annual Meeting of the American Academy of Child & Adolescent Psychiatry. Describes an experiment using realtime fMRI neurofeedback to probe participants ability to modulate default network regulation along with preliminary results.
Discrete sequential prediction of continuous actions for deep RLJie-Han Chen
This paper proposes a method called SDQN (Sequential Deep Q-Network) to solve continuous action problems using a value-based reinforcement learning approach. SDQN discretizes continuous actions into sequential discrete steps. It transforms the original MDP into an "inner MDP" between consecutive discrete steps and an "outer MDP" between states. SDQN uses two Q-networks - an inner Q-network to estimate state-action values for each discrete step, and an outer Q-network to estimate values between states. It updates the networks using Q-learning for the inner networks and regression to match the last inner Q to the outer Q. The method is tested on a multimodal environment and several MuJoCo tasks, outperform
https://telecombcn-dl.github.io/2018-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
Delayed Rewards in the context of Reinforcement Learning based Recommender ...Debmalya Biswas
We present a Reinforcement Learning (RL) based approach to implement Recommender systems. The results are based on a real-life Wellness app that is able to provide personalized health / activity related content to users in an interactive fashion. Unfortunately, current recommender systems are unable to adapt to continuously evolving features, e.g. user sentiment, and scenarios where the RL reward needs to computed based on multiple and unreliable feedback channels (e.g., sensors, wearables). To overcome this, we propose three constructs: (i) weighted feedback channels, (ii) delayed rewards, and (iii) rewards boosting, which we believe are essential for RL to be used in Recommender Systems.
https://telecombcn-dl.github.io/2017-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
An introduction to reinforcement learningJie-Han Chen
This document provides an introduction and overview of reinforcement learning. It begins with a syllabus that outlines key topics such as Markov decision processes, dynamic programming, Monte Carlo methods, temporal difference learning, deep reinforcement learning, and active research areas. It then defines the key elements of reinforcement learning including policies, reward signals, value functions, and models of the environment. The document discusses the history and applications of reinforcement learning, highlighting seminal works in backgammon, helicopter control, Atari games, Go, and dialogue generation. It concludes by noting challenges in the field and prominent researchers contributing to its advancement.
Systems biology in polypharmacology: explaining and predicting drug secondary...Andrei KUCHARAVY
This document discusses using systems biology approaches to predict and explain off-target drug effects. It notes that unexpected secondary drug effects and lack of therapeutic effects are major reasons drug development fails. The document proposes using computational methods to predict all protein targets a drug may affect and using systems biology to model the consequences, including secondary effects, unexpected therapeutic effects via drug repositioning, and unexpected lack of effects. It outlines a master's project to develop models of polypharmacological drug effects by analyzing networks of drug-affected protein targets and retrieving relevant biological annotation to interpret effects.
Understanding Protein Function on a Genome-scale through the Analysis of Molecular Networks
Cornell Medical School, Physiology, Biophysics and Systems Biology (PBSB) graduate program, 2009.01.26, 16:00-17:00; [I:CORNELL-PBSB] (Long networks talk, incl. the following topics: why networks w. amsci*, funnygene*, net. prediction intro, memint*, tse*, essen*, sandy*, metagenomics*, netpossel*, tyna*+ topnet*, & pubnet* . Fits easily into 60’ w. 10’ questions. PPT works on mac & PC and has many photos w. EXIF tag kwcornellpbsb .)
Date Given: 01/26/2009
Modified Monkey Optimization Algorithm for Solving Optimal Reactive Power Dis...ijeei-iaes
In this paper, a novel approach Modified Monkey optimization (MMO) algorithm for solving optimal reactive power dispatch problem has been presented. MMO is a population based stochastic meta-heuristic algorithm and it is inspired by intelligent foraging behaviour of monkeys. This paper improves both local leader and global leader phases. The proposed (MMO) algorithm has been tested in standard IEEE 30 bus test system and simulation results show the worthy performance of the proposed algorithm in reducing the real power loss.
Robust Immunological Algorithms for High-Dimensional Global OptimizationMario Pavone
The document describes an immunological algorithm for global optimization problems. It introduces global optimization problems and challenges in solving them. It then describes how artificial immune systems and clonal selection algorithms can be applied to optimization through cloning, hypermutation, aging and selection operators. The algorithm is tested on benchmark optimization functions and its performance is analyzed using different potential mutation approaches and parameter tuning. Results show the algorithm is effective for solving high-dimensional global optimization problems.
This document describes using an FPGA to accelerate the simulation of a gene regulatory network model of cortical area development. It involves implementing a Boolean network model of the interactions between 5 genes (Fgf8, Emx2, Pax6, Coup-tfi, Sp8) in Verilog. 24 possible interactions between the genes were defined. The logical rules for each gene's expression were transformed into Boolean logic functions. Initial and desired steady states were defined for anterior and posterior compartments. Over 1.68 million possible networks were simulated, with 50,559 networks found to follow the desired trajectory from initial to steady states.
Accelerating a System’s Biology Kernel Using FPGAsMuhammad Awais
This document describes using an FPGA to accelerate the simulation of a gene regulatory network model of cortical area development. It involves implementing a Boolean network model of the interactions between 5 genes (Fgf8, Emx2, Pax6, Coup-tfi, Sp8) in Verilog. 24 possible interactions between the genes were defined. The logical rules for each gene's expression were transformed into Boolean logic functions. Initial and desired steady states were defined for anterior and posterior compartments. Over 1.68 million possible networks were simulated, with 50,559 networks found to follow the desired trajectory from initial to steady states.
Particle Swarm Optimization (PSO) is an optimization technique inspired by swarm behavior in animals. It works by having a population (swarm) of potential solutions (particles) and updating the movement of the particles based on their personal best position and the global best position. The basic algorithm initializes a swarm of random particles and then iteratively updates the velocity and position of particles using equations that factor in inertia, cognition toward personal best, and social behavior toward global best. PSO has been shown to perform comparably to genetic algorithms but requires fewer parameters to adjust. Variants and hybridizations of PSO have also been developed to improve performance for different problem types.
OPTIMIZATION OF QOS PARAMETERS IN COGNITIVE RADIO USING ADAPTIVE GENETIC ALGO...ijngnjournal
Genetic algorithm based optimization rely on explicit relationships between parameters, observations and criteria. GA based optimization when done in cognitive radio can provide a criteria to accommodate the secondary users in best possible space in the spectrum by interacting with the dynamic radio environment at real time. In this paper we have proposed adaptive genetic algorithm with adapting crossover and mutation parameters for the reasoning engine in cognitive radio to obtain the optimum radio configurations. This method ensure better controlling of the algorithm parameters and hence the increasing the performance. The main advantage of genetic algorithm over other soft computing techniques is its multi – objective handling capability. We focus on spectrum management with a hypothesis that inputs are provided by either sensing information from the radio environment or the secondary user. Also the QoS requirements condition is also specified in the hypothesis. The cognitive radio will sense the radio frequency parameter from the environment and the reasoning engine in the cognitive radio will take the required decisions in order to provide new spectrum allocation as demanded by the user. The transmission parameters which can be taken into consideration are modulation method, bandwidth, data rate, symbol rate, power consumption etc. We simulated cognitive radio engine which is driven by genetic algorithm to determine the optimal set of radio transmission parameters. We have fitness objectives to guide one system to an optimal state. These objectives are combined to one multi – objective fitness function using weighted sum approach so that each objective can be represented by a rank which represents the importance of each objective. We have transmission parameters as decision variables and environmental parameters are used as inputs to the objective function. We have compared the proposed adaptive genetic algorithm (AGA) with conventional genetic algorithm (CGA) with same set of conditions. MATLAB simulations were used to analyze the scenarios
Peter Langfelder presented on weighted gene co-expression network analysis of HD data. Key points:
- WGCNA identified gene modules in mouse striatum associated with CAG repeat length. Neuronal modules were down with increasing repeats while oligodendrocyte modules were up.
- Human HD brain regions showed common and region-specific responses. A neuronal module was down across all regions while astrocyte and microglial modules were up.
- Consensus modules identified co-expressed genes consistently changed across multiple human HD datasets, providing robust modules for further investigation.
This presentation is based on an article titled "Knowledge-Primed Neural Networks Enable Biologically Interpretable Deep Learning on Single-Cell Sequencing Data" as an application of Artificial Neural Networks in Gene Regulatory Networks in System Biology.
The document discusses biological computation and gene regulation in cells. It describes how (1) cells perform biochemical information processing to transform cues into biological functions, (2) embryonic stem cells can adopt different states through gene regulation, and (3) techniques like Boolean networks and satisfiability modulo theories can be used to model and analyze gene interaction networks inferred from experimental data. The techniques allow predicting cell behaviors and identifying gene interaction programs governing processes like stem cell differentiation.
Machine learning techniques can help address several unsolved problems in structural bioinformatics, including predicting protein flexibility and binding sites. The document discusses using machine learning models like SVMs trained on structural data to predict flexibility regions and protein-protein interaction sites from sequence alone. It also presents challenges in defining protein domain boundaries and predicting other structural features from sequence.
Bioinformatics emerged from the marriage of computer science and molecular biology to analyze massive amounts of biological data, like that produced by the Human Genome Project. It uses algorithms and techniques from computer science to solve problems in molecular biology, like comparing genomic sequences to understand evolution. As genomic data exploded publicly, bioinformatics was needed to efficiently store, analyze, and make sense of this information, which has applications in molecular medicine, drug development, agriculture, and more.
The Influence of Age Assignments on the Performance of Immune AlgorithmsMario Pavone
How long a B cell remains, evolves and matures inside a population plays a crucial role on the capability for an immune algorithm to jump out from local optima, and find the global optimum. Assigning the right age to each clone (or offspring, in general) means to find the proper balancing between the exploration and exploitation. In this research work we present an experimental study conducted on an immune algorithm, based on the clonal selection principle, and performed on eleven different age assignments, with the main aim to verify if at least one, or two, of the top 4 in the previous efficiency ranking produced on the one-max problem, still appear among the top 4 in the new efficiency ranking obtained on a different complex problem. Thus, the NK landscape model has been considered as the test problem, which is a mathematical model formulated for the study of tunably rugged fitness landscape. From the many experiments performed is possible to assert that in the elitism variant of the immune algorithm, two of the best age assignments previously discovered, still continue to appear among the top 3 of the new rankings produced; whilst they become three in the no elitism version. Further, in the first variant none of the 4 top previous ones ranks ever in the first position, unlike on the no elitism variant, where the previous best one continues to appear in 1st position more than the others. Finally, this study confirms that the idea to assign the same age of the parent to the cloned B cell is not a good strategy since it continues to be as the worst also in the new
efficiency ranking.
Talk and poster presented at the American Society for Biochemistry and Molecular Biology/Experimental Biology Conference on April 4, 2016. Abstract: A gene regulatory network (GRN) consists of genes, transcription factors, and the regulatory connections between them that govern the level of expression of mRNA and proteins from those genes. Our group has developed a MATLAB software package, called GRNmap, that uses ordinary differential equations to model the dynamics of medium-scale GRNs. The program uses a penalized least squares approach (Dahlquist et al. 2015, DOI: 10.1007/s11538-015-0092-6) to estimate production rates, expression thresholds, and regulatory weights for each transcription factor in the network based on gene expression data, and then performs a forward simulation of the dynamics of the network. GRNmap has options for using a sigmoidal or Michaelis-Menten production function. Parameters for a series of related networks, ranging in size from 15 to 35 genes, were optimized against DNA microarray data measuring the transcriptional response to cold shock in wild type and five strains individually deleted for the transcription factors, Cin5, Gln3, Hap4, Hmo1, Zap1, of budding yeast, Saccharomyces cerevisiae BY4741. Model predictions fit the experimental data well, within the 95% confidence interval. Open source code and a compiled executable that can run without a MATLAB license are available from http://kdahlquist.github.io/GRNmap/. GRNsight is an open source web application for visualizing such models of gene regulatory networks. GRNsight accepts GRNmap- or user-generated spreadsheets containing an adjacency matrix representation of the GRN and automatically lays out the graph of the GRN model. The application colors the edges and adjusts their thicknesses based on the sign (activation or repression) and the strength (magnitude) of the regulatory relationship, respectively. Users can then modify the graph to define the best visual layout for the network. The GRNsight open source code and application are available from http://dondi.github.io/GRNsight/index.html.
Using RealTime fMRI Based Neurofeedback To Probe Default Network RegulationCameron Craddock
Talk given at the 63rd Annual Meeting of the American Academy of Child & Adolescent Psychiatry. Describes an experiment using realtime fMRI neurofeedback to probe participants ability to modulate default network regulation along with preliminary results.
Discrete sequential prediction of continuous actions for deep RLJie-Han Chen
This paper proposes a method called SDQN (Sequential Deep Q-Network) to solve continuous action problems using a value-based reinforcement learning approach. SDQN discretizes continuous actions into sequential discrete steps. It transforms the original MDP into an "inner MDP" between consecutive discrete steps and an "outer MDP" between states. SDQN uses two Q-networks - an inner Q-network to estimate state-action values for each discrete step, and an outer Q-network to estimate values between states. It updates the networks using Q-learning for the inner networks and regression to match the last inner Q to the outer Q. The method is tested on a multimodal environment and several MuJoCo tasks, outperform
https://telecombcn-dl.github.io/2018-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
Delayed Rewards in the context of Reinforcement Learning based Recommender ...Debmalya Biswas
We present a Reinforcement Learning (RL) based approach to implement Recommender systems. The results are based on a real-life Wellness app that is able to provide personalized health / activity related content to users in an interactive fashion. Unfortunately, current recommender systems are unable to adapt to continuously evolving features, e.g. user sentiment, and scenarios where the RL reward needs to computed based on multiple and unreliable feedback channels (e.g., sensors, wearables). To overcome this, we propose three constructs: (i) weighted feedback channels, (ii) delayed rewards, and (iii) rewards boosting, which we believe are essential for RL to be used in Recommender Systems.
https://telecombcn-dl.github.io/2017-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
An introduction to reinforcement learningJie-Han Chen
This document provides an introduction and overview of reinforcement learning. It begins with a syllabus that outlines key topics such as Markov decision processes, dynamic programming, Monte Carlo methods, temporal difference learning, deep reinforcement learning, and active research areas. It then defines the key elements of reinforcement learning including policies, reward signals, value functions, and models of the environment. The document discusses the history and applications of reinforcement learning, highlighting seminal works in backgammon, helicopter control, Atari games, Go, and dialogue generation. It concludes by noting challenges in the field and prominent researchers contributing to its advancement.
Systems biology in polypharmacology: explaining and predicting drug secondary...Andrei KUCHARAVY
This document discusses using systems biology approaches to predict and explain off-target drug effects. It notes that unexpected secondary drug effects and lack of therapeutic effects are major reasons drug development fails. The document proposes using computational methods to predict all protein targets a drug may affect and using systems biology to model the consequences, including secondary effects, unexpected therapeutic effects via drug repositioning, and unexpected lack of effects. It outlines a master's project to develop models of polypharmacological drug effects by analyzing networks of drug-affected protein targets and retrieving relevant biological annotation to interpret effects.
Understanding Protein Function on a Genome-scale through the Analysis of Molecular Networks
Cornell Medical School, Physiology, Biophysics and Systems Biology (PBSB) graduate program, 2009.01.26, 16:00-17:00; [I:CORNELL-PBSB] (Long networks talk, incl. the following topics: why networks w. amsci*, funnygene*, net. prediction intro, memint*, tse*, essen*, sandy*, metagenomics*, netpossel*, tyna*+ topnet*, & pubnet* . Fits easily into 60’ w. 10’ questions. PPT works on mac & PC and has many photos w. EXIF tag kwcornellpbsb .)
Date Given: 01/26/2009
Modified Monkey Optimization Algorithm for Solving Optimal Reactive Power Dis...ijeei-iaes
In this paper, a novel approach Modified Monkey optimization (MMO) algorithm for solving optimal reactive power dispatch problem has been presented. MMO is a population based stochastic meta-heuristic algorithm and it is inspired by intelligent foraging behaviour of monkeys. This paper improves both local leader and global leader phases. The proposed (MMO) algorithm has been tested in standard IEEE 30 bus test system and simulation results show the worthy performance of the proposed algorithm in reducing the real power loss.
Robust Immunological Algorithms for High-Dimensional Global OptimizationMario Pavone
The document describes an immunological algorithm for global optimization problems. It introduces global optimization problems and challenges in solving them. It then describes how artificial immune systems and clonal selection algorithms can be applied to optimization through cloning, hypermutation, aging and selection operators. The algorithm is tested on benchmark optimization functions and its performance is analyzed using different potential mutation approaches and parameter tuning. Results show the algorithm is effective for solving high-dimensional global optimization problems.
This document describes using an FPGA to accelerate the simulation of a gene regulatory network model of cortical area development. It involves implementing a Boolean network model of the interactions between 5 genes (Fgf8, Emx2, Pax6, Coup-tfi, Sp8) in Verilog. 24 possible interactions between the genes were defined. The logical rules for each gene's expression were transformed into Boolean logic functions. Initial and desired steady states were defined for anterior and posterior compartments. Over 1.68 million possible networks were simulated, with 50,559 networks found to follow the desired trajectory from initial to steady states.
Accelerating a System’s Biology Kernel Using FPGAsMuhammad Awais
This document describes using an FPGA to accelerate the simulation of a gene regulatory network model of cortical area development. It involves implementing a Boolean network model of the interactions between 5 genes (Fgf8, Emx2, Pax6, Coup-tfi, Sp8) in Verilog. 24 possible interactions between the genes were defined. The logical rules for each gene's expression were transformed into Boolean logic functions. Initial and desired steady states were defined for anterior and posterior compartments. Over 1.68 million possible networks were simulated, with 50,559 networks found to follow the desired trajectory from initial to steady states.
Particle Swarm Optimization (PSO) is an optimization technique inspired by swarm behavior in animals. It works by having a population (swarm) of potential solutions (particles) and updating the movement of the particles based on their personal best position and the global best position. The basic algorithm initializes a swarm of random particles and then iteratively updates the velocity and position of particles using equations that factor in inertia, cognition toward personal best, and social behavior toward global best. PSO has been shown to perform comparably to genetic algorithms but requires fewer parameters to adjust. Variants and hybridizations of PSO have also been developed to improve performance for different problem types.
OPTIMIZATION OF QOS PARAMETERS IN COGNITIVE RADIO USING ADAPTIVE GENETIC ALGO...ijngnjournal
Genetic algorithm based optimization rely on explicit relationships between parameters, observations and criteria. GA based optimization when done in cognitive radio can provide a criteria to accommodate the secondary users in best possible space in the spectrum by interacting with the dynamic radio environment at real time. In this paper we have proposed adaptive genetic algorithm with adapting crossover and mutation parameters for the reasoning engine in cognitive radio to obtain the optimum radio configurations. This method ensure better controlling of the algorithm parameters and hence the increasing the performance. The main advantage of genetic algorithm over other soft computing techniques is its multi – objective handling capability. We focus on spectrum management with a hypothesis that inputs are provided by either sensing information from the radio environment or the secondary user. Also the QoS requirements condition is also specified in the hypothesis. The cognitive radio will sense the radio frequency parameter from the environment and the reasoning engine in the cognitive radio will take the required decisions in order to provide new spectrum allocation as demanded by the user. The transmission parameters which can be taken into consideration are modulation method, bandwidth, data rate, symbol rate, power consumption etc. We simulated cognitive radio engine which is driven by genetic algorithm to determine the optimal set of radio transmission parameters. We have fitness objectives to guide one system to an optimal state. These objectives are combined to one multi – objective fitness function using weighted sum approach so that each objective can be represented by a rank which represents the importance of each objective. We have transmission parameters as decision variables and environmental parameters are used as inputs to the objective function. We have compared the proposed adaptive genetic algorithm (AGA) with conventional genetic algorithm (CGA) with same set of conditions. MATLAB simulations were used to analyze the scenarios
Peter Langfelder presented on weighted gene co-expression network analysis of HD data. Key points:
- WGCNA identified gene modules in mouse striatum associated with CAG repeat length. Neuronal modules were down with increasing repeats while oligodendrocyte modules were up.
- Human HD brain regions showed common and region-specific responses. A neuronal module was down across all regions while astrocyte and microglial modules were up.
- Consensus modules identified co-expressed genes consistently changed across multiple human HD datasets, providing robust modules for further investigation.
This presentation is based on an article titled "Knowledge-Primed Neural Networks Enable Biologically Interpretable Deep Learning on Single-Cell Sequencing Data" as an application of Artificial Neural Networks in Gene Regulatory Networks in System Biology.
The document discusses biological computation and gene regulation in cells. It describes how (1) cells perform biochemical information processing to transform cues into biological functions, (2) embryonic stem cells can adopt different states through gene regulation, and (3) techniques like Boolean networks and satisfiability modulo theories can be used to model and analyze gene interaction networks inferred from experimental data. The techniques allow predicting cell behaviors and identifying gene interaction programs governing processes like stem cell differentiation.
Machine learning techniques can help address several unsolved problems in structural bioinformatics, including predicting protein flexibility and binding sites. The document discusses using machine learning models like SVMs trained on structural data to predict flexibility regions and protein-protein interaction sites from sequence alone. It also presents challenges in defining protein domain boundaries and predicting other structural features from sequence.
Bioinformatics emerged from the marriage of computer science and molecular biology to analyze massive amounts of biological data, like that produced by the Human Genome Project. It uses algorithms and techniques from computer science to solve problems in molecular biology, like comparing genomic sequences to understand evolution. As genomic data exploded publicly, bioinformatics was needed to efficiently store, analyze, and make sense of this information, which has applications in molecular medicine, drug development, agriculture, and more.
The Influence of Age Assignments on the Performance of Immune AlgorithmsMario Pavone
How long a B cell remains, evolves and matures inside a population plays a crucial role on the capability for an immune algorithm to jump out from local optima, and find the global optimum. Assigning the right age to each clone (or offspring, in general) means to find the proper balancing between the exploration and exploitation. In this research work we present an experimental study conducted on an immune algorithm, based on the clonal selection principle, and performed on eleven different age assignments, with the main aim to verify if at least one, or two, of the top 4 in the previous efficiency ranking produced on the one-max problem, still appear among the top 4 in the new efficiency ranking obtained on a different complex problem. Thus, the NK landscape model has been considered as the test problem, which is a mathematical model formulated for the study of tunably rugged fitness landscape. From the many experiments performed is possible to assert that in the elitism variant of the immune algorithm, two of the best age assignments previously discovered, still continue to appear among the top 3 of the new rankings produced; whilst they become three in the no elitism version. Further, in the first variant none of the 4 top previous ones ranks ever in the first position, unlike on the no elitism variant, where the previous best one continues to appear in 1st position more than the others. Finally, this study confirms that the idea to assign the same age of the parent to the cloned B cell is not a good strategy since it continues to be as the worst also in the new
efficiency ranking.
Talk and poster presented at the American Society for Biochemistry and Molecular Biology/Experimental Biology Conference on April 4, 2016. Abstract: A gene regulatory network (GRN) consists of genes, transcription factors, and the regulatory connections between them that govern the level of expression of mRNA and proteins from those genes. Our group has developed a MATLAB software package, called GRNmap, that uses ordinary differential equations to model the dynamics of medium-scale GRNs. The program uses a penalized least squares approach (Dahlquist et al. 2015, DOI: 10.1007/s11538-015-0092-6) to estimate production rates, expression thresholds, and regulatory weights for each transcription factor in the network based on gene expression data, and then performs a forward simulation of the dynamics of the network. GRNmap has options for using a sigmoidal or Michaelis-Menten production function. Parameters for a series of related networks, ranging in size from 15 to 35 genes, were optimized against DNA microarray data measuring the transcriptional response to cold shock in wild type and five strains individually deleted for the transcription factors, Cin5, Gln3, Hap4, Hmo1, Zap1, of budding yeast, Saccharomyces cerevisiae BY4741. Model predictions fit the experimental data well, within the 95% confidence interval. Open source code and a compiled executable that can run without a MATLAB license are available from http://kdahlquist.github.io/GRNmap/. GRNsight is an open source web application for visualizing such models of gene regulatory networks. GRNsight accepts GRNmap- or user-generated spreadsheets containing an adjacency matrix representation of the GRN and automatically lays out the graph of the GRN model. The application colors the edges and adjusts their thicknesses based on the sign (activation or repression) and the strength (magnitude) of the regulatory relationship, respectively. Users can then modify the graph to define the best visual layout for the network. The GRNsight open source code and application are available from http://dondi.github.io/GRNsight/index.html.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
In this work, we propose to apply trust region optimization to deep reinforcement
learning using a recently proposed Kronecker-factored approximation to
the curvature. We extend the framework of natural policy gradient and propose
to optimize both the actor and the critic using Kronecker-factored approximate
curvature (K-FAC) with trust region; hence we call our method Actor Critic using
Kronecker-Factored Trust Region (ACKTR). To the best of our knowledge, this
is the first scalable trust region natural gradient method for actor-critic methods.
It is also a method that learns non-trivial tasks in continuous control as well as
discrete control policies directly from raw pixel inputs. We tested our approach
across discrete domains in Atari games as well as continuous domains in the MuJoCo
environment. With the proposed methods, we are able to achieve higher
rewards and a 2- to 3-fold improvement in sample efficiency on average, compared
to previous state-of-the-art on-policy actor-critic methods. Code is available at
https://github.com/openai/baselines.
1. The document describes the Behavior Regularized Actor Critic (BRAC) framework, which evaluates different design choices for offline reinforcement learning algorithms.
2. BRAC experiments show that simple variants using a fixed regularization weight, minimum ensemble Q-targets, and value penalty regularization can achieve good performance, outperforming more complex techniques from previous work.
3. The experiments find that choices like the divergence used for regularization and number of ensemble Q-functions do not have large impacts on performance, and hyperparameter sensitivity also varies between design choices.
PREDICTING MORE INFECTIOUS VIRUS VARIANTS FOR PANDEMIC PREVENTION THROUGH DEE...gerogepatton
The document describes a deep learning approach called Optimus PPIme that uses a transformer network and search algorithms to predict future, more infectious virus variants. The transformer is trained on protein-protein interaction data to score how well a virus protein binds to a host receptor, indicating infectivity. Greedy search and beam search algorithms then propose variants with higher predicted binding by iteratively mutating the virus protein and assessing the new scores. Experiments showed pre-training the transformer on masked language modeling, using sharpness-aware minimization, and data augmentation improved its accuracy in scoring novel protein interactions. When applied to SARS-CoV-2 spike variants, the approach predicted higher infectivity for variants of concern like Alpha and Omicron compared to
Similar to Deep Reinforcement Learning for control of PBNs--CNA2020 (20)
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
Generating privacy-protected synthetic data using Secludy and MilvusZilliz
During this demo, the founders of Secludy will demonstrate how their system utilizes Milvus to store and manipulate embeddings for generating privacy-protected synthetic data. Their approach not only maintains the confidentiality of the original data but also enhances the utility and scalability of LLMs under privacy constraints. Attendees, including machine learning engineers, data scientists, and data managers, will witness first-hand how Secludy's integration with Milvus empowers organizations to harness the power of LLMs securely and efficiently.
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on integration of Salesforce with Bonterra Impact Management.
Interested in deploying an integration with Salesforce for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
5th LF Energy Power Grid Model Meet-up SlidesDanBrown980551
5th Power Grid Model Meet-up
It is with great pleasure that we extend to you an invitation to the 5th Power Grid Model Meet-up, scheduled for 6th June 2024. This event will adopt a hybrid format, allowing participants to join us either through an online Mircosoft Teams session or in person at TU/e located at Den Dolech 2, Eindhoven, Netherlands. The meet-up will be hosted by Eindhoven University of Technology (TU/e), a research university specializing in engineering science & technology.
Power Grid Model
The global energy transition is placing new and unprecedented demands on Distribution System Operators (DSOs). Alongside upgrades to grid capacity, processes such as digitization, capacity optimization, and congestion management are becoming vital for delivering reliable services.
Power Grid Model is an open source project from Linux Foundation Energy and provides a calculation engine that is increasingly essential for DSOs. It offers a standards-based foundation enabling real-time power systems analysis, simulations of electrical power grids, and sophisticated what-if analysis. In addition, it enables in-depth studies and analysis of the electrical power grid’s behavior and performance. This comprehensive model incorporates essential factors such as power generation capacity, electrical losses, voltage levels, power flows, and system stability.
Power Grid Model is currently being applied in a wide variety of use cases, including grid planning, expansion, reliability, and congestion studies. It can also help in analyzing the impact of renewable energy integration, assessing the effects of disturbances or faults, and developing strategies for grid control and optimization.
What to expect
For the upcoming meetup we are organizing, we have an exciting lineup of activities planned:
-Insightful presentations covering two practical applications of the Power Grid Model.
-An update on the latest advancements in Power Grid -Model technology during the first and second quarters of 2024.
-An interactive brainstorming session to discuss and propose new feature requests.
-An opportunity to connect with fellow Power Grid Model enthusiasts and users.
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/
Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit.
In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing.
van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.
Skybuffer SAM4U tool for SAP license adoptionTatiana Kojar
Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool.
SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
Taking AI to the Next Level in Manufacturing.pdfssuserfac0301
Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as:
1. How quickly AI is being implemented in manufacturing.
2. Which barriers stand in the way of AI adoption.
3. How data quality and governance form the backbone of AI.
4. Organizational processes and structures that may inhibit effective AI adoption.
6. Ideas and approaches to help build your organization's AI strategy.
Fueling AI with Great Data with Airbyte WebinarZilliz
This talk will focus on how to collect data from a variety of sources, leveraging this data for RAG and other GenAI use cases, and finally charting your course to productionalization.
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Deep Reinforcement Learning for control of PBNs--CNA2020
1. Deep Reinforcement Learning for Control
of Probabilistic Boolean Networks
Georgios Papagiannis1 and Sotiris Moschoyiannis2
1University of Cambridge, UK
2University of Surrey, UK
Complex Networks and their Applications 2020 – 1 Dec 2020
s.moschoyiannis@surrey.ac.uk
2. Boolean Networks1 (BNs)
2
n1
n4
n3
n2
n5
A class of discrete dynamical systems:
• Nodes represent genes,
gene expression is quantized: 0 (inactive), 1 (active)
• Expression level of each gene is functionally related to
the expression states of some other genes
• At each time step,
each node computes and produces output (0 or 1),
which is input for its connected nodes in the next time step
AND
OR
Corresponding state space
Boolean network (BN)
1 Kauffman (1969) Metabolic stability and epigenesis in randomly constructed genetic nets, J. of Theoretical Biology, 22(3):437-467
3. Attractors
3
n1
n4
n3
n2
n5
Corresponding state space
Boolean network (BN)
Dynamics of BNs dictate that the network will
evolve to a state, or set of states, that it cannot
leave without external intervention
• Fixed point attractors
• Limit cycle attractors
4. 4
n1
n4
n3
n2
n5
AND, p=0.5
OR, p=0.3
NAND, p=0.2
Boolean network (BN)
Probabilistic Boolean Networks2 (PBNs)
More than one Boolean function at each node;
one function executes at each step t, with prob. p
Accommodate uncertainty in gene regulation.
Corresponding state space
• Dynamics of PBNs
- admit Markov Chain theory (MDPs)
- exhibit attractors; these manifest as:
• absorbing states
• irreducible sets
2 Shmulevich et al (2002) Probabilistic Boolean Networks: a rule-based uncertainty model for gene regulatory networks , Bioinformatics 18(2):261-274
5. Gene Regulatory Networks (GRNs)
5
Segment polarity genes5
Fission yeast cell-cycle6
• Spontaneous emergence of ordered collective behaviour 3
e.g., functional states of the cell such as growth or quiescence
correspond to such attractors 3
e.g., high / low resistance to antibiotics at different attractors 4
• (Why PBN study is useful) Targeted therapeutics: external
perturbation on certain gene(s), at certain state(s), can drive the
GRN to a desirable attractor (drug targets)
• where perturbation = change of state (i.e., 0->1, 1-> 0)
• Kauffman: attractors are stable under most gene perturbations
Study PBNs as a dynamical system where change of state on
certain genes, at certain states, may drastically affect the state
of the network as a whole, and
• lead to a different attractor, with desirable properties
• switch between attractors
3 Huang, Ingber (2000) Shape-dependent control of cell growth: Switching between attractors in cell regulatory networks. Experimental Cell Research, 261(1): 91-103
4 Reardon (2017) Modified viruses deliver death to antibiotic-resistant bacteria. Nature, 546:586-587
5 Albert, Othmer (2003) The topology of the regulatory interactions predicts the expression pattern of the segment polarity genes in Drosophila melanogaster. J. of Theoretical Biology, 223(1):1-18
6 Wang, Du, Chen, et al (2010) Process-based network decomposition reveals backbone motif structure. Proc Natl Acad Sci 107(23):10,478–10,483
6. Control
Complex Networks perspective:
A dynamical system is controllable if it can be driven from
any initial state to any desired state, within finite time 7
7 Liu, Slotine, Barabasi (2011) Controllability of complex networks, Nature 473: 167-173
Our goal:
Discover control strategies to effect perturbations on individual nodes
(targeted intervention), aiming to drive the whole network from its current
state to a specified target state that exhibits desirable (biological) properties.
6
7. Control in (P)BNs
Comes in different flavours.
e.g.,
Assume control inputs,
intervene only using these 8
Intervene on one node to
affect another node’s state 9
Intervene on any node to affect
the long-run network behaviour3
7
8 Wu, Guo, Toyoda (2020) Policy Iteration Approach to the Infinite Horizon Average Optimal Control of Probabilistic Boolean Networks, IEEE Trans, Neural Netw. Learn. Syst.
9 Pal, Datta, Dougherty (2006) Optimal Infinite-Horizon Control for Probabilistic Boolean Networks. IEEE Trans on Signal Processing, 54(6):2375-2387
10 Shmulevich, Dougherty, Zhang (2002) Gene Perturbation and Intervention in PBNs. Bioinformatics 18(10):1319-1331
Lac operon on E. coli 8
Metastatic melanoma 9
Toy example from Shmulevich 10
8. Control (in our work, here)
What is the series of required interventions (which gene, at which step) to drive a
PBN from any state towards a target attractor, within a finite number of steps ?
- Can intervene on any node
- Can intervene on at most one node at each time step
- Each intervention is followed by a natural evolution step (internal dynamics)
- Aim for minimum number of interventions (perturbations)
- Limit the number of steps (or num of interventions)
- Assume no additional info from systems biology study
- One requirement: knowledge of the target attractor
8
9. STG / Probability Transition Matrix
Intractable
9
Lac operon E.coli PBN
n = 9 nodes
Corresponding State Transition Graph (STG) : 29 = 512 states
Corresponding Probability Transition Matrix (PTM) : 29 x 29
10. STG / Probability Transition Matrix
10
Corresponding STG : 210 = 1024 statesCorresponding Probability Transition Matrix : 210 x 210
Fission yeast PBN
n = 10 nodes
11. STG / Probability Transition Matrix
11
Corresponding STG : 220 = 1,048,576 states
Corresponding Probability Transition Matrix : 220 x 220
Synthetic PBN
n = 20 nodes
(in this paper – see Section 4)
12. STG / Probability Transition Matrix
Corresponding STG : 228 = 268,435,456 statesCorresponding Probability Transition Matrix : 228 x 228
Metastatic Melanoma PBN
n = 28 nodes
The Probability Transition Matrix (PTM) becomes
computationally intractable for larger networks 11 –
requires the estimation of 2n x (2n – 1) probabilities..
Can we work without the PTM ?
11 Akutsu, Hayashida, et al (2007) Control of Boolean networks: Hardness results and algorithms for tree structured network. Journal of Theoretical Biology, 244(4):670-679
14. Reinforcement Learning
Policy 𝜋(𝑠) : How to select an action, at each state,
a distribution over actions given a state.
Goal: Maximize expected cumulative reward.
1412 Sutton, Barto (2018) Reinforcement Learning: an introduction, MIT Press [Chapter 6]
15. Markov Decision Process (MDP)
An MDP is a tuple (𝑆, 𝐴, 𝑃, 𝑅, 𝛾)
𝑆 – Set of states of the environment
𝐴 – Set of possible actions to perform
at some state s ∈ 𝑆
𝑃 – State transition matrix where
𝑃𝑠 𝑡 𝑠 𝑡+1
′
𝑎∈𝐴
= 𝑃[𝑠𝑡+1
′
|𝑠𝑡, 𝑎 𝑡]
𝑅 – Reward function where 𝑅 𝑠
𝑎 =
Ε[𝑅𝑡+1|𝑠𝑡, 𝑎 𝑡]
𝛾 – Discount factor 𝛾 ∈ [0, 1]
PBNs as MDPs:
𝑆 – Binary states
𝐴 – Possible interventions given a state s ∈ 𝑆
(in fact, N+1 actions at each state)
𝑃 – Probability of transitioning between
binary states, given Boolean function realisations
𝑅 – Problem dependent
(we define the reward function)
𝛾 – Problem dependent
(we choose this)
15
17. Q-Learning
1712 Sutton, Barto (2018) Reinforcement Learning: an introduction, MIT Press [Chapter 6]
Expected reward of taking action at state , at time step t :
target for update
error in estimate
increment – sample update
old estimatenew estimate
19. ε-greedy
19
Simplest idea for ensuring continual exploration.
All m actions available are tried with non-zero probability:
with probability ε choose an action at random (define small ε)
with probability 1 – ε choose the greedy action
where a greedy action is an action whose expected reward is the greatest,
and is given by argmax
Approximate iteratively, by selecting actions at each time step.
N.B. Q has been shown to converge to Q* with ε-greedy e.g., see Sutton, Barto (2018) Reinforcement Learning: an introduction, MIT Press
21. 21
Use a function approximator to learn a parameterised form Q(s, a; θ).
Use DQN to iteratively update θ, in order to approximate Q*(s, a; θ) (true Q values).
Deep Q Net (DQN)
23. 23
Use a separate network to determine the TD-target.
Double DQN (DDQN)
• “target” DQN – initialised with the same parameters as the main DQN (“policy” DQN) but
has its parameters updated every k iterations
• “policy” DQN – the expected Q values of the target DQN are fixed and every k iterations the
parameters of the policy DQN are copied to the target
used to update the θ parameters,
θt
’ = θt every k time steps.
24. Reward
Objective: Find a policy that drives a PBN to an attractor in order to maximise reward
: set of states in target attractor
24
25. 25
Training a network from consecutive samples directly from the environment is
susceptible to strong correlations in the data.
26. 26
Sample from a batch of experiences, at each time step t, to update the DDQN.
Prioritised Experience Replay (DDQN with PER)
During training,
• the agent observes state , performs action on the environment, and
• then, environment transitions to and agent receives reward
The transition / experience is stored in a replay buffer
• 5K buffer (for n=10 nodes); 500K (for n=20)
At each t, a batch of experiences is sampled in order to update the network parameters
• 128 for n=10; 512 for n=20
27. 27N.B. Please see Section 3.2 (pp. 4-5) in the paper for more detail.
PER – proportional, importance
• Proportional - probability of an experience being sampled given by
where is the priority of sample i, δ is the TD-error, c is small constant to prevent
experiences with zero TD-error from never being replayed, and ω is the magnitude of prioritisation
• Importance weights used to compensate for samples with high TD-error sampled more often
where L is the size of replay memory and β is used to anneal the amount of
importance sampling over training episodes.
29. Results
Success rate is at least 99% from any initial state
29
- 1024 states
- attractor occurs 1/100 times
- random interventions: 1,387
- horizon of 11 is < 1% of that
- DRL: 99.8% successful control
- 100% if horizon is set to 14
- 1,048,576 states
- attractor occurs 1/10,000 times
- random interventions: 6,511
- horizon of 100 is <1.5% of that
- DRL: 100% successful control
- 99% if horizon is set to 15
- 512 states
- attractor set to 1001111 motivated
by biology (2nd gene, WNT5A,
unexpressed)
- DRL: 100% successful control for 10
- 99.72% if horizon is set to 7
PBN20PBN10 Melanoma
30. Not in this paper – control larger PBNs
We have tried our DRL (DDQN with PER) method
on the more common type of control problem 13,14
30
OFF
Intervene on pirin’s state only, aiming to drive the
PBN to a state where WNT5A is OFF (target state).
Cancerous Melanoma PBN inferred from GRN data 13,14
13 Pal, Datta, Dougherty (2006) Optimal Infinite-Horizon Control for Probabilistic Boolean Networks. IEEE Trans on Signal Processing, 54(6):2375-2387
14 Sirin, Polat, Alhajj (2013) Employing Batch Reinforcement Learning to Control Gene Regulation Without Explicitly Constructing Gene Regulatory Networks, 23rd IJCAI 2013, 2042-2048
31. Not in this paper – control larger PBN (N= 70)
On N=7 we get favourable performance
to existing literature 13,14
On N=28 we get favourable performance
to existing literature 14
On N=70 we get 97.6% successful
control.
This is the largest PBN to be controlled,
from real data or synthetic data.
Joint work with Vytenis Sliogeris, paper under preparation
13 Pal, Datta, Dougherty (2006) Optimal Infinite-Horizon Control for Probabilistic Boolean Networks. IEEE Trans on Signal Processing, 54(6):2375-2387
14 Sirin, Polat, Alhajj (2013) Employing Batch Reinforcement Learning to Control Gene Regulation Without Explicitly Constructing Gene Regulatory Networks, 23rd IJCAI, 2042-2048
32. Not in this paper – infer larger PBNs
We have been successful in inferring a PBN directly from real gene expression data
(samples taken when network in a steady-state distribution)
• Metastatic melanoma dataset from Bittner et al1
• Using CoDs and a perceptron as a predictive model2,3
Our approach does not build the PTM (as our control method does not need it!).
We are looking at inferring a PBN from real, time-series gene expression data
• But, typically, studies provide no more than 6-7 time steps
This is in progress – please get in touch if you are also working on something like this.
32
1 Bittner, Meltzer, Chen et al (2000) Molecular classification of cutaneous malignant melanoma by gene expression profiling. Nature 406: 536-540
2 Kim,Dougherty, et al (2000) General nonlinear framework for the analysis of gene interaction via multivariate expression arrays. Journal of Biomedical Optics 5: 411-424
3 Shmulevich, Dougherty, Zhang (2002) Gene Perturbation and Intervention in PBNs. Bioinformatics 18(10): 1319-1331
33. Thank you for listening.
Any questions?
s.moschoyiannis@surrey.ac.uk
Editor's Notes
Boolean networks were introduced by Stuart Kauffman as a model of genetic networks.
BNs are a class of discrete dynamical systems,
characterized by interactions over a set of Boolean variables.
Nodes represent genes…
As the network evolves we build the corresponding state space, shown here, which show the state of the network as a whole, between time steps.
If you let these networks evolve for some time, they will invariably end up in a state they cannot get out of, at least no without external intervention.
These are the so-called attractors and they come in the form of fixed points (blue) or limit cycles (red)
Probabilistic BNs extend BNs to accommodate uncertainty in gene regulation.
In PBNs each node is associated with more than one function, and one of these executes at each time step, with some probability p.
So node n1 will use AND 5 out of 10 times, OR 3/10 times, NAND 2/10 times.
The dynamics are similar to BNs with absorbing states and irreducible sets, this time, being the attractors.
BNs have been used to model Gene Regulatory Networks
Seminal work by Reka Albert, who gave the keynote in last year's Complex Networks conf (Lisbon), another example - the fission yeast cycle here
There is a nice correspondence between ordered collective behaviour in GRNs and attractors in PBNs.
I note here just 2 of the works from systems biology and genetics, who point to the fact that certain attractors are desirable, others not (ref in 3 here); sometimes switching between attractors is a good idea (ref in 4 here)
So beyond using PBN for modelling GRNs, one starts to think where and when to intervene (what gene to change from 0 to 1, and vice versa) in order to drive the GRN to a desired state.
Kauffman since 1993 tells us that not all interventions are effective.. so we propose the study of PBNs as a dynamical system ... PAUSE.. where change of state on certain genes (your control parameter - micro) has drastic effect on the state of the whole network (your order parameter - macro)
So we approach Control from a complex networks perspective.
I quote from Liu Slotine Barabasi’s 2011 paper in Nature:
“READ IT OUT LOUD “
And I say this because Control in the literature comes in different flavours.
In certain GRNs one can assume control inputs and focus on those only (the green ones in the example at the top) to control the network.
Other works consider intervening on one node only, to affect another node’s state (study of Melanoma in ref 9 here)
Jut to be clear in the work in this paper specifically, we ask:
“READ IT”
And these are the rules of engagement, so to speak..
Won’t go thorough all of them, they are discussed in detail in the paper !
--
Interventions allowed on any node
Only one node at a time, so can’t change all of nodes at once
Respect the internal dynamics as much as possible, so mim num of interventions, finite number of steps
I mentioned the state space when introducing PBNs.
For N nodes, and since nodes are Boolean variables, there are 2^N states.
So 9 nodes in the PBN, 512 states
Matrix representation of that is 2^N times 2^N
Fission yeast modelled as PBN of 10 nodes, state space of 1024
So one more node, double the num of states…
In the paper we control a PBN of 20 nodes, that is about a MILLION states!
We have also worked with PBN with 28 nodes since submitting this paper, which takes us closer to 270M states!
So the point here is that although there are some elegant techniques around, the PTM is intractable for larger networks.
See this paper here from 2007 …
Soo… can we work without the PTM?
YES WE CAN !
We formulate the problem as one of reward maximization and use RL ---- this is what the paper is about.
I try to give you a quick tour of how we do this.
Very briefly, in RL the agent learns by interaction with the environment.
The agent's goal is to maximise the expected cumulative reward.
The key word here is cumulative.
So having been presented with a state by the env, the Agent selects an action and that choice not only determines the reward it receives at the next step, but also affects the next state it is presented wit...and by virtue of that, also the set of actions it can take at that next step.
So selecting an action is important, what policy the agent follows, as we say,,,, this may be random or more sophisticated.
The environment is often modelled as Markov Decision Process.
-The binary states of the PBN become the states in the MDP
-There are N+1 actions at each state: the option for flipping each of the N nodes plus doing nothing
-There is the probability of function selection at each node, at each time step
As far as the agent is concerned we equip it in our work with a Q-Learning algorithm. This is a Temporal-Difference Learning method
....so it combines learning from experience (so sampling like MC) with bootstrapping (like DP).
Importantly Q-Learning is model-free, so no knowledge of the dynamics is required. So we don’t need the PTM (CLICK!), which as we saw earlier is intractable for larger networks.
Also it is off-policy and I will come back to this later.
The idea behind Temporal-Difference (TD) Learning methods is to learn directly from the environment but update estimates based on other learned estimates, without waiting for the end of the episode (bootstrap – like DP)
As you probably suspect it is important to be able to calculate the expected reward of each action (as that can be used in selecting actions at each step).
This formula here is key for improving the estimates
I will not go into detail, this is pretty standard in RL
So this is the expected reward at t+1 having taken a at t,
This is the the expected value of the best action available from that next state (target for the update)
And if I take out the current estimate, this gives me the error in the estimate
And \alpha is the “learning rate“ which says how much I should take this error into account when updating to the new estimate …
So that formula tells us how to improve our estimated Q values....
But how do we get to the true Q values?
The answer is… Approximate iteratively.
In this work, we use and epsilon-greedy policy
so with prob epsilon we choose an action randomly
with prob 1- epsilon we choose greedily – CLICK so we go for the action with the greatest expected reward
Important note: Q converges to Q* with this policy
So we know how to get the true values. PAUSE
But there is some computational cost associated with storing these, especially in large state spaces.
We turn to Deep RL to address this.
We use a function approximator to learn a parameterised form
This Deep Q Net (DQN) is trained by minimizing a sequence of loss functions L_i you see here…
Now TD-Learning is often susceptible to overestimating….
To address this, and I wont go into much detail here, we use a __separate__ network for the target during training,
And we use another, a second n/w, the so-called “policy DQN”, for updating the parameters of the “target DQN” [CLICK] every k number of time steps.
So we effectively use a Double DQN (DDQN) in our control method.
--
The parameters from this second DQN, the so-called “policy DQN”, are used to update the parameters of the “target DQN”
When it comes to rewards, we assign negative rewards for actions that do not lead to the desired attractor and positive for those that do.
We also take into account the internal dynamics, the natural tendency of a PBN to graaavitate towards an attractor , which might not be the desired one, so we penalise more the actions that lead to a non-target attractor.
Before we talk results, last point on the method…
Training from successive experiences is susceptible to strong correlations in the data.
To break up such correlations we sample from a batch of experiences, at each t, to update the n/w parameters.
CLICK
Size of the sample varies depending on the size of the PBN.
The particular flavour of PER we use in this work is
Proportional, as compared to rank-based for example,
and we also use importance weighting .
...
RESULTS !
We have applied our method to various networks.
HEADLINE NEWS: our DRL method leads to successful control, over 99% of the time!! from any initial state
Some highlights:
CLICK
On the synthetic PBN20 (just over 1M states), the target attractor occurs 1/10K times (!), it would take 6,500 random interventions to get to get that attractor;
in comparison, our DRL method needs under 100 interventions to take us there
(...in faaact, the success rate only drops to 99% if we limit the agent to 15 interventions)
On the PBN of a real GRN, from the well studied Melanoma gene expression dataset, we get 100% success rate when allowing up to 10 interventions;
if we limit to 7, success drops by only a fraction to 99.72%
So we are quite pleased with results.
That made us think:
Can we do larger networks?
Can we do the other type of control?
I refer to the problem that other work on control has been looking at: play with some subset of genes (CLICK pirin, in the studies I reference here), to fix the state of another gene (CLICK WNT5A OFF here)
We applied our method to this kind of control on the 7-node PBN from the melanoma GRN
We did it on the 28 node PBN -- which is the largest PBN addressed in existing literature.
We get favourable results!
Recently, we have been successful in controlling a 70-node PBN from this Melanoma dataset.
So these developments make us hopeful we can control larger networks!!
That's all from me!
hope you found at least soooome parts of this talk interesting.
Please do get in touch if you are working on something similar.
Thank you.