The document discusses quasi-Newton methods for solving nonlinear systems of equations and for unconstrained optimization problems. Quasi-Newton methods approximate Newton's method by iteratively updating an estimate of the inverse Hessian matrix without having to explicitly compute second derivatives. This avoids expensive evaluations of the Hessian and allows the methods to converge superlinearly. Examples of quasi-Newton methods discussed include Broyden's method and the BFGS method.
The document discusses using Tensor Train (TT) decomposition to efficiently represent tensors and apply it to machine learning models. Some key points:
- TT decomposition provides a compact representation of tensors that allows efficient linear algebra operations.
- It has been used to compress the weights matrix of neural networks without loss of accuracy.
- Exponential machines model all feature interactions using a TT-formatted weight tensor, controlling complexity with TT-rank. This outperforms other models on classification tasks involving interactions.
The document discusses various 2-D orthogonal and unitary transforms that can be used to represent digital images, including:
1. The discrete Fourier transform (DFT) which transforms an image into the frequency domain and has properties like energy conservation and fast computation via FFT.
2. The discrete cosine transform (DCT) which has good energy compaction properties and is close to the optimal Karhunen-Loeve transform.
3. The discrete sine transform (DST) which is real, symmetric, and orthogonal like the DCT.
4. The Hadamard transform which uses only ±1 values and has a fast computation, and the Haar transform which is a simpler wavelet transform
1) The document discusses equivariant imaging, a method for learning to solve inverse problems from measurements alone without ground truth data.
2) It introduces the concept of exploiting symmetries in natural signals to learn the inversion function from multiple related measurement operators.
3) A key innovation is an unsupervised loss function that enforces the learned inversion to be equivariant to the underlying signal symmetries, allowing training from noisy measurements without clean signals.
This document discusses greedy algorithms and provides examples of their use. It begins by defining characteristics of greedy algorithms, such as making locally optimal choices that reduce a problem into smaller subproblems. The document then covers designing greedy algorithms, proving their optimality, and analyzing examples like the fractional knapsack problem and minimum spanning tree algorithms. Specific greedy algorithms covered in more depth include Kruskal's and Prim's minimum spanning tree algorithms and Huffman coding.
Linear Machine Learning Models with L2 Regularization and Kernel TricksFengtao Wu
The slides are the course project presentation for INFSCI 2915 Machine Learning Foundations course. The presentation reviewed and summarized how the L2 regularization techniques are applied in the linear machine models including linear regression, logistic regression, support vector machine and perceptron learning algorithm. Also the presentation reviewed the quadratic programming problem and took SVM model as an example to illustrate the relation between primal and dual problem. At last, the presentation reviewed the general conclusion which is the representer theorem, and connected the kernel tricks to the L2 regularized linear models.
This document discusses tensor decomposition with Python. It begins by explaining what tensor decomposition and factorization are, and how they can be used to represent multi-dimensional datasets and perform dimensionality reduction. It then discusses matrix and tensor factorization methods like NMF, topic modeling, and CP/PARAFAC decomposition. The remainder of the document provides examples of tensor decomposition using Python tools and libraries, and discusses applications to analyzing temporal network and sensor data.
Algorithm Design and Complexity - Course 3Traian Rebedea
The document provides an overview of recursive algorithms and complexity analysis. It discusses recursive algorithms, divide and conquer design technique, and several examples of recursive algorithms including Towers of Hanoi, Merge Sort, and Quick Sort. For recursive algorithms, it explains how to analyze their running time using recurrence relations. It then covers four methods for solving recurrence relations: iteration, recursion trees, substitution method, and master theorem. The substitution method and master theorem are described as the most rigorous mathematical approaches.
The document discusses quasi-Newton methods for solving nonlinear systems of equations and for unconstrained optimization problems. Quasi-Newton methods approximate Newton's method by iteratively updating an estimate of the inverse Hessian matrix without having to explicitly compute second derivatives. This avoids expensive evaluations of the Hessian and allows the methods to converge superlinearly. Examples of quasi-Newton methods discussed include Broyden's method and the BFGS method.
The document discusses using Tensor Train (TT) decomposition to efficiently represent tensors and apply it to machine learning models. Some key points:
- TT decomposition provides a compact representation of tensors that allows efficient linear algebra operations.
- It has been used to compress the weights matrix of neural networks without loss of accuracy.
- Exponential machines model all feature interactions using a TT-formatted weight tensor, controlling complexity with TT-rank. This outperforms other models on classification tasks involving interactions.
The document discusses various 2-D orthogonal and unitary transforms that can be used to represent digital images, including:
1. The discrete Fourier transform (DFT) which transforms an image into the frequency domain and has properties like energy conservation and fast computation via FFT.
2. The discrete cosine transform (DCT) which has good energy compaction properties and is close to the optimal Karhunen-Loeve transform.
3. The discrete sine transform (DST) which is real, symmetric, and orthogonal like the DCT.
4. The Hadamard transform which uses only ±1 values and has a fast computation, and the Haar transform which is a simpler wavelet transform
1) The document discusses equivariant imaging, a method for learning to solve inverse problems from measurements alone without ground truth data.
2) It introduces the concept of exploiting symmetries in natural signals to learn the inversion function from multiple related measurement operators.
3) A key innovation is an unsupervised loss function that enforces the learned inversion to be equivariant to the underlying signal symmetries, allowing training from noisy measurements without clean signals.
This document discusses greedy algorithms and provides examples of their use. It begins by defining characteristics of greedy algorithms, such as making locally optimal choices that reduce a problem into smaller subproblems. The document then covers designing greedy algorithms, proving their optimality, and analyzing examples like the fractional knapsack problem and minimum spanning tree algorithms. Specific greedy algorithms covered in more depth include Kruskal's and Prim's minimum spanning tree algorithms and Huffman coding.
Linear Machine Learning Models with L2 Regularization and Kernel TricksFengtao Wu
The slides are the course project presentation for INFSCI 2915 Machine Learning Foundations course. The presentation reviewed and summarized how the L2 regularization techniques are applied in the linear machine models including linear regression, logistic regression, support vector machine and perceptron learning algorithm. Also the presentation reviewed the quadratic programming problem and took SVM model as an example to illustrate the relation between primal and dual problem. At last, the presentation reviewed the general conclusion which is the representer theorem, and connected the kernel tricks to the L2 regularized linear models.
This document discusses tensor decomposition with Python. It begins by explaining what tensor decomposition and factorization are, and how they can be used to represent multi-dimensional datasets and perform dimensionality reduction. It then discusses matrix and tensor factorization methods like NMF, topic modeling, and CP/PARAFAC decomposition. The remainder of the document provides examples of tensor decomposition using Python tools and libraries, and discusses applications to analyzing temporal network and sensor data.
Algorithm Design and Complexity - Course 3Traian Rebedea
The document provides an overview of recursive algorithms and complexity analysis. It discusses recursive algorithms, divide and conquer design technique, and several examples of recursive algorithms including Towers of Hanoi, Merge Sort, and Quick Sort. For recursive algorithms, it explains how to analyze their running time using recurrence relations. It then covers four methods for solving recurrence relations: iteration, recursion trees, substitution method, and master theorem. The substitution method and master theorem are described as the most rigorous mathematical approaches.
This document discusses various unitary transforms that can be used to decompose images, including the discrete Fourier transform (DFT), discrete cosine transform (DCT), Karhunen-Loève transform (KLT), Hadamard transform, and wavelet transforms. Unitary transforms have desirable properties like energy conservation, orthonormal bases, and de-correlation of image elements. The KLT provides optimal energy compaction and de-correlation but relies on signal statistics. Practical transforms like the DCT approximate the KLT while having fast implementations and being signal-independent. Transforms are widely used for applications like image compression, feature extraction, and pattern recognition.
Algorithm Design and Complexity - Course 11Traian Rebedea
This document discusses algorithms for solving the all-pairs shortest paths (APSP) problem. It begins by defining the problem and describing the input as a weighted graph. It then discusses several approaches to solving APSP, including using single-source shortest path algorithms repeatedly, and specialized dynamic programming algorithms like Floyd-Warshall and Johnson's algorithm. Floyd-Warshall finds shortest paths by considering intermediate vertices between pairs of vertices. It works in O(n3) time and space for a graph with n vertices. Johnson's algorithm improves the running time for sparse graphs with negative weights.
Digital Signal Processing[ECEG-3171]-Ch1_L04Rediet Moges
This Digital Signal Processing Lecture material is the property of the author (Rediet M.) . It is not for publication,nor is it to be sold or reproduced.
#Africa#Ethiopia
Digital Signal Processing[ECEG-3171]-Ch1_L03Rediet Moges
This Digital Signal Processing Lecture material is the property of the author (Rediet M.) . It is not for publication,nor is it to be sold or reproduced.
#Africa#Ethiopia
This document discusses image transformation, which represents an image as a series of summations of unitary matrices. It describes how 1D signals can be represented as linear combinations of orthogonal basis functions through Fourier analysis. Similarly, images can be transformed and represented as linear combinations of orthonormal basis images using unitary transformations. The transformation decomposes an image into coefficient values that can be used for processing and analysis.
Lecture 15 DCT, Walsh and Hadamard TransformVARUN KUMAR
This document discusses discrete cosine, Walsh, and Hadamard transforms for 2D signals. It provides the mathematical formulas for the forward and inverse transforms of each. The discrete cosine transform uses cosine functions in its kernel. The Walsh transform uses the binary representation of values, with the kernel containing terms with (−1) factors. The Hadamard transform has a similar kernel to the Walsh transform. Each transform decomposes 2D signals into component frequencies or patterns in a way that is separable and symmetric.
The document discusses various image filtering techniques in the frequency domain. It begins by introducing convolution as frequency domain filtering using the Fourier transform. It then provides examples of low pass and high pass filtering using sharp cut-off and Gaussian filters. Additional topics covered include the Butterworth filter, homomorphic filtering to separate illumination and reflectance, and systematic design of 2D finite impulse response (FIR) filters.
This document provides an overview of key concepts in discrete random signal processing including:
1) Discrete time random processes are indexed sequences of random variables that map from a sample space to discrete time signals. Examples include tossing a coin or rolling a die.
2) Key concepts discussed include the Bernoulli process, ensemble averages, stationary and wide-sense stationary processes, autocorrelation, power spectral density, filtering of random processes, and the Yule-Walker equations.
3) Important theorems covered are the Parseval theorem relating energy in the time and frequency domains, and the Wiener-Khinchine relation showing the power spectral density is the Fourier transform of the autocorrelation function.
MVPA with SpaceNet: sparse structured priorsElvis DOHMATOB
The GraphNet (aka S-Lasso), as well as other “sparsity + structure” priors like TV (Total-Variation), TV-L1, etc., are not easily applicable to brain data because of technical problems
relating to the selection of the regularization parameters. Also, in
their own right, such models lead to challenging high-dimensional optimization problems. In this manuscript, we present some heuristics for speeding up the overall optimization process: (a) Early-stopping, whereby one halts the optimization process when the test score (performance on leftout data) for the internal cross-validation for model-selection stops improving, and (b) univariate feature-screening, whereby irrelevant (non-predictive) voxels are detected and eliminated before the optimization problem is entered, thus reducing the size of the problem. Empirical results with GraphNet on real MRI (Magnetic Resonance Imaging) datasets indicate that these heuristics are a win-win strategy, as they add speed without sacrificing the quality of the predictions. We expect the proposed heuristics to work on other models like TV-L1, etc.
This document describes analysis of linear time-invariant (LTI) systems through impulse response and convolution. It defines LTI systems and explains that their behavior is characterized by impulse response. The input and output of an LTI system are related by convolution. Impulse response is the output of an LTI system when the input is a unit impulse. Given impulse response, the output can be found for any input using convolution. Difference equations are also used to represent LTI systems. The document provides examples of finding impulse responses and performing convolution in MATLAB. It discusses properties of convolution like associativity and commutativity. Tasks are given on finding impulse responses and convolving various signals.
This paper proposes a method called tensorizing neural networks to compress neural network models by decomposing weight matrices into tensor trains. It first introduces background on neural networks and backpropagation. The motivation is to address the large memory requirements of neural networks. It then formulates the problem and discusses a naive low-rank SVD approach. The main idea is to recursively apply low-rank SVD and adaptively reshape matrices into higher-dimensional tensors to decompose them into tensor train formats. Experiments on MNIST and ImageNet show the tensor train decomposition approach can effectively compress large fully-connected layers with better results than the naive method.
DSP_FOEHU - MATLAB 01 - Discrete Time Signals and SystemsAmr E. Mohamed
This document provides an overview of discrete-time signals and systems in MATLAB. It defines discrete signals as sequences represented by x(n) and how they can be implemented as vectors in MATLAB. It describes various types of sequences like unit sample, unit step, exponential, sinusoidal, and random. It also covers operations on sequences like addition, multiplication, scaling, shifting, folding, and correlations. Discrete time systems are defined as operators that transform an input sequence x(n) to an output sequence y(n). Key properties discussed are linearity, time-invariance, stability, causality, and the use of convolution to represent the output of a linear time-invariant system. Examples are provided to demonstrate various concepts.
The document discusses uncertainty quantification (UQ) using quasi-Monte Carlo (QMC) integration methods. It introduces parametric operator equations for modeling input uncertainty in partial differential equations. Both forward and inverse UQ problems are considered. QMC methods like interlaced polynomial lattice rules are discussed for approximating high-dimensional integrals arising in UQ, with convergence rates superior to standard Monte Carlo. Algorithms for single-level and multilevel QMC are presented for solving forward and inverse UQ problems.
This document summarizes key aspects of the discrete Fourier transform (DFT). It defines the DFT, provides the formula for calculating it, and explains that the DFT transforms a discrete-time signal from the time domain to the frequency domain. It also outlines several important properties of the DFT, including linearity, shift property, duality, symmetry, and circular convolution. Examples are provided to illustrate duality and symmetry. References for further information on the discrete Fourier transform are also included.
DSP_2018_FOEHU - Lec 03 - Discrete-Time Signals and SystemsAmr E. Mohamed
The document discusses discrete-time signals and systems. It defines discrete-time signals as sequences represented by x[n] and discusses important sequences like the unit sample, unit step, and periodic sequences. It then defines discrete-time systems as devices that take a discrete-time signal x(n) as input and produce another discrete-time signal y(n) as output. The document classifies systems as static vs. dynamic, time-invariant vs. time-varying, linear vs. nonlinear, and causal vs. noncausal. It provides examples to illustrate each classification.
DSP_FOEHU - MATLAB 04 - The Discrete Fourier Transform (DFT)Amr E. Mohamed
The document discusses the discrete Fourier transform (DFT) and its implementation in MATLAB. It introduces the DFT as a numerically computable alternative to the discrete-time Fourier transform and z-transform. The DFT decomposes a sequence into its constituent frequency components. MATLAB functions like fft and ifft efficiently compute the DFT and inverse DFT using fast Fourier transform algorithms. Zero-padding a sequence provides more samples of its discrete-time Fourier transform without adding new information. Circular convolution relates to the DFT through its properties. Linear convolution can be computed from the DFT of zero-padded sequences.
Stochastic Alternating Direction Method of MultipliersTaiji Suzuki
This document discusses stochastic optimization methods for solving regularized learning problems with structured regularization and large datasets. It proposes applying the alternating direction method of multipliers (ADMM) in a stochastic manner. Specifically, it introduces two stochastic ADMM methods for online data: RDA-ADMM, which extends regularized dual averaging with ADMM updates; and OPG-ADMM, which extends online proximal gradient descent with ADMM updates. These methods allow the regularization term to be optimized in batches, resolving computational difficulties, while the loss is optimized online using only a small number of samples per iteration.
This document contains a 10-question problem set on digital signal processing. The problems cover topics such as determining the impulse response and transfer function of discrete-time systems described by difference equations, analyzing stability of systems, computing outputs given certain inputs, and realizing systems in block diagrams. Students are asked to find input-output relations, plot frequency responses, solve for poles and zeros, and analyze other properties of linear time-invariant discrete-time systems. The problems progress from simpler analyses to more complex issues like stability determination.
The document discusses the research focus of the TAO group at Inria Saclay, which includes machine learning and optimization applications for energy management. The group has one permanent member and others part-time. It collaborates closely with partners in Taiwan and the company Artelys. The research aims to address challenges in power grid simulation like variable demand and renewable energy sources using techniques like mathematical programming, reinforcement learning, and direct policy search combined with heuristics and Monte Carlo tree search.
Dynamic Optimization without Markov Assumptions: application to power systemsOlivier Teytaud
Ilab METIS is a collaboration between TAO, a machine learning and optimization team at INRIA, and Artelys, an SME focused on optimization. They work on optimizing energy policies through modeling power systems and simulating operational and investment decisions. Their methodologies hybridize reinforcement learning, mathematical programming, and direct policy search to optimize complex, constrained problems with uncertainties while minimizing model error. They have applied these techniques to problems involving European-scale power grids with stochastic renewables.
This document discusses various unitary transforms that can be used to decompose images, including the discrete Fourier transform (DFT), discrete cosine transform (DCT), Karhunen-Loève transform (KLT), Hadamard transform, and wavelet transforms. Unitary transforms have desirable properties like energy conservation, orthonormal bases, and de-correlation of image elements. The KLT provides optimal energy compaction and de-correlation but relies on signal statistics. Practical transforms like the DCT approximate the KLT while having fast implementations and being signal-independent. Transforms are widely used for applications like image compression, feature extraction, and pattern recognition.
Algorithm Design and Complexity - Course 11Traian Rebedea
This document discusses algorithms for solving the all-pairs shortest paths (APSP) problem. It begins by defining the problem and describing the input as a weighted graph. It then discusses several approaches to solving APSP, including using single-source shortest path algorithms repeatedly, and specialized dynamic programming algorithms like Floyd-Warshall and Johnson's algorithm. Floyd-Warshall finds shortest paths by considering intermediate vertices between pairs of vertices. It works in O(n3) time and space for a graph with n vertices. Johnson's algorithm improves the running time for sparse graphs with negative weights.
Digital Signal Processing[ECEG-3171]-Ch1_L04Rediet Moges
This Digital Signal Processing Lecture material is the property of the author (Rediet M.) . It is not for publication,nor is it to be sold or reproduced.
#Africa#Ethiopia
Digital Signal Processing[ECEG-3171]-Ch1_L03Rediet Moges
This Digital Signal Processing Lecture material is the property of the author (Rediet M.) . It is not for publication,nor is it to be sold or reproduced.
#Africa#Ethiopia
This document discusses image transformation, which represents an image as a series of summations of unitary matrices. It describes how 1D signals can be represented as linear combinations of orthogonal basis functions through Fourier analysis. Similarly, images can be transformed and represented as linear combinations of orthonormal basis images using unitary transformations. The transformation decomposes an image into coefficient values that can be used for processing and analysis.
Lecture 15 DCT, Walsh and Hadamard TransformVARUN KUMAR
This document discusses discrete cosine, Walsh, and Hadamard transforms for 2D signals. It provides the mathematical formulas for the forward and inverse transforms of each. The discrete cosine transform uses cosine functions in its kernel. The Walsh transform uses the binary representation of values, with the kernel containing terms with (−1) factors. The Hadamard transform has a similar kernel to the Walsh transform. Each transform decomposes 2D signals into component frequencies or patterns in a way that is separable and symmetric.
The document discusses various image filtering techniques in the frequency domain. It begins by introducing convolution as frequency domain filtering using the Fourier transform. It then provides examples of low pass and high pass filtering using sharp cut-off and Gaussian filters. Additional topics covered include the Butterworth filter, homomorphic filtering to separate illumination and reflectance, and systematic design of 2D finite impulse response (FIR) filters.
This document provides an overview of key concepts in discrete random signal processing including:
1) Discrete time random processes are indexed sequences of random variables that map from a sample space to discrete time signals. Examples include tossing a coin or rolling a die.
2) Key concepts discussed include the Bernoulli process, ensemble averages, stationary and wide-sense stationary processes, autocorrelation, power spectral density, filtering of random processes, and the Yule-Walker equations.
3) Important theorems covered are the Parseval theorem relating energy in the time and frequency domains, and the Wiener-Khinchine relation showing the power spectral density is the Fourier transform of the autocorrelation function.
MVPA with SpaceNet: sparse structured priorsElvis DOHMATOB
The GraphNet (aka S-Lasso), as well as other “sparsity + structure” priors like TV (Total-Variation), TV-L1, etc., are not easily applicable to brain data because of technical problems
relating to the selection of the regularization parameters. Also, in
their own right, such models lead to challenging high-dimensional optimization problems. In this manuscript, we present some heuristics for speeding up the overall optimization process: (a) Early-stopping, whereby one halts the optimization process when the test score (performance on leftout data) for the internal cross-validation for model-selection stops improving, and (b) univariate feature-screening, whereby irrelevant (non-predictive) voxels are detected and eliminated before the optimization problem is entered, thus reducing the size of the problem. Empirical results with GraphNet on real MRI (Magnetic Resonance Imaging) datasets indicate that these heuristics are a win-win strategy, as they add speed without sacrificing the quality of the predictions. We expect the proposed heuristics to work on other models like TV-L1, etc.
This document describes analysis of linear time-invariant (LTI) systems through impulse response and convolution. It defines LTI systems and explains that their behavior is characterized by impulse response. The input and output of an LTI system are related by convolution. Impulse response is the output of an LTI system when the input is a unit impulse. Given impulse response, the output can be found for any input using convolution. Difference equations are also used to represent LTI systems. The document provides examples of finding impulse responses and performing convolution in MATLAB. It discusses properties of convolution like associativity and commutativity. Tasks are given on finding impulse responses and convolving various signals.
This paper proposes a method called tensorizing neural networks to compress neural network models by decomposing weight matrices into tensor trains. It first introduces background on neural networks and backpropagation. The motivation is to address the large memory requirements of neural networks. It then formulates the problem and discusses a naive low-rank SVD approach. The main idea is to recursively apply low-rank SVD and adaptively reshape matrices into higher-dimensional tensors to decompose them into tensor train formats. Experiments on MNIST and ImageNet show the tensor train decomposition approach can effectively compress large fully-connected layers with better results than the naive method.
DSP_FOEHU - MATLAB 01 - Discrete Time Signals and SystemsAmr E. Mohamed
This document provides an overview of discrete-time signals and systems in MATLAB. It defines discrete signals as sequences represented by x(n) and how they can be implemented as vectors in MATLAB. It describes various types of sequences like unit sample, unit step, exponential, sinusoidal, and random. It also covers operations on sequences like addition, multiplication, scaling, shifting, folding, and correlations. Discrete time systems are defined as operators that transform an input sequence x(n) to an output sequence y(n). Key properties discussed are linearity, time-invariance, stability, causality, and the use of convolution to represent the output of a linear time-invariant system. Examples are provided to demonstrate various concepts.
The document discusses uncertainty quantification (UQ) using quasi-Monte Carlo (QMC) integration methods. It introduces parametric operator equations for modeling input uncertainty in partial differential equations. Both forward and inverse UQ problems are considered. QMC methods like interlaced polynomial lattice rules are discussed for approximating high-dimensional integrals arising in UQ, with convergence rates superior to standard Monte Carlo. Algorithms for single-level and multilevel QMC are presented for solving forward and inverse UQ problems.
This document summarizes key aspects of the discrete Fourier transform (DFT). It defines the DFT, provides the formula for calculating it, and explains that the DFT transforms a discrete-time signal from the time domain to the frequency domain. It also outlines several important properties of the DFT, including linearity, shift property, duality, symmetry, and circular convolution. Examples are provided to illustrate duality and symmetry. References for further information on the discrete Fourier transform are also included.
DSP_2018_FOEHU - Lec 03 - Discrete-Time Signals and SystemsAmr E. Mohamed
The document discusses discrete-time signals and systems. It defines discrete-time signals as sequences represented by x[n] and discusses important sequences like the unit sample, unit step, and periodic sequences. It then defines discrete-time systems as devices that take a discrete-time signal x(n) as input and produce another discrete-time signal y(n) as output. The document classifies systems as static vs. dynamic, time-invariant vs. time-varying, linear vs. nonlinear, and causal vs. noncausal. It provides examples to illustrate each classification.
DSP_FOEHU - MATLAB 04 - The Discrete Fourier Transform (DFT)Amr E. Mohamed
The document discusses the discrete Fourier transform (DFT) and its implementation in MATLAB. It introduces the DFT as a numerically computable alternative to the discrete-time Fourier transform and z-transform. The DFT decomposes a sequence into its constituent frequency components. MATLAB functions like fft and ifft efficiently compute the DFT and inverse DFT using fast Fourier transform algorithms. Zero-padding a sequence provides more samples of its discrete-time Fourier transform without adding new information. Circular convolution relates to the DFT through its properties. Linear convolution can be computed from the DFT of zero-padded sequences.
Stochastic Alternating Direction Method of MultipliersTaiji Suzuki
This document discusses stochastic optimization methods for solving regularized learning problems with structured regularization and large datasets. It proposes applying the alternating direction method of multipliers (ADMM) in a stochastic manner. Specifically, it introduces two stochastic ADMM methods for online data: RDA-ADMM, which extends regularized dual averaging with ADMM updates; and OPG-ADMM, which extends online proximal gradient descent with ADMM updates. These methods allow the regularization term to be optimized in batches, resolving computational difficulties, while the loss is optimized online using only a small number of samples per iteration.
This document contains a 10-question problem set on digital signal processing. The problems cover topics such as determining the impulse response and transfer function of discrete-time systems described by difference equations, analyzing stability of systems, computing outputs given certain inputs, and realizing systems in block diagrams. Students are asked to find input-output relations, plot frequency responses, solve for poles and zeros, and analyze other properties of linear time-invariant discrete-time systems. The problems progress from simpler analyses to more complex issues like stability determination.
The document discusses the research focus of the TAO group at Inria Saclay, which includes machine learning and optimization applications for energy management. The group has one permanent member and others part-time. It collaborates closely with partners in Taiwan and the company Artelys. The research aims to address challenges in power grid simulation like variable demand and renewable energy sources using techniques like mathematical programming, reinforcement learning, and direct policy search combined with heuristics and Monte Carlo tree search.
Dynamic Optimization without Markov Assumptions: application to power systemsOlivier Teytaud
Ilab METIS is a collaboration between TAO, a machine learning and optimization team at INRIA, and Artelys, an SME focused on optimization. They work on optimizing energy policies through modeling power systems and simulating operational and investment decisions. Their methodologies hybridize reinforcement learning, mathematical programming, and direct policy search to optimize complex, constrained problems with uncertainties while minimizing model error. They have applied these techniques to problems involving European-scale power grids with stochastic renewables.
Complexity of planning and games with partial informationOlivier Teytaud
Survey of computational complexity or computability of sequential decision making (games, planning)
contains two more detailed proofs:
- EXPSPACE completeness of unobservable adversarial planning for existence of 100% winning strategy (Hasslum et al)
- undecidability of unobservable adversarial planning for arbitrary winning rate (including optimal play in the Nash sense)
Weather, opponents, geopolitics: so many uncertainties in such a case ? How to manage power systems in spite of these uncertainties, and how to decide investments.
Talk at Saint-Etienne in 2015; thanks to R. Leriche and to the "games and optimizations" days in Saint-Etienne.
This document discusses the go variant "killall go" where black wins by capturing all white stones, and proposes different board sizes and handicap placements to test balance. It analyzes MoGo evaluations of some placements, finding discrepancies between single-game evaluations and full-game results. It concludes more research could be done on MCTS opening evaluations, Batoo variants where players take turns placing stones, and using bandits to learn balanced handicap placements.
Computers have made progress playing the game of Go but still have weaknesses. In 19x19 Go, computers have beaten professionals with handicaps of 6-9 stones. In 9x9 Go, computers have reached human professional level by beating professionals without handicaps. However, in 19x19 Go computers still require at least a 6 stone handicap against top professionals. Future improvements may allow computers to reach professional human level in 19x19 Go without handicaps.
The document discusses tools for artificial intelligence and their applications. It describes Olivier Teytaud's work at Tao, a research group in Paris focused on reservoir computing, optimal decision making under uncertainty, optimization, and machine learning. It then provides examples of applications for these tools in electricity generation, urban rivals, pokemons, minesweeper, and solving unsolved situations in the game of Go. Olivier suggests that breakthroughs in games can help open doors to applying these algorithms to more important real-world problems by building trust in the approaches.
This document provides an overview of statistical concepts related to item response theory (IRT), including posterior probability, Bayes' theorem, maximum a posteriori (MAP) estimation, and the Jacobi algorithm. It discusses how to initialize IRT parameters like student abilities and item parameters, and evaluates options for fitting IRT models like R and Octave packages.
The document discusses energy management in France and Taiwan. It notes that both countries have a long history in energy but differ in their approaches. France relies heavily on nuclear power through large state-owned companies while Taiwan has focused more recently on renewable resources like solar, wind, and ocean currents. The challenges of energy management are outlined as deciding how to operate existing power sources and plan new investments over decades to improve the system under uncertainty. Optimization methods that incorporate stochastic modeling are proposed to help with these long-term planning problems.
This document discusses using Meta-Monte Carlo Tree Search (Meta-MCTS) to build an opening book for 7x7 Go. Meta-MCTS improved its play against a sparring partner that incorporated human variations. While Meta-MCTS won all games as black and white against professionals, humans found at least one variation where it did not play correctly. The document concludes that Meta-MCTS performed well but incorporating human data helped, and exactly solving 7x7 Go would require immense work collecting and solving all leaf variations.
Machine learning 2016: deep networks and Monte Carlo Tree SearchOlivier Teytaud
This talk describes two key machine learning algorithms, namely MCTS and Deep Networks (DN), presented as the main AI innovations from the last 20 years. Interestingly, the talk was given a few days before a combination MCTS+DN was used by Google DeepMind for winning against a pro (https://docs.google.com/document/d/1ZjniEJiotdCfvBYI3MTBpjtOTSlUvf3ma7V8DHVmjhk/edit#)
Talk ENS-Lyon at "Sept Laux"
Introduction to the TAO Uct Sig, a team working on computational intelligence...Olivier Teytaud
The document discusses research from Tao-Uctsig, a special interest group within Tao focused on artificial intelligence. Tao has 11 permanent staff and around 22 PhD students/postdocs working across various fields of mathematics, computer science, and sciences. The SIG works on problems where computers make decisions, particularly challenges where humans are currently better than computers. Their work involves games, important applications, and previously included mathematics. Specific applications discussed include controlling a robot arm, analyzing strategy in Pokemon and Urban Rivals, solving Minesweeper puzzles, and Go. Industrial applications include helping optimize France's major electricity industry.
Stochastic modelling and quasi-random numbersOlivier Teytaud
Stochastic models use random numbers to simulate random variables and processes. However, random numbers can be disappointing as they do not cover the space uniformly. Quasi-random numbers provide an alternative by being more uniformly distributed. The document discusses using quasi-random numbers instead of purely random numbers in stochastic models to generate sequences that better cover the sample space.
Introductory talk
more technicities in
@inproceedings{schoenauer:inria-00625855,
hal_id = {inria-00625855},
url = {http://hal.inria.fr/inria-00625855},
title = {{A Rigorous Runtime Analysis for Quasi-Random Restarts and Decreasing Stepsize}},
author = {Schoenauer, Marc and Teytaud, Fabien and Teytaud, Olivier},
abstract = {{Multi-Modal Optimization (MMO) is ubiquitous in engineer- ing, machine learning and artificial intelligence applications. Many algo- rithms have been proposed for multimodal optimization, and many of them are based on restart strategies. However, only few works address the issue of initialization in restarts. Furthermore, very few comparisons have been done, between different MMO algorithms, and against simple baseline methods. This paper proposes an analysis of restart strategies, and provides a restart strategy for any local search algorithm for which theoretical guarantees are derived. This restart strategy is to decrease some 'step-size', rather than to increase the population size, and it uses quasi-random initialization, that leads to a rigorous proof of improve- ment with respect to random restarts or restarts with constant initial step-size. Furthermore, when this strategy encapsulates a (1+1)-ES with 1/5th adaptation rule, the resulting algorithm outperforms state of the art MMO algorithms while being computationally faster.}},
language = {Anglais},
affiliation = {TAO - INRIA Saclay - Ile de France , Microsoft Research - Inria Joint Centre - MSR - INRIA , Laboratoire de Recherche en Informatique - LRI},
booktitle = {{Artificial Evolution}},
address = {Angers, France},
audience = {internationale },
year = {2011},
month = Oct,
pdf = {http://hal.inria.fr/inria-00625855/PDF/qrrsEA.pdf},
}
1. The document discusses various methods for continuous optimization, including rates of convergence for noise-free and noisy settings.
2. In noise-free settings, methods like Newton's method and BFGS have quadratic or superlinear convergence rates, while evolutionary strategies (ES) have linear convergence rates.
3. Lower bounds on optimization complexity are also discussed, showing minimum comparisons or evaluations needed depending on problem properties like domain size and precision required.
Combining UCT and Constraint Satisfaction Problems for MinesweeperOlivier Teytaud
@inproceedings{buffet:hal-00750577,
hal_id = {hal-00750577},
url = {http://hal.inria.fr/hal-00750577},
title = {{Optimistic Heuristics for MineSweeper}},
author = {Buffet, Olivier and Lee, Chang-Shing and Lin, Woanting and Teytaud, Olivier},
abstract = {{We present a combination of Upper Con dence Tree (UCT) and domain speci c solvers, aimed at improving the behavior of UCT for long term aspects of a problem. Results improve the state of the art, combining top performance on small boards (where UCT is the state of the art) and on big boards (where variants of CSP rule).}},
language = {Anglais},
affiliation = {MAIA - INRIA Nancy - Grand Est / LORIA , Department of Computer Science and Information Engineering - CSIE , National University of Tainan - NUTN , TAO - INRIA Saclay - Ile de France , Laboratoire de Recherche en Informatique - LRI , Department of Electrical Engineering and Computer Science - Institut Montefiore},
booktitle = {{International Computer Symposium}},
address = {Hualien, Ta{\"\i}wan, Province De Chine},
audience = {internationale },
year = {2012},
pdf = {http://hal.inria.fr/hal-00750577/PDF/mines3.pdf},
}
The document discusses portfolio methods for optimization problems with uncertainty. It introduces noisy optimization problems where the objective function includes random variables. It then discusses various optimization criteria and methods for noisy optimization problems, including resampling methods to reduce noise. The document also covers portfolio approaches that combine or select among multiple optimization solvers to handle uncertainty.
A Mathematically Derived Number of Resamplings for Noisy Optimization (GECCO2...Jialin LIU
"A Mathematically Derived Number of Resamplings for Noisy Optimization". Jialin Liu, David L. St-Pierre and Olivier Teytaud. (Accepted as short paper) Genetic and Evolutionary Computation Conference (GECCO), 2014.
The document discusses algorithms, including their definition, properties, analysis of time and space complexity, and examples of recursion and iteration. It defines an algorithm as a finite set of instructions to accomplish a task. Properties include inputs, outputs, finiteness, definiteness, and effectiveness. Time complexity is analyzed using big-O notation, while space complexity considers static and variable parts. Recursion uses function calls to solve sub-problems, while iteration uses loops. Examples include factorial calculation, GCD, and Towers of Hanoi solved recursively.
This document discusses automatic Bayesian cubature for numerical integration. It begins with an introduction to multivariate integration and the challenges it poses. It then describes an automatic cubature algorithm that generates sample points and computes error bounds iteratively until a tolerance threshold is met. Next, it covers Bayesian cubature, which treats integrands as random functions to obtain probabilistic error bounds. It defines a Bayesian trio identity relating the integration error to discrepancies, variations, and alignments. The document concludes with discussions of future work.
This document discusses algorithms and their analysis. It begins by defining an algorithm and its key characteristics like being finite, definite, and terminating after a finite number of steps. It then discusses designing algorithms to minimize cost and analyzing algorithms to predict their performance. Various algorithm design techniques are covered like divide and conquer, binary search, and its recursive implementation. Asymptotic notations like Big-O, Omega, and Theta are introduced to analyze time and space complexity. Specific algorithms like merge sort, quicksort, and their recursive implementations are explained in detail.
The document describes the syllabus for a course on design analysis and algorithms. It covers topics like asymptotic notations, time and space complexities, sorting algorithms, greedy methods, dynamic programming, backtracking, and NP-complete problems. It also provides examples of algorithms like computing greatest common divisor, Sieve of Eratosthenes for primes, and discusses pseudocode conventions. Recursive algorithms and examples like Towers of Hanoi and permutation generation are explained. Finally, it outlines the steps for designing algorithms like understanding the problem, choosing appropriate data structures and computational devices.
This document summarizes a talk on dynamic graph algorithms. It begins with an introduction to dynamic graph algorithms, which involve maintaining a graph structure and answering queries efficiently as the graph undergoes a sequence of edge insertions and deletions. It then discusses several examples of fully dynamic algorithms for problems like connectivity, minimum spanning trees, and graph spanners. A key data structure introduced is the Euler tour tree, which represents a dynamic tree as a one-dimensional structure to support efficient updates and queries. The document concludes by outlining a fully dynamic randomized algorithm for maintaining connectivity under edge updates with polylogarithmic update time, using a hierarchical approach with multiple levels of edge partitions and ET trees.
A generalized class of normalized distance functions called Q-Metrics is described in this presentation. The Q-Metrics approach relies on a unique functional, using a single bounded parameter (Lambda), which characterizes the conventional distance functions in a normalized per-unit metric space. In addition to this coverage property, a distinguishing and extremely attractive characteristic of the Q-Metric function is its low computational complexity. Q-Metrics satisfy the standard metric axioms. Novel networks for classification and regression tasks are defined and constructed using Q-Metrics. These new networks are shown to outperform conventional feed forward back propagation networks with the same size when tested on real data sets.
A generalized class of normalized distance functions called Q-Metrics is described in this presentation. The Q-Metrics approach relies on a unique functional, using a single bounded parameter Lambda, which characterizes the conventional distance functions in a normalized per-unit metric space. In addition to this coverage property, a distinguishing and extremely attractive characteristic of the Q-Metric function is its low computational complexity. Q-Metrics satisfy the standard metric axioms. Novel networks for classification and regression tasks are defined and constructed using Q-Metrics. These new networks are shown to outperform conventional feed forward back propagation networks with the same size when tested on real data sets.
The document provides an overview and outline of the course "Optimization for Machine Learning". Key points:
- The course covers topics like convexity, gradient methods, constrained optimization, proximal algorithms, stochastic gradient descent, and more.
- Mathematical modeling and computational optimization for machine learning are discussed. Optimization algorithms like gradient descent and stochastic gradient descent are important for learning model parameters.
- Convex optimization problems have desirable properties like every local minimum being a global minimum. Gradient descent and related algorithms are guaranteed to converge for convex problems.
- Convex sets and functions are introduced, including characterizations using epigraphs and subgradients. Convex functions have useful properties like continuity and satisfying Jensen's inequality.
The document discusses algorithm analysis and asymptotic notation. It begins by explaining how to analyze algorithms to predict resource requirements like time and space. It defines asymptotic notation like Big-O, which describes an upper bound on the growth rate of an algorithm's running time. The document then provides examples of analyzing simple algorithms and classifying functions based on their asymptotic growth rates. It also introduces common time functions like constant, logarithmic, linear, quadratic, and exponential time and compares their growth.
Introduction to Algorithms and Asymptotic NotationAmrinder Arora
Asymptotic Notation is a notation used to represent and compare the efficiency of algorithms. It is a concise notation that deliberately omits details, such as constant time improvements, etc. Asymptotic notation consists of 5 commonly used symbols: big oh, small oh, big omega, small omega, and theta.
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsPer Kristian Lehre
We demonstrate how to estimate the expected optimisation time of UMDA, an estimation of distribution algorithm, using the level-based theorem. The talk was given at the GECCO 2015 conference in Madrid, Spain.
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsPK Lehre
We describe how to estimate the optimisation time of the UMDA, an estimation of distribution algorithm, using the level-based theorem. The paper was presented at GECCO 2015 in Madrid.
This document provides an introduction to linear and integer programming. It defines key concepts such as linear programs (LP), integer programs (IP), and mixed integer programs (MIP). It discusses the complexity of different optimization problem types and gives examples of LP and IP formulations. It also covers common techniques for solving LPs and IPs, including the simplex method, cutting plane methods, branch and bound, and heuristics like beam search.
Opening of our Deep Learning Lunch & Learn series. First session: introduction to Neural Networks, Gradient descent and backpropagation, by Pablo J. Villacorta, with a prologue by Fernando Velasco
This document discusses algorithm analysis and asymptotic notation. It defines common asymptotic notations like O(N), Ω(N), and Θ(N) and provides examples of analyzing simple algorithms and determining their time complexities. The document also outlines general rules for analyzing algorithms with loops, nested loops, consecutive statements, and recursion to determine their asymptotic running times.
Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...Yandex
We consider a new class of huge-scale problems, the problems with sparse subgradients. The most important functions of this type are piecewise linear. For optimization problems with uniform sparsity of corresponding linear operators, we suggest a very efficient implementation of subgradient iterations, the total cost of which depends logarithmically in the dimension. This technique is based on a recursive update of the results of matrix/vector products and the values of symmetric functions. It works well, for example, for matrices with few nonzero diagonals and for max-type functions.
We show that the updating technique can be efficiently coupled with the simplest subgradient methods. Similar results can be obtained for a new non-smooth random variant of a coordinate descent scheme. We also present promising results of preliminary computational experiments.
Similar to Noisy optimization --- (theory oriented) Survey (20)
Building RAG with self-deployed Milvus vector database and Snowpark Container...Zilliz
This talk will give hands-on advice on building RAG applications with an open-source Milvus database deployed as a docker container. We will also introduce the integration of Milvus with Snowpark Container Services.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Zilliz
Join us to introduce Milvus Lite, a vector database that can run on notebooks and laptops, share the same API with Milvus, and integrate with every popular GenAI framework. This webinar is perfect for developers seeking easy-to-use, well-integrated vector databases for their GenAI apps.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
20 Comprehensive Checklist of Designing and Developing a WebsitePixlogix Infotech
Dive into the world of Website Designing and Developing with Pixlogix! Looking to create a stunning online presence? Look no further! Our comprehensive checklist covers everything you need to know to craft a website that stands out. From user-friendly design to seamless functionality, we've got you covered. Don't miss out on this invaluable resource! Check out our checklist now at Pixlogix and start your journey towards a captivating online presence today.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
1. Noisy optimization
Astete-Morales, Cauwet,
Decock, Liu, Rolet, Teytaud
Thanks all !
•
•
•
Runtime analysis
Black-box complexity
Noisy objective function
@inproceedings{dagstuhlNoisyOptimization, title={Noisy Optimization},
booktitle={Proceedings of Dagstuhl seminar 13271 “Theory of Evolutionary Algorithm”},
author={Sandra Astete-Morales and Marie-Liesse Cauwet and Jeremie Decock and Jialin Liu
and Philippe Rolet and Olivier Teytaud}, year={2013}}
2. This talk about noisy optimization in
continuous domains
Has an impact
on algorithms.
EA theory in
discrete
domains
EA theory in
Continuous
domains
Yes, I wanted a little bit
of troll in my talk.
3. In case I still have friends in the
room, a second troll.
Has an impact
on algorithms.
(but ok, when we
really want it to work,
we use mathematical
programming)
EA theory in
Continuous
domains
EA theory in
discrete
domains
4. Noisy optimization:
preliminaries (1)
Points at which the fitness function
is evaluated (might be bad)
Obtained fitness values
Points which are recommended as
Approximation of the optimum (should
be good)
6. Noisy optimization:
preliminaries (3)
A noise model (among others):
p
z
f(x) = ||x|| + ||x|| Noise
with “Noise” some i.i.d noise.
●
Z=0 ==> additive constant variance noise
●
Z=2 ==> noise quickly converging to 0
7. Part 1: what about ES ?
Let us start with:
●
ES with nothing special
●
Just reevaluate and same business as usual
8. Noisy optimization: log-log or loglinear convergence for ES ?
Set z=0 (constant additive noise).
Define r(n) = number of revaluations at iteration n.
●
r(n)=poly(n)
●
r(n)=expo(n)
●
r(n)=poly (1 / step-size(n) )
Results:
●
Log-linear w.r.t. iterations
●
Log-log w.r.t. evaluations
9. Noisy optimization: log-log or loglinear convergence for ES ?
●
r(n)=expo(n) ==> log(||~xn||) ~ -C log(n)
●
r(n)=poly (1 / ss(n) ) ==> log(||~xn||) ~ -C log(n)
Noise' = Noise after averaging over r(n) revaluations
Proof: enough revaluations for no misrank
●
d(n) = sequence decreasing to zero.
●
p(n) = P( |E f(xi)-E f(xj)| <= d(n) ) (≈same fitness)
●
p'(n) = λ2 P( | Noise'(xi)-Noise'(xj) | >d(n) )
●
choose coeffs so that, if linear convergence in
the noise-free case, ∑ p(n)+p(n') < δ
10. So with simple revaluation schemes
●
●
z=0 ==> log-log
(for #evals)
z=large (2?) ==> not finished checking, we
believe it is log-linear
Now:
What about “races” ?
Races = revaluations until statistical difference.
11. Part 2
●
●
Bernoulli noise model (sorry, another model...)
Combining bandits and ES for runtime
analysis
(see also work by
V. Heidrich-Meisner and C. Igel)
●
Complexity bounds by information theory
(#bits of information)
13. ES + races
●
The race stops when you know one of the
good points
●
Assume sphere (sorry)
●
Sample five in a row, on one axis
●
At some point two of them are statistically
different (because of the sphere)
●
Reduce the domain accordingly
●
Now split another axis
14. ES + races
●
The race stops when you know one of the
good points
●
Assume sphere (sorry)
●
Sample five in a row, on one axis
●
At some point two of them are statistically
different
●
Reduce the domain accordingly
●
Now split another axis
15. ES + races
●
The race stops when you know one of the
good points
●
Assume sphere (sorry)
●
Sample five in a row, on one axis
●
At some point two of them are statistically
different
●
Reduce the domain accordingly
●
Now split another axis
16. Black-box complexity: the tools
●
Ok, we got a rate for both evaluated and
recommended points.
●
What about lower bounds ?
●
UR = uniform rate = bound for worse evaluated point
(see also: cumulative regret)
Proof technique for precision e with Bernoulli
noise:
b=#bits of information
(yes, it's a rigorous proof)
b > log2( nb of possible solutions with precision e )
b < sum of q(n) with
q(n) = proba that fitness(x*,w) ≠ fitness(x**,w)
17. Rates on Bernoulli model
log||x(n)|| < C - α log(n)
Other algorithms reach 1/2. So, yes, sampling close
to the optimum makes algorithms slower sometimes.
18. Conclusions of parts 1 and 2
●
●
Part 1: with constant noise, log-log
convergence; improved with noise decreasing
to 0
Part 2:
–
When using races, sampling should be careful
–
sampling not too close to the optimum can
improve convergence rates
19. Part 3: Mathematical programming
and surrogate models
●
●
Fabian 1967: good convergence rate for
recommended points (evaluated points: not
good) with stochastic gradient
z=0, constant noise:
Log ||~xn|| ~ C - log ( n )/2
(rate=-1/2)
●
Chen 1988: this is optimal
●
Rediscovered recently in ML conferences
20. Newton-style information
●
Estimate the gradient & Hessian by finite differences
●
s(SR) = slope(simple regret)
= log | E f (~xn)-Ef(x*)| / log(n)
●
s(CR) = slope(cumulative regret)
= log | E f(xi) – Ef(xi)| / log(n)
Step-size for
finite differences
#revals per point
21. Newton-style information
●
Estimate the gradient & Hessian by finite differences
●
s(SR) = slope(simple regret) = log | E f (xn)-Ef(x*)| / log(n)
●
s(CR) = slope(cumulative) = log | E f(xi) – Ef(xi)| / log(n)
(same rate as Fabian for z=0; better for z>0)
ly
nd ;
e
-fri rion
ES rite ers
C sid st
n
co wor
the oints
p
22. Conclusions
●
●
Compromise between SR and CR (ES better
for CR)
Information theory can help for proving
complexity bounds in the noisy case
●
Bandits + races = require careful sampling
(if two points very close, huge #revals)
●
●
Newton-style algorithms are fast … when
z>0
No difference between evaluated points &
recommended points ==> slow rates (similar
to simple regret vs cumulative regret debates in bandits)
23. I have no xenophobia against
discrete domains
http://www.lri.fr/~teytaud/discretenoise.pdf
is a Dagstuhl-collaborative work on discrete
optimization (thanks Youhei, Adam, Jonathan).