This talk describes two key machine learning algorithms, namely MCTS and Deep Networks (DN), presented as the main AI innovations from the last 20 years. Interestingly, the talk was given a few days before a combination MCTS+DN was used by Google DeepMind for winning against a pro (https://docs.google.com/document/d/1ZjniEJiotdCfvBYI3MTBpjtOTSlUvf3ma7V8DHVmjhk/edit#)
Talk ENS-Lyon at "Sept Laux"
This document provides an overview of digital logic circuits. It begins with an introduction to logic gates and Boolean algebra. Common logic gates like AND, OR, NOT are described along with their truth tables. Boolean algebra is discussed as a way to analyze and synthesize digital logic circuits using Boolean variables and logic operations. Combinational logic and sequential logic are defined. Techniques for simplifying Boolean functions are covered, including Karnaugh maps and Boolean identities. Implementation of logic functions using sum-of-products form is also summarized.
A Proposal of Loose Asymmetric Cryptography Algorithm - SMCE2017Loc Nguyen
Although traditional asymmetric algorithm along with its implementations are successful in keeping important documents confidential, it uses only one private key. This research proposes the loose asymmetric (LA) algorithm to satisfy requirement of generating many access keys. Each access key is granted to only one user. This demand is real because a group of members needs to retrieve same documents but each member requires confidentiality in access. Because implementation of LA algorithm is complicated, I also propose a scheme of how to deploy LA algorithm. The research is a proposal because I do not make experiment on LA algorithm yet.
This document provides an overview of digital logic circuits. It begins with an introduction to logic gates and Boolean algebra. Common logic gates like AND, OR, NOT, NAND, NOR, XOR and XNOR are explained along with their truth tables. Boolean algebra identities and theorems like De Morgan's theorem are discussed. Karnaugh maps are introduced as a method for simplifying Boolean functions. The document also covers combinational logic, sequential logic, implementation of logic functions using sum-of-products form and provides examples of logic circuit design.
Training Deep Neural Networks has been a difficult task for a long time. Recently diverse approaches have been presented to tackle these difficulties, showing that deep models improve the performance of shallow ones in some areas like signal processing, signal classification or signal segmentation, whatever type of signals, e.g. video, audio or images. One of the most important methods is greedy layer-wise unsupervised pre-training followed by a fine-tuning phase. Despite the advantages of this procedure, it does not fit some scenarios where real time learning is needed, as for adaptation of some time-series models. This paper proposes to couple both phases into one, modifying the loss function to mix together the unsupervised and supervised parts. Benchmark experiments with MNIST database prove the viability of the idea for simple image tasks, and experiments with time-series forecasting encourage the incorporation of this idea into on-line learning approaches. The interest of this method in time-series forecasting is motivated by the study of predictive models for domotic houses with intelligent control systems.
The document summarizes Bresenham's line drawing algorithm. It derives the equations for calculating the next pixel position when drawing a line on a digital display. It considers cases where the slope is less than or equal to 1 and greater than 1. For each case, it calculates the distance from intersection points to pixel positions and derives the decision parameter equation to determine the next pixel.
This document provides an overview of key concepts in differential equations including:
- A differential equation contains derivatives of dependent variables with respect to independent variables.
- The order of a differential equation is defined as the order of the highest derivative. The degree is the highest power of the highest order derivative.
- Differential equations can be formed by differentiating curves and eliminating arbitrary constants.
- Common methods for solving differential equations include variable separation, homogeneous equations, and linear equations.
We experiment with Wiener's attack to break RSA when the secret exponent is short, meaning it is smaller than one quarter of the public modulus size. We discuss cryptanalysis details and present demos of the attack. Our very minor extension of Wiener's attack is also discussed.
If we have an RSA 2048 bits configuration, but our private exponent d is only about 512 bits, then the above attack breaks RSA in a few seconds.
This work uses Continued Fractions to derive the private keys from the given public keys. It turned out that one can derive the private exponent d by approximating it as a ratio of e/n, both are public values.
In a default settings of standard RSA libaries, this attack and my minor extension are not relevant (to the best of our knowledge). However, if we configure our library to choose a very large public encryption exponent e, then our private decryption exponent d could be short enough to mount an attack.
An entry level sharing for trend micro scan engine team member. The sharing would take a binary classification on script type classification task as demonstrate. Topics would start from machine learning problem definition and covered computational graph for deep neural network (DNN), recurrent neural network, LSTM, GRU and some RNN fine tune tricks.
This document provides an overview of digital logic circuits. It begins with an introduction to logic gates and Boolean algebra. Common logic gates like AND, OR, NOT are described along with their truth tables. Boolean algebra is discussed as a way to analyze and synthesize digital logic circuits using Boolean variables and logic operations. Combinational logic and sequential logic are defined. Techniques for simplifying Boolean functions are covered, including Karnaugh maps and Boolean identities. Implementation of logic functions using sum-of-products form is also summarized.
A Proposal of Loose Asymmetric Cryptography Algorithm - SMCE2017Loc Nguyen
Although traditional asymmetric algorithm along with its implementations are successful in keeping important documents confidential, it uses only one private key. This research proposes the loose asymmetric (LA) algorithm to satisfy requirement of generating many access keys. Each access key is granted to only one user. This demand is real because a group of members needs to retrieve same documents but each member requires confidentiality in access. Because implementation of LA algorithm is complicated, I also propose a scheme of how to deploy LA algorithm. The research is a proposal because I do not make experiment on LA algorithm yet.
This document provides an overview of digital logic circuits. It begins with an introduction to logic gates and Boolean algebra. Common logic gates like AND, OR, NOT, NAND, NOR, XOR and XNOR are explained along with their truth tables. Boolean algebra identities and theorems like De Morgan's theorem are discussed. Karnaugh maps are introduced as a method for simplifying Boolean functions. The document also covers combinational logic, sequential logic, implementation of logic functions using sum-of-products form and provides examples of logic circuit design.
Training Deep Neural Networks has been a difficult task for a long time. Recently diverse approaches have been presented to tackle these difficulties, showing that deep models improve the performance of shallow ones in some areas like signal processing, signal classification or signal segmentation, whatever type of signals, e.g. video, audio or images. One of the most important methods is greedy layer-wise unsupervised pre-training followed by a fine-tuning phase. Despite the advantages of this procedure, it does not fit some scenarios where real time learning is needed, as for adaptation of some time-series models. This paper proposes to couple both phases into one, modifying the loss function to mix together the unsupervised and supervised parts. Benchmark experiments with MNIST database prove the viability of the idea for simple image tasks, and experiments with time-series forecasting encourage the incorporation of this idea into on-line learning approaches. The interest of this method in time-series forecasting is motivated by the study of predictive models for domotic houses with intelligent control systems.
The document summarizes Bresenham's line drawing algorithm. It derives the equations for calculating the next pixel position when drawing a line on a digital display. It considers cases where the slope is less than or equal to 1 and greater than 1. For each case, it calculates the distance from intersection points to pixel positions and derives the decision parameter equation to determine the next pixel.
This document provides an overview of key concepts in differential equations including:
- A differential equation contains derivatives of dependent variables with respect to independent variables.
- The order of a differential equation is defined as the order of the highest derivative. The degree is the highest power of the highest order derivative.
- Differential equations can be formed by differentiating curves and eliminating arbitrary constants.
- Common methods for solving differential equations include variable separation, homogeneous equations, and linear equations.
We experiment with Wiener's attack to break RSA when the secret exponent is short, meaning it is smaller than one quarter of the public modulus size. We discuss cryptanalysis details and present demos of the attack. Our very minor extension of Wiener's attack is also discussed.
If we have an RSA 2048 bits configuration, but our private exponent d is only about 512 bits, then the above attack breaks RSA in a few seconds.
This work uses Continued Fractions to derive the private keys from the given public keys. It turned out that one can derive the private exponent d by approximating it as a ratio of e/n, both are public values.
In a default settings of standard RSA libaries, this attack and my minor extension are not relevant (to the best of our knowledge). However, if we configure our library to choose a very large public encryption exponent e, then our private decryption exponent d could be short enough to mount an attack.
An entry level sharing for trend micro scan engine team member. The sharing would take a binary classification on script type classification task as demonstrate. Topics would start from machine learning problem definition and covered computational graph for deep neural network (DNN), recurrent neural network, LSTM, GRU and some RNN fine tune tricks.
This document provides an overview of a NET training session on computer system architecture conducted by Prof. Gopika S. of Kristu Jayanti College, Bengaluru.
The syllabus and paper pattern for the UGC NET Computer Science exam is discussed. Important topics from computer system architecture that are likely to appear in exams like GATE are listed, including digital logic circuits, data representation, computer organization, memory hierarchy, and instruction set architecture.
Sample questions from various topics are provided, along with explanations of answers. Topics of the questions include digital logic, data representation, computer arithmetic, cache memory, instruction sets, and I/O organization.
This document provides an overview of deep learning concepts including neural networks, supervised learning, perceptrons, logistic regression, feature transformation, feedforward neural networks, activation functions, loss functions, and gradient descent. It explains how neural networks can learn representations through hidden layers and how different activation functions, loss functions, and tasks relate. It also shows examples of calculating the gradient of the loss with respect to weights and biases for logistic regression.
The document discusses neural networks and their application to image recognition tasks. It describes the basic architecture of a neural network, including input, output, and hidden units. It then provides an example of how a single layer perceptron works using matrix multiplication and activation functions. Finally, it discusses using a neural network with Python to classify handwritten digits from the MNIST dataset, which contains labeled images of handwritten numbers that the network can learn to identify.
The document discusses using structured support vector machines to predict structured outputs by learning a scoring function F(x,y) = w*φ(x,y) that is maximized to make predictions, it provides an example of using this approach for category-level object localization in images by representing image-box pairs as features and learning to localize objects.
The document provides bibliographic references for 14 books and papers on the topics of tensors, vector analysis, and continuum mechanics. It includes publication information such as author names, titles, publishers, and years. The references are listed alphabetically by author surname.
This document discusses neural networks in Python using Theano and Lasagne libraries. It begins with an introduction to machine learning concepts like supervised learning and neural network training as minimizing a cost function. It then demonstrates how to build and train a simple neural network classifier for MNIST digits using Theano. Finally, it shows how to build a deeper multi-layer network for MNIST using Lasagne, obtaining better results through multiple layers and dropout regularization.
Stochastic modelling and quasi-random numbersOlivier Teytaud
Stochastic models use random numbers to simulate random variables and processes. However, random numbers can be disappointing as they do not cover the space uniformly. Quasi-random numbers provide an alternative by being more uniformly distributed. The document discusses using quasi-random numbers instead of purely random numbers in stochastic models to generate sequences that better cover the sample space.
The document discusses tools for artificial intelligence and their applications. It describes Olivier Teytaud's work at Tao, a research group in Paris focused on reservoir computing, optimal decision making under uncertainty, optimization, and machine learning. It then provides examples of applications for these tools in electricity generation, urban rivals, pokemons, minesweeper, and solving unsolved situations in the game of Go. Olivier suggests that breakthroughs in games can help open doors to applying these algorithms to more important real-world problems by building trust in the approaches.
Dynamic Optimization without Markov Assumptions: application to power systemsOlivier Teytaud
Ilab METIS is a collaboration between TAO, a machine learning and optimization team at INRIA, and Artelys, an SME focused on optimization. They work on optimizing energy policies through modeling power systems and simulating operational and investment decisions. Their methodologies hybridize reinforcement learning, mathematical programming, and direct policy search to optimize complex, constrained problems with uncertainties while minimizing model error. They have applied these techniques to problems involving European-scale power grids with stochastic renewables.
Combining UCT and Constraint Satisfaction Problems for MinesweeperOlivier Teytaud
@inproceedings{buffet:hal-00750577,
hal_id = {hal-00750577},
url = {http://hal.inria.fr/hal-00750577},
title = {{Optimistic Heuristics for MineSweeper}},
author = {Buffet, Olivier and Lee, Chang-Shing and Lin, Woanting and Teytaud, Olivier},
abstract = {{We present a combination of Upper Con dence Tree (UCT) and domain speci c solvers, aimed at improving the behavior of UCT for long term aspects of a problem. Results improve the state of the art, combining top performance on small boards (where UCT is the state of the art) and on big boards (where variants of CSP rule).}},
language = {Anglais},
affiliation = {MAIA - INRIA Nancy - Grand Est / LORIA , Department of Computer Science and Information Engineering - CSIE , National University of Tainan - NUTN , TAO - INRIA Saclay - Ile de France , Laboratoire de Recherche en Informatique - LRI , Department of Electrical Engineering and Computer Science - Institut Montefiore},
booktitle = {{International Computer Symposium}},
address = {Hualien, Ta{\"\i}wan, Province De Chine},
audience = {internationale },
year = {2012},
pdf = {http://hal.inria.fr/hal-00750577/PDF/mines3.pdf},
}
Computers have made progress playing the game of Go but still have weaknesses. In 19x19 Go, computers have beaten professionals with handicaps of 6-9 stones. In 9x9 Go, computers have reached human professional level by beating professionals without handicaps. However, in 19x19 Go computers still require at least a 6 stone handicap against top professionals. Future improvements may allow computers to reach professional human level in 19x19 Go without handicaps.
This document discusses using Meta-Monte Carlo Tree Search (Meta-MCTS) to build an opening book for 7x7 Go. Meta-MCTS improved its play against a sparring partner that incorporated human variations. While Meta-MCTS won all games as black and white against professionals, humans found at least one variation where it did not play correctly. The document concludes that Meta-MCTS performed well but incorporating human data helped, and exactly solving 7x7 Go would require immense work collecting and solving all leaf variations.
The document discusses energy management in France and Taiwan. It notes that both countries have a long history in energy but differ in their approaches. France relies heavily on nuclear power through large state-owned companies while Taiwan has focused more recently on renewable resources like solar, wind, and ocean currents. The challenges of energy management are outlined as deciding how to operate existing power sources and plan new investments over decades to improve the system under uncertainty. Optimization methods that incorporate stochastic modeling are proposed to help with these long-term planning problems.
Introduction to the TAO Uct Sig, a team working on computational intelligence...Olivier Teytaud
The document discusses research from Tao-Uctsig, a special interest group within Tao focused on artificial intelligence. Tao has 11 permanent staff and around 22 PhD students/postdocs working across various fields of mathematics, computer science, and sciences. The SIG works on problems where computers make decisions, particularly challenges where humans are currently better than computers. Their work involves games, important applications, and previously included mathematics. Specific applications discussed include controlling a robot arm, analyzing strategy in Pokemon and Urban Rivals, solving Minesweeper puzzles, and Go. Industrial applications include helping optimize France's major electricity industry.
Introductory talk
more technicities in
@inproceedings{schoenauer:inria-00625855,
hal_id = {inria-00625855},
url = {http://hal.inria.fr/inria-00625855},
title = {{A Rigorous Runtime Analysis for Quasi-Random Restarts and Decreasing Stepsize}},
author = {Schoenauer, Marc and Teytaud, Fabien and Teytaud, Olivier},
abstract = {{Multi-Modal Optimization (MMO) is ubiquitous in engineer- ing, machine learning and artificial intelligence applications. Many algo- rithms have been proposed for multimodal optimization, and many of them are based on restart strategies. However, only few works address the issue of initialization in restarts. Furthermore, very few comparisons have been done, between different MMO algorithms, and against simple baseline methods. This paper proposes an analysis of restart strategies, and provides a restart strategy for any local search algorithm for which theoretical guarantees are derived. This restart strategy is to decrease some 'step-size', rather than to increase the population size, and it uses quasi-random initialization, that leads to a rigorous proof of improve- ment with respect to random restarts or restarts with constant initial step-size. Furthermore, when this strategy encapsulates a (1+1)-ES with 1/5th adaptation rule, the resulting algorithm outperforms state of the art MMO algorithms while being computationally faster.}},
language = {Anglais},
affiliation = {TAO - INRIA Saclay - Ile de France , Microsoft Research - Inria Joint Centre - MSR - INRIA , Laboratoire de Recherche en Informatique - LRI},
booktitle = {{Artificial Evolution}},
address = {Angers, France},
audience = {internationale },
year = {2011},
month = Oct,
pdf = {http://hal.inria.fr/inria-00625855/PDF/qrrsEA.pdf},
}
This document discusses the go variant "killall go" where black wins by capturing all white stones, and proposes different board sizes and handicap placements to test balance. It analyzes MoGo evaluations of some placements, finding discrepancies between single-game evaluations and full-game results. It concludes more research could be done on MCTS opening evaluations, Batoo variants where players take turns placing stones, and using bandits to learn balanced handicap placements.
The document discusses the research focus of the TAO group at Inria Saclay, which includes machine learning and optimization applications for energy management. The group has one permanent member and others part-time. It collaborates closely with partners in Taiwan and the company Artelys. The research aims to address challenges in power grid simulation like variable demand and renewable energy sources using techniques like mathematical programming, reinforcement learning, and direct policy search combined with heuristics and Monte Carlo tree search.
This document provides an overview of statistical concepts related to item response theory (IRT), including posterior probability, Bayes' theorem, maximum a posteriori (MAP) estimation, and the Jacobi algorithm. It discusses how to initialize IRT parameters like student abilities and item parameters, and evaluates options for fitting IRT models like R and Octave packages.
1. The document discusses various methods for continuous optimization, including rates of convergence for noise-free and noisy settings.
2. In noise-free settings, methods like Newton's method and BFGS have quadratic or superlinear convergence rates, while evolutionary strategies (ES) have linear convergence rates.
3. Lower bounds on optimization complexity are also discussed, showing minimum comparisons or evaluations needed depending on problem properties like domain size and precision required.
Weather, opponents, geopolitics: so many uncertainties in such a case ? How to manage power systems in spite of these uncertainties, and how to decide investments.
Talk at Saint-Etienne in 2015; thanks to R. Leriche and to the "games and optimizations" days in Saint-Etienne.
This document provides an overview of a NET training session on computer system architecture conducted by Prof. Gopika S. of Kristu Jayanti College, Bengaluru.
The syllabus and paper pattern for the UGC NET Computer Science exam is discussed. Important topics from computer system architecture that are likely to appear in exams like GATE are listed, including digital logic circuits, data representation, computer organization, memory hierarchy, and instruction set architecture.
Sample questions from various topics are provided, along with explanations of answers. Topics of the questions include digital logic, data representation, computer arithmetic, cache memory, instruction sets, and I/O organization.
This document provides an overview of deep learning concepts including neural networks, supervised learning, perceptrons, logistic regression, feature transformation, feedforward neural networks, activation functions, loss functions, and gradient descent. It explains how neural networks can learn representations through hidden layers and how different activation functions, loss functions, and tasks relate. It also shows examples of calculating the gradient of the loss with respect to weights and biases for logistic regression.
The document discusses neural networks and their application to image recognition tasks. It describes the basic architecture of a neural network, including input, output, and hidden units. It then provides an example of how a single layer perceptron works using matrix multiplication and activation functions. Finally, it discusses using a neural network with Python to classify handwritten digits from the MNIST dataset, which contains labeled images of handwritten numbers that the network can learn to identify.
The document discusses using structured support vector machines to predict structured outputs by learning a scoring function F(x,y) = w*φ(x,y) that is maximized to make predictions, it provides an example of using this approach for category-level object localization in images by representing image-box pairs as features and learning to localize objects.
The document provides bibliographic references for 14 books and papers on the topics of tensors, vector analysis, and continuum mechanics. It includes publication information such as author names, titles, publishers, and years. The references are listed alphabetically by author surname.
This document discusses neural networks in Python using Theano and Lasagne libraries. It begins with an introduction to machine learning concepts like supervised learning and neural network training as minimizing a cost function. It then demonstrates how to build and train a simple neural network classifier for MNIST digits using Theano. Finally, it shows how to build a deeper multi-layer network for MNIST using Lasagne, obtaining better results through multiple layers and dropout regularization.
Stochastic modelling and quasi-random numbersOlivier Teytaud
Stochastic models use random numbers to simulate random variables and processes. However, random numbers can be disappointing as they do not cover the space uniformly. Quasi-random numbers provide an alternative by being more uniformly distributed. The document discusses using quasi-random numbers instead of purely random numbers in stochastic models to generate sequences that better cover the sample space.
The document discusses tools for artificial intelligence and their applications. It describes Olivier Teytaud's work at Tao, a research group in Paris focused on reservoir computing, optimal decision making under uncertainty, optimization, and machine learning. It then provides examples of applications for these tools in electricity generation, urban rivals, pokemons, minesweeper, and solving unsolved situations in the game of Go. Olivier suggests that breakthroughs in games can help open doors to applying these algorithms to more important real-world problems by building trust in the approaches.
Dynamic Optimization without Markov Assumptions: application to power systemsOlivier Teytaud
Ilab METIS is a collaboration between TAO, a machine learning and optimization team at INRIA, and Artelys, an SME focused on optimization. They work on optimizing energy policies through modeling power systems and simulating operational and investment decisions. Their methodologies hybridize reinforcement learning, mathematical programming, and direct policy search to optimize complex, constrained problems with uncertainties while minimizing model error. They have applied these techniques to problems involving European-scale power grids with stochastic renewables.
Combining UCT and Constraint Satisfaction Problems for MinesweeperOlivier Teytaud
@inproceedings{buffet:hal-00750577,
hal_id = {hal-00750577},
url = {http://hal.inria.fr/hal-00750577},
title = {{Optimistic Heuristics for MineSweeper}},
author = {Buffet, Olivier and Lee, Chang-Shing and Lin, Woanting and Teytaud, Olivier},
abstract = {{We present a combination of Upper Con dence Tree (UCT) and domain speci c solvers, aimed at improving the behavior of UCT for long term aspects of a problem. Results improve the state of the art, combining top performance on small boards (where UCT is the state of the art) and on big boards (where variants of CSP rule).}},
language = {Anglais},
affiliation = {MAIA - INRIA Nancy - Grand Est / LORIA , Department of Computer Science and Information Engineering - CSIE , National University of Tainan - NUTN , TAO - INRIA Saclay - Ile de France , Laboratoire de Recherche en Informatique - LRI , Department of Electrical Engineering and Computer Science - Institut Montefiore},
booktitle = {{International Computer Symposium}},
address = {Hualien, Ta{\"\i}wan, Province De Chine},
audience = {internationale },
year = {2012},
pdf = {http://hal.inria.fr/hal-00750577/PDF/mines3.pdf},
}
Computers have made progress playing the game of Go but still have weaknesses. In 19x19 Go, computers have beaten professionals with handicaps of 6-9 stones. In 9x9 Go, computers have reached human professional level by beating professionals without handicaps. However, in 19x19 Go computers still require at least a 6 stone handicap against top professionals. Future improvements may allow computers to reach professional human level in 19x19 Go without handicaps.
This document discusses using Meta-Monte Carlo Tree Search (Meta-MCTS) to build an opening book for 7x7 Go. Meta-MCTS improved its play against a sparring partner that incorporated human variations. While Meta-MCTS won all games as black and white against professionals, humans found at least one variation where it did not play correctly. The document concludes that Meta-MCTS performed well but incorporating human data helped, and exactly solving 7x7 Go would require immense work collecting and solving all leaf variations.
The document discusses energy management in France and Taiwan. It notes that both countries have a long history in energy but differ in their approaches. France relies heavily on nuclear power through large state-owned companies while Taiwan has focused more recently on renewable resources like solar, wind, and ocean currents. The challenges of energy management are outlined as deciding how to operate existing power sources and plan new investments over decades to improve the system under uncertainty. Optimization methods that incorporate stochastic modeling are proposed to help with these long-term planning problems.
Introduction to the TAO Uct Sig, a team working on computational intelligence...Olivier Teytaud
The document discusses research from Tao-Uctsig, a special interest group within Tao focused on artificial intelligence. Tao has 11 permanent staff and around 22 PhD students/postdocs working across various fields of mathematics, computer science, and sciences. The SIG works on problems where computers make decisions, particularly challenges where humans are currently better than computers. Their work involves games, important applications, and previously included mathematics. Specific applications discussed include controlling a robot arm, analyzing strategy in Pokemon and Urban Rivals, solving Minesweeper puzzles, and Go. Industrial applications include helping optimize France's major electricity industry.
Introductory talk
more technicities in
@inproceedings{schoenauer:inria-00625855,
hal_id = {inria-00625855},
url = {http://hal.inria.fr/inria-00625855},
title = {{A Rigorous Runtime Analysis for Quasi-Random Restarts and Decreasing Stepsize}},
author = {Schoenauer, Marc and Teytaud, Fabien and Teytaud, Olivier},
abstract = {{Multi-Modal Optimization (MMO) is ubiquitous in engineer- ing, machine learning and artificial intelligence applications. Many algo- rithms have been proposed for multimodal optimization, and many of them are based on restart strategies. However, only few works address the issue of initialization in restarts. Furthermore, very few comparisons have been done, between different MMO algorithms, and against simple baseline methods. This paper proposes an analysis of restart strategies, and provides a restart strategy for any local search algorithm for which theoretical guarantees are derived. This restart strategy is to decrease some 'step-size', rather than to increase the population size, and it uses quasi-random initialization, that leads to a rigorous proof of improve- ment with respect to random restarts or restarts with constant initial step-size. Furthermore, when this strategy encapsulates a (1+1)-ES with 1/5th adaptation rule, the resulting algorithm outperforms state of the art MMO algorithms while being computationally faster.}},
language = {Anglais},
affiliation = {TAO - INRIA Saclay - Ile de France , Microsoft Research - Inria Joint Centre - MSR - INRIA , Laboratoire de Recherche en Informatique - LRI},
booktitle = {{Artificial Evolution}},
address = {Angers, France},
audience = {internationale },
year = {2011},
month = Oct,
pdf = {http://hal.inria.fr/inria-00625855/PDF/qrrsEA.pdf},
}
This document discusses the go variant "killall go" where black wins by capturing all white stones, and proposes different board sizes and handicap placements to test balance. It analyzes MoGo evaluations of some placements, finding discrepancies between single-game evaluations and full-game results. It concludes more research could be done on MCTS opening evaluations, Batoo variants where players take turns placing stones, and using bandits to learn balanced handicap placements.
The document discusses the research focus of the TAO group at Inria Saclay, which includes machine learning and optimization applications for energy management. The group has one permanent member and others part-time. It collaborates closely with partners in Taiwan and the company Artelys. The research aims to address challenges in power grid simulation like variable demand and renewable energy sources using techniques like mathematical programming, reinforcement learning, and direct policy search combined with heuristics and Monte Carlo tree search.
This document provides an overview of statistical concepts related to item response theory (IRT), including posterior probability, Bayes' theorem, maximum a posteriori (MAP) estimation, and the Jacobi algorithm. It discusses how to initialize IRT parameters like student abilities and item parameters, and evaluates options for fitting IRT models like R and Octave packages.
1. The document discusses various methods for continuous optimization, including rates of convergence for noise-free and noisy settings.
2. In noise-free settings, methods like Newton's method and BFGS have quadratic or superlinear convergence rates, while evolutionary strategies (ES) have linear convergence rates.
3. Lower bounds on optimization complexity are also discussed, showing minimum comparisons or evaluations needed depending on problem properties like domain size and precision required.
Weather, opponents, geopolitics: so many uncertainties in such a case ? How to manage power systems in spite of these uncertainties, and how to decide investments.
Talk at Saint-Etienne in 2015; thanks to R. Leriche and to the "games and optimizations" days in Saint-Etienne.
Complexity of planning and games with partial informationOlivier Teytaud
Survey of computational complexity or computability of sequential decision making (games, planning)
contains two more detailed proofs:
- EXPSPACE completeness of unobservable adversarial planning for existence of 100% winning strategy (Hasslum et al)
- undecidability of unobservable adversarial planning for arbitrary winning rate (including optimal play in the Nash sense)
Deep learning uses neural networks with many hidden layers to learn representations of data with multiple levels of abstraction. It has been shown to outperform simpler models with fewer layers on complex tasks like image and speech recognition. Deep learning works by defining a set of candidate functions (neural networks) and using gradient descent to optimize the network parameters to minimize loss on training data. Deeper networks with more parameters generally perform better but require large datasets and computational resources to train effectively.
https://telecombcn-dl.github.io/2017-dlsl/
Winter School on Deep Learning for Speech and Language. UPC BarcelonaTech ETSETB TelecomBCN.
The aim of this course is to train students in methods of deep learning for speech and language. Recurrent Neural Networks (RNN) will be presented and analyzed in detail to understand the potential of these state of the art tools for time series processing. Engineering tips and scalability issues will be addressed to solve tasks such as machine translation, speech recognition, speech synthesis or question answering. Hands-on sessions will provide development skills so that attendees can become competent in contemporary data anlytics tools.
Hardware Acceleration for Machine LearningCastLabKAIST
This document provides an overview of a lecture on hardware acceleration for machine learning. The lecture will cover deep neural network models like convolutional neural networks and recurrent neural networks. It will also discuss various hardware accelerators developed for machine learning, including those designed for mobile/edge and cloud computing environments. The instructor's background and the agenda topics are also outlined.
Opening of our Deep Learning Lunch & Learn series. First session: introduction to Neural Networks, Gradient descent and backpropagation, by Pablo J. Villacorta, with a prologue by Fernando Velasco
the slides are aimed to give a brief introductory base to Neural Networks and its architectures. it covers logistic regression, shallow neural networks and deep neural networks. the slides were presented in Deep Learning IndabaX Sudan.
The document provides an introduction to machine learning. It discusses the author's path to becoming a data scientist and some key machine learning concepts, including:
- Required skills at different experience levels for machine learning roles
- Popular machine learning approaches like deep learning and reinforcement learning
- Common machine learning problems like one shot learning and imbalanced datasets
- How machine learning works by using tricks on data through parametric models and free parameters
- Key questions in machine learning like what to teach, how to teach, and to what entity
- Popular machine learning frameworks like TensorFlow that automate tasks
https://telecombcn-dl.github.io/2018-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
The document summarizes a presentation titled "Yoyak" given by Heejong Lee at ScalaDays 2015. The presentation introduces Yoyak, a static analysis framework developed by the speaker. It covers the following topics:
- Static analysis and abstract interpretation theory
- Implementation highlights of the Yoyak framework
- Experiences using Scala in developing Yoyak
- The roadmap for future development of Yoyak
The document proposes a "blind coupon mechanism" (BCM) to spread signals or rumors quickly in a network while preventing an adversary from identifying the source or presence of the signal. The BCM uses an abstract group structure and instantiates it using elliptic curves over Z_n or bilinear groups. It allows processes to spread coupons by continually broadcasting and combining received coupons with their own, in a way that an adversary cannot distinguish dummy from signal coupons or forge new signal coupons.
Martin Takac - “Solving Large-Scale Machine Learning Problems in a Distribute...diannepatricia
Martin Takac, Assistant Professor, Lehigh University, gave a great presentation today on “Solving Large-Scale Machine Learning Problems in a Distributed Way” as part of our Cognitive Systems Institute Speaker Series.
The document discusses various policies for order picking, including pick sequencing, batching, and zoning. It describes the pick sequencing problem as analogous to the Traveling Salesman Problem (TSP) of finding the shortest route that visits all locations. It provides an analytical formulation of the TSP as a minimization problem with constraints. It also describes several heuristics for approximating solutions to the TSP, including the closest insertion algorithm. A special case admitting a polynomial-time solution involves a grid-like warehouse layout. Overall, the document outlines different approaches to modeling and solving the pick sequencing problem programmatically to minimize travel time.
This document provides an overview of support vector machines (SVMs) for machine learning. It explains that SVMs find the optimal separating hyperplane that maximizes the margin between examples of separate classes. This is achieved by formulating SVM training as a convex optimization problem that can be solved efficiently. The document discusses how SVMs can handle non-linear decision boundaries using the "kernel trick" to implicitly map examples to higher-dimensional feature spaces without explicitly performing the mapping.
This document discusses properties of pseudo-random numbers and methods for generating random numbers computationally. It covers:
- Properties of pseudo-random numbers including being continuous between 0 and 1 and uniformly distributed.
- Common methods for generating pseudo-random numbers including table lookup, linear congruential generators (LCG), and feedback shift registers.
- Desirable properties for random number generators including being fast, requiring little memory, having a long cycle or period, and producing numbers that are close to uniform and independent.
This document provides an overview of artificial neural networks including their history, applications, properties, and basic concepts like perceptrons, gradient descent, backpropagation, and multi-layer networks. It then gives an example of using a neural network for face recognition, describing the input/output encoding, network structure, training parameters, and achieving 90% accuracy on the test set. The document encourages the reader to try implementing and running the face recognition network code provided online.
Deep learning @ University of Oradea - part I (16 Jan. 2018)Vlad Ovidiu Mihalca
Deep Learning series of presentations at University of Oradea, Faculty of Managerial and Technological Engineering, Mechatronics department.
English and Romanian language series held in parallel for Erasmus foreign students and Engineering Doctoral School students, teachers as well as anyone interested within the university.
This presentation was the first in the English language series, covering a tiny part of the theoretical aspects of Deep Learning. It will be followed by presentations and discussion regarding frameworks for use in products featuring Deep Learning, as well as current state of the art in Deep Learning research and applications in Robotics and Computer/Machine Vision.
Similar to Machine learning 2016: deep networks and Monte Carlo Tree Search (20)
Generative AI Use cases applications solutions and implementation.pdfmahaffeycheryld
Generative AI solutions encompass a range of capabilities from content creation to complex problem-solving across industries. Implementing generative AI involves identifying specific business needs, developing tailored AI models using techniques like GANs and VAEs, and integrating these models into existing workflows. Data quality and continuous model refinement are crucial for effective implementation. Businesses must also consider ethical implications and ensure transparency in AI decision-making. Generative AI's implementation aims to enhance efficiency, creativity, and innovation by leveraging autonomous generation and sophisticated learning algorithms to meet diverse business challenges.
https://www.leewayhertz.com/generative-ai-use-cases-and-applications/
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Sinan KOZAK
Sinan from the Delivery Hero mobile infrastructure engineering team shares a deep dive into performance acceleration with Gradle build cache optimizations. Sinan shares their journey into solving complex build-cache problems that affect Gradle builds. By understanding the challenges and solutions found in our journey, we aim to demonstrate the possibilities for faster builds. The case study reveals how overlapping outputs and cache misconfigurations led to significant increases in build times, especially as the project scaled up with numerous modules using Paparazzi tests. The journey from diagnosing to defeating cache issues offers invaluable lessons on maintaining cache integrity without sacrificing functionality.
VARIABLE FREQUENCY DRIVE. VFDs are widely used in industrial applications for...PIMR BHOPAL
Variable frequency drive .A Variable Frequency Drive (VFD) is an electronic device used to control the speed and torque of an electric motor by varying the frequency and voltage of its power supply. VFDs are widely used in industrial applications for motor control, providing significant energy savings and precise motor operation.
Design and optimization of ion propulsion dronebjmsejournal
Electric propulsion technology is widely used in many kinds of vehicles in recent years, and aircrafts are no exception. Technically, UAVs are electrically propelled but tend to produce a significant amount of noise and vibrations. Ion propulsion technology for drones is a potential solution to this problem. Ion propulsion technology is proven to be feasible in the earth’s atmosphere. The study presented in this article shows the design of EHD thrusters and power supply for ion propulsion drones along with performance optimization of high-voltage power supply for endurance in earth’s atmosphere.
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELijaia
As digital technology becomes more deeply embedded in power systems, protecting the communication
networks of Smart Grids (SG) has emerged as a critical concern. Distributed Network Protocol 3 (DNP3)
represents a multi-tiered application layer protocol extensively utilized in Supervisory Control and Data
Acquisition (SCADA)-based smart grids to facilitate real-time data gathering and control functionalities.
Robust Intrusion Detection Systems (IDS) are necessary for early threat detection and mitigation because
of the interconnection of these networks, which makes them vulnerable to a variety of cyberattacks. To
solve this issue, this paper develops a hybrid Deep Learning (DL) model specifically designed for intrusion
detection in smart grids. The proposed approach is a combination of the Convolutional Neural Network
(CNN) and the Long-Short-Term Memory algorithms (LSTM). We employed a recent intrusion detection
dataset (DNP3), which focuses on unauthorized commands and Denial of Service (DoS) cyberattacks, to
train and test our model. The results of our experiments show that our CNN-LSTM method is much better
at finding smart grid intrusions than other deep learning algorithms used for classification. In addition,
our proposed approach improves accuracy, precision, recall, and F1 score, achieving a high detection
accuracy rate of 99.50%.
Gas agency management system project report.pdfKamal Acharya
The project entitled "Gas Agency" is done to make the manual process easier by making it a computerized system for billing and maintaining stock. The Gas Agencies get the order request through phone calls or by personal from their customers and deliver the gas cylinders to their address based on their demand and previous delivery date. This process is made computerized and the customer's name, address and stock details are stored in a database. Based on this the billing for a customer is made simple and easier, since a customer order for gas can be accepted only after completing a certain period from the previous delivery. This can be calculated and billed easily through this. There are two types of delivery like domestic purpose use delivery and commercial purpose use delivery. The bill rate and capacity differs for both. This can be easily maintained and charged accordingly.
Applications of artificial Intelligence in Mechanical Engineering.pdfAtif Razi
Historically, mechanical engineering has relied heavily on human expertise and empirical methods to solve complex problems. With the introduction of computer-aided design (CAD) and finite element analysis (FEA), the field took its first steps towards digitization. These tools allowed engineers to simulate and analyze mechanical systems with greater accuracy and efficiency. However, the sheer volume of data generated by modern engineering systems and the increasing complexity of these systems have necessitated more advanced analytical tools, paving the way for AI.
AI offers the capability to process vast amounts of data, identify patterns, and make predictions with a level of speed and accuracy unattainable by traditional methods. This has profound implications for mechanical engineering, enabling more efficient design processes, predictive maintenance strategies, and optimized manufacturing operations. AI-driven tools can learn from historical data, adapt to new information, and continuously improve their performance, making them invaluable in tackling the multifaceted challenges of modern mechanical engineering.
Machine learning 2016: deep networks and Monte Carlo Tree Search
1. Machine learning: deep networks
and MCTS
olivier.teytaud@inria.fr
1. What is machine learning (ML)
2. Critically needed: optimization
3. Two recent algorithms: DN and MCTS
4. The mathematics of ML
5. Conclusion
2. What is machine learning ?
It's when machines learn :-)
● Learn to recognize, classify, make decisions,
play, speak, translate …
● Can be inductive (from data, using statistics)
and/or deductive
3. Examples
● Learn to play chess
● Learn to translate French → English
● Learn to recognize bears / planes / …
● Learn to drive a car (from examples ?)
● Learn to recognize handwritten digits
● Learn which ads you like
● Learn to recognize musics
4. Different flavors of learning
● From data: given 100000 pictures of bears and 100000 pictures
of beers, learn to discriminate a picture of bear and a picture of
beer.
● From data, 2: given 10000 pictures (no categories!
“unsupervised”)
– Find categories and classify
– Or find a “good” representation as a vector
● From simulators: given a simulator (~ the rules) of Chess, play
(well) chess.
● From experience: control a robot, and avoid bumps.
Deductive: not much... (was important at the time of your
grandfathers/grandmothers)
5. Machine learning everywhere ! ! !
Finding ads most likely to get your money.
Local weather forecasts.
Translation.
Handwritten text recognition.
Predicting traffic.
Detecting spam.
...
6. 2. Optimization: a key component of
ML
● Given: a function k: w → k(w)
● Output: w* such that k(w*) minimum
Usually, only an approximation of w*.
Many algorithms exist; one of the best for ML is
stochastic gradient descent.
7. 2.a. Gradient descent
● w = random
● for m=1,2,3,....
– alpha = 0.01 / square-root(m)
– compute the gradient g of k at w
– w = w – alpha g
Key problem: computing g quickly.
8. 2.b. Stochastic gradient descent
● k(w) = k1(w) + k2(w) + … + kn(w)
● Then at iteration i, use the gradient of kj where j=i mod n
==> THE key algorithm for machine learning
● w = random
● for m=1,2,3,....
– Alpha = 0.01 / square-root(m)
– compute the gradient g of k(m mod n) at w
– w = w – alpha g
Gradient can often be computed by “reverse-mode differentiation”, termed
“backpropagation” in neural networks (not that hard)
9. 3. Two ML algorithms
● Part 1: Deep learning (learning to predict)
– Neural networks
– Empirical risk minimization & variants
– Deep networks
● Part 2: MCTS (learning to play)
14. Neural networks & empirical risk
minimization
Define the model:
f(x,w)=σ(w1.x+w1b) w=(w1,w1b)
f(x,w)=σ(w2.σ(w1.x+w1b)+w2b)
w=(w1,w2,w1b,w2b)
f(x,w)= ….. more layers ….
how to find a good w ?
15. What is a good w ?
Try to find w such that ||f(xi,w) – yi||2
is small
==> finding a predictor of y, given x
X f(x,w)
16. Neural networks & empirical risk
minimization
● Inputs = x1,...,xN (vectors in R^d) and y1,...,yN (vectors in R^k)
● Assumption: the (xi,yi) are randomly drawn, i.i.d, for some probability
distribution
● Define a loss:
L(w) = ( E f(x,w)-y)2
and
its approximation L'(w)= average of (f(x(i),w)-y(i))2
● Optimize:
– Computing w= argmin L(w) impossible (L unknown)
– So w = argmin L'(w) ==> by stochastic gradient descent: gradient ?
Empirical risk
17. Neural networks with SGD
(stochastic gradient descent)
Minimize the sum of the ||f(xi,w) – yi||2
by
●
w ←w – alpha grad ||f(x1,w) – y1||2
●
w ←w – alpha grad ||f(x2,w) – y2||2
● …
●
w ←w – alpha grad ||f(xn,w) – yn||2
● +restart
X f(x,w) ~ y
The network sees
“xi” and “yi”
one at a time.
18. Backpropagation ==> gradient
(thanks http://slideplayer.com/slide/5214241)
● Sigmoid function:
● Partial derivative written in terms of outputs (o)
and activation (z); using derivatives/z (δ)
output node: internal node:
19. Neural networks as encoders
Try to find w such that ||f(xi,w) – xi||2
is small + remove the end
==> finding an encoder of x!
i.e. we get a function f such that x should be a g(f(x)) (for some g).
… looks crazy ? Just f(x)=x is a solution!
X f(x,w)
Delete this ! ! !
20. Ok, neural networks
We have seen two possibilities:
● Neural networks as predictors (supervised)
● Neural networks as encoders (unsupervised)
Both use stochastic gradient descent and ERM.
Now, let us come back to predictors, but with a
better algorithm, for “deep” learning – using
encoders.
From
examples
One example at
a time
21. Empirical risk minimization and
numerical optimization
● We would like to optimize the “real” error (expectation; termed
generalization error, GE) but we have only access to the empirical error
(ER).
● For the same ER, we can have different GE.
● Two questions:
– How to reduce the difference between ER and GE ?
Regularization: minim L'+||w||2 Sparsity: minim L'+||w||0
(small parameters) (few parameters)
==> VC theory (no details here)
– Which of the ER optima are best for GE ? ? ? ?
(now known to be an excellent question!)
==> deep network learning by unsupervised tools!
22. Deep neural networks
● What if many layers ?
● Many local minima (proof: symmetries!)
==> does not work
● Two steps:
– unsupervised learning, layer by layer; the network is
growing;
– then, apply ERM for fine tuning.
● Unsupervised pretraining ==> with the same
empirical error, generalization error is better!
31. Deep networks in one slide
● For i = 1, 2, 3, …, k:
– Learn one layer by autoencoding (unsupervised)
– Remove the second part
● Learn one more layer in a supervised manner
● Learn the whole network (supervised as well;
fine tuning)
32. Deep networks
● A revolution in vision
● Important point (not developped here): sharing some parameters,
because first layers = low level feature extractors, and LLF are
the same everywhere ==> convolutional nets
● Link with natural learning: learn simple concepts first;
unsupervised learning.
● Not only “σ”, this was just an example;
output=w0.exp(-w2.||input-w1||2)
● Great success in speech & vision
● Surprising performance in Go (discuss later :-) )
33. Part 2: MCTS
● MCTS originates in 2006
● UCT = one particular flavor, from 2007, most
well known probably
● A revolution in Computer Go
34. Part I : The Success Story
(less showing off in part II :-) )
The game of Go is a beautiful
Challenge.
35. Part I : The Success Story
(less showing off in part II :-) )
The game of Go is a beautiful
challenge.
We did the first wins against
professional players
in the game of Go
But with handicap!
43. Game of Go: counting territories
( w h i t e h a s 7 . 5 “ b o n u s ” a s b l a c k s t a r t s )
44. Game of Go: the rules
Black plays at the blue circle:
the white group dies (it is
removed)
It's impossible to kill white (two “eyes”).
“Superko” rule: we don't come back to the same
situation.
(without superko: “PSPACE hard”
with superko: “EXPTIME-hard”)
At the end, we count territories
==> black starts, so +7.5 for white.
58. UCT in one slide
Great progress in the game of Go and in various other games
59. Why ?
Why “+ square-root( log(...)/ … )” ?
because there are nice maths on this in
completely different settings.
Seriously, no good reason, use whatever
you want :-)
60. Current status ?
MCTS has invaded game applications:
• For games which have a good simulator
(required!)
• For games for which there is no good
evaluation function, i.e. no simple map
“board → probability that black wins”)
Also some hard discrete control tasks.
61. Current status ?
Go ? Humans still much stronger than
computers.
Deep networks: surprisingly good
performance as an evaluation function.
Still performs far worse than best MCTS.
Merging MCTS and deep networks ?
62. Current MCTS research ?
Recent years:
• parallelization
• extrapolation (between branches of the
search)
But most progress = human expertise
and tricks in the random part.
63. 4. The maths of ML
One can find theorems justifying regularization (+||
w||2 or +||w||0), or theorems justifying that deep
networks need less parameters than shallow
networks for approximating some functions.
Still, MCTS and neural networks were born quite
independently of maths.
Still, you need stochastic gradient descent.
Maybe in the future of ML a real progress born in
maths ?
65. Random projection ?
● Randomly project your data (linearly or not)
● Learn on these random projections
● Super fast, not that bad
66. Machine learning + encryption
● Statistics on data... without decrypting them
● Critical for applications
– Where we must “know” what you do (predicting
power consumption)
– But we should not know too much (privacy)
67. Simulation-based + data-based
optimization
● Optimization of models = forgets too many features
from the real world
● Optimization of simulators = better
==> technically, optimization of expensive functions
(the optimization algorithm can spend computational
power) + surrogate model (i.e. ML)
68. Distributed collaborative
decision making ?
● Power network:
– frequency = 50Hz (deviations ≈ )
– (frequency)' = k x (production – demand) → ≈ 0!
● Too much wind power ==> unstable network
because hard to satisfy “production = demand”
● Solutions ?
– Detect frequency
– Increase/decrease production but also demand
69. Limited
capacity
Typical example of natural monopoly.
Deregulation + more distributed production
+ more renewable energy
==> who regulates the network ?
More regulation after all ?
Distributed collaborative decision making.
Ramping
Constraint
(power output
smooth)
IMHO,
Distributed collaborative
decision making is a great
research area (useful + not well
understood)
70. Power systems must change!
● Tired of buying oil which leads to ?
● Don't want ?(coal)
● Afraid of ?
But unstable ?
COME AND HELP ! ! ! STABILIZATION NEEDED :-)
71. Conclusions 1: recent
success stories
● MCTS success story
– 2006: immediately reasonably good
– 2007: thanks to fun tricks in the MC part, strong against pros in
9x9
– 2008: with parallelization, good in 19x19
● Deep networks
– Convolutional DN excellent in 1998 (!) in vision, slightly
overlooked for years
– Now widely recognized in many areas
● Both make sense only with strong computers
72. Conclusions 2: mathematics &
publication & research
● During so many years:
– SVM was the big boss of supervised ML (because there were
theorems, where as there are few theorems in deep learning)
– Alpha-beta was the big boss of games
● MCTS was immediately recognized as a key contribution
to ML; why wasn't it the case for deep learning ? Maybe
because SVM were easier to explain, prove, adverstise.
(but highest impact factor = +squareRoot(... / … ) ! )
● Both deep learning and MCTS look like fun exercises
rather than science; still, they are key tools for ML.
==> keep time for “fun” research, don't worry too much for
publications
73. Conclusions 3: applications are fun!
(important ones :-) )
● Both deep learning and Mcts were born from
applications
● Machine learning came from xps more than
from pure theory
● Automatic driving, micro-emotions (big
brother ?), bioinformatics, …. and POWER
SYSTEMS (with open source / open data!).
74. References
● Backpropagation, Rummelhart et al 1986
● MCTS, Coulom 2006 + Kocsis et al 2007 +
Gelly et al 2007
● Conv. Networks Fukushima 1980
● Deep conv. networks Le Cun 1998
● Regularization, Vapnik et al 1971