1) The document discusses the evolution of AlphaGo and related AI projects from Deep Blue, to AlphaGo, AlphaGo Zero, and AlphaZero.
2) It explains the key concepts behind AlphaGo including the policy and value neural networks, and how it was initially trained via supervised learning and then improved with reinforcement learning by playing against itself.
3) It summarizes the differences between AlphaGo Zero which was trained solely via reinforcement learning without human data, and AlphaZero which aims to solve games without specific tuning or data augmentation.
1) Alpha Zero was an AI developed by DeepMind that achieved master level play in the games of chess, shogi, and Go without relying on human data or prior knowledge.
2) It was able to achieve this by using a new form of deep reinforcement learning that allowed it to learn to play solely from games of self-play, starting from random play.
3) Alpha Zero demonstrated superhuman performance in chess, shogi, and Go by defeating previous champion programs in these games, despite being provided no domain knowledge except the game rules.
The document provides an introduction and overview of AlphaGo Zero, including:
- AlphaGo Zero achieved superhuman performance at Go without human data by using self-play reinforcement learning.
- It uses a policy network and Monte Carlo tree search to select moves. The network is trained through self-play games using its own policy and value outputs as training labels.
- Experiments showed AlphaGo Zero outperformed previous AlphaGo versions and human-trained networks, and continued improving with deeper networks and more self-play training.
AlphaZero is an AI system created by DeepMind that achieved superhuman ability in the games of chess, shogi, and Go without relying on human data. It uses a new form of deep reinforcement learning combined with Monte Carlo tree search to learn from games generated by self-play. AlphaZero was able to master each game to superhuman level in a matter of hours, defeating the previous world-champion programs in each case. It represents a major advance in unsupervised, self-taught machine learning.
AlphaGo Zero is an AI agent created by DeepMind to master the game of Go without human data or expertise. It uses reinforcement learning through self-play with the following key aspects:
1. It uses a single deep neural network that predicts both the next move and the winner of the game from the current board position. This dual network is trained solely through self-play reinforcement learning.
2. The neural network improves the Monte Carlo tree search used to select moves. The search uses the network predictions to guide selection and backup of information during search.
3. Training involves repeated self-play games to generate data, then using this data to update the neural network parameters through gradient descent. The updated network plays
J-Fall 2017 - AI Self-learning Game PlayingRichard Abbuhl
The document discusses AI self-learning game playing, providing an overview of machine learning and reinforcement learning techniques used in game playing such as backpropagation, Q-learning, TD-Gammon, and AlphaGo. It reviews the history of machine learning in game playing from the 1950s to modern implementations, and discusses concepts like weak and strong AI as well as skills needed for the future of employment with advances in AI.
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...Seldon
Speaker: Pierre Harvey Richemond, PhD student at the Data Science Institute with Imperial College
After a successful career in quantitative finance, Pierre is researching deep learning and reinforcement learning at Data Science Institute. He holds several degrees in mathematics and engineering.
Abstract:
In this high-level talk, he will go through the latest recent and significant developments in the theory of reinforcement learning. Topics will range from soft Q-learning to proximal policy optimization and the Monte-Carlo tree search used in AlphaGo Zero. He will discuss strategies to implement these methods in Tensorflow, combine and replicate them in practice, and highlight connections with other related fields such as convex optimization and optimal transport.
Thanks to all TensorFlow London meetup organisers and supporters:
Seldon.io
Altoros
Rewired
Google Developers
Rise London
1) The document discusses the evolution of AlphaGo and related AI projects from Deep Blue, to AlphaGo, AlphaGo Zero, and AlphaZero.
2) It explains the key concepts behind AlphaGo including the policy and value neural networks, and how it was initially trained via supervised learning and then improved with reinforcement learning by playing against itself.
3) It summarizes the differences between AlphaGo Zero which was trained solely via reinforcement learning without human data, and AlphaZero which aims to solve games without specific tuning or data augmentation.
1) Alpha Zero was an AI developed by DeepMind that achieved master level play in the games of chess, shogi, and Go without relying on human data or prior knowledge.
2) It was able to achieve this by using a new form of deep reinforcement learning that allowed it to learn to play solely from games of self-play, starting from random play.
3) Alpha Zero demonstrated superhuman performance in chess, shogi, and Go by defeating previous champion programs in these games, despite being provided no domain knowledge except the game rules.
The document provides an introduction and overview of AlphaGo Zero, including:
- AlphaGo Zero achieved superhuman performance at Go without human data by using self-play reinforcement learning.
- It uses a policy network and Monte Carlo tree search to select moves. The network is trained through self-play games using its own policy and value outputs as training labels.
- Experiments showed AlphaGo Zero outperformed previous AlphaGo versions and human-trained networks, and continued improving with deeper networks and more self-play training.
AlphaZero is an AI system created by DeepMind that achieved superhuman ability in the games of chess, shogi, and Go without relying on human data. It uses a new form of deep reinforcement learning combined with Monte Carlo tree search to learn from games generated by self-play. AlphaZero was able to master each game to superhuman level in a matter of hours, defeating the previous world-champion programs in each case. It represents a major advance in unsupervised, self-taught machine learning.
AlphaGo Zero is an AI agent created by DeepMind to master the game of Go without human data or expertise. It uses reinforcement learning through self-play with the following key aspects:
1. It uses a single deep neural network that predicts both the next move and the winner of the game from the current board position. This dual network is trained solely through self-play reinforcement learning.
2. The neural network improves the Monte Carlo tree search used to select moves. The search uses the network predictions to guide selection and backup of information during search.
3. Training involves repeated self-play games to generate data, then using this data to update the neural network parameters through gradient descent. The updated network plays
J-Fall 2017 - AI Self-learning Game PlayingRichard Abbuhl
The document discusses AI self-learning game playing, providing an overview of machine learning and reinforcement learning techniques used in game playing such as backpropagation, Q-learning, TD-Gammon, and AlphaGo. It reviews the history of machine learning in game playing from the 1950s to modern implementations, and discusses concepts like weak and strong AI as well as skills needed for the future of employment with advances in AI.
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...Seldon
Speaker: Pierre Harvey Richemond, PhD student at the Data Science Institute with Imperial College
After a successful career in quantitative finance, Pierre is researching deep learning and reinforcement learning at Data Science Institute. He holds several degrees in mathematics and engineering.
Abstract:
In this high-level talk, he will go through the latest recent and significant developments in the theory of reinforcement learning. Topics will range from soft Q-learning to proximal policy optimization and the Monte-Carlo tree search used in AlphaGo Zero. He will discuss strategies to implement these methods in Tensorflow, combine and replicate them in practice, and highlight connections with other related fields such as convex optimization and optimal transport.
Thanks to all TensorFlow London meetup organisers and supporters:
Seldon.io
Altoros
Rewired
Google Developers
Rise London
Adversarial search is an algorithm used in game playing to plan ahead when other agents are planning against you. The minimax algorithm determines the optimal strategy by assuming the opponent will make the best counter-move. It searches the game tree to find the move with the highest minimum payoff. α-β pruning improves on minimax by pruning branches that cannot affect the choice of move. State-of-the-art game programs use techniques like precomputed databases, deep search trees, and pattern knowledge bases to defeat human champions at games like checkers, chess, and Othello.
The document discusses how AlphaGo, a computer program developed by DeepMind, was able to defeat world champion Lee Sedol at the game of Go. It achieved this through a combination of deep learning and tree search techniques. Four deep neural networks were used: three convolutional networks to reduce the action space and search depth through imitation learning, self-play reinforcement learning, and value prediction; and a smaller network for faster simulations. This combination of deep learning and search allowed AlphaGo to master the complex game of Go, demonstrating the capabilities of modern AI.
Devoxx 2017 - AI Self-learning Game PlayingRichard Abbuhl
This document provides an overview of the history of AI self-learning game playing and machine learning. It discusses early work using search trees and perceptrons in the 1950s-1970s. Reinforcement learning techniques like TD-Gammon and Q-Learning are explained. Landmark projects including Deep Blue, AlphaGo, and AlphaGo Zero using neural networks and reinforcement learning to master challenging games like chess and Go are summarized. The document provides high-level descriptions of machine learning basics and techniques demonstrated through examples like Tic-Tac-Toe.
AlphaGo uses a novel combination of Monte Carlo tree search and neural networks to master the game of Go. It trains two neural networks - a policy network to predict expert moves and a value network to evaluate board positions. During gameplay, AlphaGo runs multiple Monte Carlo tree simulations that use the neural networks to guide search and evaluate positions. The move selected is the one most frequently visited after all simulations. This approach allowed AlphaGo to defeat world champion Lee Sedol 4-1, achieving a milestone in artificial intelligence.
AlphaZero: A General Reinforcement Learning Algorithm that Masters Chess, Sho...Joonhyung Lee
An introduction to DeepMind's newest board-game playing AI, AlphaZero.
I have improved significantly on my previous presentation in https://www.slideshare.net/ssuserc416e2/alphago-zero-mastering-the-game-of-go-without-human-knowledge, which had several errors (some rather glaring, such as the temperature equation for simulated annealing). Also, DeepMind released far more details in their new Science paper for AlphaZero.
One comment I would like to add is that the AlphaGo Zero used for comparison in this paper is a very weak version, not the final version. Thus, AlphaGo Zero is still SOTA for Go.
1. Game playing is an important domain for artificial intelligence research as games provide formal reasoning problems that allow direct comparison between computer programs and humans.
2. Alpha-beta pruning can speed up minimax search in game trees by pruning branches that cannot alter the outcome. It works by maintaining lower and upper bounds on the score.
3. Evaluating leaf nodes is challenging. For chess, linear evaluation functions combining weighted features like material and position are commonly used, and reinforcement learning can help tune the weights.
Artificial Intelligence
Deep Learning vs Machine Learning
Machine Learning
Terminology
Core Concepts
JavaScript and AI
TensorFlow
TensorFlow JS
Examples
Evan Estola – Data Scientist, Meetup.com at MLconf ATLMLconf
Beyond Collaborative Filtering: using Machine Learning to power recommendations at Meetup
Collaborative filtering and other common recommendation algorithms are a powerful technique for some scenarios. I will cover how to design a recommendation system from the ground up using an ensemble classifier and supervised learning to avoid some of the pitfalls of collaborative filtering. From sampling to deployment, we’ve had to invent our approach with few non-academic and non-toy examples to follow. At Meetup we’re all about sharing information and empowering communities, so I’ll present the details of our model as well as some of the new features we are still developing.
Implementation and analysis of search algorithms in single player connect fou...Anmol Rajpurohit
The document discusses the implementation and analysis of search algorithms in a single-player Connect Four game. It outlines the game rules and previous work analyzing strategies. It then describes the problem statement, algorithms implemented including minimax and alpha-beta pruning, heuristics to evaluate board positions, and a comparative analysis of the algorithms. Exponential heuristics were found to explore more nodes than linear heuristics but require less than 1 second to search to a depth of 10. Alpha-beta pruning reduced the number of nodes explored by 10 to 100 times compared to not using pruning.
In this talk we discuss about the aplicação of Reinforcement Learning to Games. Recently, OpenAI created an algorithm capable of beating a human team in DOTA, considered a game with great amount of complexity and strategy. In this talk, we'll evaluate the role Reinforcement Learning plays in the world of games, taking a look at some of main achievements and how they look like in terms of implementation. We'll also take a look at some of the history of AI applied to games and how things evolved over time.
Yuandong Tian at AI Frontiers : Planning in Reinforcement LearningAI Frontiers
Deep Reinforcement Learning (DRL) has made strong progress in many tasks, such as board games, robotics, navigation, neural architecture search, etc. I will present our recent open-sourced DRL frameworks to facilitate game research and development. Our framework is scalable so we can can reproduce AlphaGoZero and AlphaZero using 2000 GPUs, achieving super-human performance of Go AI that beats 4 top-30 professional players. We also show usability of our platform by training agents in real-time strategy games, and show interesting behaviors with a small amount of resource.
This document discusses how while deep learning has achieved success in areas like image recognition and natural language processing, it is not always the best or most accurate approach and should not be obsessively pursued to the exclusion of other machine learning techniques. Specifically, simpler models may perform equally well due to Occam's razor. Unsupervised learning and feature engineering are also important. Ensembles of different models can further improve results compared to relying on a single approach. The document cautions against an overemphasis on deep learning without considering factors like system complexity, costs, and the ability to distribute models.
Deep learning to the rescue - solving long standing problems of recommender ...Balázs Hidasi
I gave this talk at the 1st Budapest RecSys and Personalization Meetup about using deep learning to solve long standing problems of recommender systems. I also presented our approach on using RNNs for session-based recommendations in details.
This document provides an overview of reinforcement learning and AlphaZero. It discusses the math behind reinforcement learning concepts like policy iteration, policy improvement, and policy evaluation. It then explains how AlphaZero uses these concepts along with a deep neural network and self-play to master the game of Go without human data. Key algorithms discussed include Monte Carlo tree search and how AlphaZero implements them in code to learn directly from games played between copies of itself.
Netflix uses machine learning and algorithms to power recommendations for over 69 million members across more than 50 countries. They experiment with a wide range of algorithms including regression, matrix factorization, deep neural networks, and more. Some lessons learned are to first build an offline experimentation framework with clear metrics, consider distribution from the start, and design production code to also support experimentation. The goal is to efficiently iterate experiments and smoothly implement successful models in production.
Netflix uses machine learning and algorithms to power recommendations that help members find content to watch. Some of the models and algorithms used include regression, matrix factorization, deep neural networks, and clustering. Key lessons learned are to first build an offline experimentation framework with proper metrics and data splits before tackling new problems. When experimenting, consider distributing algorithms to offset communication overhead if data is large enough. Design production code to also support experimentation through shared engines and avoiding dual implementations.
This document summarizes different procedural content generation (PCG) methods and approaches, including their strengths and weaknesses. It discusses constructive methods, constraint-based systems, optimization techniques, and grammars. For each method, it provides examples and discusses the "power and peril" - the strengths but also challenges to address. The document concludes with practical advice on choosing a PCG approach based on factors like desired player interaction, control level, and speed needs. It also discusses tools that use PCG to aid designers in a mixed-initiative process.
Juantomás García gave a talk on machine learning pipelines for developing AI that can learn to play and solve the 1980s video game "The Abbey of the Crime". He discussed gathering game data, exploring different reinforcement learning strategies, and developing a simple neural network model with policy and value networks to determine moves and rewards. He described his current pipeline that moves raw game data through processing steps using technologies like Kubernetes, PubSub, training jobs, and model storage. The talk encouraged attendees to collaborate on the open source project on GitHub and join the AbadIA Slack channel.
The process of building an AI looks like it is so glamorous but is a long process, and at the end of the day, the tasks related to the AI model are just 5% or less of the project.
We will see how to start an AI project from zero: defining the objectives, creating the architecture, building the game interfaces, massive data pipelines, defining model strategies, how to parallelize everything, etc.
The "the abbey of crime" is an adamant 8-bit game. This game is more complicated than Montezuma Revenge and is a perfect challenge for an AI. Its complexity is about 10^1000 legal moves to solve it.
As AI technology, we will use Reinforcement Learning using Deep Neural Networks and Monte Carlo Tree Search.
The takeaways of this talk will be: understanding all the processes involved to create an AI and learning the basics of Reinforcement Learning.
More Related Content
Similar to From alpha go to alpha zero TLP innova 2018
Adversarial search is an algorithm used in game playing to plan ahead when other agents are planning against you. The minimax algorithm determines the optimal strategy by assuming the opponent will make the best counter-move. It searches the game tree to find the move with the highest minimum payoff. α-β pruning improves on minimax by pruning branches that cannot affect the choice of move. State-of-the-art game programs use techniques like precomputed databases, deep search trees, and pattern knowledge bases to defeat human champions at games like checkers, chess, and Othello.
The document discusses how AlphaGo, a computer program developed by DeepMind, was able to defeat world champion Lee Sedol at the game of Go. It achieved this through a combination of deep learning and tree search techniques. Four deep neural networks were used: three convolutional networks to reduce the action space and search depth through imitation learning, self-play reinforcement learning, and value prediction; and a smaller network for faster simulations. This combination of deep learning and search allowed AlphaGo to master the complex game of Go, demonstrating the capabilities of modern AI.
Devoxx 2017 - AI Self-learning Game PlayingRichard Abbuhl
This document provides an overview of the history of AI self-learning game playing and machine learning. It discusses early work using search trees and perceptrons in the 1950s-1970s. Reinforcement learning techniques like TD-Gammon and Q-Learning are explained. Landmark projects including Deep Blue, AlphaGo, and AlphaGo Zero using neural networks and reinforcement learning to master challenging games like chess and Go are summarized. The document provides high-level descriptions of machine learning basics and techniques demonstrated through examples like Tic-Tac-Toe.
AlphaGo uses a novel combination of Monte Carlo tree search and neural networks to master the game of Go. It trains two neural networks - a policy network to predict expert moves and a value network to evaluate board positions. During gameplay, AlphaGo runs multiple Monte Carlo tree simulations that use the neural networks to guide search and evaluate positions. The move selected is the one most frequently visited after all simulations. This approach allowed AlphaGo to defeat world champion Lee Sedol 4-1, achieving a milestone in artificial intelligence.
AlphaZero: A General Reinforcement Learning Algorithm that Masters Chess, Sho...Joonhyung Lee
An introduction to DeepMind's newest board-game playing AI, AlphaZero.
I have improved significantly on my previous presentation in https://www.slideshare.net/ssuserc416e2/alphago-zero-mastering-the-game-of-go-without-human-knowledge, which had several errors (some rather glaring, such as the temperature equation for simulated annealing). Also, DeepMind released far more details in their new Science paper for AlphaZero.
One comment I would like to add is that the AlphaGo Zero used for comparison in this paper is a very weak version, not the final version. Thus, AlphaGo Zero is still SOTA for Go.
1. Game playing is an important domain for artificial intelligence research as games provide formal reasoning problems that allow direct comparison between computer programs and humans.
2. Alpha-beta pruning can speed up minimax search in game trees by pruning branches that cannot alter the outcome. It works by maintaining lower and upper bounds on the score.
3. Evaluating leaf nodes is challenging. For chess, linear evaluation functions combining weighted features like material and position are commonly used, and reinforcement learning can help tune the weights.
Artificial Intelligence
Deep Learning vs Machine Learning
Machine Learning
Terminology
Core Concepts
JavaScript and AI
TensorFlow
TensorFlow JS
Examples
Evan Estola – Data Scientist, Meetup.com at MLconf ATLMLconf
Beyond Collaborative Filtering: using Machine Learning to power recommendations at Meetup
Collaborative filtering and other common recommendation algorithms are a powerful technique for some scenarios. I will cover how to design a recommendation system from the ground up using an ensemble classifier and supervised learning to avoid some of the pitfalls of collaborative filtering. From sampling to deployment, we’ve had to invent our approach with few non-academic and non-toy examples to follow. At Meetup we’re all about sharing information and empowering communities, so I’ll present the details of our model as well as some of the new features we are still developing.
Implementation and analysis of search algorithms in single player connect fou...Anmol Rajpurohit
The document discusses the implementation and analysis of search algorithms in a single-player Connect Four game. It outlines the game rules and previous work analyzing strategies. It then describes the problem statement, algorithms implemented including minimax and alpha-beta pruning, heuristics to evaluate board positions, and a comparative analysis of the algorithms. Exponential heuristics were found to explore more nodes than linear heuristics but require less than 1 second to search to a depth of 10. Alpha-beta pruning reduced the number of nodes explored by 10 to 100 times compared to not using pruning.
In this talk we discuss about the aplicação of Reinforcement Learning to Games. Recently, OpenAI created an algorithm capable of beating a human team in DOTA, considered a game with great amount of complexity and strategy. In this talk, we'll evaluate the role Reinforcement Learning plays in the world of games, taking a look at some of main achievements and how they look like in terms of implementation. We'll also take a look at some of the history of AI applied to games and how things evolved over time.
Yuandong Tian at AI Frontiers : Planning in Reinforcement LearningAI Frontiers
Deep Reinforcement Learning (DRL) has made strong progress in many tasks, such as board games, robotics, navigation, neural architecture search, etc. I will present our recent open-sourced DRL frameworks to facilitate game research and development. Our framework is scalable so we can can reproduce AlphaGoZero and AlphaZero using 2000 GPUs, achieving super-human performance of Go AI that beats 4 top-30 professional players. We also show usability of our platform by training agents in real-time strategy games, and show interesting behaviors with a small amount of resource.
This document discusses how while deep learning has achieved success in areas like image recognition and natural language processing, it is not always the best or most accurate approach and should not be obsessively pursued to the exclusion of other machine learning techniques. Specifically, simpler models may perform equally well due to Occam's razor. Unsupervised learning and feature engineering are also important. Ensembles of different models can further improve results compared to relying on a single approach. The document cautions against an overemphasis on deep learning without considering factors like system complexity, costs, and the ability to distribute models.
Deep learning to the rescue - solving long standing problems of recommender ...Balázs Hidasi
I gave this talk at the 1st Budapest RecSys and Personalization Meetup about using deep learning to solve long standing problems of recommender systems. I also presented our approach on using RNNs for session-based recommendations in details.
This document provides an overview of reinforcement learning and AlphaZero. It discusses the math behind reinforcement learning concepts like policy iteration, policy improvement, and policy evaluation. It then explains how AlphaZero uses these concepts along with a deep neural network and self-play to master the game of Go without human data. Key algorithms discussed include Monte Carlo tree search and how AlphaZero implements them in code to learn directly from games played between copies of itself.
Netflix uses machine learning and algorithms to power recommendations for over 69 million members across more than 50 countries. They experiment with a wide range of algorithms including regression, matrix factorization, deep neural networks, and more. Some lessons learned are to first build an offline experimentation framework with clear metrics, consider distribution from the start, and design production code to also support experimentation. The goal is to efficiently iterate experiments and smoothly implement successful models in production.
Netflix uses machine learning and algorithms to power recommendations that help members find content to watch. Some of the models and algorithms used include regression, matrix factorization, deep neural networks, and clustering. Key lessons learned are to first build an offline experimentation framework with proper metrics and data splits before tackling new problems. When experimenting, consider distributing algorithms to offset communication overhead if data is large enough. Design production code to also support experimentation through shared engines and avoiding dual implementations.
This document summarizes different procedural content generation (PCG) methods and approaches, including their strengths and weaknesses. It discusses constructive methods, constraint-based systems, optimization techniques, and grammars. For each method, it provides examples and discusses the "power and peril" - the strengths but also challenges to address. The document concludes with practical advice on choosing a PCG approach based on factors like desired player interaction, control level, and speed needs. It also discusses tools that use PCG to aid designers in a mixed-initiative process.
Similar to From alpha go to alpha zero TLP innova 2018 (20)
Juantomás García gave a talk on machine learning pipelines for developing AI that can learn to play and solve the 1980s video game "The Abbey of the Crime". He discussed gathering game data, exploring different reinforcement learning strategies, and developing a simple neural network model with policy and value networks to determine moves and rewards. He described his current pipeline that moves raw game data through processing steps using technologies like Kubernetes, PubSub, training jobs, and model storage. The talk encouraged attendees to collaborate on the open source project on GitHub and join the AbadIA Slack channel.
The process of building an AI looks like it is so glamorous but is a long process, and at the end of the day, the tasks related to the AI model are just 5% or less of the project.
We will see how to start an AI project from zero: defining the objectives, creating the architecture, building the game interfaces, massive data pipelines, defining model strategies, how to parallelize everything, etc.
The "the abbey of crime" is an adamant 8-bit game. This game is more complicated than Montezuma Revenge and is a perfect challenge for an AI. Its complexity is about 10^1000 legal moves to solve it.
As AI technology, we will use Reinforcement Learning using Deep Neural Networks and Monte Carlo Tree Search.
The takeaways of this talk will be: understanding all the processes involved to create an AI and learning the basics of Reinforcement Learning.
1. The document discusses using reinforcement learning to create an AI that can learn to play and solve the 1987 8-bit role playing game "The Abbey of Crime".
2. The game was recreated in C++ using a video game framework to make it playable on modern systems.
3. Training an AI to master the game is an immense challenge due to the enormous number of possible game states and moves, far exceeding the number of atoms in the universe or moves in Go.
How we use Reinforcement Learning to solve the abbey of the crime
Do you know the Abbey of the Crime?
The abbey is an 8-bit game (for spectrum and CPC) that became the first RPG game in 3D (2.5D) in 1987.
This game is a marvel from a technological point of view: in only less than 120k it is capable of storing the sound, the images, all the logic of the program and the data.
Did you manage to finish the game without help?
I do not know any human being who has passed it without help. It is one of the most complicated games that have been developed, like a 1000x compared to the revenge of Montezuma from Atari. The complexity is around (10 ** 10000)
In the talk, we will tell how we design and build an AI capable of playing alone and learn to complete the game.
The document discusses using reinforcement learning to create an AI that can learn to play and solve the 1987 8-bit role playing game "La Abadía del Crimen". It notes that the game is considered a legend in video game history. The plan is outlined to make an AI that can learn to play and solve the full game. Challenges are discussed around the enormous number of possible moves. Tools and collaboration methods are presented for creating the AI, including interacting with the game code and saving game information.
The document discusses using reinforcement learning to create an AI that can learn to play and solve the 1987 8-bit role playing game "La Abadía del Crimen". It notes that the game is considered a legend in video game history. The plan is outlined to make an AI that can learn to play and solve the full game. Challenges are discussed around the enormous number of possible moves. Tools and collaboration methods are presented for creating the AI, including interacting with the game code and saving game information.
In the last 3 years kappa architecture has evolved too much. From the Classic kafka+spark and now many options and new players like apache beam/google dataflow. Will show how a real use case has evolved and how important is think big and different.
This document provides an overview of Kappa Architecture presented by Juantomás García. It includes:
1) A brief history of Kappa Architecture, which was coined in 2014 by Jay Kreps to describe an architecture using real-time streaming data and batch processing.
2) An explanation of how Kappa Architecture works, using streaming data pipelines to continuously update real-time views and batch jobs to rebuild views from historical data.
3) A real use case example of how OpenSistemas used Kappa Architecture to monitor vehicle data from many cars in real-time and perform analytics.
This document discusses Spark real use cases and the Kappa architecture. It introduces Juantomás García and provides background on Jay Kreps who coined the term "Kappa Architecture". The document then describes the usual data flow versus the Kappa Architecture approach. It provides examples of tools that can be used and highlights some favorite Spark features. The document also discusses some real use cases including monitoring car data and scaling REST interfaces and queries. It emphasizes thinking big, using lightweight technologies, and mixing tools from different sources.
El documento anuncia una presentación el 31 de mayo de 2017 en el Campus Madrid sobre Google Developer Group Cloud (GDG Cloud), un grupo creado para compartir conocimientos sobre tecnologías en la nube. La presentación incluirá información sobre la misión de GDG Cloud, quienes lo crearon y son parte del equipo de GDG Cloud Madrid, así como detalles sobre los meetups, hackatones y otras actividades que organizarán para promover el aprendizaje sobre tecnologías en la nube.
This document discusses Kappa Architecture 2.0 and tools for real-time data processing. It introduces Jay Kreps, who coined the term "Kappa Architecture" and has worked on several related projects. The document then contrasts the usual batch processing data flow with the Kappa Architecture approach of combining real-time and batch processing. It provides examples of using Spark and Kafka for real-time analytics and discusses taking the Kappa Architecture to version 3.0 with additional tools like Apache Druid, Kafka Streams, and Google Cloud.
Juantomás García is a data solutions manager at OpenSistemas and Google Developer Expert for cloud computing. He has authored books on free software and leads groups on machine learning and cloud computing. In his presentation, he discusses considerations around AI versus machine learning, implicit versus explicit data, Target's pregnancy product advertising controversy, Google's diversity study findings, evaluating feedback in real-time, contextual information, and Google's Jobs API. He welcomes questions by email or Twitter.
This document summarizes a presentation by Juantomás García on Kappa Architecture 2.0. It introduces Kappa Architecture, which was coined in 2014 as a way to handle both batch and real-time processing of streaming data. The presentation describes the traditional data flow, and how Kappa Architecture uses tools like Apache Kafka, Apache Spark, and Apache Samza to allow for real-time and batch processing of data. It also provides examples of how Kappa Architecture has been applied to use cases involving monitoring vehicle data in real-time and scaling APIs to handle many requests.
This document summarizes a presentation about Kappa Architecture 2.0 given by Juantomás García. It discusses the origins of Kappa Architecture as coined by Jay Kreps in 2014 for handling real-time data. It then outlines how Kappa Architecture uses tools like Apache Kafka, Apache Spark, and Apache Samza to handle real-time and batch data processing. The presentation also provides examples of how Kappa Architecture has been applied to a use case of monitoring car data in real-time and scaling REST APIs. It concludes by discussing future improvements and variations of Kappa Architecture.
Juantomás García gave a presentation on PaaS (Platform as a Service) using pigeons at Databeers Madrid. He proposed training pigeons to detect cancer tumors as an unconventional method for machine learning. Some key points included that pigeon clusters can achieve 99% accuracy after a month of training and discussing how to identify individual pigeons, design an API, and scale the pigeon cluster. The presentation maintained a lighthearted and humorous tone about using pigeons for complex data solutions.
How to create a personal knowledge graph IBM Meetup Big Data Madrid 2017Juantomás García Molina
This document discusses how to create a personal knowledge graph. It begins by explaining why a knowledge graph is needed, as the speaker manages a lot of information from different sources and needs a way to organize and query it. It then discusses how to build a knowledge graph using concepts like explicit and implicit information, graph databases, and collective intelligence. The speaker advocates using cloud services, containers, notebooks and machine learning to build the knowledge graph. The first steps proposed are to name the project "Boosterme" and start a GitHub repository.
This document discusses Kappa Architecture, which is a data processing architecture using Apache Kafka as the centralized data backbone. It summarizes Jay Kreps' original definition of Kappa Architecture and describes how it provides a more flexible way to do real-time and batch processing compared to the traditional Lambda Architecture. The document also provides an example of how a company implemented Kappa Architecture using Kafka and Spark Streaming to monitor car telemetry data from many vehicles and support flexible real-time queries over that data.
Este documento presenta tres ejemplos de clasificación utilizando TensorFlow. El autor instaló TensorFlow a través de un contenedor Docker para ganar tiempo y eficiencia. Los tres ejemplos utilizan diferentes conjuntos de datos sintéticos y algoritmos de aprendizaje automático, incluida una regresión logística para datos lineales y RNN para datos no lineales. El documento también introduce el uso de TensorBoard para el desarrollo con TensorFlow.
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Zilliz
Join us to introduce Milvus Lite, a vector database that can run on notebooks and laptops, share the same API with Milvus, and integrate with every popular GenAI framework. This webinar is perfect for developers seeking easy-to-use, well-integrated vector databases for their GenAI apps.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/
Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit.
In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing.
van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
20 Comprehensive Checklist of Designing and Developing a WebsitePixlogix Infotech
Dive into the world of Website Designing and Developing with Pixlogix! Looking to create a stunning online presence? Look no further! Our comprehensive checklist covers everything you need to know to craft a website that stands out. From user-friendly design to seamless functionality, we've got you covered. Don't miss out on this invaluable resource! Check out our checklist now at Pixlogix and start your journey towards a captivating online presence today.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
3. Who I am
Juantomás García
•Chief Envisioning Officer @ Sngular
•GDE (Google Developer Expert) for cloud
•#AbadIA Public Relations
Others
•Co-Author of the first Spanish free software book “La Pastilla Roja”
•Former President of Hispalinux (Spanish Linux User Group)
•Organizer of the Machine Learning Spain and GDG Cloud Madrid.
4. Who are the Audience
• People interested in Machine Learning
• Wants to know more about what’s is Alpha Go
• With a good technical background
5. Why I did this presentation
•I love Machine Learning.
•There are a lot of takeaways from this project.
•I wish to divulge it
6. Outline
•Alpha Go: the epic project
•AlphaGo Zero: re-evolution version
•Alpha Zero: Looking for general solutions
•DIY: Alpha Zero Connect 4
•Takeaways
7. A brief introduction
• Deep Blue was about brute
force
•They were emulating how
humans play chess
8. A brief introduction
•A very huge Search Space
Chess -> Opening 20 possible moves
Go -> Opening 361 possible moves
9. Alpha Go Main Concepts
• Policy Neural Network
“To decide which are the most sensible
moves in a particular board position”.
10. Alpha Go Main Concepts
• Value Neural Network
“How great is a particular board arrangements”.
“How likely you are to win the game with this position”.
11. Alpha Go First Approach: SL
• Just train both networks using human games.
• Just old and ordinary Supervised Learning.
• With this: AlphaGo just play with like a weak
human.
• It like the approach of deep blue: just
emulating human chess players
12. Alpha Go Second Approach: RL
• Improve SL version starting playing again itself.
• With Reinforcement Learning is able to play
well against state of the art go playing programs
• These programs are using MCTS
15. Alpha Go Second Approach: RL
• It is not 2 NN vs Monte Carlo Tree Search
• Is a better MCTS thanks to the NNs.
16. Alpha Go Second Approach: RL
• Optimal Value Function V*(s)
“Determine the outcome of the game from every
board position (s is the state)”.
Brute force solution is impossible:
Chess: 35 ** 80
Go: 250 ** 150
17. Alpha Go Second Approach: RL
•Two solutions for reduce the effective search space:
Truncate the tree subtree search: V(s) like V*(s)
Reducing the breadth of the search with the policy:
P(a|s)
We MCTS rollout the moves choose by the policy
function and evaluate with the optimal value function.
19. AlphaGo Zero: Re-Evolution version
•Just trained with Reinforcement Learning
•Choose the less out different moves: u(s,a)
•Just one neural network for policy and value.
•Every time a search is done the neural network is
retrained
20. AlphaGo Zero: Re-Evolution version
•Human games was noisy and not reliable.
•Don’t use rollouts for predict who will win.
23. Alpha Zero: New Challenges
AlphaGo Zero VS AlphaZero:
• Binary outcome (win / loss) × expected outcome (including
• 3 draws or potentially other outcomes)
• Board positions transformed before passing to neural
networks (by randomly selected rotation or redirection) × no
data augmentation
• Games generated by the best player from previous iterations
(margin of 55 %) × continual update using the latest parameters
(without the evaluation and selection steps)
• Hyper-parameters tuned by Bayesian optimisation × reused the
same hyper-parameters without game-specific tuning
31. Questions?
•email: juantomas.garcia@gmail.com
•twitter: @juantomas
This talk have a free questions lifetime warranty: If you have any questions or concerns
about this talk, feel free to contact me anytime.
Selfie Time: If you like the talk just smile while I take
the selfie ;-)
We’re Hiring, Sngular People