An introduction to DeepMind's newest board-game playing AI, AlphaZero.
I have improved significantly on my previous presentation in https://www.slideshare.net/ssuserc416e2/alphago-zero-mastering-the-game-of-go-without-human-knowledge, which had several errors (some rather glaring, such as the temperature equation for simulated annealing). Also, DeepMind released far more details in their new Science paper for AlphaZero.
One comment I would like to add is that the AlphaGo Zero used for comparison in this paper is a very weak version, not the final version. Thus, AlphaGo Zero is still SOTA for Go.
In this report, the key theoretical foundations and implementation details of AlphaGo, AlphaGo Zero, and Alpha Zero presented in the literature (mainly the two Nature papers) are introduced, followed a short discussion to imitation learning paradigm.
my presentation on AlphaZero (https://arxiv.org/abs/1712.01815) for AI seminar (http://ktiml.mff.cuni.cz/~bartak/ui_seminar/)
LaTeX source code is avalailable at https://github.com/mathemage/AlphaZero-presentation
The slides go through the implementation details of Google Deepmind's AlphaGo, a computer Go AI that defeated the European champion. The slides are targeted for beginners in the machine learning area.
Korean version (한국어 버젼): http://www.slideshare.net/ShaneSeungwhanMoon/ss-59226902
Slides for a short lecture (~1 hr) on the foundations of the Alpha Go model developed by Google. Intended for people with little technical background, but with basic familiarity with Deep Learning.
In this talk we discuss about the aplicação of Reinforcement Learning to Games. Recently, OpenAI created an algorithm capable of beating a human team in DOTA, considered a game with great amount of complexity and strategy. In this talk, we'll evaluate the role Reinforcement Learning plays in the world of games, taking a look at some of main achievements and how they look like in terms of implementation. We'll also take a look at some of the history of AI applied to games and how things evolved over time.
In this report, the key theoretical foundations and implementation details of AlphaGo, AlphaGo Zero, and Alpha Zero presented in the literature (mainly the two Nature papers) are introduced, followed a short discussion to imitation learning paradigm.
my presentation on AlphaZero (https://arxiv.org/abs/1712.01815) for AI seminar (http://ktiml.mff.cuni.cz/~bartak/ui_seminar/)
LaTeX source code is avalailable at https://github.com/mathemage/AlphaZero-presentation
The slides go through the implementation details of Google Deepmind's AlphaGo, a computer Go AI that defeated the European champion. The slides are targeted for beginners in the machine learning area.
Korean version (한국어 버젼): http://www.slideshare.net/ShaneSeungwhanMoon/ss-59226902
Slides for a short lecture (~1 hr) on the foundations of the Alpha Go model developed by Google. Intended for people with little technical background, but with basic familiarity with Deep Learning.
In this talk we discuss about the aplicação of Reinforcement Learning to Games. Recently, OpenAI created an algorithm capable of beating a human team in DOTA, considered a game with great amount of complexity and strategy. In this talk, we'll evaluate the role Reinforcement Learning plays in the world of games, taking a look at some of main achievements and how they look like in terms of implementation. We'll also take a look at some of the history of AI applied to games and how things evolved over time.
Reinforcement Learning (RL) approaches to deal with finding an optimal reward based policy to act in an environment (Charla en Inglés)
However, what has led to their widespread use is its combination with deep neural networks (DNN) i.e., deep reinforcement learning (Deep RL). Recent successes on not only learning to play games but also superseding humans in it and academia-industry research collaborations like for manipulation of objects, locomotion skills, smart grids, etc. have surely demonstrated their case on a wide variety of challenging tasks.
With application spanning across games, robotics, dialogue, healthcare, marketing, energy and many more domains, Deep RL might just be the power that drives the next generation of Artificial Intelligence (AI) agents!
Part 2 of the Deep Learning Fundamentals Series, this session discusses Tuning Training (including hyperparameters, overfitting/underfitting), Training Algorithms (including different learning rates, backpropagation), Optimization (including stochastic gradient descent, momentum, Nesterov Accelerated Gradient, RMSprop, Adaptive algorithms - Adam, Adadelta, etc.), and a primer on Convolutional Neural Networks. The demos included in these slides are running on Keras with TensorFlow backend on Databricks.
What is Machine Learning | Introduction to Machine Learning | Machine Learnin...Simplilearn
This presentation on Introduction to Machine Learning will explain what is Machine Learning and how does Machine Learning works. By the end of this presentation, you will be able to understand what are the types of Machine Learning, Machine Learning algorithms and some of the breakthroughs in Machine Learning industry. You will also learn what Machine Learning has to offer to us in terms of career opportunities.
This Machine Learning presentation will cover the following topics:
1. Real life applications of Machine Learning
2. Machine Learning Challenges
3. How did Machine Learning evolve?
4. Why Machine Learning / Machine Learning benefits
5. What is Machine Learning?
6. Types of Machine Learning ( Supervised, Unsupervised & Reinforcement Learning )
7. Machine Learning algorithms
8. Breakthroughs in Machine Learning
9. Machine Learning Future
10. Machine Learning Career
11. Machine Learning job trends
- - - - - - - -
About Simplilearn Machine Learning course:
A form of artificial intelligence, Machine Learning is revolutionizing the world of computing as well as all people’s digital interactions. Machine Learning powers such innovative automated technologies as recommendation engines, facial recognition, fraud protection and even self-driving cars.This Machine Learning course prepares engineers, data scientists and other professionals with knowledge and hands-on skills required for certification and job competency in Machine Learning.
- - - - - - -
Why learn Machine Learning?
Machine Learning is taking over the world- and with that, there is a growing need among companies for professionals to know the ins and outs of Machine Learning
The Machine Learning market size is expected to grow from USD 1.03 Billion in 2016 to USD 8.81 Billion by 2022, at a Compound Annual Growth Rate (CAGR) of 44.1% during the forecast period.
- - - - - -
What skills will you learn from this Machine Learning course?
By the end of this Machine Learning course, you will be able to:
1. Master the concepts of supervised, unsupervised and reinforcement learning concepts and modeling.
2. Gain practical mastery over principles, algorithms, and applications of Machine Learning through a hands-on approach which includes working on 28 projects and one capstone project.
3. Acquire thorough knowledge of the mathematical and heuristic aspects of Machine Learning.
4. Understand the concepts and operation of support vector machines, kernel SVM, naive bayes, decision tree classifier, random forest classifier, logistic regression, K-nearest neighbors, K-means clustering and more.
5. Be able to model a wide variety of robust Machine Learning algorithms including deep learning, clustering, and recommendation systems
- - - - - - -
Deep Reinforcement Learning: Q-LearningKai-Wen Zhao
This slide reviews deep reinforcement learning, specially Q-Learning and its variants. We introduce Bellman operator and approximate it with deep neural network. Last but not least, we review the classical paper: DeepMind Atari Game beats human performance. Also, some tips of stabilizing DQN are included.
Deep Reinforcement Learning Talk at PI School. Covering following contents as:
1- Deep Reinforcement Learning
2- QLearning
3- Deep QLearning (DQN)
4- Google Deepmind Paper (DQN for ATARI)
YouTube: https://youtu.be/LzaWrmKL1Z4
** Python Data Science Training: https://www.edureka.co/python **
In this PPT on “Reinforcement Learning Tutorial” you will get an in-depth understanding about how reinforcement learning is used in the real world. I’ll be covering the following topics in this session:
Introduction to Machine Learning
What is Reinforcement Learning?
Reinforcement Learning with an analogy
Reinforcement Learning process
Reinforcement Learning Counter-Strike example
Reinforcement Learning Definitions
Reinforcement Learning Concepts
Markov’s Decision Process
Understanding Q-Learning
Demo
Check out our Python Training Playlist: https://goo.gl/Na1p9G
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
SLIDES WITH NOTES: http://bitly.com/rej-practical-ai
This talk is an introductory material for students and programmers aspiring for developing AI for games.
Talk is split into 2 parts - first part provides an overview on popular AI approaches in games and second part gives an outlook on technology that might be relevant for games AI in the future.
I gave this talk at Vilnius University as a guest speaker in the late October 2016.
( Machine Learning & Deep Learning Specialization Training: https://goo.gl/5u2RiS )
This CloudxLab Reinforcement Learning tutorial helps you to understand Reinforcement Learning in detail. Below are the topics covered in this tutorial:
1) What is Reinforcement?
2) Reinforcement Learning an Introduction
3) Reinforcement Learning Example
4) Learning to Optimize Rewards
5) Policy Search - Brute Force Approach, Genetic Algorithms and Optimization Techniques
6) OpenAI Gym
7) The Credit Assignment Problem
8) Inverse Reinforcement Learning
9) Playing Atari with Deep Reinforcement Learning
10) Policy Gradients
11) Markov Decision Processes
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...Preferred Networks
This presentation explains basic ideas of graph neural networks (GNNs) and their common applications. Primary target audiences are students, engineers and researchers who are new to GNNs but interested in using GNNs for their projects. This is a modified version of the course material for a special lecture on Data Science at Nara Institute of Science and Technology (NAIST), given by Preferred Networks researcher Katsuhiko Ishiguro, PhD.
AlphaGo Zero: Mastering the Game of Go Without Human KnowledgeJoonhyung Lee
A brief but in-depth and highly understandable introduction to AlphaGo Zero, the successor to the world-famous AlphaGo.
Unlike its predecessor, which relied on a huge amount of human training data, AlphaGo Zero requires no human input in its training process.
Because of this, it is uninhibited by human prejudices and preconceptions, which allowed it to become the best go player, human or machine, in history.
This is the presentation that I gave concerning the subject in my first semester as a graduate subject. It was designed for those with a background in deep learning, but not reinforcement learning. It explains the core concepts necessary to understand AlphaGo to an interested audience.
Reinforcement Learning (RL) approaches to deal with finding an optimal reward based policy to act in an environment (Charla en Inglés)
However, what has led to their widespread use is its combination with deep neural networks (DNN) i.e., deep reinforcement learning (Deep RL). Recent successes on not only learning to play games but also superseding humans in it and academia-industry research collaborations like for manipulation of objects, locomotion skills, smart grids, etc. have surely demonstrated their case on a wide variety of challenging tasks.
With application spanning across games, robotics, dialogue, healthcare, marketing, energy and many more domains, Deep RL might just be the power that drives the next generation of Artificial Intelligence (AI) agents!
Part 2 of the Deep Learning Fundamentals Series, this session discusses Tuning Training (including hyperparameters, overfitting/underfitting), Training Algorithms (including different learning rates, backpropagation), Optimization (including stochastic gradient descent, momentum, Nesterov Accelerated Gradient, RMSprop, Adaptive algorithms - Adam, Adadelta, etc.), and a primer on Convolutional Neural Networks. The demos included in these slides are running on Keras with TensorFlow backend on Databricks.
What is Machine Learning | Introduction to Machine Learning | Machine Learnin...Simplilearn
This presentation on Introduction to Machine Learning will explain what is Machine Learning and how does Machine Learning works. By the end of this presentation, you will be able to understand what are the types of Machine Learning, Machine Learning algorithms and some of the breakthroughs in Machine Learning industry. You will also learn what Machine Learning has to offer to us in terms of career opportunities.
This Machine Learning presentation will cover the following topics:
1. Real life applications of Machine Learning
2. Machine Learning Challenges
3. How did Machine Learning evolve?
4. Why Machine Learning / Machine Learning benefits
5. What is Machine Learning?
6. Types of Machine Learning ( Supervised, Unsupervised & Reinforcement Learning )
7. Machine Learning algorithms
8. Breakthroughs in Machine Learning
9. Machine Learning Future
10. Machine Learning Career
11. Machine Learning job trends
- - - - - - - -
About Simplilearn Machine Learning course:
A form of artificial intelligence, Machine Learning is revolutionizing the world of computing as well as all people’s digital interactions. Machine Learning powers such innovative automated technologies as recommendation engines, facial recognition, fraud protection and even self-driving cars.This Machine Learning course prepares engineers, data scientists and other professionals with knowledge and hands-on skills required for certification and job competency in Machine Learning.
- - - - - - -
Why learn Machine Learning?
Machine Learning is taking over the world- and with that, there is a growing need among companies for professionals to know the ins and outs of Machine Learning
The Machine Learning market size is expected to grow from USD 1.03 Billion in 2016 to USD 8.81 Billion by 2022, at a Compound Annual Growth Rate (CAGR) of 44.1% during the forecast period.
- - - - - -
What skills will you learn from this Machine Learning course?
By the end of this Machine Learning course, you will be able to:
1. Master the concepts of supervised, unsupervised and reinforcement learning concepts and modeling.
2. Gain practical mastery over principles, algorithms, and applications of Machine Learning through a hands-on approach which includes working on 28 projects and one capstone project.
3. Acquire thorough knowledge of the mathematical and heuristic aspects of Machine Learning.
4. Understand the concepts and operation of support vector machines, kernel SVM, naive bayes, decision tree classifier, random forest classifier, logistic regression, K-nearest neighbors, K-means clustering and more.
5. Be able to model a wide variety of robust Machine Learning algorithms including deep learning, clustering, and recommendation systems
- - - - - - -
Deep Reinforcement Learning: Q-LearningKai-Wen Zhao
This slide reviews deep reinforcement learning, specially Q-Learning and its variants. We introduce Bellman operator and approximate it with deep neural network. Last but not least, we review the classical paper: DeepMind Atari Game beats human performance. Also, some tips of stabilizing DQN are included.
Deep Reinforcement Learning Talk at PI School. Covering following contents as:
1- Deep Reinforcement Learning
2- QLearning
3- Deep QLearning (DQN)
4- Google Deepmind Paper (DQN for ATARI)
YouTube: https://youtu.be/LzaWrmKL1Z4
** Python Data Science Training: https://www.edureka.co/python **
In this PPT on “Reinforcement Learning Tutorial” you will get an in-depth understanding about how reinforcement learning is used in the real world. I’ll be covering the following topics in this session:
Introduction to Machine Learning
What is Reinforcement Learning?
Reinforcement Learning with an analogy
Reinforcement Learning process
Reinforcement Learning Counter-Strike example
Reinforcement Learning Definitions
Reinforcement Learning Concepts
Markov’s Decision Process
Understanding Q-Learning
Demo
Check out our Python Training Playlist: https://goo.gl/Na1p9G
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
SLIDES WITH NOTES: http://bitly.com/rej-practical-ai
This talk is an introductory material for students and programmers aspiring for developing AI for games.
Talk is split into 2 parts - first part provides an overview on popular AI approaches in games and second part gives an outlook on technology that might be relevant for games AI in the future.
I gave this talk at Vilnius University as a guest speaker in the late October 2016.
( Machine Learning & Deep Learning Specialization Training: https://goo.gl/5u2RiS )
This CloudxLab Reinforcement Learning tutorial helps you to understand Reinforcement Learning in detail. Below are the topics covered in this tutorial:
1) What is Reinforcement?
2) Reinforcement Learning an Introduction
3) Reinforcement Learning Example
4) Learning to Optimize Rewards
5) Policy Search - Brute Force Approach, Genetic Algorithms and Optimization Techniques
6) OpenAI Gym
7) The Credit Assignment Problem
8) Inverse Reinforcement Learning
9) Playing Atari with Deep Reinforcement Learning
10) Policy Gradients
11) Markov Decision Processes
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...Preferred Networks
This presentation explains basic ideas of graph neural networks (GNNs) and their common applications. Primary target audiences are students, engineers and researchers who are new to GNNs but interested in using GNNs for their projects. This is a modified version of the course material for a special lecture on Data Science at Nara Institute of Science and Technology (NAIST), given by Preferred Networks researcher Katsuhiko Ishiguro, PhD.
AlphaGo Zero: Mastering the Game of Go Without Human KnowledgeJoonhyung Lee
A brief but in-depth and highly understandable introduction to AlphaGo Zero, the successor to the world-famous AlphaGo.
Unlike its predecessor, which relied on a huge amount of human training data, AlphaGo Zero requires no human input in its training process.
Because of this, it is uninhibited by human prejudices and preconceptions, which allowed it to become the best go player, human or machine, in history.
This is the presentation that I gave concerning the subject in my first semester as a graduate subject. It was designed for those with a background in deep learning, but not reinforcement learning. It explains the core concepts necessary to understand AlphaGo to an interested audience.
An Analytical Study of Puzzle Selection Strategies for the ESP GameAcademia Sinica
“Human Computation” represents a new paradigm of applications that take advantage of people’s desire to be entertained and produce useful metadata as a by-product. By creating games with a purpose, human computation has shown promise in solving a variety of problems that computer computation cannot currently resolve completely. Using the ESP game as an example, we propose a metric, called system gain, for evaluating the performance of human computation systems, and also use analysis to study the properties of the ESP game. We argue that human computation systems should be played with a strategy. To this end, we implement an Optimal Puzzle Selection Strategy (OPSA) based on our analysis to improve human computation. Using a comprehensive set of simulations, we demonstrate that the proposed OPSA approach can effectively improve the system gain of the ESP game, as long as the number of puzzles in the system is sufficiently large.
A presentation on the "no new UNet" model, which attempts to automate hyper-parameter selection for medical image segmentation. The paper was accepted to Nature Methods.
Denoising Unpaired Low Dose CT Images with Self-Ensembled CycleGANJoonhyung Lee
Presentation at ISBI 2020 for our work on low dose CT reconstruction with CycleGAN.
Used for presentation at ISBI 2020 workshop on "Deep Learning for Biomedical Image Reconstruction".
Deep Learning Fast MRI Using Channel Attention in Magnitude DomainJoonhyung Lee
My presentation on how we participated in the fastMRI Challanege in 2019.
Aside from theoretical considerations, it also explains key implementation issues that arise in all deep learning for MRI such as disk I/O and CPU/GPU load balancing.
Used for presentation at ISBI 2020 Oral session.
Accidentally wrote the title as "Deep Learning Sum-of-Squares Images in Accelerated Parallel MRI". Sorry for the mistake!
CutMix: Regularization Strategy to Train Strong Classifiers with Localizable ...Joonhyung Lee
A presentation on CutMix, a beautifully simple general data augmentation/regularization method for improving performance and reducing over-fitting on deep learning models. Being applied to the input data, it can be used on any model for almost any task. It has been shown to be effective for a wide variety of tasks and may be used for many more.
Squeeze Excitation Networks, The simple idea that won the final ImageNet Chal...Joonhyung Lee
An introduction to Squeeze Excitation Networks, the architecture which won the ImageNet Large Scale Visual Recognition Challenge 2017 (the last ImageNet challenge to be hosted).
The key idea is to use "channel attention". Attention originally came from NLP but was adopted for CNNs. This work shows how just applying attention to the channels of a CNN can improve the performance of an architecture dramatically. Moreover, this module can be implemented in any existing CNN architecture with minimal computational overhead. It is thus a very simple but very important idea.
DeepLab V3+: Encoder-Decoder with Atrous Separable Convolution for Semantic I...Joonhyung Lee
A presentation introducting DeepLab V3+, the state-of-the-art architecture for semantic segmentation. It also includes detailed descriptions of how 2D multi-channel convolutions function, as well as giving a detailed explanation of depth-wise separable convolutions.
Slide 1: Title Slide
Extrachromosomal Inheritance
Slide 2: Introduction to Extrachromosomal Inheritance
Definition: Extrachromosomal inheritance refers to the transmission of genetic material that is not found within the nucleus.
Key Components: Involves genes located in mitochondria, chloroplasts, and plasmids.
Slide 3: Mitochondrial Inheritance
Mitochondria: Organelles responsible for energy production.
Mitochondrial DNA (mtDNA): Circular DNA molecule found in mitochondria.
Inheritance Pattern: Maternally inherited, meaning it is passed from mothers to all their offspring.
Diseases: Examples include Leber’s hereditary optic neuropathy (LHON) and mitochondrial myopathy.
Slide 4: Chloroplast Inheritance
Chloroplasts: Organelles responsible for photosynthesis in plants.
Chloroplast DNA (cpDNA): Circular DNA molecule found in chloroplasts.
Inheritance Pattern: Often maternally inherited in most plants, but can vary in some species.
Examples: Variegation in plants, where leaf color patterns are determined by chloroplast DNA.
Slide 5: Plasmid Inheritance
Plasmids: Small, circular DNA molecules found in bacteria and some eukaryotes.
Features: Can carry antibiotic resistance genes and can be transferred between cells through processes like conjugation.
Significance: Important in biotechnology for gene cloning and genetic engineering.
Slide 6: Mechanisms of Extrachromosomal Inheritance
Non-Mendelian Patterns: Do not follow Mendel’s laws of inheritance.
Cytoplasmic Segregation: During cell division, organelles like mitochondria and chloroplasts are randomly distributed to daughter cells.
Heteroplasmy: Presence of more than one type of organellar genome within a cell, leading to variation in expression.
Slide 7: Examples of Extrachromosomal Inheritance
Four O’clock Plant (Mirabilis jalapa): Shows variegated leaves due to different cpDNA in leaf cells.
Petite Mutants in Yeast: Result from mutations in mitochondrial DNA affecting respiration.
Slide 8: Importance of Extrachromosomal Inheritance
Evolution: Provides insight into the evolution of eukaryotic cells.
Medicine: Understanding mitochondrial inheritance helps in diagnosing and treating mitochondrial diseases.
Agriculture: Chloroplast inheritance can be used in plant breeding and genetic modification.
Slide 9: Recent Research and Advances
Gene Editing: Techniques like CRISPR-Cas9 are being used to edit mitochondrial and chloroplast DNA.
Therapies: Development of mitochondrial replacement therapy (MRT) for preventing mitochondrial diseases.
Slide 10: Conclusion
Summary: Extrachromosomal inheritance involves the transmission of genetic material outside the nucleus and plays a crucial role in genetics, medicine, and biotechnology.
Future Directions: Continued research and technological advancements hold promise for new treatments and applications.
Slide 11: Questions and Discussion
Invite Audience: Open the floor for any questions or further discussion on the topic.
Introduction:
RNA interference (RNAi) or Post-Transcriptional Gene Silencing (PTGS) is an important biological process for modulating eukaryotic gene expression.
It is highly conserved process of posttranscriptional gene silencing by which double stranded RNA (dsRNA) causes sequence-specific degradation of mRNA sequences.
dsRNA-induced gene silencing (RNAi) is reported in a wide range of eukaryotes ranging from worms, insects, mammals and plants.
This process mediates resistance to both endogenous parasitic and exogenous pathogenic nucleic acids, and regulates the expression of protein-coding genes.
What are small ncRNAs?
micro RNA (miRNA)
short interfering RNA (siRNA)
Properties of small non-coding RNA:
Involved in silencing mRNA transcripts.
Called “small” because they are usually only about 21-24 nucleotides long.
Synthesized by first cutting up longer precursor sequences (like the 61nt one that Lee discovered).
Silence an mRNA by base pairing with some sequence on the mRNA.
Discovery of siRNA?
The first small RNA:
In 1993 Rosalind Lee (Victor Ambros lab) was studying a non- coding gene in C. elegans, lin-4, that was involved in silencing of another gene, lin-14, at the appropriate time in the
development of the worm C. elegans.
Two small transcripts of lin-4 (22nt and 61nt) were found to be complementary to a sequence in the 3' UTR of lin-14.
Because lin-4 encoded no protein, she deduced that it must be these transcripts that are causing the silencing by RNA-RNA interactions.
Types of RNAi ( non coding RNA)
MiRNA
Length (23-25 nt)
Trans acting
Binds with target MRNA in mismatch
Translation inhibition
Si RNA
Length 21 nt.
Cis acting
Bind with target Mrna in perfect complementary sequence
Piwi-RNA
Length ; 25 to 36 nt.
Expressed in Germ Cells
Regulates trnasposomes activity
MECHANISM OF RNAI:
First the double-stranded RNA teams up with a protein complex named Dicer, which cuts the long RNA into short pieces.
Then another protein complex called RISC (RNA-induced silencing complex) discards one of the two RNA strands.
The RISC-docked, single-stranded RNA then pairs with the homologous mRNA and destroys it.
THE RISC COMPLEX:
RISC is large(>500kD) RNA multi- protein Binding complex which triggers MRNA degradation in response to MRNA
Unwinding of double stranded Si RNA by ATP independent Helicase
Active component of RISC is Ago proteins( ENDONUCLEASE) which cleave target MRNA.
DICER: endonuclease (RNase Family III)
Argonaute: Central Component of the RNA-Induced Silencing Complex (RISC)
One strand of the dsRNA produced by Dicer is retained in the RISC complex in association with Argonaute
ARGONAUTE PROTEIN :
1.PAZ(PIWI/Argonaute/ Zwille)- Recognition of target MRNA
2.PIWI (p-element induced wimpy Testis)- breaks Phosphodiester bond of mRNA.)RNAse H activity.
MiRNA:
The Double-stranded RNAs are naturally produced in eukaryotic cells during development, and they have a key role in regulating gene expression .
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...Scintica Instrumentation
Intravital microscopy (IVM) is a powerful tool utilized to study cellular behavior over time and space in vivo. Much of our understanding of cell biology has been accomplished using various in vitro and ex vivo methods; however, these studies do not necessarily reflect the natural dynamics of biological processes. Unlike traditional cell culture or fixed tissue imaging, IVM allows for the ultra-fast high-resolution imaging of cellular processes over time and space and were studied in its natural environment. Real-time visualization of biological processes in the context of an intact organism helps maintain physiological relevance and provide insights into the progression of disease, response to treatments or developmental processes.
In this webinar we give an overview of advanced applications of the IVM system in preclinical research. IVIM technology is a provider of all-in-one intravital microscopy systems and solutions optimized for in vivo imaging of live animal models at sub-micron resolution. The system’s unique features and user-friendly software enables researchers to probe fast dynamic biological processes such as immune cell tracking, cell-cell interaction as well as vascularization and tumor metastasis with exceptional detail. This webinar will also give an overview of IVM being utilized in drug development, offering a view into the intricate interaction between drugs/nanoparticles and tissues in vivo and allows for the evaluation of therapeutic intervention in a variety of tissues and organs. This interdisciplinary collaboration continues to drive the advancements of novel therapeutic strategies.
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...University of Maribor
Slides from:
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Track: Artificial Intelligence
https://www.etran.rs/2024/en/home-english/
2. Introduction: AlphaGo and its Successors
▪ AlphaGo: January 27th, 2016
▪ AlphaGo Master: December 29th, 2016
▪ AlphaGo Zero: October 19th, 2017
▪ AlphaZero: December 5th, 2017
▪ The full AlphaZero paper was published
in December 6th, 2018 in Science.
3. AlphaZero: One Program to Rule them All
▪ Going Beyond the Game of Go: All three games of chess, shogi, and
go are played by a single algorithm and single network architecture.
Training is performed separately for each game.
▪ No human data: Starts tabula rasa (hence the “Zero” in the name)
from random play and only uses self-play.
▪ No hand-crafted features: Only the rules of each game and raw board
positions are used (different from original AlphaGo).
▪ Shared hyperparameters: Only learning-rate schedule and exploration
noise parameters are different for each game.
5. Introduction
▪ Reinforcement Learning (RL) concerns how software agents should take actions in an
environment to maximize some reward.
▪ It is different from Supervised Learning (SL) in that the agent discovers the reward by
exploring its environment, making labelled data unnecessary.
▪ AlphaZero uses the discrete Markov Decision Process (MDP) paradigm, where outcomes
are partly random and partly under the control of the agent.
6. Terminology
▪ Agent: The thing interacting with the
environment.
▪ State (s): The situation that the agent is in.
▪ Action (a): The action that the agent takes.
▪ Reward (r): The reward (or penalty) that the
agent receives from taking an action in a state.
▪ Policy (π): The function that decides probabilities
for taking each possible action in a given state.
Returns a vector with probabilities for all actions.
𝑉 𝑠 =
𝑎∈𝐴
𝜋(𝑠, 𝑎)𝑄 𝑠, 𝑎
• Value Function (V(s)): The value (long term
discounted total reward) of the given state.
• Action-Value Function (Q(s, a)): The value of a
given action in a given state.
7. Key Properties
▪ The value of a state is the sum of its action-values weighted by the likelihood of the action.
𝑉 𝑠 =
𝑎∈𝐴
𝜋(𝑠, 𝑎)𝑄 𝑠, 𝑎
▪ Policies must sum to 1 because they are the probabilities of choosing possible actions.
𝑎∈𝐴
𝜋(𝑠, 𝑎) = 1
8. The Explore-Exploit Tradeoff
▪ The fundamental question of Reinforcement Learning:
▪ Explore: Explore the environment further to find higher rewards.
▪ Exploit: Exploit the known states/actions to maximize reward.
Should I just eat the cheese that I have already found, or should I search the maze for more/better cheese?
9. The Markov Property
▪ All states in the Markov Decision Process (MDP) must satisfy the Markov Property: All
states must depend only on the state immediately before it. There is no memory of
previous states.
▪ A stochastic process has the Markov property if the conditional probability distribution of
the future given the present and the past depends only on the present state and not on any
previous states.
▪ Unfortunately, board games do not satisfy the Markov property.
11. Monte Carlo Simulation
▪ Using repeated random sampling to
simulate intractable systems.
▪ The name derives from the Casino de
Monte-Carlo in Monaco.
▪ Monte Carlo simulation can be applied
to any problem with a probabilistic
interpretation.
12. Monte Carlo Tree Search
▪ Node: State
▪ Edge: Action
▪ Tree Search: Searching the various “leaves”
of the “tree” of possibilities.
▪ The simulation begins from the “root” node.
▪ When visited in simulation, a “leaf” node
becomes a “branch” node and sprouts its own
“leaf” nodes in the “tree”.
13. MCTS in AlphaZero
▪ MCTS is used to simulate games in
AlphaZero’s “imagination”.
▪ The processes for selecting the next
move in the “imagination” and in
“reality” are very different.
MCTS
15. Network Architecture: Introduction
▪ Inputs: Concatenated board positions from the previous 8 turns in the player’s perspective.
▪ Outputs: Policy for MCTS simulation (policy head, top) and Value of given state (value head, bottom).
▪ Inputs also include information, such as the current player, concatenated channel-wise.
▪ Policy outputs for chess and shogi are in 2D, unlike go, which has 1D outputs.
16. Overview
• Select the next move in the simulation using Polynomial Upper Confidence Trees (PUCT).
• Repeat until an unevaluated leaf node is encountered.
• Backup from the node after evaluating its value and action value. Update the statistics of the branches.
• Play after enough (800 was used for AlphaZero) simulations have been performed to generate a policy.
17. Core Concepts
▪ 𝑁(𝑠, 𝑎): Visit count, the number of times a state-action pair has been visited.
▪ 𝑊(𝑠, 𝑎): Total action-value, the sum of all NN value outputs from that branch.
▪ 𝑄(𝑠, 𝑎): Mean action-value, 𝑊(𝑠, 𝑎)/𝑁(𝑠, 𝑎).
▪ 𝑃(𝑠, 𝑎): Prior Probability, policy output of NN for the given state-action pair (s, a).
▪ 𝑁 𝑠 : Parent visit count. 𝑁 𝑠 = 𝑎∈𝐴 𝑁 𝑠, 𝑎
▪ 𝐶(𝑠): Exploration rate. Stays nearly constant in a single simulation.
18. Select
▪ Select the next move in the simulation using the PUCT algorithm.
𝑎 𝑠𝑒𝑙𝑒𝑐𝑡𝑒𝑑 = argmax
𝑎
[𝑄 𝑠, 𝑎 + 𝑈(𝑠, 𝑎)]
𝑈 𝑠, 𝑎 = 𝐶 𝑠 𝑃 𝑠, 𝑎
𝑁 𝑠
1 + 𝑁 𝑠, 𝑎
𝐶 𝑠 = log
1 + 𝑁 𝑠 + 𝑐 𝑏𝑎𝑠𝑒
𝑐 𝑏𝑎𝑠𝑒
+ 𝑐𝑖𝑛𝑖𝑡
▪ 𝑄 𝑠, 𝑎 + 𝑈(𝑠, 𝑎): Upper Confidence Bound.
▪ 𝑄 𝑠, 𝑎 : The Exploitation component.
▪ 𝑈(𝑠, 𝑎): The Exploration component.
19. Key Points
▪ All statistics for MCTS (N, W, Q, P, C) are maintained for 1 game only, not for 1
simulation and not between multiple games.
▪ The NN evaluates each node only once, when it is a leaf node.
▪ The NN outputs 𝑃(𝑠, 𝑎) and 𝑉 𝑠 by the policy and value heads, respectively.
▪ 𝑃(𝑠, 𝑎), 𝑄 𝑠, 𝑎 , and 𝑈 𝑠, 𝑎 are vectors with one element per action, not scalars.
20. Expand and Evaluate
▪ From the root node, go down the branch nodes of the tree
until a leaf node (an unevaluated node) is encountered.
▪ Evaluate the leaf node (𝑠′) using 𝑓𝜃, the Neural Network
(NN) to obtain the policy and value for the simulation.
𝒑, 𝑣 = 𝑓𝜃 𝑠 , 𝒑 = 𝑃 𝑠, 𝑎 , 𝑣 = 𝑉 𝑠′
▪ The tree then grows a branch where there was a leaf.
21. Backup
𝑁 𝑠 ← 𝑁 𝑠 + 1
𝑁 𝑠, 𝑎 ← 𝑁 𝑠, 𝑎 + 1
𝑊 𝑠, 𝑎 ← 𝑊 𝑠, 𝑎 + 𝑣
𝑄 𝑠, 𝑎 ←
𝑊 𝑠, 𝑎
𝑁 𝑠, 𝑎
▪ A simulation terminates if a leaf node is reached, the game ends
in the simulation, the value is below a resignation threshold, or
a maximum game length is reached.
▪ Update the visit counts and average action value for all previous
state-action pairs, all the way up the tree to the root node.
22. Play
𝜋 𝑎 𝑠′ =
𝑁 𝑠′, 𝑎
𝑁 𝑠′
1
𝜏
▪ After a specified number of simulations (800 was used), the policy for play is
decided by the visit count and the temperature parameter.
▪ 𝜏: The temperature parameter controlling the entropy of the policy.
▪ The moves in play are “real” moves not “imaginary” simulations.
23. Key Points
▪ The probabilities of the play policy π are given by the visit counts of MCTS simulation,
not by the NN directly.
▪ No NN training occurs during MCTS simulation.
▪ The action selection mechanisms for simulation and play are different.
24. The Loss Function
𝑙 = 𝑧 − 𝑣 2 − 𝜋 𝑇 𝑙𝑜𝑔𝒑 + 𝑐 𝜃
2
Loss = MSE(actual value, predicted value)
+ Cross Entropy(MCTS policy, predicted policy)
+ L2 Decay(model weights)
▪ 𝑧 = 1, 0, −1 for win, tie, and lose of the true
outcome of a game.
▪ 𝑐: Weight decay hyperparameter.
26. Self-Play vs Evaluation
Prior Probabilities
𝑃 𝑠′
, 𝑎 = 1 − 𝜖 𝑝 𝑎 + 𝜖𝜂 𝑎
𝜂 𝛼~𝐷𝑖𝑟 𝛼
▪ In training, noise is added to the root node prior
probability.
▪ 𝜖 = 0.25, 𝛼 = {0.3, 0.15, 0.03} for chess, shogi,
and go, respectively.
▪ 𝛼 is scaled in inverse to the approximate number of
legal moves in a typical position.
Temperature
𝜋 𝑎 𝑠′ =
𝑁 𝑠′
, 𝑎
𝑁 𝑠′
1
𝜏
▪ Simulated annealing is used to increase exploration
during the first few moves (𝜏 = 1 for the first 30
moves, 𝜏 ≈ 0 afterwards).
▪ 𝜏 ≈ 0 is equivalent to choosing the action with
highest probability while 𝜏 = 1 is equivalent to
randomly choosing an action according to
probabilities given by the vector 𝜋 𝑎 𝑠′
.
27. Details of Training Data Generation
▪ Self-Play games of the most recent model are used to generate training data.
▪ Multiple self-play games are run in parallel to provide enough training data.
▪ 5,000 first-generation TPUs were used for data generation during training.
▪ 16 second-generation TPUs were used for model training.
▪ The actual MCTS is performed asynchronously for better resource utilization.
▪ A batch size of 4096 game steps was used for training.
28. Differences with AlphaGo Zero
▪ No data augmentation by symmetries. Go is symmetric but chess and shogi are not.
▪ A single network is continually updated instead of testing for the best player every 1,000
steps. Self-play games are always generated by the latest model.
▪ No Bayesian optimization of hyperparameters.
▪ 19 residual blocks in the body of the NN, unlike the final version of AlphaGo Zero, which
had 39. However, this is identical to the early version of AlphaGo Zero.
30. Network Architecture: Structure
▪ 19 residual blocks in the body with 2 output heads.
▪ The policy head (top) has softmax activation to output probabilities for the policy for the state.
▪ The value head (bottom) has tanh activation to output the value of the state (∵ +1: win, 0: tie, -1: lose).
40. Common Misunderstandings
▪ Computers just search for all possible positions.
▪ Computers cannot have creativity or intuition like humans.
▪ Computers can only perform tasks programmed by humans; therefore they cannot exceed
humans.
▪ AlphaZero needs a supercomputer to run.
42. Expert Opinion
“I admit that I was pleased to see that AlphaZero had a dynamic, open style like my own. The
conventional wisdom was that machines would approach perfection with endless dry
maneuvering, usually leading to drawn games. But in my observation, AlphaZero prioritizes
piece activity over material, preferring positions that to my eye looked risky and aggressive.
Programs usually reflect priorities and prejudices of programmers, but because AlphaZero
programs itself, I would say that its style reflects the truth. This superior understanding allowed
it to outclass the world's top traditional program despite calculating far fewer positions per
second. It's the embodiment of the cliché, “work smarter, not harder.””
-Garry Kasparov, former World Chess Champion
43. Additional Information
▪ The “Zero” in AlphaZero and AlphaGo Zero means that these systems began learning
tabula rasa, from random initialization with zero human input, only the rules of the game.
▪ A single machine with 4 first-generation TPUs and 44 CPU cores was used for game-play.
A first-generation TPU has a similar inference speed to an NVIDIA Titan V GPU.
▪ Leela Zero, an open-source implementation of AlphaGo Zero and AlphaZero, is available
for those without access to 5,000 TPUs.