2. Background and Definition
• Reinforcement learning (RL) is an area of machine learning inspired by
behaviourist psychology, concerned with how software agents ought
to take actions in an environment so as to maximize some notion of
cumulative reward (Wikipedia)
• Reinforcement learning copies a very simple principle from nature
that The psychologist Edward Thorndike documented it more than
100 years ago. Thorndike placed cats inside boxes from which they
could escape only by pressing a lever. After a considerable amount of
pacing around and meowing, the animals would eventually step on
the lever by chance. After they learned to associate this behavior with
the desired outcome, they eventually escaped with increasing speed.
3. • In 1951, Marvin Minsky, a student at Harvard who would become
one of the founding fathers of AI as a professor at MIT, built a
machine that used a simple form of reinforcement learning to mimic
a rat learning to navigate a maze. Minsky’s Stochastic Neural Analogy
Reinforcement Computer, or SNARC, consisted of dozens of tubes,
motors, and clutches that simulated the behavior of 40 neurons and
synapses. As a simulated rat made its way out of a virtual maze, the
strength of some synaptic connections would increase, thereby
reinforcing the underlying behavior.
4.
5. Will Knight@technologyreview.com
• By experimenting, computers are figuring out how to do things that
no programmer could teach them
• how to get a computer to calculate the value that should be assigned
to, say, each right or wrong turn that a rat might make on its way out
of its maze
6. Algoritma
• Q-learning works by learning value of action and state so that to
choose optimal solution just by choosing maximal value of action-
state for each state.
• Q-learning - Q-learning could give an optimal solution in Markov
Decision Process
• Markov Decision Process (MDP)-is a mathemahical framework that
developed by andrey markov to modellling a sistem of decision
making
7.
8. interested fact
• DeepMind combined Deep Learning & Reinforcement Learning to
create the first artificial agents to achieve human-level performance
across many challenging domains
So that
• in March 2016, AlphaGo, a program trained using reinforcement
learning, destroyed one of the best Go players of all time, South
Korea’s Lee Sedol
Reinforce learning adalah sebuah area dari machine learning yang terinspirasi oleh prilaku mahluk hidup seperti yang telah didokumentasikan oleh edwar thorndike, dimana dia menempatkan seekor kucing dalam sebuah kendang yang bisa saja terbuka dengan menekan sebuah tuas. Setelah dilakukan beberapa percobaan dari kucing yang sama maka kucing itu makin cepat keluar dari kandang
Penerapan pada AI dilakukan Pada tahun 1951 marvin Minsky dengan perangkatnya yang bernama Minsky’ stochastic neural analogy reinforcement computer atau snarc, membuat sebuah simulasi pembelajaran tikus dalam menyelesaikan maze
Algorima Q – learning to learn to play games on the Atari 2600 console
Deep-learning --The basic idea—that software can simulate the neocortex’s large array of neurons in an artificial “neural network”