The document describes reinforcement learning algorithms. It defines the reinforcement learning framework using a 5-tuple (S, A, T, R, γ) representing states, actions, transitions, rewards, and discount factor. It presents the Bellman equations for calculating the optimal state-action value function Q* and state value function V* using recursive updates. It also describes the iterative policy evaluation algorithm which uses the Bellman equation to calculate the value function Vk at each iteration k.