The document discusses different methods for reinforcement learning, including learning heuristic functions from experiences, learning in explicit and implicit graphs, using rewards instead of goals for tasks, and different algorithms like temporal difference learning and value iteration that help agents learn optimal policies by assigning credit to relevant state-action pairs.