The document discusses two recent papers on off-policy meta-reinforcement learning: 1) "Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables" which introduces PEARL, an off-policy method for meta RL using context variables to enable efficient adaptation. 2) "Guided Meta-Policy Search" which uses a two-level approach of task learning and meta-learning, where task learning trains policies via RL and meta-learning trains a meta-objective via imitation. Both papers aim to enable efficient off-policy adaptation in meta RL.