Designing Reliable Reinforcement Learning Agents

Speaker:  Thiago D. Simão - Eindhoven University of Technology
  Wednesday, May 29, 2024 at 12:30 PM

Safety is a crucial concern when deploying reinforcement learning (RL) algorithms in real-world scenarios. In this two-part lecture series, we delve into safety considerations from two perspectives: ensuring reasonable performance and adhering to predefined constraints.
- PART 1. In the first segment, we investigate the offline setting where the RL agent solely accesses a fixed dataset of prior trajectories, devoid of direct interaction with the environment. Given the availability of the behavior policy responsible for data collection, the primary challenge is crafting a policy that outperforms such behavior policy. We study algorithms that leverage the behavior policy to compute an improved policy with high probability and discuss how to improve their sample efficiency.
- PART 2. Transitioning to the latter segment, we confront the limitations inherent in specifying the behavior expected from an agent solely via a reward function. We introduce a model that mitigates this issue using constraints, and we discuss how to compute the corresponding optimal policy when the problem is known. Finally, we study algorithms that can efficiently explore the environment and eventually converge to an optimal policy when the model is unknown.



May 29, 12.30-14.30 (Room B)
May 29, 15.30-17.30 (Room 1.02)
May 31, 15.30-17.30 (Room 1.02)

June 5, 12.30-14.30 (Room B)
June 5, 15.30-17.30 (Room 1.02
June 7, 15.30-17.30 (Room 1.02)


The minicourse is related to the "Reinforcement Learning" course (Master in Artificial Intelligence).

Programme Director
Alberto Castellini

External reference
Publication date
April 8, 2024