Mathematical Foundations of Reinforcement Learning [1 ECTS - 8hours]
This course concerns multi-stage decision processes in the framework of dynamic programming and the Bellman equation, where optimal policies are synthesized based on both immediate and long-term rewards. However, the computational requirements of dynamic programming techniques can be prohibitive as the policy/state space is overwhelmingly large, the so-called Bellman's curse of dimensionality". In this course we will overcome this difficulty by means of different techniques for the computation of suboptimal solutions to dynamic programming equations. The lectures will address theoretical, algorithmic, and computational aspects of such techniques.
Teacher: Dr. Dante Kalise (email:
dante.kalise@nottingham.ac.uk)
Lectures will be recorded and live streamed according to the following schedule:
Tue 9 June 10:30-12:30; [video]
Wed 10 June 10:30-12:30; [video]
Thu 11 June 12:30-14:30; [video] associated article: [arxiv]
Fri 12 June 10:30-12:30. [video] find below the matlab code zermelo.m
Find the handwritten notes below
Students willing to participate are asked to send a registration email to: giacomo.albi@univr.it