Speaker:  Prof. Dante Kalise - University of Nottingham
  Tuesday, June 9, 2020 at 10:30 AM

Mathematical Foundations of Reinforcement Learning [1 ECTS - 8hours]

This course concerns multi-stage decision processes in the framework of dynamic programming and the Bellman equation, where optimal policies are synthesized based on both immediate and long-term rewards. However, the computational requirements of dynamic programming techniques can be prohibitive as the policy/state space is overwhelmingly large, the so-called Bellman's curse of dimensionality". In this course we will overcome this difficulty by means of different techniques for the computation of suboptimal solutions to dynamic programming equations. The lectures will address theoretical, algorithmic, and computational aspects of such techniques.

Lectures will be recorded and  live streamed according to the following schedule:
Tue     9 June    10:30-12:30;    [video]
Wed  10 June    10:30-12:30;   [video]
Thu   11 June 
  12:30-14:30;   [video]  associated article: [arxiv]
Fri     12 
June    10:30-12:30.   [video]  find below the matlab code zermelo.m

Find the handwritten notes below

