# Dynamic programming and Optimal Control
Schedule: Fall 2023, Thursdays 9:00am - 12:15pm Location: 430 Kravis Hall
Professor: Daniel Russo Office hours: 12:15-1:00 PM or by appointment
TA: David Cheikhi
Course Number: B9120-001
This course serves as an advanced introduction to dynamic programming and optimal control. The first part of the course will cover problem formulation and problem specific solution ideas arising in canonical control problems. The second part of the course covers algorithms, treating foundations of approximate dynamic programming and reinforcement learning alongside exact dynamic programming algorithms.
Markov chains; Some python and some linear programming; mathematical maturity (this is a doctoral course);
Dynamic Programming and Optimal Control by Dimitris Bertsekas, 4th Edition, Volumes I and II.
# | Date | Topic | Notes | Reading | Extras |
---|---|---|---|---|---|
Part I: Problem formulation and special problem structure | |||||
1 | 9/7 | Intro | Vol 1, Sec 1.2-1.4 | ||
2 | 9/7 | Optimal stopping | Vol 1, Sec 3.4 | ||
3 | 9/14 | Inventory control | Vol 1, Sec 3.2 | ||
4 | 9/14 | Linear quadratic control | Vol 1, Sec 3.1 | ||
5 | 9/21 | Imperfect state information; reformulation as a problem with perfect state information; separation principle | Vol 1, Sec 4.1-4.3 | ||
6 | 9/28 | Discounted infinite horizon objectives. | Vol 2 Sec 1.4 | ||
7 | 10/5 | Continuous time, discrete event problems; uniformization | Vol 2 Sec 1.5 | ||
8 | 10/5 | Gittins index theorem; Applications to scheduling and optimal learning; | pdf1, pdf2, pdf3 | ||
9 | 10/12 | Gittins index theorem continued | |||
10/19 | No B-school classes scheduled | ||||
Part 2: Algorithms and approximations | |||||
10 | 10/26 | Infinite (or indefinite) horizon problems without discounting: average cost objectives & stochastic shortest path problems. | |||
11 | 11/2 | Online value iteration. | |||
12 | 11/9 | Approximate value iteration. | |||
12 | 11/9 | Approximation benefits of online value iteration. | |||
13 | 11/16 | Policy iteration; impact of distribution shift on convergence rate. | |||
11/23 | No B-school classes scheduled | ||||
14 | 11/30 | Policy gradient methods | |||
15 | Dec 7 |
Topics for part 2 are yet to be determined. Some possibilities include:
Homework (90%). Lecture scribing and class participation (10%).
We will have a short homework each week. Please write down a precise, rigorous, formulation of all word problems. For example, specify the state space, the cost functions at each state, etc.