Fixed titles, math rendering, and links on some pages

2026-01-30 05:53:38 +00:00 · 2021-07-26 09:13:20 -04:00 · 2021-07-26 09:13:20 -04:00 · 330ace0de9
commit 330ace0de9
parent 9f096a8720
61 changed files with 303 additions and 115 deletions
--- a/content/research/reinforcementlearning/notes/mdp.md
+++ b/content/research/reinforcementlearning/notes/mdp.md
@ -1,4 +1,8 @@
-# Chapter 3: Finite Markov Decision Processes
+---
+title: Chapter 3 - Finite Markov Decision Processes
+showthedate: false
+math: true
+---

 Markov Decision processes are a classical formalization of sequential decision making, where actions influence not just immediate rewards, but also subsequent situations, or states, and through those future rewards. Thus MDPs involve delayed reward and the need to trade-off immediate and delayed reward. Whereas in bandit problems we estimated the value of $q_*(a)$ of each action $a$, in MDPs we estimate the value of $q_*(s, a)$ of each action $a$ in state $s$.