Removing raw HTML

This commit is contained in:
Brandon Rozek 2025-02-16 22:04:56 -05:00
parent e06d45e053
commit 572d587b8e
No known key found for this signature in database
GPG key ID: DFB0E78F805F4567
33 changed files with 373 additions and 386 deletions

View file

@ -24,7 +24,7 @@ This algorithm is called *iterative policy evaluation*.
To produce each successive approximation, $v_{k + 1}$ from $v_k$, iterative policy evaluation applies the same operation to each state $s$: it replaces the old value of $s$ with a new value obtained from the old values of the successor states of $s$, and the expected immediate rewards, along all the one-step transitions possible under the policy being evaluated.
<u>**Iterative Policy Evaluation**</u>
**Iterative Policy Evaluation**
```
Input π, the policy to be evaluated
@ -69,7 +69,7 @@ Each policy is guaranteed to be a strict improvement over the previous one (unle
This way of finding an optimal policy is called *policy iteration*.
<u>Algorithm</u>
**Algorithm**
```
1. Initialization