mirror of
https://github.com/Brandon-Rozek/website.git
synced 2025-10-10 23:11:14 +00:00
Fixed titles, math rendering, and links on some pages
This commit is contained in:
parent
9f096a8720
commit
330ace0de9
61 changed files with 303 additions and 115 deletions
|
@ -1,4 +1,8 @@
|
|||
# Progress for Week of March 26
|
||||
---
|
||||
title: Progress for Week of March 26
|
||||
showthedate: false
|
||||
math: true
|
||||
---
|
||||
|
||||
## Parallelized Evolutionary Strategies
|
||||
|
||||
|
@ -8,4 +12,4 @@ When the parallel ES class is declared, I start a pool of workers that then gets
|
|||
|
||||
I started looking through papers on Exploration and am interested in using the theoretical niceness of Count-based exploration in tabular settings and being able to see their affects in the non-tabular case.
|
||||
|
||||
""[Unifying Count-Based Exploration and Intrinsic Motivation](https://arxiv.org/abs/1606.01868)" creates a model of a arbitrary density model that follows a couple nice properties we would expect of probabilities. Namely, $P(S) = N(S) / n$ and $P'(S) = (N(S) + 1) / (n + 1)$. Where $N(S)$ represents the number of times you've seen that state, $n$ represents the total number of states you've seen, and $P'(S)$ represents the $P(S)$ after you have seen $S$ another time. With this model, we are able to solve for $N(S)$ and derive what the authors call a *Psuedo-Count*.
|
||||
""[Unifying Count-Based Exploration and Intrinsic Motivation](https://arxiv.org/abs/1606.01868)" creates a model of a arbitrary density model that follows a couple nice properties we would expect of probabilities. Namely, $P(S) = N(S) / n$ and $P'(S) = (N(S) + 1) / (n + 1)$. Where $N(S)$ represents the number of times you've seen that state, $n$ represents the total number of states you've seen, and $P'(S)$ represents the $P(S)$ after you have seen $S$ another time. With this model, we are able to solve for $N(S)$ and derive what the authors call a *Psuedo-Count*.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue