mirror of
https://github.com/Brandon-Rozek/website.git
synced 2024-12-24 13:02:21 +00:00
25 lines
1.5 KiB
Markdown
25 lines
1.5 KiB
Markdown
|
# Progress Report for Week of April 2nd
|
||
|
|
||
|
## Added Video Recording Capability to MinAtar environment
|
||
|
|
||
|
You can now use the OpenAI Monitor Wrapper to watch the actions performed by agents in the MinAtar suite. (Currently the videos are in grayscale)
|
||
|
|
||
|
Problems I had to solve:
|
||
|
|
||
|
- How to represent the channels into a grayscale value
|
||
|
- Getting the tensor into the right format (with shape and dtype)
|
||
|
- Adding additional meta information that OpenAI expected
|
||
|
|
||
|
## Progress Towards \#Exploration
|
||
|
|
||
|
After getting nowhere trying to combine the paper on Random Network Distillation and Count-based exploration and Intrinsic Motivation, I turned the paper \#Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning.
|
||
|
|
||
|
This paper uses the idea of an autoencoder to learn a smaller latent state representation of the input. We can then use this smaller representation as a hash and count states based on these hashes.
|
||
|
|
||
|
Playing around with the ideas of autoencoders, I wanted a way to discretized my hash more than just what floating point precision allows. Of course this turns it into a non-differential function which I then tried turning towards Evolutionary methods to solve. Sadly the rate of optimization was drastically diminished using the Evolutionary approach. Therefore, my experiments for this week failed.
|
||
|
|
||
|
I'll probably look towards implementing what the paper did for my library and move on to a different piece.
|
||
|
|
||
|
|
||
|
|
||
|
Guru Indian: 3140 Cowan Blvd, Fredericksburg, VA 22401
|