rltorch

Author	SHA1	Message	Date
Brandon Rozek	11d99df977	Added improvements to the REINFORCE algorithm	2019-03-04 17:10:24 -05:00
Brandon Rozek	a59f84b446	Cleaned up scripts, added more comments	2019-03-04 17:09:46 -05:00
Brandon Rozek	e42f5bba1b	Corrected A2C and PPO to train at the end of an episode	2019-03-01 21:04:13 -05:00
Brandon Rozek	76a044ace9	Added Evolutionary Strategies Network and added more example scripts	2019-02-27 09:52:28 -05:00
Brandon Rozek	21b820b401	Implemented REINFORCE into the library	2019-02-16 20:30:27 -05:00
Brandon Rozek	14ba64d525	Added a single process environment runner. Also added an example for using such class.	2019-02-16 18:15:45 -05:00
Brandon Rozek	460d4c05c1	Fixed EnvironmentRun to be properly multiprocess. Fixed the prioirity of bad states to be the smallest [TODO] Make EnvironmentEpisode properly multiprocess	2019-02-13 23:47:37 -05:00
Brandon Rozek	115543d201	Fixed parallel implementation of getting experiences by using a queue	2019-02-13 00:36:23 -05:00
Brandon Rozek	5094ed53af	Updated examples to have new features	2019-02-11 10:23:11 -05:00
Brandon Rozek	9cd3625fd3	Made sure everything went to their appropriate devices	2019-02-03 00:45:14 -05:00
Brandon Rozek	a03abe2bb1	Initial Commit	2019-01-31 23:34:32 -05:00