| 
							
						 | 
						
							
							
							
							
								
							
							
								720bb1b051
								
							
						 | 
						
							
							
								
								Documented scheduler module
							
							
							
							
							
						 | 
						
							2020-03-20 17:59:56 -04:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								ea62ccf389
								
							
						 | 
						
							
							
								
								Added templates for unit testing and sphinx documentation
							
							
							
							
							
						 | 
						
							2020-03-15 14:27:56 -04:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Brandon Rozek
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								2e01bc16ea
								
							
						 | 
						
							
							
								
								Merge pull request #5 from Brandon-Rozek/dependabot/pip/tensorflow-1.15.2
							
							
							
							
							
							
							
							Bump tensorflow from 1.15.0 to 1.15.2 
							
						 | 
						
							2020-03-15 13:16:03 -04:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									dependabot[bot]
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								c6ca2a4cfe
								
							
						 | 
						
							
							
								
								Bump tensorflow from 1.15.0 to 1.15.2
							
							
							
							
							
							
							
							Bumps [tensorflow](https://github.com/tensorflow/tensorflow) from 1.15.0 to 1.15.2.
- [Release notes](https://github.com/tensorflow/tensorflow/releases)
- [Changelog](https://github.com/tensorflow/tensorflow/blob/master/RELEASE.md)
- [Commits](https://github.com/tensorflow/tensorflow/compare/v1.15.0...v1.15.2)
Signed-off-by: dependabot[bot] <support@github.com> 
							
						 | 
						
							2020-01-28 22:34:13 +00:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Brandon Rozek
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								557c0f689a
								
							
						 | 
						
							
							
								
								Merge pull request #4 from Brandon-Rozek/dependabot/pip/urllib3-1.24.2
							
							
							
							
							
							
							
							Bump urllib3 from 1.24.1 to 1.24.2 
							
						 | 
						
							2020-01-02 23:50:58 -05:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									dependabot[bot]
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								8323b1b073
								
							
						 | 
						
							
							
								
								Bump urllib3 from 1.24.1 to 1.24.2
							
							
							
							
							
							
							
							Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.24.1 to 1.24.2.
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/master/CHANGES.rst)
- [Commits](https://github.com/urllib3/urllib3/compare/1.24.1...1.24.2)
Signed-off-by: dependabot[bot] <support@github.com> 
							
						 | 
						
							2020-01-03 04:48:59 +00:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Brandon Rozek
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								0c7640fea1
								
							
						 | 
						
							
							
								
								Merge pull request #1 from Brandon-Rozek/dependabot/pip/werkzeug-0.15.3
							
							
							
							
							
							
							
							Bump werkzeug from 0.14.1 to 0.15.3 
							
						 | 
						
							2020-01-02 23:48:21 -05:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Brandon Rozek
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								d8d1a0a5e9
								
							
						 | 
						
							
							
								
								Merge pull request #2 from Brandon-Rozek/dependabot/pip/pillow-6.2.0
							
							
							
							
							
							
							
							Bump pillow from 5.4.1 to 6.2.0 
							
						 | 
						
							2020-01-02 23:48:09 -05:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Brandon Rozek
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								3c09867fb2
								
							
						 | 
						
							
							
								
								Merge pull request #3 from Brandon-Rozek/dependabot/pip/tensorflow-1.15.0
							
							
							
							
							
							
							
							Bump tensorflow from 1.12.0 to 1.15.0 
							
						 | 
						
							2020-01-02 23:47:54 -05:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									dependabot[bot]
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								35f57a3f22
								
							
						 | 
						
							
							
								
								Bump tensorflow from 1.12.0 to 1.15.0
							
							
							
							
							
							
							
							Bumps [tensorflow](https://github.com/tensorflow/tensorflow) from 1.12.0 to 1.15.0.
- [Release notes](https://github.com/tensorflow/tensorflow/releases)
- [Changelog](https://github.com/tensorflow/tensorflow/blob/master/RELEASE.md)
- [Commits](https://github.com/tensorflow/tensorflow/compare/v1.12.0...v1.15.0)
Signed-off-by: dependabot[bot] <support@github.com> 
							
						 | 
						
							2019-12-16 21:27:03 +00:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Brandon Rozek
								
							 
						 | 
						
							
							
							
							
								
							
							
								3217c76a79
								
							
						 | 
						
							
							
								
								DQfD memory was adjusted to actually update the weights in the priority trees, fixing a bug in the sampling
							
							
							
							
							
						 | 
						
							2019-11-17 19:50:49 -05:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Brandon Rozek
								
							 
						 | 
						
							
							
							
							
								
							
							
								23532fc372
								
							
						 | 
						
							
							
								
								Added a way to cap the number of demonstrations that are kept in the buffer
							
							
							
							
							
						 | 
						
							2019-11-17 18:29:12 -05:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Brandon Rozek
								
							 
						 | 
						
							
							
							
							
								
							
							
								038d406d0f
								
							
						 | 
						
							
							
								
								Fixed errors with n-step returns
							
							
							
							
							
						 | 
						
							2019-11-13 22:56:27 -05:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Brandon Rozek
								
							 
						 | 
						
							
							
							
							
								
							
							
								ed62e148d5
								
							
						 | 
						
							
							
								
								Initial implementation of n-step loss
							
							
							
							
							
						 | 
						
							2019-11-11 10:24:40 -05:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								07c90a09f9
								
							
						 | 
						
							
							
								
								Fixed scoping error with Transitions
							
							
							
							
							
						 | 
						
							2019-11-04 12:09:09 -05:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								ad75539776
								
							
						 | 
						
							
							
								
								Implemented components necessary for Deep Q Learning from Demonstrations
							
							
							
							
							
						 | 
						
							2019-11-04 07:44:39 -05:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Brandon Rozek
								
							 
						 | 
						
							
							
							
							
								
							
							
								17391c7467
								
							
						 | 
						
							
							
								
								First draft of Deep Q Learning From Demonstrations
							
							
							
							
							
						 | 
						
							2019-10-31 20:54:52 -04:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									dependabot[bot]
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								6b2b6da9e6
								
							
						 | 
						
							
							
								
								Bump pillow from 5.4.1 to 6.2.0
							
							
							
							
							
							
							
							Bumps [pillow](https://github.com/python-pillow/Pillow) from 5.4.1 to 6.2.0.
- [Release notes](https://github.com/python-pillow/Pillow/releases)
- [Changelog](https://github.com/python-pillow/Pillow/blob/master/CHANGES.rst)
- [Commits](https://github.com/python-pillow/Pillow/compare/5.4.1...6.2.0)
Signed-off-by: dependabot[bot] <support@github.com> 
							
						 | 
						
							2019-10-22 22:14:02 +00:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									dependabot[bot]
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								324fa5c667
								
							
						 | 
						
							
							
								
								Bump werkzeug from 0.14.1 to 0.15.3
							
							
							
							
							
							
							
							Bumps [werkzeug](https://github.com/pallets/werkzeug) from 0.14.1 to 0.15.3.
- [Release notes](https://github.com/pallets/werkzeug/releases)
- [Changelog](https://github.com/pallets/werkzeug/blob/master/CHANGES.rst)
- [Commits](https://github.com/pallets/werkzeug/compare/0.14.1...0.15.3)
Signed-off-by: dependabot[bot] <support@github.com> 
							
						 | 
						
							2019-10-21 18:14:40 +00:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								559efa38b0
								
							
						 | 
						
							
							
								
								Corrected for numba deprecation
							
							
							
							
							
							
							
							Enable the ability to render out scenes to play back data 
							
						 | 
						
							2019-09-19 07:57:39 -04:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								a99ca66b4f
								
							
						 | 
						
							
							
								
								Fixed multiprocessing with CUDA. Added entropy importance as a config option.
							
							
							
							
							
						 | 
						
							2019-09-18 07:26:32 -04:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								9d32a9edd1
								
							
						 | 
						
							
							
								
								Merge branch 'master' of https://github.com/Brandon-Rozek/rltorch
							
							
							
							
							
							
							
							# Conflicts:
#	rltorch/agents/QEPAgent.py 
							
						 | 
						
							2019-09-13 20:00:13 -04:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								da83f1470c
								
							
						 | 
						
							
							
								
								Some work on multiprocessing evolutionary strategies from last semester
							
							
							
							
							
						 | 
						
							2019-09-13 19:53:19 -04:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								7aa698c349
								
							
						 | 
						
							
							
								
								Added save and load functionality
							
							
							
							
							
						 | 
						
							2019-09-13 19:49:04 -04:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								dcf7cce30d
								
							
						 | 
						
							
							
								
								Flush out print text so I can keep track of the rewards by SSHing in
							
							
							
							
							
						 | 
						
							2019-09-13 19:48:51 -04:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								912e3d42cb
								
							
						 | 
						
							
							
								
								Added new OpenAI Baseline Wrappers
							
							
							
							
							
						 | 
						
							2019-09-13 19:48:24 -04:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								6d3a78cd20
								
							
						 | 
						
							
							
								
								Added parallel version of ES
							
							
							
							
							
						 | 
						
							2019-03-30 16:33:40 -04:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								9ad63a6921
								
							
						 | 
						
							
							
								
								Added license
							
							
							
							
							
						 | 
						
							2019-03-30 16:32:57 -04:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								b2f5220585
								
							
						 | 
						
							
							
								
								Made sure the reward_batch is float across different agents
							
							
							
							
							
						 | 
						
							2019-03-14 10:43:14 -04:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								cdfd3ab6b9
								
							
						 | 
						
							
							
								
								Playing around with QEP
							
							
							
							
							
						 | 
						
							2019-03-14 00:53:51 -04:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								8683b75ad9
								
							
						 | 
						
							
							
								
								Corrected gamma multiplication
							
							
							
							
							
						 | 
						
							2019-03-04 22:04:13 -05:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								190eb1f0c4
								
							
						 | 
						
							
							
								
								Correct discount_rewards function to only multiply with gamma throughout
							
							
							
							
							
						 | 
						
							2019-03-04 21:59:02 -05:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								11d99df977
								
							
						 | 
						
							
							
								
								Added improvements to the REINFORCE algorithm
							
							
							
							
							
						 | 
						
							2019-03-04 17:10:24 -05:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								a59f84b446
								
							
						 | 
						
							
							
								
								Cleaned up scripts, added more comments
							
							
							
							
							
						 | 
						
							2019-03-04 17:09:46 -05:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								e42f5bba1b
								
							
						 | 
						
							
							
								
								Corrected A2C and PPO to train at the end of an episode
							
							
							
							
							
						 | 
						
							2019-03-01 21:04:13 -05:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								1958fc7c7e
								
							
						 | 
						
							
							
								
								Corrected device when constructing fitness tensor
							
							
							
							
							
						 | 
						
							2019-02-28 14:41:34 -05:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								9740c40527
								
							
						 | 
						
							
							
								
								d.sample returns a tensor, so we stack them to not lose the device
							
							
							
							
							
						 | 
						
							2019-02-28 14:30:49 -05:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								714443192d
								
							
						 | 
						
							
							
								
								Added entropy into QEP (1% of loss)
							
							
							
							
							
							
							
							Made random numbers generated in ESNetwork happen in the same device 
							
						 | 
						
							2019-02-28 12:17:35 -05:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								76a044ace9
								
							
						 | 
						
							
							
								
								Added Evolutionary Strategies Network and added more example scripts
							
							
							
							
							
						 | 
						
							2019-02-27 09:52:28 -05:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								26084d4c7c
								
							
						 | 
						
							
							
								
								Added PPOAgent and A2CAgent to the agents submodule.
							
							
							
							
							
							
							
							Also made some small changes to how memories are queried 
							
						 | 
						
							2019-02-19 20:54:30 -05:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								21b820b401
								
							
						 | 
						
							
							
								
								Implemented REINFORCE into the library
							
							
							
							
							
						 | 
						
							2019-02-16 20:30:27 -05:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								14ba64d525
								
							
						 | 
						
							
							
								
								Added a single process environment runner. Also added an example for using such class.
							
							
							
							
							
						 | 
						
							2019-02-16 18:15:45 -05:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								736e73a1f7
								
							
						 | 
						
							
							
								
								Took away explicit deleting since the next_state variable gets used in another slot
							
							
							
							
							
						 | 
						
							2019-02-14 22:01:13 -05:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								2caf869fd6
								
							
						 | 
						
							
							
								
								Added numba as a dependency and decorated the Prioiritzed Replay function
							
							
							
							
							
						 | 
						
							2019-02-14 21:42:31 -05:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								19a859a4f6
								
							
						 | 
						
							
							
								
								If memory or logger does not exist, then don't create those shared memory structures
							
							
							
							
							
						 | 
						
							2019-02-14 21:06:44 -05:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								460d4c05c1
								
							
						 | 
						
							
							
								
								Fixed EnvironmentRun to be properly multiprocess.
							
							
							
							
							
							
							
							Fixed the prioirity of bad states to be the smallest
[TODO] Make EnvironmentEpisode properly multiprocess 
							
						 | 
						
							2019-02-13 23:47:37 -05:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								115543d201
								
							
						 | 
						
							
							
								
								Fixed parallel implementation of getting experiences by using a queue
							
							
							
							
							
						 | 
						
							2019-02-13 00:36:23 -05:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								5094ed53af
								
							
						 | 
						
							
							
								
								Updated examples to have new features
							
							
							
							
							
						 | 
						
							2019-02-11 10:23:11 -05:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								fe97a9b78d
								
							
						 | 
						
							
							
								
								Corrected typo
							
							
							
							
							
						 | 
						
							2019-02-11 00:00:34 -05:00 | 
						
						
							
							
							
							
								
							
							
						 | 
					
				
					
						| 
							
						 | 
						
							
							
							
							
								
							
							
								be637664e7
								
							
						 | 
						
							
							
								
								Added collections import
							
							
							
							
							
						 | 
						
							2019-02-10 23:59:29 -05:00 | 
						
						
							
							
							
							
								
							
							
						 |