website/content/research/reinforcementlearning.md

---
Title: Reinforcement Learning
Description: The study of optimally mapping situations to actions
---

# Reinforcement Learning
Reinforcement learning is the art of analyzing situations and mapping them to actions in order to maximize a numerical reward signal.

In this independent study, I as well as Dr. Stephen Davies, will explore the Reinforcement Learning problem and its subproblems. We will go over the bandit problem, markov decision processes, and discover how best to translate a problem in order to **make decisions**.

I have provided a list of topics that I wish to explore in a [syllabus](syllabus)

## Readings

In order to spend more time learning, I decided to follow a textbook this time. 

Reinforcement Learning: An Introduction

By Richard S. Sutton and Andrew G. Barto


[Reading Schedule](readings) 


## Notes

The notes for this course, is going to be an extreemly summarized version of the textbook. There will also be notes on whatever side tangents Dr. Davies and I explore.

[Notes page](notes)

I wrote a small little quirky/funny report describing the bandit problem. Great for learning about the common considerations for Reinforcement Learning problems.

[The Bandit Report](/files/research/TheBanditReport.pdf)

## Code

Code will occasionally be written to solidify the learning material and to act as aids for more exploration. 

[Github Link](https://github.com/brandon-rozek/ReinforcementLearning)

Specifically, if you want to see agents I've created to solve some OpenAI environments, take a look at this specific folder in the Github Repository

[Github Link](https://github.com/Brandon-Rozek/ReinforcementLearning/tree/master/agents)
Website snapshot 2020-01-16 02:51:49 +00:00			`---`
			`Title: Reinforcement Learning`
			`Description: The study of optimally mapping situations to actions`
			`---`

			`# Reinforcement Learning`
			`Reinforcement learning is the art of analyzing situations and mapping them to actions in order to maximize a numerical reward signal.`

			`In this independent study, I as well as Dr. Stephen Davies, will explore the Reinforcement Learning problem and its subproblems. We will go over the bandit problem, markov decision processes, and discover how best to translate a problem in order to make decisions.`

			`I have provided a list of topics that I wish to explore in a [syllabus](syllabus)`

			`## Readings`

			`In order to spend more time learning, I decided to follow a textbook this time.`

			`Reinforcement Learning: An Introduction`

			`By Richard S. Sutton and Andrew G. Barto`


			`[Reading Schedule](readings)`


			`## Notes`

			`The notes for this course, is going to be an extreemly summarized version of the textbook. There will also be notes on whatever side tangents Dr. Davies and I explore.`

			`[Notes page](notes)`

			`I wrote a small little quirky/funny report describing the bandit problem. Great for learning about the common considerations for Reinforcement Learning problems.`

			`[The Bandit Report](/files/research/TheBanditReport.pdf)`

			`## Code`

			`Code will occasionally be written to solidify the learning material and to act as aids for more exploration.`

			`[Github Link](https://github.com/brandon-rozek/ReinforcementLearning)`

			`Specifically, if you want to see agents I've created to solve some OpenAI environments, take a look at this specific folder in the Github Repository`

			`[Github Link](https://github.com/Brandon-Rozek/ReinforcementLearning/tree/master/agents)`