Q-Learning (1)

Purpose: To implement and understand the basics of reinforcement learning by implementing the Q-Learning algorithm in a simple environment. 

Background: While a graduate student at the University of Florida, I completed this project as a final assignment for a Pattern Recognition class (Spring of 2022). The deliverables were a final paper and video presentation (Part 1, Part 2).  

Source Code: See the Github repository to test the code for yourself! 

Paper Abstract: In Reinforcement Learning (RL), an agent’s movement through an environment is related to the expected reward or policy at state-action pairs. The number of state-action pairs increases when an agent has access to more feedback channels. This paper explores how increasing the number of feedback channels influences an agent’s ability to reach a goal. The agent’s performance is evaluated in terms of number of episodes needed to reach a goal, the mean reward across episodes prior to solving the game, and the evolution of an agent’s state-space trajectories as it learns across episodes. The agent is a 2-Dimensional vehicle from the Mountain Car environment distributed through the OpenAI-Gym [3]. Results indicate that there is little to no influence on an agent’s state-space trajectory when varying
feedback modalities. There is influence of feedback modalities on other performance metrics such as the number of episodes required to train an agent. This finding may depend on the limited environment complexity.