Solving OpenAI Gym With DeepQ-Learning

Learning to Play Pong!

Reinforcement Learning

Check out our exploration here!

Reinforcement Learning has played a massive role in bringing us closer to Artificial General Intelligence. It is one of the few ideas that feel like the model is replicating some human action that requires true cognitive effort. The OpenAI Gym platform is a series of video games that humans can do relatively easily and we can train a machine to do even better!

What makes Reinforcement Learning different from all other techniques is the inclusion of a Reward Function. Where most Machine Leaning tasks are supervised by some labels, reinforcement learning reformulates this problem as such: The Deep Learning model is some agent in an environment, it has the ability to input its current state within this environment and can output some action given each state. Each action the model takes will be given some reward (or lack there of) which hopefully can supervise that learning process to know the optimal action to take given every state.

The other concept that allows the algorithm to determine which state to take is to balance taking a step for a current reward versus a future one. Essentially, we want the model to take an action that will maximize future rewards rather than just focus on grabbing rewards in the current state, but we can weigh this to ensure we also focus on the current low-hanging-fruit.

Theres a ton more I want to explore, from having multiple agents to actor critic methods, but that will be in the next one!

Previous
Previous

Transformers for Vision