Skip to content

We're going to be building and training a deep Q-network to learn to balance a pole on a moving cart. This is widely known as the cart and pole problem.

Notifications You must be signed in to change notification settings

AI-Ahmed/DQN-OpenAI-CartPole-PyTorch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Deep Q-Network - Reinforcement Learning with PyTorch

We're going to be building and training a deep Q-network to learn to balance a pole on a moving cart. This is widely known as the cart and pole problem.

We'll be using OpenAI's Gym toolkit to set up our cart and pole environment.

Image Snapped from Deeplizard

Image Snapped from Packt – Hands-On Q-Learning with Python


🄠-Learning Algorithm

  1. Initialize replay memory capacity.

  2. Initialize the policy network with random weights.

  3. Clone the policy network, and call it the target network.

  4. For each episode:

  • Initialize the starting state.

  • For each time step:

    1. Select an action.
    • Via exploration or exploitation
    1. Execute selected action in an emulator.

    2. Observe reward and next state.

    3. Store experience in replay memory.

    4. Sample random batch from replay memory.

    5. Preprocess states from batch.

    6. Pass batch of preprocessed states to policy network.

    7. Calculate loss between output Q-values and target Q-values.

    • Requires a pass to the target network for the next state
    1. Gradient descent updates weights in the policy network to minimize loss.
    • After x time steps, weights in the target network are updated to the weights in the policy network.

About

We're going to be building and training a deep Q-network to learn to balance a pole on a moving cart. This is widely known as the cart and pole problem.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published