Highlights
- Pro
⏲ Reinforcement Learning
Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.
Genetic Algorithm and Neural Network for the snake game.
This repo contains the syllabus of the Hugging Face Deep Reinforcement Learning Course.
The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement …
Tic Tac Toe with Alpha Zero method - My first work
Code and other material for the book "Deep Learning and the Game of Go"
Python Implementation of Reinforcement Learning: An Introduction
A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more
Minimal AlphaZero in PyTorch, trained on Connect4 on a 6x6 board.
A simple gomoku game written in Python and with the help of Streamlit
The absolute most basic example of AlphaZero and Monte Carlo Tree Search I could come up with
Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)
An implementation of improved AlphaGo algorithm in the game of Gomoku.
Contains high quality implementations of Deep Reinforcement Learning algorithms written in PyTorch
An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)
Train transformer language models with reinforcement learning.
[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)
A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
A minimum example of aligning language models with RLHF similar to ChatGPT
EasyRLHF aims to provide an easy and minimal interface to train aligned language models, using off-the-shelf solutions and datasets
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
A Python implementation of active inference for Markov Decision Processes