Skip to content
View snoop2head's full-sized avatar

Highlights

  • Pro

Organizations

@PoolC @QuoQA-NLP @AttentionX

Block or report snoop2head

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

⏲ Reinforcement Learning

RL examples including RLHF
40 repositories

Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.

Python 2,332 440 Updated Apr 29, 2024
Python 5 Updated Aug 26, 2023

Genetic Algorithm and Neural Network for the snake game.

Python 21 6 Updated Feb 21, 2021

This repo contains the syllabus of the Hugging Face Deep Reinforcement Learning Course.

MDX 3,815 585 Updated Sep 12, 2024

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement …

C# 16,928 4,135 Updated Sep 14, 2024

Tic Tac Toe with Alpha Zero method - My first work

Python 16 4 Updated Aug 23, 2018

Code and other material for the book "Deep Learning and the Game of Go"

Python 971 385 Updated Aug 2, 2024

Python Implementation of Reinforcement Learning: An Introduction

Python 13,480 4,814 Updated Aug 9, 2024

A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more

Jupyter Notebook 3,831 1,027 Updated Jun 6, 2024

Minimal AlphaZero in PyTorch, trained on Connect4 on a 6x6 board.

Python 11 2 Updated Aug 12, 2022

A simple gomoku game written in Python and with the help of Streamlit

Python 6 1 Updated Jan 12, 2024

MuZero

Python 2,464 606 Updated Sep 3, 2024

The absolute most basic example of AlphaZero and Monte Carlo Tree Search I could come up with

Python 181 33 Updated Apr 3, 2023

Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)

Python 2,838 458 Updated Apr 22, 2023

An implementation of improved AlphaGo algorithm in the game of Gomoku.

Python 57 20 Updated Nov 12, 2019

Contains high quality implementations of Deep Reinforcement Learning algorithms written in PyTorch

Jupyter Notebook 1,052 325 Updated May 19, 2021

An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)

Python 3,273 964 Updated Apr 24, 2024

신경망 Q-Learning으로 구현한 오목 게임

Python 18 14 Updated Apr 12, 2017

Monte Carlo tree search in JAX

Python 2,313 187 Updated Jul 25, 2024

AlphaZero in JAX

Python 68 18 Updated Apr 3, 2024

Train transformer language models with reinforcement learning.

Python 9,316 1,168 Updated Sep 19, 2024

[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)

Python 1,050 109 Updated Sep 20, 2024

A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.

Python 761 59 Updated Jul 1, 2024

Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM

Python 7,671 668 Updated Jan 14, 2024

A minimum example of aligning language models with RLHF similar to ChatGPT

Python 210 28 Updated Sep 26, 2023

EasyRLHF aims to provide an easy and minimal interface to train aligned language models, using off-the-shelf solutions and datasets

Python 6 Updated Dec 12, 2023

[NIPS2023] RRHF & Wombat

Python 789 49 Updated Sep 22, 2023

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Python 4,447 470 Updated Jan 8, 2024

A Python implementation of active inference for Markov Decision Processes

Python 445 86 Updated Sep 20, 2024