snoop2head

snoop2head snoop2head

leg godt

271 followers · 254 following

KAIST AI
Seoul, South Korea
snoop2head.github.io

Achievements

Highlights

Organizations

Stars

⏲ Reinforcement Learning

RL examples including RLHF

40 repositories

kzl / decision-transformer

Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.

Python 2,332 440 Updated Apr 29, 2024

kkugosu / RL_BASIC

Python 5 Updated Aug 26, 2023

arthurdjn / snake-reinforcement-learning

Genetic Algorithm and Neural Network for the snake game.

Python 21 6 Updated Feb 21, 2021

huggingface / deep-rl-class

This repo contains the syllabus of the Hugging Face Deep Reinforcement Learning Course.

MDX 3,815 585 Updated Sep 12, 2024

Unity-Technologies / ml-agents

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement …

C# 16,928 4,135 Updated Sep 14, 2024

kekmodel / gym-tictactoe-zero

Tic Tac Toe with Alpha Zero method - My first work

Python 16 4 Updated Aug 23, 2018

patrickloeber / snake-ai-pytorch

Python 610 395 Updated Jun 11, 2024

maxpumperla / deep_learning_and_the_game_of_go

Code and other material for the book "Deep Learning and the Game of Go"

Python 971 385 Updated Aug 2, 2024

ShangtongZhang / reinforcement-learning-an-introduction

Python Implementation of Reinforcement Learning: An Introduction

Python 13,480 4,814 Updated Aug 9, 2024

suragnair / alpha-zero-general

A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more

Jupyter Notebook 3,831 1,027 Updated Jun 6, 2024

zhihanyang2022 / alpha-zero

Minimal AlphaZero in PyTorch, trained on Connect4 on a 6x6 board.

Python 11 2 Updated Aug 12, 2022

TeddyHuang-00 / streamlit-gomoku

A simple gomoku game written in Python and with the help of Streamlit

Python 6 1 Updated Jan 12, 2024

werner-duvaud / muzero-general

MuZero

Python 2,464 606 Updated Sep 3, 2024

JoshVarty / AlphaZeroSimple

The absolute most basic example of AlphaZero and Monte Carlo Tree Search I could come up with

Python 181 33 Updated Apr 3, 2023

seungeunrho / minimalRL

Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)

Python 2,838 458 Updated Apr 22, 2023

PolyKen / 15_by_15_AlphaGomoku

An implementation of improved AlphaGo algorithm in the game of Gomoku.

Python 57 20 Updated Nov 12, 2019

qfettes / DeepRL-Tutorials

Contains high quality implementations of Deep Reinforcement Learning algorithms written in PyTorch

Jupyter Notebook 1,052 325 Updated May 19, 2021

junxiaosong / AlphaZero_Gomoku

An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)

Python 3,273 964 Updated Apr 24, 2024

deepseasw / OmokQLearning

신경망 Q-Learning으로 구현한 오목 게임

Python 18 14 Updated Apr 12, 2017

google-deepmind / mctx

Monte Carlo tree search in JAX

Python 2,313 187 Updated Jul 25, 2024

NTT123 / a0-jax

AlphaZero in JAX

Python 68 18 Updated Apr 3, 2024

huggingface / trl

Train transformer language models with reinforcement learning.

Python 9,316 1,168 Updated Sep 19, 2024

opendilab / LightZero

[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)

Python 1,050 109 Updated Sep 20, 2024

tatsu-lab / alpaca_farm

A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.

Python 761 59 Updated Jul 1, 2024

lucidrains / PaLM-rlhf-pytorch

Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM

Python 7,671 668 Updated Jan 14, 2024

ethanyanjiali / minChatGPT

A minimum example of aligning language models with RLHF similar to ChatGPT

Python 210 28 Updated Sep 26, 2023

DaehanKim / EasyRLHF

EasyRLHF aims to provide an easy and minimal interface to train aligned language models, using off-the-shelf solutions and datasets

Python 6 Updated Dec 12, 2023

GanjinZero / RRHF

[NIPS2023] RRHF & Wombat

Python 789 49 Updated Sep 22, 2023

CarperAI / trlx

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Python 4,447 470 Updated Jan 8, 2024

infer-actively / pymdp

A Python implementation of active inference for Markov Decision Processes

Python 445 86 Updated Sep 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

snoop2head snoop2head

Achievements

Achievements

Highlights

Organizations

Block or report snoop2head

⏲ Reinforcement Learning

kzl / decision-transformer

kkugosu / RL_BASIC

arthurdjn / snake-reinforcement-learning

huggingface / deep-rl-class

Unity-Technologies / ml-agents

kekmodel / gym-tictactoe-zero

patrickloeber / snake-ai-pytorch

maxpumperla / deep_learning_and_the_game_of_go

ShangtongZhang / reinforcement-learning-an-introduction

suragnair / alpha-zero-general

zhihanyang2022 / alpha-zero

TeddyHuang-00 / streamlit-gomoku

werner-duvaud / muzero-general

JoshVarty / AlphaZeroSimple

seungeunrho / minimalRL

PolyKen / 15_by_15_AlphaGomoku

qfettes / DeepRL-Tutorials

junxiaosong / AlphaZero_Gomoku

deepseasw / OmokQLearning

google-deepmind / mctx

NTT123 / a0-jax

huggingface / trl

opendilab / LightZero

tatsu-lab / alpaca_farm

lucidrains / PaLM-rlhf-pytorch

ethanyanjiali / minChatGPT

DaehanKim / EasyRLHF

GanjinZero / RRHF

CarperAI / trlx

infer-actively / pymdp