Highlights
- Pro
Stars
OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
(SIGGRAPH 2024) Official repository for "Taming Diffusion Probabilistic Models for Character Control"
An open-source Chinese font derived from Fontworks' Klee One. 一款开源中文字体,基于 FONTWORKS 出品字体 Klee One 衍生。
⚡️HivisionIDPhotos: a lightweight and efficient AI ID photos tools. 一个轻量级的AI证件照制作算法。
AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio a…
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
An optical music recognition (OMR) system. Converts sheet music to a machine-readable version.
Code Implementation, Evaluations, Documentation, Links and Resources for Min P paper
Speech to text (PocketSphinx, Iflytex API, Baidu API) and text to speech (pyttsx3) | 语音转文字(PocketSphinx、百度 API、科大讯飞 API)和文字转语音(pyttsx3)
chinese speech pretrained models
SoftVC VITS Singing Voice Conversion
MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning
Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualization
The code releasing for https://image-dream.github.io/
[ECCV 2024] Single Image to 3D Textured Mesh in 10 seconds with Convolutional Reconstruction Model.
vencord installer with stereo and other stuff (from philhk)
A FER, AFINN and Heart-Monitoring system to intergrate into NLP/LLM, Game Engines, and more, by detecting the player's heart rate and emotional state throught gameplay to create more reactive scena…
[CAAI AIR'24] Bilateral Reference for High-Resolution Dichotomous Image Segmentation
Text-to-video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Official inference repo for FLUX.1 models
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
The best way to write secure and reliable applications. Write nothing; deploy nowhere.
A discord bot LLM for Voice Chat (and text)