SWivid

Follow

Yushen CHEN SWivid

Follow

2 followers · 1 following

Highlights

Pro

Stars

BytedanceSpeech / seed-tts-eval

Python 944 97 Updated Jun 14, 2024

feizc / FluxMusic

Text-to-Music Generation with Rectified Flow Transformers

Python 1,451 110 Updated Sep 6, 2024

innnky / MagVITS

VITS with phoneme-level prosody modeling based on MaskGIT

Python 72 7 Updated Aug 31, 2024

bfs18 / e2_tts

Python 44 6 Updated Sep 3, 2024

huggingface / speech-to-speech

Speech To Speech: an effort for an open-sourced and modular GPT4-o

Python 3,035 321 Updated Sep 20, 2024

csukuangfj / kaldifeat

Kaldi-compatible online & offline feature extraction with PyTorch, supporting CUDA, batch processing, chunk processing, and autograd - Provide C++ & Python API

C++ 186 35 Updated Sep 14, 2024

X-LANCE / VoiceFlow-TTS

[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"

Python 299 20 Updated Sep 3, 2024

CrossmodalGroup / DynamicVectorQuantization

Official Pytorch Implementation of Our CVPR2023 Paper: "Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization"

Python 149 6 Updated Jul 23, 2023

lucidrains / e2-tts-pytorch

Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch

Python 239 21 Updated Sep 11, 2024

Variante / video-postproc-toolbox

针对新的视频后期工作流制作的各种小工具

Python 18 Updated Apr 14, 2024

lucidrains / voicebox-pytorch

Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch

Python 593 50 Updated Feb 16, 2024

atong01 / conditional-flow-matching

TorchCFM: a Conditional Flow Matching library

Python 1,061 83 Updated Aug 21, 2024

Plachtaa / FAcodec

Training code for FAcodec presented in NaturalSpeech3

Python 158 17 Updated Aug 26, 2024

bdashore3 / flash-attention

Forked from Dao-AILab/flash-attention

Fast and memory-efficient exact attention

Python 226 18 Updated Jul 26, 2024

dukGuo / valle-audiodec

Inference code for Audiodec-Valle-Wenetspeech4TTS

Python 43 2 Updated Jul 14, 2024

Plachtaa / VALL-E-X

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

Python 7,570 756 Updated Feb 11, 2024

kale4eat / nisqalib

This is a Python package for NISQA.

Python 4 2 Updated Apr 9, 2024

GitYCC / g2pW

Chinese Mandarin Grapheme-to-Phoneme Converter. 中文轉注音或拼音 (INTERSPEECH 2022)

Python 276 37 Updated Jun 16, 2024

FunAudioLLM / SenseVoice

Multilingual Voice Understanding Model

Python 2,701 258 Updated Sep 2, 2024

FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 4,924 498 Updated Sep 19, 2024

OpenNMT / CTranslate2

Fast inference engine for Transformer models

C++ 3,233 286 Updated Sep 20, 2024

mobiusml / faster-whisper

Forked from SYSTRAN/faster-whisper

Faster Whisper ASR transcription with CTranslate2

Python 11 3 Updated Sep 10, 2024

SYSTRAN / faster-whisper

Faster Whisper transcription with CTranslate2

Python 11,488 953 Updated Aug 21, 2024

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 13,478 1,234 Updated Sep 21, 2024

facebookresearch / seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation

Jupyter Notebook 10,770 1,049 Updated Aug 15, 2024

fishaudio / fish-speech

Brand new TTS solution

Python 12,232 927 Updated Sep 20, 2024

cwx-worst-one / EAT

[IJCAI 2024] EAT: Self-Supervised Pre-Training with Efficient Audio Transformer

Python 99 3 Updated Apr 19, 2024

SillyTavern / SillyTavern

LLM Frontend for Power Users.

JavaScript 7,579 2,175 Updated Sep 22, 2024

ankitapasad / layerwise-analysis

Layer-wise analysis of self-supervised pre-trained speech representations

Python 89 15 Updated Aug 8, 2024

THUDM / ChatGLM-6B

ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型

Python 40,425 5,187 Updated Jun 27, 2024