Skip to content
View SWivid's full-sized avatar

Highlights

  • Pro

Block or report SWivid

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Text-to-Music Generation with Rectified Flow Transformers

Python 1,451 110 Updated Sep 6, 2024

VITS with phoneme-level prosody modeling based on MaskGIT

Python 72 7 Updated Aug 31, 2024
Python 44 6 Updated Sep 3, 2024

Speech To Speech: an effort for an open-sourced and modular GPT4-o

Python 3,035 321 Updated Sep 20, 2024

Kaldi-compatible online & offline feature extraction with PyTorch, supporting CUDA, batch processing, chunk processing, and autograd - Provide C++ & Python API

C++ 186 35 Updated Sep 14, 2024

[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"

Python 299 20 Updated Sep 3, 2024

Official Pytorch Implementation of Our CVPR2023 Paper: "Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization"

Python 149 6 Updated Jul 23, 2023

Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch

Python 239 21 Updated Sep 11, 2024

针对新的视频后期工作流制作的各种小工具

Python 18 Updated Apr 14, 2024

Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch

Python 593 50 Updated Feb 16, 2024

TorchCFM: a Conditional Flow Matching library

Python 1,061 83 Updated Aug 21, 2024

Training code for FAcodec presented in NaturalSpeech3

Python 158 17 Updated Aug 26, 2024

Fast and memory-efficient exact attention

Python 226 18 Updated Jul 26, 2024

Inference code for Audiodec-Valle-Wenetspeech4TTS

Python 43 2 Updated Jul 14, 2024

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

Python 7,570 756 Updated Feb 11, 2024

This is a Python package for NISQA.

Python 4 2 Updated Apr 9, 2024

Chinese Mandarin Grapheme-to-Phoneme Converter. 中文轉注音或拼音 (INTERSPEECH 2022)

Python 276 37 Updated Jun 16, 2024

Multilingual Voice Understanding Model

Python 2,701 258 Updated Sep 2, 2024

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 4,924 498 Updated Sep 19, 2024

Fast inference engine for Transformer models

C++ 3,233 286 Updated Sep 20, 2024

Faster Whisper ASR transcription with CTranslate2

Python 11 3 Updated Sep 10, 2024

Faster Whisper transcription with CTranslate2

Python 11,488 953 Updated Aug 21, 2024

Fast and memory-efficient exact attention

Python 13,478 1,234 Updated Sep 21, 2024

Foundational Models for State-of-the-Art Speech and Text Translation

Jupyter Notebook 10,770 1,049 Updated Aug 15, 2024

Brand new TTS solution

Python 12,232 927 Updated Sep 20, 2024

[IJCAI 2024] EAT: Self-Supervised Pre-Training with Efficient Audio Transformer

Python 99 3 Updated Apr 19, 2024

LLM Frontend for Power Users.

JavaScript 7,579 2,175 Updated Sep 22, 2024

Layer-wise analysis of self-supervised pre-trained speech representations

Python 89 15 Updated Aug 8, 2024

ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型

Python 40,425 5,187 Updated Jun 27, 2024
Next