Skip to content
View stevenhillis's full-sized avatar

Block or report stevenhillis

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Sequence alignement methods with helpers for PyTorch.

Python 24 3 Updated Nov 30, 2022

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 4,760 389 Updated Aug 10, 2024

Python forced alignment

Jupyter Notebook 71 4 Updated Apr 12, 2024

The reproduced code for Google's SoundStorm

Python 242 18 Updated Oct 7, 2023

Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch

Python 1,288 80 Updated Sep 15, 2024

Multilingual and Controllable Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart.

Python 1,387 158 Updated Sep 25, 2024

Making large AI models cheaper, faster and more accessible

Python 38,660 4,333 Updated Sep 26, 2024

[WIP] VoiceSmith makes training text to speech models easy.

Python 218 32 Updated Oct 10, 2022

Quartznet implementation on pytorch [https://arxiv.org/abs/1910.10261]

Jupyter Notebook 26 7 Updated Jul 16, 2021

This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.

Python 350 40 Updated Sep 25, 2024

A free online portfolio website to showcase your photos.

JavaScript 955 991 Updated Aug 20, 2024

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Python 4,048 398 Updated Sep 24, 2024

Python interface to the WebRTC Voice Activity Detector

C 2,020 406 Updated Jul 4, 2024

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Jupyter Notebook 5,975 757 Updated Sep 26, 2024

💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

1,265 137 Updated Jun 6, 2024

Official Pytorch implementation of CutMix regularizer

Python 1,216 159 Updated Sep 16, 2020

In defence of metric learning for speaker recognition

Python 1,030 272 Updated Mar 26, 2024

A PyTorch-based Speech Toolkit

Python 8,609 1,368 Updated Sep 25, 2024

The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DN…

HTML 363 29 Updated Sep 26, 2024

A python package to build AI-powered real-time audio applications

Python 1,006 86 Updated Jul 8, 2024

A fast and lightweight python-based CTC beam search decoder for speech recognition.

Python 421 89 Updated Jul 13, 2023

Working online speech recognition based on RNN Transducer. ( Trained model release available in release )

Python 288 44 Updated Aug 5, 2021

A PyPI package for fast word/character error rate (WER/CER) calculation

Python 67 14 Updated Jul 1, 2023

OpenMMLab Detection Toolbox and Benchmark

Python 29,183 9,390 Updated Aug 21, 2024

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 132,607 26,421 Updated Sep 26, 2024

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 82,476 22,197 Updated Sep 26, 2024

📹 Library for making playlists and scraping youtube videos - alternative to pafy, pytube, and youtube-dl.

Python 7 5 Updated Feb 13, 2022

A collection of basic python modules for spoken natural language processing

Python 56 15 Updated Dec 1, 2019

The PyTorch-based audio source separation toolkit for researchers

Python 2,230 422 Updated Jul 19, 2024