A simple, high-quality voice conversion tool focused on ease of use and performance.
-
Updated
Sep 19, 2024 - Python
A simple, high-quality voice conversion tool focused on ease of use and performance.
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
A desktop application that uses AI to translate voice between languages in real time, while preserving the speaker's tone and emotion.
Code for NeurIPS 2023 paper "DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation".
If you've ever had the wish to talk to your AI Waifu using quality characters and voices for character voicing, then I suggest Soul of Waifu. Don't miss the opportunity to touch your dream!
💬 "Realtime" voice transcription and cloning using ElevenLabs's API.
Speech to text to speech using Elevenlabs
Code for the INTERSPEECH 2023 paper "Learning When to Speak: Latency and Quality Trade-offs for Simultaneous Speech-to-Speech Translation with Offline Models"
Chatter Box is an android app that is capable of Voice, Text, Image Text Translation, and end-to-end chat translation.
A user-friendly interface for ElevenLabs' API with added audio transcription capability.
Systems submitted to IWSLT 2022 by the MT-UPC group.
Speech-to-Speech translation dataset for German and English (text and speech quadruplets).
This repository contains the code for a speech to speech translation system created from scratch for digits translation from English to Tamil
CtrlSpeak is a voice assistant activated with [Control]+Q, listening and responding only when you want.
3-month project on artificial intelligence in teams of 3 with Manon Duboscq and Léa Mariot
A flask web-page hosting a speech to speech translation demo
End-to-End AI Voice Assistant pipeline with Whisper for Speech-to-Text, Hugging Face LLM for response generation, and Edge-TTS for Text-to-Speech. Features include Voice Activity Detection (VAD), tunable parameters for pitch, gender, and speed, and real-time response with latency optimization.
A speech-to-speech real-time translation bot for Discord
A lite tool to quickly customize LLM chatbot workflow pipelines, like Text-to-Text, Text-to-Speech or Speech-to-Speech
Add a description, image, and links to the speech-to-speech topic page so that developers can more easily learn about it.
To associate your repository with the speech-to-speech topic, visit your repo's landing page and select "manage topics."