- Singapore
Stars
NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.
A sound cloning tool with a web interface, using your voice or any sound to record audio / 一个带web界面的声音克隆工具,使用你的音色或任意声音来录制音频
Text utilities, including beam search decoding, tokenizing, and more, built for use in Flashlight.
Official inference repo for FLUX.1 models
Fast and memory-efficient exact attention
YaRN: Efficient Context Window Extension of Large Language Models
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
A python package for whisper normalizer
Qdrant - High-performance, massive-scale Vector Database for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
Make PyTorch models up to 40% faster! Thunder is a source to source compiler for PyTorch. It enables using different hardware executors at once; across one or thousands of GPUs.
Open-Sora: Democratizing Efficient Video Production for All
Diff Match Patch is a high-performance library in multiple languages that manipulates plain text.
This is the Python library for an unsupervised, fast method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsupervised Segment-Based Robust Voice Activity Detection Method.
Convert WebVTT to JSON, optionally removing duplicate lines
Tools for merging pretrained large language models.
A tool for enriching the output of nvidia-smi.
A joint community effort to create one central leaderboard for LLMs.
Universal LLM Deployment Engine with ML Compilation
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatio…
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
ModelScope: bring the notion of Model-as-a-Service to life.
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding