Highlights
- Pro
Starred repositories
Foundational model for human-like, expressive TTS
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
This repository provides the code and model checkpoints of the research paper: Scalable Pre-training of Large Autoregressive Image Models
An Open Source text-to-speech system built by inverting Whisper.
Enhanced ChatGPT Clone: Features Anthropic, AWS, OpenAI, Assistants API, Azure, Groq, o1, GPT-4o, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message search, langchain, D…
Instant voice cloning by MIT and MyShell.
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
llama.cpp with BakLLaVA model describes what does it see
A programming framework for agentic AI 🤖
a state-of-the-art-level open visual language model | 多模态预训练模型
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
⏩ Continue is the leading open-source AI code assistant. You can connect any models and any context to build custom autocomplete and chat experiences inside VS Code and JetBrains
WebUI for Fine-Tuning and Self-hosting of Open-Source Large Language Models for Coding
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
Easily train or fine-tune SOTA computer vision models with one open source training library. The home of Yolo-NAS.
Toy Gaussian Splatting visualization in Unity
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
Search images with a text or image query, using Open AI's pretrained CLIP model.
Foundational Models for State-of-the-Art Speech and Text Translation
Simple, open source, lightweight (< 1 KB) and privacy-friendly web analytics alternative to Google Analytics.
♾ Infisical is the open-source secret management platform: Sync secrets across your team/infrastructure, prevent secret leaks, and manage internal PKI
Driver for the VL53L1 time-of-flight sensor in pure Rust.