Starred repositories
[NeurIPS 2020] Official code for the paper "DeepSVG: A Hierarchical Generative Network for Vector Graphics Animation". Includes a PyTorch library for deep learning with SVG data.
✨✨Latest Advances on Multimodal Large Language Models
Investment Research for Everyone, Everywhere.
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Code for the paper Hybrid Spectrogram and Waveform Source Separation
An Open-Source Python3 tool for recognizing layouts, tables, math formulas (LaTeX), and text in images, converting them into Markdown format. A free alternative to Mathpix, empowering seamless conv…
Bring projects, wikis, and teams together with AI. AppFlowy is an AI collaborative workspace where you achieve more without losing control of your data. The best open source alternative to Notion.
Explore a comprehensive collection of resources, tutorials, papers, tools, and best practices for fine-tuning Large Language Models (LLMs). Perfect for ML practitioners and researchers!
Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/
Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory
This UI serves as a Synthetic ASR Dataset Generator powered by/for OpenAI Whisper, enabling users to capture audio, transcribing it, on the fly and manage the generated dataset 🤗. Fine tune Whisper…
Transcription, forced alignment, and audio indexing with OpenAI's Whisper
An Android keyboard that performs speech-to-text (STT/ASR) with OpenAI Whisper and input the recognized text; Supports English, Chinese, Japanese, etc. and even mixed languages.
Deep Learning Book Chinese Translation
The world's simplest facial recognition api for Python and the command line
A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python
PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
Large Action Model framework to develop AI Web Agents
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs. Documentation: https://docs.h2o.ai/h2o-llmstudio/
Interact with your documents using the power of GPT, 100% privately, no data leaks
WaveGAN: Learn to synthesize raw audio with generative adversarial networks
Unsupervised Word Segmentation for Neural Machine Translation and Text Generation
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Faster Whisper transcription with CTranslate2