Use PEFT or Full-parameter to finetune 350+ LLMs or 90+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vi…

Python 3,682 314 Updated Oct 4, 2024

CLAY-3D / OpenCLAY

CLAY: A Controllable Large-scale Generative Model for Creating High-quality 3D Assets

802 10 Updated Jun 21, 2024

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 27,851 4,109 Updated Oct 5, 2024

santisoler / cc-licenses

Creative Commons Licenses for Github

527 305 Updated Apr 3, 2024

PyAV-Org / PyAV

Pythonic bindings for FFmpeg's libraries.

Cython 2,466 360 Updated Oct 4, 2024

JourneyDB / JourneyDB

147 5 Updated Jul 18, 2023

3DTopia / 3DTopia

Text-to-3D Generation within 5 Minutes

Python 617 42 Updated Mar 10, 2024

deepseek-ai / DeepSeek-VL

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Python 2,026 190 Updated Apr 24, 2024

lllyasviel / sd-forge-layerdiffuse

[WIP] Layer Diffusion for WebUI (via Forge)

Python 3,817 330 Updated Aug 30, 2024

mit-han-lab / distrifuser

[CVPR 2024 Highlight] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models

Python 561 21 Updated Aug 17, 2024

lllyasviel / LayerDiffuse

Transparent Image Layer Diffusion using Latent Transparency

1,999 26 Updated Jun 16, 2024

PixArt-alpha / PixArt-alpha

PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

Python 2,720 174 Updated Aug 1, 2024

apple / ml-mgie

Python 3,840 253 Updated Mar 15, 2024

TencentARC / MotionCtrl

Official Code for MotionCtrl [SIGGRAPH 2024]

Python 1,282 71 Updated Sep 20, 2024

YingqingHe / Awesome-LLMs-meet-Multimodal-Generation

🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).

HTML 313 16 Updated Sep 25, 2024

duyguceylan / pix2video

Code for the paper "Pix2Video: Video Editing using Image Diffusion"

Python 61 4 Updated Oct 2, 2023

lllyasviel / Fooocus

Focus on prompting and generating

Python 40,508 5,650 Updated Aug 21, 2024

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 19,594 2,159 Updated Aug 12, 2024

yt-dlp / yt-dlp

A feature-rich command-line audio/video downloader

Python 84,017 6,547 Updated Oct 1, 2024

necla-ml / Diff-JPEG

Official and maintained implementation of the paper "Differentiable JPEG: The Devil is in the Details" [WACV 2024].

Python 84 5 Updated Dec 30, 2023

lm-sys / FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Python 36,586 4,518 Updated Sep 25, 2024

cientgu / InstructDiffusion

PyTorch implementation of InstructDiffusion, a unifying and generic framework for aligning computer vision tasks with human instructions.

Python 378 20 Updated May 14, 2024

facebookresearch / co-tracker

CoTracker is a model for tracking any point (pixel) on a video.

Jupyter Notebook 2,721 195 Updated Sep 25, 2024

qiuyu96 / CoDeF

[CVPR 2024 Highlight] Official PyTorch implementation of CoDeF: Content Deformation Fields for Temporally Consistent Video Processing

Python 4,828 386 Updated Apr 7, 2024