Stars
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
StoryMaker: Towards consistent characters in text-to-image generation
face-cluster-by-infomap 一种无监督人脸聚类方法,在开源数据集上取得SOTA效果
Research Code for Multimodal-Cognition Team in Ant Group
Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with …
🔥「企业级低代码平台」前后端分离架构SpringBoot 2.x/3.x,SpringCloud,Ant Design&Vue3,Mybatis,Shiro,JWT。强大的代码生成器让前后端代码一键生成,无需写任何代码! 引领新的开发模式OnlineCoding->代码生成->手工MERGE,帮助Java项目解决70%重复工作,让开发更关注业务,既能快速提高效率,帮助公司节省成本,同时又不失…
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
real time face swap and one-click video deepfake with only a single image
SGLang is a fast serving framework for large language models and vision language models.
专门用于给图片加水印打码的工具,完全基于浏览器本地 API,无任何网络请求(特别适合身份证等敏感证件)
agentUniverse is a LLM multi-agent framework that allows developers to easily build multi-agent applications.
Use PEFT or Full-parameter to finetune 350+ LLMs or 90+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vi…
Ikaros-521 / AI-Vtuber
Forked from sandboxdream/AI-VtuberAI Vtuber是一个由 【ChatterBot/ChatGPT/claude/langchain/chatglm/text-gen-webui/闻达/千问/kimi/ollama】 驱动的虚拟主播【Live2D/UE/xuniren】,可以在 【Bilibili/抖音/快手/微信视频号/拼多多/斗鱼/YouTube/twitch/TikTok】 直播中与观众实时互动 或 直接在本地进行聊…
带HTTP API的数字人视频播放器,使用gradio api对接Easy-Wav2Lip、Sadtalker、GeneFacePlusPlus、MuseTalk,也可以用于播放本地视频
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…
[NeurIPS 2024] Official code for PuLID: Pure and Lightning ID Customization via Contrastive Alignment
Production First and Production Ready End-to-End Keyword Spotting Toolkit
本项目基于SadTalkers实现视频唇形合成的Wav2lip。通过以视频文件方式进行语音驱动生成唇形,设置面部区域可配置的增强方式进行合成唇形(人脸)区域画面增强,提高生成唇形的清晰度。使用DAIN 插帧的DL算法对生成视频进行补帧,补充帧间合成唇形的动作过渡,使合成的唇形更为流畅、真实以及自然。
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
Comprehensive and timely academic information on federated learning (papers, frameworks, datasets, tutorials, workshops)