Skip to content
View helloheshee's full-sized avatar

Block or report helloheshee

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[ECCV 2024] Official Repository for DiffiT: Diffusion Vision Transformers for Image Generation

435 14 Updated Jul 1, 2024

Includes the code for training and testing the CountGD model from the paper CountGD: Multi-Modal Open-World Counting.

Python 55 8 Updated Jul 11, 2024

📈 目前最大的工业缺陷检测数据库及论文集 Constantly summarizing open source dataset and critical papers in the field of surface defect research which are of great importance.

Python 3,021 520 Updated May 27, 2024

无监督正样本训练 检测缺陷并分割图像

Python 21 5 Updated Nov 1, 2021

Segment Anything in Defect Detection

Jupyter Notebook 15 3 Updated Jun 2, 2024

Brand new TTS solution

Python 12,117 919 Updated Sep 20, 2024

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 3,750 286 Updated Sep 19, 2024

Dead simple FLUX LoRA training UI with LOW VRAM support

Python 732 48 Updated Sep 17, 2024

[ICLR 2023] Official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"

Python 2,172 235 Updated Jul 31, 2024

[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale

Python 1,029 82 Updated Aug 8, 2024

[CVPR 2024] Official implementation of the paper "Visual In-context Learning"

Python 364 16 Updated Apr 8, 2024

API for Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series

Python 716 22 Updated Aug 9, 2024

Tutorials for creating and using ONNX models

Jupyter Notebook 3,340 626 Updated Jul 15, 2024

📻Terminal/ssh/telnet/serialport/RDP/VNC/sftp client(linux, mac, win)

JavaScript 11,006 938 Updated Sep 21, 2024

Labeling tool with SAM(segment anything model),supports SAM, SAM2, sam-hq, MobileSAM EdgeSAM etc.交互式半自动图像标注工具

Python 1,209 129 Updated Sep 14, 2024

OLMoE: Open Mixture-of-Experts Language Models

Jupyter Notebook 369 27 Updated Sep 17, 2024

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 2,093 119 Updated Sep 20, 2024

⚡️HivisionIDPhotos: a lightweight and efficient AI ID photos tools. 一个轻量级的AI证件照制作算法。

Python 9,826 925 Updated Sep 20, 2024

High-resolution models for human tasks.

Python 3,926 201 Updated Sep 20, 2024

【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models

Python 1,775 123 Updated Sep 5, 2024

A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama

Python 1,043 106 Updated Sep 10, 2024

Nanolog is an extremely performant nanosecond scale logging system for C++ that exposes a simple printf-like API.

C++ 2,967 342 Updated Mar 4, 2024

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Python 1,803 145 Updated Sep 17, 2024

mimalloc is a compact general purpose allocator with excellent performance.

C 10,401 840 Updated Aug 22, 2024

Open-source vector similarity search for Postgres

C 11,830 538 Updated Sep 21, 2024

MixTeX multimodal LaTeX, ZhEn, and, Table OCR. It performs efficient CPU-based inference in a local offline on Windows.

Python 654 30 Updated Sep 4, 2024

🔊 Text-Prompted Generative Audio Model

Jupyter Notebook 35,371 4,157 Updated Aug 19, 2024

Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining"

Python 461 19 Updated Aug 16, 2024

automatically quant GGUF models

Python 119 11 Updated Sep 20, 2024

[ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.

Python 3,245 527 Updated Aug 15, 2024
Next