helloheshee

helloheshee

8 followers · 74 following

Stars

NVlabs / DiffiT

[ECCV 2024] Official Repository for DiffiT: Diffusion Vision Transformers for Image Generation

435 14 Updated Jul 1, 2024

niki-amini-naieni / CountGD

Includes the code for training and testing the CountGD model from the paper CountGD: Multi-Modal Open-World Counting.

Python 55 8 Updated Jul 11, 2024

Charmve / Surface-Defect-Detection

📈 目前最大的工业缺陷检测数据库及论文集 Constantly summarizing open source dataset and critical papers in the field of surface defect research which are of great importance.

Python 3,021 520 Updated May 27, 2024

amazingcodeLYL / Positive_sample_defect_detection

无监督正样本训练检测缺陷并分割图像

Python 21 5 Updated Nov 1, 2021

bozhenhhu / DefectSAM

Segment Anything in Defect Detection

Jupyter Notebook 15 3 Updated Jun 2, 2024

fishaudio / fish-speech

Brand new TTS solution

Python 12,117 919 Updated Sep 20, 2024

Ucas-HaoranWei / GOT-OCR2.0

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 3,750 286 Updated Sep 19, 2024

cocktailpeanut / fluxgym

Dead simple FLUX LoRA training UI with LOW VRAM support

Python 732 48 Updated Sep 17, 2024

IDEA-Research / DINO

[ICLR 2023] Official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"

Python 2,172 235 Updated Jul 31, 2024

FoundationVision / GLEE

[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale

Python 1,029 82 Updated Aug 8, 2024

UX-Decoder / DINOv

[CVPR 2024] Official implementation of the paper "Visual In-context Learning"

Python 364 16 Updated Apr 8, 2024

IDEA-Research / Grounding-DINO-1.5-API

API for Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series

Python 716 22 Updated Aug 9, 2024

onnx / tutorials

Tutorials for creating and using ONNX models

Jupyter Notebook 3,340 626 Updated Jul 15, 2024

electerm / electerm

📻Terminal/ssh/telnet/serialport/RDP/VNC/sftp client(linux, mac, win)

JavaScript 11,006 938 Updated Sep 21, 2024

yatengLG / ISAT_with_segment_anything

Labeling tool with SAM(segment anything model),supports SAM, SAM2, sam-hq, MobileSAM EdgeSAM etc.交互式半自动图像标注工具

Python 1,209 129 Updated Sep 14, 2024

allenai / OLMoE

OLMoE: Open Mixture-of-Experts Language Models

Jupyter Notebook 369 27 Updated Sep 17, 2024

QwenLM / Qwen2-VL

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 2,093 119 Updated Sep 20, 2024

Zeyi-Lin / HivisionIDPhotos

⚡️HivisionIDPhotos: a lightweight and efficient AI ID photos tools. 一个轻量级的AI证件照制作算法。

Python 9,826 925 Updated Sep 20, 2024

facebookresearch / sapiens

High-resolution models for human tasks.

Python 3,926 201 Updated Sep 20, 2024

Yuliang-Liu / Monkey

【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models

Python 1,775 123 Updated Sep 5, 2024

itsOwen / CyberScraper-2077

A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama

Python 1,043 106 Updated Sep 10, 2024

PlatformLab / NanoLog

Nanolog is an extremely performant nanosecond scale logging system for C++ that exposes a simple printf-like API.

C++ 2,967 342 Updated Mar 4, 2024

NVlabs / VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Python 1,803 145 Updated Sep 17, 2024

microsoft / mimalloc

mimalloc is a compact general purpose allocator with excellent performance.

C 10,401 840 Updated Aug 22, 2024

pgvector / pgvector

Open-source vector similarity search for Postgres

C 11,830 538 Updated Sep 21, 2024

RQLuo / MixTeX-Latex-OCR

MixTeX multimodal LaTeX, ZhEn, and, Table OCR. It performs efficient CPU-based inference in a local offline on Windows.

Python 654 30 Updated Sep 4, 2024

suno-ai / bark

🔊 Text-Prompted Generative Audio Model

Jupyter Notebook 35,371 4,157 Updated Aug 19, 2024

Alpha-VLLM / Lumina-mGPT

Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining"

Python 461 19 Updated Aug 16, 2024

leafspark / AutoGGUF

automatically quant GGUF models

Python 119 11 Updated Sep 20, 2024

fundamentalvision / BEVFormer

[ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.

Python 3,245 527 Updated Aug 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly