Skip to content
View xuanlongORZ's full-sized avatar
🌠
be a beam of light
🌠
be a beam of light

Block or report xuanlongORZ

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Recognize Any Regions

Python 116 4 Updated Nov 22, 2023

[NeurIPS 2021] You Only Look at One Sequence

Jupyter Notebook 832 118 Updated May 4, 2022

[ICCV 2023 Oral] IOMatch: Simplifying Open-Set Semi-Supervised Learning with Joint Inliers and Outliers Utilization

Jupyter Notebook 51 3 Updated Jan 28, 2024
16 Updated Sep 12, 2024

[ECCV 2024] Be-Your-Outpainter https://arxiv.org/abs/2403.13745

Python 204 6 Updated Jul 10, 2024

A curated list of papers and resources related to Described Object Detection, Open-Vocabulary/Open-World Object Detection and Referring Expression Comprehension. Updated frequently and pull request…

180 14 Updated Aug 17, 2024

[ICCV'23 Main Track, WECIA'23 Oral] Official repository of paper titled "Self-regulating Prompts: Foundational Model Adaptation without Forgetting".

Python 219 8 Updated Sep 28, 2023

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Python 12,044 843 Updated Sep 13, 2024

SuperPrompt is an attempt to engineer prompts that might help us understand AI agents.

4,308 400 Updated Sep 18, 2024

CVPR 2023: Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification

Python 71 7 Updated May 28, 2024

PyTorch Implementation of ECCV 2024 OOD-CV Workshop SSB Challenge (Open-Set Recognition Track) - 1st Place

Python 13 1 Updated Sep 13, 2024

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 2,235 128 Updated Sep 24, 2024

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 4,837 368 Updated Aug 7, 2024

(Pattern Recognition) Pytorch implementation of “HTR-VT: Handwritten Text Recognition with Vision Transformer”

Python 17 1 Updated Sep 18, 2024

Examples and tutorials on using SOTA computer vision models and techniques. Learn everything from old-school ResNet, through YOLO and object-detection transformers like DETR, to the latest models l…

Jupyter Notebook 5,231 807 Updated Sep 16, 2024

Images to inference with no labeling (use foundation models to train supervised models).

Python 1,874 149 Updated Sep 19, 2024

GPT4V-level open-source multi-modal model based on Llama3-8B

Python 2,001 132 Updated Sep 3, 2024

a state-of-the-art-level open visual language model | 多模态预训练模型

Python 5,895 403 Updated May 29, 2024
Python 343 13 Updated Jul 29, 2024

An open source implementation of CLIP.

Python 9,870 955 Updated Aug 19, 2024

Real-time and accurate open-vocabulary end-to-end object detection

Python 1,484 142 Updated Sep 6, 2024

[CVPR 2023] Bridging Precision and Confidence: A Train-Time Loss for Calibrating Object Detection

Python 32 2 Updated Jun 21, 2023

Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI

Python 571 22 Updated Sep 17, 2024

A curated list of papers, datasets and resources pertaining to open vocabulary object detection.

274 18 Updated Jun 25, 2024

Code release for "Active Teacher for Semi-Supervised Object Detection", CVPR2022

Python 80 8 Updated Mar 27, 2023

A curated list of papers & resources linked to open set recognition, out-of-distribution, open set domain adaptation and open world recognition

1,056 145 Updated Mar 1, 2024

(TPAMI 2024) A Survey on Open Vocabulary Learning

799 45 Updated Aug 24, 2024

[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.

Python 744 37 Updated Jun 2, 2024
Next