- Seoul, Republic of Korea
Starred repositories
PyTorch Lightning + Hydra. A very user-friendly template for ML experimentation. ⚡🔥⚡
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Run OpenAI's CLIP and Apple's MobileCLIP model on iOS to search photos.
🔍 Search local images with natural language on Android, powered by OpenAI's CLIP model. / 在 Android 上用自然语言搜索本地图片 (基于 OpenAI 的 CLIP 模型)
Container runtimes on macOS (and Linux) with minimal setup
LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images
[ECCV2024] Official implementation of paper, "DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs".
(CVPR2024)RMT: Retentive Networks Meet Vision Transformer
VisionLLaMA: A Unified LLaMA Backbone for Vision Tasks
Research Code for Multimodal-Cognition Team in Ant Group
Official PyTorch(MMCV) implementation of “Adversarial AutoMixup” (ICLR 2024 spotlight)
[ICML 2024] This repository includes the official implementation of our paper "Rejuvenating image-GPT as Strong Visual Representation Learners"
official code of "Improving Image Captioning via Predicting Structured Concepts"
Using LLMs and pre-trained caption models for super-human performance on image captioning.
Train, Evaluate, Optimize, Deploy Computer Vision Models via OpenVINO™
Dataset Management Framework, a Python library and a CLI tool to build, analyze and manage Computer Vision datasets.
Model Preparation Algorithm: a Transfer Learning Framework
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
CVPR2022 (Oral) - Rethinking Semantic Segmentation: A Prototype View