The Paper List of Large Multi-Modality Model, Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insight.

392 47 Updated Jul 11, 2024

KevinLight831 / ESA

ESA: External Space Attention Aggregation for Image-Text Retrieval

Python 12 Updated Aug 30, 2024

TinyLLaVA / TinyLLaVA_Factory

A Framework of Small-scale Large Multimodal Models

Python 597 53 Updated Sep 10, 2024

MILVLG / imp

a family of highly capabale yet efficient large multimodal models

Python 158 15 Updated Aug 23, 2024

zhuyiche / llava-phi

Python 361 38 Updated May 1, 2024

XiaoduoAILab / XmodelVLM

Python 57 2 Updated Jun 20, 2024

usydnlp / gradual

This repository contains code for paper GraDual: Graph-based Dual-modal Representation for Image-Text Matching, published in WACV 2022

Python 8 Updated Sep 13, 2022

CrossmodalGroup / ESL

Python 11 2 Updated May 3, 2024

VL-Group / 2022-NeurIPS-DAA

The code of the paper of "A Differentiable Semantic Metric Approximation in Probabilistic Embedding for Cross-Modal Retrieval" accepted by NeurIPS' 2022.

Python 18 Updated Jan 16, 2024

zhangy0822 / USER

USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text Retrieval, TIP 2024

Python 19 Updated Mar 22, 2024

kuanghuei / SCAN

PyTorch source code for "Stacked Cross Attention for Image-Text Matching" (ECCV 2018)

Python 544 113 Updated May 18, 2023

FrLars21 / ZoteroCitationCountsManager

Enhanced Citation Counts Manager for Zotero 7

JavaScript 90 3 Updated Jul 10, 2024

microsoft / RegionCLIP

[CVPR 2022] Official code for "RegionCLIP: Region-based Language-Image Pretraining"

Python 697 52 Updated Mar 20, 2024

fartashf / vsepp

PyTorch Code for the paper "VSE++: Improving Visual-Semantic Embeddings with Hard Negatives"

Python 488 125 Updated Dec 8, 2021

jaychempan / PIR-CLIP

📖 Official Code for “PIR-CLIP: Remote Sensing Image-text Retrieval with Prior Instruction Representation Learning”

Python 12 1 Updated Jun 27, 2024

mlfoundations / open_clip

An open source implementation of CLIP.

Python 9,934 959 Updated Aug 19, 2024

96-Zachary / vse_2ad

Python 15 3 Updated Apr 30, 2022

PKU-ICST-MIPL / MKVSE-TOMM2023

Python 23 3 Updated May 16, 2023

BruceW91 / CVSE

The official source code for the paper Consensus-Aware Visual-Semantic Embedding for Image-Text Matching (ECCV 2020)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ChenJunyu ChenJunyu2000

Block or report ChenJunyu2000

Lists (1)

🚀 My stack

Starred repositories

AlanWang0o0 / ParGo

zengyan-97 / X2-VLM

lerogo / aaai24_itr_cusa

CrossmodalGroup / LAPS

X-PLUG / mPLUG-HalOwl

zengzhixian / SoftPool_SVSE

Image-Text-Matching / AAHR

sjy0727 / CLIP-Text-Image-Retrieval

rom1504 / clip-retrieval

vkhoi / cora_cvpr24

Paranioar / Awesome_Matching_Pretraining_Transfering