Skip to content

Toloka Visual Question Answering Challenge at WSDM Cup 2023

License

Notifications You must be signed in to change notification settings

Yangget/WSDMCup2023

 
 

Repository files navigation

Toloka Visual Question Answering Challenge at WSDM Cup 2023

We challenge you with a visual question answering task! Given an image and a textual question, draw the bounding box around the object correctly responding to that question.

Image and Answer Question
What do you use to hit the ball? What do you use to hit the ball?
What do people use for cutting? What do people use for cutting?
What do we use to support the immune system and get vitamin C? What do we use to support the immune system and get vitamin C?

Dataset

Our dataset consists of the images associated with textual questions. One entry (instance) in our dataset is a question-image pair labeled with the ground truth coordinates of a bounding box containing the visual answer to the given question. The images were obtained from a CC BY-licensed subset of the Microsoft Common Objects in Context dataset, MS COCO. All data labeling was performed on the Toloka crowdsourcing platform, https://toloka.ai/. The entire dataset can be downloaded at https://doi.org/10.5281/zenodo.7057740.

Licensed under the Creative Commons Attribution 4.0 License. See LICENSE-CC-BY.txt file for more details.

Baseline

We offer a zero-shot baseline in Baseline.ipynb. First, it uses a detection model, YOLOR, to generate candidate rectangles. Then, it applies CLIP to measure the similarity between the question and a part of the image bounded by each candidate rectangle. To make a prediction, it uses the candidate with the highest similarity. This baseline method achieves IoU = 0.20 on both public and private test subsets.

Licensed under the Apache License, Version 2.0. See LICENSE-APACHE.txt file for more details.

About

Toloka Visual Question Answering Challenge at WSDM Cup 2023

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%