VISCOUNTH

A Large Dataset for Visual Question Answering for Cultural Heritage

General Information

Visual Question Answering (VQA) is gaining momentum for its ability of bridging Computer Vision and Natural Language Process ing. However, VQA approaches rely on Machine Learning (ML) algorithms, need to be trained on large annotated datasets for achieving good performances, and, once trained, a ML model is barely portable on a different domain. This calls for agile methodologies for building large annotated datasets from existing resources. The cultural heritage domain represents both a natural application of this task and an exten sive source of data for training and validating VQA models. To this end, by using data and models from ArCo (Architecture of Knowledge), the knowledge graph of the Italian cultural heritage, we generated a large dataset for VQA in Italian and English.

Data

The dataset contains 6.49M question-answer pairs covering cultural assets, 43 question templates and 427 verbal forms. The number of pairs per template ranges from 35 to 576K. Together with the complete dataset, we provide a small sample of question-answer pairs from different question templates and cultural asset types and a grid with the distribution of question-answer pairs over question templates and cultural asset types. We also provide statistics on how many times questions from every question template were formulated by the surveyed users and how many time a question template was evoked by a user.

Details

The Dataset-Statistics file contains the accuracy by question template obtained from the manual validation of the samples in Italian (Column H) and English (Column I) together with the total amount of occurrences of the question-answer pairs on the question models (Column B), how many different verbal forms of questions are associated with each question-template (Column C), the indication of the variety of types and classes of cultural assets that populate each question-template (Columns E and F) and the number of short answers with the text span of their description (column G).
The stats_occurrences_cultural-property-types file contains the breakdown of question-answer pair counters by cultural asset type together with the total amount of occurrences of the question-answer pairs on the question models (Column J) and the total amount of occurrences of the cultural asset type (Row 45).
The Expert and users questions - statistics file contains the 43 question types associated to their 427 verbal forms, and to the number of times they are proposed (column Mentions) by experts and/or non-expert users.
The Italian-sample file contains an italian language sample of 50 elements by question template together with the results obtained from the manual validation.
The English-sample file contains an english language sample of 50 elements by question template together with the results obtained from the manual validation.
The IDEHAdataset - Transformation Rules file contains a list of transformation rules used to cleaning and normalize data.
The QATemplate-ITA_definitivo file contains a template composed by question-answer pairs in natural language (Columns D and E) and a short answer (Column F). Each pair is associated to an ID (Column A) and to a SPARQL query to extract data from ArCo KG (Column G); which is also labeled as "user" or "expert" or "both" (Column B) and as "visual" (eg, What is the shape?) or "contextual" (eg, Who is the author?) or "mixed" (eg, What is the material?) depending on the type of information needed to answer the question (Column C).
The Description folder contains the set of unique descriptions relating to each of the cultural assets present in the dataset. Both the English and Italian versions have a total of 584,514 texts obtained from the union of the answers associated with each cultural property (Column B). In addition to the description, files also consist of the depiction (column C) and of a unique identifier specifically structured for downloading the images associated with the single asset (Column D).
The img_download file supports the download of the images associated with the single cultural property through the input of the document contained in Decription.rar.
The Dataset 1.0 folder contains the first version of the Dataset divided into Italian and English versions, each of which is divided according to the identifiers associated with each question-answer template.
The Dataset 2.0 folder contains the second version of the Dataset divided into Italian and English versions, each of which is divided according to the identifiers associated with each question-answer template.

License

This dataset is licensed under the Creative Commons by-sa 4.0. The data of ArCo, the knowledge graph of the Italian cultural heritage, were extracted through SPARQL Endpoint.

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
Dataset 1.0		Dataset 1.0
Dataset 2.0		Dataset 2.0
Desription		Desription
Dataset-statistics.xlsx		Dataset-statistics.xlsx
English-sample.xlsx		English-sample.xlsx
Expert and users questions - statistics.xlsx		Expert and users questions - statistics.xlsx
IDEHAdataset - Transformation Rules.xlsx		IDEHAdataset - Transformation Rules.xlsx
Italian-sample.xlsx		Italian-sample.xlsx
QATemplate-ITA_definitivo .xlsx		QATemplate-ITA_definitivo .xlsx
README.md		README.md
img_download.py		img_download.py
stats_occurrences_cultural-property-types.xlsx		stats_occurrences_cultural-property-types.xlsx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VISCOUNTH

A Large Dataset for Visual Question Answering for Cultural Heritage

Table of contents

General Information

Data

Details

License

About

Releases

Packages

Contributors 2

Languages

misaelmongiovi/IDEHAdataset

Folders and files

Latest commit

History

Repository files navigation

VISCOUNTH

A Large Dataset for Visual Question Answering for Cultural Heritage

Table of contents

General Information

Data

Details

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages