Modelling word learning and recognition using visually grounded speech

Dataset

DOI PID

A set of recorded isolated nouns, verbs and image annotations used for testing the word recognition performance of our speech2image model.

We trained a word recognition model on a set of images and utterances. The model should learn to recognise words without ever having seen written transcripts. The word recognition performance is measured as the number of retrieved images out of 10 displaying the correct visual referent.

We furthermore collected new ground truth object and action annotations for the Flickr8k test images for this purposes. This consists of 1000 images, all annotated for the presence of the 50 actions and objects corresponding to the test verbs and nouns.

In order to test the word recognition performance we took the 50 most common nouns and 50 most common verbs in the training data, confirmed that there were at least 10 images in our test image data that displayed these actions and objects. These nouns and verbs where recorded in singular and plural form (nouns) and in root, third person and progressive form (verbs). We furthermore annotated 1000 images from the Flickr8k test set for the presence of these nouns and verbs. These annotations are included in .CSV format

Identifier
DOI	https://doi.org/10.17026/dans-22n-xh47
PID	https://nbn-resolving.org/urn:nbn:nl:ui:13-uh-1sta
Metadata Access	https://easy.dans.knaw.nl/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=oai:easy.dans.knaw.nl:easy-dataset:242736

Provenance
Creator	Merkx, D.G.M.; Frank, S.L.; Scharenborg, O.E.; Ernestus, M.T.C.; Scholten, S.
Publisher	Data Archiving and Networked Services (DANS)
Contributor	Radboud University
Publication Year	2022
Rights	info:eu-repo/semantics/openAccess; License: http://creativecommons.org/licenses/by/4.0; http://creativecommons.org/licenses/by/4.0
OpenAccess	true

Representation
Resource Type	Dataset
Format	zip
Discipline	Computer Science; Computer Science, Electrical and System Engineering; Engineering Sciences