Dataset - B2FIND

Slovene Lexicographic QA Fine-Tuning Corpus SloLexQA 1.0

The Slovene Lexicographic QA Fine-Tuning Corpus is a specialized dataset designed to advance the performance of AI models in understanding the structural, grammatical, and...

Corpus-grounded evaluation dataset for grammatical question answering GramQA 1.0

The Corpus-grounded evaluation dataset for grammatical question answering (GramQA) consists of 13 grammatical questions inspired by WALS, the World Atlas of Language Structures...

sqad 3.0

Simple question answering database version 3 (SQAD v3) created from Czech Wikipedia. New version consits of 13477 records. Each record of SQAD consist of multiple files -...

SQAD v2

Simple question answering database (SQAD) created from Czech Wikipedia. Each record of SQAD consist of four files (in vertical form provided with lemmatization and POS tagging)...

SQAD

The SQAD database consists of 3301 records obtained from Czech Wikipedia articles. The record structure is following: - the original sentence(s) from Wikipedia - a question...

sqad 2.1

Simple question answering database version 2.1 (SQAD_v2.1) created from Czech Wikipedia. Each record of SQAD consist of four files (in vertical form provided with lemmatization...

Fine-tuned models for extractive question answering in the Slovenian language

6 different fine-tuned Transformer-based models that solve the downstream task of extractive question answering in the Slovenian language. The fine-tuned models included are:...

Pytania i odpowiedzi z serwisu wikipedyjnego "Czy wiesz", wersja 2.0

Zbiór wzbogacono o oznaczenie konkretnych fragmentów zawierających odpowiedź na wskazane pytania. Wszystkie wskazane fragmenty zostały zweryfikowane przez człowieka. Niektórym...

Real-world misleading visualizations QA dataset

The real-world misleading visualization QA dataset accompanies the paper "'Protecting multimodal large language models againts misleading visualizations". The dataset contains...

M2QA: A Multi-domain Multilingual Question Answering Benchmark Dataset

M2QA (Multi-domain Multilingual Question Answering) is an extractive question answering benchmark for evaluating joint language and domain transfer. M2QA includes 13,500 SQuAD...

10 datasets found