Dataset - B2FIND

SiR 2.0

SiR 2.0 is an update of an annotated corpus of Czech articles published on iRozhlas, a news server of a Czech public radio (https://www.irozhlas.cz/). SiR 2.0 is a collection of...

WoodVIT_V1

This deep learning dataset is designed for image classification and segmentation of bulky waste. It contains 22,659 patches with dimensions of 50 × 50 × 717 px. The dataset...

Replication Data for: CAPTIV8 : A comprehensive large scale CAPsule endoscopy...

General description and ethics approvals: The dataset contains images and videos of wireless capsule endoscopic examinations of 10 patients focused on the large colon conducted...

Annotationsguidelines für die Evaluation automatischer Koreferenzannotation

Guidelines for manual evaluation of automatic coreference annotation of German language data with CorPipe (Straka 2023) Gefördert durch die Deutsche Forschungsgemeinschaft...

W4M00006_BPA-MMusculus

Study Metabolic shifts induced in vivo after perinatal exposure to low doses of BPA in CD-1 mice were assessed. In this study conducted by the ToxAlim-MeX team and the Tufts...

W4M00004_GCMS-Algae

Study Characterization of the physiological variations of the metabolome in algae exposed to 3 different abiotic stress (salt concentration). Dataset The dataset contains 12...

W4M00002_Sacurine-comprehensive

Study Characterization of the physiological variations of the metabolome in biofluids is critical to understand human physiology and to avoid confounding effects in cohort...

German causal language annotations and lexicon (verbs, nouns, prepositions) (DE)

Annotations of causal verbs, nouns and prepositions in context and lexicon file for causal verbs, nouns and prepositions.

tweeDe

A German UD Twitter treebank, with >12,000 tokens from 519 tweets, annotated in the Universal Dependencies framework

DeModify

deModify consists of 3631 instances, each with three annotations obtained through CrowdFlower. An instance is a short story in which a modifier is annotated with respect to its...

The MSC Data Set

From this page you can download resources we created for modal sense classification as reported in Zhou et al. (2015), Marasović et al. (2016) and Marasović and Frank (2015)...

Converter for content-to-head style syntactic dependencies

A set of Python scripts that convert function-head style encodings in dependency treebanks in a content-head style encoding (as used in the UD treebanks) and vice versa (for...

Opinion role extractor

System for the Extraction of Subjective Expressions, Sentiment Sources and Sentiment Targets from German Text

Caulifinder banks

This dataset gathers libraries used to run Caulifinder pipeline. Caulifinder is an automated tool for annotation of endogenous viral elements of the Caulimoviridae family. You...

PDT-Vallex: Czech Valency lexicon linked to treebanks 4.0 (PDT-Vallex 4.0)

The valency lexicon PDT-Vallex 4.0 has been built in close connection with the annotation of the Prague Dependency Treebank project (PDT) and its successors (mainly the Prague...

TrEd

Tree Editor TrEd is a fully customizable and programmable graphical editor and viewer for tree-like structures. Among other projects, it was used as the main annotation tool for...

KAMOKO-Digitalizer

This editor was developed especially for the needs of the KAMOKO project (https://lindat.mff.cuni.cz/repository/xmlui/handle/11372/LRT-3261). The editor allows the quick entry...

Annotate

Annotate is a web and desktop application that should simplify the process of transforming photos of manuscripts to a browsable collection. It also allows users to annotate...

SiR 1.0

SiR 1.0 is a corpus of Czech articles published on iRozhlas, a news server of a Czech public radio (https://www.irozhlas.cz/). It is a collection of 1 718 articles (42 890...

Czech Court Decisions Dataset

We present the Czech Court Decisions Dataset (CCDD) -- a dataset of 300 manually annotated court decisions published by The Supreme Court of the Czech Republic and the...

44 datasets found