-
Czech RST Discourse Treebank 1.0
The Czech RST Discourse Treebank 1.0 (CzRST-DT 1.0) is a dataset of 54 Czech journalistic texts manually annotated using the Rhetorical Structure Theory (RST). Each text... -
EVALD 2.0 for Foreigners
EVALD 2.0 for Foreigners is a software for automatic evaluation of surface coherence (cohesion) in Czech texts written by non-native speakers of Czech. -
Prague Dependency Treebank - Consolidated 2.0 (PDT-C 2.0)
A manually annotated and genre-diversified language resource with rich linguistic information from morphology and syntax to semantics, the Prague Dependency Treebank –... -
CzeDLex 1.0
CzeDLex 1.0 is the first production version (the fourth development version) of the Lexicon of Czech discourse connectives. The lexicon contains connectives partially... -
DiscoMT 2017 Shared Task on Cross-lingual Pronoun Prediction
Data used in the 2017 shared task on cross-lingual pronoun prediction. -
Prague Dependency Treebank - Consolidated 1.0 (PDT-C 1.0)
A richly annotated and genre-diversified language resource, The Prague Dependency Treebank – Consolidated 1.0 (PDT-C 1.0, or PDT-C in short in the sequel) is a consolidated... -
Prague Discourse Treebank 4.0
The Prague Discourse Treebank 4.0 (PDiT 4.0; Synková et al., 2024) is an annotation of discourse relations marked by primary and secondary discourse connectives in the whole... -
EVALD 4.0 for Beginners – Evaluator of Discourse
EVALD 4.0 for Beginners is a software that serves for automatic evaluation of Czech texts written by non-native speakers of Czech – language beginners. -
EVALD 2.0
EVALD 2.0 serves for automatic evaluation of surface coherence (cohesion) in Czech texts written by native speakers of Czech. -
EVALD 4.0 for Foreigners – Evaluator of Discourse
EVALD 4.0 for Foreigners is a software for automatic evaluation of surface coherence (cohesion) in Czech texts written by non-native speakers of Czech. -
Prague Dependency Treebank 3.5
The Prague Dependency Treebank 3.5 is the 2018 edition of the core Prague Dependency Treebank (PDT). It contains all PDT annotation made at the Institute of Formal and Applied... -
Legitimation Strategies of Regional Organizations (LegRO)
In an era of increasing political challenges to global and regional organizations, it is crucial to understand how they claim legitimacy and how successful they are in this... -
Invasion@Ukraine
We publish a dataset of raw tweets collected via the Twitter Streaming API in the context of the onset of the war, which Russia started in Ukraine on February 24, 2022. In... -
SciTweets - A Dataset and Annotation Framework for Detecting Scientific Onlin...
This repository contains an expert-annotated dataset of 1261 tweets and the corresponding annotation framework from the publication "SciTweets - A Dataset and Annotation... -
SciTweets - A Dataset and Annotation Framework for Detecting Scientific Onlin...
This repository contains an expert-annotated dataset of 1261 tweets and the corresponding annotation framework from the publication "SciTweets - A Dataset and Annotation... -
Legitimation Strategies of Regional Organizations (LegRO)
In an era of increasing political challenges to global and regional organizations, it is crucial to understand how they claim legitimacy and how successful they are in this... -
Communicating Strategically about What? Europe and China in the Kenyan Media
European actors are increasingly relying on strategic communication tools in their external relations, especially in key partner countries like Kenya. Based on a large-scale... -
TweetsKB: A Public and Large-Scale RDF Corpus of Annotated Tweets (Part 12, S...
TweetsKB is a public RDF corpus of anonymized data for a large collection of annotated tweets. The dataset currently contains data for more than 3.1 billion tweets, spanning... -
TweetsKB: A Public and Large-Scale RDF Corpus of Annotated Tweets (Part 11, J...
TweetsKB is a public RDF corpus of anonymized data for a large collection of annotated tweets. The dataset currently contains data for nearly 3.0 billion tweets, spanning more... -
TweetsKB: A Public and Large-Scale RDF Corpus of Annotated Tweets (Part 10, J...
TweetsKB is a public RDF corpus of anonymized data for a large collection of annotated tweets. The dataset currently contains data for nearly 3.0 billion tweets, spanning more...
