-
SciTweets - A Dataset and Annotation Framework for Detecting Scientific Onlin...
This repository contains an expert-annotated dataset of 1261 tweets and the corresponding annotation framework from the publication "SciTweets - A Dataset and Annotation... -
SciTweets - A Dataset and Annotation Framework for Detecting Scientific Onlin...
This repository contains an expert-annotated dataset of 1261 tweets and the corresponding annotation framework from the publication "SciTweets - A Dataset and Annotation... -
Legitimation Strategies of Regional Organizations (LegRO)
In an era of increasing political challenges to global and regional organizations, it is crucial to understand how they claim legitimacy and how successful they are in this... -
TweetsCOV19 - A Semantically Annotated Corpus of Tweets About the COVID-19 Pa...
TweetsCOV19 is a semantically annotated corpus of Tweets about the COVID-19 pandemic. It is a subset of TweetsKB and aims at capturing online discourse about various aspects of... -
TweetsKB: A Public and Large-Scale RDF Corpus of Annotated Tweets (Part 10, J...
TweetsKB is a public RDF corpus of anonymized data for a large collection of annotated tweets. The dataset currently contains data for nearly 3.0 billion tweets, spanning more... -
TweetsKB: A Public and Large-Scale RDF Corpus of Annotated Tweets (Part 11, J...
TweetsKB is a public RDF corpus of anonymized data for a large collection of annotated tweets. The dataset currently contains data for nearly 3.0 billion tweets, spanning more... -
Invasion@Ukraine
We publish a dataset of raw tweets collected via the Twitter Streaming API in the context of the onset of the war, which Russia started in Ukraine on February 24, 2022. In... -
Legitimation Strategies of Regional Organizations (LegRO)
In an era of increasing political challenges to global and regional organizations, it is crucial to understand how they claim legitimacy and how successful they are in this... -
PUSH*BACK*LASH X Dataset
These data were collected as part of PushBackLash's Work Package on anti-gender discourses online to analyze anti-gender equality strategies on Twitter/X to develop typologies... -
Communicating Strategically about What? Europe and China in the Kenyan Media
European actors are increasingly relying on strategic communication tools in their external relations, especially in key partner countries like Kenya. Based on a large-scale... -
TweetsKB: A Public and Large-Scale RDF Corpus of Annotated Tweets (Part 12, S...
TweetsKB is a public RDF corpus of anonymized data for a large collection of annotated tweets. The dataset currently contains data for more than 3.1 billion tweets, spanning... -
<b><i>Why and How Victimhood Matters? Between Strategic Ontological Narrative...
These are annexes to an article for Media, War &Conflict that is forthcoming as Why and How Victimhood Matters? Between Strategic Ontological Narratives and Intersectional... -
Prague Dependency Treebank - Consolidated 2.0 (PDT-C 2.0)
A manually annotated and genre-diversified language resource with rich linguistic information from morphology and syntax to semantics, the Prague Dependency Treebank –... -
Prague Dependency Treebank - Consolidated 1.0 (PDT-C 1.0)
A richly annotated and genre-diversified language resource, The Prague Dependency Treebank – Consolidated 1.0 (PDT-C 1.0, or PDT-C in short in the sequel) is a consolidated... -
Prague Discourse Treebank 2.0
PDiT 2.0 is a new version of the Prague Discourse Treebank. It contains a complex annotation of discourse phenomena enriched by the annotation of secondary connectives. -
CzeDLex 1.0
CzeDLex 1.0 is the first production version (the fourth development version) of the Lexicon of Czech discourse connectives. The lexicon contains connectives partially... -
EVALD 4.0 for Foreigners – Evaluator of Discourse
EVALD 4.0 for Foreigners is a software for automatic evaluation of surface coherence (cohesion) in Czech texts written by non-native speakers of Czech. -
Prague Discourse Treebank 4.0
The Prague Discourse Treebank 4.0 (PDiT 4.0; Synková et al., 2024) is an annotation of discourse relations marked by primary and secondary discourse connectives in the whole... -
Prague Discourse Treebank 3.0
The Prague Discourse Treebank 3.0 (PDiT 3.0) is a new version of annotation of discourse relations marked by primary and secondary discourse connectives in the data of the... -
DiscoMT 2015 Shared Task on Pronoun Translation
The data set includes training, development and test data from the shared tasks on pronoun-focused machine translation and cross-lingual pronoun prediction from the EMNLP 2015...