Dataset - B2FIND

German causal language annotations and lexicon (verbs, nouns, prepositions) (DE)

Annotations of causal verbs, nouns and prepositions in context and lexicon file for causal verbs, nouns and prepositions.

Topological Field Labeler for German

This resource contains the code of the topological labeler used in the paper: Do and Rehbein (2020). "Parsers Know Best: German PP Attachment Revisited". For this tool, labeling...

Pre-trained POS tagging models for German social media

Pre-trained POS tagging models for the HunPos tagger (Halácsy et al. 2007) the biLSTM-char-CRF tagger (Reimers & Gurevych 2017) Online-Flors (Yin et al. 2015)....

tweeDe

A German UD Twitter treebank, with >12,000 tokens from 519 tweets, annotated in the Universal Dependencies framework

Tool for Extracting PP Attachment Disambiguation Dataset

This resource contains code to extract a PP attachment disambiguation dataset as described in the paper: Do and Rehbein (2020). "Parsers Know Best: German PP Attachment...

Neural Rerankers for Dependency Parsing

This resource contains code for different types of neural rerankers (RCNN, RCNN-shared and GCN) from the paper: Do and Rehbein (2020). "Neural Reranking for Dependency Parsing:...

Real-World PP Attachment Disambiguation Dataset

This resource contains a German dataset for real-world PP attachment disambiguation. The creation, analysis and experiment results of the dataset are described in the paper: Do...

A harmonised testsuite for social media POS tagging (DE)

A harmonised POS testsuite of web data, CMC and Twitter microtext, with word forms and STTS pos tags (+ some additional CMC-specific tags). UD pos tags have been automatically...

Converter for content-to-head style syntactic dependencies

A set of Python scripts that convert function-head style encodings in dependency treebanks in a content-head style encoding (as used in the UD treebanks) and vice versa (for...

Datasets for Dependency Tree Reranking

This resource contains the datasets for dependency tree reranking in 3 languages: English, German and Czech. The creation, analysis and experiment results of the datasets are...

Neural PP Attachment Disambiguation Systems

This resource contains code for different types of neural PP attachment disambiguation systems: A disambiguation system inspired by de Kok et al. (2017) but with the ranking...

MACE-AL

A method for detecting noise in automatically annotated sequence-labelled data, combining MACE (Hovy et al. 2013) with Active Learning.

Head Selection Parsers and LSTM Labelers

This resource contains code, data and pre-trained models for various types of neural dependency parsers and LSTM labelers used in the papers: Do et al. (2017). "What Do We Need...

Neural Dependency Parser with Biaffine Attention and BERT Embeddings

This resource contains the code of the dependency parser used in the paper: Do and Rehbein (2020). "Parsers Know Best: German PP Attachment Revisited". The parser is a...

MACE-AL-TREE

An method for detecting noise in automatically annotated dependency parse trees, combining MACE (Hovy et al. 2013) with Active Learning.