Dataset - B2FIND

Re3: A Holistic Framework and Dataset for Modeling Collaborative Document Rev...

A dataset of aligned scientific paper revisions manually labeled according to their action and intent, and supplemented with the respective peer reviews and human-written edit...

Propositional Claim Detection (NLP Datensatz)

Es handelt sich um einen natural language processing (NLP) Trainingsdatensatz. Modelle, die auf diesen Daten trainiert werden, sollen Behauptungen klassifizieren können, die...

Combining text and vision in compound semantics: Towards a cognitively plausi...

In the current state-of-the art distributionalsemantics model of the meaning of noun-noun compounds (such aschainsaw, but-terfly, home phone),CAOSS(Marelli...

Transcribed newspaper articles from the NCSE collection

CLOCR-C: Transcribed newspaper articles from the NCSE collection This dataset contains 91 pairs of newspaper articles from the Nineteenth Century Serials Edition (NCSE). The...

ChunkRel WS

ChunkRel-WS is a prototype service for recognition of three syntactic relations between chunks. The service may be run against plain text (input format: text), then the...

Chunker WS

Chunker-WS provides shallow parsing of Polish. The parser may be run against plain text (input format: text, then it runs WCRFT for tagging) or already tagged input (other input...

Cinderella - tool for Clustering and Classifications of Texts in Polish

System for clustering and classifications of Texts in Polish. Source code.

WebStylo

Web based, open stylometry system based on Multilevel Text Analysis. Runs cluto and stylo (R system) clusterisation methods. Based on Natural Language Processing Workflow...

Movie Title Puns

Context The data is based on the following paper on pun generation: Hämäläinen, M., & Alnajjar, K. (2019). Modelling the Socialization of Creative Agents in a...

Dataset for color terms, 2012

This dataset comprises adjective-noun phrases with color terms.

50 datasets found