xLiMe Twitter Corpus XTC 1.0.1

Dataset

PID

The xLiMe Twitter Corpus contains tweets in German, Italian and Spanish manually annotated with part-of-speech, named entities, and message-level sentiment polarity. In total, the corpus contains almost 20K annotated messages and 350K tokens. The corpus is described in Luis Rei, Dunja Mladenić, Simon Krek. A Multilingual Social Media Linguistic Corpus. Proceedings of the 4th Conference on CMC and Social Media Corpora for the Humanities. 27–28 September 2016, Ljubljana, Slovenia. https://nl.ijs.si/janes/cmc-corpora2016/proceedings/

Identifier
PID	http://hdl.handle.net/11356/1078
Related Identifier	https://github.com/lrei/xlime_twitter_corpus
Metadata Access	http://www.clarin.si/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:www.clarin.si:11356/1078

Provenance
Creator	Rei, Luis; Krek, Simon; Mladenić, Dunja
Publisher	Jožef Stefan Institute
Publication Year	2016
Funding Reference	info:eu-repo/grantAgreement/EC/FP7/611346
Rights	The MIT License (MIT); PUB; https://opensource.org/licenses/mit-license.php
OpenAccess	true
Contact	info(at)clarin.si

Representation
Language	Spanish; Castilian; Italian; German
Resource Type	corpus
Format	application/zip; application/pdf; text/plain; charset=utf-8; downloadable_files_count: 2
Discipline	Linguistics