xLiMe Twitter Corpus XTC 1.0.1


The xLiMe Twitter Corpus contains tweets in German, Italian and Spanish manually annotated with part-of-speech, named entities, and message-level sentiment polarity. In total, the corpus contains almost 20K annotated messages and 350K tokens. The corpus is described in Luis Rei, Dunja Mladenić, Simon Krek. A Multilingual Social Media Linguistic Corpus. Proceedings of the 4th Conference on CMC and Social Media Corpora for the Humanities. 27–28 September 2016, Ljubljana, Slovenia. https://nl.ijs.si/janes/cmc-corpora2016/proceedings/

PID http://hdl.handle.net/11356/1078
Related Identifier https://github.com/lrei/xlime_twitter_corpus
Metadata Access http://www.clarin.si/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:www.clarin.si:11356/1078
Creator Rei, Luis; Krek, Simon; Mladenić, Dunja
Publisher Jožef Stefan Institute
Publication Year 2016
Funding Reference info:eu-repo/grantAgreement/EC/FP7/611346
Rights The MIT License (MIT); PUB; https://opensource.org/licenses/mit-license.php
OpenAccess true
Contact info(at)clarin.si
Language Spanish; Castilian; Italian; German
Resource Type corpus
Format application/zip; application/pdf; text/plain; charset=utf-8; downloadable_files_count: 2
Discipline Linguistics