Dataset: tweets and analyses related to the paper 'The (Un)Predictability of Emotional Hashtags in Twitter'

DOI

This dataset features all the tweetids and labels that were used to model the language of 24 hashtags, and test the performance on predicting the hashtags in unseen tweets. This study is described in:Kunneman, F.A., Liebrecht, C.C. & Bosch, A.P.J. van den (2014). The (Un)Predictability of Emotional Hashtags in Twitter. In Proceedings of the 5th Workshop on Language Analysis for Social Media (LASM) @ EACL 2014 (pp. 26-34). s.l.: Association for Computational Linguistics, http://hdl.handle.net/2066/127067In addition to the train and test data, this dataset includes the most indicative features (words and phrases) for four of the hashtags, as well as the human judgement whether the tweets that contain or are classified with these hashtags convey the presumed emotion of the hashtags.Subject period: December 16th 2010 until February 1st 2013

Date: start=2013-07-01; end=2013-07-31 (data collection)

Identifier
DOI https://doi.org/10.17026/dans-zs9-fj3t
Metadata Access https://datacatalogue.cessda.eu/oai-pmh/v0/oai?verb=GetRecord&metadataPrefix=oai_ddi25&identifier=eb4f1f1dd74719d7bf9d85a1597b21a7731fdd311e3ac5142466a5afd5cf01e5
Provenance
Creator F.A. Kunneman; C.C. Liebrecht; A.P.J. van den Bosch
Publisher DANS Data Station Social Sciences and Humanities
Publication Year 2017
OpenAccess true
Representation
Discipline Agriculture, Forestry, Horticulture, Aquaculture; Agriculture, Forestry, Horticulture, Aquaculture and Veterinary Medicine; Humanities; Life Sciences; Social Sciences; Social and Behavioural Sciences; Soil Sciences