User Comments from a SNL 'Fast Fashion Ad' Sketch Combined with RoBERTa and BERTopic Outputs

DOI

Context

This dataset was created for a Master's thesis in Digital Humanities by Ka Yee Suvini Lai (see Related Works for the thesis paper titled: Emotion Classification, Topic Modelling, and Discourse Evaluation of Audience Responses to SNL's Fast Fashion Sketch on Social Media: Leveraging RoBERTa, BERTopic and Discourse Analysis). The dataset consists of user comments from a SNL sketch titled 'Fast Fashion Ad', extracted across YouTube, Instagram and TikTok (n=4028). The dataset also contains emotion classification and topic modelling outputs from RoBERTa and BERTopic. 

Technical details

The dataset consists of the following columns (with explanations in brackets):

comment_text (this column contains the user comments of the SNL sketch from Youtube, Instagram and Tiktok) 

top_emotion (RoBERTa's output of the highest emotion score from the comment)

emotion_scores (RoBERTa's output of all the emotions and their scores from the comment)

topic (BERTopic's output for the topic number for the comment)

topic_label (BERTopic's output for the topic number and topic label for the comment)

probability (BERTopic's output for the probability of the topic from the comment)

This dataset is a .csv file and is interoperable across many digital tools. It is the aggregated results from the RoBERTa and BERTopic Python Pipelines (see Related Works for the source code).

Further details

To gain access to the dataset, please reach out to the author via email: ka.lai@tuwien.ac.at

Identifier
DOI https://doi.org/10.48436/c3j49-2pv45
Related Identifier IsPartOf https://urn.kb.se/resolve?urn=urn%3Anbn%3Ase%3Alnu%3Adiva-140368
Related Identifier IsDerivedFrom https://doi.org/10.5281/zenodo.15506533
Related Identifier IsVersionOf https://doi.org/10.48436/15y32-w1573
Metadata Access https://researchdata.tuwien.ac.at/oai2d?verb=GetRecord&metadataPrefix=oai_datacite&identifier=oai:researchdata.tuwien.ac.at:c3j49-2pv45
Provenance
Creator Lai, Ka Yee Suvini ORCID logo
Publisher TU Wien
Publication Year 2025
Rights Creative Commons Attribution 4.0 International; https://creativecommons.org/licenses/by/4.0/legalcode
OpenAccess true
Contact tudata(at)tuwien.ac.at
Representation
Language English
Resource Type Dataset
Discipline Humanities