Trans-related Online Corpus (TROC)

PID

The resource includes three corpora. 0. Raw corpus of YouTube and Reddit comments (see readme.txt). 1. Stance annotated corpus of contra-trans and pro-trans comments (contra-trans comments - 1264; pro-trans comments - 1023). 2. Stance annotated corpus part of speech tagged. FILES No Number of comments Words 01_youtube 660 14298 02_youtube 568 13704 Total: 1227 28002 No Number of comments Words 01_reddit 240 14152 02_reddit 1167 50498 03_reddit 58 1540 07_reddit 877 40487 Total: 2342 106677 Reference: J. Ruzaitė et al. (2026) "Polarisation and Stance-Taking across Platforms: Challenges of Stance Annotation in Trans-Related Debates"

Identifier
PID http://hdl.handle.net/20.500.11821/82
Metadata Access https://clarin.vdu.lt/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:clarin.vdu.lt:20.500.11821/82
Provenance
Creator Ruzaitė, Jūratė; Utka, Andrius; Negrea-Busuioc, Elena; Lewandowska-Tomaszczyk, Barbara; Kazazi, Ledia; Dylgjeri, Ardita; Bączkowska, Anna
Publisher Vytautas Magnus University
Publication Year 2026
Rights PUB_CLARIN-LT_End-User-Licence-Agreement_EN-LT; https://clarin.vdu.lt/licenses/eula/PUB_CLARIN-LT_End-User-Licence-Agreement_EN-LT.htm; PUB
OpenAccess true
Contact info(at)clarin.vdu.lt
Representation
Language English
Resource Type corpus
Format text/plain; application/zip; text/plain; charset=utf-8; downloadable_files_count: 4
Discipline Linguistics