SFU Opinion and Comments Corpus (SOCC) for NoSketch Engine

PID

The SFU Opinion and Comments Corpus (SOCC) is a corpus for the analysis of online news comments. It contains opinionated articles and comments. It was tagged using TreeTagger and prepared for the NoSketch Engine corpus manager.

The 7z archive already contains the prepared registry ("sfu_opinion_and_comments"), subcdef files, scripts and the vertical file which was also archived in 7z format. To complete the setup, simply configure the paths in the registry and compile the corpus.

Identifier
PID http://hdl.handle.net/11234/1-5969
Related Identifier https://nlp.fi.muni.cz/projekty/socc/
Metadata Access http://lindat.mff.cuni.cz/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:lindat.mff.cuni.cz:11234/1-5969
Provenance
Creator Marek Hába
Publisher Masaryk University, NLP Centre
Publication Year 2024
Rights Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0); http://creativecommons.org/licenses/by-sa/4.0/; PUB
OpenAccess true
Contact lindat-help(at)ufal.mff.cuni.cz
Representation
Language English
Resource Type corpus
Format text/plain; charset=utf-8; application/octet-stream; downloadable_files_count: 1
Discipline Linguistics