Spoken corpus Gos 1.1

PID

Gos is a corpus of spoken Slovene that includes the transcripts of approximately 120 hours of speech recorded in various situations: radio and TV shows, school lessons and lectures, private conversations between friends or within the family, work meetings, consultations, conversations in buying and selling situations, etc. All speech is transcribed in two versions – with pronunciation-based spelling and with standardized spelling – and it comprises over one million words. The corpus can be searched by means of the web concordancer where it is also possible to listen to the corresponding recordings: http://www.korpus-gos.net.

As opposed to the previous version, this one corrects some errors in the transcriptions and introduces various changes in the TEI and vertical encodings.

Identifier
PID http://hdl.handle.net/11356/1438
Related Identifier http://hdl.handle.net/11356/1040
Related Identifier http://hdl.handle.net/11356/1771
Related Identifier http://eng.slovenscina.eu/korpusi/gos
Metadata Access http://www.clarin.si/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:www.clarin.si:11356/1438
Provenance
Creator Zwitter Vitez, Ana; Zemljarič Miklavčič, Jana; Krek, Simon; Stabej, Marko; Erjavec, Tomaž
Publisher Centre for Language Resources and Technologies, University of Ljubljana
Publication Year 2021
Rights Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0); https://creativecommons.org/licenses/by-nc-sa/4.0/; PUB
OpenAccess true
Contact info(at)clarin.si
Representation
Language Slovenian; Slovene
Resource Type corpus
Format text/plain; charset=utf-8; application/zip; downloadable_files_count: 2
Discipline Linguistics