Corpus of Legislation texts of Republic of Serbia 1.0

PID

The dataset was created using a large number of Serbian Legislation texts gathered from the https://www.pravno-informacioni-sistem.rs/ website. The gathered texts were used for fine-tuning a neural network called SRBerta on the masked language modeling task. The dataset contains texts which are part of the following legislation categories: • Constitution of the Republic of Serbia and state regulation • Justice • Defense, military and internal affairs • Public incomes • Monetary system, financial organizations and business

Identifier
PID http://hdl.handle.net/11356/1754
Related Identifier https://huggingface.co/JelenaTosic/SRBerta
Metadata Access http://www.clarin.si/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:www.clarin.si:11356/1754
Provenance
Creator Bogdanović, Miloš; Tošić, Jelena
Publisher Miloš Bogdanović; Jelena Tošić
Publication Year 2022
Rights Creative Commons - Attribution 4.0 International (CC BY 4.0); https://creativecommons.org/licenses/by/4.0/; PUB
OpenAccess true
Contact info(at)clarin.si
Representation
Language Serbian
Resource Type corpus
Format text/plain; charset=utf-8; text/plain; downloadable_files_count: 5
Discipline Linguistics