Corpus of Legislation texts of Republic of Serbia 1.0

Dataset

PID

The dataset was created using a large number of Serbian Legislation texts gathered from the https://www.pravno-informacioni-sistem.rs/ website. The gathered texts were used for fine-tuning a neural network called SRBerta on the masked language modeling task. The dataset contains texts which are part of the following legislation categories: • Constitution of the Republic of Serbia and state regulation • Justice • Defense, military and internal affairs • Public incomes • Monetary system, financial organizations and business

Identifier
PID	http://hdl.handle.net/11356/1754
Related Identifier	https://huggingface.co/JelenaTosic/SRBerta
Metadata Access	http://www.clarin.si/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:www.clarin.si:11356/1754

Provenance
Creator	Bogdanović, Miloš; Tošić, Jelena
Publisher	Miloš Bogdanović; Jelena Tošić
Publication Year	2022
Rights	Creative Commons - Attribution 4.0 International (CC BY 4.0); https://creativecommons.org/licenses/by/4.0/; PUB
OpenAccess	true
Contact	info(at)clarin.si

Representation
Language	Serbian
Resource Type	corpus
Format	text/plain; charset=utf-8; text/plain; downloadable_files_count: 5
Discipline	Linguistics