KUKY1.0

PID

KUKY is a curated selection of 224 Czech administrative and legal documents for readability research, stored in two JSON files. The documents come partly from public databases (Office of the Ombudsman, courts) and from private sources (letters, public local administration announcements). Some documents come in documented draft-revision pairs. They are manually enriched with a two-level annotation: "Relevance Stoplight" and "Speech Acts". This annotation mimics the way a plain-language expert scrutinizes a document before redesigning it for better readability: first, they closely read the entire document and detect problematic passages ("Relevance Stoplight"), classifying them as either incomprehensible or superfluous, or approving them as relevant. In a second step, the editor works with the relevant text according to a genre-specific template ("Speech Acts"). At the metadata level, the documents are graded with respect to their readability, as perceived by experienced plain legal writing teachers.

Identifier
PID http://hdl.handle.net/11234/1-5812
Related Identifier https://ufal.mff.cuni.cz/grants/ponk/kuky
Metadata Access http://lindat.mff.cuni.cz/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:lindat.mff.cuni.cz:11234/1-5812
Provenance
Creator Cinková, Silvie; Kuk, Michal; Šamánková, Jana; Kubíková, Barbora; Pospíšil, Přemysl; Mírovský, Jiří; Hladká, Barbora; Novotná, Tereza
Publisher Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Publication Year 2024
Rights Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0); http://creativecommons.org/licenses/by-nc-sa/4.0/; PUB
OpenAccess true
Contact lindat-help(at)ufal.mff.cuni.cz
Representation
Language Czech
Resource Type corpus
Format application/octet-stream; text/html; downloadable_files_count: 3
Discipline Linguistics