POS Tagging and Lemmatization (Czech model)


Model trained for Czech POS Tagging and Lemmatization using Czech version of BERT model, RobeCzech. Model is trained on data from Prague Dependency Treebank 3.5. Model is a part of Czech NLP with Contextualized Embeddings master thesis and presented a state-of-the-art performance on the date of submission of the work. Demo jupyter notebook is available on the project GitHub.

PID http://hdl.handle.net/11234/1-4613
Related Identifier https://dspace.cuni.cz/handle/20.500.11956/147648
Metadata Access http://lindat.mff.cuni.cz/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:lindat.mff.cuni.cz:11234/1-4613
Creator Vysušilová, Petra; Straka, Milan
Publisher Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Publication Year 2021
Rights Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0); http://creativecommons.org/licenses/by-nc-sa/4.0/; PUB
OpenAccess true
Contact lindat-help(at)ufal.mff.cuni.cz
Language Czech
Resource Type languageDescription
Format text/plain; charset=utf-8; application/octet-stream; downloadable_files_count: 4
Discipline Linguistics