The model for morphosyntactic annotation of standard Croatian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the hr500k training corpus (http://hdl.handle.net/11356/1183) and using the CLARIN.SI-embed.hr word embeddings (http://hdl.handle.net/11356/1205). The model produces simultaneously UPOS, FEATS and XPOS (MULTEXT-East) labels. The estimated F1 of the XPOS annotations is ~94.1.
The difference to the previous version of the model is that the pre-trained embeddings are limited to 250 thousand entries and adapted to the new code base.