The Corpus of the Colloquial Polish Language (CCPL) is a UGC-based corpus tagged with morpho-syntactic features by the team of professional linguists from the Wrocław University of Technology. It consists of 400 000 tagged segments and has been used for training of the UGC-tagger, also available in the CLARIN repository.
Main resources:
Corpus files (NCP tagset): CCPL - anonimizacja_xml_out_ver(3.05).zip
Manual annotation guidelines: Specification for morphosyntactic tagging of UGC texts.pdf
Corpus files (UD tagset): corpus_petrov_tags.zip