CCLL Lemmatised Frequency Lists

PID

The resource contains 6 frequency lists for the Corpus of Contemporary Lithuanian language (CCLL) (https://sitti.vdu.lt/en/services/) 1-LT_token_freq_list.txt - a full frequency list of all tokens in CCLL 2-LT_token_freq_stats.txt - statistics of the tokens and most common 100 tokens in CCLL 3-LT_alpha_wordform_freq_list.txt - a full frequency list of Lithuanian alphabetic wordforms in CCLL 4-LT_lemma_alpha_freq_list.txt - a full frequency list of Lithuanian alphabetic lemmas in CCLL 5-LT_lemma_and_punct_freq_list_freq_list.txt - a full frequency list of Lithuanian lemmas and punctuation marks in CCLL 6-LT_lemma_and_punct_freq_stats.txt - statistics of lemmas and punctuation marks and most common 100 lemmas and punctuation marks in CCLL.

Identifier
PID http://hdl.handle.net/20.500.11821/77
Metadata Access https://clarin.vdu.lt/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:clarin.vdu.lt:20.500.11821/77
Provenance
Creator Mindaugas, Petkevičius
Publisher Institute of Digital Resources and Interdisciplinary Research (SITTI) at Vytautas Magnus University
Publication Year 2025
Rights PUB_CLARIN-LT_End-User-Licence-Agreement_EN-LT; https://clarin.vdu.lt/licenses/eula/PUB_CLARIN-LT_End-User-Licence-Agreement_EN-LT.htm; PUB
OpenAccess true
Contact info(at)clarin.vdu.lt
Representation
Language Lithuanian
Resource Type lexicalConceptualResource
Format text/plain; application/zip; text/plain; charset=utf-8; downloadable_files_count: 2
Discipline Linguistics