Free morphological analyzer Majka

Dataset

PID

Majka is a fast tool which assigns a lemma (basic form) and all possible grammatical tags to each word form on the input. It can be used also for word forms generation or diacritics restoration. Although Majka builds on the previous system for morphological analysis Ajka, which gives roughly the same results, it is an entirely new and independent implementation completely based on finite automata and it is also much faster and more flexible than the previous system.

Free databases of multiple languages for Majka can be found at: https://nlp.fi.muni.cz/ma/

When using majka for research purposes, please cite: Pavel Šmerk. Fast Morphological Analysis of Czech. In Petr Sojka and Aleš Horák. Proceedings of Third Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2009. Brno : Masaryk University, 2007. p. 13–16. ISBN 978-80-210-5048-8. https://nlp.fi.muni.cz/raslan/2009/papers/13.pdf

Usage: program expects one entry (word, lemma, or string lemma:tag, according the data file in use) per line on its standard input and prints the requested information on its standard output.

Binaries: majka for Linux majka.exe for Windows Source code: majkalinux.tgz for Linux majkawin.tgz for Windows

Identifier
PID	http://hdl.handle.net/11234/1-5988
Related Identifier	https://nlp.fi.muni.cz/ma/
Metadata Access	http://lindat.mff.cuni.cz/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:lindat.mff.cuni.cz:11234/1-5988

Provenance
Creator	Pavel Šmerk
Publisher	Masaryk University, NLP Centre
Publication Year	2009
Rights	GNU General Public License 2 or later (GPL-2.0); http://opensource.org/licenses/GPL-2.0; PUB
OpenAccess	true
Contact	lindat-help(at)ufal.mff.cuni.cz

Representation
Resource Type	toolService
Format	application/octet-stream; application/x-gzip; text/plain; charset=utf-8; downloadable_files_count: 5
Discipline	Linguistics