Corpus extraction tool LIST 1.0

PID

The LIST corpus extraction tool is a Java program for extracting lists from text corpora on the levels of characters, word parts, words, and word sets. It supports VERT and TEI P5 XML formats and outputs .CSV files that can be imported into Microsoft Excel or similar statistical processing software.

Identifier
PID http://hdl.handle.net/11356/1227
Related Identifier http://www.sdjt.si/wp/wp-content/uploads/2018/09/JTDH-2018_Kljucevsek-et-al_Ucinkovit-izracun-frekvencnih-statistik-za-slovenske-jezikovne-korpuse.pdf
Related Identifier https://gitea.cjvt.si/lkrsnik/list
Related Identifier http://hdl.handle.net/11356/1276
Related Identifier http://slovnica.ijs.si/
Metadata Access http://www.clarin.si/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:www.clarin.si:11356/1227
Provenance
Creator Krsnik, Luka; Arhar Holdt, Špela; Čibej, Jaka; Dobrovoljc, Kaja; Ključevšek, Aleksander; Krek, Simon; Robnik-Šikonja, Marko
Publisher Centre for Language Resources and Technologies, University of Ljubljana; Faculty of Computer and Information Science, University of Ljubljana; Jožef Stefan Institute
Publication Year 2019
Rights The MIT License (MIT); https://opensource.org/licenses/mit-license.php; PUB
OpenAccess true
Contact info(at)clarin.si
Representation
Language Slovenian; Slovene; English
Resource Type toolService
Format application/zip; text/plain; charset=utf-8; downloadable_files_count: 1
Discipline Linguistics