1,005 datasets found

Language: Spanish

Filter Results
  • Diccionario de neologismos on line

    Lexicographic resource containing 3.530 neologisms documented in press written in Spanish between 1989 and 2007.
  • Universal Dependencies 2.8

    Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual...
  • CoNLL 2017 and 2018 Shared Task Blind and Preprocessed Test Data

    CoNLL 2017 and 2018 shared tasks: Multilingual Parsing from Raw Text to Universal Dependencies This package contains the test data in the form in which they ware presented to...
  • Universal Dependencies 1.0

    Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual...
  • VOLEM

    Multilingual Verbal Lexicon: Catalan , spanish (connexion with French and Basc of other groups)
  • Deep Universal Dependencies 2.5

    Deep Universal Dependencies is a collection of treebanks derived semi-automatically from Universal Dependencies (http://hdl.handle.net/11234/1-3105). It contains additional...
  • Corpus bilingüe d’alternança de llengües (codeswitching)

    8 interactive recordings of group dynamics. Bilingual speakers (L1 -> English; L1 -> Catalan/Spanish).
  • HamleDT 2.0

    HamleDT 2.0 is a collection of 30 existing treebanks harmonized into a common annotation style, the Prague Dependencies, and further transformed into Stanford Dependencies, a...
  • Universal Segmentations 1.0 (UniSegments 1.0)

    Universal Segmentations (UniSegments) is a collection of lexical resources capturing morphological segmentations harmonised into a cross-linguistically consistent annotation...
  • WMT 13 Test Set

    We provide the Vietnamese version of the multi-lingual test set from WMT 2013 [1] competition. The Vietnamese version was manually translated from English. For completeness,...
  • OmegaWiki

    This dataset has no description

  • Universal Derivations v1.1

    Universal Derivations (UDer) is a collection of harmonized lexical networks capturing word-formation, especially derivational relations, in a cross-linguistically consistent...
  • SynSemClass 5.0

    The SynSemClass synonym verb lexicon version 5.0 is a multilingual resource that enriches previous editions of this event-type ontology with a new language, Spanish. The...
  • Multilingual Central Repository

    Multilingual lexical database that follows the model proposed by the EuroWordNet project. The MCR integrates into the same EuroWordNet framework wordnets from five different...
  • Universal Dependencies 2.4 Models for UDPipe (2019-05-31)

    Tokenizer, POS Tagger, Lemmatizer and Parser models for 90 treebanks of 60 languages of Universal Depenencies 2.4 Treebanks, created solely using UD 2.4 data...
  • NameTag 3 Multilingual Model 250203

    This is a trained model for the supervised machine learning tool NameTag 3 (https://ufal.mff.cuni.cz/nametag/3/). NameTag 3 is an open-source tool for both flat and nested named...
  • CorPipe 23 multilingual CorefUD 1.2 model (corpipe23-corefud1.2-240906)

    The corpipe23-corefud1.2-240906 is a mT5-large-based multilingual model for coreference resolution usable in CorPipe 23 https://github.com/ufal/crac2023-corpipe. It is released...
  • C4Corpus (CC BY-SA part)

    A large web corpus (over 10 billion tokens) licensed under CreativeCommons license family in 50+ languages that has been extracted from CommonCrawl, the largest publicly...
  • Bwananet

    Tool for querying the Technical Corpus of the Institut Universitari de Lingüística Aplicada.
  • PALIC

    A package of tools for the processing of the Corpus Tècnic in Catalan and Spanish. It includes a preprocessor, a PoSTagger and a linguistic disambiguator.
You can also access this registry using the API (see API Docs).