118 datasets found

Keywords: Language resources

Filter Results
  • Galician LMF Apertium Dictionary

    This is the LMF version of the Galician Apertium dictionary. Monolingual dictionaries for Spanish, Catalan, Galician and Euskera have been generated from the Apertium expanded...
  • French-Spanish LMF Apertium Bilingual dictionary

    - This is the LMF version of the Apertium bilingual dictionary for French and Spanish languages. Bilingual LMF dictionaries were generated from Apertium bilingual dix files. For...
  • Spanish LMF Parole/Simple Lexicon

    This is the LMF version of the Spanish Parole-Simple lexicon. The original PAROLE lexica (20,000 entries per language) were built conform to a model based on EAGLES guidelines...
  • LMF version of the SenSem Spanish Data Base

    This is the LMF version of the SenSem database created by the Spanish Inter-University Research Group GRIAL. As part of SenSem project, a corpus of sentences annotated at the...
  • Corpus92 Corpus

    - The corpus consists of a number of texts corresponding to Access to University examinations held on June 1992 in several Spanish universities. It contains about 350,000 words...
  • English-Galician CLUVI Dictionary

    This is the LMF version of the English-Galician CLUVI Dictionary developed under the direction of Xavier Gómez Guinovart (2005-2012) from parallel texts in the CLUVI Corpus of...
  • IULA Spanish-English Technical Corpus

    The corpus consists of a number of specialized texts (Law, Economics, Medicine, Environment and Computer Science domains) available in both Spanish and English languages. This...
  • CLUVI Parallel Corpus

    - The CLUVI Corpus of the University of Vigo is an open collection of parallel text corpora developed under the direction of Xavier Gómez Guinovart (2003-2012) that covers...
  • GrAF version of Catalan portions of Wikipedia Corpus

    This is the stand-off GrAF version of Catalan portions of the Wikipedia (based on a 2006 dump). This Wikipedia Catalan Corpus contains 122052 articles that contain about 47,3...
  • IULA Penn Treebank

    This treebank consists of a number of Spanish and English sentences that has been manually annotated with syntactical information. The sentences have been choosed from the Penn...
  • IULA Spanish LSP Treebank

    - This treebank consists of a number of sentences syntactically analyzed. The sentences have been choosed from the IULA LSP corpus, automatically annotated with POS information...
  • GrAF version of Spanish portions of Wikipedia Corpus

    This is the stand-off GrAF version of Spanish portions of the Wikipedia (based on a 2006 dump). This Wikipedia Spanish Corpus contains 257019 articles that contain about 150,1...
  • MATE Parser module for Spanish

    In this package we include the following: logonFinal20130315_4matetools361.model; parse_ESCAsentences_mate.sh; freeling_spaMate.sh; toconll2006.py; prueba.txt (test file: 4...
  • PANACEA Environment Multi Word Italian Lexicon

    The Environment MW Italian Lexicon is a lexicon of noun-noun multiword expressions automatically /nextracted from a 36Mio word web crawled corpus in the environmental domain....
  • PANACEA Italian V-SUBCAT Repubblica lexicon (language independent extractor)

    This is a lexicon of verb subcategorisation frames automatically extracted from a 300Mio words newspaper corpus using a language independent SCF acquisition software. The...
  • PANACEA Italian V-SUBCAT Repubblica lexicon (language dependent extractor)

    - The OpenDomain SCF Italian Lexicon is a lexicon of verb subcategorisation frames automatically extracted from a 300Mio words newspaper corpus using a language dependent SCF...
  • PANACEA Labour Multi Word Italian Lexicon

    The Labour MW Italian Lexicon is a lexicon of noun-noun multiword expressions automatically /nextracted from a 70Mio word web crawled corpus in the labour law domain. The...
  • PANACEA Spanish multi-level, multi-domain lexicon

    - This is a multi-level, multi-domain lexicon for Spanish. It combines the automatically acquired lexica for ENV and LAB domains using PANACEA platform and some general domain...
  • PANACEA Labour SCF MWE merged Italian Lexicon

    The Italian PANACEA_LAB_SCF_MWE_merged.lmf.xml lexicon is obtained by merging two automatically extracted lexicons: a domain lexicon (labour) for SCFs,...
  • PANACEA Environment SCF MWE merged Italian Lexicon

    - The Italian PANACEA_ENV_MWE_SCF_merged.lmf.xml lexicon is obtained by merging two automatically extracted lexicons: a domain lexicon (environment) for SCFs,...
You can also access this registry using the API (see API Docs).