-
ECODE: Extractor de Contextos Definitorios
Código fuente de un sistema basado en reglas lingüísticas para la extracción de contextos definitorios sobre textos especializados en español. Este sistema está formado por... -
Quantification of chemical components present in phellem and phelloderm + plh...
The dataset contains the chemical composition (water content, fiber, six carbohydrates, total proteins, organic matter, and lignin) of the subcortical tissues (phellem and... -
IULA Spanish-English Technical Corpus
The corpus consists of a number of specialized texts (Law, Economics, Medicine, Environment and Computer Science domains) available in both Spanish and English languages. This... -
IULA Penn Treebank
This treebank consists of a number of Spanish and English sentences that has been manually annotated with syntactical information. The sentences have been choosed from the Penn... -
LMF version of the SenSem Spanish Data Base
This is the LMF version of the SenSem database created by the Spanish Inter-University Research Group GRIAL. As part of SenSem project, a corpus of sentences annotated at the... -
MATE Parser module for Spanish
In this package we include the following: logonFinal20130315_4matetools361.model; parse_ESCAsentences_mate.sh; freeling_spaMate.sh; toconll2006.py; prueba.txt (test file: 4... -
PANACEA Spanish automatically acquired lexicon for ENV domain: Lexical Semant...
This is a domain-specific lexicon of for Spanish for environment (ENV) domain. This lexicon contains the a set of nouns classified into nine different semantic classes. It has... -
PANACEA Environment Bilingual Glossary EL-EN (Greek-English)
This folder contains files for bilingual glossary creation from factored phrase tables that include part of speech tagged text for EL-EN language pair. The tables are firstly... -
PANACEA Environment Corpus n-grams ES (Spanish)
This data set contains Spanish word n-grams and Spanish word/tag/lemma n-grams in the "Environment" (ENV) domain. N-grams are accompanied by their observed frequency counts. The... -
PANACEA English Gold Standard for lexical semantic classification
We present a set of English gold-standards for different noun classes created in PANACEA to train and test automatic classifiers. To create these gold-standards we used we the... -
PANACEA Labour Legislation Corpus n-grams EN (English)
This data set contains English word n-grams and English word/tag/lemma n-grams in the "labour Legislation" (LAB) domain. N-grams are accompanied by their observed frequency... -
PANACEA Annotated Dependency Greek Labour Legislation Corpus Version 2
PANACEA Annotated Greek Labour Legislation Corpus Version 2 consists of Greek texts in the Labour Legislation (LAB) domain that were collected and automatically annotated in the... -
PANACEA Environment Corpus n-grams IT (Italian)
This data set contains Italian word n-grams and Italian word/tag/lemma n-grams in the "Environment" (ENV) domain. N-grams are accompanied by their observed frequency counts. The... -
PANACEA Spanish automatically acquired lexicon for ENV domain: Subcategorizat...
This is a domain-specific lexicon for Spanish for environment (ENV) domain. This lexicon contain both, subcategorization frames for verbs and lexical semantic classes for nouns.... -
PANACEA Environment Bilingual Glossary FR-EN (French-English)
This folder contains files for bilingual glossary creation from factored phrase tables that include part of speech tagged text for FR-EN language pair. The tables are firstly... -
PANACEA Labour and Repubblica merged Italian Lexicon
The Italian PANACEA_rep_lab_merged.lmf.xml is SCF lexicon obtained by merging two automatically extracted lexicons: a domain lexicon (labour) PANACEA_SCF_IT_labour.lmf.xml and a... -
PANACEA Italian V-SUBCAT gold-standard for LAB domain
The PANACEA_SCF_Gold_LAB_IT is a manually created "gold-standard" lexicon of verbal subcategorisation frames for 27 verb lemmas. The language is Italian and the domain is Labour... -
PANACEA Italian Parole V-SUBCAT Gold Standard lexicon
The PAROLE-SCF-31-IT is a lexicon of verb subcategorisation frames for 31 verb lemmas extracted from the PAROLE Italian Lexicon (Ruimy et a. 2003). -
PANACEA Labour and Parole merged Italian Lexicon
The Italian PAROLE_lab_merged.lmf.xml is SCF lexicon obtained by merging two automatically extracted lexicons: a domain lexicon (labour) pANACEA_SCF_IT_labour.lmf.xml and a the... -
PANACEA English V-SUBCAT gold-standard for LAB domain
This is a domain-specific gold-standard for English subcategorization frames, in the case, for labour (LAB) domain. This gold-standard was manually developed, choosing a set of...