PANACEA English automatically acquired lexicon for ENV domain: Subcategorization Frames (V-SUBCAT)

DOI

-

This lexicon was produced using an inductive SCF classifier, the tpc_subcat_inductive webservice in the PANACEA project. The lexicon was automatically produced from the PANACEA MCv2 crawled corpus, by parsing the data with the RASP parser (Third Release, Open-Source Version, February 2001, available from http://ilexir.co.uk; see also E. Briscoe, J. Carroll, and R. Watson, 2006, The Second Release of the RASP System, in Proceedings of COLING/ACL Interactive Presentation Sessions), and then processing the parsed data with tpc_subcat_inductive. Only verb lemmas with at least 200 instances in MCv2 were retained.

Identifier
DOI https://doi.org/10.34810/data363
Metadata Access https://dataverse.csuc.cat/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.34810/data363
Provenance
Creator University of Cambridge. Department of Theoretical and Applied Linguistics
Publisher CORA.Repositori de Dades de Recerca
Contributor Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)
Publication Year 2023
Funding Reference European Commission 248064
Rights Custom Dataset Terms; info:eu-repo/semantics/openAccess; https://dataverse.csuc.cat/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.34810/data363
OpenAccess true
Representation
Resource Type Textual data; Dataset
Format application/pdf; text/xml; text/plain; text/html
Size 172184; 989214; 273; 9731; 3967
Version 1.0
Discipline Agriculture, Forestry, Horticulture, Aquaculture; Agriculture, Forestry, Horticulture, Aquaculture and Veterinary Medicine; Humanities; Life Sciences; Social Sciences; Social and Behavioural Sciences; Soil Sciences