DK-CLARIN LSP Corpus - Environment domain

PID

Texts in the Environment Domain come from Hovedland, Danske Miljøundersøgelser, Det Økologiske Råd and Aktuel Naturvidenskab(via DMI). The corpus consists of 1,478,298 words in 93 files. Communicative setting/Number of files: expert->expert (2) expert->advanced (23) expert->basic (68). All texts are in XML TEIP5 format (TEIP5DKCLARIN-format), with tokenisation, pos-tagging, sentence and paragraph segmentation, lemmatisation and termhood annotation placed in separate text external spangroups. "DK-CLARIN LSP Corpus - Environment domain" is a part of the Danish DK-CLARIN LSP corpus consisting of seven sub-corpora from following subject domains: Agriculture, Construction, Economics, Environment, Health, IT and Nanotechnology.

Identifier
PID http://hdl.handle.net/20.500.12115/13
Metadata Access http://repository.clarin.dk/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:repository.clarin.dk:20.500.12115/13
Provenance
Creator Olsen, Sussi; Braasch, Anna; Jakob, Halskov; Hansen, Dorte Haltrup
Publisher Centre for Language Technology, NorS, University of Copenhagen; The Danish Language Council
Publication Year 2011
Rights CLARIN-ACA-NC; https://kitwiki.csc.fi/twiki/bin/view/FinCLARIN/ClarinEulaAca?ID=1&AFFIL=EDU&BY=1&NC=1&NORED=1; ACA
OpenAccess true
Contact info(at)clarin.dk
Representation
Language Danish
Resource Type corpus
Format text/plain; charset=utf-8; application/zip; text/plain; application/pdf; text/xml; downloadable_files_count: 11
Discipline Linguistics