Providing access to Grey Literature: the CLARIN infrastructure

Dataset

DOI PID

Technological process, in particular in the field of computer science, has thus eased access, retrieval and use of information as a consequence of the radical transformation which formats underwent: from papers organized on shelves to electronic files archived on the web. “The Internet has thus had the paradoxical result of making grey literature far easier to access and retrieve than once was the case, but simultaneously making so much available that it is often much harder to find or identify relevant material in the first place” (Hartman, 2006). While in its first days technology, as Hartman states, kept a very quick – and somehow wild - pace in publishing any type of information on line, nowadays there is the need of more sophisticated core technologies and technological building blocks in order to better exploit the huge amount of digital content available on the web. Therefore there is this blossoming of infrastructures, large technological shells which host documentary repositories intended to meet the expectations of a well-educated and demanding audience. The strengthening of these infrastructures at different levels (academic, national, trans-national, community, disciplinary, commercial, industrial, etc.) implies a further step in the process of gathering, organizing, managing, preserving and spreading a huge amount of relevant information. “The official definition of ‘research infrastructure’ refers to structures, resources and services used by a scientific community for carrying out a high-level research in several fields, from the from astronomy, physics, biology, archaeology, to the humanities. At a European level the scientific communities get together in a consortium thus creating infrastructures accessible to all their members and sharing the same resources” (Monachini, Frontini, 2016). Infrastructures stimulate new research avenues, relying on the comparison, re-use and integration into current research of the outcomes of past and on-going field and laboratory activity. Such data are scattered amongst diverse digital collections and datasets, unpublished reports (grey literature), and in publications. Given this scenario, the authors – who deal with documentation, digitalization and language technologies for the Humanities since years now - focus on an important European research infrastructure called CLARIN (Common Language Resources and Technology Infrastructure) for assessing the traceability of grey literature within it.

Identifier
DOI	https://doi.org/10.17026/dans-xwd-7j28
PID	https://nbn-resolving.org/urn:nbn:nl:ui:13-li-pnet
Metadata Access	https://easy.dans.knaw.nl/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=oai:easy.dans.knaw.nl:easy-dataset:78290

Provenance
Creator	Goggi, S.
Publisher	Data Archiving and Networked Services (DANS)
Contributor	Pardelli, G.; Russo, I.; Bartolini, R.; Monachini, M.
Publication Year	2018
Rights	info:eu-repo/semantics/openAccess; License: http://creativecommons.org/publicdomain/zero/1.0; http://creativecommons.org/publicdomain/zero/1.0
OpenAccess	true

Representation
Language	English
Resource Type	Dataset
Format	XLSX; PDF
Discipline	Humanities