NLP in Diagnostic Texts from Nephropathology [Research Data]


This data set contains all annotated topic word tables from the work "NLP in Diagnostic Texts from Nephropathology", as well as all pre-processed and tf-idf-vectorized text files. The raw texts (i.e., descriptive and diagnostic sections) are explicitly not made available, since it cannot be ruled out here that it is possible to infer the patient or the person making the report. This is in accordance with our local ethics committee.

Please note: This data set is not yet complete and will be completed soon.

Please refer to chapter 3.1.2 of our paper to learn how to interpret the annotated topic word tables.

The associated gitlab project contains some examples of how the .pkl files can be opened and used with python.

Metadata Access
Creator Legnar, Maximilian; Daumke, Philipp; Hesser, Jürgen; Porubsky, Stefan; Popovic, Zoran; Bindzus, Jan Niklas; Siemoneit, Joern-Helge; Weis, Cleo-Aron
Publisher heiDATA
Contributor Legnar, Maximilian; Weis, Cleo-Aron; Institute of Pathology, Medical Faculty Mannheim, Heidelberg University; Legnar, Maximillian; Bindzus, Jan Niklas
Publication Year 2022
Rights CC BY 4.0; info:eu-repo/semantics/openAccess;
OpenAccess true
Contact Legnar, Maximilian (Institute of Pathology, Medical Faculty Mannheim, Heidelberg University, Germany); Weis, Cleo-Aron (Institute of Pathology, Medical Faculty Mannheim, Heidelberg University, Germany)
Resource Type Dataset
Format application/octet-stream; application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
Size 658789; 875657; 316789; 475357; 26482; 26513; 25989; 25160; 24221; 27573; 27585
Version 1.3
Discipline Life Sciences; Medicine