Dataset of real hospital network traffic features from the internal network of Helios (Germany). The dataset comprises a total of 12 files in PARQUET format, containing network statistical features extracted from one week of hospital network traffic. The files are organized into three main categories according to the feature extraction methodology: flow-based features, window-based features, and combined features.
Each feature category includes four PARQUET files representing different temporal segments of the weekly capture: (i) Friday capture, (ii) weekend capture, (iii) Monday capture, and (iv) aggregated Monday-to-Friday capture. Consequently, each feature type (flow, window, and combined) contains four files, resulting in a total of twelve files in the dataset.
The flow-based files contain 40 features computed at the individual network flow level, the window-based files include 35 statistical features aggregated over windows of packet size 100, and the combined files integrate both flow-level and window-level attributes into a unified feature set of 72 features. This structure enables comparative analysis across feature extraction strategies and temporal traffic distributions within a real hospital network environment.
The dataset consists solely of aggregated, anonymized statistical features to ensure no private patient data is reconstructable.
Other funding agency: INCIBE CARISMATICA Chair of Cybersecurity