Electromagnetic modes of photonic nanostructures can exhibit increased near-field energy densities which can be applied in many fields such as biosensing, quantum dot solar cells or photon upconversion. Optimizing such systems enforces to systematically analyze large amounts of numerically obtained three-dimensional field distribution data, as in the presented dataset. The simulated system is a silicon photonic crystal slab on glass (subspace) with a hexagonal lattice of cylindrical holes. The holes are filled with a medium of constant refractive index of 1.65, which is also used for the complete superspace. The system is illuminated from the superspace with plane waves of transversal electric (TE) and transversal magnetic (TM) polarization under a polar angle theta
and an azimuthal angle phi
. The data includes electric and magnetic field distributions on characteristic planes, as well as input parameters (e.g. geometrical parameters of the system), derived quantities (e.g. reflectance, transmittance), and computational cost information of the simulations (e.g. CPU time and memory consumption). The dataset is composed of five HDF5 databases with data derived from time-harmonic simulations using the finite-element Maxwell solver JCMsuite. Specifically, these are (1) a database with the input data and derived quantities ('parameters_and_results.h5') and (2) databases with the electric ('E') or magnetic ('H') field strength data for the polarizations TE or TM (filenames of the form 'field_data_{field}_{polarization}.h5'). All databases contain a '/data' group holding the main data, and a number of groups with metadata tables. The '/data' tables have an index ('number' column) that identifies a unique simulation. All the input parameters for the simulation are given in columns of the 'parameters_and_results.h5' database, together with the computational cost information and derived quantities. The metadata tables of this file are less relevant for further analyses. The '/data' table of the field databases store the absolute values of the electric and magnetic field, respectively, for each simulation (i.e. index) and polarization (TE or TM) on the xy, xz, and yz planes of the computational domain. These distributions are flattened and have to be reshaped based on the information in the metadata tables. This data publication is accompanied by a code publication. The code takes the data as input and features all the tools needed to repeat the clustering analysis that underlies the main publication. A detailed description of the quantities given in the '/data' table of the 'parameters_and_results.h5' database is given there, as well as code to restructure the field distribution data.