This dataset serves as training data for modeling the temperature field emanating from open-loop groundwater heat pumps.
The dataset was simulated in 2D with Feflow using cut-outs from interpolated hydrogeological measurements of the Munich, Germany, region. Heat pump locations are realistic positions, and extraction rates are adapted to fit the available groundwater.
To prepare the data for machine learning, it was transformed from unstructured to structured data (2560x2560 cells) with Python. Inputs consist of hydrogeological parameters such as hydraulic conductivity [m/d], log(conductivity), hydraulic head [m a.s.l.], hydraulic head gradient, transmissivity [m^2/d], aquifer thickness [m], and operational pump parameters such as maximum flow rate [m^3/d]. All hydrogeological parameters were extracted prior to heat pump operation. Additionally, the Darcy velocities in the x- and y-directions [m/d] before heat pump operation can be used (and are already part of the inputs), as they can be easily estimated from other input fields.
Heat pumps are operated under seasonal load as depicted in prepared/normed_flow_injection_series.npy and prepared/temperature_injection_series.npy. The former contains the normalized flow rates of the heat pumps, while the latter contains the corresponding injection temperatures.
The prepared data is split into training and validation data. Each contains a set of inputs (inputs_unnormed) and labels (labels_unnormed). The overall normalization information is contained in general/properties_info_normalization.yaml. The order of inputs is given by index_orig in the info-yaml.
The train and validation splits are mere suggestions and can be chosen differently.
The test data is hidden for final evaluation and only contains the last time step of the temperature label in a format of (N, C, H, W) with N being the number of samples, C the number of channels (1 in this case), and H and W the spatial dimensions of the grid.