This dataset provides solar images and physical features from June 2010 to June 2024, which can be used for solar wind speed (SWS) forecasting. The data is a supplement to the publication "PROSWIN: Probabilistic Solar Wind Speed Forecasting Using Deep Distributional Regression With Solar Images" (Journal TBD) by Collin et al. (2026) and facilitates reproducing all results therein. We provide:
(1) three preprocessed solar image channels and preprocessed magnetograms from the SDO satellite (AIA: 171 Å, 193 Å, 211 Å; HMI: magnetograms; note that the images are not yet normalized and not scaled to the interval [-1,1]),
(2) 63 physical features describing the solar wind conditions from the previous solar rotation, the state of the solar cycle, and the position angle of Earth relative to the solar equator,
(3) the trained neural network models based on combinations of solar image channels, magnetograms, and physical features from the journal publication,
(4) a list of high-speed solar wind streams (HSSs) and coronal mass ejections (CMEs), which can be used for investigating the effectiveness of a prediction model with regard to HSSs and CMEs,
(5) the predicted time series of all the models we trained and of the models from the literature that we compare to in the journal publication, and
(6) the solar wind speed and sunspot number time series we use in the journal publication.
Image downloading and preprocessing was done using the following Python code: https://github.com/DanielCollin96/solar_image_processing. For all details on the data preparation and usage, we refer to the original journal article by Collin et al. (2026).