This repository contains the training dataset, evaluation dataset, and expert validation data for the paper "Deep learning for multi-criteria construction product selection: A context-sensitive preference scoring model applied to concrete". The dataset comprises 42,874 labelled scenarios combining deterministic control cases, LLM-generated synthetic labels, and expert annotations, each describing between two and five concrete product alternatives characterised by sustainability, performance, stakeholder, and situational features. The expert validation subset includes preference rankings and confidence scores collected from six domain experts across 32 real-product scenarios, used to assess alignment between model outputs and professional judgement. The code used to train, test, and validate the model is publicly available at https://github.com/eirasroger/concrete-selection-dl-model