The electronic density of states (DOS) provides information regarding the distribution of electronic states in a material, and can be used to approximate its optical and electronic properties and therefore guide computational material design. Given its usefulness and relative simplicity, it has been one of the first electronic properties used as target for machine-learning approaches going beyond interatomic potentials. A subtle but important point, well-appreciated in the condensed matter community but usually overlooked in the construction of data-driven models, is that for bulk configurations the absolute energy reference of single-particle energy levels is ill-defined. Only energy differences matter, and quantities derived from the DOS are typically independent on the absolute alignment. We introduce an adaptive scheme that optimizes the energy reference of each structure as part of training, and show that it consistently improves the quality of ML models compared to traditional choices of energy reference, for different classes of materials and different model architectures. On a practical level, we trace the improved performance to the ability of this self-aligning scheme to match the most prominent features in the DOS. More broadly, we believe that this work highlights the importance of incorporating insights into the nature of the physical target into the definition of the architecture and of the appropriate figures of merit for machine-learning models, that translate in better transferability and overall performance.
This record contains all the necessary data files and scripts to support the results presented in the paper with the same title.