Randomly-displaced methane configurations


Most of the datasets to benchmark machine-learning models contain minimum-energy structures, or small fluctuations around stable geometries, and focus on the diversity of chemical compositions, or the presence of different phases. This dataset provides a large number (7732488) configurations for a simple CH4 composition, that are generated in an almost completely unbiased fashion. Hydrogen atoms are randomly distributed in a 3A sphere centered around the carbon atom, and the only structures that are discarded are those with atoms that are closer than 0.5A, or such that the reference DFT calculation does not converge. This dataset is ideal to benchmark structural representations and regression algorithms, verifying whether they allow reaching arbitrary accuracy in the data rich regime.

DOI https://doi.org/10.24435/materialscloud:qy-dp
Source https://archive.materialscloud.org/record/2020.110
Metadata Access https://archive.materialscloud.org/xml?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:materialscloud.org:528
Creator Pozdnyakov, Sergey; Willatt, Michael; Ceriotti, Michele
Publisher Materials Cloud
Publication Year 2020
Rights info:eu-repo/semantics/openAccess; Creative Commons Attribution Non Commercial 4.0 International https://creativecommons.org/licenses/by-nc/4.0/legalcode
OpenAccess true
Contact archive(at)materialscloud.org
Language English
Resource Type Dataset
Discipline Materials Science and Engineering