Fixed node diffusion Monte Carlo energies for over one thousand small organic molecules

In the past decade, quantum diffusion Monte Carlo (DMC) has been demonstrated to successfully predict the energetics and properties of a wide range of molecules and solids by numerically solving the electronic many-body Schrödinger equation. We show that when coupled with quantum machine learning (QML) based surrogate methods the computational burden can be alleviated such that QMC shows clear potential to undergird the formation of high quality descriptions across chemical space. We discuss three crucial approximations necessary to accomplish this: The fixed node approximation, universal and accurate references for chemical bond dissociation energies, and scalable minimal amons set based QML (AQML) models. Numerical evidence presented includes converged DMC results for over one thousand small organic molecules with up to 5 heavy atoms used as amons, and 50 medium sized organic molecules with 9 heavy atoms to validate the AQML predictions. Numerical evidence collected for 𝛥-AQML models suggests that already modestly sized QMC training data sets of amons suffice to predict total energies with near chemical accuracy throughout chemical space. In this archive, we present DMC energies for over one thousand small organic molecules with up to 5 heavy atoms, and 50 medium sized organic molecules with 9 heavy atoms, as well as energies computed at cheaper levels of theory such as HF, DFT, MP2 and CCSD(T).

Identifier
Source https://archive.materialscloud.org/record/2022.177
Metadata Access https://archive.materialscloud.org/xml?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:materialscloud.org:1421
Provenance
Creator Huang, Bing; von Lilienfeld, Anatole; Krogel, Jaron; Benali, Anouar
Publisher Materials Cloud
Publication Year 2022
Rights info:eu-repo/semantics/openAccess; Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode
OpenAccess true
Contact archive(at)materialscloud.org
Representation
Language English
Resource Type Dataset
Discipline Materials Science and Engineering