Knowledge of the oxidation state of a metal centre in a material is essential to understand its properties. Chemists have developed theories to predict the oxidation state based on counting rules, which can fail to describe the oxidation states of systems such as metal-organic frameworks.
Here we present a data-driven approach to automatically assign oxidation states, using a machine learning model trained on assignments by chemists encoded in the chemical names in the Cambridge Crystallographic Database.
Our approach only considers the immediate local environment around a metal centre, and is robust to experimental uncertainties (like incorrect protonation, unbound solvents, or changes in bondlength).
We find such excellent accuracy in our predictions that we can use our method to detect incorrect assignments.
This work nicely illustrates how powerful the collective knowledge of chemists is. Machine learning can harvest this knowledge and convert it into a useful tool for chemists.