News & Analysis
/
Article

A new way to represent atomic species improves molecular machine learning models

AUG 23, 2019
The creation of elemental modes to represent atomic species enables transfer learning and lowers computational cost, bringing chemical modeling closer to a single machine learning model that can be fine-tuned for different systems.
A new way to represent atomic species improves molecular machine learning models internal name

A new way to represent atomic species improves molecular machine learning models lead image

Machine learning helps chemists model molecular possibilities and predictions. Currently, however, every atomic species in a molecule requires a different machine learning model, which is computationally costly and prevents information sharing between models. Developing a better way to represent atomic species is a necessary step for realizing one machine learning model that could be applied to any system.

Herr et al. present a new way to represent atomic species, called elemental modes. The authors identified a set of physical properties for each atomic species and compressed these properties into smaller dimensional space with an auto-encoder, a type of artificial neural network. These compressed representations are the elemental modes, which retain periodic table trends but are scalable for machine learning models.

To evaluate their performance, the authors used the elemental modes to train a neural network to predict formation energies of a crystalline material. The network did so with increased accuracy, demonstrating that the elemental modes could help rapidly screen new materials and drug candidates before synthesis at lower computational cost.

The neural network was also able to generalize its knowledge of a single element to improve predictions for another. When the authors removed chloride from the training data, the network was still able to extrapolate information about chlorine from its knowledge of other elements. This transfer learning reduces the amount of required training data.

Author John Herr said the work demonstrates that it is possible to take generalized models trained on large datasets and fine-tune them to a specific system with smaller amounts of data.

Source: “Compressing physics with an auto-encoder: Creating an atomic species representation to improve machine learning models in the chemical science,” by John E. Herr, Kevin Koh, Kun Yao, and John Parkhilll, The Journal of Chemical Physics (2019). The article can be accessed at https://doi.org/10.1063/1.5108803 .

Related Topics
More Science
/
Article
Diffuse correlation spectroscopy and a machine learning algorithm can provide continuous blood pressure feedback, enabling real-time monitoring during surgery and for at-risk patients.
/
Article
A hybrid approach simulates the breakup and entrapment of water droplets in hypersonic vehicles’ air flow fields.
/
Article
Computational approach offers insights into hemodynamics and pulsatile blood flow.
AAS
/
Article
Astronomers have found a baby system that’s just beginning to build planets — and it can tell us about how and where planet formation starts.