News & Analysis
/
Article

A new way to represent atomic species improves molecular machine learning models

AUG 23, 2019
The creation of elemental modes to represent atomic species enables transfer learning and lowers computational cost, bringing chemical modeling closer to a single machine learning model that can be fine-tuned for different systems.
A new way to represent atomic species improves molecular machine learning models internal name

A new way to represent atomic species improves molecular machine learning models lead image

Machine learning helps chemists model molecular possibilities and predictions. Currently, however, every atomic species in a molecule requires a different machine learning model, which is computationally costly and prevents information sharing between models. Developing a better way to represent atomic species is a necessary step for realizing one machine learning model that could be applied to any system.

Herr et al. present a new way to represent atomic species, called elemental modes. The authors identified a set of physical properties for each atomic species and compressed these properties into smaller dimensional space with an auto-encoder, a type of artificial neural network. These compressed representations are the elemental modes, which retain periodic table trends but are scalable for machine learning models.

To evaluate their performance, the authors used the elemental modes to train a neural network to predict formation energies of a crystalline material. The network did so with increased accuracy, demonstrating that the elemental modes could help rapidly screen new materials and drug candidates before synthesis at lower computational cost.

The neural network was also able to generalize its knowledge of a single element to improve predictions for another. When the authors removed chloride from the training data, the network was still able to extrapolate information about chlorine from its knowledge of other elements. This transfer learning reduces the amount of required training data.

Author John Herr said the work demonstrates that it is possible to take generalized models trained on large datasets and fine-tune them to a specific system with smaller amounts of data.

Source: “Compressing physics with an auto-encoder: Creating an atomic species representation to improve machine learning models in the chemical science,” by John E. Herr, Kevin Koh, Kun Yao, and John Parkhilll, The Journal of Chemical Physics (2019). The article can be accessed at https://doi.org/10.1063/1.5108803 .

Related Topics
More Science
AAS
/
Article
Leo, the Lion, is one of the most recognizable of the spring constellations, with its large size, distinctive shape, and plentiful bright stars.
AAS
/
Article
Observations — including from an amateur astronomer — show that the Plutino 2002 XV93 has a thin wisp of air around it.
APS
/
Article
Researchers repeated a key measurement of the fundamental constant G, but the results remain inconsistent, highlighting the difficulty of putting gravity on the proverbial scale.
/
Article
Relating Noether’s theorem to introductory concepts like Newton’s laws can give students an early appreciation of its impact.