Generating high-quality training data for atomic neural networks with virtual reality
DOI: 10.1063/10.0002645
Generating high-quality training data for atomic neural networks with virtual reality lead image
A growing trend in the field of computational chemistry is applying machine learning to calculate the structures and properties of molecules. Machine learning methods allow researchers to surpass prior limitations originating from a lack of sufficient processing power. The emphasis has shifted to creating superior data to train algorithms rather than upping the computational speed.
Amabilino et al. demonstrate the use of virtual reality as an efficient, intuitive way to generate a high-quality training dataset for the purpose of deriving the energy functions of large molecular systems. They developed a program featuring real-time interactive quantum molecular dynamics that a user can directly interact with through a virtual reality headset and controllers. The software package, named Narupa, is completely open source and free for any group to use.
“A scientist can literally go into virtual reality and reach out to touch the molecule as if it’s a tangible object,” said senior author David Glowacki. “The program is running according to a real-time physics simulation, so the user can set up these different geometries that can then be fed into the machine from which to learn.”
With this technique, the researchers created six different training datasets containing only smaller hydrocarbons with up to six carbon atoms in each molecule. These datasets were then fed into atomic neural networks, which were able to accurately predict the energies of much higher-dimensional systems containing nearly 100 atoms. Specifically, they determined the energy of a large hydrocarbon chain called squaline reacting with a cyano radical.
The results suggest that even small training datasets, when intelligently curated, can guide neural networks to fit accurate potential energy surfaces for large molecular systems.
Source: “Training atomic neural networks using fragment-based data generated in virtual reality,” by Silvia Amabilino, Lars A. Bratholm, Simon Jonathan Bennie, Michael O’Connor, and David Glowacki, Journal of Chemical Physics (2020). The article can be accessed at http://doi.org/10.1063/5.0015950