News & Analysis
/
Article

Supervised machine learning looks to help researchers design collective variables

SEP 03, 2018
A newly demonstrated automated approach defines low-dimensional representation of molecules for accelerated sampling on alanine dipeptide and Chignolin mini-protein models.
Supervised machine learning looks to help researchers design collective variables internal name

Supervised machine learning looks to help researchers design collective variables lead image

Collective variables are functions designed using simulation data to provide a simplified representation of complex molecular systems. The invention of new methods for the identification of optimal collective variables describing protein dynamics is a highly active area of data science applied to biochemical physics.

In computationally intensive molecular dynamics simulations, choosing the appropriate collective variable takes on heightened importance, though defining the appropriate collective variable is often challenging. Advances in machine learning promise to provide new tools that improve our ability to effetively define collective variables from biomolecular simulation data.

Sultan and Pande demonstrated a method for designing collective variables for accelerated sampling with the help of supervised machine learning (SML) algorithms. Using solvated alanine dipeptides (amino acids) and the mini-protein Chignolin as examples, the group showed that their SML techniques produced the first estimate of collective variables from limited data that can be further improved on by other forms of parameter optimization.

The authors report several approaches that may be used to reversibly sample slow structural transitions between protein conformational states, including output probability estimates using logistic models and the outputs from statistical classifications known as shallow or deep neural network classifiers. Sultan said he hopes the current paper can serve as a bridge between the group’s previous work on machine learning and Markov state modeling with enhanced sampling.

“We hope this will allow researchers to worry less about the design of enhanced sampling simulations, allowing them to focus more on interpreting the results or designing new simulations,” Sultan said. “We also hope that this will stimulate more discussion on the use of other ML algorithms for accelerating molecular simulations.”

Source: “Automated design of collective variables using supervised machine learning,” by Mohammad M. Sultan and Vijay S. Pande, The Journal of Chemical Physics (2018). The article can be accessed at https://doi.org/10.1063/1.5029972 .

Related Topics
More Science
/
Article
Using a combination of slurry and gas-phase silicon evaporation methods, researchers have developed a high-density protective ceramic coating.
APS
/
Article
High-precision spectroscopy of weakly bound rubidium dimers pushes a theoretical model to its limits.
APS
/
Article
A new model captures how impurities affect jets formed when bubbles rise and pop at a liquid surface.
AAS
/
Article
With more than 40 missions on the chopping block, the space community is holding on to hope that the budget is “dead on arrival” in Congress