News & Analysis
/
Article

Supervised machine learning looks to help researchers design collective variables

SEP 03, 2018
A newly demonstrated automated approach defines low-dimensional representation of molecules for accelerated sampling on alanine dipeptide and Chignolin mini-protein models.
Supervised machine learning looks to help researchers design collective variables internal name

Supervised machine learning looks to help researchers design collective variables lead image

Collective variables are functions designed using simulation data to provide a simplified representation of complex molecular systems. The invention of new methods for the identification of optimal collective variables describing protein dynamics is a highly active area of data science applied to biochemical physics.

In computationally intensive molecular dynamics simulations, choosing the appropriate collective variable takes on heightened importance, though defining the appropriate collective variable is often challenging. Advances in machine learning promise to provide new tools that improve our ability to effetively define collective variables from biomolecular simulation data.

Sultan and Pande demonstrated a method for designing collective variables for accelerated sampling with the help of supervised machine learning (SML) algorithms. Using solvated alanine dipeptides (amino acids) and the mini-protein Chignolin as examples, the group showed that their SML techniques produced the first estimate of collective variables from limited data that can be further improved on by other forms of parameter optimization.

The authors report several approaches that may be used to reversibly sample slow structural transitions between protein conformational states, including output probability estimates using logistic models and the outputs from statistical classifications known as shallow or deep neural network classifiers. Sultan said he hopes the current paper can serve as a bridge between the group’s previous work on machine learning and Markov state modeling with enhanced sampling.

“We hope this will allow researchers to worry less about the design of enhanced sampling simulations, allowing them to focus more on interpreting the results or designing new simulations,” Sultan said. “We also hope that this will stimulate more discussion on the use of other ML algorithms for accelerating molecular simulations.”

Source: “Automated design of collective variables using supervised machine learning,” by Mohammad M. Sultan and Vijay S. Pande, The Journal of Chemical Physics (2018). The article can be accessed at https://doi.org/10.1063/1.5029972 .

Related Topics
More Science
/
Article
/
Article
AAS
/
Article
NASA’s Psyche asteroid mission made a course adjustment via a flyby past Mars en route to its final destination. Here’s what it saw.
AAS
/
Article
A new study suggests that warm and hot Jupiters, despite looking different today, underwent the same dynamical evolution when they were younger.
APS
/
Article
A new technique uses an ‘anti-noise’ signal to cancel out the unavoidable quantum noise associated with precision measurements like those needed for gravitational-wave detection.
AAS
/
Article
New research suggests that Triton — or a Triton-like object — might have disrupted Neptune’s original moon system. Nereid might be the sole survivor.