By applying natural language processing tools to the movements of protein molecules, University of Maryland scientists created an abstract language that describes the multiple shapes a protein molecule can take and how and when it transitions from one shape to another.
A protein molecule’s function is often determined by its shape and structure, so understanding the dynamics that control shape and structure can open a door to understanding everything from how a protein works to the causes of disease and the best way to design targeted drug therapies. This is the first time a machine learning algorithm has been applied to biomolecular dynamics in this way, and the method’s success provides insights that can also help advance artificial intelligence (AI). A research paper on this work was published on October 9, 2020, in the journal Nature Communications.
“Here we show the same AI architectures used to complete sentences when writing emails can be used to uncover a language spoken by the molecules of life,” said the paper’s senior author, Pratyush Tiwary, an assistant professor in UMD’s Department of Chemistry and Biochemistry and Institute for Physical Science and Technology. “We show that the movement of these molecules can be mapped into an abstract language, and that AI techniques can be used to generate biologically truthful stories out of the resulting abstract words.”
Biological molecules are constantly in motion, jiggling around in their environment. Their shape is determined by how they are folded and twisted. They may remain in a given shape for seconds or days before suddenly springing open and refolding into a different shape or structure. The transition from one shape to another occurs much like the stretching of a tangled coil that opens in stages. As different parts of the coil release and unfold, the molecule assumes different intermediary conformations.
But the transition from one form to another occurs in picoseconds (trillionths of a second) or faster, which makes it difficult for experimental methods such as high-powered microscopes and spectroscopy to capture exactly how the unfolding happens, what parameters affect the unfolding and what different shapes are possible. The answers to those questions form the biological story that Tiwary’s new method can reveal.
Tiwary and his team applied Newton’s laws of motion—which can predict the movement of atoms within a molecule—with powerful supercomputers, including UMD’s Deepthought2, to develop statistical physics models that simulate the shape, movement and trajectory of individual molecules.
Then they fed those