A new algorithm makes interpreting the results of cryo-electron microscopy maps easier and more accurate, helping researchers to determine protein structures and potentially create drugs that block their functions.
Cryo-electron microscopy, or cryo-EM, uses electron beams to obtain 3-D images of biomolecular structures. The use of this technique has skyrocketed in recent years due to technolgical advancements, but as cryo-EM gains steam in the field, additional tools are needed to interpret the images it outputs.
The final product of cryo-EM is a map of the density of atoms in biological molecules, including proteins and nucleotides. To get the level of detail they really need, researchers must identify atom or amino acid residue positions in a map, which requires specialized computer analysis. Programs that do this exist, but they aren’t always accuracate or easy to use, said Daisuke Kihara, a professor of biological sciences and computer science at Purdue University.
Kihara and a postdoctoral researcher in his lab, Genki Terashi, have created a fully automated algorithm for interpreting maps of proteins at lower than ideal resolution – around 4 to 5 ångström (Å, a unit of length to express size of atoms and molecules). Many similar tools were developed for more detailed images or X-ray crystallography, which don’t work as well for lower resolution cryo-EM images.
Kihara’s program, MAINMAST, identifies local density points in a given EM map and connects them into a tree structure – like connecting the dots. The algorithm tries different parameters for defining density points and branches in a tree.
“With this method, you don’t need to tune the parameters from 1 to 1.2 to 1.5, or need any expert knowledge about how to do this. Typically, when people use this kind of software, that’s critical,” Kihara said. “This algorithm has the different parameters already inside, so users don’t have to do anything but wait.”
The generated trees are then ranked by a score that evaluates their similarity to the density of each amino acid in the protein sequence. The top 500 modes are fully reconstructed and refined.
Other methods for interpreting cryo-EM maps exist, but many look to similar, previously solved protein structures as a starting point.
“If structures of similar proteins have already been solved, this is an obvious place to start because the new structure probably looks similar,” Kihara said. “Reference-based methods can be accurate, but if you’re solving a completely new structure, you can’t use them because you don’t have anything to start with.”
MAINMAST doesn’t rely on previously solved structures to get started – it’s a completely “de novo” meathod and, thus, models new structures using only information from EM density maps.
MAINMAST assigns confidence levels to different regions of the map, which tells users which regions are likely to be accurate and which should be manually checked. If the researcher knows some biological information, they can visually see which structures agree with their knowledge of the protein, Kihara said.
On the other hand, the de novo approach poses some challenges. Sometimes MAINMAST’s structures need a little more refinement, because the program doesn’t know what protein structures really look like. And if a cryo-EM map is low-resolution and doesn’t have density in some areas, MAINMAST can’t fill those parts. Kihara hopes to correct these flaws in the future, he said.
On EM density maps between 2.6 and 4.8 Å resolution, MAINMAST performed substantially better than two other existing de novo methods. The code is available now, and Kihara’s team is working to make the plugin more user friendly.
The findings were published in the journal Nature Communications.
Funding was provided by grants from the National Institutes of Health (R01GM123055, R01GM097528) and the National Science Foundation (IIS1319551, IOS1127027, DMS1614777).
Full release: http://www.purdue.edu/newsroom/releases/2018/Q2/new-method-for-interpreting-cryo-em-maps-makes-it-easier-to-determine-protein-structures.html