Researchers at U of T have developed new machine learning algorithms to determine the 3D structures of proteins, paving the way for faster and more efficient drug discoveries and diagnostic methods.

In order to be effective, drugs must bind to specific proteins in a cell in the right orientation. They do this by changing the conformation of a protein, which results in a change in that protein’s function. Knowing the 3D structure of a protein can significantly enhance understanding of how they work in the body and consequently, aid in the development of drugs targeting the potential harmful effects of these proteins at unprecedented speeds and efficacies.

“The ability to discover 3D structures of protein molecules is one of the major goals of the field of structural biology. Proteins, which are the building blocks of every biological process, are tiny molecular machines that interact, bind, move and work together to make life happen,” said Ali Punjani, a PhD student who is working on these algorithms under Professor David Fleet, Chair of the Computer and Mathematical Sciences department at UTSC.

This team of researchers, along with  Dr. Marcus Brubaker, an Assistant Professor at York University who worked on these algorithms with Fleet as a postdoctoral researcher, have developed new machine learning algorithms to dramatically speed up the process of solving the 3D structure of a protein.

This novel approach does not require a scientist to guess what the protein could look like.

“We employ an algorithm for finding 3D structures that effectively ‘explores’ the space of possible structures to find the one that best explains the observed data,” explained Punjani, “Generally, with our algorithms, if a protein can be purified and 2D images of it can be acquired of sufficient quality, the structure can be solved without prior knowledge.”

High-resolution protein structures can now be identified rapidly and automatically, targeting a computational bottleneck in the field. “Previous approaches [took] days or weeks to solve a single structure on expensive computer equipment. Our new algorithms, running on an inexpensive desktop computer, can solve the same structures in minutes,” added Punjani.

This can dramatically improve the speed at which scientists can identify complex structures and use it to discover more effective drugs to target a protein’s function. Since there are many proteins in a cell, existing drugs on the market may bind off-target proteins and cause unwanted side effects. The new method can allow researchers to design drugs with optimal structures that target proteins with high specificity to reduce those side effects.

These algorithms use microscopic images of proteins taken using electron cryomicroscopy (cryo-EM), a tool that enables the direct discovery of 3D protein structures. Cryo-EM works by firing a beam of electrons at a protein sample and detecting the emerging electrons to map out the structure of what the electrons collided with. This process takes thousands of 2D images that are then computationally analyzed to uncover the 3D structures of proteins at a near atomic resolution.

The ability to visualize the accurate 3D protein structure will equip researchers in academia and industry with a tool to harness that knowledge for further work. “We hope that the techniques we’ve developed will help other scientists solve structures quickly and accurately, paving the way for drug discoveries and a deeper understanding of how biological life works,” said Punjani.

The paper was recently published in Nature Methods. The team collaborated with U of T professor Dr. John Rubinstein, a Canada Research Chair in Electron Cryomicroscopy, and their work was funded by the Natural Sciences and Engineering Council of Canada.

The team’s startup, Structura Biotechnology Inc., has developed a software called cryoSPARC. It uses these algorithms on a cryo-EM platform and has already been integrated in labs. Their graphical user interface allows many users to work on the program remotely, upload and share data, and view the results in real time as they are computed.

The startup is being funded and supported by U of T’s Innovations and Partnership’s Office through the Connaught Innovation Award, U of T’s Early Stage Technologies Program, the Ontario Centres of Excellence, and FedDev Ontario’s Investing in Commercialization Partnerships Program with York University.