Accurate descriptions of protein-protein interactions are essential for understanding biological systems. Very recently, AlphaFold2 has been shown remarkably accurate for predicting the atomic structures of individual proteins. Can AF2 be adapted to predict a structural model of a protein complex? We introduce AF2Complex, that employs the same neural network models developed for AlphaFold2 to predict the structures of multimeric protein complexes. In contrast to common approaches that invariably require paired multiple sequence alignments, AF2Complex works without using such paired alignments. It achieves higher accuracy than complex strategies that combine AlphaFold2 and protein-protein docking. New metrics are introduced for predicting direct protein-protein interactions between arbitrary protein pairs. The approach is tested on some challenging CASP14 assembly targets, a small but appropriate benchmark set, and the E. coli proteome. Using the cytochrome c biogenesis system as an example, we present high-confidence models of three sought-after assemblies formed by eight members of this system.
Details described in the following publication:
Mu Gao, Davi Nakajima An, Jerry M. Parks, and Jeffrey Skolnick. 2022. AF2Compex Predicts direct physical interactions in multimeric proteins with deep learning. Nat Commun. 13, 1774. doi.org/10.1038/s41467-022-29394-2. PDF
Source Code
The source code of AF2Complex is freely available at github.
Data Set
Benchmark data sets of CP17, Dimer1193, Oligomer562, and the full E. coli proteome, including pre-generated input features to AF2Complex, and the top computational models of E. coli Ccm system I are available at Zenodo.
Questions and comments to Dr. Mu Gao.