Predicting interaction partners and generating new protein sequences using protein language models
Speaker: Anne-Florence Bitbol (EPFL)
Date: 23/10/2025
Time: 10:00 CEST
Host: Nora Martin (CRG)
Protein language models trained on multiple sequence alignments of homologous proteins successfully capture coevolution between amino acids in structural contact: this is one of the ingredients of the success of AlphaFold. We have used such models, especially MSA Transformer, to generate new protein sequences from given protein families, and to predict which proteins interact among the members of two protein families.
Despite their successes, a drawback of models based on multiple sequence alignments is that sequence alignment can be imperfect. Thus, we developed ProtMamba, a homology-aware but alignment-free protein language model, which is able to generate new protein sequences from given protein families.
Beyond the amino-acid scale, coevolution also exists between genes that in a genome. To capture it, we trained ProteomeLM on complete proteomes spanning the tree of life. This model allows quick and precise scans of whole protein interaction networks.
If you would like to attend the seminar, please register here.