Protein Function Prediction with Pretrained Transformers: Performance, Pitfalls, and Practical Guidance

Kushal Raj Roy

doi:10.2478/ebtj-2026-0005

.blurhash-client-img { display: none !important; }

Protein Function Prediction with Pretrained Transformers: Performance, Pitfalls, and Practical Guidance

The EuroBiotech Journal

Volume 10 (2026): Issue 2 (April 2026)

By: Kushal Raj Roy

Open Access

|Apr 2026

Abstract

Transformer-based protein language models (PLMs) learn meaningful representations from millions of unlabeled sequences, capturing evolutionary patterns and functional relationships. Recent advances include ESM-2’s systematic scaling to 15 billion parameters, structure-aware vocabularies (SaProt), and multimodal foundation models (ESM-3, 98B parameters). PLMs achieve state-of-the-art performance: Gene Ontology prediction (F-max 0.64–0.68), enzyme classification (81% accuracy), and variant effect prediction (Spearman ρ 0.52–0.55). Deep-layer attention correlates 44–63% with 3D contacts despite no structural training. This review synthesizes recent PLM developments, benchmarks, and practical applications, providing guidance for experimental biologists on model selection and validation strategies.

References

Z. Lin, H. Akin, R. Rao, B. Hie, Z. Zhu, W. Lu, N. Smetanin, R. Verkuil, O. Kabeli, Y. Shmueli, A. dos Santos Costa, M. Fazel-Zarandi, T. Sercu, S. Candido and A. Rives, Evolutionary-scale prediction of atomic-level protein structure with a language model, Science, 2023, 379, 1123–1130.
Search in Google Scholar Back to article
J. Su, C. Han, Y. Zhou, J. Shan, X. Zhou and F. Yuan, SaProt: Protein language modeling with structure-aware vocabulary, Proc. Int. Conf. Learn. Represent., 2024.
Search in Google Scholar Back to article
T. Hayes, R. Rao, H. Akin, N. J. Sofroniew, D. Oktay, Z. Lin, R. Verkuil, V. Q. Tran, J. Deaton, M. Wiggert, R. Badkundri, I. Shafkat, J. Gong, A. Derry, R. S. Molina, N. Thomas, A. Khan, C. Mishra, C. Kim, L. J. Bartie, M. Nemeth, P. D. Hsu, T. Sercu, S. Candido and A. Rives, Simulating 500 million years of evolution with a language model, Science, 2025, 383, eadl5946.
Search in Google Scholar Back to article
Q. Yu, T. Cui, H. Li, J. C. Li, Y. Luo, L. Xie and L. Ma, Enzyme function prediction using contrastive learning, Science, 2023, 379, 1358–1363.
Search in Google Scholar Back to article
P. Notin, N. Rollins, Y. Gal, C. Sander and D. Marks, Machine learning for functional protein design, Nat. Biotechnol., 2024, 42, 216–228.
Search in Google Scholar Back to article
UniProt Consortium, UniProt: the universal protein knowledgebase in 2023, Nucleic Acids Res., 2023, 51, D523–D531.
Search in Google Scholar Back to article
R. Schmirler, M. Heinzinger and B. Rost, Fine-tuning protein language models boosts predictions across diverse tasks, Nat. Commun., 2024, 15, 7407.
Search in Google Scholar Back to article
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser and I. Polosukhin, Attention is all you need, Adv. Neural Inf. Process. Syst., 2017, 30, 5998–6008.
Search in Google Scholar Back to article
M. Heinzinger, K. Weissenow, J. G. Sanchez, A. Henkel, M. Mirdita, M. Steinegger and B. Rost, Bilingual language model for protein sequence and structure, NAR Genomics Bioinformatics, 2024, 6, lqae021.
Search in Google Scholar Back to article
J. Vig, A. Madani, L. R. Varshney, C. Xiong, R. Socher and N. Rajani., BERTology meets biology: interpreting attention in protein language models, Proc. Int. Conf. Learn. Represent., 2021.
Search in Google Scholar Back to article
J. Jumper, R. Evans, A. Pritzel, T. Green, M. Figurnov, O. Ronneberger, K. Tunyasuvunakiol, R. Bates, A. Žídek, A. Potapenko, A. Bridgland, C. Meyer, S. A. A. Kohl, A. J. Ballard, A. Cowie, B. Romera-Paredes, S. Nikolov, R. Jain, J. Adler, T. Back, S. Petersen, D. Reiman, E. Clancy, M. Zielinski, M. Steinegger, M. Pacholska, T. Berghammer, S. Bodenstein, D. Silver, O. Vinyals, A. W. Senior, K. Kavukcuoglu, P. Kohli and D. Hassabis, Highly accurate protein structure prediction with AlphaFold, Nature, 2021, 596, 583–589.
Search in Google Scholar Back to article
J. Abramson, J. Adler, J. Dunger, R. Evans, T. Green, A. Pritzel, O. Ronneberger, L. Willmore, A. J. Ballard, J. Bambrick, S. W. Bodenstein, D. A. Evans, C. C. Hung, M. O’Neill, D. Reiman, K. Tunyasuvunakiol, Z. Wu, A. Žemgulytė, E. Arvaniti, C. Beattie, O. Bertolli, A. Bridgland, A. Cherepanov, M. Congreve, A. I. Cowen-Rivers, A. Cowie, M. Figurnov, F. B. Fuchs, H. Gladman, R. Jain, Y. A. Khan, C. M. R. Low, K. Perlin, A. Potapenko, P. Savy, S. Singh, A. Stecula, A. Thillaisundaram, C. Tong, S. Yakneen, E. D. Zhong, M. Zielinski, A. Žídek, V. Bapst, P. Kohli, M. Jaderberg, D. Hassabis and J. M. Jumper, Accurate structure prediction of biomolecular interactions with AlphaFold 3, Nature, 2024, 630, 493–500.
Search in Google Scholar Back to article
A. Rives, J. Meier, T. Sercu, S. Goyal, Z. Lin, J. Liu, D. Guo, M. Ott, C. L. Zitnick, J. Ma and R. Fergus, Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences, Proc. Natl. Acad. Sci. U. S. A., 2021, 118, e2016239118.
Search in Google Scholar Back to article
A. Elnaggar, M. Heinzinger, C. Dallago, G. Rehawi, Y. Wang, L. Jones, T. Gibbs, T. Feher, C. Angerer, M. Steinegger, D. Bhowmik and B. Rost, ProtTrans: Toward understanding the language of life through self-supervised learning, IEEE Trans. Pattern Anal. Mach. Intell., 2022, 44, 7112–7127.
Search in Google Scholar Back to article
J. Meier, R. Rao, R. Verkuil, J. Liu, T. Sercu and A. Rives, Language models enable zero-shot prediction of the effects of mutations on protein function, Adv. Neural Inf. Process. Syst., 2021, 34, 29287–29303.
Search in Google Scholar Back to article
J. L. Watson, D. Juergens, N. R. Bennett, B. L. Trippe, J. Yim, H. E. Eisenach, W. Ahern, A. J. Borst, R. J. Ragotte, L. F. Milles, B. I. M. Wicky, N. Hanikel, S. J. Pellock, A. Courbet, W. Sheffler, J. Wang, P. Venkatesh, I. Sappington, S. V. Torres, A. Lauko, V. De Bortoli, E. Mathieu, R. Barzilay, T. S. Jaakkola, F. DiMaio, M. Baek and D. Baker, De novo design of protein structure and function with RFdiffusion, Nature, 2023, 620, 1089–1100.
Search in Google Scholar Back to article
A. Madani, B. Krause, E. R. Greene, S. Subramanian, B. P. Mohr, J. M. Holton, J. L. Olmos Jr, C. Xiong, Z. Z. Sun, R. Socher, J. S. Fraser and N. Naik, Large language models generate functional protein sequences across diverse families, Nat. Biotechnol., 2023, 41, 1099–1106.
Search in Google Scholar Back to article
Gene Ontology Consortium, The Gene Ontology knowledgebase in 2023, Genetics, 2023, 224, iyad031.
Search in Google Scholar Back to article
K. Guo, Y. Zhou, X. Guo, A. Gao, J. Li, D. Chen, H. Guo, Z. Ma, Q. Liang and M. Jiang, Integrating protein structure and deep learning for protein function prediction in the CAFA5 challenge, bioRxiv, 2024, DOI: 10.1101/2024.02.05.578892.
Search in Google Scholar Back to article

Articles in this issue

DOI: https://doi.org/10.2478/ebtj-2026-0005 | Journal eISSN: 2564-615X

Journal RSS Feed

Language: English

Page range: 35 - 45

Published on: Apr 30, 2026

Published by: European Biotechnology Thematic Network Association

In partnership with: Paradigm Publishing Services

Publication frequency: 4 issues per year

Keywords:

protein language models,

Related subjects:

© 2026 Kushal Raj Roy, published by European Biotechnology Thematic Network Association
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.

Volume 10 (2026): Issue 2 (April 2026)