Have a personal or library account? Click to login
Using Convolutional Neural Networks with Image-Based Representations of Amino Acid Sequences for Predicting the Effects of Genetic Variants Cover

Using Convolutional Neural Networks with Image-Based Representations of Amino Acid Sequences for Predicting the Effects of Genetic Variants

Open Access
|Oct 2025

Abstract

Proteins are one of the fundamental molecules that regulate cellular processes in living organisms. Given the pivotal role played by protein-protein, DNA-protein, and RNA-protein interactions in a significant proportion of biological processes, variants occurring in the regions where these interactions occur have the potential to give rise to serious consequences for the phenotype. Various supervised learning techniques are employed to ascertain the correlation between protein variants and the development of a specific disease. In this study, a convolutional neural network-based prediction model is proposed to predict the pathogenicity effect of variants on the phenotype. This is achieved by converting amino acid sequences into two-dimensional images. A protein embedding method utilizing transfer learning (TAPE) was employed to generate the feature vector. The feature vector was transformed into a square-shaped, single-channel image and trained with a deep learning algorithm comprising a convolutional neural network. This study performed a binary classification (benign versus pathogenic) using missense variants in the BRCA1 protein obtained from the open-access ClinVar database as the dataset. The findings demonstrate that the developed prediction model is highly effective in predicting the pathogenicity effects of variants within the functional regions of the BRCA1 protein on phenotype. The evaluation of the model’s prediction results demonstrated that variants in the benign class can be classified with 91% accuracy (93% sensitivity). Furthermore, the model demonstrated robust performance in classifying both benign and pathogenic variants, with an AUC value of 92%. These findings suggest that the developed prediction model may offer potential in classifying BRCA1 variants and assessing their potential pathogenicity. The variant effect prediction model obtained in this study shows promise and may benefit from further refinement in future research.

Language: English
Page range: 247 - 256
Published on: Oct 23, 2025
Published by: European Biotechnology Thematic Network Association
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2025 Gülbahar Merve Şilbir, Burçin Kurt, published by European Biotechnology Thematic Network Association
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.