Skip to main content
Have a personal or library account? Click to login
Decoding the latent molecular interactions between aspartame and hepatocellular carcinoma: A multi-omics and machine learning integration Cover

Decoding the latent molecular interactions between aspartame and hepatocellular carcinoma: A multi-omics and machine learning integration

By: ,   and    
Open Access
|Apr 2026

Figures & Tables

Figure 1

Identification of HCC-associated target genes. (a) Prior to batch correction, the PCA scatter plot shows distinct separation of the two datasets (GSE36376 and GSE54236), indicating the existence of batch effects. After batch correction, the PCA scatter plot demonstrates effective integration of the datasets, with marked reduction in batch effects as shown. (b) A heatmap displays the expression profiles of DEGs across all included samples. Downregulated genes are indicated in blue, while upregulated genes are highlighted in red. (c) A volcano plot categorizes DEGs based on log fold change (logFC) and statistical significance (|log2FC| > 0.585 and p-value < 0.05). Brown dots represent upregulated genes, green dots denote downregulated genes, and gray dots correspond to genes with no significant expression alterations. Abbreviations: HCC: hepatocellular carcinoma; PCA: principal component analysis; DEGs: differentially expressed genes

Figure 2

Construction of WGCNA and identification of overlapping genes. (a) Determination of the optimal soft-threshold power. Horizontal lines, arranged from top to bottom, correspond to R 2 values of 0.9 and 0.8, respectively. (b) Soft connectivity (power = 10) and verification of scale-free topology (R 2 = 0.89, slope = –1.83). (c) A gene dendrogram generated via WGCNA illustrates hierarchical clustering based on co-expression relationships. Different gene modules are represented by distinct colors in the lower section of the dendrogram. (d) A module-trait relationship heatmap exhibits correlations between WGCNA-identified modules and sample traits (Control group vs Tumor group). Correlation coefficients and P-values are displayed within each box. (e) Venn diagrams illustrate the overlapping genes among in silico predicted aspartame targets, OSOS-related genes, WGCNA module genes, and DEGs. Abbreviations: WGCNA: weighted gene co-expression network analysis; OS: oxidative stress; DEGs: differentially expressed genes

Figure 3

PPI networks and functional enrichment analysis. (a) PPI network encompassing genes overlapping with in silico predicted aspartame targets, DEGs, OS-related genes, and WGCNA-derived module genes. The central network visualizes interaction relationships among these overlapping genes. GO enrichment analysis annotates the overlapping genes in terms of (b) BP, (c) CC, and (d) MF. For these plots, the x-axis denotes gene count, and the color gradient correlates with adjusted P-values (darker red reflects greater statistical significance). (e) KEGG pathway analysis identifies enriched pathways associated with the overlapping genes. In this plot, the x-axis represents gene ratio, dot size corresponds to gene count, and the color gradient reflects adjusted p-values < 0.05. Abbreviations: PPI: Protein–protein interaction; GO: Gene Ontology; BP: biological processes; CC: cellular components; MF: molecular functions; KEGG: Kyoto Encyclopedia of Genes and Genomes

Figure 4

Identification of core genes associated with HCC status overlapping with in silico predicted aspartame targets. (a) Comparison of model performance: A heatmap displays AUC values of distinct models across multiple cohorts. The left column lists the models, while the right column presents AUC values (higher values indicate superior performance). Cohort sources are distinguished by different colors. (b–i) ROC curves: ROC curves for the core genes are shown for three external validation datasets (GSE14811, GSE25097, and GSE76427) and the training dataset. For these curves, the x-axis represents false positive rate, and the y-axis denotes sensitivity. Predictive capability is assessed via AUC values. (j–m) Expression levels of the four core genes in the external validation datasets. ****, p < 0.0001. Abbreviations: AUC: Area under the receiver operating characteristic curve; ROC: Receiver operating characteristic

Figure 5

Model interpretation via SHAP analysis. (a) Ranking of feature importance: A bar chart ranks the top genes according to their statistical feature importance for HCC prediction. Larger bar heights correspond to greater contributions to the predictive model. (b) Violin plot: A violin plot depicts the distribution of gene expression across distinct experimental conditions. Plot width represents data density, and colors indicate expression levels. (c–f) Distribution of SHAP values: Scatter plots display SHAP values for the core genes, reflecting their statistical impacts on model predictions for HCC status. (g) SHAP summary plot: A SHAP summary plot exhibits the statistical contribution of each core gene to model predictions for HCC status. Negative SHAP values indicate a reduction in predictive probability, while positive values indicate an elevation in predictive probability. Abbreviations: SHAP: SHapley Additive exPlanations

Figure 6

In silico molecular docking prediction of the binding potential between aspartame and core genes. Molecular docking results (binding energy < –5.0 kcal/mol) for the binding potential between aspartame and (a) ARF1 (–5.6 kcal/mol), (b) AURKA (–5.0 kcal/mol), and (c) GSTZ1 (–7.1 kcal/mol). Abbreviations: ARF1: ADP-ribosylation factor 1; AURKA: aurora kinase A; GSTZ1: glutathione S-transferase zeta 1
Language: English
Page range: 68 - 80
Submitted on: Nov 25, 2025
Accepted on: Apr 17, 2026
Published on: Apr 18, 2026
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2026 Zhou An, Xianhua Wang, Yuyun Jia, published by Hirszfeld Institute of Immunology and Experimental Therapy
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.