Figure 1:

Figure 2:

Figure 3:

Figure 4:

Figure 5:

Figure 6:

Figure 7:

Figure 8:

Figure 9:

Figure 10:

Figure 11:

Deployment-realism metrics
| Metric | Definition | Example |
|---|---|---|
| p95 Latency (ms) | 95-percentile inference time per query | Transformer = 240 ms; Hybrid RL = 410 ms |
| Cost/1 k tasks ($) | Total GPU + API cost per 1,000 predictions | $0.38 vs $0.62 |
| Grounded-answer rate (%) | Outputs verifiably supported by source data | 92% |
| Tool-success rate (%) | Successful external API/tool invocations | 88% |
| Success@1 (%) | Correct decision in first attempt | 84% |
Quantitative ablation results for hybrid components
| Configuration | F1 Score | AUPRC | Notes |
|---|---|---|---|
| Baseline (transformer only) | 0.78 | 0.81 | No external structure or policy learning |
| + Retrieval module | 0.82 | 0.85 | Improves context grounding |
| + KG module | 0.83 | 0.87 | Enhances reasoning and link precision |
| + KG + RL modules (full hybrid) | 0.86 | 0.89 | Best trade-off between accuracy and robustness |
Techniques and training details of different models
| Model | Training method | Loss function | Hardware used |
|---|---|---|---|
| LLaMA2, GPT-J, GPT-3.5-turbo | Supervised pre-training on large corpora | Cross-entropy loss | NVIDIA A100/V100 GPUs |
| TransE, DistMult, ComplEx, RotatE | KG embedding training | Margin ranking loss/logistic loss | NVIDIA Tesla V100/GTX 1080 Ti |
| GNN + MAPPO | MARL (graph-structured environments) | Policy gradient loss + value loss | NVIDIA A100 GPU, CPU cluster for environment simulation |
| Graph transformer | Supervised/contrastive learning on graphs | Cross-entropy loss/contrastive loss | NVIDIA V100/A100 GPU |
| GCN-based anomaly detector | Supervised/semi-supervised GCN training | Binary CrossEntropy/MSE loss | NVIDIA Tesla V100/RTX 3090 |
Summary of recent approaches and models across different graph, knowledge, and robustness tasks
| Ref. | Approach | Model | Evaluation (Acc./Prec./Rec./F1) |
|---|---|---|---|
| [18] | CounterFact benchmark for paraphrased prompts | LLaMA2, GPT-J, GPT 3.5-turbo | Acc.: N/A; Prec.: 0.74; Rec.: 0.72; F1: 0.73 |
| [20] | Adaptive contrastive learning for KGE | TransE, DistMult, ComplEx, RotatE |
|
| [3] | MAGEC | GNN + MAPPO | Prec.: 0.81; Rec.: 0.84; F1: 0.825 |
| [2] | KGTN (graph transformer + contrastive learning) | Graph transformer |
|
| [31] | AGT | Graph transformer | Acc.: 0.982, 0.976; Prec.: 0.98; Rec.: 0.98; F1: 0.98 |
| [27] | FRGL | GCN-based anomaly detector | Acc.: 0.932; Prec.: 0.93; Rec.: 0.92; F1: 0.925 |
Task vs model overview: Models applied to different tasks and performance highlights
| Task | Models applied | Performance highlights |
|---|---|---|
| Counterfactual reasoning/NLP | LLaMA2, GPT-J, GPT-3.5-turbo | F1: 0.73; Robust to paraphrased prompts; captures textual consistency and reasoning patterns |
| KG completion/link prediction | TransE, DistMult, ComplEx, RotatE | MRR: 0.355–0.557; Hits@10 improved; effectively predicts missing links in KG datasets (FB15k, WN18RR, YAGO3) |
| Multi-agent coordination | GNN + MAPPO | Improved average and worst node idleness in MARL scenarios; handles agent interactions and graph-based decision making |
| Graph-based recommendation/multi-intent Prediction | Graph transformer | F1:0.6876–0.8559; AUC: 0.79–0.93; captures global graph dependencies for multi-intent recommendation tasks |
| Anomaly detection in graphs/federated learning | GCN-based anomaly detector | F1: 0.925; AUC: 0.95; robust detection of anomalous nodes in federated or distributed networks |
Unified benchmark: Accuracy vs latency/cost vs practical success
| Model type | Representative paper | F1 | AUPRC | p95 Latency (ms) | Cost/1 k tasks ($) | Success@1 (%) |
|---|---|---|---|---|---|---|
| Transformer (BERT/LLM) | Zou et al. [2] | 0.86 | 0.88 | 240 | 0.38 | 82 |
| KG transformer | Wang et al. [8] | 0.78 | 0.80 | 310 | 0.42 | 85 |
| RL path-reasoner | Ma et al. [6] | 0.81 | 0.83 | 400 | 0.60 | 87 |
| Multi-agent system (MARL) | Goeckner et al. [3] | 0.83 | 0.84 | 450 | 0.64 | 88 |
| Hybrid KG + RL + LLM agent | Zhou et al. [23] | 0.85 | 0.86 | 480 | 0.70 | 91 |
Advantages and disadvantages of different techniques
| Technique | Advantages | Disadvantages |
|---|---|---|
| DL |
|
|
| KG |
|
|
| Multi-agent |
|
|
| RL |
|
|
Overview of AI/graph models: Usage and applications
| Model | Why used | Where used |
|---|---|---|
| LLaMA2, GPT-J, GPT-3.5-turbo | Pretrained language models with strong NLP capabilities; handle reasoning, paraphrasing, and text generation efficiently | Counterfactual reasoning, question answering, NLP tasks, and knowledge consistency evaluation |
| TransE, DistMult, ComplEx, RotatE | Knowledge Graph Embedding models; capture relationships in graphs with low dimensional embeddings | KG completion, link prediction, recommendation systems |
| GNN + MAPPO | Graph neural network combined with multi-agent proximal policy optimization; models agent interactions in graph-structured environments | MARL, traffic optimization, robotics, resource allocation in networks |
| Graph transformer | Leverages attention mechanisms on graph structures; captures global dependencies and relational patterns efficiently | Multi-intent recommendation, graph-based prediction tasks, link prediction, recommendation systems |
| GCN-based anomaly detector | Graph convolutional networks for learning node embeddings and detecting unusual patterns in graph data | FRGL, anomaly detection in networks, cybersecurity, fraud detection |