Skip to main content
Have a personal or library account? Click to login
A large language model-based analysis of vulnerability discovery in windows software Cover

A large language model-based analysis of vulnerability discovery in windows software

Open Access
|Jun 2026

Figures & Tables

Fig. 1

An historical overview of Windows development frameworks.

Fig. 2

The analysis of the commit frequency over time.

Fig. 3

The code frequency over time.

Fig. 4

Proposed architecture.

Fig. 5

Reduction rates.

Fig. 6

Severity distribution before vs after disagreement-aware LLM interpretation.

Severity distribution before and after LLM-based interpretation (Windows App SDK, C/C++)_

Severity LevelFindings (SA)Findings (LLM-Based)Main Vulnerability CategoriesStatic Analysis Tools
5 (Critical)10Privilege escalation (baseline highest-risk item)AppScan Static Analyzer [51]
4 (High)12Command injection; reclassified critical item (context-limited)AppScan Static Analyzer [51]
3 (Medium)117Improper resource access control; permission/validation warningsFlawfinder; AppScan Static Analyzer [47,51]
2 (Low)4228Information exposure; input validation; dependency integrity; API pattern alertsAppScan; Fluid Attacks; Cppcheck; RATS [48-50]

Effectiveness of LLM-based alert consolidation across static analysis tools_

SA ToolRaw AlertsUnique Code LocationsLLM-Refined FindingsAlert Reduction (%)
Flawfinder86362.5%
RATS76357.1%
Cppcheck660100.0%
Fluid Attacks2220.0%
AppScan Static Analyzer1110.0%
Total2421962.5%

Project details_

MetricIDMetric Value
Application NameANWindows App SDK 1.6.2
Review DateRDDecember 12, 2025
ObjectiveOBJSecurity Code Review
Number of Lines (LOC)LOC167,894
Code Review ModeCRMStatic

Comparison of research work on bug detection_

Ref.Semantic ReasoningExplainabilityHybrid Static + AIFailure ModeEvaluation
[37]×✓ (LLM)✓ (shallow reasoning)High FDR (>50%)
[38]××✓ (LLM)✓ (industrial errors)FP reduced by ≈ 94 − 98 %
[39]×××✓ (count bias)F1 ≈ 0.97; Recall < 30 %
[40]××××Accuracy ≈ 0.87; F1 ≈ 0.86
[41]×××Accuracy ≈ 0.86; F1 ≈ 0.85
[42]××××Accuracy ≈ 0.90; F1 ≈ 0.91
[43]×××Accuracy ≈ 0.87
[44]×××Accuracy ≈ 91.8 %
Our Proposed Model✓ (LLM)✓ (tool disagreement)Alert reduction 62.5%
Language: English
Submitted on: Mar 4, 2026
Accepted on: Apr 1, 2026
Published on: Jun 2, 2026
Published by: Harran University
In partnership with: Paradigm Publishing Services
Publication frequency: 2 issues per year

© 2026 Puya Pakshad, Samson Quaye, Jamal Al-Karaki, Marwan Omar, Maurice E. Dawson, published by Harran University
This work is licensed under the Creative Commons Attribution 4.0 License.

AHEAD OF PRINT