A large language model-based analysis of vulnerability discovery in windows software

Puya Pakshad; Samson Quaye; Jamal Al-Karaki; Marwan Omar; Maurice E. Dawson

doi:10.2478/ijmce-2026-0019

.blurhash-client-img { display: none !important; }

A large language model-based analysis of vulnerability discovery in windows software

International Journal of Mathematics and Computer in Engineering

Volume 4 (2026): Issue 2 (December 2026)

By: Puya Pakshad, Samson Quaye, Jamal Al-Karaki, Marwan Omar and Maurice E. Dawson

Open Access

|Jun 2026

Figures & Tables

An historical overview of Windows development frameworks.

The analysis of the commit frequency over time.

Severity distribution before vs after disagreement-aware LLM interpretation.

Severity distribution before and after LLM-based interpretation (Windows App SDK, C/C++)_

Severity Level	Findings (SA)	Findings (LLM-Based)	Main Vulnerability Categories	Static Analysis Tools
5 (Critical)	1	0	Privilege escalation (baseline highest-risk item)	AppScan Static Analyzer [51]
4 (High)	1	2	Command injection; reclassified critical item (context-limited)	AppScan Static Analyzer [51]
3 (Medium)	11	7	Improper resource access control; permission/validation warnings	Flawfinder; AppScan Static Analyzer [47,51]
2 (Low)	42	28	Information exposure; input validation; dependency integrity; API pattern alerts	AppScan; Fluid Attacks; Cppcheck; RATS [48-50]

Effectiveness of LLM-based alert consolidation across static analysis tools_

SA Tool	Raw Alerts	Unique Code Locations	LLM-Refined Findings	Alert Reduction (%)
Flawfinder	8	6	3	62.5%
RATS	7	6	3	57.1%
Cppcheck	6	6	0	100.0%
Fluid Attacks	2	2	2	0.0%
AppScan Static Analyzer	1	1	1	0.0%
Total	24	21	9	62.5%

Project details_

Metric	ID	Metric Value
Application Name	AN	Windows App SDK 1.6.2
Review Date	RD	December 12, 2025
Objective	OBJ	Security Code Review
Number of Lines (LOC)	LOC	167,894
Code Review Mode	CRM	Static

Comparison of research work on bug detection_

Ref.	Semantic Reasoning	Explainability	Hybrid Static + AI	Failure Mode	Evaluation
[37]	✓	×	✓ (LLM)	✓ (shallow reasoning)	High FDR (>50%)
[38]	×	×	✓ (LLM)	✓ (industrial errors)	FP reduced by ≈ 94 − 98 %
[39]	×	×	×	✓ (count bias)	F1 ≈ 0.97; Recall < 30 %
[40]	×	×	×	×	Accuracy ≈ 0.87; F1 ≈ 0.86
[41]	✓	×	×	×	Accuracy ≈ 0.86; F1 ≈ 0.85
[42]	×	×	×	×	Accuracy ≈ 0.90; F1 ≈ 0.91
[43]	✓	×	×	×	Accuracy ≈ 0.87
[44]	×	✓	×	×	Accuracy ≈ 91.8 %
Our Proposed Model	✓	✓	✓ (LLM)	✓ (tool disagreement)	Alert reduction 62.5%

References

Authors

Metrics

Articles in this issue

DOI: https://doi.org/10.2478/ijmce-2026-0019 | Journal eISSN: 2956-7068

Journal RSS Feed

Language: English

Page range: 351 - 372

Submitted on: Mar 4, 2026

Accepted on: Apr 1, 2026

Published on: Jun 2, 2026

Published by: Harran University

In partnership with: Paradigm Publishing Services

Publication frequency: 2 issues per year

Keywords:

Software security,

static code analysis,

large language models,

vulnerability discovery,

Windows App SDK

Related subjects:

Computer sciences,

Computer sciences, other,

Engineering,

Introductions and overviews,

Physics,

© 2026 Puya Pakshad, Samson Quaye, Jamal Al-Karaki, Marwan Omar, Maurice E. Dawson, published by Harran University
This work is licensed under the Creative Commons Attribution 4.0 License.

Volume 4 (2026): Issue 2 (December 2026)

A large language model-based analysis of vulnerability discovery in windows software

Figures & Tables

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

Fig. 6

Severity distribution before and after LLM-based interpretation (Windows App SDK, C/C++)_

Effectiveness of LLM-based alert consolidation across static analysis tools_

Project details_

Comparison of research work on bug detection_

Paradigm

My account