Have a personal or library account? Click to login
SecuGuard: Leveraging pattern-exploiting training in language models for advanced software vulnerability detection Cover

SecuGuard: Leveraging pattern-exploiting training in language models for advanced software vulnerability detection

By: Mahmoud Basharat and  Marwan Omar  
Open Access
|Jun 2024

Abstract

Identifying vulnerabilities within source code remains paramount in assuring software quality and security. This study introduces a refined semi-supervised learning methodology that capitalizes on pattern-exploiting training coupled with cloze-style interrogation techniques. The research strategy employed involves the training of a linguistic model on the Software Assurance Reference Dataset (SARD) and Devign datasets, which are replete with vulnerable code fragments. The training procedure entails obscuring specific segments of the code and subsequently prompting the model to ascertain the obfuscated tokens. Empirical analyses underscore the efficacy of our method in pinpointing vulnerabilities in source code, benefiting substantially from patterns discerned within the code fragments. This investigation underscores the potential of integrating pattern-exploiting training and cloze-based queries to enhance the precision of vulnerability detection within source code.

Language: English
Page range: 47 - 56
Submitted on: Oct 28, 2023
Accepted on: Jan 16, 2024
Published on: Jun 2, 2024
Published by: Harran University
In partnership with: Paradigm Publishing Services
Publication frequency: 2 issues per year

© 2024 Mahmoud Basharat, Marwan Omar, published by Harran University
This work is licensed under the Creative Commons Attribution 4.0 License.