Have a personal or library account? Click to login
From text to threats: A language model approach to software vulnerability detection Cover

From text to threats: A language model approach to software vulnerability detection

By: Marwan Omar and  Darrell Burrell  
Open Access
|Oct 2023

Abstract

In the rapidly-evolving landscape of software development, the detection of vulnerabilities in source code has become of paramount importance. Our study introduces a novel knowledge distillation (KD) technique aimed at enhancing vulnerability detection in software codebases. Using benchmark datasets such as SARD, SeVC, Devign, and D2A, we assess the prowess of the KD method when applied to different classifiers, specifically GPT2, CodeBERT, and LSTM. The empirical results are revealed a marked improvement in the performance of these classifiers upon the implementation of the KD technique, particularly with the GPT-2 model demonstrating the most promising outcomes. This work underscores the potential of integrating transformer-based learning models, like GPT-2, with knowledge distillation for more efficient and accurate vulnerability detection.

Language: English
Page range: 23 - 34
Submitted on: Sep 4, 2023
Accepted on: Oct 26, 2023
Published on: Oct 31, 2023
Published by: Harran University
In partnership with: Paradigm Publishing Services
Publication frequency: 2 issues per year

© 2023 Marwan Omar, Darrell Burrell, published by Harran University
This work is licensed under the Creative Commons Attribution 4.0 License.