Have a personal or library account? Click to login
From text to threats: A language model approach to software vulnerability detection Cover

From text to threats: A language model approach to software vulnerability detection

By: Marwan Omar and  Darrell Burrell  
Open Access
|Oct 2023

Figures & Tables

Fig. 1

An Overview of our defense framework.
An Overview of our defense framework.

Fig. 2

Comparison of F1 scores across different models and datasets.
Comparison of F1 scores across different models and datasets.

Fig. 3

Model size comparison across the three models.
Model size comparison across the three models.

Comparison of models’ performance on various datasets_

ModelScoreSARDSeVCDevignD2A

VulBERTa88.784.280.581.879.9
SySeVR81.582.678.380.272.7
DistilVulBERT94.091.482.287.585.9

Fine-tuning time comparison_

ModelDatasetFine-tuning time (hours)

VulBERTaSARD1.2
SySeVRSeVC1.1
DistilVulBERTSARD0.8
DistilVulBERTSeVC0.9

Model overhead analysis_

ModelParameters (millions)Training time (hours)

VulBERTa1108.2
SySeVR906.5
DistilVulBERT665.0

Hyperparameters of the models_

HyperparameterGPT-2CodeBERTLSTM

Learning rate0.0010.00050.01
Batch size3264128
Epochs5103
OptimizerAdamAdamWRMSprop
Dropout rate0.10.050.2
Hidden units768312256
Attention heads128
Layers12121

j_ijmce-2024-0003_tab_005

Require: Set of labeled training data D = {(xi,yi)}
Require: Set of K teacher models T = Tk
Require: Student model S
Ensure: Trained student model
1: Initialize student model parameters θS randomly.
2: for each teacher model TkT do
3: Compute predictions pk (x) for each xiD.
4: Initialize student model weights to match Tk.
5: Train student model on D using: KDLoss
(θS,θT(k);D)=1ni=1nDKL(pk(xi)qs(xi;θS,θT(k))) (\theta_S,\theta^{(k)}_{T};D)=\frac{1}{n}\sum^\nolimits{n}_{i=1}D_{KL}(p_k(x_i)\parallel q_s (x_i;\theta_S,\theta^{(k)}_T))
where DKL denotes Kullback-Leibler divergence and qs(xi;θS,θT(k)) q_s (x_i;\theta_S,\theta^{(k)}_T) is the softmax output of student model.
6: end for
7: return Trained student model S
Language: English
Page range: 23 - 34
Submitted on: Sep 4, 2023
Accepted on: Oct 26, 2023
Published on: Oct 31, 2023
Published by: Harran University
In partnership with: Paradigm Publishing Services
Publication frequency: 2 issues per year

© 2023 Marwan Omar, Darrell Burrell, published by Harran University
This work is licensed under the Creative Commons Attribution 4.0 License.