SecuGuard: Leveraging pattern-exploiting training in language models for advanced software vulnerability detection

Mahmoud Basharat; Marwan Omar

doi:10.2478/ijmce-2025-0005

References

Abbasi R., Bashir A.K., Mateen A., Amin F., Ge Y., Omar M., Efficient security and privacy of lossless secure communication for sensor-based urban cities, IEEE Sensors Journal, DOI: 10.1109/JSEN.2023.3305716, 2024.
Open DOI Search in Google Scholar Back to article
Kinoon M.A., Omar M., Mohaisen M., Mohaisen D., Security breaches in the healthcare domain: a spatiotemporal analysis, Computational Data and Social Networks: 10th International Conference, CSoNet 2021, Virtual Event, 15–17 November 2021, 171–183, 2021.
Search in Google Scholar Back to article
Alharbi A.R., Hijji M., Aljaedi A., Enhancing topic clustering for Arabic security news based on k-means and topic modelling, IET Networks, 10(6), 278–294, 2021.
Search in Google Scholar Back to article
Aluru S.S., Mathew B., Saha P., Mukherjee A., Deep learning models for multilingual hate speech detection, arXiv:2004.06465, 2020.
Search in Google Scholar Back to article
Beyer L., Zhai X., Royer A., Markeeva L., Anil R., Kolesnikov A., Knowledge distillation: a good teacher is patient and consistent, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 18–24 June 2022, New Orleans, Los Angeles, USA, 10925–10934, 2022.
Search in Google Scholar Back to article
Chakraborty S., Krishna R., Ding Y., Ray B., Deep learning based vulnerability detection: Are we there yet?, IEEE Transactions on Software Engineering, 48(9), 3280–3296, 2021.
Search in Google Scholar Back to article
Cheng X., Wang H., Hua J., Xu G., Sui Y., DeepWukong: statically detecting software vulnerabilities using deep graph neural network, ACM Transactions on Software Engineering and Methodology, 30(3), 1–33, 2021.
Search in Google Scholar Back to article
Furlanello T., Lipton Z., Tschannen M., Itti L., Anandkumar A., Born again neural networks, International Conference on Machine Learning, PLMR, 1607–1616, 2018.
Search in Google Scholar Back to article
Gholami S., Omar M., Can a student large language model perform as well as it’s teacher?, arXiv:2310.02421, 2023.
Search in Google Scholar Back to article
Gholami S., Omar M., Do generative large language models need billions of parameters?, arXiv:2309.06589, 2023.
Search in Google Scholar Back to article
Hanif H., Maffeis S., VulBERTa: simplified source code pre-training for vulnerability detection, 2022 International Joint Conference on Neural Networks, 18–23 July 2022, Padua, Italy, 1–8, 2022.
Search in Google Scholar Back to article
Kim S., Woo S., Lee H., Oh H., VUDDY: a scalable approach for vulnerable code clone discovery, 2017 IEEE Symposium on Security and Privacy, IEEE, 22–26 May 2017, San Jose, California, USA, 595–614, 2017.
Search in Google Scholar Back to article
Kim S., Choi J., Ahmed M.E., Nepal S., Kim H., VulDeBERT: a vulnerability detection system using BERT, 2022 International Symposium on Software Reliability Engineering Workshops, IEEE, 31 October 3 November 2022, Charlotte, New York, USA, 69–74, 2022.
Search in Google Scholar Back to article
Li Z., Zou D., Xu S., Jin H., Qi H., Hu J., VulPecker: an automated vulnerability detection system based on code similarity analysis, Proceedings of the 32^nd Annual Conference on Computer Security Applications, Association for Computing Machinery New York USA, 5–8 December 2016, Los Angeles, California, USA, 201–213, 2016.
Search in Google Scholar Back to article
Omar M., Application of machine learning (ML) to address cybersecurity threats, In Machine Learning for Cybersecurity: Innovative Deep Learning Solutions, Springer, 1–11, 2022.
Search in Google Scholar Back to article
Omar M., Machine Learning for Cybersecurity: Innovative Deep Learning Solutions, Springer, 2022.
Search in Google Scholar Back to article
Omar M., Machine Learning for Cybersecurity: Innovative Deep Learning Solutions (Chapter: Malware anomaly detection using local outlier factor technique), Springer, 2022.
Search in Google Scholar Back to article
Omar M., Backdoor learning for NLP: recent advances, challenges, and future research directions, arXiv:2302.06801, 2023.
Search in Google Scholar Back to article
Omar M., VulDefend: a novel technique based on pattern exploiting training for detecting software vulnerabilities using language models, 2023 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology, IEEE, 22–24 May 2023, Amman, Jordan, 287–293, 2023.
Search in Google Scholar Back to article
Omar M., Burrell D., From text to threats: a language model approach to software vulnerability detection, International Journal of Mathematics and Computer in Engineering, 2(1), 23–34, 2024.
Search in Google Scholar Back to article
Omar M., Choi S., Nyang D., Mohaisen D., Quantifying the performance of adversarial training on language models with distribution shifts, Proceedings of the 1^st Workshop on Cybersecurity and Social Sciences, 30 May 2022, Nagasaki Japan, 3–9, 2022.
Search in Google Scholar Back to article
Omar M., Choi S., Nyang D., Mohaisen D., Robust natural language processing: recent advances, challenges, and future directions, arXiv:2201.00768, 2022.
Search in Google Scholar Back to article
Omar M., Jones R., Burrell D.N., Dawson M., Nobles C., Mohammed M., Bashir A.K., Harnessing the power and simplicity of decision trees to detect IoT Malware, Transformational Interventions for Business Technology and Healthcare, 215–229, 2023.
Search in Google Scholar Back to article
Omar M., Mohaisen D., Making adversarially-trained language models forget with model retraining: a case study on hate speech detection, Companion Proceedings of the Web Conference 2022, Virtual Event, 25–29 April 2022, Lyon, France, 887–893, 2022.
Search in Google Scholar Back to article
Rabheru R., Hanif H., Maffeis S., DeepTective: detection of PHP vulnerabilities using hybrid graph neural networks, Proceedings of the 36th Annual ACM Symposium on Applied Computing, Virtual Event, Republic of Korea, 22–26 March 2021, 1687–1690, 2021.
Search in Google Scholar Back to article
Radford A., Wu J., Child R., Luan D., Amodei D., Sutskever I., Language models are unsupervised multitask learners, OpenAI Blog, 1(8), 9, 2019.
Search in Google Scholar Back to article
Russell R., Kim L., Hamilton L., Lazovich T., Harer J., Ozdemir O., Ellingwood P., McConley M., Automated vulnerability detection in source code using deep representation learning, 2018 17th IEEE international conference on machine learning and applications, IEEE, 17–20 December 2018, Orlando, Florida, USA, 757–762, 2018.
Search in Google Scholar Back to article
Saleem M.A., Li X., Mahmood K., Shamshad S., Ayub M.F., Bashir A.K., Omar M., Provably secure conditional-privacy access control protocol for intelligent customers-centric communication in VANET, IEEE Transactions on Consumer Electronics, 2023.
Search in Google Scholar Back to article
Salimi S., Kharrazi M., VulSlicer: vulnerability detection through code slicing, Journal of Systems and Software, 193, 111450, 2022.
Search in Google Scholar Back to article
Shoeybi M., Patwary M., Puri R., LeGresley P., Casper J., Catanzaro B., Megatron-LM: training multi-billion parameter language models using model parallelism, arXiv:1909.08053, 2019.
Search in Google Scholar Back to article
Yamaguchi F., Golde N., Arp D., Rieck K., Modeling and discovering vulnerabilities with code property graphs, 2014 IEEE Symposium on Security and Privacy, 590–604, 2014.
Search in Google Scholar Back to article
Yan R., Xiao X., Hu G., Peng S., Jiang Y., New deep learning method to detect code injection attacks on hybrid applications, Journal of Systems and Software, 137, 67–77, 2018.
Search in Google Scholar Back to article
Zheng Y., Pujar S., Lewis B., Buratti L., Epstein E., Yang B., Laredo J., Morari A., Su Z., D2A: a dataset built for AI-based vulnerability detection methods using differential analysis, 2021 IEEE/ACM 43rd International Conference on Software Engineering: Software Engineering in Practice, IEEE, 25–28 May 2021, Madrid, Spain, 111–120, 2021.
Search in Google Scholar Back to article
Zhou X., Verma R.M., Vulnerability detection via multimodal learning: datasets and analysis, Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security, 30 May–3 June 2022, Nagasaki, Japan, 1225–1227, 2022.
Search in Google Scholar Back to article
Zhou Y., Liu S., Siow J.K., Du X., Liu Y., Advances in Neural Information Processing Systems (Chapter: Devign: Effective vulnerability identification by learning comprehensive program semantics via graph neural networks), 10197–10207, 2019.
Search in Google Scholar Back to article
Zou D., Wang S., Xu S., Li Z., Jin H., µVulDeePecker: a deep learning-based system for multiclass vulnerability detection, IEEE Transactions on Dependable and Secure Computing, 18(5), 2224–2236, 2021.
Search in Google Scholar Back to article

SecuGuard: Leveraging pattern-exploiting training in language models for advanced software vulnerability detection

References

Paradigm

My account