Have a personal or library account? Click to login
From text to threats: A language model approach to software vulnerability detection Cover

From text to threats: A language model approach to software vulnerability detection

By: Marwan Omar and  Darrell Burrell  
Open Access
|Oct 2023

References

  1. Alharbi A.R., Hijji M., Aljaedi A., Enhancing topic clustering for Arabic security news based on k-means and topic modelling, IET Networks, 10(6), 278–294, 2021.
  2. Beyer L., Zhai X., Royer A., Markeeva L., Anil R., Kolesnikov A., Knowledge distillation: A good teacher is patient and consistent, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 18–24 2022, New Orleans, LA, USA, 10915–10924, 2022.
  3. Furlanello T., Lipton Z., Tschannen M., Itti L., Anandkumar A., Born again neural networks, Proceedings of the 35th International Conference on Machine Learning, 10–15 July 2018, Stockholmsmässan, Stockholm, Sweden, 80, 1607–1616, 2018.
  4. Xie C., Tan M., Gong B., Wang J., Yuille A.L., Le Q.V., Adversarial examples improve image recognition, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 13–19 June 2020, Seattle, WA, USA, 819–828, 2020.
  5. Chen Z., Xie X., Li Y., Luo J., Code representation learning with AST paths and their contexts, In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis, July 16–21 2018,Amsterdam, Netherlands, 312–322, 2018.
  6. Hanif H., Maffeis S., VulBERTa: simplified source code pre-training for vulnerability detection, 2022 International Joint Conference on Neural Networks (IJCNN), IEEE, 18–23 July 2022, Padua, Italy, 1–8, 2022.
  7. Hinton G., Vinyals O., Dean J., Distilling the knowledge in a neural network, arXiv:1503.02531, 2015.
  8. Kim S., Woo S., Lee H., Oh H., VUDDY: A scalable approach for vulnerable code clone discovery, 2017 IEEE Symposium on Security and Privacy (SP), IEEE, 22–26 May 2017, 22–26 May 2017, San Jose, California, USA, 595–614, 2017.
  9. Kim S., Choi J., Ahmed M.E., Nepal S., Kim H., VulDeBERT: A vulnerability detection system using BERT, 2022 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), IEEE, October 31 to November 3 2022, Charlotte, New York, USA, 69–74, 2022.
  10. Kingma D.P., Ba J., Adam: A method for stochastic optimization, arXiv:1412.6980, 2014.
  11. Li Z., Zou D., Xu S., Jin H., Qi H., Hu J., VulPecker: An automated vulnerability detection system based on code similarity analysis, Proceedings of the 32nd Annual Conference on Computer Security Applications, December 5–8, 2016, Los Angeles, California, USA, 201–213, 2016.
  12. Rabheru R., Hanif H., Maffeis S., DeepTective: detection of PHP vulnerabilities using hybrid graph neural networks, Proceedings of the 36th Annual ACM Symposium on Applied Computing, March 22–26, 2021, Virtual Event, Republic of Korea, 1687–1690, 2021.
  13. Salimi S., Kharrazi M., VulSlicer: Vulnerability detection through code slicing, Journal of Systems and Software, 193, 111450, 2022.
  14. Yamaguchi F., Golde N., Arp D., Rieck K., Modeling and discovering vulnerabilities with code property graphs, 2014 IEEE Symposium on Security and Privacy, IEEE, May 18–21 2014, Berkeley, California, USA, 590–604, 2014.
  15. Zhou X., Verma R.M., Vulnerability detection via multimodal learning: Datasets and analysis, Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security, 30 May to June 3 2022, Nagasaki, Japan, 1225–1227, 2022.
  16. Russell R., Kim L., Hamilton L., Lazovich T., Harer J., Ozdemir O., Ellingwood P., McConley M., Automated vulnerability detection in source code using deep representation learning, 17th IEEE international conference on machine learning and applications (ICMLA), IEEE, December 17–20 2018, Orlando, Florida, USA, 757–762, 2018.
  17. Li Z., Zou D., Xu S., Ou X., Jin H., Wang S., Deng Z., Zhong Y., VulDeePecker: A deep learning-based system for vulnerability detection, arXiv:1801.01681, 2018.
  18. Zou D., Wang S., Xu S., Li Z., Jin H., µVulDeePecker: A deep learning-based system for multiclass vulnerability detection, IEEE Transactions on Dependable and Secure Computing, 18(5), 2224–2236, 2021.
  19. Zhou Y., Liu S., Siow J., Du X., Liu Y., Devign: effective vulnerability identification by learning comprehensive program semantics via graph neural networks, Proceedings of the 33rd International Conference on Neural Information Processing Systems, December 8–14 2019, Vancouver, British Columbia, Canada, 10197–10207, 2019.
  20. Fu M., Nguyen V., Tantithamthavorn C.K., Le T., Phung D., VulExplainer: a transformer-based hierarchical distillation for explaining vulnerability types, IEEE Transactions on Software Engineering, 49(10), 4550–4565, 2023.
  21. Gholami S., Omar M., Do generative large language models need billions of parameters?, arXiv:2309.06589, 2023.
  22. Gholami S., Omar M., Can pruning make large language models more efficient?, arXiv:2310.04573, 2023.
  23. Saleem M.A., Li X. Mahmood K., Shamshad S., Ayub M.F., Bashir A.K., Omar M., Provably secure conditional-privacy access control protocol for intelligent customers-centric communication in VANET, IEEE Transactions on Consumer Electronics, doi: 10.1109/TCE.2023.3324273, 2023.
  24. Lan X., Zhu X., Gong S., Knowledge distillation by on-the-fly native ensemble, arXiv.1806.04606, 2018.
  25. Shoeybi M., Patwary M., Puri R., LeGresley P., Casper J., Catanzaro B., Megatron-LM: Training multi-billion parameter language models using model parallelism, arXiv: 1909.08053, 2019.
  26. Zheng Y., Pujar S., Lewis B., Buratti L., Epstein E., Yang B., Laredo J., Morari A., Su Z., D2A: A dataset built for AI-based vulnerability detection methods using differential analysis, 43rd International Conference on Software Engineering: Software Engineering in Practice (ICSESEIP), IEEE, May 25–28, 2021, Virtual Event, Spain, 111–120, 2021.
  27. Li Z., Zou D., Xu S., Jin H., Zhu Y., Chen Z., SySeVR: A framework for using deep learning to detect software vulnerabilities, IEEE Transactions on Dependable and Secure Computing, 19(4), 2244–2258, 2022.
Language: English
Page range: 23 - 34
Submitted on: Sep 4, 2023
Accepted on: Oct 26, 2023
Published on: Oct 31, 2023
Published by: Harran University
In partnership with: Paradigm Publishing Services
Publication frequency: 2 issues per year

© 2023 Marwan Omar, Darrell Burrell, published by Harran University
This work is licensed under the Creative Commons Attribution 4.0 License.