From text to threats: A language model approach to software vulnerability detection

Omar, Marwan; Burrell, Darrell

doi:10.2478/ijmce-2024-0003

References

Alharbi A.R., Hijji M., Aljaedi A., Enhancing topic clustering for Arabic security news based on k-means and topic modelling, IET Networks, 10(6), 278–294, 2021.
Search in Google Scholar Back to article
Beyer L., Zhai X., Royer A., Markeeva L., Anil R., Kolesnikov A., Knowledge distillation: A good teacher is patient and consistent, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 18–24 2022, New Orleans, LA, USA, 10915–10924, 2022.
Search in Google Scholar Back to article
Furlanello T., Lipton Z., Tschannen M., Itti L., Anandkumar A., Born again neural networks, Proceedings of the 35th International Conference on Machine Learning, 10–15 July 2018, Stockholmsmässan, Stockholm, Sweden, 80, 1607–1616, 2018.
Search in Google Scholar Back to article
Xie C., Tan M., Gong B., Wang J., Yuille A.L., Le Q.V., Adversarial examples improve image recognition, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 13–19 June 2020, Seattle, WA, USA, 819–828, 2020.
Search in Google Scholar Back to article
Chen Z., Xie X., Li Y., Luo J., Code representation learning with AST paths and their contexts, In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis, July 16–21 2018,Amsterdam, Netherlands, 312–322, 2018.
Search in Google Scholar Back to article
Hanif H., Maffeis S., VulBERTa: simplified source code pre-training for vulnerability detection, 2022 International Joint Conference on Neural Networks (IJCNN), IEEE, 18–23 July 2022, Padua, Italy, 1–8, 2022.
Search in Google Scholar Back to article
Hinton G., Vinyals O., Dean J., Distilling the knowledge in a neural network, arXiv:1503.02531, 2015.
Search in Google Scholar Back to article
Kim S., Woo S., Lee H., Oh H., VUDDY: A scalable approach for vulnerable code clone discovery, 2017 IEEE Symposium on Security and Privacy (SP), IEEE, 22–26 May 2017, 22–26 May 2017, San Jose, California, USA, 595–614, 2017.
Search in Google Scholar Back to article
Kim S., Choi J., Ahmed M.E., Nepal S., Kim H., VulDeBERT: A vulnerability detection system using BERT, 2022 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), IEEE, October 31 to November 3 2022, Charlotte, New York, USA, 69–74, 2022.
Search in Google Scholar Back to article
Kingma D.P., Ba J., Adam: A method for stochastic optimization, arXiv:1412.6980, 2014.
Search in Google Scholar Back to article
Li Z., Zou D., Xu S., Jin H., Qi H., Hu J., VulPecker: An automated vulnerability detection system based on code similarity analysis, Proceedings of the 32nd Annual Conference on Computer Security Applications, December 5–8, 2016, Los Angeles, California, USA, 201–213, 2016.
Search in Google Scholar Back to article
Rabheru R., Hanif H., Maffeis S., DeepTective: detection of PHP vulnerabilities using hybrid graph neural networks, Proceedings of the 36th Annual ACM Symposium on Applied Computing, March 22–26, 2021, Virtual Event, Republic of Korea, 1687–1690, 2021.
Search in Google Scholar Back to article
Salimi S., Kharrazi M., VulSlicer: Vulnerability detection through code slicing, Journal of Systems and Software, 193, 111450, 2022.
Search in Google Scholar Back to article
Yamaguchi F., Golde N., Arp D., Rieck K., Modeling and discovering vulnerabilities with code property graphs, 2014 IEEE Symposium on Security and Privacy, IEEE, May 18–21 2014, Berkeley, California, USA, 590–604, 2014.
Search in Google Scholar Back to article
Zhou X., Verma R.M., Vulnerability detection via multimodal learning: Datasets and analysis, Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security, 30 May to June 3 2022, Nagasaki, Japan, 1225–1227, 2022.
Search in Google Scholar Back to article
Russell R., Kim L., Hamilton L., Lazovich T., Harer J., Ozdemir O., Ellingwood P., McConley M., Automated vulnerability detection in source code using deep representation learning, 17th IEEE international conference on machine learning and applications (ICMLA), IEEE, December 17–20 2018, Orlando, Florida, USA, 757–762, 2018.
Search in Google Scholar Back to article
Li Z., Zou D., Xu S., Ou X., Jin H., Wang S., Deng Z., Zhong Y., VulDeePecker: A deep learning-based system for vulnerability detection, arXiv:1801.01681, 2018.
Search in Google Scholar Back to article
Zou D., Wang S., Xu S., Li Z., Jin H., µVulDeePecker: A deep learning-based system for multiclass vulnerability detection, IEEE Transactions on Dependable and Secure Computing, 18(5), 2224–2236, 2021.
Search in Google Scholar Back to article
Zhou Y., Liu S., Siow J., Du X., Liu Y., Devign: effective vulnerability identification by learning comprehensive program semantics via graph neural networks, Proceedings of the 33rd International Conference on Neural Information Processing Systems, December 8–14 2019, Vancouver, British Columbia, Canada, 10197–10207, 2019.
Search in Google Scholar Back to article
Fu M., Nguyen V., Tantithamthavorn C.K., Le T., Phung D., VulExplainer: a transformer-based hierarchical distillation for explaining vulnerability types, IEEE Transactions on Software Engineering, 49(10), 4550–4565, 2023.
Search in Google Scholar Back to article
Gholami S., Omar M., Do generative large language models need billions of parameters?, arXiv:2309.06589, 2023.
Search in Google Scholar Back to article
Gholami S., Omar M., Can pruning make large language models more efficient?, arXiv:2310.04573, 2023.
Search in Google Scholar Back to article
Saleem M.A., Li X. Mahmood K., Shamshad S., Ayub M.F., Bashir A.K., Omar M., Provably secure conditional-privacy access control protocol for intelligent customers-centric communication in VANET, IEEE Transactions on Consumer Electronics, doi: 10.1109/TCE.2023.3324273, 2023.
Open DOI Search in Google Scholar Back to article
Lan X., Zhu X., Gong S., Knowledge distillation by on-the-fly native ensemble, arXiv.1806.04606, 2018.
Search in Google Scholar Back to article
Shoeybi M., Patwary M., Puri R., LeGresley P., Casper J., Catanzaro B., Megatron-LM: Training multi-billion parameter language models using model parallelism, arXiv: 1909.08053, 2019.
Search in Google Scholar Back to article
Zheng Y., Pujar S., Lewis B., Buratti L., Epstein E., Yang B., Laredo J., Morari A., Su Z., D2A: A dataset built for AI-based vulnerability detection methods using differential analysis, 43rd International Conference on Software Engineering: Software Engineering in Practice (ICSESEIP), IEEE, May 25–28, 2021, Virtual Event, Spain, 111–120, 2021.
Search in Google Scholar Back to article
Li Z., Zou D., Xu S., Jin H., Zhu Y., Chen Z., SySeVR: A framework for using deep learning to detect software vulnerabilities, IEEE Transactions on Dependable and Secure Computing, 19(4), 2244–2258, 2022.
Search in Google Scholar Back to article

From text to threats: A language model approach to software vulnerability detection

References

Paradigm

My account