Have a personal or library account? Click to login
Harnessing GPT-2 for Feature Extraction in Malware Detection: A Novel Approach to Cybersecurity Cover

Harnessing GPT-2 for Feature Extraction in Malware Detection: A Novel Approach to Cybersecurity

Open Access
|Feb 2024

Abstract

In the rapidly advancing digital age, the surge in malware complexity presents a formidable challenge to cybersecurity efforts, rendering traditional signature-based detection methods increasingly obsolete. These methods struggle to keep pace with the swift evolution of malware, particularly with the emergence of polymorphic and metamorphic variants designed to bypass conventional detection. This study introduces a groundbreaking approach to malware detection by utilizing GPT-2, a cutting-edge language model developed by OpenAI, specifically for the purpose of feature extraction. By applying GPT-2’s deep learning capabilities to the EMBER and Drebin datasets, this research explores the model’s effectiveness in identifying malware through the intricate patterns present in binary data. Contrary to its original design for natural language processing, GPT-2’s application in this context demonstrates a significant potential for enhancing malware detection strategies. The model’s proficiency in extracting complex features from binary sequences marks a notable advancement over traditional methods, providing a more adaptive and robust mechanism for identifying malicious software. However, the study also acknowledges the challenges associated with the interpretability of deep learning models and their susceptibility to adversarial attacks, underscoring the imperative for ongoing innovation in the field of cybersecurity. This exploration into the unconventional use of GPT-2 for feature extraction in malware detection not only showcases the model’s versatility beyond language tasks but also sets a new precedent for the application of unsupervised learning models in enhancing cybersecurity defenses.

DOI: https://doi.org/10.2478/raft-2024-0008 | Journal eISSN: 3100-5071 | Journal ISSN: 3100-5063
Language: English
Page range: 74 - 84
Published on: Feb 28, 2024
Published by: Nicolae Balcescu Land Forces Academy
In partnership with: Paradigm Publishing Services
Publication frequency: 4 times per year

© 2024 Mahmoud Basharat, Marwan Omar, published by Nicolae Balcescu Land Forces Academy
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.