The ever increasing number of Android malware has always been a concern for cybersecurity professionals. Even though plenty of anti-malware solutions exist, we hypothesize that the performance of existing approaches can be improved by deriving relevant attributes through effective feature selection methods. In this paper, we propose a novel two-step feature selection approach based on Rough Set and Statistical Test named as RSST to extract refined system calls, which can effectively discriminate malware from benign apps. By refined set of system call, we mean the existence of highly relevant calls that are uniformly distributed thought target classes. Moreover, an optimal attribute set is created, which is devoid of redundant system calls. To address the problem of higher dimensional attribute set, we derived suboptimal system call space by applying the proposed feature selection method to maximize the separability between malware and benign samples. Comprehensive experiments conducted on three datasets resulted in an accuracy of 99.9%, Area Under Curve (AUC) of 1.0, with 1% False Positive Rate (FPR). However, other feature selectors (Information Gain, CFsSubsetEval, ChiSquare, FreqSel, and Symmetric Uncertainty) used in the domain of malware analysis resulted in the accuracy of 95.5% with 8.5% FPR. Moreover, the empirical analysis of RSST derived system calls outperformed other attributes such as permissions, opcodes, API, methods, call graphs, Droidbox attributes, and network traces.
Identification of Android Malware Using Refined System Calls.
Vinod P.;Mohammad Shojafar;Mauro Conti
2019
Abstract
The ever increasing number of Android malware has always been a concern for cybersecurity professionals. Even though plenty of anti-malware solutions exist, we hypothesize that the performance of existing approaches can be improved by deriving relevant attributes through effective feature selection methods. In this paper, we propose a novel two-step feature selection approach based on Rough Set and Statistical Test named as RSST to extract refined system calls, which can effectively discriminate malware from benign apps. By refined set of system call, we mean the existence of highly relevant calls that are uniformly distributed thought target classes. Moreover, an optimal attribute set is created, which is devoid of redundant system calls. To address the problem of higher dimensional attribute set, we derived suboptimal system call space by applying the proposed feature selection method to maximize the separability between malware and benign samples. Comprehensive experiments conducted on three datasets resulted in an accuracy of 99.9%, Area Under Curve (AUC) of 1.0, with 1% False Positive Rate (FPR). However, other feature selectors (Information Gain, CFsSubsetEval, ChiSquare, FreqSel, and Symmetric Uncertainty) used in the domain of malware analysis resulted in the accuracy of 95.5% with 8.5% FPR. Moreover, the empirical analysis of RSST derived system calls outperformed other attributes such as permissions, opcodes, API, methods, call graphs, Droidbox attributes, and network traces.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.