key: cord-0059537-su1trutu
authors: Han, RuiDong; Yang, Chao; Ma, JianFeng; Ma, Siqi; Wang, YunBo; Li, Feng
title: IMShell-Dec: Pay More Attention to External Links in PowerShell
date: 2020-08-01
journal: ICT Systems Security and Privacy Protection
DOI: 10.1007/978-3-030-58201-2_13
sha: c6574ba0edcefbbf9b26914657f49969d2b0a8d6
doc_id: 59537
cord_uid: su1trutu

Windows proposes the PowerShell shell command line to substitute the traditional CMD. However, it is often utilized by the attacker to invade the victim because of its versatile functionality. In this paper, we investigate an attack combined PowerShell and image steganography. Compared with the traditional method, this attack can deceive the defender by hiding its malicious contents in benign images. To effectively detect this attack, we propose a framework IMShell-Dec, whose main target is to check external links before the execution of PowerShell script. IMShell-Dec trains a machine learning classifier with image examples, where the features are generated by merging histograms of three image color channels. Then IMShell-Dec examines the script through tracking and classifying the related images. The detector achieves more than 95% precision in 9,589 high-definition images.

Windows PowerShell is an adaptive and versatile command-line shell environment. It allows the user to take advantage of the .NET Framework [12, 20] , but it also provides additional functions for attackers to generate malicious scripts. Several open-source frameworks(e.g., empire 1 , nishang 2 , PowerSploit 3 ) exploit it to attack victims. Traditional malicious scripts detection methods [1, 5] rely on regular expression matching and complex rules. The regular expression is timeconsuming to create while analyzing PowerShell script, and complex rules are hard to derive and pose a maintenance burden as the attack method evolves. Recently, several automated solutions have proposed to address these issues. Hendler et al. [8] leverages deep neural networks to detect obfuscated malicious PowerShell script. They encode characters as features to train a classifier. And Zhenyuan et al. [15] design a novel subtree-based de-obfuscation method to detect obfuscation, since the attacker always uses obfuscation to conceal their malicious contents. They implement obfuscation detection and emulation-based recovery in the abstract syntax tree. PowerDrive [19] , a de-obfuscator for Pow-erShell attacks, recursively de-obfuscates the code by processing multi-stage deobfuscation.

Previous works assume the payload exists in the form of script, however, we discover that attacker can mount their malicious PowerShell payload on a harmless medium outside of the script. Specifically, attackers may attempt to hide PowerShell malicious content in an external resource and use another harmless script to recover it later, which eliminates the distinctive characteristic caused by excessive obfuscation. In this work, we focus on the Power-Shell attack combines with image steganography, where the attacker injects PowerShell script's information into the image's color channels, then generates another PowerShell release script to decode the malicious contents from the image. Both the release script and image itself are harmless, and, to improve stealthiness, the release script is usually embedded into a file (e.g., Office, JavaScript, C#) before delivered to the victims. When they run the file, the latent PowerShell release script retrieves the image and releases the malicious script. The malicious script can download Web files with the framework plugin WebClient, establishes remote control by sending requests to remote service, sets a persistence mechanism by creating a scheduled task or uninstalls a local application forcefully.

To counter this attack, we propose a novel machine-learning-based detection method, named IMShell-Dec. Unlike previous researches, which only consider the security of script itself, we also consider the external link, since the attacker can conceal their real malicious script in the external resource. We locate the external resource in the script, then apply a machine-learning-based method to check these external resources. We integrate the color histogram as the feature and train a classifier to identify malicious script.

The contribution is summarized in two folds. First, we research a new type of PowerShell attack. It hides the malicious script into an image and generates a standard release script, which can not be detected by the existing detection method. To address this emerging threat, we propose IMShell-Dec, which locates and identify the potentially malicious content hiding in the external image. IMShell-Dec achieves more than 95% precision in 9,589 high-definition images.

The rest of this paper is organized as follows. In Sect. 2, threat model of PowerShell attack including victim setting is introduced. Then, the detailed process of the threat is reported in Sect. 3. In Sect. 4, the detection mechanism is illustrated, which combines the image color histogram feature and machine learning. In Sect. 5, we describe the way we generate data samples, and report the detection performance of our method. Finally, relevant researches and conclusion are shown respectively in Sect. 6 and Sect. 7.

In this paper, we explore a novel attack combine PowerShell attack with image steganography. In this attack, the attacker generates two parts of the resource, including an image and a trap file with a release script. Then, the attacker spread the trap file through Web document, Webmail or USB device, and attempt to fool potential victims to give the execution permissions for the release script. The release script then decodes the malicious script from the image, which is hosted on a website or send to the victim along with the trap file. The whole attack flow is shown in Fig. 1 . The scope of the attack is limited to the following scenarios. The target's system version is not older than Windows 7, since Microsoft developers set the PowerShell as a default application in the newer Windows version. The victim must be unaware or unfamiliar about the system security policy and proficiency of PowerShell. When victims get trap files, they accept to run it and granting necessary permission for the release script. For example, it is common for staff to download Office word documents from the Internet and open them with a local editor. When the document asks to allow update source or modify the file, the user often clicks sure button without paying attention to the prompts in the dialog box. Such action grants the file with specific permissions, allows the releasing script to retrieve a malicious payload and launch an attack.

In this section, we demonstrate the attack process through a concrete example and explain why the two parts of the attack can evade detection.

The conventional rule-based detection method mainly relies on the character form of PowerShell script to separate benign and malicious content. However, image steganography allows the attacker to conceal their malicious payload in an external image, thus bypassing existed script detection. The attacker can then use a release script, which has no difference from the common benign scripts, to recover the payload and execute the intended attack.

In this work, we assume the attacker use Invoke-PSImage 4 , a commonly used tool in the PowerShell, to generate the steganography image. Invoke-PSImage embeds the bytes of a PowerShell script into pixels of a PNG image by utilizing the least significant 4 bits of 2 color values in each pixel to hold the payload, then generates a release script that can extract the original payload later. If treated separately, both the release script and the image are harmless: the image is a PNG file, and the script's content is no more than a benign Pow-erShell command. The diverse format of the release script further strengthened the stealthiness, as the script itself can be a drop-in Office, VBScript, JavaScript, BAT Script, or a base64 certificate. Once the attacker lures the user to opening/running the file with certain permission, an image decoding command is executed in the memory without any GUI activity. The malicious payload is then extracted from the image existed in local or remote storage, and launch the intended attack. As our threat model mentioned, the release script is embedded in another file to ensure it can sneak into the user system environment. For example, Windows provides several methods for data transferring between applications. One method is to use the dynamic data exchange protocol [10] . The DDE protocol carries out macro-less code execution in Office documents. Although Microsoft has limited it in ADV170021(2017.12) 5 , there are still users who are not installing this patch. We conduct a pilot experiment on a colleague's computer, which is installed with Office 2013(15.0.4.4569.1504), and found out that the older version Office can run PowerShell code execution under the default permissions. Excel4-DCOM 6 enable raw shellcode execution on a remote Excel(32Bit), which opens the possibility to combines shellcode attack with lateral movement. JavaScript is capable of running PowerShell script by utilizing component "child process", it can also start a process to execute local PowerShell.exe to run a script. And .Net Framework also manages applications through SCM(Services Control Manager), where we can interfere PowerShell scripts in C# with public API.

We perform experiments to determine the ability of the attack with three different forms of samples PowerShell scripts. At the same time, we explain why release scripts can slip away from the victim's attention and why image steganography makes the attack payload harder to be detected.

To verify the sensitivity of different defenders to scripts. We collect a corpus of PowerShell scripts (i.e., 4,079 PowerShell scripts in total) from iocs 7 , which containing 27 kinds of malicious PowerShell scripts. The most frequently appeared script is Downloader DFSP, which downloads file with Web-Client. To test the response of the defenders, we simulate a Downloader DFSP example as iocs provided, and process the example with different script forms, including an origin script, a base64 emending obfuscated script, and an image steganography script. In this simulation, we use this script to download the 7z 8 application (and in the real attack, a malicious file) and execute it. More specifically, the origin script (see Fig. 2a ) call WebClient to download the "7z.exe" into local directory "$HOME\Documents" and execute. The script is able to coding in Base64 (see Fig. 2b ), which can directly be executed through Power-Shell with the option "-enc". For the image steganography attack, we encode the script into an image's color channels through Invoke-PSImage, then generates a lossless PNG image and a release script (see Fig. 2c ). Figure 4 compared the original image with its steganography processed copy. 

We evaluate the stealthiness of methods by observing the defender's response during the execution of scripts. Before this experiment, we download the latest defenders from their official websites and install them on Windows 10(1903) with PowerShell's version 5.1.18362. Nine experience results about security defender are enumerated in Fig. 3 . We observe that all tested defenders do not raise a warning to the original script, a natural result since the script itself doesn't contain any abnormal behavior. However, defenders can easily intercept a naked malicious URL download attempt. Even we obfuscate the original (malicious) script with deep embedding, half of the defenders report that the script is operating suspiciously. This observation conforms with the discovery in research [8, 11] . Image steganography conceals the true payload into a legitimate medium, extract it later through another independent and benign-looking release script, thus bypass the conventional script detection method.

As for the image, both defender and firewall only examine the script itself but pay no attention to its external image. Besides, as Fig. 4 shows, it is challenging to notice the blemish in steganographic image by naked eyes. 

To address the above PowerShell attacks, we proposed a machine-learning based defense framework, IMShell-Dec. In this section, we provide an overview of the proposed framework, and describe two key components of our framework: feature extractor and detection model.

IMShell-Dec is a detection framework that aims to identify suspicious payload hiding in image. It starts by locating the external image links in PowerShell scripts. Once located, IMShell-Dec attempts to retrieve the image file, and determine whether there is a malicious payload in the image. The overview of IMShell-Dec is illustrated in Fig. 5 . When IMShell-Dec receives an unknown script, it starts by seeking for the external image links in PowerShell scripts and attempts to retrieve the image for any link located. Once the image has successfully retrieved, the feature extractor will transform images into useful features, and then the detection model will determine the category of these images. If the detection model label the image as malicious, then IMShell-Dec will mark the source script as suspicious and raise a warning to the user. In the following subsections, we thoroughly describe the two key components of our proposed framework: feature extractor, and detection model.

Before calling the detection model, we use feature extractor to distill useful information from the raw images. A pixel in the typical RGB-colored image consists of three integers, where each integer represents a colored channel with a range between 0 to 255. If we plot the number of pixels for each possible value, we obtain a frequency graph that represents the tonal distribution in a digital image. Such a graph is called "histogram". Usually, the distribution in an unmodified image histogram tends to be smooth in general. However, steganography tools like Invoke-PSImage will introduce additional offsets to pixels, which may break the smooth shape of the distribution.

To examine this conjecture, we record several image histograms and compare the smoothness of distribution before and after the steganographic process. As Fig. 6 shows, the steganographic process introduces numerous small yet obvious spikes in the image histogram. Hence, we leverage a filter with kernel [−0.5, 1, −0.5] to process each color histograms, and transform the result of three channels into one feature vector, which reflected the smoothness of the transition between a particular value with its neighbor. To neutralize the influence of frequency scale, we further apply a min-max normalization and re-scale the feature vector to the range of [−1, 1]. The visualized image features are displayed in Fig. 7 . It can be observed from the figure that the features extracted from a benign and malicious image are quite different. 

Once the image has processed into a feature, we can use a detection model to classify the images into two categories: benign or malicious. To obtain this model, we need to train it before the deployment with a training set.

A training set contains a set of images with a ground-truth label, where each image is processed forehand with feature extractor to generate corresponding feature vector. During the training process, these features are merged into one matrix, then the detection model takes the feature matrix as input and output the prediction. By comparing the prediction with ground-truth, the machine learning algorithm is able to correct and update the detection model, thus improving its classification performance. When the training is completed, we freeze the model parameter and deploy the discriminative model to predict unseen images.

The label prediction process takes as input the discriminative model and a feature vector that the label is to be predicted. The discriminative model would assign the likelihoods of the feature to belong to each of the two categories. The category with the highest likelihood would be outputted as the predicted label for the feature. The prediction result of the feature is the judgment of the image.

There are several machine learning algorithms able to perform the classification task. In this study, we select three algorithms, namely Linear Discriminant Analysis (LDA), Random Forest (RF), and Back-Propagation Networks (BPNs) as our experiment candidates. Their detection performance and time consumption will be evaluated in the next section.

We implement our code in Python 3.7 and perform the experiment on a PC equipped with Intel i7-9700 CPU. We use iocs, a corpus contains 4079 Pow-erShell scripts as our malicious script database, which has an average size of 312 Bytes and can be divided into 27 categories by the attack behavior. Then, we collect 5,510 high-definition images from the Internet, and randomly select 4,079 of them to generate synthetic copies with malicious payloads. This image dataset(9,589 samples in total) is used for classifier training.

To ensure the fidelity and reproducibility of the experiment, we apply k-fold cross-validation to generate a diversified dataset for both training and evaluation. Specifically, the entire dataset is randomly split into k(k = 10 in our work) subsets, each subset contains approximately 958 images, with roughly 550 benign and 408 malicious. We then generate ten distinctive data groups by taking each unique subset as test data while the remaining subsets as training data. We train and evaluate models on each data groups, and report the average performance.

We use accuracy, precision, recall, and F1-score as the performance metrics, which are defined as follows:

where P is the number of malicious image, N is the number of benign image, T P is the number of correctly identified malicious image, F P is the number of benign images that incorrectly labeled as malicious, and F N is the number of malicious image that wrongly classified as benign.

We also apply the receiver operating characteristic curve (ROC), area under curve (AUC) and precision/recall curve (PR) to assess the overall effectiveness of IMShell-Dec. ROC curve is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied, and AUC, the size of the area under the ROC curve, represented the model's capability to distinguish between classes. Both indicators of the PR curve focus on positive examples in a binary classifier system. If the PR of one classifier is entirely covered another, it can be asserted that the classifier has better performance than another. In past studies, F1, accuracy, recall and precision scores of 0.8 or above are often considered reasonable (e.g., [3, 13, 17] ).

We implement and evaluate three classification algorithm, including linear discriminant analysis, back-propagation network, and random forest. Experiment results are shown in Table 1 . We note that the results of these classifiers are reasonably good, which means our proposed scheme performs well in classifying images. Among the three machine learning algorithms, their F1 results achieve 0.920, 0.943, and 0.961 (out of 1) respectively. The result of LDA is lower than the other two algorithms. Through the ROC, AUC and PR results shown in Fig. 8 and Fig. 9 , the result differences between these three algorithms are visible.

No matter which classifier is applied, we obtain a conclusion that our framework shows high performance in malicious image detection. Moreover, it represents that BPNs performs better than the other two algorithms, and we believe that BPNs is more suitable for our framework. The back-propagation network classifier has the following advantages:

-It can handle thousands of input features without feature deletion.

-It points out the important potential features for classification. -It performs an internal unbiased estimate of the generalization error.

We also investigate the time consumption of each algorithm by measuring the time to process the entire training dataset and the time to predict 1000 images with malicious payloads. The result is illustrated in Table 2 . 

Limitation. We manually inspected some incorrectly classified images and identified the following issues that cannot be proceeded by IMShell-Dec.

-If a malicious image contains a sizeable pure color area, IMShell-Dec may predict the image as the wrong label. -IMShell-Dec may incorrectly report a benign image with inferior quality.

-If the image path is deliberately obfuscated, our proposed scheme is unable to locate the image's position.

Future Work. To address the limitation mentioned above, we plan to find new feature extraction methods to solve the problem that some edges of the image may extract inaccurate features. Besides, we decide to design a better pattern matching method to locate the links of images more accurately.

Several works [6, 8, 11, 16] has proposed their methods and algorithms to detect malicious scripts. For example, Hendler et al. [8] extract features from malicious PowerShell scripts through the bag-of-words model, a natural language processing approach, where the system transform PowerShell commands into a multi-set of words, then calculate their frequency to generate the feature vectors. These feature vectors are further processed with Convolutional Neural Networks(CNNs) and Recurrent Neural Networks(RNNs) to identify the category of PowerShell commands.

Khan et al. [11] extract critical features through the wrapper approach to detect unseen malicious scripts. They collect malicious JavaScript codes from client sides, apply the wrapper method to distill an info-enriched feature subset, then feed this feature subset into the detection model. In this work, the author compared four supervised machine learning classifiers (Naive Bayes, Support Vector Machines, K-Nearest Neighbour and Decision Trees), and choose the one with the best prediction performance as the detection model.

Although these research apply different feature extraction strategies, none of them consider the attack vectors outside of the script. Therefore, existing script detection scheme can not identify our proposed attack, as the true attack payload is located in an external resource, and the release script itself is clean and harmless.

Due to the data structure of image, researchers has proposed several machinelearning based detection method [9, 14, [21] [22] [23] to recognize image processed with steganography tools. Wu et al. [21] leverage the residual network [7] to detect steganographic images. Ye et al. [22] promote a CNNs architecture to analyze steganography consisted of diverse activation modules. Ke et al. [9] proposed a hybrid deep learning framework, which combines the bottom hand-crafted convolutional kernels and threshold quantizers pairing with the upper compact deep-learning model.

For the adaptive pattern-based detection, Chen et al. [4] utilize local texture pattern (LTP) to detect binary image steganography, which LTP describes the texture distribution of areas and consist of pixels within the areas. Similarly, a feature selection approach [2] implemented adaptive inertia weight-based particle swarm optimization is proposed. Saman et al. [18] proposes a novel blind statistical analysis technique to detect the least significant bit flipping image steganography.

We investigate a new class of PowerShell attack combined with steganography, which allows an attacker to conceal their malicious payload in a medium outside of script, thus bypassing conventional intrusion detection methods. To examine the feasibility, we generate images hosted with script through a popular steganography tool, Invoke-PSImage, then retrieved and executed the payload successfully through another harmless release script. Pilot research shows that the synthesized image has no visual difference with the original, and multiple mainstream defenders failed to intercept the image nor the release script. Both results confirmed the stealthiness of this attack.

To address the emerging threat, in this paper, we propose a machine-learningbased defense framework, IMShell-Dec, to identify malicious PowerShell script that hiding their real payload in the external image. We train and evaluate our proposed framework on a synthesized dataset, in which our framework achieved high detection performance across multiple measurements. Our work can serve as an inspiration in designing a more robust and secure detection model against the proposed attack schemes.

Identifying malicious queries

Image steganalysis using improved particle swarm optimization based feature selection

Is it a bug or an enhancement?: a text-based approach to classify change requests

Binary image steganalysis based on local texture pattern

Static analysis of executables to detect malicious patterns

JaSt: fully syntactic detection of malicious (Obfuscated) JavaScript

Deep residual learning for image recognition

Detecting malicious PowerShell commands using deep neural networks

Image steganalysis via multi-column convolutional neural network

Dynamic data exchange server

Defending malicious script attacks using machine learning classifiers

Windows PowerShell 2.0 Bible

Benchmarking classification models for software defect prediction: a proposed framework and novel findings

ReST-Net: diverse activation modules and parallel subnets-based CNN for spatial image steganalysis

Effective and lightweight deobfuscation and semantic-aware attack detection for PowerShell scripts

Workshop on Trustworthy Manufacturing and Utilization of Secure Devices

A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction

PSW statistical LSB image steganalysis

PowerDrive: accurate deobfuscation and analysis of PowerShell malware

Windows PowerShell 3.0 First Steps

Deep residual learning for image steganalysis

Deep learning hierarchical representations for image steganalysis

Large-scale JPEG image steganalysis using hybrid deep-learning framework