key: cord-0941659-ghabmg5z
authors: Beaman, Craig; Barkworth, Ashley; Akande, Toluwalope David; Hakak, Saqib; Khan, Muhammad Khurram
title: Ransomware: Recent Advances, Analysis, Challenges and Future Research Directions
date: 2021-09-24
journal: Comput Secur
DOI: 10.1016/j.cose.2021.102490
sha: e5dfb630efa2c483522f9426a69043d8aefd3956
doc_id: 941659
cord_uid: ghabmg5z

The COVID-19 pandemic has witnessed a huge surge in the number of ransomware attacks. Different institutions such as healthcare, financial, and government have been targeted. There can be numerous reasons for such a sudden rise in attacks, but it appears working remotely in home-based environments (which is less secure compared to traditional institutional networks) could be one of the reasons. Cybercriminals are constantly exploring different approaches like social engineering attacks, such as phishing attacks, to spread ransomware. Hence, in this paper, we explored recent advances in ransomware prevention and detection and highlighted future research challenges and directions. We also carried out an analysis of a few popular ransomware samples and developed our own experimental ransomware, AESthetic, that was able to evade detection against eight popular antivirus programs.

The COVID-19 pandemic has led to an increase in the rate of cyberattacks. As the workplace paradigm shifted to home-based scenarios-resulting in weaker security controls-attackers lured people through COVID-19 themed ransomware phishing emails. For example, many phishing campaigns prompted users to click on specific links to get sensitive information related to a COVID-19 vaccine, shortage of surgical masks, etc. Attackers made good use of fake COVID-19 related information as a hook to launch more successful phishing campaigns. Higher levels of unemployment can be another factor that motivates people towards cybercrime, such as launching ransomware attacks and disrupting critical IT services, in order to support themselves [1] .

Cyber extortion methods have existed since the 1980s. The first ransomware sample dates back to 1989 with the PC Cyborg Trojan [2] . After the target computer was restarted 90 times, PC Cyborg hid directories and encrypted the names of all files on the C drive, rendering the system unusable. In the 1990s and early 2000s, ransomware attacks were mostly carried out by hobbyist hackers who aimed to gain notoriety through cyber pranks and vandalism [3] . Modern ransomware emerged around 2005 and quickly became a viable business strategy for attackers [4, 5] . Targets shifted from individuals to companies and organizations in order to fetch larger ransoms [6] . The following industries were particularly targeted: transportation, healthcare, financial services, and government [7] . The number of ransomware attacks has grown exponentially thanks to easily obtainable ransomware toolkits and ransomware-as-a-service (RaaS) that allows novices to launch ransomware attacks [8] . [9] Ransomware is a type of malware designed to facilitate different nefarious activities, such as preventing access to personal data unless a ransom is paid [10, 11, 12] . This ransom typically uses cryptocurrency like Bitcoin, which makes it difficult to track the recipient of the transaction and is ideal for attackers to evade law-enforcement agencies [13, 14] . There has been a surge in ransomware attacks in the past few years. For example, during the ongoing COVID-19 pandemic, an Android app called CovidLock was developed to monitor heat map visuals and statistics on COVID-19 [15] . The application tricked users by locking user contacts, pictures, videos, and access to social media accounts as soon as they installed it. To regain access, users were asked to pay some ransom in Bitcoin; otherwise, their data was made public [16] . Another notorious example of ransomware is the WannaCry worm, which spread rapidly across many computer networks in May 2017 [17, 18] . Within days, it had infected over 200,000 computers spanning across 150 countries [19] . Hospitals across the U.K. were knocked offline [20] ; government systems, railway networks, and private companies were affected as well [21] . Ransomware can be categorized into three main forms: locker, crypto, and scareware [22, 23] as shown in Figure 1 . Scareware may use pop-up ads to manipulate users into assuming that they are required to download certain software, thereby using coercion techniques for downloading malware. In scareware, the cyber crooks exploit the fear rather than lock the device or encrypt any data [9] . This form of ransomware does not do any harm to the victim's computer. The aim of locker ransomware is to block primary computer functions. Locker ransomware may encrypt certain files which can lock the computer screen and/or keyboard but is generally easy to overcome and can often be resolved by rebooting the computer in safe mode or running an on-demand virus scanner [24] . Locker ransomware may allow limited user access. Crypto ransomware encrypts the users sensitive files but does not interfere with basic computer functions. Unlike locker ransomware, crypto ransomware is often irreversible as current encryption techniques (e.g., AES and RSA) are nearly impossible to revert if implemented properly [25, 23] . Table 1 presents a few popular ransomware families. Crypto ransomware can use one of three encryption schemes: symmetric, asymmetric, or hybrid [26] . A purely symmetric approach is problematic as the encryption key must be embedded in the ransomware [27] . This makes this approach vulnerable to reverse engineering. The second approach is to use asymmetric encryption. The issue with this approach is that asymmetric encryption is slow compared to symmetric encryption and hence struggles to encrypt larger files [28] .

The most effective approach (i.e., the hardest to decrypt) is hybrid encryption, which uses both symmetric and asymmetric encryption. An overview of the hybrid approach is given in Figure 2 . For hybrid encryption, the first step is to create a random symmetric key. The ransomware usually creates this key by calling a cryptographic API on the user's operating system [29] . The symmetric key encrypts the victim's files as the ransomware traverses through the file system. Once all files are encrypted, a public-private key is generated by a command and control (C&C) server which the ransomware connects to. The public key is sent to the ransomware and is used to encrypt the symmetric key, while the private key is held by the C&C server. The plaintext version of the symmetric key is then deleted to ensure that the victim cannot use it to recover their files. Instructions for how to pay the ransom are left for the victim. If the ransom is paid, then the decryption process will begin. Decryption starts by requesting the private key from the C&C server. Once obtained, the private key is used to decrypt the symmetric key. Finally, the symmetric key is used to recover the victim's files. Generally, a unique public-private key pair is generated for each new ransomware infection; this prevents victims from sharing private keys with other victims to enable them to recover the symmetric key.

Ransomware attacks can cause significant financial damage, reduce productivity, disrupt normal business operations, and harm the reputations of individuals or companies [30, 31] . The global survey 'The State of Ransomware 2021' commissioned by Sophos announced in its findings that, among roughly 2,000 respondents whose organizations had been hit by a ransomware attack, the average total cost to an organization to rectify the impacts of a ransomware attack (considering downtime, people time, device cost, network cost, lost opportunity, ransom paid etc.) was US$1.85 million, more than double the US$761,106 cost reported in 2020 [32] . These attacks may also result in a permanent loss of information or files. Paying the ransom does not guarantee that the locked system or files will be released [33] . For companies who pay the ransom, the cost of recovering from the attack doubles on average [34] . By the end of the year 2021, ransomware attacks are expected to cost the world $20 billion, up from $325 million in 2015 [7] . These attacks have been particularly devastating since the COVID-19 pandemic started by targeting hospitals, vaccine research labs, and contact tracing apps [35] . From all these statistics, it is clear that we need to understand the behaviour of ransomware and its variations to effectively detect and mitigate future attacks. Due to its profitability, new variations of ransomware continue to emerge that circumvent traditional antivirus applications and other detection methods. Hence, it is critical to come up with a new generation of efficient countermeasures.

There is an emerging need to highlight the recent advancements in the area of ransomware. The contribution of this paper is as follows:

• Recent state-of-the-art ransomware detection and prevention approaches are presented.

• Different ransomware samples are tested in a virtual environment.

• A new experimental ransomware known as AESthetic is developed and tested on eight popular antivirus programs.

• The effectiveness of a few popular ransomware countermeasures on implemented ransomware samples is analyzed.

• Future research challenges and directions are identified and elaborated.

The rest of the article is organized as follows. Section 2 surveys the recent literature on ransomware detection and prevention approaches. Section 3 presents our new ransomware sample, AESthetic, and the experimental testbed setup along with in-depth analysis. A discussion of our literature survey and test results is in Section 4. Section 5 highlights future research challenges and directions. Finally, Section 6 concludes the article.

Before our own survey, we searched for and identified relevant surveys on ransomware and summarized their contributions in Table 2 . Most existing surveys were outdated and focused on papers from 2014 to 2017. Hence, for our own literature review, we sourced papers on ransomware solutions from 2017 onwards. The papers came from the following article databases: IEEE Xplore, ACM, Science Direct, and Springer. Our searches were made using combinations of the following keywords: 'ransomware detection', 'ransomware prevention', 'crypto-ransomware', 'malware detection', 'key backup', 'data backup', 'access control', 'honeypots', 'machine learning', and 'intrusion/anomaly detection'. We categorized the surveyed papers into ransomware prevention and detection approaches. Most of the existing works within these two categories involved the preliminary step of malware analysis, which is explained below: Year [2, 7] Various ransomware detection and mitigation techniques are presented from literature, along with their pros and cons 2017,2020 [4] In this article, the history of ransomware and best practices to mitigate it are presented 2017 [45] In this study, a review on ransomware detection and prevention is carried out 2017 [46] In this study, emerging ransomware attacks and a few security challenges are highlighted 2017 [47] This article provides a general overview of ransomware and how it works 2016 [48] A detailed review on ransomware attack methodology is conducted 2017 [49] In this study, the authors carried out a survey on Windows-based ransomware 2020 [50] In this study, the authors focused on detection techniques with the core focus on crypto ransomware 2019

Malware analysis is a standard approach to understand the components and behaviour of malware, ransomware included. This analysis is useful to detect malware attacks and prevent similar attacks in the future. Malware analysis is broadly categorized into static and dynamic analysis. Static analysis analyzes binary file contents, whereas dynamic analysis studies the behaviour and actions of a process during execution [36, 37, 38] .

Signature-based malware detection is a static analysis approach that uses the unique patterns within the malicious file in order to detect it. For ransomware, this includes the unique sequences of bytes within the binary file, the order of function calls, or the analysis of ransomware notes [7, 39, 40] . The signature can then be checked against the signatures of known malware samples. The main advantages of signature-based detection are that it is fast and has a low false-positive rate; for these reasons, signature-based detection is very popular. However, if malware is concealed through code obfuscation techniques like binary packing, then it may evade detection [41] . Dynamic analysis is less susceptible to these evasion techniques because, unlike static analysis, it does not rely on analyzing the binary code itself and looks for meaningful patterns or signatures that imply the maliciousness of the analyzed file [42] . Additionally, signature-based approaches will fail against newly created malware [43, 44] .

Analysis can reveal some of the steps ransomware takes to infect a user's computer. For example, Bajpai and Enbody [51] performed static and dynamic analysis on decompiled .NET ransomware samples and found that .NET ransomware first attempts to gain execution privileges and then contacts a C&C server to obtain the encryption key. Zimba and Mulenga [52] examined the static and behavioural properties of WannaCry ransomware; they discovered that WannaCry retrieves the network adapter properties to determine whether it's residing in a private or public subnet in order to effectuate substantial network propagation and subsequent damage. Malware analysis can discover the unique characteristics of ransomware which can then be used to help design prevention or detection mechanisms.

As mentioned previously, most existing studies have analyzed the nature of malware. Based on the analysis, they have proposed different approaches to prevent or detect ransomware. We have classified the existing studies based on their goal which is to either prevent ransomware infection or to detect ransomware once it has infected the system. A classification diagram of the utilized tools from the reviewed studies can be found in Figure 3 .

Preventative solutions aim to block, mitigate, or reverse the damage done by ransomware. Common preventative approaches include: enforcing strict access control, storing data and/or key backups, and increasing user awareness and training. Raising user awareness of ransomware attacks and training users on how to avoid them can prevent attacks before they occur. A summary of the utilized tools found to be used in the surveyed literature on ransomware prevention can be found in Table 3 . 

Access control prevents ransomware encryption by restricting access to the file system.

Parkinson [54] examined how to use built-in security controls to prevent ransomware from executing in the host computer via elevated privileges. One way that ransomware gains access to files is through a user's credentials if the user has a high level of permissions. He proposed implementing least privilege and separation of duties through role-based access control; restricting data access as far up the directory hierarchy as possible and routinely auditing permissions and roles.

Kim and Lee [55] proposed an access control list that whitelists specific programs for each file type. Only whitelisted programs are allowed to access files. This implicitly blocks malicious processes from accessing and encrypting files. Whereas a blacklist cannot stop ransomware that it does not contain a code signature for, a whitelist can effectively block new and unknown ransomware.

Ami et al. [53] developed a solution known as AntiBotics containing three key components: a policy enforcement driver, a policy specification in-terface, and a challenge-response. This program makes use of both biometric authentication (e.g., a fingerprint) and human response (e.g., CAPTCHA) to prevent the deletion or modification of data. AntiBotics enforces access control by presenting periodic identification challenges. This program assigns access permissions to executable objects based on a rule specified by an administrator as well as the feedback of the challenges presented upon attempts to modify or delete files. One of this program's limitations is that it is only tested on Windows OS. Also, although modern ransomware failed to evade AntiBotics, it's possible that future ransomware could adapt to AntiBotics. For example, ransomware could avoid AntiBotics by injecting itself into a permitted process while waiting until the process is granted permission. A case where ransomware may attempt to rename a protected folder and conceal itself may arise, but AntiBotics can block such a process by presenting a challenge when a rename operation is carried out.

McIntosh et al. [56] proposed a framework that enables access control decision making to a filesystem to be deferred when required, to observe the consequence of such an access request to the file system and to roll back changes if required. The authors suggested that their framework could be applied to implement a malware-resilient file system and potentionally deter ransomware attacks. They demonstrated the practicality of their framework through a prototype testing, capturing relevant ransomware situations. The experimental results against a large ransomware dataset showed that their framework can be effectively applied in practice.

Genç et al. [57] developed an access control mechanism with the insight that without access to true randomness, ransomware relies on the pseudo random number generators that modern operating systems make available to applications in order to generate keys. They proposed a strategy to mitigate ransomware attacks that considers pseudo random number generator functions as critical resources, controls accesses on their APIs, and stops unauthorized applications that call them. Their strategy was tested against 524 active real-world ransomware samples and stopped 94% of them, including WannaCry, Locky, CryptoLocker, CryptoWall, and NotPetya samples.

Keeping regular backups of the data stored on a computer or network can greatly minimize the impact of ransomware. Instead, the damage is simply limited to any data that has been created since the last backup. There is overhead in backing up large amounts of data, and so choosing how often backups should be taken and how long they will be kept are important decisions to be made.

Huang et al. [61] proposed a solution called FlashGuard that does not rely on software at all. Instead, it uses the fact that Solid State Drives (SSD) don't overwrite data right away -a garbage collector does this after a while. The authors modified SSD firmware so the garbage collector doesn't remove data as quickly, and hence lost data can be restored. When tested against ransomware samples, FlashGuard successfully recovered encrypted data with little impact on SSD performance and life span.

Thomas and Galligher [62] conducted a literature review of the ransomware process, functional backup architecture paradigms, and the ability of backups to address ransomware attacks. They also provided suggestions to improve the information security risk assessments to better address ransomware threats, and presented a new tool for conducting backup system evaluations during information security risk assessments that enables auditors to effectively analyze backup systems and improve and organizations ability to combat and recover from a ransomware attack.

Min et al. [63] proposed Amoeba, an autonomous backup and recovery SSD system to defend against ransomware attacks. Amoeba contains a hardware accelerator to detect the infection of pages by ransomware attacks at high speed, as well as a fine-grained backup control mechanism to minimize space overhead for original data backup. To evaluate their system, the authors extended the Microsoft SSD simulator to implement Amoeba and evaluated it using realistic block-level traces collected while running the actual ransomware. Their experiments found that Amoeba had negligible overhead and outperformed in performance and space efficiency over the state-of-the-art SSD, FlashGuard.

Kharraz and Kirda [60] proposed Redemption, a system that requires minimal modification of the operating system to maintain a transparent buffer for all storage I/O. Redemption monitors the I/O request patterns of applications on a per-process basis for signs of ransomware-like behavior. If I/O request patterns are observed that indicate possible ransomware activity, the offending processes can be terminated and the data restored. The evaluation of their system showed that Redemption can ensure zero data loss against current ransomware families without detracting from the user experience or inducing alarm fatigue. Additionally, they proved that Redemption incurs modest overhead, averaging 2.6% for realistic workloads.

Key management refers to recovering the encryption key that was used to encrypt files and using that to decrypt them, without paying the ransom. For some ransomware samples, such as samples that hard code the key directly into their executable binary, this may be rather straightforward. For hybrid models, this can be more challenging, as the key is only available in plain text while the files are actively being encrypted.

Bajpai and Enbody [51] decompiled eight different .NET ransomware variants and determined that some ransomware samples use poor key generation techniques that call common libraries. This insight can be utilized by ransomware countermeasures by keeping a backup of an attacker's symmetric encryption key. This key can be used to recover any encrypted files later on. For example, Lee et al. [64] observed that many ransomware programs use the CNG library, a cryptographic library for Windows machines, to generate the encryption key. They developed a prevention system that hooks these functions such that when ransomware calls them, the system stores the encryption key. For the evaluation of their system, Lee et al.

[lee2018] implemented a sample ransomware program. They also implemented their prevention solution which attempts hooking into the process from the ransomware program that performs encryption so that it can extract the encryption key. After hooking, the prevention program displays the extracted encryption key when the sample ransomware generates the key for the encryption. In experiments where the ransomware program attempted encryption 10, 100, 1,000, 10,000, and 100,000 times, their ransomware prevention program was able to extract the encryption key 100% of the time. One limitation of this solution is the assumption that ransomware calls a specific library to obtain the encryption key; if the assumption is invalid, the solution fails.

Some ransomware programs use a symmetric session key for encryption. This key is stored in the victims computer which then encrypts the users files. Kolodenker et al. [65] developed a key backup solution called Paybreak which relies on signatures. PayBreak implements a key escrow approach that stores session keys in a vault, including the symmetric key that the attacker uses. When tested, PayBreak successfully recovered all files encrypted with known encryption signatures.

The security of the symmetric encryption key is vital for ransomware developers. Furthermore, a large subset of current ransomware exclusively deploy AES for data encryption. With this in mind, Bajpai and Enbody [66] developed a side-channel attack on ransomwares key management to extract exposed ransomware keys from system memory during the encryption process. Their attack leverages the knowledge that the encryption process is a white box on the host system; this approach is successful regardless of which cryptographic API is being used by the malware and regardless of whether a cryptographic API is being used by the malware at all. Their attack was able to identify exposed AES keys in ransomware process memory with a 100% success rate in preliminary experiments, including against NotPetya, WannaCry, LockCrypt, CryptoRoger, and AutoIT samples.

Chung [68] looked at preventing ransomware attacks within companies and organizations, arguing that they should help individual employees take precautions against ransomware scams. This is especially important since, as mentioned previously, ransomware attacks are increasingly targeting institutions such as financial or healthcare organizations. The author listed five prevention tips for employees to follow: install anti-virus or anti-malware software on every computer and mobile device in use; choose strong and unique passwords for personal and work accounts; regularly back up files to an external hard drive; never open suspicious email attachments; and use mirror shielding technology such as NeuShield as a failsafe data protection measure.

Thomas [67] also examined how users and employees within organizations can avoid ransomware attacks, but this paper focused on how individuals can avoid falling for phishing attacks, which are a common first step for ransomware. The author surveyed several security professionals and, based on the findings from the survey, proposed several recommendations. The first recommendation was to segment company employees based on factors such as their familiarity with phishing and the impact level of their jobs. After segmentation, the next recommendation was to develop targeted training for each group; this training should include real-life examples highlighting the seriousness and damage caused by phishing, use real case studies, and include actual incidents within the company. Sharing these actual and personal examples will result in a strong realization of the dangerous impact of spear phishing and will evoke a more personal protection response.

Researchers have proposed various detection solutions to spot ongoing ransomware attacks. Once ransomware programs have been spotted, they can be stopped and removed. Below is a classification of different detection approaches. A summary of the utilized tools found to be used in the surveyed literature on ransomware detection can be found in Table 4 . An overview of the experimental results, which includes sensitivity and specificity rates, of the surveyed literature on ransomware detection can be found in Table 5 . 

A few of the surveyed papers used system information, such as log files or changes to the Windows registry, as a method of detecting ransomware. A brief summary of all those works is presented below.

Monika et al. [70] noted that ransomware samples tend to add and modify many Windows registry values. They suggest that the continuous monitoring of Windows registry values, along with file system activity, can be used to detect ransomware attacks. Chen et al. [20] analyzed system log files to detect ransomware activity. This was done by extracting various features from the log files that are relevant to malware activity. Ultimately they found that malware (ransomware included) can be effectively detected using their approach, even when the logs contain mostly benign events, and is resilient to polymorphism.

After the execution of a ransomware attack, a ransom note is usually left behind. This note could be saved to the user's computer in the form of a text file or displayed on the user's screen. This note informs the user that their personal files have been encrypted -or, in the case of locker ransomware, are inaccessible -and gives steps on how to pay and retrieve them. Static and dynamic analysis can reveal the traits of ransomware notes. For example, Groenewegen et al. [114] performed static and dynamic behaviour analysis to identify the traits of the NEFILIM ransomware strain that targets Windows machines. They found that if a NEFILIM sample is executed with administrative privileges, the accompanying ransom note is written to the root directory of the machine (C:); otherwise, it is written to the user's "AppData" directory. Furthermore, the ransomware calls the "CreateFileW" and "WriteFile" Windows functions to create the ransomware note and write to it, respectively. Lastly, they determined that the ransomware note file is always named "NEFILIM-DECRYPT.txt". In the case where the ransom note is displayed on the screen, some researchers took screen captures and used image and text analysis methods to detect the presence of a ransom note [111, 75] .

As mentioned in Section 2.1, ransomware typically displays a ransom note on the user's computer to receive payment. Some researchers used static and/or dynamic analysis to detect the presence of such a note to ascertain whether a ransomware attack is underway.

Alzahrani et al. [111] proposed RanDroid, a framework to detect ransomware embedded in malicious Android applications by looking for ransom notes displayed during the app's execution. RanDroid measures the structural similarity between a set of images collected from the inspected application and a set of threatening images collected from known ransomware variants. The framework first decompiles the Android Application Package (APK) which contains a set of files and folders. It then extracts images from the resources folder and XML layout files using static analysis. Dynamic analysis is performed with a UI-guided test input generator to interact with the application without instrumentation, in order to trigger the app's events, capture the activities that appear while the app is running, and collect additional images. Several pre-processing steps are applied to the images, including extracting the text from the images. Image and text similarity measurements are calculated against a database of images and texts collected from known ransomware variants; both measurements are used for a final classification. RanDroid was tested by running 300 applications (100 ransomware and 200 goodware applications) and achieved a 91% accuracy rate.

Kharraz et al. [75] designed a system called UNVEIL to detect ransomware; a core component of UNVEIL is aimed at detecting screen locker ransomware, with the key insight that ransom notes generally cover a significant part, if not all, of the display. UNVEIL monitors the desktop of the victim machine and takes screenshots of the desktop before and after a sample is executed. The series of screenshots are then analyzed and compared with image analysis methods to determine if a large part of the screen has changed substantially between captures. When evaluated against 148,223 samples, UNVEIL achieved a 96.3% detection rate with zero false positives.

Crypto ransomware modifies a file when encrypting it. Large changes made to many files in a computer's file system could indicate that a ransomware attack is underway. There are several metrics that can be used to detect significant changes in files. The three metrics identified from the surveyed literature are entropy, file type, and file differences (i.e. similarity). In addition, several researchers analyzed file I/O operations to detect suspicious activity. These four methods of file analysis are defined below.

• File entropy: This measures the "randomness" of a file. Encrypted and compressed files have high entropy compared to plaintext files. Hence, calculating the entropy of the file and comparing the value to previous calculations for the same file can be used to determine whether a file has been infected by ransomware. Scaife et al. [71] calculated file entropy with Shannon's formula and used it as one feature to detect ransomware. Mehnaz et al. [72] also used Shannon entropy as a metric for detecting ransomware. Lee et al. [73] applied machine learning to classify infected files based on file entropy analysis.

• File type: A file's type refers to its extension. Ransomware typically changes the extension of any file that it encrypts. In addition to entropy, both Scaife et al. [71] and Mehnaz et al. [72] used file type changes as a feature to determine the presence of ransomware. The detection system designed by Ramesh and Menen [69] monitors for changes such as large numbers of files being created with the same extension or any files with more than one extension.

• Similarity: In comparison with benign file changes, such as modifying parts of a file or adding new text, the contents of a file encrypted by ransomware should be completely dissimilar from the original plaintext content. Hence, measuring the similarity of two versions of the same file can be used to detect whether ransomware is present. Scaife et al. [71] measured the similarity between two files with a hash function sdhash, which outputs a similarity score from 0 to 100 that describes the confidence of similarity between two files. Comparisons between previous versions of a file and the encrypted version of the file should yield a score close to 0, as the ciphertext should be indistinguishable from random data. Mehnaz et al. [72] also used sdhash to perform similarity checks between file versions to determine if a file has been encrypted by ransomware. 

An abstract mathematical model that can be used to represent the state of a system and track changes. It has been noted that many ransomware samples tend to carry out similar sets of actions once they reach a target system. Also, the changes made by ransomware differ significantly from benign programs. Hence, ransomware can be quickly identified in most cases. FSM's can be used to track those actions by associating system events with transitions between the states in the FSM. The state of the FSM can be monitored and if certain states are reached, the FSM can signal that a ransomware attack is underway. Monitoring the state changes that occur in the computer system in terms of utilization, persistence, and the lateral movement of resources can detect ransomware [69] .

Ramesh and Menen [69] proposed a finite state machine (FSM) with eight total states. The changes represented in the FSM include: changes in file entropy, as encrypted files have higher levels of entropy; changes in retention state, which occurs if a process has been added to the Run registry or startup directory; lateral movement, which checks for suspicious file names such as doubled file extensions (e.g. .pdf.exe); and system resources, which looks for processes that modify the system-restore settings or stop a large number of other processes in a short amount of time. If the FSM ever moves into one of its four final states, then the system is considered to be under a ransomware attack. Their method was tested against 475 different ransomware samples and 1500 benign programs. It detected 98.1% of the tested samples and had a 0% false positive rate. The main drawbacks of this approach are its inability to detect locker-type ransomware and its inability to detect ransomware samples that use sophisticated code-obfuscation and incremental unpacking techniques, such as NotPetya.

Honeypots (or honeyfiles) are decoy files set up for the ransomware to attack. Once these files are attacked, the attack is detected and stopped. Honeyfiles are easy to set up and require little maintenance. However, there is no guarantee the attacker will target these decoys, so an attacker may encrypt other files while leaving the honeyfiles untouched [78] . Gómez-Hernández and Álvarez-González [23] proposed R-Locker, a tool for Unix platforms containing a "trap layer" with a series of honeyfiles. Any process or application that accesses the trap layer is detected and stopped. Unfortunately, R-Locker only protects part of the complete file system, and the tool can be defeated by deleting the central trap file.

Similarly, Kharraz et al. [75] designed UNVEIL to limit the damage that can be done by attackers before they are detected with honeyfiles. UNVEIL generates a virtual environment that aims to attract attackers. It then monitors its file system I/O and detects any presence of a screen locker. Their solution detected 96.3% of ransomware samples and had zero false positives.

Shaukat and Rebeiro [58] proposed RansomWall, a multi-layered defense system that incorporates honeyfiles to protect against crypto-ransomware.

When the trap layer suspects a process is malicious, any modified files are backed up until it is classified as either ransomware or benign by other layers. When tested, RansomWall had a 98.25% accuracy rate and generated zero false positives. One challenge is that some ransomware samples have limited file system activity.

Network traffic analysis intercepts network packets and analyzes communication traffic patterns to detect ongoing malware attacks. For certain ransomware families, the communication between the victim host and the C&C server behaves much differently compared to normal conditions. This anomalous behavior can be revealed by studying certain traffic features. The four main features of network traffic used by researchers to detect ransomware are discussed below. Such as suspicious source or destination IP addresses in a packet header or unusually large packet sizes, which may contain an encryption key. Behavioural patterns compared to normal communications. These patterns can be detected through distinct traffic features, such as suspicious source or destination IP addresses in a packet header or unusual packet sizes. Some of the network traffic parameters used by researchers to detect ransomware are discussed below.

• Packet size: The size of messages exchanged may be unusually large if they contain an encryption key or encryption instructions. Cabaj et al. [96] analyzed CryptoLocker and Locky ransomware samples under execution and extracted the message size from HTTP packet headers to determine the average size of messages exchanged between the infected host and the C&C server, then used these statistics to build an anomaly detection system based on message size. Bekerman et al. [95] used TCP packet size as a feature in a supervised-based system for detecting ransomware.

• Message frequency: Determining an uptick in certain kinds of traffic can be used to detect the presence of a ransomware attack. Almasshadani et al. [94] observed that Locky ransomware signicantly increases the number of HTTP POST request packets within the traffic stream compared to the normal traffic. Additionally, they found that there are numerous TCP RST and TCP ACK packets in Lockys trafc used to terminate the malicious TCP connections abnormally. The authors used these features and others as part of a multi-classifier • Other features: Hundreds of other extracted network features from various OSI layers can also be used for ransomware detection. Many of these are outlined in [95] , where they did not focus on ransomware detection specifically, but instead on general malware detection.

Many studies proposed machine learning models that detect ransomware by classifying computer programs as either benign or ransomware based on their behaviour. With sufficient training data, these models can spot attacks with a high degree of accuracy. Additionally, they are frequently able to detect ransomware before it has a chance to encrypt any files. However, finding a suitable model requires trial and error, and biasness or overfitting may occur if proper measures are not taken [43] . What distinguishes the models proposed by different researchers are the classifier algorithms that are applied and the features that are used for training. The features used in the surveyed literature include the following:

• APIs / System calls: API calls are functions that facilitate the exchange of data among applications, while system calls are service requests made by the ransomware to the OS or kernel [116] . Often, ransomware makes API calls to the C&C server to obtain an encryption or decryption key. Other API calls can be made to maintain execution privileges on the host computer, enumerate the list of files to encrypt, and access or modify files. Ransomware and benign programs have specific call patterns or a unique order of calls that can be used to differentiate them. Examples of system calls include create, delete, execute, terminate, etc [117, 116, 85] .

• Log files: Log files can come from a variety of sources and record information that can indicate whether a ransomware attack is underway. For instance, Herrera Silva and Hernández-Alvarez [93] found that both WannaCry and Petya ransomware exploit DNS and NetBIOS and can be spotted by analyzing DNS and NetBIOS logs. I/O request packets are generated for each file operation and contain parameters such as the type of operation and the address and size of the data being read or written to. These parameters can be extracted from I/O request packet logs and used as features.

• File I/O: Ransomware typically executes many more read operations than benign programs, since it must read every file it encrypts. Additionally, it executes more write operations on average. File operation metrics such as the number of files written to or read from; the average entropy of file-write operations; the number of file operations performed for each file extension; and the total number of files accessed can be used to gauge if the file operations being performed are benign or part of a ransomware attack [59, 80] .

• HPC values: Hardware Performance Counters (HPCs) are a set of special-purpose registers that were first introduced to verify the static and dynamic integrity of programs in order to detect any malicious modifications to them [91] . The time-series data collected from these counters can be fed into a model to learn the behaviour of a system and detect malicious programs through any statistical deviations in the data.

• Network traffic: Network traffic features include average packet size, the number of packets exchanged between the host and other machines, and the source and/or destination IP addresses contained within packet headers. Ransomware frequently displays anomalous communications patterns. For example, the work by Cabaj et al. [109] found that CryptoWall and Locky ransomware samples involve a defined sequence of HTTP packets exchanged between the host and a C&C server to distribute the encryption key; in addition, these packets tend to be larger than average. Machine learning models can learn normal and anomalous traffic features to distinguish normal communication from malicious communication. Chadha and Kumar [107] analyzed network traffic to obtain the names of benign and malicious domains to use as features for their model, which detects ransomware by predicting if incoming or outgoing packets transmitted to or from the host contains a malicious domain.

• Opcode/Bytecode sequences: Opcodes ("operation codes") specify the basic processor instructions to be performed by a machine, whereas bytecode is a form of instruction designed to be executed by a program interpreter (e.g., Java Virtual Machine). These sequences have rich context and semantic information that provide a snapshot of the program's behaviour. This information can be extracted through dynamic analysis and fed into a model to predict if a given program is benign or malicious.

• Process actions: This refers to the sequence of events that occur while a program or application is running. Ransomware will typically cause different events to occur compared to a benign program; these events can be transformed into feature vectors and learned by a model by extracting information such as text and encoding it as numerical values [106] .

• Others: Many other features were used by researchers and extracted from assorted sources. Some of these features are derived from the raw bytes extracted from executable files using static analysis [10] . Other features related to web domains (e.g., the length of the domain name, the number of days a domain is registered for [118] ) or DNS (e.g., the number of DNS name errors, the number of meaningless domain names [94] ). Portable Executable (PE) file headers, which show the structure of a file and contain important information about the nature of the executable file, have components that be used as features. Other sources for features include the CPU (e.g., power usage), k-mer substrings (e.g., frequencies), volatile memory, and the Windows Registry [97, 89, 80] .

A complete list of the works that focused on detecting ransomware using machine learning is highlighted in Table 6 .

In this section, we have highlighted the motivation of implementing existing ransomware samples and testing the effectiveness of existing countermeasures against those ransomware samples. A brief description of our new ransomware is also presented.

From the literature review, few studies were found to test the effectiveness of existing ransomware countermeasures, such as antivirus products. There seems to be a research gap between research-based proposed solutions and existing practical solutions. To validate our claim, we decided to test different AV products against random known ransomware samples and a simple ransomware created by us. This was done to evaluate the effectiveness of existing practical countermeasures against both known and unknown ransomware samples. Also, our aim is not to claim that existing AV products are not able to detect ransomware samples, as it is possible that the tested AV products are able to detect other samples from other known ransomware families.Through these experiments, our motive is just to highlight the need of effective countermeasures against known/unknown ransomware samples.

Testing was done using a VirtualBox virtual machine running the latest version of Windows 10. VirtualBox Guest Additions were not installed as some malware samples are known to detect these additions [119]. Ransomware samples were taken from the work of [120]. The samples were in a binary format and had to be extracted from an encrypted ZIP file before use. In most cases, the file extensions were manually added before the execution of the ransomware. To conduct the tests safely on these ransomware samples, a few precautions were taken. This included setting the network adaptor to Table 6 : Overview of surveyed machine learning detection approaches

Classifier Algorithm(s) Features [10] Random Forest Raw bytes [22] Decision trees APIs/system calls [37] SVM, Random Forest Strings, APIs/system calls [41] Linear Regression k-mer frequency [58] Logistic Regression, SVM, ANN, Random Forest, Gradient Tree Boosting APIs/system calls [59] Random Forest Log files [72] Naïve Bayes, Logistic Regression, Decision trees, Random Forest Log files [73] KNN ANN Log files [87] Random Forest, Logistic Regression, Naïve Bayes, SGD, KNN, SVM APIs/system calls [88] Linear Regression, Decision trees APIs/system calls [89] Decision trees, Random Forest, Naïve Bayes, Bayesian networks, Logistic Regression, LogitBoost, Bagging, AdaBoost Volatile memory dump features [90] Linear Regression APIs/system calls [92] ANN (LSTM) HPC values [93] None (proof of concept) Log files [94] Random Forest, Bayesian Network, SVM Network traffic [95] Naïve Bayes, Decision trees, Random Forest Network traffic [97] KNN, ANN, SVM, Random Forest CPU power usage [100] Random Forest Network traffic [101] CNN Opcodes [102] SVM Opcode/bytecode sequences [103] CNN PE header components [104] Naïve Bayes, Logistic Regression, SVM, Random Forest, Decision trees DLL function calls, Opcode/bytecode sequences [105] Logistic Regression, SVM, Random Forest, Decision trees DLL function calls, Opcode/bytecode sequences [106] LSTM, CNN Event sequences [107] KNN, SVM, ANN Network traffic [109] k-means Clustering Network traffic [112] SVM, Naïve Bayes Network traffic [113] ANN host only, ensuring all software was up-to-date, and removing any shared folders between the guest and the host operating systems. On the host side, data was backed up to an external hard drive and the internet connection was disconnected. The reason for disconnecting the internet was to make sure ransomware does not escape the environment of the virtual machine. The ransomware samples were all taken from https://github.com/ytisf/theZoo in January of 2021. Several test folders were placed in different areas of the file system including Desktop, Documents, and Picture folders. Test folders were also placed in protected areas of the file system such as Program Files, Program Files (x86), and Windows. One of the folders was placed in the Recycle Bin to analyze if the ransomware scans Recycle Bin or not. The test folders contained four different file formats that included rich-text, text, PDF, and image files. All these respective files had a non-zero size.

Testing consisted of three parts, where in each part various ransomware samples are pitted against various antivirus products. The first test was on well-known ransomware samples. The second test used a RaaS generator. The third and final test used a novel custom-made ransomware sample. All of the antivirus products were the most up-to-date versions as of January, 2021.

The first round of testing was simply a control test to see the impact of the ransomware samples when no security controls were in place; all antivirus applications were turned off. The User Access Control Settings of Windows were set to default. The ransomware samples tested were WannaCry [17] , Cerber [121] , Thanos, and Jigsaw [122] . The results are shown in Table 7 , where it can be seen that most of the files within the Desktop, Documents, etc, got encrypted except for the protected operating system folders. Cerber ransomware failed to encrypt folders that the other samples encrypted. The explanation for this behaviour is unknown, but it could have just been programmed in that way.

Other ransomware samples were also tested, but unfortunately, we were not able to analyze them. As mentioned earlier, some forms of ransomware need to connect via the internet to a C&C server before they can be executed. In our scenario, due to the testing being done offline, it was not possible to analyze that category of ransomware. The same ransomware samples were then tested against eight popular antivirus programs. In all cases, the ransomware samples were rapidly detected and removed before any test files became encrypted. The samples were often removed before they were even clicked on.

The second round of testing was done using a RaaS generator called RAASNet, which can be downloaded from https://github.com/leonv024/ RAASNet. RAASNet is a free, cross-platform, and open-source software project designed to educate the public about how easy it is to create and use ransomware. It allows for custom ransomware to be created and tested. Although RAASNet generates real ransomware, the decryption key can be freely obtained from the author's website.

A control test was performed for two different RAASNet generated ransomware samples with no antivirus software running. These two samples were identical except for the fact that one ran with administrator privileges while the other did not. The payloads of both samples were generated using the default settings of RAASNet. The results of this control test can be seen in Table 8 . Both of the samples were set to target all of the listed folder locations. The sample with administrator privileges was tested to see if it would be able to infect the protected operating system folders, but this was unsuccessful. The only difference between the two tests was that the one with administrator privileges generated a user account control (UAC) prompt message, but allowing access still did not let the ransomware modify the files. The advantage of testing RAASNet ransomware over well-known ransomware samples (e.g. Jigsaw) is that RAASNet generated samples are not included in all antivirus signature databases. One of the generated payloads was uploaded to VirusTotal.com, and only 20 out of 72 antivirus engines detected the payload as malicious. Comparatively, Jigsaw's sample was also uploaded and this was detected by 67 out of 72 engines. This means that the antivirus programs can be tested for their dynamic detection abilities rather than strictly through static-based detection. This is important since it is a better indication of how they might do against novel ransomware samples in the future where static analysis is more likely to fail.

A RAASNet generated payload (created with default settings and without administrator privileges) was then tested against several popular antivirus programs. The results of these tests can be found in Table 9 . Folders were placed in different locations across the file system and marked as either encrypted or safe depending on whether the ransomware encrypted them or not. The worst performing antivirus programs were Microsoft Defender, MalwareBytes (Free), and Avira (Free). All of the antivirus programs had real-time protection turned on. Overall, the antivirus programs did quite well and quickly caught the ransomware before it could do any real damage. However, the antivirus programs with the best results appeared to detect the ransomware samples through static analysis. This is evidenced by the fact that many of these antivirus programs gave messages indicating that they detected the ransomware by preemptively scanning the file, seemingly before they could run. It is worth noting that many antivirus programs, such as Microsoft Defender, do have an effective form of ransomware protection built-in. This protection comes in the form of folder protection which checks if a process is trusted. If it is not, the antivirus software denies the process from modifying the folder contents. A protected folder was set up on the Desktop using Microsoft Defender, and the contents in this folder were successfully protected. It would appear that a similar form of protection also safeguards important operating system folders, as evidenced by the fact that no ransomware sample was able to encrypt files in these areas of the file system.

The final tests were done using the AESthetic ransomware sample. This sample was custom-made for this research and was created in Java. We created AESthetic using Java's standard cryptographic package, javax.crypto. AESthetic uses a hybrid encryption approach with the help of a C&C server that runs on localhost. It starts by generating a symmetric key using secure cryptographic modules. It then recursively crawls through the file system from a specified target directory and will encrypt all specified file types using AES-256 in CBC mode. A unique and randomly generated initialization vector is used for each file, which gets appended to the beginning of the encrypted file for later use. A ransom note is placed in every directory that AESthetic traverses through. Once all of the files are encrypted, AESthetic connects to the C&C server to obtain an RSA public key that it uses to en-crypt the symmetric key. Once the symmetric key is encrypted, the plaintext version of the symmetric key is deleted. New files are created to store the encrypted data and the original plaintext files are deleted. After ten seconds, it will automatically start to decrypt the encrypted files. To do this, it once again connects to the C&C server to obtain the corresponding RSA private key to decrypt the encrypted AES symmetric key. This sample was tested against eight popular antivirus programs (which are the same as those listed in Table 9 ). All of the test files got encrypted by AESthetic. None of the antivirus programs reported any suspicious activity. Both the source code and an executable JAR file were uploaded to VirusTotal.com, and in both cases, this resulted in zero detections. There were zero detections since the malware was made just for this research and its signature has not yet been added to any signature database.

From the results of our literature review and experiments, we can make several observations on the current trends and limitations of ransomware countermeasure solutions. Most papers preferred to study ransomware using dynamic analysis over static analysis, or used a combination of the two. This is perhaps unsurprising, as static analysis can frequently be evaded through code obfuscation or polymorphic/metamorphic attacks [58] . However, some papers found that certain dynamic analysis approaches can be evaded as well. For instance, the virtual environment in UNVEIL [75] could potentially be detected and avoided by attackers. One limitation of both types of analysis is that the results cannot usually be generalized to all ransomware variants. For example, the key backup technique proposed by Lee et al. [64] relies on their analysis that ransomware calls specific functions in the CNG library. The HTTP traffic characteristics that Cabaj et al. [96] used to detect ransomware comes from studying ransomware families: CryptoWall and Locky. Almashhadani et al. [94] base their detection system on the behavioural analysis of one family -Locky.

Preventative techniques such as access control and key or data backups can reduce the damage that ransomware can inflict on systems and possibly deter future attacks. However, these prevention-based approaches suffer from several shortcomings as well. Firstly, they can have significant overhead. Access control or key backup schemes can incur significant computational costs [123] . Creating data backups can cause the system to take a significant performance hit, especially under high workloads [7] . To counter this approach machine learning approach can help prevent ransomware based on behavioural analysis.

Machine learning models were the most common technique for detecting ransomware. These models can be trained to recognize the general behaviour patterns of ransomware through suspicious behaviour or specific basic processor instruction patterns. The ability for machine learning to detect the general behaviour of ransomware is important, as ransomware is constantly evolving and can easily change its code signature, but has difficulty changing its attack pattern [43] . However, many of these models require an attack to already be underway in order to detect suspicious activity, such as file access or communication to a malicious domain. Khan et al.'s [41] use of digital DNA sequencing is a promising approach since it is designed to detect ransomware before infection.

Based on the results of our experiments, which were conducted on a number of different ransomware samples, we have learned a few interesting things about ransomware. Our tests using RAASNet have shown how easy it is to acquire and use ransomware through RaaS software. RaaS lets ransomware developers sell or lease their ransomware variants to affiliates, who use these variants to perform attacks; both developers and affiliates get a cut of any profits. As previously mentioned, RaaS enables users without technical expertise to launch ransomware attacks, meaning that ransomware is no longer limited to the developers who create it. For developers, RaaS reduces their risk since they do not launch the attacks themselves. The RaaS model has gained popularity amongst cybercriminals and has caused a dramatic increase in the rate of ransomware attacks in recent years [82] .

Although antivirus programs were successful against previously known samples, they did not fare quite so well against the lesser-known RAASNet sample and the completely novel AESthetic sample. The novel sample of course is not present in antivirus signature databases and it was completely undetected. This highlights that current antivirus software likely rely too heavily on simple signature-based static analysis detection and hence should invest more into the approaches seen in literature, especially in regards to dynamic analysis or honeypot approaches. For example, our ransomware AESthetic was designed with many tell-tale ransomware behaviors in mind, such as leaving ransom notes, reading and writing to many files throughout the file system, and using cryptographic libraries. These behaviors could have potentially been used to detect AESthetic as malicious using dynamic analysis. The only tested antivirus countermeasure that successfully repelled all of the tested ransomware samples was ransomware folder protection, such as "Controlled folder access" which is offered by Windows Defender. Such an approach requires the user to manually decide which folders to protect however and it is not very user-friendly as one needs to manually allow benign programs through the protection wall.

In this section, we have highlighted key research challenges based on the literature review and explored future research directions. The identified research challenges include unawareness among users, lack of open-access ransomware libraries, and inadequate detection and false-positive rates for ransomware. Future research directions include edge and fog-assisted ransomware, DeepFake ransomware, remote working vulnerabilities, blockchainbased countermeasures, increases in RaaS attacks, and expansion to AESthetic.

1. Unawareness among users: Awareness among users is one of the fundamental challenges that needs to be addressed to reduce the impact of ransomware. For example, there is no full-proof automatic system that is able to consistently counter ransomware attacks that propagate through phishing campaigns. Although existing spam filters are efficient, there is always a possibility that some malicious emails will make their way into your inbox. In that scenario, basic knowledge of recognizing spam can save a victim from being infected. There are currently many workshops, programs, and online websites available to educate users of such threats, but based on the statistics of ransomware attacks, it seems more efforts are needed.

2. Lack of Open-Access ransomware Libraries: In order to propose and develop new solutions that can tackle ransomware, there is an emerging need for open-ransomware libraries. The availability of such libraries will help researchers to better understand the varying features behind existing ransomware samples, including their working mechanism, etc. Based on that understanding, researchers can propose better solutions in a faster time span. As it stands, it is a tedious task to implement a particular ransomware sample and then test out the countermeasure. However, collecting many of the existing ransomware samples is itself a big research challenge that needs international research collaboration, as well as a huge amount of funding to obtain the necessary resources, etc.

3. Inadequate Detection and False Positive Rates: Existing Ransomware detection systems face a difficult challenge achieving both a high detection rate and few false alarms. A large number of false alarms is frustrating for administrators, whereas a low detection rate makes the system ineffective [112] . Signature-based detection systems may miss attacks if the signature is too specific; conversely, the system may flag too many benign programs as ransomware if the signature is too generic. Anomaly-based detection systems flag behaviour that is sufficiently far from normal [113] . However, not all abnormal behaviour is malicious. Consequently, these systems can generate a high number of false alarms and require a human to manually review each alarm. This manual validation adds to the system workload and reduces the system's practicality. Al-Rimy et al. [82] were able to achieve both high detection and low false-positive rates by combining two behavioural detection methods into a single model. However, their system relies on a time-based threshold. Hence, more research is needed to improve ransomware detection models and to increase their applicability.

1. Edge and Fog-assisted Ransomware Detection and Prevention using Federated Learning: There have been huge advancements in the area of Edge and Fog-based related technologies. [124, 16, 125, 126] . Besides, with the arrival of federated learning [127] , numerous opportunities in terms of improving state-of-the-art machine-learning-based approaches have emerged. There is a huge possibility of utilizing these concepts to detect and prevent ransomware, based on machine learning approaches [128] . One of the possibilities arises by training and deploying machine learning-based algorithms into Edge/Fog-based nodes to detect and prevent ransomware. Through Federated learning, we can personalize the learning process of each respective node.

2. DeepFake Ransomware: Deepfakes are the manipulated digital representations such as images, videos where an attacker tries to mimic the real-person [129] . In the future, it could be possible for attackers to create ransomware that will automatically generate DeepFake content of a victim performing some incriminatory or intimate action which he/she never did. The victim will be asked to pay the ransom in order to avoid that content being published online. To mitigate such ransomware attacks will be challenging due to the velocity of data and the availability of numerous social media channels to spread the content.

3. Remote Working Vulnerabilities: The recent COVID-19 pandemic made it mandatory for several institutions to initiate the work-from-home scenarios or implement bring your own devices (BYOD) policy [130] . As a result of which, several vulnerabilities [131] were exploited by the attackers that resulted in several ransomware attacks. In one of the reports by SkyBox Security, the ransomware attacks witnessed 72 percent growth compared to the previous years. Hence, it is one of the future research directions to look at mitigating such attacks during remote working scenarios.

4. Blockchain-based Countermeasures: Blockchain is an immutable decentralized ledger that makes tampering difficult [132] due to its decentralized nature along with linked hash function, timestamp function and consensus mechanism [133, 134] . It seems to have potential and it is an interesting research direction where blockchain-based solutions can be used to mitigate ransomware-based attacks. The first step in this direction is the work of [135] where the authors have highlighted the use of smart contracts for the limited payment of ransoms to get the decryption keys. 5 . Increase in Ransomware-as-a-service (RaaS) Attacks: Ransomware as a service or RaaS is gaining popularity from the past few years [136] . In RaaS model, an experienced attacker creates ransomware and offers that code to script kiddies or gray-hat hackers for some price [11, 137] . The script kiddies or gray-hat hackers then use that code to carry out their own attacks. The Cerber ransomware attack is one example of the RaaS model in action. With emerging technologies and an increasing number of internet users, there is a strong possibility for a surge in these types of attacks. Hence, mitigating such attacks in the future seems to be a potential research direction.

6.AESthetic Ransomware Artifact Development: The source code of AESthetic ransomware has been posted to GitHub at https://github.com/ kregg34/AESthetic and has been made private. As we are still in initial phases of developing decryption tool for AESthetic, we aim to create artifact for AESthetic ransomeware so that researchers can evaluate the efficacy of their solutions against ransomwares. On the otherhand, once the decryption tool is finalised, we will release the code of AESthetic.

7.AESthetic Performance: The antivirus products were likely able to detect the other, well-known, samples due to their known signatures. However, our ransomware AESthetic, which has no known signatures and went unde-tected. This may indicate that these products are relying on static analysis too much, and not effectively utilizing dynamic analysis. Dynamic analysis may be able to detect AESthetic as this was designed to have many of the tell-tale-signs of ransomware behaviour. However, to validate this claim, more research is needed owing to the blackbox nature of antivirus products.

In this work, recent advances in ransomware analysis, detection, and prevention were explored. It was found that the focus of the state-of-the-art ransomware detection techniques mostly revolves around honey pots, network traffic analysis, and machine learning based approaches. Prevention techniques meanwhile were mostly focused on access control, data and key backups, and hardware-based solutions. However, it seems that there is a trend in using machine learning based approaches to detect ransomware. We have conducted a number of experiments on ransomware samples through which it was observed that there is a need for more intelligent approaches to detect and prevent ransomware. Through the experiments, it was also observed that ransomware can be easily created and used. In the end, we highlighted the existing research challenges and enumerated some future research directions in the field of ransomware.

Manuscript title: Ransomware: Recent Advances, Analysis, Challenges and Future Research Directions All persons who meet authorship criteria are listed as authors, and all authors certify that they have participated sufficiently in the work to take public responsibility for the content, including participation in the concept, design, analysis, writing, or revision of the manuscript. Furthermore, each author certifies that this material or similar material has not been and will not be submitted to or published in any other publication before its appearance in the Computers and Security Journal.

The contribution of each author is enlisted below:

-Craig Beaman conducted the literature review, worked on implementation details, and was involved in drafting the manuscript. -Toluwalope David Akande conducted the literature review and was involved in drafting the manuscript.

-Saqib Hakak designed the study, assisted in classification, worked on future research challenges & directions section, and coordinated the whole work.

-M.Khurram Khan provided potential useful recommendations and directions to improve the work, assisted in addressing reviewer comments and proof-reading.

Cyber security in the age of covid-19: A timeline and analysis of cyber-crime and cyber-attacks during the pandemic

A comprehensive survey: ransomware attacks prevention, monitoring and damage control

Hobby hackers to billion-dollar industry: the evolution of ransomware

Ransomware: Evolution, mitigation and prevention

On the social science of ransomware: Technology, security, and society

A Study of Ransomware Attacks: Evolution and Prevention

Ransomware Prevention and Mitigation Techniques

Avoiding Future Digital Extortion Through Robust Protection Against Ransomware Threats Using Deep Learning Based Adaptive Approaches

HelDroid: Dissecting and Detecting Mobile Ransomware

Ransomware Detection using Random Forest Technique

The Ransomwareas-a-Service economy within the darknet

A survey on malware detection and classification

Cyber Fraud: Detection and Analysis of the Crypto-Ransomware

Ransomware as a Service using Smart Contracts and IPFS

Malware in Computer Systems: Problems and Solutions

Have you been a victim of COVID-19-related cyber incidents? Survey, taxonomy, and mitigation strategies

WannaCry ransomware: Analysis of infection, persistence, recovery prevention and propagation mechanisms

WannaCry Aftershock

Privacy, confidentiality, and security of health care information: Lessons from the recent Wannacry Cyberattack

Automated Behavioral Analysis of Malware: A Case Study of WannaCry Ransomware

New Challenges in Forensic Analysis in Railway Domain

Prevention of crypto-ransomware using a pre-encryption detection algorithm

R-Locker: Thwarting ransomware action through a honeyfile-based approach

Ransomware Prediction Using Supervised Learning Algorithms

Contemporary cybercrime: A taxonomy of ransomware threats mitigation techniques

Analysis of Encryption Key Generation in Modern Crypto Ransomware

A Cyber-Kill-Chain based taxonomy of cryptoransomware features

A key-management-based taxonomy for ransomware

Recent Advances in Cryptovirology: State-of-the-Art Crypto Mining and Crypto Ransomware Attacks

Awareness Learning Analysis of Malware and Ransomware in Bitcoin

The aftermath of a crypto-ransomware attack at a large academic institution

Ransomware: How to Prevent and Recover (ITSAP.00.099)

Paying the Ransom Doubles Cost of Recovering from a Ransomware Attack, According to Sophos

COVID-19 pandemic cybersecurity issues

Dynamic malware analysis in the modern eraA state of the art survey

Integrated static and dynamic analysis for malware detection

Developing realistic distributed denial of service (DDoS) attack dataset and taxonomy

Deep feature transfer learning for trusted and automated malware signature generation in private cloud environments

A comprehensive review on malware detection approaches

A Digital DNA Sequencing Engine for Ransomware Detection Using Machine Learning

Dynamic Malware Analysis in the Modern EraA State of the Art Survey

Ransomware, threat and detection techniques: A review

When Malware is PackinHeat; Limits of Machine Learning Classifiers Based on Static Analysis Features

Ransomware threat success factors, taxonomy, and countermeasures: A survey and research directions

The rise of ransomware and emerging security challenges in the Internet of Things

Ransomware attacks: detection, prevention and cure

Ransomware: a survey and trends

Windows-based Ransomware: A Survey

A survey on detection techniques for cryptographic ransomware

Dissecting .NET ransomware: key generation, encryption and operation

A Dive into the Deep: Demystifying Wan-naCry Crypto Ransomware Network Attacks Via Digital Forensics

Ransomware prevention using application authentication-based file access control

Use of access control to minimise ransomware impact

Blacklist vs. Whitelist-Based Ransomware Solutions

Enforcing situation-aware access control to build malware-resilient file systems

No random, no ransom: a key to stop cryptographic ransomware

RansomWall: A layered defense system against cryptographic ransomware attacks using machine learning

ShieldFS: a self-healing, ransomware-aware filesystem

Redemption: Real-time protection against ransomware at end-hosts

FlashGuard: Leveraging intrinsic flash properties to defend against encryption ransomware

Improving backup system evaluations in information security risk assessments to combat ransomware

Amoeba: an autonomous backup and recovery SSD for ransomware attack defense

Ransomware prevention technique using key backup

Paybreak: Defense against cryptographic ransomware

Attacking key management in ransomware

Individual cyber security: Empowering employees to resist spear phishing to prevent identity theft and ransomware attacks

Why employees matter in the fight against ransomware

Automated dynamic approach for detecting ransomware using finite-state machine

Experimental Analysis of Ransomware on Windows and Android Platforms: Evolution and Characterization

CryptoLock (and Drop It): Stopping Ransomware Attacks on User Data

Rwguard: A real-time detection system against cryptographic ransomware

Machine learning based file entropy analysis for ransomware detection in backup systems

Ransomware detection method based on contextaware entropy analysis

{UNVEIL}: A large-scale, automated approach to detecting ransomware

SSD-insider: Internal defense of solid-state drive against ransomware with perfect data recovery

Ransomware detection using I/O patterns. US Patent 10,078,459

Detecting ransomware with honeypot techniques

Early detection of cryptoransomware using pre-encryption detection algorithm

Automated dynamic analysis of ransomware: Benefits, limitations and use for detection

Detecting ransomware using support vector machines

Zero-day aware decision fusion-based model for crypto-ransomware early detection

Insights into Malware Detection via Behavioral Frequency Analysis Using Machine Learning

A pseudo feedback-based annotated TF-IDF technique for dynamic crypto-ransomware pre-encryption boundary delineation and features extraction

API Call Based Ransomware Dynamic Detection Approach Using TextCNN

An I/O Request Packet (IRP) Driven Effective Ransomware Detection Scheme using Artificial Neural Network

Ransomware detection using machine learning algorithms

Detection and elimination of spyware and ransomware by intercepting kernel-level system routines

Trusted detection of ransomware in a private cloud using machine learning methods leveraging meta-features from volatile memory

Crypto-ransomware early detection model using novel incremental bagging with enhanced semi-random subspace selection

RAPPER: Ransomware prevention via performance counters

RATAFIA: ransomware analysis using time and frequency informed autoencoders

Large scale ransomware detection by cognitive security

A multi-classifier network-based crypto ransomware detection system: a case study of Locky ransomware

Unknown malware detection using network traffic classification

Software-defined networkingbased crypto ransomware detection using HTTP traffic characteristics

Detecting crypto-ransomware in IoT networks based on energy consumption footprint

Leveraging machine learning techniques for windows ransomware network traffic detection

Ransomware early detection by the analysis of file sharing traffic

Machine learning-based detection of ransomware using sdn

Ransomware classification using patch-based CNN and self-attention network on embedded N-grams of opcodes

Leveraging support vector machine for opcode density based detection of crypto-ransomware

A New Method for Ransomware Detection Based on PE Header Using Convolutional Neural Networks

A framework for analyzing ransomware using machine learning

A multi-level ransomware detection framework using natural language processing and machine learning

DRTHIS: Deep ransomware threat hunting and intelligence system at the fog layer

Ransomware: Let's fight back!

A Novel Approach for Detecting DGA-based Ransomwares

Using software-defined networking for ransomware mitigation: the case of cryptowall

Ransomware Attack PredicTOR

RanDroid: Structural Similarity Approach for Detecting Ransomware Applications in Android Platform

Intelligent and Dynamic Ransomware Spread Detection and Mitigation in Integrated Clinical Environments

Catch It If You Can: Real-Time Network Anomaly Detection with Low False Alarm Rates

A behavioral analysis of the ransomware strain NEFILIM

What is the Difference Between API and System Call

An Empirical Study of API Calls in Ransomware

RAPTOR: Ransomware Attack PredicTOR

Ransomware Families

Ransomware deployment methods and analysis: views from a predictive model and human responses

Efficient attribute-based comparable data access control

Survey of fog computing: Fundamental, network applications, and research challenges

A Framework for Edge-Assisted Healthcare Data Analytics using Federated Learning

A survey of multi-access edge computing in 5G and beyond: Fundamentals, technology integration, and state-of-the-art

Federated machine learning: Concept and applications

Adaptive privacy-preserving federated learning

Deepfake video detection using recurrent neural networks

BYOD policy compliance: Risks and strategies in organizations

Cyber security and the remote workforce

Recent advances in Blockchain Technology: A survey on Applications and Challenges

Securing smart cities through blockchain technology: Architecture, requirements, and challenges

Industrial wastewater management using blockchain technology: Architecture, requirements, and future directions

Blockchainbased semi-autonomous ransomware

The new generation of ransomware: an in depth study of Ransomware-as-a-Service

RANSOMWARE AS A SER-VICE AND PUBLIC AWARENESS

Craig Beaman is a graduate student at the University of New Brunswick, where he is completing a Master of Applied Cybersecurity. Craig received a B.Sc. (Honours) from the University of New Brunswick with a major in physics and minors in mathematics and computer science. His research interests include cryptography, network security, and malware detection and prevention

Ashley Barkworth is a graduate student at the University of New Brunswick, where she is completing a masters in applied cybersecurity. Ashley received a B.Sc. (Honours) from the University of British Columbia with a major in computer science and a minor in mathematics in 2020. Her research interests include information security, cryptography, and data management in centralized systems

Toluwalope David Akande is a graduate student at the University of New Brunswick, where he is completing a Master of Applied Cybersecurity. He received a B.Sc. (Honours) from Obafemi Awolowo University with a major in Computer Engineering. His research interests include network security, intrusion detection using machine learning and cloud computing security

Having more than 5+ years of industrial and academic experience, he has received several Gold/Silver awards in international innovation competitions and is serving as the technical committee member/reviewer of several reputed conference/journal venues. His current research interests include Risk management, Fake news detection using AI, Security and Privacy concerns in IoE, Applications of Federated Learning in IoT, and blockchain technology Muhammad Khurram Khan is currently working as a Professor of Cybersecurity at the Center of Excellence in Information Assurance

All persons who have made substantial contributions to the work reported in the manuscript (e.g., technical help, writing and editing assistance, general support), but who do not meet the criteria for authorship, are named in the Acknowledgements and have given us their written permission to be named. If we have not included an Acknowledgements, then that indicates that we have not received substantial contributions from non-authors.

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.