key: cord-0059924-1j20puts authors: Karagiannis, Stylianos; Maragkos-Belmpas, Elpidoforos; Magkos, Emmanouil title: An Analysis and Evaluation of Open Source Capture the Flag Platforms as Cybersecurity e-Learning Tools date: 2020-08-16 journal: Information Security Education DOI: 10.1007/978-3-030-59291-2_5 sha: 5882db26ef0d497df6dbc567170a3815b11307c7 doc_id: 59924 cord_uid: 1j20puts Capture the Flag (CTF) challenges are typically used for hosting competitions related to cybersecurity. Like any other event, CTF competitions vary in terms of context, topics and purpose and integrate various features and characteristics. This article presents the results of a comparative evaluation between 4 popular open source CTF platforms, regarding their use for learning purposes. We conducted this evaluation as part of the user-centered design process by demonstrating the platforms to the potential participants, in order to collect descriptive insights regarding the features of each platform. The results of this evaluation demonstrated that participants approved the high importance of the selected features and their significance for enhancing the learning process. This study may be useful for organizers of learning events to select the right platform, as well as for future researchers to upgrade and to extend any particular platform according to their needs. Cybersecurity is a fast-growing topic and a compound industry that is rapidly changing following the lightning fast evolution of technology. Large sums are consistently invested in security research and training of professionals in order to protect critical infrastructures against possible threats [1] . As part of their cybersecurity strategy, many companies choose to train their employees in order to sharpen their skills and increase their security awareness [2] . Traditional methodologies of teaching cybersecurity and information security topics may not allow trainees to use and test their knowledge in realistic conditions [3] . Capture the Flag (CTF) competitions [4] are very popular for testing skills and presenting challenges for practice on various security topics such as cryptography, steganography, web or binary exploitation and reverse engineering among others. The game takes place in the digital world, while each team must protect and attack vulnerable systems and collect the flags which are alphanumeric strings. Each challenge has a description, related files or website links, featuring potential hints and the amount of reward points which each participant or team collects after a successful flag submission [4] . Groups or individual participants are trying to collect as many reward points as possible within a certain time. The winner is the individual or the team with the most collected reward points. CTF competitions could be categorized according to their purpose. The first category involves the use of CTF tools by educational institutions as an alternative way of teaching security concepts [5, 6] . This gives the participants the opportunity to acquire practical experience as well as to better understand context related to academic topics. The second category involves the use of CTF tools by organizations and even governments for recruiting purposes [7] . Organizing CTF competitions is an ideal way for companies or organizations to find competent people and evaluate their skills. The third and final purpose for organizing a CTF is entertainment and self-directed learning [4] . CTFs have greatly evolved in the past decade, while modern CTF competitions use gamification elements [8] [9] [10] [11] such as storytelling, rich graphics, prizes, even augmented reality that transform them into interesting and fun activities. Over the years this has led to the creation of entire online communities which could be considered as social networks that unite people that share the same passion [12] . Depending on the category a CTF belongs, some features could be more important than others. Our study aims to review the technical elements and key components of 4 open source CTF platforms focusing on their use for educational purposes [13] [14] [15] . Towards this direction, this article addresses the following research questions: RQ1: Which are the features that CTF platforms have for presenting information regarding the included CTF challenges? RQ2: How and which are the features that could enhance the learning curve? RQ3: Are there any missing features which could be important for supporting the learning process? RQ4: Which are the potential features of the CTF platforms which could enhance the gamification attribute? RQ1 and RQ2 intend to evaluate the current options which the current CTF platforms support, in terms of the options which affect the challenge presentation and flag submission, while RQ3 and RQ4 focus on possible missing key components and possible extensions which could enhance the learning process. Towards this direction, we conducted an empirical study, using direct observation on 4 open source CTF platforms, from the perspective of the facilitator and organizer. Furthermore, qualitative research was conducted and more specifically an experimental study using one-on-one interviews in order to extract evidence on the impact of each individual key component from the participants' perspective. Noor et al. [16] conducted an evaluation of the most popular open source and online CTF platforms. By focusing on usability, their research does not delve into a holistic analysis of each platform leaving many important aspects unclear. Raman et al. [17] also evaluated various CTF contests along with their key differences, mostly from a technical point of view. Other important researches are that of Chung [18, 19] , presenting the key elements of CTFd in comparison to other CTF platforms such as OpenCTF, picoCTF, TinyCTF, Mellivora, and the iCTF framework. Key differences for each platform are mentioned; however, the details are more generic and do not include specific evaluation criteria. Similarly, Kucek and Leitner [21] present a survey and a comparison of 8 open source CTF platforms. More specifically, they present technical details and features of the selected CTF platforms. Most of the above studies are focused on the organizers' perspective and mostly on the technical aspects, while our research is focused on the capability of using the CTF platforms for educational purposes. Our research focuses on an in-depth analysis for evaluating CTF platforms and extracting their key components as e-Learning tools in higher education, in order to provide a more complete perspective of the special characteristics, limitations and capabilities of each platform. Specifically, we evaluated 4 open source CTF platforms, using both a systematic comparative study and an experimental study based on one-onone interviews on undergraduate computer science students. The students expressed interest after an open call for participation by providing their opinion and comments. Towards this direction we conducted open-ended questions in order to gather information from the participants' perspective about the features and key components of the selected CTF platforms which reflect to specific attributes. The results of this research could be important for organizers or facilitators to select the most suitable CTF platform for learning or training purposes and to highlight potential features which could be important according to their needs. 6 and Shelter Labs 7 among others. Some of the above platforms could be used for hosting a CTF event; however, they require a premium account, including extra costs. Finally, CTF365 is a fully commercial product with a 30-day free trial. Most of the challenges presented in such platforms are usually restricted to cybersecurity topics, without providing any other educational context and are appropriate mostly for experienced users. In contrary, open source CTF platforms can be used for deploying educational context and presenting custom challenges including specific topics which could be extended further from ethical hacking and penetration testing. The criteria for the selected CTF platforms regarding the key components were selected by combining criteria from Systems and software Quality Requirements and Evaluation (SQuaRE) and ISO/IEC 25010:2011 [20] as well as criteria related to the educational perspective, using a rubric for the evaluation of e-learning tools in higher education [21, 22] . The selected criteria reflect various attributes which are affected from the platforms' features. Evaluation rubrics related to the higher education have also been presented elsewhere [23] . Towards this direction, our research is not focused on the strictly technical attributes of the platforms and therefore we customized the evaluation attributes to rubric categories which represent not only the instructors' perspective, but the participants' perspective as well [23] . Most of the mentioned disadvantages of the CTF platforms include, among others, objective factors such as incomplete documentation, insufficient reporting and lack of migration tools. However, some of the factors directly affect the learning experience while some other factors are not that important on specific perspectives. For example, the use of gamification features in virtual learning environments has been shown to have positive effects [19] and the platforms include specific features which enhance this attribute. For conducting this research, we deployed the selected CTF platforms and extracted the features each platform provides (Fig. 1) . During the deployment we successfully added five main challenges which include 5 to 12 sub challenges each one. After extracting the criteria for evaluation, we conducted an experimental study using one-on-one interviews with undergraduate students of the 4 th semester or higher of the Department of Informatics, Corfu, Greece. More specifically, an open request for participation was distributed to students of academic courses in information security for providing their perspective on each CTF platform; a total number of nine (9) participants were responded, and were asked to provide us feedback for each CTF platform. The interviews were conducted both physically and remotely using sound and screen recording, maintaining at about 1-h duration for each one. Informed consent was explicitly requested and documented from candidates prior to the interview and recording process commencing, while all recording and data collection has been done without retaining any personal information. FBCTF. The Facebook CTF platform (FBCTF) was developed by Facebook security engineers, in order to provide an easy way for organizing CTF competitions. The platform stands out for its ease of installation, the capability to host King of The Hill type competitions, its rich graphics in the form of a world map that work as gamification elements and finally, the capability of multilingualism. CTFd. CTFd was developed for the needs of Cyber Security Awareness Worldwide (CSAW 8 ). The ease of installation, use and customization options combined with its rich features, make it a particularly attractive choice for the organizers. This platform focuses on extensibility, along with descriptive information related to the reporting tools and statistics. Mellivora. Mellivora is a CTF platform developed in the PHP programming language and might not be as popular as the other platforms, however its simplicity makes it a particularly attractive choice for CTF contest organizers. Root the Box. Root the Box focuses mostly on presenting the challenges as a "box", meaning that each challenge includes minor steps for being able to complete the main challenge. The reward system is more complex than the others and reward points are virtual credits which the participants could use in order to acquire extra features. The key components for each of the selected CTF hosting platforms were identified and matched with the criteria for evaluating the platforms [22] . The results derive from the deployment and our experience as facilitators. Since some of the features and attributes could not be distinct as either strengths or weaknesses, these have both been included as comments for each platform. The criteria and the comparison might include subjectiveness and for that reason we conducted the evaluation experiment from one-onone interviews in order to clarify our initial assumptions (Sect. 2.3). Evaluation Criterion 01 -Functionality. This criterion is related to the extent to which the tool's operations and processes facilitate or make easier to use the platform as a learning environment. Such attributes include visualization, ease of use, sufficient documentation and hypermediability. The strengths and weaknesses of each platform are presented on Table 1 . For instance, the attribute of visualization includes all the related elements which present visualized information such as scoreboards, scenarios, a map and challenge categories among others. Ease of use (EoU) is evaluated for both administrators and participants. Regarding EoU, Root the Box includes a lot of complex elements which in some cases might be difficult to use and to get familiar with. Table 1 highlights a distinct advantage of FBCTF, mainly because of its rich graphics and engaging environment. FBCTF maintains sufficient documentation while CTFd was the easiest to deploy by following the documentation. For Mellivora we had to look further into setting up the localhost and some steps were not described extensively. Root the Box was easy to deploy as well using the documentation. CTFd provides extensive documentation for developing extra plugins and themes providing extensive information related to the platforms' capabilities. The attributes which reflect to the visualization, usually have direct impact in terms of usability. Hypermediability includes the ability to upload hypermedia such as images, videos and other documents inside the platform. All platforms except Root the Box included the support for uploading files. Evaluation Criterion 02 -Extensibility. This criterion includes attributes such as Ease of Use which is affected from features such as support for extra plugins and themes among others ( Table 2 ). Plugins and themes already exist for CTFd and Root the Box includes some end-user themes as well. CTFd includes specific advantages related to customization options and for providing an easy way for customizing the theme through a CSS editor. Custom plugins exist for CTFd such as a world map and a plugin for maintaining multiple-choice questions. Most of the CTF platforms are customizable and open source, however CTFd maintains an easier way for maintaining any changes and customizations and already has published themes and plugins 9 , while Root the Box maintains specific themes and seems extendable by maintaining a lot of extra features; for example, it maintains the option for having bonus challenges which the participants could unlock using virtual credits giving them the opportunity to unlock bonus content and extra features. Evaluation Criterion 03 -Teaching Presence. This criterion is very important for our approach, which is the usage of CTF platforms in the classroom and includes the features which could be used in order to enhance the learning environment and process. More particularly this criteria category includes options which could enhance the learning processes and facilitators better presenting their challenges. For example, CTFd includes the option for creating and maintaining extra webpages inside the platform, featuring HTML and rich context (Table 3) . Maintaining specific prerequisites for unlocking challenges could be important for students to engage more to the learning process and for facilitators to gradually present educational context. Regarding facilitation and more specifically the options for interactive communication with the participants, FBCTF provides an announcement window which might be possible to miss, while CTFd provides notifications by using alerts such as sound indications, popup windows and a subpage for announcements. Mellivora provides the notifications on the homepage without any alerts. Root the Box provides 4-s pop-up notifications for each announcement and kept on the homepage. Other attributes which affect the teaching presence and the learning process include statistics, readability, filters, the option for hide or lock specific challenges and the rewarding system which is related to the gamification elements. Evaluation Criterion 04 -Flag and Challenge Management/Submission. This criterion concerns the way the selected platforms are maintaining and handling the flags (Table 4) . CTFd for example not only maintains the ability to open a challenge when meeting a set of prerequisites, but a published plugin extends this option further. Regarding Root the Box, the option to evaluate the flag submission as an administrator is important. MVP (Most Valuable Player) on the scoreboard was also considered as a benefit for increasing competitiveness for Root the Box. Evaluation Criterion 05 -Social Presence. This specific criterion relates to features such as integration of the scoreboard with online communities, features for identifying and authenticate the participants (Table 5) . Moreover, it is related to the popularity of each platform and the ability to be socially identified. Evaluation Criterion 06 -Sustainability. Sustainability includes features such as licensing and the system requirements for maintaining the platform as well as their total social presence ( Table 6) . FBCTF requires quite a lot of system resources, while CTFd, Root the Box and Mellivora are lighter environments. Especially, Mellivora is appropriate for low resource systems or for conducting large scale competitions that would increase the demand for resources. FBCTF, CTFd and Mellivora are frequently used on events (especially CTFd and Mellivora), while Root the Box is not very popular, maintaining low presence. Evaluation Criterion 07 -Portability. This specific criteria category relates to features which consider compatibility with various screen resolutions, responsiveness and options for offline access (Table 7) . CTFd is ultra-compatible and out-of-the-box responsive, maintaining all the functionality. The selected CTF platforms were presented to the participants which, during the presentation, were asked questions regarding their opinion on each of them. The ability for the participant to understand which challenge is next and to monitor the total progress Readability The ability of the platform to present clear, understandable and complete information regarding the challenges to the participants Reward system The options which the platforms providing for rewarding the participants Structure Taxonomies, filters and every feature which ensures a good structure of the various CTF challenges. It is important if the number of challenges is large Socializing Features which establish good connection from the facilitator and of the team members (continued) The attributes which were affected from the features and were set for the evaluation are presented on Table 8 . All selected attributes were mentioned as important from the participants (mean values higher than 3.6/5 and most of them higher than 4/5). In the first place, all the participants expressed highly acceptance for FBCTF, since the visuals and immersion of this platform are promising. However, some of the other platforms (CTFd and Root the Box) were distinguished later as more appropriate for educational purposes (Fig. 2) . For each attribute the participants were asked to provide scores regarding the importance (Fig. 2) and to set score for each platform. Explanation Scoreboards The amount of information that the scoreboards provide. Visuals might affect this attribute Storytelling elements This attribute relates to how the platform itself could enhance the presentation of storytelling elements of the CTF challenges Hypermedia support The ability to maintain context such as images, video, documents and other files Flag submission The options which the platforms maintain for creating a flag. For example, regular expression might be present or multiple flags per challenge Extensibility The ability for the platform to be extensible and if there are already developed extensions such as themes or extra plugins Educational acceptance The ability of using the CTF platform as an educational tool Defines which of the CTF platforms and how it is better to use it for creating short-term or long-term events Total acceptance The total acceptance and feedback for the platforms The results, presented in Fig. 2 and Fig. 3 , refine and enhance our assumptions regarding each platform. The attributes in each of the figures include minor differences, however we can distinguish the similarities. Through this approach we were able to distinguish the psychological and personal characteristics, related to their opinion and to define which elements are important for each participant. CTFd was already designed having in mind the educational perspective. The ability to create dependencies on each challenge is important for the facilitators to present challenges in linear sequence or by condition. Root the Box maintains a very extensive reporting system, which is very important for the facilitators or educators. Moreover, the reward system of Root the Box enhances the gamification elements and promises highly engagement levels to competitive players. However, most of the participants identified that Root the Box is a bit complex and difficult for beginners to understand and use. The choice of Mellivora would seem to be the most appropriate if we are interested in simple design and especially in high performance with minimal hardware resources. More specifically, Mellivora is designed with a combination of methods and tools in order to be able to host very large competitions with minimum hardware required and to remain extremely stable and fast. FBCTF is recommended as a platform in competitions in which organizers are interested in introducing strong gamification elements in order to increase the students' engagement and active participation. Since CTFd offers better scoreboard and result graphs and especially team-based statistics it is a more attractive platform for the facilitators. Based on the above, it is possible to confirm that both Root the Box and CTFd are the most suitable for educational purposes, while FBCTF is suitable for conducting CTF competitions as an event. Finally, Mellivora is suitable when the system resources are limited. The key components which the participants recognized as very important were the following: Visuals and Immersion. Participants mentioned the importance of visuals and rich graphics on their first impression after seeing the platforms. User experience is also affected by such attributes and most participants mentioned that FBCTF was the most Fig. 3 . Scores from our own perspective appealing, however a bit complex. Root the Box was mentioned also for having high complexity in terms of the visuals, while CTFd was described as an easy way to engage beginners, mentioning that customization options such as customized themes will be very important. Mellivora was underrated and criticized for not presenting rich graphic elements. Sense of Control. This attribute was mentioned as important for being able to know the progress and understanding what to do next. To this extent, it is important to mention that usually participants are discouraged if they cannot make any significant progress. Finally, the ease of use and the user experience seem to be highly affected from this attribute. Hypermedia. Participants mentioned the importance of maintaining hypermedia in order to enhance the storytelling elements and to engage more to an enhanced gamified version of the challenges. Capabilities to support Events. For conducting the events, participants mentioned the importance of presenting the live scoreboard on a large screen during the event. They highlighted the importance for conducting events in order to engage newcomers. For maintaining events, FBCTF was approved as the most appropriate platform because of the highly immersive environment it provides. Scoreboards. Participants recognized the importance of scoreboards, since scoreboards could increase the completeness and could provide useful information regarding the progress of each team. Furthermore, the participants mentioned the importance of maintaining a scoreboard as a self-evaluation process and for the facilitators to monitor each team or participant. Competitive players mentioned that information from the scoreboards will be used to determine the difficulty of a specific challenge. Therefore, the scoreboard from the participants' perspective was identified very important and an especially motivational element for competitive players. Reward System. Rewards were identified as a benefit for increasing the motivation and competence from the participants. The option for the participants to unlock hidden challenges using their rewarding points was mentioned as an interesting feature. Towards this direction, many participants mentioned the possibility to add extra context or hidden challenges as bonus challenges in order to increase their engagement. Personalization. Most of the participants mentioned that the personalization attributes are important for enhancing the storytelling elements. Therefore, the appropriate usage of themes, colors and context could improve the process of embedding storytelling elements related to the challenges. Flag Submission Options. Participants mentioned that the flag submission should be easy. However, a specific participant mentioned that it is important for someone to stick on the details and to provide the correct flag appropriately. The support for multiple flags and embedding regular expression could be helpful as well as the validation tool for the flags which the Root the Box provides. Storytelling Elements. Storytelling elements were unexpectedly mentioned as an important feature from the reviewers. Participants mentioned this as a very engaging attribute and a motivation to finish the challenges. However, some of the participants mentioned that this attribute is mostly related to games and it could be distracting for some people who are not interested on that perspective. Structure. CTF challenges mostly suffer from the lack of not presenting structured challenges, meaning that each challenge is separate from the other, without distinct categorization or taxonomy. Most participants mentioned their preference for presenting a structured way of the challenges in order to enhance the learning process. Moreover, for educational purposes is best to separate a main challenge to smaller sub challenges for the participants to proceed gradually. Finally, the ability to maintain well-structured challenges is important if we have a large number of challenges. Educational Appropriateness. Participants found that the usage of CTF platforms and challenges would be very interesting for educational purposes, especially for beginners and people who are not very familiar with IT topics. CTFd was mostly approved for making it easy for beginners to engage quickly and for presenting the challenges in a clear and readable way. This main purpose of this study was to compare four popular open source CTF platforms as possible learning platforms. For investigating all aspects of the CTF platforms, a comparative study was conducted highlighting the distinct features of each platform, and we were able to draw conclusions about the advantages and disadvantages of each platform. Given that each platform maintains different features and characteristics, it turns to be quite difficult for the organizers to choose the most appropriate platform, depending on the purpose and the audience. To this end, a number of one-on-one interviews refined our assumptions providing important information regarding the usage of CTF platforms for learning purposes. Extra features which could improve the platforms were discussed as well. In our case we tried to identify the most suitable platform for setting up a hands-on lab at the Ionian University, Corfu, Greece and to highlight CTF challenges as a complementary learning method. For learning purposes, CTFd scored the highest on the criteria of teaching presence. Future work includes the creation of custom CTF challenges focusing on the learning perspective and on presenting extensive educational context. Towards this direction, specific features could be updated or extended in order to provide enhanced gamification elements, quizzes and evaluation processes. An important aspect would be to embed storytelling elements in order to discover and to evaluate the potential of using the CTF platforms and customized CTF challenges for learning purposes, not only in cybersecurity but also to related topics such as user privacy and privacy-aware data governance, towards capitalizing on the results of related projects such as DEFeND [24] . Game based cyber security training: are serious games suitable for cyber security training? NIZKCTF: a noninteractive zero-knowledge capture-the-flag platform Measuring the human factor of cyber security The fun and future of CTF Capture the flag as cyber security introduction A CTF-based approach in information security education: an extracurricular activity in teaching students at Altai State University Innovative approaches to building comprehensive talent pipelines: helping to grow a strong and diverse professional workforce Learning cyber security through gamification Automatic problem generation for capture-the-flag competitions PicoCTF: a game-based computer security competition for high school students Gamification for teaching and learning computer security in higher education Hacking competitions and their untapped potential for security education Capture-the-flag: learning computer security under fire Gamifying education and research on ICS security: design, implementation and results of S3 Using capture-the-flag to enhance the effectiveness of cybersecurity education Usability evaluation of open source and online capture the flag platforms Framework for evaluating Capture The Flag (CTF) security competitions Learning obstacles in the capture the flag model Live lesson: lowering the barriers to capture the flag administration and participation Software sustainability characteristic for software development towards long living software An empirical survey of functions and configurations of open source capture the Flag (CTF) environments A technological acceptance of e-learning tools used in practical and laboratory teaching, according to the European higher education area Comparison of requirement prioritization techniques to find best prioritization technique DEFeND architecture: a privacy by design platform for GDPR compliance Acknowledgements. This project has received funding from the GSRT for the European Union's Horizon 2020 research and innovation programme DEFeND under grant agreement No 787068.