key: cord-0916623-p4lqsnmp authors: Stone, Paul B.; Nelson, Hailey Marie; Fendley, Mary E.; Ganapathy, Subhashini title: Development of a novel hybrid cognitive model validation framework for implementation under COVID‐19 restrictions date: 2021-05-12 journal: Hum Factors Ergon Manuf DOI: 10.1002/hfm.20904 sha: 077110971fee71d222b68b6e0507f4e767483ad0 doc_id: 916623 cord_uid: p4lqsnmp The purpose of this study was to develop a method for validation of cognitive models consistent with the remote working situation arising from COVID‐19 restrictions in place in Spring 2020. We propose a framework for structuring validation tasks and applying a scoring system to determine initial model validity. We infer an objective validity level for cognitive models requiring no in‐person observations, and minimal reliance on remote usability and observational studies. This approach has been derived from the necessity of the COVID‐19 response, however, we believe this approach can lower costs and reduce timelines to initial validation in post‐Covid‐19 studies, enabling faster progress in the development of cognitive engineering systems. A three‐stage hybrid validation framework was developed based on existing validation methods and was adapted to enable compliance with the specific limitations derived from COVID‐19 response restrictions. This validation method includes elements of argument‐based validation combined with a cognitive walkthrough analysis, and reflexivity assessments. We conducted a case study of the proposed framework on a developmental cognitive model of cardiovascular surgery to demonstrate application of a real‐world validation task. This framework can be easily and quickly implemented by a small research team and provides a structured validation method to increase confidence in assumptions as well as to provide evidence to support validity claims in the early stages of model development. The recent outbreak of COVID-19 (SARS-CoV-2) at the beginning of 2020, has caused 49.7 million individuals to become infected and caused 1.2 million fatalities globally as of November 2020 (World Health Organization, 2020) . The outbreak has prompted government intervention through widespread shutdown of nonessential businesses and services, implementation of social distancing guidance, and reallocation of resources and funds to better assist with viral mitigation and containment efforts (Ashraf, 2020) . In addition to social and institutional shutdowns, economic downturn has also ensued due to loss of funding, lack of consumer spending, and uncertainty around the return to pre-COVID-19 normalcy. The ramifications have not only impacted the global economy but have also had a significant effect on the research community within public and private institutions (Ashraf, 2020) . The American Journal of Emergency Medicine has outlined specific focus areas and guidelines for research during the COVID-19 pandemic (Haleem et al., 2020) . Personal Protective Equipment has been redistributed to those fighting the virus on the frontlines, participants and researchers have been prohibited from participating in in-person research (unless pertaining directly to and reassigned to staggered schedules or remote work to reduce the amount of face-to-face interaction and to maintain appropriate social distancing guidelines. Pertinent research in the healthcare field has also been largely suspended as an attempt to allocate the most physical and financial resources possible to fighting the virus and predicting future viral outcomes. The face-to-face work of human factors researchers has been affected particularly hard by these restrictions, as this historically requires extensive human interaction to elicit information regarding cognitive and decision-making processes (Sy et al., 2020) . Much of this work requires structured in-person studies to be conducted utilizing observational and probing techniques. Given that many current studies in Human Factors research are not related to battling COVID-19, they are not considered essential practices. In the case of healthcare, this problem is compounded as subject matter experts (SMEs), especially those on the frontlines, are not readily available to participate in related studies, and non-essential research personnel are restricted from healthcare facilities, necessitating that test and evaluation procedures are moved online or remote (Sy et al., 2020) . Cognitive models are widely used in psychology and cognitive engineering to understand where errors are made or how training systems can be developed to reinforce a user's cognitive model. They are a means for researchers to understand, describe and predict how individuals or teams perform cognitive tasks, such as information processing and decision making, and can highlight relevant cognitive states and actions (Rupp & Leighton, 2016) . Wagenaar et al. (1990) suggest that there are specific errors associated with specific cognitive states, and that knowledge of the cognitive model can be used to reduce errors and improve decision making. Hayes-Roth and Hayes-Roth (1979) developed a cognitive model to understand the nature of the planning activity from apparently rational to apparently chaotic decisions made when individuals develop plans. The model allows the researcher to abstract types of knowledge and decisions made in planning to parse top level thinking and enable simulation of the underlying process. Similarly, the recognition-primed decision (RPD) model (Klein, 1993) examines the more abstract concept of naturalistic decision making. This model allows the researcher to relate outcomes of testing to more intuitive decision types which in turn can be used to confirm or reject an assertion about the decision-making being employed. The application of cognitive models is addressed by Belkin (1984) , who concludes that it is vital to understand the user's problem to build a cognitive model. This explicit representation allows researchers to use computational techniques to quantify human cognitive performance through simulation (Cooper et al., 1996) . For cognitive modelling to deliver its full potential benefit, it is important to ensure sound methodological principles are used to ensure output is repeatable and models produce valid predictions. There is a need for evidence in simulation and ergonomics science and studies of validity can provide this, giving additional credibility to the associated models (David, 2013; Stanton, 2016) . Validation can improve confidence in the methods used by human factors engineers and is an important step in the modeling of a system (Annett, 2002; Stanton & Young, 1999; Stanton, 2016) . We considered the questions asked by Landry et al. (1983) in determining our approach to validation: what does it mean for a model to be valid? and, does validity refer to the output, structure, or modelling process? Annett (2002) argues that, in ergonomic models in particular, it is essential to the validity of a model to ensure that the performance is consistent and predictive in nature. It is important to note that cognitive models can be complex in nature and a validation method that works for one model, may not work for another (Strube, 2001) . Keehner et al. (2017) discuss the general approach to validation of cognitive models, highlighting the requirement for an iterative, staged process. The goal of any validation method should be to support claims and validity arguments about the specific models to which it is applied (Keehner et al., 2017) . Similarly, Kane (2013) argues that validation is an ongoing and iterative process, and we believe a structured validation framework could have utility in the early stages of research where gathering appropriate resources is difficult or not cost-effective. In this study we use the definition of validation as determining if the real system is represented by the model (Law, 1991) . This is achieved by mapping the system capabilities with the model representation. More robust, quantitative validation procedures may be required as models are developed, but here we seek to determine if the model meets the basic requirements of representing the underlying cognitive processes. For this study, the safety-critical nature of cardiovascular surgery makes validation key to future implementation of this proposed model and heightens the need to find alternative means of cognitive model validation to enable progress under COVID-19 restrictions. This paper attempts to address the question-how can we validate a cognitive model in the age of COVID-19 and remote testing? We outline a streamlined validation process and explain how we adapted existing thinking while developing an understanding of the evolving COVID-19 situation and innovated new approaches to cognitive model validation. We aim to advance a simple, structured framework for cognitive model validation requiring no direct contact and minimal reliance on remote usability and observational studies. While this has been born STONE ET AL. | 361 out of the necessity of the COVID-19 response, we believe a hybrid validation framework for cognitive modeling can have broader application to support cognitive systems engineering. We believe this approach can lower costs and reduce timelines to initial validation and could allow identification of problems early in model development, potentially preventing problems further downstream. Our case study focuses on the validation of a cognitive model of cardiovascular surgery, given restrictions associated with COVID-19. The intended output of this study is a Hybrid Cognitive Validation Framework to provide human factors researchers a means to expedite initial validation with minimal resources. We propose a validation process based on analysis of existing literature, reflexivity cross check, and cognitive walkthrough within the research team. A case study implementation of the validation framework is presented to demonstrate application and provide example output from the framework. This analysis was contingent on the availability of a candidate cognitive model developed before the COVID-19 restrictions. Although these restrictions may impact the development of cognitive models, this study focuses exclusively on the validation of cognitive models in this context. The scoring system used in this study is only preliminary for framework confirmation through implementation of the framework and associated feedback on scoring usefulness and accuracy. In line with the aims and objectives of this study, we used the following process to develop and conduct a test implementation of the Hybrid Cognitive Validation Framework. COVID-19 restrictions are not uniform among countries, states and even cities and localities (Hale et al., 2020) . As authorities balance the need for public safety with the desire to maintain economic activity, these regulations also vary over time. We, therefore, define specific conditions with which this validation framework is compatible. In addition to representing COVID-19 restrictions, these conditions also represent future situations under which the Hybrid Cognitive Validation Framework can be an effective tool for research teams. The restrictions used in this study are: • The validation framework can be implemented by a research team consisting of a minimum of 2 individuals for performing reflexivity analysis requirements (Davies & Dodd, 2002) , to collaborate and ensure checks and balances on the validation. • No requirement for in-person intra-team meetings. • No in-person contact with external SMEs during the validation process. • Communication with SMEs is limited to confirmatory questioningno probing or enhanced analysis due to assumed lack of availability. • The research team has access to validation resources, such as those available online. , 1954) . Predictive and Concurrent methods are considered together as criterion validity and refer to the ability of a test to accurately predict, either in advance or concurrently, a predetermined measure or characteristic (Cronbach & Meehl, 1955) . By contrast, Construct Validity (American Psychological Association, 1954) aims to determine if a test measures the underlying concept it aims to address (Middleton, 2019) . Finally, there is Content Validity, which is established by demonstrating that test subjects are representative of the population of interest (Cronbach & Meehl, 1955) . In this paper, we concentrate on construct and content validity, ensuring the model and its inputs are representative, rather than comparison of model outputs to known standards. These measures relate to the internal validity of the model and given the potential complexity of assessment, we do not focus on the external validity. Before COVID-19 restrictions, the proposed validation procedure for the Cognitive Model of Cardiovascular Surgery, was to implement a method used by Craig et al. (2012) to validate a cognitive model of laparoscopic surgery. This method includes three validation stages, including data collection through SME interviews, construct encoding and comparison, and reflexivity phases including reassessment by additional participants who did not take part in the initial assessment. This is a modified version of an evidence collection and model validation developed in the knowledge audit method (Militello & Hutton, 1998) . The process required multiple data elicitation procedures and cross comparison with SMEs with explicit procedures aimed at minimizing researcher bias. The restrictions adopted in this study exclude face-to-face, concurrently gathered validity measures, and drive us toward remote, asynchronous measures, which, although easier to collect are somewhat harder to infer target cognitive processes from (Embretson, 1983) . We considered several approaches to establish construct and content validity for the HCOG framework. Thoroman et al. (2019) use interview data as a reference standard to evaluate the validity of a near-miss reporting form. In this study, we adopt a similar approach, utilizing the cognitive walkthrough as our interview reference standard for empirical validity assessment of the HCOG model. Silva et al. (2020) Stanton and Baber (2005) validate the Task Analysis for Error Identification, demonstrating improved performance compared with Heuristic evaluation, an approach developed by Stanton and Stevenage (1998) . This method showed good reliability and concurrent validity for the Task Analysis for Error Identification technique. Cornelissen et al. (2014) consider the validation of a formative method, concentrating on cognitive work analysis, noting the importance of pooling results of multiple analysts in establishing validity. This reinforces the importance of the requirement for multiple researchers to conduct our validation framework. Vinod et al. (2016) utilize Markovian modelling to represent humans and build task simulations to compare outcomes with potential human action. This was initially considered a strong option to provide objective, quantitative basis for evidence generation in the Hybrid Cognitive Validation Framework, however, the complexities of this approach and expert nature of surgeons meant this approach was discounted. The argument-based approach to validation (Kane, 2013) aims to minimize complexity in the validation process while still evaluating claims and providing evidence to support them. The argument-based approach is based on early construct validity models (Cronbach & Meehl, 1955) which details three general principles for validity: • The focus of the validation is on the interpretation of the output rather than the output itself. • Validation is part of an ongoing research program. • The proposed interpretation of the output is subject to critical evaluation. The argument-based validation framework (Kane, 2006 (Kane, , 2013 also requires two argument types to support validity: An interpretive/use argument (IUA) and a validity argument. The IUA argument specifies the claims that are to be evaluated in the validation, while the validity argument is used to evaluate the interpretation of validation scoring. The argument-based method claims if these arguments are clear, coherent, and complete, and the inferences reasonable and assumptions plausible, a model can be said to be valid. We will use these definitions as the basis for the argument-based elements of our validation framework. The cognitive walkthrough is a structured review method for conducting usability assessments early in the design cycle of a product (Lewis et al., 1990) . It involves the generation of task scenarios and explicit assumptions regarding the user population and context of use. This method was initially developed to assess the usability design performance of user interfaces, but we believe it can be adapted to establish construct validity of models. The method requires definition of the user along with sample tasks or scenarios and action sequences that are compared with an implementation of the user interface (Lewis et al., 1990) . In adapting this method to the validation of cognitive models, we establish sample tasks and incorporate these into credible vignettes or scenarios based on the implementation associated with the cognitive model. We ask the reviewer to determine if the cognitive paths and states in the model are representative of the decisions associated with the tasks in example scenarios, and to walkthrough the scenario, task-by-task and compare to the cognitive model of cardiovascular surgery. The intended use of the model of cardiovascular surgery is to predict the surgeon's workload through indirect assessment of cognitive state, linking the paths taken through the model to periods in a procedure where workload is high or low. Ideally, a method to evaluate concurrent validity of the prediction made by the model would be implemented, but given the restrictions due to COVID-19, this is not possible. We are aiming to demonstrate construct validity and content validity. Specifically, we aim to establish the ability of the model to represent the underlying concept (construct validity) and the representativeness of the model to the target population, in this case cardiovascular surgeons. For the purposes of this study, we utilize existing Cognitive Models in the decision ladder and RPD model. These models have been widely used and are subject to validation (Lintern, 2010; Rasmussen, 1974; Soh, 2007) . Rather than focus on internal validation of the model structure, this study focuses on developing a validation approach to answer the question "Is a specific model representative of the cognitive tasks for which it is built?" We, therefore, propose a concept for a hybrid cognitive walkthrough (Lewis et al., 1990) , argument-based validation method (Kane, 2006) to establish construct validity, augmented with reflexivity analysis (Davies & Dodd, 2002) to establish content validity. We believe including both the walkthrough analysis and argument-based methods provides a broad but flexible and adaptable basis for the validation of cognitive models that integrates into early iterations of model development and can be used to derive requirements for more complex assessments as well as providing evidence for model validity. This concept was developed into a specific validation framework detailing specific tasks to establish both construct and content validity. The tasks are representative of those in the donor validation methods, highlighting surrogate tasks where the restrictions of this study limited the scope of initial application. The expected outputs and interpretations are also defined. The resultant framework was implemented in a case-study validation of a cognitive model of cardiovascular surgery. This wider context of the development of this cognitive model is development of a decision support system (DSS) to improve performance and reduce risk in cardiovascular surgery. Woods (1985) proposes that integrating STONE ET AL. where Subject-Matter Experts utilize these vignettes, with reference to the cognitive model to determine how representative the states and pathways compare to the vignette requirement. Rather than require SMEs to record outcomes at each decision point and attempting to encode potentially incomplete or inaccurate data, we propose a four-level qualitative scoring system to enable a simplified assessment. Task 3 is an implementation of the Reflexivity analysis (Davies & Dodd, 2002) . This stage aims to ensure model validation procedures and assessments are supported by suitable evidence and that SMEs used are qualified and documented. Clarity, coherence, and completeness are key to all tasks in this framework, in line with requirements defined by Kane (2013) . The resultant Hybrid Cognitive Validation Framework is detailed in Table 1 . The reviewers identified to conduct the cognitive walkthrough are provided with the following instructions on the implementation of the method: • This is a walkthrough validation of the states and paths detailed in the cognitive model of cardiovascular surgery. • The aim is to establish validity through comparison with tasks and decision points in the scenario. (Table 1 ). • Conduct a task-by task walkthrough of the validation scenario (Tables 2 and 3 ) with reference to the cognitive task matrix (Table 4 ). • For each task identified in the walkthrough scenario, record the paths taken through the cognitive model and identify where decisions and cognitive states and paths do not match. These instructions are designed to be sent electronically and can be followed up with a discussion with the reviewer to clarify any elements of the walkthrough task. The research team can then utilize the results of the walkthrough analysis, in conjunction with the validation interpretation framework (Table 5) to assign validity scores to the model. This is an initial implementation of the walkthrough analysis method to support cognitive model validation and it is expected that the guidance for reviewers will be expanded based on the feedback received during this study. The Hybrid Cognitive Validation Framework (Table 1) should be used as part of an iterative, scalable validation process with tasks completed sequentially. The model development stage is not scored but is included to indicate the chronology of the development within the validation process. The argument-based analysis, walkthrough analysis, and reflexivity analysis tasks, detailed in Table 1 , each have two sub-tasks with a potential for 5 points. The success criteria, or Validation (V) element scores (0-3 points) are summed with the objectivity (O) scores (0-2 points) to give an overall Validation Framework (VF) score (0-5 points). This gives a potential score of 5 points for each of the six tasks and sub-tasks and a total of 30 points for each implementation of the validation framework, with 18 points attributable to validation success criteria and 12 points to objectivity scoring. The inferences derived from this scoring are variable, dependent on the validation task, defined in the argument-based validation task (Kane, 2013) . For a model to be considered "Good," some objectivity analysis should be undertaken. For this reason, the threshold for "good" should be above 18 out of 30, hence even if the model achieves a score of 18/18 on the validation criteria element, the interpretation threshold should require a score above this to ensure that some objectivity analysis is completed. A unified model representing naturalistic, analytical, and mixed decision types was synthesized by modifying and combining existing cognitive models. The RPD Model (Klein, 1993; 1999) The RPD model (Klein, 1993; 1999 ) is used to understand how people make quick decisions in complex situations, particularly in expert domains. The model is derived from research into intuitive decision making and assumes use of prior knowledge and pattern recognition to make decisions. There are two key elements in the RPD model: first, the way a decision maker assesses a situation and recognizes a suitable course of action, and secondly how a course of action is imagined, and potential outcomes evaluated. Both of these elements are dependent on the ability to recognize both features of the situation and corresponding actions (Klein, 1993) . This model is more typical of the advanced or expert decision maker, as higher situational awareness and ability to predict outcomes based on experience enables this type of nuanced, heuristic decision making The Fluoroscopic imaging of the patient reveals a narrower than expected vascular structure for a man this size and the initial expectations on catheter size and shape are violated. Catheter maneuver complications: The catheter maneuver is unsuccessful due to the narrowed vascular geometry. Step 1 A 58-year-old male has been diagnosed with a narrowing of a coronary artery and has been scheduled for a stent fitting to correct the condition. The patient characteristics are within the experience of the cardiovascular surgeon and planning elements are routine. This follows path P1 in the cognitive model and is representative of Intuitive decision making. Step 2 The preparation is conducted by the anesthetist and the patient responds as expected. This follows path P1. Step 3 This vascular geometry is assessed creating a decision point at D1. The vascular geometry is found to be narrower than expected in line with the scenario definition, resulting in the surgeon using their experience and associated heuristics to reselect an appropriate catheter based on this new information. This is a mixed Intuitive-analytical decision, following path P2. Step 4 A suitable insertion point is easily found but catheter insertion task is somewhat harder than expected. The surgeon intuitively corrects and quickly achieves a successful insertion without requiring consultation with other team members. This is within the bounds of the Recognition-Primed Decision model-not requiring analytical heuristics so still follows path P1. Step 5 The maneuver of the catheter to the procedure site is unsuccessful in line with the scenario definition. The surgeon corrects position and tries again but is still unsuccessful. Expectations are violated and path P3 is adopted. The surgeon may need to consult with external decision support and generate multiple options such as a different catheter or entry point. These options are evaluated against through a value judgement in the analytical level of the model. (Klein, 1993) . The decision ladder model (Rasmussen, 1974) is representative of both the analytical decision making paradigm and the heuristic, intuitive paradigm. In this model, the rational decision process follows the outer path of the "ladder" whereas the heuristic decision process may start and finish anywhere in the model appearing as "shortcuts" (Rasmussen, 1974) . These models were combined with a single start point as in the Complex RPD Strategy Model (Klein, 1993; 1999) . The following case study implements the three tasks defined in the Hybrid Cognitive Validation Framework (see Table 1 ). These tasks are implemented in the validation of a developmental cognitive model. The scores for each task are detailed in Tables 6 and 7 . We defined a five-level interpretation hierarchy with inferences attributable to both the validity and reflexivity scores. The validation score interpretation framework is shown in Table 5 . 6.2 | Task 2-Representation assessment (cognitive walkthrough analysis) Scenarios representative of the use case were developed to enable cognitive walkthrough. To ensure scenarios generated for the T A B L E 4 Cognitive task matrix for generic cardiovascular surgery scenarios a Sub-tasks Intuitive example Mixed example Analytical example The combination of patient characteristics and procedure type and complexity are familiar. The patient and procedure type are largely familiar, but some anomalies discovered but within situations governed by analytical heuristics. The patient characteristics and/or procedure type are unfamiliar and additional cognitive resource are assigned to develop an analytical solution. Patient responds to sedative and initial preparation as expected. Some anomalous response but within experiential reference and responds as expected. Patient response is not expected, and initial remedial measures are unsuccessful. T A B L E 5 Validation score interpretation framework High overall validity The model can be said to have a high validity for an early developmental model and is suitable for implementation in initial research studies only. Model validation should continue as part of an ongoing, iterative design process. Total VF a Score ≥25 (max = 30) Interpretation Score ≥7 The model has high validity scores across all validation tasks. There is good evidence that the underlying assumptions are valid, data collection techniques are sound and researcher bias has been addressed through reflexivity assessment. The model has been demonstrated to be representative. The inferences of the Interpretation framework are reasonable and the assumptions plausible and the definitions of are clear, coherent, and complete. Reflexivity Score ≥7 Min Success Criteria Score ≥2 Min objectivity Criteria Score ≥1 There is evidence that this model has good validity for an early developmental model and clear, complete, and coherent interpretations were defined. This model may be useful for implementation in initial research studies, however reflexivity scores were low so there is a remaining caveat on the potential for researcher bias. Further external confirmation of the model is required to use with confidence. Total VF Score ≥19 (max = 30) Poor reflexivity Interpretation Score ≥6 (60%) Representation Score ≥6 (60%) Reflexivity Score ≤6 (60%) Objectivity score ≤6 (60%) Good validity There is evidence that this model has good validity for an early developmental model and reflexivity assessments have been complete. This model may be useful for implementation in initial research studies, however interpretation frameworks were not provided so findings are should be treated as somewhat speculative until an interpretation framework is defined. and decision points identified in the model in Figure 1 . The output from the cognitive walkthrough is detailed in Table 3 and both initial and reflexivity scores are provided in Tables 6 and 7. 6.3 | Task 3 reflexivity assessment The preliminary scores derived from the initial Hybrid Cognitive Validation Framework assessment of the case study model are summarized in Table 6 . It can be seen from these results that there is a "0" objectivity score as the assessment was conducted by the primary researcher. In this case, the Interpretation guidance would be that this model has poor validity, despite high validation scores, as no objectivity analysis was complete at this point. The rationale for this decision is outlined in the Section "Validation Framework Implementation Scoring." The validation procedure was repeated by a second researcher to demonstrate the improvement in scoring associated with the objectivity scoring element of the framework. Second researcher variance in the implementation of the validation framework scoring stemmed from Tasks 3B and 4B in both Success Criteria and Objectivity components. Task 3B demonstrated the basic requirements of the framework were met, but only on a single scenario. Additionally, this was not found to be completely representative of all surgical phases. Task 4B alternatively, demonstrated that data collection covered the breadth and depth of the model for a representation to be made and collection conditions to be consistent, however, there was insufficient evidence to determine data were collected without bias or omission. The second researcher was not present for a debrief within 1 week of data collection, limiting the Objectivity score to a score of 1. Overall, the second researcher scoring lowered the success criteria scoring from 17 to 15 but improved the objectivity scoring from 0 to 11. This underlines the importance of second researcher involvement in the assessment phase. The VF score resulting from the second researcher implementation of the case study was 26. When applied to the interpretation guide ( The argument-based validation elements used are comparable with those outlined by Kane (2013) . The reflexivity assessments are comparable with the process outlined by Davies and Dodd (2002) , however these are contingent on review by a second researcher or external SME. where access to expert input is limited, such as military, petrochemical, medical, or aviation. The keys to broader implementation of this method are the ability to establish the IUA and validity arguments defined in the argument-based validation method along with the development of credible scenarios with tasks that represent the potential cognitive states and paths defined in the cognitive model (Kane, 2006; . While we believe this validation framework has utility under the circumstances identified, it is more prescriptive than other methods discussed in this paper and does not cover criterion validity. Ideally, we would like to establish concurrent validity through correlation tests between the predicted model state and concurrent assessments of a surgeon's employed mental model assessed by SMEs. This approach would be potentially less subjective and easier to compare using quantitative tests. While this establishes further validity evidence, it is potentially much more complex and requires access to resources that are not compatible with the rationale of an early-stage, low-cost validation approach presented here. The use of interview data as a reference for model validity (Thoroman et al., 2019) would provide a potentially more robust means to gather validity data. A simulation approach, as employed by Vinod et al. (2016) could potentially establish more objective validity data but has potential corresponding validity issues arising from the simulation of human agents in a specialized role. Stanton (2016) notes that small assessment groups are frequently a problem with validation in Human Factors Engineering, and this case study was no exception, limiting the confidence in the conclusions until further research can be completed. Due to time constraints, this study only considers a single scenario for the walkthrough analysis validation with a single SME. The next stage in the development of this validation framework will be to develop a more extensive set of scenarios for the cognitive walkthrough to establish more evidence for model validity. To enable this, it is expected that a more detailed scoring framework will be required to bridge the gap between the reviewer's cognitive walkthrough responses and the validity scores assigned in the validation interpretation framework. COVID-19-like restrictions also have the potential to disrupt the development of cognitive models, before, or in parallel with validation activities. Future studies should address potential methods to address the impact on model development. The detail in the instructions given is another potential limitation of this study, further development of the instructions and the presentation of the cognitive walkthrough task is important to ensure clarity and consistency of interpretation between reviewers. Reliability and Validity are closely related and often combined to establish confidence in a model. We have not considered reliability in this study, but in the future, this could be established with intra-rater agreement analysis using Pearson's correlation test. As part of an iterative validation process, we recommend conducting a full validation in line with the procedure defined by American Educational Research Association, & National Council on Measurement in Education 1954). Technical recommendations for psychological tests and diagnostic techniques (51) A note on the validity and reliability of ergonomics methods Economic impact of government interventions during the COVID-19 pandemic: International evidence from financial markets Cognitive models and information transfer A systematic methodology for cognitive modelling Validating the strategies analysis diagram: Assessing the reliability and validity of a formative method Using cognitive task analysis to identify critical decisions in laparoscopic environments Construct validity in psychological tests Validating simulations Qualitative research and the question of rigor Director's Stay Safe Ohio Order. Ohio Department of Health Construct validity: Construct representation versus nomothetic span Variation in government responses to COVID-19. Blavatnik school of government working paper Areas of academic research with the impact of COVID-19 A cognitive model of planning Educational measurement The argument-based approach to validation Developing and validating cognitive models in assessment. The Handbook of Cognition and Assessment: Frameworks, Methodologies, and Applications COCATS 4 Task Force 10: Training in cardiac catheterization A recognition-primed decision (RPD) model of rapid decision making Sources of power: How people make decisions Model validation in operations research Simulation Modeling and Analysis March Testing a walkthrough methodology for theory-based design of walk-up-anduse interfaces A comparison of the decision ladder and the recognition-primed decision model The 4 Types of Validity. Website Applied cognitive task analysis (ACTA): A practitioner's toolkit for understanding cognitive task demands The human data processor as a system component The Wiley handbook of cognition and assessment: Frameworks, methodologies, and applications Development and validation of a descriptive cognitive model for predicting usability issues in a low-code development platform Validation of the recognition-primed decision model and the roles of common-sense strategies in an adversarial environment (Doctoral dissertation On the reliability and validity of, and training in, ergonomics methods: A challenge revisited Validating task analysis for error identification: Reliability and validity of a human error prediction technique Learning to predict human error: Issues of acceptability, reliability, and validity What price ergonomics? Stent: Purpose, Procedure, and Risks Improving Decision Support Systems Through Context and Demand Aware Augmented Intelligence in Dynamic Joint Cognitive Systems Cognitive modeling: research logic in cognitive science Doing interprofessional research in the COVID-19 era: A discussion paper Evaluation of construct and criterion-referenced validity of a systems-thinking based near miss reporting form Validation of cognitive models for collaborative hybrid systems with discrete human input Cognitive failures and accidents Cognitive technologies: The design of joint humanmachine cognitive systems Development of a novel hybrid cognitive model validation framework for implementation under COVID-19 restrictions The peer review history for this article is available at https://publons.