key: cord-0492698-dbwoa0ux authors: Rohanian, Omid; Kouchaki, Samaneh; Soltan, Andrew; Yang, Jenny; Rohanian, Morteza; Yang, Yang; Clifton, David title: Privacy-aware Early Detection of COVID-19 through Adversarial Training date: 2022-01-09 journal: nan DOI: nan sha: 46b6f3b16fc8c4d37aa9909b62bdcd897460768d doc_id: 492698 cord_uid: dbwoa0ux Early detection of COVID-19 is an ongoing area of research that can help with triage, monitoring and general health assessment of potential patients and may reduce operational strain on hospitals that cope with the coronavirus pandemic. Different machine learning techniques have been used in the literature to detect coronavirus using routine clinical data (blood tests, and vital signs). Data breaches and information leakage when using these models can bring reputational damage and cause legal issues for hospitals. In spite of this, protecting healthcare models against leakage of potentially sensitive information is an understudied research area. In this work, we examine two machine learning approaches, intended to predict a patient's COVID-19 status using routinely collected and readily available clinical data. We employ adversarial training to explore robust deep learning architectures that protect attributes related to demographic information about the patients. The two models we examine in this work are intended to preserve sensitive information against adversarial attacks and information leakage. In a series of experiments using datasets from the Oxford University Hospitals, Bedfordshire Hospitals NHS Foundation Trust, University Hospitals Birmingham NHS Foundation Trust, and Portsmouth Hospitals University NHS Trust we train and test two neural networks that predict PCR test results using information from basic laboratory blood tests, and vital signs performed on a patients' arrival to hospital. We assess the level of privacy each one of the models can provide and show the efficacy and robustness of our proposed architectures against a comparable baseline. One of our main contributions is that we specifically target the development of effective COVID-19 detection models with built-in mechanisms in order to selectively protect sensitive attributes against adversarial attacks. COVID-19 has impacted millions across the world. Its early signs cannot be easily distinguished from other respiratory illnesses and hence an accurate and rapid testing approach is vital for its management. RT-PCR assay of nasopharyngeal swabs is a widely accepted gold-standard test, which has several limitations, including limited sensitivity and slow turnaround time (12-24h in hospitals in high and middle-income countries). Several other techniques, including qualitative rapid-antigen 1. Most approaches that have appeared in the literature so far are based on basic machine learning techniques that require a complete retraining anytime a new batch of data is available. However, in a dynamic situation like a pandemic where new streams of data need to be processed, it is vital to incrementally learn from data without the need to start over and retrain the system using all the seen instances. 2. ML-based models explored in the COVID-19 literature are not equipped with an inherent mechanism to guard against possible issues that might arise due to the presence of demographic features. For example, models could easily get biased to a certain demographic group causing incorrect associations and overfitting. 3. Another issue is preserving the privacy of the patients and robustness against adversarial attacks. Most basic models can easily 'leak' information, making it easy for an adversary to recover sensitive information contained in the hidden representation. As blood tests are known to include features which typically correlate with demographic features, such as sex and ethnicity, exclusion of demographics does not necessarily solve the problem. For example, health issues like Benign Ethnic Neutropenia (Haddy et al., 1999) or Sickle Cell Disease (Rees et al., 2010) are predominantly found in a certain number of ethnic groups and much less likely to occur in others. As an additional example, healthy men and women have different reference ranges for blood tests (Park et al., 2016) . This work aims to address the above-mentioned shortcomings in existing research. The proposed adversarial architectures (Section 4) are designed to prevent the learning model from potentially encoding unwanted demographic biases and protect its sensitive information during the learning process. In the first architecture (Section 4.1), protection of attributes is explicit, with the option to select the attributes for guarding against adversarial attacks. We will investigate in Section 5.3.1 whether these direct protective measures would hurt generalisibility to unseen data. In the second architecture (Section 4.2), protecting attributes is based on a general adversarial regularisation and is not tied to any specific subset of selected attributes. Several recent studies in the field of natural language processing (NLP) have shown that textual data carries informative features regarding authors' race, age and other social factors. This makes embedding and predictive models susceptible to a wide range of biases that can negatively affect performance and severely limit generalisability. This kind of bias also raises concerns in areas where fairness and privacy are important. Numerous works have focused on the different ways representation learning can be biased to or against certain demographics and different countermeasures have been proposed to counteract bias (Gonen & Goldberg, 2019) . Most of these studies, however, are done using text and image data. Currently, there is limited research on the application of representation learning and adversarial models for healthcare applications. The proposed models in this study are designed to preserve sensitive information against adversarial attacks, allow incremental learning, and reduce the potential impact of demographic bias. However, the main focus of the work is in privacy preservation. The contributions of this work are as follows: • We introduce two adversarial learning models for the task of COVID-19 identification based on Electronic health records (EHR) that perform satisfactorily on a real COVID-19 dataset and in comparison with strong baselines. Unlike conventional tree-based methods, these architectures are well-suited for transfer learning, multi-modal data, and other advantages of neural models without a significant performance trade-off. • The models use adversarial regularisation to make them robust against leakage of sensitive information and adversarial attacks, which makes them suitable for scenarios where preservation of privacy is important or classification bias is costly. • We run a series of tests to quantitatively demonstrate the efficacy of the proposed architectures in protecting sensitive information against adversarial attacks in comparison with a neural model that is not adversarially trained. • We perform several tests to observe the effect of this type of training on generalisability across different demographic groups. • We externally validate the models using data from other hospital groups. There are various ways a trained model can be attacked by an adversary. The goal in most of them is to infer some kind of knowledge that is not originally meant to be shared or is unintentionally encoded by the model. At least three different forms of attack are known, namely, membership inference, property inference, and model inversion (Shokri et al., 2017) . In this work, we focus on property inference, in which an adversary who has access to model's parameters during training, tries to extract information about certain properties of the training data that are not necessarily related to the main task. Figure 1 shows the general overview of privacy attacks according to Rigaki & Garcia (2020) . The adversary, in our case, can see the model and its parameters and wants information about the data to which they do not have direct access to. Attacks of this kind are possible in any scenario where the model is stored and trained on an external server. Protecting an ML model against property inference attacks is especially useful in the context of collaborative and federated learning, where models locally train on different portions of the dataset and share their parameters over a network that might or might not be fully secure against eavesdropping (Melis et al., 2019) . Within the context of healthcare, such attacks can reveal sensitive personal data and prove disastrous for hospitals. GDPR defines personal data as 'any information relating to an identified or identifiable natural person'. Article 9(1) of the GDPR declares the following types of personal data as sensitive: data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, or trade union membership, genetic and biometric data, and data concerning health or sex life or sexual orientation of the subject (Voigt & Von dem Bussche, 2017) . Sensitive information such as age, gender, location, or ethnicity are usually quantised or anonymised in large healthcare datasets. However, as we will see in Section 5.3, this information can be easily recovered by a simple attack model because of the implicit associations that exist between such information and other features in the dataset. Property inference attacks are not limited to recovering any specific type of data and can predict both categorical and numerical values. For instance, they can be used to train attacker models that learn to identify both demographic features (implicitly present in the data) and blood test features (explicitly present) that highly correlate with certain diseases. It is then possible to use this trained model to re-identify some patients based on their demographic features and possible combination of diseases (Jegorova et al., 2021) . In our binary classification setting, each neural network f is trained to predict labels y 1 , y 2 , ..., y n from instances x 1 , x 2 , ..., x n . Each instance x i contains a set of sensitive (in this case demographic) discrete features z i ∈ 1, 2, ..., k which we intend to "protect" 1 . These sensitive features are called protected attributes. In the context of classification, any neural network f (x) can be characterised as an encoder, followed by a linear layer W : f (x) = W × h(x). W can be seen as the last layer of the network (i.e. dense + softmax) and h is all the preceding layers (Ravfogel et al., 2020) . Suppose we have an attacker model f att that is trained on the encoder h(x) of a neural classifier in order to predict z i . If this trained adversary is able to predict z i based on the encoded representation from the model, the model has leaked and privacy of the model has been compromised. It is unlikely that h(x) would be completely guarded against an attack. If it encodes sufficient information about x i it might reveal some information to a properly trained f att . We say that the trained model f is private with regards to z i if an attacker model f att that has access to f 's encoder (h(x)) cannot predict z i with a greater probability than a majority class baseline. If we perturb h(x) too much, it will not be informative to f att but would also fail in accurately predicting the main task label y i . Therefore, we would like to ensure privacy against potential attackers with regards to the protected attributes while achieving a reasonably good result in the main task. We follow a standard supervised learning scenario where each training instance x i represents information from blood tests and vital signs for each patient seen at the hospital and y i is the corresponding Boolean value denoting the result of the PCR test for that patient. The task is to train a model to predict the correct label for each patient. The first adversarial architecture we explore is comprised of one main part and a number of secondary networks: I. A main classifier M that is the central component of the model. It consists of a stack of n fully connected layers with dropout and batch normalisation, followed by a softmax layer at the end. II. d networks with auxiliary objectives separate from the main task. Supposing we have d categorical features, each of these secondary networks (henceforth referred to as discriminators) predict the value for that feature given each training instance. Assume h i is the representation of an instance at the ith layer within M . This is the point of interception where the auxiliary networks get access to the contents of M . All these components then train in tandem with the following loss function: Each D i corresponds to a separate discriminator network that predicts one of the d different categorical features of interest. λ is a weighting factor and can control the contribution of each individual auxiliary loss. Formula 1 is set up so that after backpropagation, the contents of h be maximally informative for the main task, and minimally informative for prediction of the protected features. Loss of the main task is computed using binary cross entropy. If x and y are the features and labels,ŷ andẑ the predictions for the main target and protected features, θ M and θ Di the parameters of the main classifier and its d discriminators, and L is the joint binary cross entropy loss function, we can formulate the training objective as finding the optimal parametersθ such that: 4.1.1 GRADIENT REVERSAL LAYER As discussed in Section 4.1, during training, the objective is to jointly minimise both of the following terms 2 : 2 Our formulation of GRL in this section is based on Elazar & Goldberg (2018) arg min where each x i is an instance of the data which is associated with the protected attribute z. D is the discriminator (the adversarial network), and c is the classifier used to predict the labels for the main task from representation h. L denotes the loss function. Using an optimisation trick called the Gradient Reversal Layer (GRL), we can combine the above terms into a single objective. This idea was first introduced in the context of domain adaptation (Ganin & Lempitsky, 2015) and was later also applied to text processing (Elazar & Goldberg, 2018; Li et al., 2018) . GRL is easy to implement and requires adding a new layer to the end of the Discriminator's encoder. During forward propagation, GRL acts as an identity layer, passing along the input from the previous layer without any changes. However, during backpropagation, it multiplies the computed gradients by −1. Mathematically this layer can be formulated as a pseudofunction with the following two incompatible equations: Using this layer, we could formulate the loss function into one single formula, and perform a single backpropagation in each training epoch. For the trivial case of having only one protected attribute, we can consolidate equations 3 and 4 with the following: The objective is to minimise the total loss, and for the case of the discriminator, the gradients are reversed and scaled by λ. We can generalise this to the case where we have multiple (in our case 3; namely, age, gender, and ethnicity) protected attributes and corresponding D i s: As the second adversarial architecture, we develop another model in which the adversarial component can perturb the representation during training with some added noise. The direction of this noise (i.e. whether the added noise is a positive or negative number) is dependent on the signs of the computed gradients. This adversarial method is based on linear perturbation of inputs fed to a classifier. In every dataset, the measurements enjoy a certain degree of precision, below which could be considered negligible error . If x is the representation of an instance, it is likely that the classifier would treat x the same asx = x + η, as long as η ∞ < . However, this small perturbation grows when it is multiplied by a weight matrix w: The perturbation is maximised when we set η = sign(w), predicated on the assumption that it remains within the max-norm constraint defined above. In the context of deep learning, the method can be formulated in the following way: If θ is the parameters of the model, and J is the cost function, during training, for each instance a perturbation of η is added to the representation of the instance such that: This procedure is known as the fast gradient sign method (FGSM), originally introduced in a seminal 2015 paper by Goodfellow et al. (2015) . It can be viewed either as a regularisation technique or a data augmentation method that includes unlikely instances in the dataset. For training, the following adversarial objective function can be used: This method can be seen in terms of making the model robust against worst case errors when the data is perturbed by an adversary (Goodfellow et al., 2015) . Because of this regularisation, our expectation is that hidden representations would become less informative to an attacker network that attempts to retrieve demographic attributes. Following the original paper, α is usually taken to be 0.5, which turns the equation into a linear combination with equal weights given to both terms in the objective function. In our implementation (Figure 3 ), alongside the main component, there is an attacker that intercepts the model at a certain step during each training epoch, makes a copy of the pre-attack parameters in the intercepted layer, and injects noise into the model. Based on this information, an adversarial loss is computed and backpropagation is applied. After this step, a restore function is executed, returning the parameters of the intercepted layer back to its pre-attack values. A regular loss is then computed and backpropagation is applied for a second time. This added noise is computed based on equation 9. If h is the representation of a training instance at the time of interception by the attacker, the perturbation is calculated by h = h + η. For the experiments in this study we use a hospital dataset which we refer to as OUH. OUH is a de-identified EHR dataset, covering unscheduled emergency presentations to emergency and acute medical services at Oxford University Hospitals NHS Foundation Trust (Oxford, UK). These hospitals consist of four teaching hospitals, which serve a population of 600, 000 and provide tertiary referral services to the surrounding region. At the time of model development, linked deidentified demographic and clinical data were obtained for the period of November 30, 2017 to March 6, 2021. For each presentation, data extracted included presentation blood tests, blood gas results, vital sign measurements, results of RT-PCR assays for SARS-CoV-2, and PCR for influenza and other respiratory viruses. Patients who opted out of EHR research, did not receive laboratory blood tests, or were younger than 18 years of age have been excluded from this dataset. For OUH, hospital presentations before December 1, 2019, and thus before the global outbreak, were included in the COVID-19-negative cohort. Patients presenting to hospital between December 1, 2019, and March 6, 2021, with PCR confirmed SARS-CoV-2 infection, were included in the COVID-19-positive cohort. This period includes both the first and second waves of the pandemic in England 3 . Because of incomplete penetrance of testing during early stages of the pandemic and limited sensitivity of PCR swab tests, there is uncertainty in the viral status of patients presenting during the pandemic who were untested or tested negative. Therefore, these patients were excluded from the datasets. There are 3081 instances of COVID-19-positive in the original dataset and 112121 negative instances. For the experiments with OUH, we subsampled the majority class to reach a more balanced dataset with prevalence 0.5 (i.e. 6162 positive labels). Age, gender, and ethnicity information were binarised during preprocessing. For gender, the average age is 64, which is taken as cut-off point for binarisation 4 . The ethnicity information, which were encoded using NHS ethnic categories, were divided into white and non-white. While quantising features in this way involves oversimplification and loss of detail, it keeps the values binary across all the protected attributes making comparisons easier in our experimental setup. Table 4 shows the distribution of demographic labels in the OUH dataset. In Section 5.3.2, we will externally validate our models on three NHS Foundation Trust datasets (Soltan et al., 2022), namely Bedfordshire Hospitals NHS Foundation Trust (BH), University Hospitals Birmingham NHS Foundation Trust (UHB), and Portsmouth University Hospitals NHS Trust (PUH). We will use the entire test sets in their original label distribution within the pandemic timeframe to make sure the evaluation is fair and that it mirrors the highly imbalanced data used in hospitals. Table 1 shows the statistics for the Covid-19 Positive cases in the datasets. We performed a series of experiments in order to test the proposed models and compare them against baselines. The baseline non-adversarial model that we use as the basic structure to start from, consists of 3 fully connected dense layers with batch normalisation and dropout. We refer to this model as Base. During 10-fold cross-validation, the best hyperparameters were chosen using random search. We empirically found that heavy hyperparameter optimisation had at best mixed results and adding more layers to the model did not consistently boost performance. We chose a set of parameters that seemed to work well across all the models during cross-validation (Table 2 ) 5 . We also kept the Base model simple with only a few layers so we could have direct and straightforward comparisons with the adversarially trained models. The demographic-based adversarial model is referred to as ADV and its main component is the same as Base. Since after training, only the Base part will be tested (i.e. discriminators will detach), the ADV model ends up having the exact same number of parameters as Base. The perturbation-based adversarial model, which also has the same number of parameters as Base, is referred to as Adv per . All the reported results on the test set are the median of three consecutive runs. In what follows we explain the feature sets used, the train and test procedure and finally report the main task and attacker results under different scenarios. Two sets of clinical variables were investigated (Table 3) : presentation blood tests from the first blood draw on arrival to hospital and vital signs. Only blood test markers that are commonly taken within existing care pathways and are usually available within 1 hour in middle and high-income countries were considered here. The models are trained and tested in a binary classification task in which the labels are confirmed PCR test results. As the first step, the model is evaluated on the TRAIN set in a stratified 10-fold cross-validation scenario during which a threshold is set on the ROC curve to meet the minimum recall constraint 6 . Consequently, the model is trained on the TRAIN set and tested on the holdout TEST data and results are computed using the previously set threshold. During training of the ADV model, the expectation is that the accuracy of the main classifier increase over subsequent epochs, and since the learning setup is such that discriminators are constantly misled, performance is intended to be kept below or around 50% accuracy. To test this assumption, we plotted the changes in the trajectory of accuracy for the main and three auxiliary tasks in the first 15 epochs. This is when the ADV model is being trained on TRAIN set and before it is tested on holdout TEST. As can be seen in Figure 5 , accuracy for the main task keeps growing steadily while discriminator accuracy drops below 50% and plateaus afterwards. In Table 4 we report the results on the main task of predicting PCR results for all the models. The results demonstrate the models perform well at the main task, namely, predicting the outcome of the PCR test. In order to asses how much privacy each model can provide against an adversarial attack, we perform a series of experiments in which 3 different non-adversarial Base models are trained on the training data, with each corresponding to the prediction of a different demographic attribute. In other words, instead of predicting the PCR test result, a protected attribute is provided as the label to train and test on. We perform the experiments under the same conditions as the main task. The attacker is first trained in a 10-fold cross-validation scenario and a threshold is set based on the ROC curve with the minimum recall constraint of 0.8 ± 0.07. Subsequently, the attackers are trained on TRAIN set and tested on the TEST portion of the dataset and predict the same values given the obtained threshold set during 10-fold CV. These results are important to the final interpretations of the model privacy because they determine the upper bound for the most amount of leak the proposed models can have. In Table 5 , we report the results for trained attackers on the TEST portion of the dataset given each protected attribute that was predicted. The lower bound is the the majority class baselines in which the attacker simply relies on some prior information about the distribution of the protected attributes to predict these features and does not make use of the obtained hidden representations. For instance, if a dataset is obtained in Scotland, relying on the known fact that the predominant ethnic category is British White, the attacker would simply assign the same label to all of the instances. Statistics about majority classes for each attribute is given in Table 6 in both TRAIN and TEST sets. As can be seen, ethnicity is the most unbalanced category in comparison with gender and age in which class labels are more equally distributed. As the next step, we trained our baseline and proposed adversarial models on the TRAIN set and saved the weights of the neural networks. We then loaded our trained attackers and tested the attackers, not on the feature directly this time, but on the output of the encoder of the baseline and adversarially trained models. The idea is that, if an adversarially trained model is indeed protecting demographic attributes, it should make it harder for an attacker to predict those values from its encoded representations in comparison with a baseline model that is not specifically designed for preservation of privacy. Results shown in Table 7 already show a degree of privacy provided by the non-adversarial encoder, as they indicate a noticeable decrease in performance compared to Table 5 . The most marked decrease is visible in prediction of gender, in which performance drops from AUC of 0.9104 to 0.6926. In the case of age, however, the attacker seems more robust. Since we want to keep the attackers blind to the encoding strategy used by the adversarially trained model, in order to test the attackers on the ADV and ADV per models, we have to use the same threshold set during 10-fold CV on the encoded representation of the Base model. Therefore, we load the attacker which is trained on the non-adversarial encoder on the TRAIN set and test it on the ADV/ADV per model's encoder to predict the three attributes. The results in Tables 8 and 9 confirm the assumption that an adversarial learning procedure, either with separate discriminator networks for each protected attribute or using perturbation-based regu- larisation, provides a greater level of privacy against attacks by an intruder that intends to recover this information using a representation obtained from the model. The application of an adversarial learning procedure to protect selected attributes involves a training setup with competing losses which is intended to weaken undesirable implicit associations contained in the hidden representations of the network. This is expected to result in a certain amount of performance drop compared to the non-adversarial baseline. As long as this drop is not massive, the performance-privacy trade-off is justified. However, a more general concern is whether a model like ADV, with its 3 different discriminators and the direct and specific manipulation of its hidden representations would generalise poorly when tested on certain demographic sub-populations of the dataset. Since ADV per applies its regularisation without specifically targeting any protected attributes, it is less likely to suffer from this issue. In order to investigate whether protecting demographic attributes damages generalisability of the ADV, we performed a series of experiments with the aim to train and test our Base and ADV models only on one demographic group and tested it on the other. We compare the adversarial model with the baseline to make sure that generalisability of the ADV model is not hurt. Since we have 3 different binary attributes, there are 6 possible ways to cross-test the models. We denote these subgroups with f (female), m (male), w (white), n (non-white), o (old), and y (young) 7 . To restructure the dataset for these experiments, in each case we combine all the data and filter TRAIN and TEST based on the targeted demographic. For example 'm2f' would mean that our TRAIN set only contains females and the TEST set only males. The results in Table 10 clearly indicate that adversarial learning has not damaged generalisability in any of scenarios in which the Base and ADV models were tested. In order to validate the models on external data, we trained Base, ADV, and ADV per on the OUH dataset (as described in Section 4.3) and tested it on the entirety of the UHB, BH, and PUH datasets. We performed the same procedure as the previous experiments: First we ran a 10-fold CV on the OUH dataset and set a threshold and then tested the models on the external test data with the previously obtained threshold. The hyperparameters were kept the same for these experiments with the exception of ADV per which seemed to converge better after 30 epochs during 10-fold CV. Tables 11, 12 , and 13 show the results of this experiment on the UHB, BH, and PUH test sets, respectively. In this work, we introduced and tested two adversarially trained models for the task of predicting COVID-19 PCR test results based on routinely collected blood tests and vital signs. The data was processed in the form of tabular data. In our experiments, we addressed the issue of leakage of potentially sensitive attributes that are implicitly contained in the dataset, and demonstrated how an attacker network can successfully retrieve this information under different circumstances. Information like age seem to be easily inferred with high accuracy from the features or from the hidden representation of the Base model. In this case, ADV and ADV per models significantly reduced this vulnerability, which highlights the protective power of these adversarial methods in hiding such implicit information against invasive models that are specifically trained to infer this knowledge. The same pattern was seen in the case of the other two demographic attributes, namely, gender and ethnicity. For ethnicity, the representation was less informative to the attacker network for the following two reasons: I. A certain percentage of the patients had preferred not to state their ethnicity. Since we wanted to keep all the tasks binary, we treated this category as non-white which is clearly sub-optimal. This further complicates ethnicity prediction for the attacker. II. There are limitations in the accuracy of documenting ethnicity by hospital staff during data collection, which may increase the amount of noise in the data. However, even though the overall results are lower for the case of ethnicity, the ADV model still shows better privacy compared to the baseline. In such cases, the adversary is likely to rely on prior knowledge of the dataset or general information about the prevalence of ethnicity groups in the data, rather than the output of the encoder. Our adversarial setup came with only a minimal performance cost (Table 4 ) and proved robust both in the generalisability tests (Table 10 ) and in external validation on highly imbalanced datasets (Section 5.3.2). More experiments (both at the level of data and model) are needed to ascertain whether the same general patterns can be seen under different conditions. Nonetheless, these methods are not tied to the specifics of the Base model and can be applied to any neural architecture. Furthermore, in the case of the ADV model, the protected attributes need not be demographic and theoretically any categorical feature of interest (or any feature that can be meaningfully quantised) can be used during training. Future work can also include experimenting with continuous features, in which the attacker would have to guess the features in a regression task. To conclude, in this paper we introduced two effective methods to protect sensitive attributes in a tabular dataset related to the task of predicting COVID-19 PCR test result based on routinely collected clinical data. We demonstrated the effectiveness of adversarial training by assessing the proposed models against a comparable baseline both in the context of the main task where it showed performance scores that were by and large at the same level with the baselines and also in the context of privacy preservation where a trained attacker was employed to retrieve sensitive information by intercepting the content of the models' encoder. In the second scenario, the adversarially trained models consistently showed superior performance compared to the non-adversarial baseline. Performance evaluation of the samba ii sars-cov-2 test for point-of-care detection of sars-cov-2 Rapid, point-of-care antigen and molecular-based tests for diagnosis of sars-cov-2 infection Adversarial removal of demographic attributes from text data Unsupervised domain adaptation by backpropagation Lipstick on a pig: Debiasing methods cover up systematic gender biases in word embeddings but do not remove them Explaining and harnessing adversarial examples A machine learning algorithm to increase covid-19 inpatient diagnostic capacity Benign ethnic neutropenia: what is a normal absolute neutrophil count? Survey: Leakage and privacy at inference time Covid-classifier: An automated machine learning model to assist in the diagnosis of covid-19 infection in chest x-ray images Towards robust and privacy-preserving text representations Exploiting unintended feature leakage in collaborative learning Establishment of age-and gender-specific reference ranges for 36 routine and 57 cell population data items in a new automated blood cell analyzer, sysmex xn-2000 Null it out: Guarding protected attributes by iterative nullspace projection Sickle-cell disease. The Lancet A survey of privacy attacks in machine learning Membership inference attacks against machine learning models Rapid triage for covid-19 using routine clinical data for patients attending hospital: development and prospective validation of an artificial intelligence screening test Real-world evaluation of AI driven covid-19 triage for emergency admissions: External validation & operational assessment of lab-free and high-throughput screening solutions. The Lancet Digital Health The eu general data protection regulation (gdpr) Lateral flow device specificity in phase 4 (post marketing) surveillance Machine learning-based prediction of covid-19 diagnosis based on symptoms . The funders of the study had no role in study design, data collection, data analysis, data interpretation, or writing of the manuscript. AS is an NIHR Academic Clinical Fellow. The views expressed are those of the authors and not necessarily those of the NHS, NIHR, or the Wellcome Trust.