key: cord-1041493-jwe2csvh
authors: Pape, Constantin; Remme, Roman; Wolny, Adrian; Olberg, Sylvia; Wolf, Steffen; Cerrone, Lorenzo; Cortese, Mirko; Klaus, Severina; Lucic, Bojana; Ullrich, Stephanie; Anders‐Össwein, Maria; Wolf, Stefanie; Cerikan, Berati; Neufeldt, Christopher J.; Ganter, Markus; Schnitzler, Paul; Merle, Uta; Lusic, Marina; Boulant, Steeve; Stanifer, Megan; Bartenschlager, Ralf; Hamprecht, Fred A.; Kreshuk, Anna; Tischer, Christian; Kräusslich, Hans‐Georg; Müller, Barbara; Laketa, Vibor
title: Microscopy‐based assay for semi‐quantitative detection of SARS‐CoV‐2 specific antibodies in human sera: A semi‐quantitative, high throughput, microscopy‐based assay expands existing approaches to measure SARS‐CoV‐2 specific antibody levels in human sera
date: 2020-12-30
journal: Bioessays
DOI: 10.1002/bies.202000257
sha: fd452a2c6cd5adfc738b023fa6bae054b8771a60
doc_id: 1041493
cord_uid: jwe2csvh

Emergence of the novel pathogenic coronavirus SARS‐CoV‐2 and its rapid pandemic spread presents challenges that demand immediate attention. Here, we describe the development of a semi‐quantitative high‐content microscopy‐based assay for detection of three major classes (IgG, IgA, and IgM) of SARS‐CoV‐2 specific antibodies in human samples. The possibility to detect antibodies against the entire viral proteome together with a robust semi‐automated image analysis workflow resulted in specific, sensitive and unbiased assay that complements the portfolio of SARS‐CoV‐2 serological assays. Sensitive, specific and quantitative serological assays are urgently needed for a better understanding of humoral immune response against the virus as a basis for developing public health strategies to control viral spread. The procedure described here has been used for clinical studies and provides a general framework for the application of quantitative high‐throughput microscopy to rapidly develop serological assays for emerging virus infections.

The recent emergence of the novel pathogenic coronavirus SARS-CoV-2 [1] [2] [3] and the rapid pandemic spread of the virus has dramatic consequences in all affected countries. In the absence of a protective vaccine or a causative antiviral therapy for COVID-19 patients, testing for SARS-CoV-2 infection and tracking of transmission and outbreak events are of paramount importance to control viral spread and avoid the overload of healthcare systems. The sequence of the viral genome became publicly available only weeks after the initial reports on COVID-19 via the community online resource virological.org and allowed rapid development of reliable and standardized quantitative RT-PCR (qPCR) based tests for direct virus detection in nasopharyngeal swab specimens. [4] [5] [6] These tests are the key to identify acutely infected individuals and monitor virus load as a basis for the implementation of quarantine measures and treatment decisions.

In response to the initial wave of COVID-19 infection many countries implemented more or less severe lockdown strategies, resulting in a gradual decrease in the rate of new infections and deaths. [7] With gradual release of these lockdown strategies, monitoring and tracking of SARS-CoV-2 specific antibody levels becomes highly important. Many critical aspects of the humoral immune response against SARS-CoV-2 are currently not well understood. [8] In addition, levels of infection in the general population in different areas remain largely unknown due to proportion of undocumented cases arising from asymptomatic individuals. [9, 10] who had not been subjected to RNA testing, or to limitations in testing capacity especially in areas of relatively high prevalence. Public health control strategies aiming at regulating human mobility and social behaviour in order to suppress the infection rate will have to take into account the proportion of seropositive individuals in the general population, or in specific population groups. [11] Information on the level of antiviral antibodies, as well as on the serological response against different viral proteins, is also a key element of understanding the nature, development and durability of the antiviral immune response. Therefore, specific, sensitive and reliable methods for the quantitative detection of virus specific antibodies in human specimens are urgently needed from the beginning of an emerging pandemic.

Compared to approaches for direct virus diagnostics by PCR, development of test systems for detection of SARS-CoV-2 specific antibodies proved to be more challenging. In particular, cross reactivity of antibodies against circulating common cold coronaviruses (strains OC43, NL63, 229E and HKU1) are of concern in this respect as it was observed in case of serological tests developed for closely related SARS-CoV and MERS-CoV. [12] Developments in the past months yielded well val-idated, commercially available ELISA or (electro)chemoluminescencebased kits for SARS-CoV-2 serological diagnostics. However, initially marketed test kits underwent a very rapid development and approval process due to the emergency of the situation; low numbers of samples were used for validation. Consequently, sensitivity and specificity of the test systems often failed to meet the practical requirements. [13] Furthermore, the disruption of supply chains and high demand for tests during pandemic situations can lead to shortage of commercially available test kits and/or required reagents, as witnessed in the early phases of the ongoing SARS-CoV-2 pandemic. Thus, complementary strategies to test for antiviral antibodies that can be rapidly deployed in situations where commercially available kits are either not yet developed or not available are an important addition to the diagnostic toolkit.

is a classical serological approach in virus diagnostics and has been applied to coronavirus infections, including the closely related virus SARS-CoV. [14] [15] [16] The advantages of IF are (i) that it does not depend on specific diagnostic reagent kits or instruments, (ii) that the specimen contains all viral antigens expressed in the cellular context and (iii) that the method has the potential to provide high information content (differentiation of staining patterns and intensities due to reactivity against various viral proteins). A mayor disadvantage of the IF approach as it is typically used in serological testing is its limited throughput capacity due to the involvement of manual microscopy handling steps and sample evaluation based on visual inspection of micrographs. Furthermore, visual classification is subjective and thus not well standardized and yields only binary results. Here, we address those limitations, making use of advanced automated microscopy and image analysis strategies developed for basic research. We present the establishment and validation of a semi-quantitative, semi-automated workflow for SARS-CoV-2 specific antibody detection. With its 96-well format, semi-automated microscopy and automated image analysis workflow it combines advantages of IF with a reliable and objective semi-quantitative readout and high throughput compatibility. The protocol described here was developed in response to the emergence of SARS-CoV-2, but it represents a general approach that can be adapted for the study of other viral infections and is suitable for rapid deployment to support diagnostics of emerging viral infections in the future.

We decided to use cells infected with SARS-CoV-2 as samples for our IF analyses, since this setup provides the best chance for detection of antibodies targeted at the different viral proteins expressed in the host cell context. African green monkey kidney epithelial cells (Vero E6 cell line) have been used for infection with SARS-CoV-2, virus production and IF. [3, 17] In preparation for our analyses we compared different cell   lines for use in infection and IF experiments, but all tested cell lines   were found to be inferior to Vero E6 cells for our purposes (see Mate- rials and Methods and Figure S1 ). All following experiments were thus carried out using the Vero E6 cell line.

In order to allow for clear identification of positive reactivity in spite of a variable and sometimes high nonspecific background from human sera, our strategy involves a direct comparison of the IF signal from infected and non-infected cells in the same sample. Preferential antibody binding to infected compared to non-infected cells indicates the presence of specific SARS-CoV-2 antibodies in the examined serum. Under our conditions, infection rates of ∼40%-80% of the cell population were achieved, allowing for a comparison of infected and non-infected cells in the same well of the test plate. An antibody that detects dsRNA produced during viral replication was used to distinguish infected from non-infected cells within the same field of view ( Figure 1A ).

In order to define the conditions for immunostaining using human serum, we selected a small panel of negative and positive control sera. Four sera from healthy donors collected before November 2019 were chosen as negative controls, and eight sera from PCR confirmed COVID-19 inpatients collected at day 14 or later post symptom onset was employed as positive controls. Sera from this test cohort were used for primary staining, and bound antibodies were detected using fluorophore-coupled secondary antibodies against human IgG, IgA or IgM.

No difference between infected and non-infected cells in serum IgG antibody binding was observed when sera collected before the onset of the SARS-CoV-2 pandemic were examined ( Figure 1B , Figure S2 ). In contrast, COVID-19 patient sera were clearly characterized by higher serum IgG antibody binding to infected compared to non-infected cells ( Figure 1B ). All eight COVID-19 patient serum samples yielded higher IgG binding to infected compared to non-infected cells as assessed by visual inspection ( Figure S2 ). Similar results were obtained when an IgA or IgM specific secondary antibody was used for detection (Figure S3) . In order to allow for the parallel assessment of IgG and IgA or IgM antibodies, we established conditions for the parallel detection of anti-IgG coupled to AlexaFluor488 and anti-IgA or anti-IgM coupled to DyLight650 or AlexaFluor647 secondary antibodies, respectively, without signal bleedthrough. Using this approach, it was possible to implement detection of SARS-CoV-2 specific IgG and IgA or IgM antibodies in a single experimental setup ( Figure S4 ).

Titration experiments were performed with positive control sera to determine the optimal range of serum concentration in the IF experiments. All eight positive control samples showed visually detectable specific labelling of infected cells over the range of 1:10 2 and 1:10 5 , demonstrating robustness of the assay ( Figure S5 ). Serum concentrations of less than 1:10 5 did not yield detectable signals in all cases. We decided to employ a dilution of 1:10 2 in the further experiments

Our next aim was to establish a semi-automated analysis workflow for image acquisition and analysis for a medium to high throughput setting.

Vero E6 cells were seeded into 96-well plates infected and immunostained using anti-dsRNA antibody and patient serum, followed by indirect detection using a mixture of anti-IgG and anti-IgA/IgM secondary antibodies. Images were acquired using an automated widefield microscope (see Materials and Methods section for more detail).

To obtain a measure for specific antibody binding we performed automated segmentation of cells and classified them into infected and non-infected cells based on the dsRNA staining. We then measured fluorescence intensities in the serum channel per cell as a proxy for the amount of bound antibodies for both infected and non-infected cells and calculated the ratio between these values for infected and noninfected cells in a given specimen. To enable training of a machine learning approach for cell segmentation and to directly evaluate infected cell classification, we manually labelled cells and annotated them as infected/non-infected in 10 images chosen from five positive and five control specimens. Figure 2 presents a graphical overview of all analysis steps; the full description of every step can be found in Materials and Methods. Briefly, our approach works as follows:

First, we manually discarded all images that contained obvious artefacts such as large dust particles or dirt and out-of-focus images. Then, images were processed to correct for the uneven illumination profile in each channel. Next, we segmented individual cells with a seeded watershed algorithm, [18] using nuclei segmented via StarDist [19] as seeds and boundary predictions from a U-Net [20, 21] as a heightmap. We evaluated this approach using leave-one-image-out cross-validation on the manual annotations and measured an average precision [22] of 0.77 ± 0.08 (i.e., on average 77% of segmented cells are matched correctly to the corresponding cell in the annotations). Combined with extensive automatic quality control, which discards outliers in the results, the segmentation was found to be of sufficient quality for our analysis, especially since robust intensity measurements were used to reduce the effect of remaining errors.

We then classified the segmented cells into infected and noninfected, by measuring the 95th percentile intensities in the dsRNA channel and classifying cells as infected if this value exceeded 4.8 times the noise level, determined by the mean absolute deviation. This factor and the percentile were determined empirically using grid search on the manually annotated images (see above). Using leave-one-out cross validation on the image level, we found that this approach yields an average F1-score of 84.3%.

In order to make our final measurement more reliable, we then dis- panels) and a negative control serum (lower panels), followed by staining with an AlexaFluor488-coupled anti-IgG secondary antibody. Nuclei (grey), IgG (green), dsRNA (magenta) channels and a composite image are shown. White boxes mark the zoomed areas. Dashed lines mark borders of non-infected cells that are not visible at the chosen contrast setting. Note that the upper and lower panels are not displayed with the same brightness and contrast settings. In the lower panels the brightness and contrast scales have been expanded in order to visualize cells in the IgG serum channel where only background staining was detected. Scale bar is 20 µm in overview and 10 µm in the insets To score each sample, we computed the intensity ratio:

Here, m I is the median serum intensity of infected cells and m N the median serum intensity of non-infected cells. For each cell, we compute its intensity by computing the mean pixel intensity in the serum channel (excluding the nucleus area where we typically did not observe serum binding) and then subtracting the background intensity, which is measured on two control wells that did not contain any serum.

We used efficient implementations for all processing steps and deployed the analysis software on a computer cluster in order to enhance the speed of imaging data processing. For visual inspection, we have further developed an open-source software tool (PlateViewer) F I G U R E 2 Schematic overview of the image processing pipeline. Initially, images are subjected to the first manual quality control, where images with acquisition defects are discarded. A pre-processing step is then applied to correct for barrel artifacts. Subsequently, segmentation is obtained via seeded watershed, this algorithm requires seeds obtained from StarDist segmentation of the nuclei and boundary evidence computed using a neural network. Lastly, using the virus marker channel we classify each cell as infected or not infected and we computed the scoring. A final automated quality control identifies and automatically discards non-conform results. All intermediate results are saved in a database for ensuring fully reproducibility of the results for interactive visualization of high-throughput microscopy data. [23] PlateViewer was used in a final quality control step to visually inspect positive hits. For example, PlateViewer inspection allowed identifying a characteristic spotted pattern co-localizing with the dsRNA staining ( Figure S6 ) that was sometimes observed in the IgA channel upon staining with negative control serum. In contrast, sera from COVID-19 patients typically displayed cytosol, ER-like and plasma membrane staining patterns in this channel ( Figure 1B , Figure S3 ). The dsRNA colocalizing pattern observed for sera from the negative control cohort is by definition non-specific for SARS-CoV-2, but would be classified as a positive hit based on staining intensity alone. Using PlateViewer, we performed a quality control on all IgA positive hits and removed those displaying the spotted pattern colocalizing with the dsRNA signal from further analysis.

With the immunofluorescence protocol and automated image analysis in place we proceeded to test a larger number of control samples in a high throughput compatible manner for assay validation. All samples were processed for IF as described above, and in parallel analysed by a commercially available semi-quantitative SARS-CoV-2 ELISA approved for diagnostic use (Euroimmun, Lübeck, Germany) for the presence of SARS-CoV-2 specific IgG and IgA antibodies.

As outlined above, a main concern regarding serological assays for SARS-CoV-2 antibody detection is the occurrence of false positive results. A particular concern in this case is cross-reactivity of antibodies that originated from infection with any of the four types of common cold Corona viruses (ccCoV) circulating in the population.

The highly immunogenic major structural proteins of SARS-CoV-2 nucleocapsid (N) and spike (S) protein, have an overall homology of ∼30% [3] to their counterparts in ccCoV and subdomains of these proteins display a higher degree homology; cross-reactivity with ccCoV has been discussed as the major reason for false positive detection in serological tests for closely related SARS-CoV and MERS-CoV. [12] F I G U R E 3 Examples of results from the automated image analysis pipeline. Panels display images that correspond to three different ratio scores (ratio score is indicated above the image) determined from samples stained with three different human sera, followed by staining with an anti-IgG secondary antibody coupled to AlexaFluore488. Images represent overlays of three channels-nuclei (blue), IgG (green) and dsRNA (red). White boxes mark the zoomed area. Cells in the insets are highlighted with yellow or cyan boundaries, indicating infected and non-infected cells, respectively. Scale bar = 10 µm

(CMV) may result in unspecific reactivity of human sera. [24, 25] We therefore selected a negative control panel consisting of 218 sera col- Sera were employed as primary antisera for IF staining using IgM, IgA or IgG specific secondary antibodies, and samples were imaged and analysed as described above. This procedure yielded a ratiometric intensity score for each serum sample. Based on the scores obtained for the negative control cohort and the patient sera, we defined the threshold separating negative from positive scores for each of the antibody channels. For this, we performed ROC curve analysis [26] [27] [28] on a subset of the data (cohorts A, B, C, Z). Using this approach, it is possible to take the relative importance of sensitivity versus specificity as well as seroprevalence in the population (if known) into account for optimal threshold definition. By giving more weight to false positive or false negative results, one can adjust the threshold dependent on the context of the study. Whereas high sensitivity is of importance for e.g. monitoring seroconversion of a patient known to be infected, high specificity is crucial for population based screening approaches, where large study cohorts characterized by low seroprevalence are tested.

Since we envision the use of the assay for screening approaches, we decided to assign more weight to specificity at the cost of sensitivity for our analyses (see Materials and Methods for an in-depth description of the analysis). Optimal separation in this case was given using threshold values of 1.39, 1.31, and 1.27 for IgA, IgG and IgM channels, respectively ( Figure S7 ). We validated the classification performance on negative control cohort E (n = 57) that was not seen during threshold selection, and detected no positive scores. Results from the analysis of the negative control sera are presented in Figure 4 and Table 1 .

While the majority of these samples tested negative in ELISA measurements as well as in the IF analyses, some positive readings were obtained in each of the assays, in particular in the IgA specific analyses ( Figure 4 and Table 1 ). Since samples from these cohorts were collected between 2015 and 2019, and donors were therefore not exposed to SARS-CoV-2 before sampling, these readings represent false positives. Of note, negative control cohort E displayed a particularly high rate of false positives in ELISA measurements, but not in IF (Table 1) . We conclude that the threshold values determined achieve our goal of yielding highly specific IF results (at the cost of sub-maximal sensitivity).

Roughly 10.6% (IgA) or 3% (IgG) of the samples were classified as positive or potentially positive by ELISA ( Table 1 ). The notably lower specificity of the IgA determination in a seronegative cohort observed here is in accordance with findings in other studies [29, 30] and information provided by the manufacturer of the test (90.5% for Since a corresponding IgM specific ELISA kit from Euroimmun was not available, correlation was not analysed in this case. In some cases, antibody binding above background was undetectable by IF in non-infected as well as in infected cells, indicating low unspecific cross-reactivity and lack of specific reactivity of the respective serum. In order to allow for inclusion of these data points in the graph, the IF ratio was set to 1. 

In order to determine the sensitivity of our IF assay, we employed For an assessment of sensitivity, we stratified the samples according to the day post symptom onset, as shown in Figure 6 . and Table 2 . 

Here, we describe the development of a semi-quantitative IF based assay for detection of SARS-CoV-2 specific antibodies in human samples that complements available ELISA-based testing systems. [33, 34] Alternatives to ELISA-based commercial test kits are important in situations where those kits are not available either because they are not yet developed in early days of the pandemic or due to high global demands for tests and required reagents. The microscopy-based assay described here has been developed during the early phase of the COVID-19 pandemic to support the serological testing needs of the University Hospital Heidelberg, Germany and is employed as a confirmatory assay in clinical studies [35] and ongoing studies]. The assay displayed comparable or slightly better sensitivity and specificity than a commercially available semi-quantitative SARS-CoV-2 ELISA approved for diagnostic use at the time. More importantly, combining two technically different serological assays, IF and ELISA, and classifying as "positive hits" only those that scored positive in both assays was instrumental to minimize false positive results while maintaining high sensitivity, and thus serves as a principle for serological studies or diagnostics where specificity of detection is of critical importance. Specificity of detection is essential in settings of relatively low SARS-CoV-2 antibody prevalence [36] [37] [38] in conjunction with high prevalence of potentially cross-reactive anti-ccCoV antibodies in a global population. [39] Advantages and disadvantages

One advantage of the IF based assay presented here is that the specimens used for detection present the entire viral proteome, while ELISA or chemiluminescent approaches use a single recombinantly expressed antigen. Both the N and S protein of coronaviruses are highly immuno-genic, and antibodies binding to the receptor binding domain on the S1 subunit are considered most relevant for neutralization. However, the relative importance of antibodies directed against the N protein for potential protective immunity against SARS-CoV-2 and the possible relevance of the overall breadth of the antibody response is currently unclear. Other SARS-CoV-2 structural and non-structural proteins might play a role in immune response as it was shown for proteins 3a and 9b of the closely related SARS-CoV. [40] In addition, expression of the viral proteome in permissive cells ensures correct protein folding and post-translational modification patterns. Alterations in post-translational modifications are likely to influence the ability of serum antibodies to bind to different viral epitopes as it was shown for other viruses such as HIV-1. [41] It has to be noted that the detection of viral RNA requires fixation and permeabilization of cells, which has the potential to affect epitope preservation. However, based on the high sensitivity of antibody detection and the good correlation to ELISA measurements observed we conclude that this was no major concern in this case. This is already the throughput in the range of some ELISA-based automated systems used in diagnostics and is sufficient for urgent applications in an early phase of disease response.

The major disadvantage of the procedure described here for a virus like SARS-CoV-2 is the requirement of a BSL3 containment area to generate virus stocks and produce infected cell specimens. Recombinant cell lines expressing key viral antigens can address this drawback and also allows to easily implement already established automated cell seeding and immunostaining pipelines for a true high-throughput application. [42, 43] Combining such cell lines with spectral unmixing microscopy [44] would not only enable simultaneous determination of levels of all three major classes of antibodies (IgM, IgG, and IgA), but also identification of the viral antigens recognized, in a single multiplexed approach. The high information content of the IF data (differential staining patterns) together with a machine learning-based approach [45] and the implementation of stable cell lines expressing selected viral antigens in the IF assay will provide additional parameters for classification of patient sera and further improve sensitivity and specificity of the presented IF assay.

The described analysis pipeline can be readily applied for serological analysis of other virus infections, provided that an infectable cell line and a staining procedure that allows differentiating between infected and non-infected cells are available. The assay described here thus offers potential as an immediate response to any future virus pandemic, as it can be rapidly deployed from the moment the first isolate of the pathogen has been obtained without requiring information on the expression or immunogenicity of viral proteins.

In summary, we have developed a microscopy-based assay for semi- 

In order to find a suitable cell line for our application, we performed 

ELISA measurements for determination of reactivity against the S1 domain of the viral spike protein were carried out using the Euroim- 

Two of our processing steps require manually annotated data: in order to train the convolutional neural network used for boundary and foreground prediction, we needed label masks for the individual cells. To determine suitable parameters for the infected cell classification, we needed a set of cells classified as being infected or noninfected. We have produced these annotations for 10 images with the following steps. First, we created an initial segmentation following the approach outlined in the Segmentation subsection, using boundary and foreground predictions from the ilastik [46] pixel classification workflow, which can be obtained from a few sparse annotations.

We then corrected this segmentation using the annotation tool BigCat 

On all acquired images, we performed minimal preprocessing (i.e., flatfield correction) in order compensate for uneven illumination of the microscope system. [47] First, we subtract a constant CCD camera offset (ccd_offset). Secondly, we correct uneven illumination by dividing each channel by a corresponding corrector image (flatfield(x, y)), which was obtained as a normalized average of all images of that channel, smoothed by a normalized convolution with a Gaussian filter with a bandwidth of 30 pixels.

This corrector image was obtained for all images of a given microscope set-up. Full background subtraction is performed later in the pipeline using either the background measured on wells that (deliberately) do not contain any serum or, if not available, using a fixed value that was determined manually.

Cell segmentation forms the basis of our analysis method. In order to obtain an accurate segmentation, we make use of both the DAPI and the serum channel. First, we segment the nuclei on the DAPI channel using the StarDist method [19] trained on data from Caicedo et al.

2019. [48] Note that this method yields an instance segmentation: each nucleus in the image is assigned a unique ID. In addition, we predict per pixel probabilities for the boundaries between cells and for the foreground (i.e., whether a given pixel is part of a cell) using a 2D U-Net [20] based on the implementation of Wolny et al. 2020. [21] This method was trained using the nine annotated images, see above. The cells are then segmented by the seeded watershed algorithm. [18] We use the nucleus segmentation, dilated by three pixels, as seeds and the boundary predictions as the height map. In addition, we threshold the foreground predictions, erode the resulting binary image by 20 pixels and intersect it with the binarised seeds. The result is used as a foreground mask for the watershed. The dilation/erosion is performed to alleviate issues with very small nucleus segments/imprecise foreground predictions. In order to evaluate this segmentation method, we train nine different networks using leave-one-out cross-validation, training each network on eight of the manually annotated images and evaluating it on the remaining one. We measure the segmentation quality using average precision [22] at an intersection over union (IoU) threshold of 0.5 as described in https://www.kaggle.com/c/datascience-bowl-2018/overview/evaluation. We measure a value of 0.77 ± 0.08 with the optimum value being 1.0.

Infection classification 

For additional robustness against intensity variations we adapt the threshold based on the variation in the background in the plate. Hence, we define it as a multiple of the mean absolute deviation of all background pixels of that plate with N = 4.8:

To determine the optimal values of the parameters used in our procedure, we used the cells manually annotated as infected/non-infected (see above). We performed grid search over the following parameter ranges: 

In order to obtain a relative measure of antibody binding, we determined the mean intensity and the integrated intensity in each segmented cell from images recorded in the IgG, IgA, or IgM channel. A comparative analysis revealed that the mean intensity was more robust against the variability of cell sizes, whereas using the integrated intensity as a proxy yielded a higher variance in non-infected cells. Thus, mean intensity per cell was chosen as a proxy for the amount of antibody bound. Non-specific auto-fluorescence signals required a background correction of the measured average serum channel intensities.

For background normalization, we used cells (one well per plate) that

were not immunostained with primary antiserum. From this we computed the background to be the median serum intensity of all pixels of images taken from this well. This value was subtracted from all images recorded from the respective plate. In case this control well was not available, background was subtracted manually by selecting the area outside of cells in randomly selected wells and measuring the median intensity.

The core interest of the assay is to measure the difference of anti- 

Using these, the ratio r, difference d and robust z score z are computed:

We compute above scores for each well and each image, taking into account only the cells that passed all quality control criteria (see below). While the final readout of the assay is well-based, image scores are useful for quality control.

In order to determine the presence of SARS-CoV-2 specific antibodies in patient sera, it was necessary to define a decision threshold r*.

If a measured intensity ratio r is above a decision threshold r* than the serum would be characterized as positive for SARS-CoV-2 antibodies.

For this an ROC analysis was performed. [28] Each possible choice of r* for a test corresponds to a particular sensitivity/specificity pair. By continuously varying the decision threshold, we measured all possible sensitivity/specificity pairs, known as ROC curves ( Figure S7 ). To determine the appropriate r* we considered two factors [26] :

• The undesirability of errors or relative cost of false-positive and false-negative classifications • The prevalence, or prior probability of disease These factors can be combined to calculate a slope in the ROC plot [26] [27] [28] 

where P is the prevalence or prior probability of disease.

The optimal decision threshold r*, given the false-positive/falsenegative cost ratio and prevalence, is the point on the ROC curve where a line with slope m touches the curve. As discussed in the main text, a major concern regarding serological assays for SARS-CoV-2 antibody detection is the occurrence of false-positive results. Therefore, we choose m to be larger than one in our analysis. In particular, we determine r* for the choice of m = 10 (see Figure S7 ).

We performed quality control of the images and analysis results at the level of wells, images and cells. The entities that did not pass quality control are not taken into account when computing the score during final analysis. We exclude wells that contain less than 100 non-infected cells, that have a median serum intensity of infected cells smaller than three times the noise level (measured by the median absolute deviation), or that have negative intensity ratios, which can happen due to the background subtraction. Out of 1736 wells, 94 did not pass the quality control, corresponding to 5.4% of wells. At the image level, we visually inspect all images and mark those that contain imaging artifacts using a viewer based on napari. [49] We distinguish the following types of artifacts during the visual inspection: empty, unstained or over-saturated images, as well as images covered by a large bright object. In addition, we automatically exclude images that contain less than 10 or more than 1000 cells. These thresholds are motivated by the observation that too few or too many cells often result from a problem in the assay. Thus, 296 of the total 15,624 images were excluded from further analysis, corresponding to 1.9% of images. Out of these, 295 were manually marked as outliers and only a single one did not pass the subsequent automatic quality control. Finally, we automatically exclude segmented cells with a size smaller than 250 pixels or larger than 12,500 pixels that most likely correspond to segmentation errors. These limits were derived by the histogram of cell sizes investigated for several plates. Two percent of the approximate 5.5 million segmented cells did not pass this quality control. In addition, we have also inspected all samples scored as positives. For the IgA channel, we have found a dotty staining pattern in ten cases that produced positive hits based on intensity ratio in negative control cohorts, but does not appear to indicate a specific antibody response. We have also excluded these samples from further analysis.

In order to scale the analysis workflow to the large number of images produced by the assay, we implemented an open-source python library to run the individual analysis steps. This library allows rerunning experiments for a given plate for newly added data on demand and caches intermediate results in order to rerun the analysis from checkpoints in case of errors in one of the processing steps. To this end, we use a file layout based on hdf5 [50] to store multi-resolution image data and tabular data. The processing steps are parallelized over the images of a plate if possible. We use efficient implementations for the U-Net, [21] StarDist [19] and the watershed algorithm (http://ukoethe. github.io/vigra/) as well as other image processing algorithms. [51] We 

In order to explore the numerical results of our analysis together with the underlying image data we further developed a Fiji [52] based opensource software tool for interactive visualization of high-throughput microscopy data. [23] The PlateViewer links interactive results tables and configurable scatter plots (image and well based) with a plate view of all raw, processed and segmentation images. The PlateViewer is connected to the centralised database such that also image and well based metadata can be accessed. The viewer thus enables efficient visual inspection and scientific exploration of all relevant data of the presented assay.

We would like to thank Martin Weigert and Uwe Schmidt for their help with setting up prediction for StarDist. We would like to acknowledge Infectious Disease Imaging Platform (IDIP) at Center for Integrative Infectious Diseases Research (CIID) for microscopy support. We would like to thank EMBL, especially the EMBL IT Services Department for providing computational infrastructure and support, as well as Wolfgang Huber for discussions on computing image based scores and statistical tests. We thank the patients who participated in this study. We also thank Christian Drosten at the Charité, Berlin and the European Virus Archive (EVAg) for the provision of the SARS-CoV-2 strain BavPat1. Individual images used in the Figure 1A Open access funding enabled and organized by Projekt DEAL.

A new coronavirus associated with human respiratory disease in China

The species Severe acute respiratory syndrome-related coronavirus: Classifying 2019-nCoV and naming it SARS-CoV-2

A pneumonia outbreak associated with a new coronavirus of probable bat origin

Detection of 2019 novel coronavirus (2019-nCoV) by real-time RT-PCR

A colorimetric RT-LAMP assay and LAMP-sequencing for detecting SARS-CoV-2 RNA in clinical samples

Towards effective diagnostic assays for COVID-19: A review

The effect of large-scale anti-contagion policies on the COVID-19 pandemic

Six months of coronavirus: The mysteries scientists are still racing to solve

Estimating the asymptomatic proportion of coronavirus disease 2019 (COVID-19) cases on board the Diamond Princess cruise ship

Clinical and epidemiological features of 36 children with coronavirus disease 2019 (COVID-19) in Zhejiang, China: An observational cohort study

Modeling shield immunity to reduce COVID-19 epidemic spread

Serological assays for emerging coronaviruses: Challenges and pitfalls

Evaluation of SARS-CoV-2 serology assays reveals a range of test performance

Immunofluorescence assay for detection of the nucleocapsid antigen of the severe acute respiratory syndrome (SARS)-associated coronavirus in cells derived from throat wash samples of patients with SARS

Evaluation of a safe and sensitive Spike protein-based immunofluorescence assay for the detection of antibody responses to SARS-CoV

SARS-coronavirus-2 replication in Vero E6 cells: Replication kinetics, rapid adaptation and cytopathology

The Watershed Transform: Definitions, Algorithms and Parallelization Strategies

Cell detection with star-convex polygons

U-Net: Convolutional networks for biomedical image segmentation. In Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics

Accurate and versatile 3D segmentation of plant tissues at cellular resolution. Elife

The pascal visual object classes (VOC) challenge

PlateViewer: Fiji plugin for visual inspection of high-throughput microscopy based image data

Epstein-Barr virus and Cytomegalovirus infections cause falsepositive results in IgM two-test protocol for early Lyme borreliosis

Acute Epstein-Barr virus infection and human immunodeficiency virus antibody cross-reactivity

Receiver-operating characteristic (ROC) plots: A fundamental evaluation tool in clinical medicine

A review on the methodology for assessing diagnostic tests

Primer on certain elements of medical decision making

Evaluation of nine commercial SARS-CoV-2 immunoassays

Severe acute respiratory syndrome coronavirus 2−Specific antibody responses in coronavirus disease

Virological assessment of hospitalized patients with COVID-2019

Antibody responses to SARS-CoV-2 in patients with COVID-19

A serological assay to detect SARS-CoV-2 seroconversion in humans

Serology assays to manage COVID-19

Prevalence of SARS-CoV-2 Infection in Children and Their Parents in Southwest Germany

COVID-19 Antibody Seroprevalence in

Suppression of a SARS-CoV-2 outbreak in the Italian municipality of Vo

Repeated seroprevalence of anti-SARS-CoV-2 IgG antibodies in a population

Development of a nucleocapsidbased human coronavirus immunoassay and estimates of individuals exposed to coronavirus in a U.S. metropolitan population

Antibody responses to individual proteins of SARS coronavirus and their neutralization activities

A potent and broad neutralizing antibody recognizes and penetrates the HIV glycan shield

High-throughput fluorescence microscopy for systems biology

The beautiful cell: High-content screening in drug discovery

Spectral imaging and its applications in live cell microscopy

Leveraging machine vision in cellbased diagnostics to do more with less

Ilastik: Interactive machine learning for (bio)image analysis

Comparative evaluation of retrospective shading correction methods

Nucleus segmentation across imaging experiments: The 2018 Data Science Bowl

Hierarchical data format 5 : HDF5. In Handbook of open source tools

Scikit-image: Image processing in Python

Fiji: An open-source platform for biological-image analysis

Microscopy-based assay for semi-quantitative detection of SARS-CoV-2 specific antibodies in human sera

The authors declare they have no conflicts of interest. 

The data from the IF assay are available in the BioImage Archive