key: cord-0271594-re8acamc
authors: Deng, Wenxuan; Zhu, Biqing; Park, Seyoung; Sumida, Tomokazu S.; Unterman, Avraham; Hafler, David; Cruz, Charles S. Dela; Kaminski, Naftali; Lucas, Carrie L.; Zhao, Hongyu
title: ProtAnno, an Automated Cell Type Annotation Tool for Single Cell Proteomics Data that integrates information from Multiple Reference Sources
date: 2021-09-13
journal: bioRxiv
DOI: 10.1101/2021.09.13.460162
sha: 85c3fccaf0b0e6686550123b9805b528fa3aa154
doc_id: 271594
cord_uid: re8acamc

Compared with sequencing-based global genomic profiling, cytometry labels targeted surface markers on millions of cells in parallel either by conjugated rare earth metal particles or Unique Molecular Identifier (UMI) barcodes. Correct annotation of these cells to specific cell types is a key step in the analysis of these data. However, there is no computational tool that automatically annotates single cell proteomics data for cell type inference. In this manuscript, we propose an automated single cell proteomics data annotation approach called ProtAnno to facilitate cell type assignments without laborious manual gating. ProtAnno is designed to incorporate information from annotated single cell RNA-seq (scRNA-seq), CITE-seq, and prior data knowledge (which can be imprecise) on biomarkers for different cell types. We have performed extensive simulations to demonstrate the accuracy and robustness of ProtAnno. For several single cell proteomics datasets that have been manually labeled, ProtAnno was able to correctly label most single cells. In summary, ProtAnno offers an accurate and robust tool to automate cell type annotations for large single cell proteomics datasets, and the analysis of such annotated cell types can offer valuable biological insights.

rich data and resources on single cell transcriptomics (Regev et al. 2017; Lindeboom, Regev, 59 and Teichmann 2021). 60

In addition to collecting single cell data for one specific -oimcs data type, it is possible to 61 collect multi-omics data simultaneously at the single cell level, e.g., CITE-seq (Stoeckius et 62 al. 2017 ) that measures mRNA and antibody counts simultaneously by UMI barcode. These 63 data can characterize the cellular relationship between transcript and cell surface marker 64 abundance. Methods have been developed to bridge these two types of omics data. For 65 example, cTP-net (Zhou et al. 2020 ) is a transfer learning approach under a deep learning 66 framework to predict surface protein levels from scRNA-seq data. The generation of 67 diverse types of single cell data poses many computational challenges due to their high 68 dimensionality and large sample sizes. In this paper, we focus on the annotation of single 69

cell proteomics data to their corresponding cell types, which is a critical step in single cell 70 analysis. 71 A major advantage of single cell proteomics data compared with scRNA-seq data is their 72 high sensitivity and specificity. Based on surface marker expression patterns, manual 73 gating can be used to identify different cell populations by the expression distributions. 74

However, cell type labeling by manual gating on single cell data is labor intensive and 75 subjective as it highly depends on the expert who annotates the cells. Although some 76 methods can analyze these data based on state-of-the-art machine learning methods ( Preliminary labeling by SingleR can vastly accelerate cell type inference. However, there is 83 no similar automated tool for single cell proteomics data annotation. A recently developed 84 tool named CellGrid ) applied the same idea to CyTOF data but needed a 85 large number of labeled proteomics datasets as input. However, there is a lack of labeled 86 single cell proteomics data although well-annotated scRNA-seq data are more broadly 87

available. For the CITE-seq data, it is still necessary to annotate the transcriptomics and 88 proteomics data separately. 89

To overcome the lack of labeled single cell proteomics data, we introduce an automated 90 single cell proteomics data annotation approach called ProtAnno, based on non-negative 91 matrix factorization (NMF) to incorporate data from different reference sources. The only 92 essential input of ProtAnno is some prior knowledge on cell type-specific biomarkers. To 93 further improve annotation accuracy, ProtAnno can take advantage of publicly available 94 CITE-seq data and annotated scRNA-seq data. This enables ProtAnno to perform cell type 95 annotation with no prior characterization between cell types and surface proteins by 96 leveraging these external references. 97

We have evaluated the performance of ProtAnno through simulations under different 98 settings. The results showed the robustness of ProtAnno to biological variability, technical 99 noise, cell type number, and incomplete and inaccurate expert knowledge. We then applied 100

ProtAnno to three real datasets: peripheral blood mononuclear cell (PBMC) paired 101 stimulated B cell receptor CyTOF data, PBMC CITE-seq data from healthy subjects, and 102

longitudinal whole blood covid-19 CyTOF data grouped by patients' disease severity. In the 103 analyses of these real data, ProtAnno provided fast and accurate labeling as demonstrated 104 through comparisons with manual annotations or downstream biological investigations. In 105 summary, ProtAnno is a computationally efficient and statistically robust approach for 106 automated cell type annotation when only limited expert knowledge is available for single 107 cell proteomics data. ProtAnno deconvolutes the proteomic expression profile ∈ × into the product of the 115 cell type-specific signature matrix, ∈ × , and cell type assignment matrix, ∈ × , 116

i.e. X = WH. In the model, we have surface markers for cells in the proteomics data, e.g., 117

the cytometry data and antibody profile in CITE-seq. We denote as the number of cell 118

types. The columns of are matched with the known cell types in the same order. Due to 119 the non-negative requirement on the estimations of and , ProtAnno implemented NMF 120 for solving = . 121

In ProtAnno, we integrate information from both prior knowledge encoded in a matrix A 0 122 and relationship between protein markers and RNA-seq data encoded in a matrix A. public labeled single cell expression profile, and penalty parameters , , , . 152

Step 1: Set = 0 and generate the initial non-negative and with all ones. 153

Step 2: Update by rows and by columns: 154

Step 3: Repeat Step 2 until the convergence or reaching the pre-specified number of 156

iterations. 157

It can be shown that the algorithm converges in theory with the details of the theorems and 158

converging rates provided in the supplementary materials. 159

Since the penalty parameter values are critical to ProtAnno, we developed the following 161 algorithm to tune their values iteratively. 162

To ensure that ProtAnno can group expression profiles into reasonable clusters, we use the 163 default unsupervised Louvain algorithm for preliminary clustering. We then set the initial 164 parameter by the KKT condition; and then search the initial , , and values by 165 choosing from among 0.1, 1, 10, and 100. To find the initial and values, we maximize 166 the ARI between the Louvain clusterings and ProtAnno results. We consider a novel index 167 to select the initial value for : 168

The above function considers the difference between the mean values of and . The 170 main purpose of ( ) is to ensure that is on approximately the same scale as .

Otherwise, the deconvolution would be unstable since we penalize the column summation 172 of H to make the cell type proportion estimations meaningful. 173

To set more precise penalty terms, ProtAnno uses binary search to determine the final 174 optimal output as detailed in Algorithm 2, and allows the specified search depth depending 175 on the running time. If the final ARI is lower than 0.75, the algorithm will restart the binary 176 search at higher resolution. 177

Model input: Normalized expression profile matrix , discrete prior knowledge matrix , 178

publicly labeled single cell expression profile. 179

Step 1: Estimate Louvain clusters. 180

Step 2: Initialize and by an optimization method of choice ( = 1and = 10), and 181 initialize by KKT conditions: 182

Step 3: Initialize and by choosing from (0.1,1,10,100) and minimize Adjusted Rank 185

Index (ARI) with Louvain clustering. 186

Step 4: Initialize by estimated signature matrix W reliability by the following metric: 187 ( ) ∶= − . 188

Step 5: Select the best , , and by binary search. 190

Step 6 (Optional): If the final ARI with Louvain cluster is still lower than 0.75, we re-do the 191 binary search on all the penalty parameters. 

In the following, we describe how we simulated the protein expression profiles for each 200 single cell. We first generated the entries in the expert matrix from a multinomial 201 distribution with three categories, +1, -1, and 0, corresponding to biomarker knowledge for 202 a cell type, i.e., high expression, no expression, and no information. The matrix will be built 203

with the angles between vectors as large as possible, so that the simulated single cell 204 expression profile is a nonnegative linear combination of the proteomics signature 205 matrix (Zhang et al. 2008 ). To achieve it, we generate 100 matrices randomly and minmax 206 the inner products between column vectors to get the optimal one. This design captures the 207 nature of signature genes. However, the prior knowledge may not be in line with true 208

relationships between markers and cell types due to technical and biological variations. 209 We, therefore, regenerated an intermediate discrete matrix based on containing 210 three possible discrete values, 2 (high expression), 1 (low expression), and 0 (no 211 expression), by random walk and the protocols observed in real data (details in Methods). 212 We then derived the signature matrix based on from two truncated normal 213 distributions with high and low mean values, respectively, and varying relationships 214 between variances and means that reflect the signal noise ratio for a biomarker. The 215

difference of the means of the two truncated normal distributions together with their 216 variances dictate the informativeness of a specific biomarker for cell type specification. The 217

proteomics expression profile was sampled from the truncated normal distributions 218 based on the signature matrix that defines the mean and associated variance for each 219 biomarker in each cell type. Finally, to simulate the predicted proteomics signature matrix 220 from transcriptomics data, which is the product of the protein-RNA association matrix , 221 and the transcriptomics signature matrix , [ ], we sampled the element [ ] in the 222 matrix with mean , the expression level of signature gene in cell type , and variance 223 /(2 * ). A smaller results in a weaker correlation between the two proteomics 224 signature matrices. All the simulation details can be found in Materials and Methods. 225 226 We considered six simulation scenarios to cover a wide range of biological and technical 227 variations in real data, with data quality varying from high (scenario 1) to low (scenario 6). 228 We tuned the parameters of the normal distributions to decrease the cell type distinction 229 and enlarge the expression profile divergence. To simulate the situations when AS or is 230 too noisy to have a good prediction power, we also added more uncertainty to AS and 231 that led to a lower correlation with real signature matrix W. The overall noise increased 232 from scenario 1 to 6. In scenario 5, we increased the noisy level of expert matrix so that 233

it is more aberrant compared with the single cell proteomics data expression pattern. In 234 addition, the correlation of and was set at only 0.2. These large noises from 235 references would reduce the annotation accuracy. Furthermore, we increased expression 236 profile variations by lowering the mean-variance ratio in scenario 6, making the annotation 237 more challenging. All the parameters settings and details are listed in Supplementary Table  238 1. 239 240

We compared the performance of six models: the full ProtAnno model, the ProtAnno model 241 without transcriptomics information , the ProtAnno model without expert knowledge 242

, the unsupervised ProtAnno model, the unsupervised Louvain clustering, and cell type 243 assignment by non-negative least squares (NNLS) when the protein signature matrix is 244

known. The last one is the best result an NMF model can achieve. We consider five metrics 245 for annotation accuracy: Adjusted Rand Index (ARI), annotation accuracy assigned by the 246 largest value for every single cell, Normalized Mutual Information (NMI), cosine similarity 247 between the estimated and real assignment, and Average Silhouette Width. 248

The simulation results in Figure 2A show that the full ProtAnno model achieved the best 249 annotation accuracy, especially compared with the NNLS model. Specifically, from 250 scenarios 1 to 4, ProtAnno was able to achieve the same performance an NNLS model could 251

have when the true signature matrix is known. These results suggest the importance of 252 having precise information for and . Besides the full ProtAnno model, the partial 253

ProtAnno model with the expert matrix had the second-best performance, just slightly 254

worse than the full model. The partial ProtAnno model only having the transcriptomics 255 reference was not competitive with the first two models. But it was still much better than 256 the unsupervised clustering, even in terms of ARI and NMI. In general, both full and partial 257

ProtAnno models could obtain accurate cell type assignments even in the presence of 258 considerable noise. 259

We investigated the numerical convergence of ProtAnno in our simulation studies. Fig 2B  260 shows an example of ProtAnno cell type assignment in a UMAP plot with 29 cell types, 261

where the accuracy was 98.8%. The high agreement of clustering between the real labels 262

(left) and the ProtAnno annotation (right) demonstrates the power of ProtAnno. 263 264 Algorithm convergence and robustness 265 266

ProtAnno generally converges after 100 iterations (Supp 1A). To filter the high confidence 267 annotation, the subsetting step will keep the cell whose estimated is greater than 0.5 268 after column normalization. The subsetting step was able to slightly enhance the 269 annotation accuracy (Supp 1B). 270

ProtAnno's robustness was investigated with respect to three parameters: mean-variance 271 ratio ( Fig 3A) , correlation between and AS ( Fig 3A, supp 1C) , and cell type number 272 ( Fig 3C) . Since a protein with a higher mean expression level tends to have a higher 273 variance, we used mean-variance ratio to characterize the signal noise ratio of a protein. As 274 expected, the prediction accuracy dropped with a reduced mean-variance ratio ( Fig 3A) . 275 Overall, the full ProtAnno model was very robust even when the prediction power was 276 weak (supp 1C), implying the critical role of . We also explored how the transcriptomics 277 information could help the annotation when is not available (supp 1C). The accuracy 278 rate was almost linearly associated with the prediction correlation. Therefore, scRNA-seq 279 data and dictionary CITE-seq data with high quality are essential for the good performance 280

of the partial ProtAnno model when no reliable expert knowledge is available. As for the 281 impact of the number of cell types, ProtAnno was robust to the large cell type number as 282 long as the cell counts of each cell population were more than 100 in our simulation ( Fig  283  2B , Fig 3C) . 284

The good performance of ProtAnno depends on the appropriate choices of the penalty 285 parameters. Simulation results suggest that our proposed parameter tuning algorithm was 286 able to find good penalty values ( 

The above objective function form illustrates the imposed relationships between W, AS, 291

and . The second penalty terms will force the ratio of elements in matrices W and AS to 292 be approximately equal to . Specifically, the ratio of scRNA-seq. We also constructed an expert-guided matrix (Table 1) . ProtAnno assigned a 314 cell to the cell type with the largest value in matrix H. 315

The overall median accuracy of ProtAnno for this data set was 83% (Fig 4A) annotation. Therefore, an overall accuracy of 80% may be considered satisfactory for this 327 dataset. We further investigated the misclassification patterns by the confusion matrix for 328 patient 1 in the stimulated group ( Fig 4B) . It can be seen that most of the cells could be 329 correctly assigned by ProtAnno. However, some of the natural killer cells (NK cells) were 330 annotated as CD8 T cells. That is partly because the BCR data did not measure the 331 biomarker CD8, which offers important information for CD8 T cell assignment. For this 332 data set, the only biomarker that was informative to distinguish these two cell populations 333 is CD3, which makes computational annotation difficult. The UMAP plot also shows that the 334 annotation is highly consistent with the manually annotated cell types, although the 335 boundary of NK cells and CD8 T cells is hard to recognize (Fig 4E) . 336

Since many genes had significant differential expression levels across groups in the BCR 337 dataset, we studied the W matrix variations to test whether ProtAnno could accurately 338 estimate the signature matrix. The most differential expression directions in the stimulated 339 group by ProtAnno are consistent with prior biological knowledge ( Fig 4C) . For example, 340 biomarker HLA-DR had a significantly elevated expression in most cell types after BCR 341

stimulation. The inferred cell type proportions also suggest that ProtAnno can provide an 342

accurate estimation in the comparative study. For example, the proportions of naive B cells 343

were lower in the stimulated group ( Fig 4D) . These downstream analyses suggest accurate 344 annotation by ProtAnno. 345 (2020) (Ramaswamy et al. 2021 ) to obtain the association matrix and construct (Table  356 2). The automated annotation accuracy was around 99% with 100% accuracy for most cell 357 types, except for non-classical monocytes. ProtAnno failed to recognize this cell type since 358

it is a rare population whose cell type proportion is below 1% (Fig 5A) a slight increase at the end of the study (Fig 5D, 5E ). The changes between day 1 to day 12 386 were significant, especially for the ICU group with a p-value of 0.029. These results suggest 387 that most ICU patients were recovered after treatment. 388

In contrast, the lymphocyte proportion is expected to decline in severe covid-19 patients. 389

Lymphopenia has been used as a biomarker to define a patient's morbidity (Fathi and 390 Rezaei 2020; Tavakolpour et al. 2020; I. Huang and Pranata 2020). When we applied 391

ProtAnno to infer CD4 T and CD8 T cells, although there was no statistically significant 392 difference across groups, the CD4 T cell count was indeed elevated if the patients recovered 393 (Fig 5F) . In the longitudinal curves of cell counts, almost all the patients had increased CD4 394 T cell and CD8 T cell counts after treatment (Fig 5G) . This implies that the lymphopenia 395 was alleviated by treatment. 396

The analysis of this data set shows the usefulness of ProtAnno even when the data are In summary, ProtAnno can provide a robust preliminary automated cell type annotation. 433

Its performance could be further improved through adopting more advanced clustering 434 approaches and more reliable references. The annotation accuracy barplots across samples from stimulated (first row) and 541 unstimulated (second row) groups. The columns are the cell type numbers used (5 cell 542 types in the first column, 6 cell types in the second column, and 7 cell types in the third 543 column). The color represents the raw output or subsetted output from ProtAnno. 544

A) The annotation metrics with different subsetting cutoffs. The x-axis represents the cutoff 546

value. The y-axis represents the annotation metric value. 547

B) The numbers of kept cells after subsetting filter. The x-axis represents the cutoff value. 548

The y-axis represents the remaining cell counts. . 580

Step 2: Update with multiplicative update algorithm 581

In this step, we optimize ℎ column-wise. 582 The expert matrix was generated from a multinomial distribution containing three 590 categories, +1, -1, and 0. We generated 100 matrices and selected the optimal one that 591 had the smallest inner product between the vectors by minimax. Such construction can 592 more likely enable the expression profile X to be a nonnegative linear combination of basis 593

vectors. 594

In most cases, the expert matrix may be biased. We therefore generated an intermediate 595 discrete matrix with three possible discrete values, 2, 1, and 0, to represent the expert 596 knowledge matrix used. They were matched with three biomarker distributions within a 597 cell population: high, low, and no expression. In practice, if the entry in was 1, the 598 corresponding gene would have high expression levels in real data from our observations. 599

Thus, the corresponding element in kept the value, 1. However, if the entry in was -1, 600

it is possible to have low or medium expression level instead of no expression completely. 601

In some rare instances, the corresponding gene might even have high expression. 602

Therefore, the values of -1 would random walk to 1 with probability and to 2 with 603 probability in . If the entry in was 0, then the distribution was uncertain. In this 604 case, the values of 0 would random walk to 1 with probability and to 2 with probability 605

in . This generated the new discrete matrix based on the above random walk 606 protocols to distinguish the different biomarkers behaviors. 607

Next, the signature matrix was generated based on from two positive truncated 608 normal distributions with different expectations. When the entry in was 2, the 609 corresponding signature gene average expression was sampled from a positive truncated 610 normal distribution with a large mean; when the entry in was 1, the corresponding 611 signature gene average expression was sampled from a half-normal distribution with a 612 small mean; otherwise, it was set to 0.1. 613

After randomly simulating the cell type labels for each single cell in the proteomics 614 expression profile , the expression level was sampled from normal distributions with the 615 expectation of average signature gene expression in and variance calculated from 616 sampled mean and the specified mean-variance ratio. 617

618 ARI was calculated using the function adjustedRandIndex from the R package 619 mclust (Scrucca et al. 2016) . For NMI, the ground truth and the predicted label for each cell 620

were converted into one-hot vectors and the function NMI in the R package aricode (Vinh, 621 Epps, and Bailey 2009) was used to calculate the NMI for each cell. The reported result was 622 the average across all the cells. Cosine similarity was calculated by first computing the 623 cosine measure between the ground truth one-hot vector and the cluster assignment vector 624 using the cosine function from the lsa package in R, and finally averaging over all the cells. -

Public scRNA-seq S Prior Knowledge on cell type-specific surface marker

Prior knowledge matrix ! " Dictionary CITE-seq To get the optimal penalty parameters λ 1 ,λ 2 ,µ and η, we considered both the KarushKuhn-Tucker (KKT) condition combined with empirical screening. We first obtained the initial W and H by setting the arbitrary penalization: λ 1 = 1, λ 2 = 10, µ = 50 and η = 50. This setting can give an acceptable result in most cases based on our empirical experiments. The first order derivatives of loss function are

We first initialized η by satisfying KKT conditions.

by the second sufficient inequality, we set

After getting the initial η, we initialized λ2 and λ 1 in order by minimizing Adjusted Rank Index (ARI) with Louvain clustering. We considered the tuning parameters λ2 and λ 1 from 0.1, 1, 10, and 100. The parameter µ is charging of the scale and penalization power on signature matrix W . Thus, a smaller µ can result in a larger norm of W . Therefore, we developed a new metric to evaluate the reliability by the difference between the mean value of expression profile X and the mean value of signature matrix W . To eliminate the non-Gaussian effects, we also considered the difference between the median values of X and W . Thus, the new metric is formulated as

2 Theoretical Proofs

Lemma 2.1. For any symmetric nonnegative matrix Q ∈ R K×K and row vector w ∈ R K + and a ∈ R K ≥0 , the following matrix

is always semi-positive definite.

Proof. We construct a new matrix S ∈ R K×K by S ij = w i F ij w j , where S ij is the element of S whose row is i and column is j. And we reformulate F to be

For any nonnegative row vector v ∈ R K×K , we have

Since row vectors w and v are nonnegative, F is semi-positive definite when S is semi-positive definite. Therefore, it is sufficient if we can prove S is semi-positive definite. In the following, we follow equation 7 and prove that the product is always nonnegative.

The first term in the above formula is nonnegative since w is nonnegative and Q is semi-positive definite. The second term is always greater than or equal to 0 due the nonnegativity of a and w. Thus, S is a semi-positive definite matrix.

Theorem 2.2. Consider the following quadratic optimization problem,

which is the loss function for a row vector w ∈ R K . In this optimization problem, H ∈ R K×N is a constant non-negative matrix, and x, a 1 , a 2 ∈ R K are constant row vectors. The penalty parameters, λ 1 , λ 2 , and µ are positive numbers. The the following update rule

where we denote

converges to its optimal solution with the convergence rate

Proof. To prove that there exists a w t+1 that E(w t+1 ) ≥ E(w t ), we would like to construct an auxiliary function F (w, w t ), s.t.

where w t+1 = arg min w F (w, w t ). Thus, the loss function E(w t ) is monotonously non-increasing w.r.t the iteration t. We define the auxiliary function as the following.

where

We can approximate E(w t ) based on Taylor expansion.

where, by simple derivation,

Now we can have 14 satisfied if F (w t+1 , w t ) ≥ E(w t+1 ). To get it, we have

Obviously, ∇ 2 E(w t ) = HH T + µI is a positive definite matrix since µ is positive. Furthermore, the row vector

is always nonnegative. Due to the 2.1, J(w t ) − ∇ 2 E(w t ) is a semi-positive definite matrix. So that F (w t+1 , w t ) ≥ E(w t+1 ) is always satisfied and E(w t ) is nonincreasing w.r.t to t. Next, we would like to find the optimizer of F (w, w t ). To achieve this, we set ∂F (w,w t ) ∂w | w=w t+1 = 0, which is equivalent to equation ∇E(w t ) + (w t+1 − w t )J(w t ) = 0. Therefore, we have

For each element of w t+1 , the above formula can be reformulated as

3 So far we have proved that the above updating rule can make E(w t ) nonincreasing w.r.t to the t. Specifically, we have the lower bound for every element on the diagonal of J matrix.

Based on 14 and 23, we can derive the lower bound of the difference between two loss function values after one update.

since ∇F (w t+1 , w t ) = 0 and ∇ 2 F (w t+1 , w t ) = J(w t ). And by accumulating w t − w t+1 2 from 0 to T − 1, we then can have

Thus, for any given fixed iteration number T , we can have

Theorem 2.3. Consider the following quadratic optimization problem,

which is the loss function for a column vector h ∈ R K . In this optimization problem, W ∈ R D×K is the constant non-negative matrix, and x ∈ R K is a constant column vectors. The penalty parameter, η is a positive number. The the following update rule

converges to its optimal solution with the convergence rate

Proof. To prove that there exists an h t+1 that E(h t+1 ) ≥ E(h t ), we can construct an auxiliary function

where h t+1 = arg min h F (h, h t ). Thus, the loss function E(h t ) is monotonously non-increasing w.r.t the iteration t. We define the auxiliary function as the following.

where

. . . ,

We can approximate E(h t ) based on Taylor expansion.

where, by simple derivation,

Now we can have 31 satisfied if F (h t+1 , h t ) ≥ E(h t+1 ). To get it, we have

Obviously, ∇ 2 E(h t ) = W T W + η1 k 1 T k is a positive definite matrix since η is positive. Furthermore, the row vector

is always nonnegative. Due to the 2.1, J(h t ) − ∇ 2 E(h t ) is a semi-positive definite matrix. So that F (h t+1 , h t ) ≥ E(h t+1 ) is always satisfied and E(h t ) is nonincreasing w.r.t to t. Nextly, we would like to find the optimizer of F (h, h t ). To achieve this, we set ∂F (h,h t ) ∂h | h=h t+1 = 0, which is equivalent to equation ∇E(h t ) + (h t+1 − h t )J(h t ) = 0. Therefore, we have

For each element of h t+1 , the above formula can be reformulated as

So far we have proved that the above updating rule can make E(h t ) nonincreasing w.r.t to the t. Specifically, we have the lower bound for every element on the diagonal of J matrix.

Based on 31 and 40, we can derive the lower bound of the difference between two lose function values after 5 one update.

since ∇F (h t+1 , h t ) = 0 and ∇ 2 F (h t+1 , h t ) = J(h t ). And by accumulating h t − h t+1 2 from 0 to T − 1, we have

Thus, for any given fixed iteration number T , we have

To conduct the parameter analysis and validate the optimization convergence, we rewrote the original loss function to a equivalent formula as following. 

6

Specific Neutrophil Signatures in Blood Transcriptomes Stratify COVID-19 Patients

Mass 659

Cytometry: Technique for Real Time Single Cell Multitarget Immunoassay Based on Inductively 660

Coupled Plasma Time-of-Flight Mass Spectrometry

Multiplexed Mass Cytometry Profiling of Cellular States Perturbed 663 by Small-Molecule Regulators

A 665

Test Metric for Assessing Single-Cell RNA-Seq Batch Correction

Jaromir Mikes, and Petter Brodin. 2020

Classification Using Learned Cell Phenotypes

Accurate Identification of Single-Nucleotide Variants in Whole-Genome-Amplified 671

Single Cells

Comprehensive Analysis of Single Cell ATAC-Seq Data with SnapATAC

Lymphopenia in COVID-19: Therapeutic Opportunities

Impaired Type I Interferon Activity and Inflammatory Responses in 681

Severe COVID-19 Patients

Integrated Analysis of Multimodal Single-Cell Data

Clinical Features of Patients Infected with 2019 Novel Coronavirus in Wuhan, China

COVID-19): Systematic Review and Meta-Analysis

Single-Cell RNA Sequencing Technologies 692 and Bioinformatics Pipelines

Single-Cell Analysis Targeting the Proteome

Data-Driven Phenotypic Dissection of AML Reveals Progenitor-like 697

Cells That Correlate with Prognosis

Gating Mass Cytometry Data by Deep Learning

Towards a Human Cell Atlas: 701 Taking Notes from the Past

Single-Cell Hi-C Reveals Cell-to-Cell 705 Variability in Chromosome Structure

Workflow: Differential Discovery in High-Throughput High-Dimensional Cytometry Datasets

A Dynamic Immune 712

Response Shapes COVID-19 Progression

Highly Multiparametric Analysis by Mass Cytometry

Extracting a Cellular 718

Hierarchy from High-Dimensional Cytometry Data with SPADE

Correlate with Disease Severity in SARS-CoV-2-Associated Multisystem Inflammatory 723 Syndrome in Children

The Human Cell Atlas

Systems-Level Immunomonitoring from Acute 729 to Recovery Phase of Severe COVID-19

Classification and Density Estimation Using Gaussian Finite Mixture Models

Mass Cytometry: Single Cells, Many Features

Simultaneous 737

Epitope and Transcriptome Measurement in Single Cells

Multi-Omics Resolves a Sharp Disease-State Shift between Mild and 740

Moderate COVID-19

Lymphopenia during the COVID-19 Infection: What It Shows and What Can Be Learned

Single-Cell Omics Reveals Dyssynchrony of the Innate and 746

Adaptive Immune System in Progressive COVID-19

FlowSOM: Using Self-Organizing Maps for Visualization and 749 Interpretation of Cytometry Data

Information Theoretic Measures for 752

Clusterings Comparison: Is a Correction for Chance Necessary?

Data Denoising with Transfer Learning in Single-Cell Transcriptomics

BREM-SC: A Bayesian Random Effects Mixture Model for 759

Joint Clustering Single Cell Multi-Omics Data

Differential Discovery in High-Dimensional Cytometry via High-Resolution Clustering

Cytometry Benchmark Datasets in Bioconductor Object Formats

Nonnegative Matrix Factorization: When Data Is Not Nonnegative

Nonnegative Matrix Factorization: Algorithm and Applications to Blind Source Separation

Surface Protein Imputation 772 from Single Cell Transcriptomes by Deep Neural Networks