key: cord-0478222-hvj8qq34 authors: Farhadloo, Majid; Molnar, Carl; Luo, Gaoxiang; Li, Yan; Shekhar, Shashi; Maus, Rachel L.; Markovic, Svetomir N.; Moore, Raymond; Leontovich, Alexey title: SAMCNet for Spatial-configuration-based Classification: A Summary of Results date: 2021-12-22 journal: nan DOI: nan sha: 45a43614b632c6392d4bbc9d6a1f6a81fca42ae4 doc_id: 478222 cord_uid: hvj8qq34 The goal of spatial-configuration-based classification is to build a classifier to distinguish two classes (e.g., responder, non-responder) based on the spatial arrangements (e.g., spatial interactions between different point categories) given multi-category point data from two classes. This problem is important for generating hypotheses in medical pathology towards discovering new immunotherapies for cancer treatment as well as for other applications in biomedical research and microbial ecology. This problem is challenging due to an exponential number of category subsets which may vary in the strength of spatial interactions. Most prior efforts on using human selected spatial association measures may not be sufficient for capturing the relevant (e.g., surrounded by) spatial interactions which may be of biological significance. In addition, the related deep neural networks are limited to category pairs and do not explore larger subsets of point categories. To overcome these limitations, we propose a Spatial-interaction Aware Multi-Category deep neural Network (SAMCNet) architecture and contribute novel local reference frame characterization and point pair prioritization layers for spatial-configuration-based classification. Extensive experimental results on multiple cancer datasets show that the proposed architecture provides higher prediction accuracy over baseline methods. Spatial-configuration-based classification aims to build a classifier that can learn spatial patterns in multi-category point patterns to distinguish between two classes. When each data point belongs to a distinct category feature, it logically follows that the value of different spatial interactions between various points is additionally significant for the classification task. For example, the impact of one type of immune cells (e.g., cytotoxic T lymphocytes (CTLs)) on nearby cancer cells may be affected by other immune cells (e.g., T regulatory cells) [29] . A multi-category point pattern is a collection of spatially-defined objects with corresponding categorical features (e.g., immune and tumor cells). Fig. 1 illustrates the point patterns of the pathologist-driven field of views (FOV) of human tissue samples stained with a chemical dye (e.g., hematoxylin/eosin (H&E)) at the tumor-margin between two classes of responder and non-responder indicating two clinical response to immunotherapy treatment. Each data point shows the centroid coordinates of a cell in a pixel and the color of its corresponding cell type. This problem is important because spatial configurations, a proxy for physical interactions, help generate new hypotheses towards discovering disease therapeutics (e.g., immunotherapies for cancer treatment). These hypotheses could be used in applications such as medical pathology, biomedical research, and microbial ecology, where multi-category point patterns frequently appear. In many diseases, such as cancer, spatial arrangement and associations between distinct phenotypic markers are crucial to understand normal tissue function and disease biology. For example, the development of effective intervention strategies relies on the knowledge of the spatial arrangement and cellular mechanisms of coronavirus infections [28] . Researcher in immunology also seeks to understand the spatial configuration of immune and tumor cells to help evaluate the effectiveness of the immune checkpoint inhibitors (e.g., anti-PD-1) in cancer treatment arXiv:2112.12219v2 [cs.CV] 18 Mar 2022 [12] . These indicate the need for an automated system for analyzing spatial interactions at the molecular level to identify targets for disease therapeutics. Table 1 presents use case of the proposed model in different application domains. This problem is challenging due to the following reasons. First, the number of potential spatial patterns is exponentially related to the number of different category subsets. In addition, the spatial association between various point pair instances is not always equal, requiring the model to learn these distinctions. CTL-tumor cell interactions, for example, are more biologically relevant in the context of effector function (e.g., a CTL must engage a tumor cell to kill it); in contrast, B cell-tumor cell interactions are more indirect [9] . A point pair instance describes the spatial relationship between a point belonging to one category and its neighbor belonging to a same or different category. The second challenge is that multi-category point patterns are heterogeneous and form complicated structural and higher-order spatial interactions. Lastly, point patterns possess different properties (e.g., invariance to permutation), which means the classifier needs to meet the same requirements to achieve a robust surrogate representation. Most of the prior works to identify spatial associations in multi-category point patterns can be classified into handconstructed features using spatial association interest measures (e.g., Pearson correlation, G-cross, Ripley's cross-k, Participation index) [2, 16, 18, 24, 33] . For example, [18] uses the classical statistics measures (e.g., Pearson Correlation), which are sensitive to the choice of spatial partitioning. In addition, neighbor-graph-based approaches (e.g., [2, 26] ) are primarily a function of distance, which may not accurately model the true spatial relationships among categorical points on the three-dimensional surface (e.g., organ). Furthermore, these techniques are used in isotropic space, with the same intensity regardless of measurement direction, which may not be enough to capture relevant features that might be biologically significant. Lastly, analyzing the spatial association between different communities (e.g., tumor and stromal cells indicated as sub-graph-level) [16, 33] does not reveal critical information about the spatial relationships between distinct categorical points within each community. In more recent work, a spatial-relationship aware neural network (SRNet) [15] aims to confound these limitations leveraging machine-constructed features to model spatial relationships between points of different categories. However, SRNet is limited to only binary spatial relationships, and the importance between distinct binary category pairs is assumed to be equal. To overcome these limitations, we propose a Spatial-interaction Aware Multi-Category deep neural Network (SAMC-Net) architecture for spatial-configuration-based classification and contribute novel local reference frame characterization and point pair prioritization layers. SAMCNet provides Understanding the spatial configuration between immune and tumor cells to evaluate the effectiveness of ICI [12] Pharmacology Identifying protein interactions and bindings towards discovering structure-based drugs [7] Ecology Inferring predator-prey spatial interactions in food webs [21] Paleontology Studying fossils to classify organisms and examine their interactions with each other and the environment [11] Epidemiology Investigating the relationship between human mobility and spread of Covid-19 [25] a promising way to identify the importance between different point pair instances and the most relevant N-way spatial relationships. As shown in Fig. 2a , we first aim to provide a better way to represent spatial information using a multi-scale local reference frame characterization (LRFC) and spatial feature decomposition (described in Section 5.1) before applying an EdgeConv operation [30] . As indicated in Fig. 2b , a point pair prioritization sub-network is designed to specify a weight on point pair instances that are more important in an N-way spatial relationship (described in Section 5.3). Two-and three-way spatial relationships are indicated by hyperedges connecting vertices belonging to different categories in Fig. 2b . Lastly, we use an asymmetric function (e.g., average pooling) to aggregate information across all points neighboring the center pointˆ. The thickness of different edges shows the contribution of distinct category pairs in the overall representation ofˆ. Fig. 2 shows the overall framework of the proposed SAMCNet. We highlight our contributions as follows: • We design a dynamic point pair prioritization subnetwork to learn the most relevant features in N-way spatial relationships (e.g., tertiary, ternary, etc.) and use it in a Spatial-interaction Aware Multi-Category deep neural Network (SAMCNet). (a) A multi-scale spatial representation learning using local reference characterization and EdgeConv operation. For simplicity, we are only showing one edge feature. (b) Learning a point pair importance based on categorical attributes (node colors) in N-way spatial relationships. Figure 2 . The overall framework of SAMCNet. • We experimentally show that the proposed model outperforms existing baseline methods, and it is also computationally more efficient than the competing DNN architecture (e.g., SRNet). • We present case studies on two cancer datasets which shows the proposed model is able to identify highorder spatial patterns that are ignored by the related work, as well as the potential to advance scientific discovery. Scope: We aim to identify N-way spatial relationships to help distinguish between points patterns belonging to two different classes, in which spatial relationships may vary in the strength of spatial associations. Analyzing the presence of noisy points in constructing a neighborhood graph falls outside of the scope of this paper. We also do not evaluate the effect of standard data augmentation techniques (e.g., rotation). Field trials to assess the clinical value of the proposed method also fall outside the scope of this study. Patient privacy and the propriety nature of the data prevent us from publishing the dataset. Organization: The rest of the paper is organized as follows: Section 2 briefly describes an important application domain of this problem. Related work is reviewed in Section 3. Section 4 formally defines the problem. Section 5 describes the details of our proposed work. In Section 6, we present the experimental results and case study. Finally, Section 7 concludes the paper and outlines some future research. Figure 3 . Importance of spatial analysis in evaluating the effectiveness of ICI therapy in killing tumor cells [19] . The recent development of multiplex immunofluorescence (MxIF, Fig. 1 ) technology has enabled exploration into the complexity of tumor-immune microenvironments within spatially-preserved metastatic tissue and in the therapeutic context of immune checkpoint inhibitors (ICI). ICI therapy works by augmenting the antitumor properties of preexisting tumor-specific CTL, which become more efficient in infiltrating tumor masses and destroying cancer cells. Through cyclic rounds of antibody staining, imaging and dye inactivation, MxIF technology provides a state-of-the-art method to visualize and identify many cell subtypes (e.g., immune and malignant) and their corresponding spatial coordinates using single-cell analysis of formalin-fixed paraffinembedded (FFPE) tissue sections including metastatic melanoma lymph nodes. With continuous refinement of these techniques, it is currently possible to identify over 50 cellular phenotypic markers (e.g., CD3, FoxP3, CD14) and their corresponding cellular phenotypic and functional characteristics within a single tissue section. Emerging research in this area has begun to highlight the need of an automated process to analyze the complex spatial relationships among different cellular subsets and functional states in the context of ICI therapy, which allows identifying critical intercellular interactions relevant to clinical outcomes [31] . Furthermore, it is clinically crucial to examine the importance of cell species along with their activation states in a spatially informed manner due to the clinical implications of interactions in close spatial proximity (See Fig. 3) . For example, a CTL will likely be unable to kill a cancer cell if it is nearby a tumor-associated macrophage that expresses PDL1 on its surface. In contrast, a CTL is more likely to kill a cancer cell if it is expressing Granzyme B and is not in close proximity to FoxP3-expressing regulatory T cells (Treg). We aim to provide an algorithmic description of the importance of different relationships in a tumor-microenvironment, potentially revealing insights to enhance the manual visual assessments provided by pathologists. The related works can be classified into two major categories: (1) Data-Driven spatial quantification, (2) Machine-Constructed features using deep neural networks (DNN). Data-Driven spatial quantification: A spatial association (i.e., spatial co-location) is an intuitive representation to help understand the spatial interactions of a multi-category point patterns by identifying a subset of points frequently located in close spatial proximity to one another [26] . Spatial association interest measures (e.g., Pearson correlation, cross-k, G-cross, participation index) are commonly used in spatial data mining to quantify multi-category point patterns. Previously, a spatial association between tumor and immune cells in breast cancer digitized images of H&E stained tissues was studied, using classical statistical methods (e.g., Pearson correlation coefficient) after imposing a spatial grid partitioning on the two-dimensional map of the cell center points [18] . However, the classical statistics measures are sensitive to the choice of spatial partitioning. More recent work used neighbor-graph-based spatial statistical measures such as G-cross to quantify the spatial association between cancer and immune cells in lung cancer [2] . The limitation with this approach is that it is used in isotropic space, with the same intensity regardless of measurement direction, which may not be enough to capture relevant (e.g., surrounded by) spatial interactions that might be biologically significant. Finally, quantifying spatial associations between different communities (e.g., at the sub-graph-level) [16, 33] does not reveal critical information regarding the spatial relationship between distinct categorical points within each community. Machine-Constructed features using DNN: A new approach that begins to address these limitations leverages machine-constructed features using a spatial-relationship aware neural network (SRNet) [15] to model spatial relationships between points of different categories. However, SRNet is limited to only binary spatial relationships, and the importance between distinct binary category pairs is assumed to be equal. Also, SRNet uses a fixed-neighborhood distance to construct the input graph to the network, and no operator is defined to work with different sized neighborhoods. All the featured-based DNN reviewed in a recent computational pathology survey [6] use images, i.e., a regular grid, as the input. Hence, they cannot handle a simple but significantly important geometric structure, the point pattern 1 . The main goal of this study is to build a spatial-configurationbased classifier to distinguish between multi-category point patterns that belong to two different classes. The primary objective is to achieve a high solution quality (e.g., accuracy). In addition, we aim to identify the most relevant N-way spatial relationships that help distinguish between different classes. Fig. 1 shows an example of the point patterns of two fields of view of MxIF images at the tumor-margin that need to be classified into two unique classes, namely "responder" and "non-responder", reflecting different clinical outcomes. There are three key building blocks in constructing our model. The first is a multi-scale local reference frame characterization (LRFC), which takes as input a neighborhood graph that models the spatial distribution of one point and its neighbors. The second block uses an EdgeConv [30] to learn local and global information and a semantic representation by dynamically updating the neighborhood graph at each layer (see Section 5.2). The third is a prioritization subnetwork to distinguish between point pair instances that belong to different categories by learning the importance of each distinctive pair, we use an asymmetric function (e.g., average pool) to aggregate the information of one point and all its neighbors (see Section 5.3). The primary objective of the proposed neural network architecture is to learn N-way spatial relationships in a multicategory point pattern. The main difference between SAMC-Net and traditional data-driven association interest measures is LRFC, which allows categorical points belonging to different distributions (e.g., clustering versus even distribution) to be represented through a multi-scale representation, overcoming the inefficiency of intrinsically single-scale methods like radial basis function kernels or discretization. Furthermore, SAMCNet differs from the competing DNN architecture SRNet in two distinct ways. First, it incorporates a point pair prioritization sub-network, which learns the importance of point pairs in N-way spatial relationships based on their categorical attributes. Second, the connectivity of nodes in a locally connected graph allows SAMCNet to learn relevant high-order spatial patterns based on the input nearest neighbors and aggregation choice. are the spatial coordinates and is the categorical attribute 2 , we compute directed graph = ( , ), where and are the vertices and edges. We construct as the -nearest neighbor graph of each point ∈ IR , where = × . Note that the neighborhood graph of each consecutive layer in SAMCNet relies on the output of the preceding layer, which is dynamically updated based on dimension , which represents the feature dimensionality of the given layer. For example, in the beginning, the neighborhood graph The network architecture takes as input a multi-category point set containing n points, where a local reference frame characterization (LRFC) layer calculates an embedding for each point using its coordinates and neighborhood spatial distribution. Embeddings are then passed into the EdgeConv layer to specify an edge feature set of size k for each point. The categorical features are passed using a skip connection, where a point pair prioritization sub-network calculates the importance between a point and its k nearest neighbors belonging to different category pairs. Lastly, an average pooling aggregates information from all points to the center point. Notice that since a K-nn graph is built in the LRFC layer, the reconstructing of the K-nn graph in the first EdgeConv layer is skipped. is constructed as the k-nearest neighbor points of each point ∈ IR 2 , which represent the spatial coordinates. This approach allows the network to learn how to build the graph G utilized in each layer, rather than using a fixed constant graph established before the network is evaluated. Reconstructing the neighborhood graph in the embedding space produced by the hidden layer using nearest neighbors is empirically beneficial in related classification tasks, as shown in [30] . The next step is to use local reference frame characterization (LRFC) to model the distribution of one point and its neighbor by only using spatial coordinates. This technique allows us to model the relative distance between a given point with the respect to its nearest points , where 1 ≤ ≤ , into a corresponding edge ′ . The intuition behind the LRFC is that spatial coordinates are illustrative location indicators; using discretization or feed-forward neural network techniques is insufficient to capture the spatial distribution due to the lack of feature decomposition between spatial and categorical attributes. Inspired by a multi-scale periodic representation of grid cells in mammals [1] and a vector representation of self-position [10] , Mai et al. [17] proposed a multi-scale embedding, namely positional encoding , which uses sine and cosine functions of different frequencies to present positions in space. We adopt this idea in our network as follows. Given a point in a studied 2D space, [ ] = − ( ( )), where ( ) is a multi-scale representation , 1 ≤ ≤ , to capture the distribution of mixture multicategory point patterns. The overall formulation of local reference frame characterization (LRFC) is as follows: ] are unit vectors, the angles between every pair of vectors is 2 /3, , are the minimum and maximum grid scales, and = . We define the input to PE as the distance between the center point and its k-nearest neighbors as | ( ) − ( )|), where 1 ≤ ≤ . The EdgeConv operation is defined as edge feature = ℎ Θ ( , ), where ℎ Θ : × → ′ is a nonlinear function with a set of learnable parameters Θ. Lastly, an asymmetric operation (e.g., or Max) is applied to aggregate information along all the edge features neighboring center node . The choice of ℎ Θ is critical in defining EdgeConv, such as using the dot product between a set of filters Θ = { 1 , ..., } and image pixels in a regular grid and aggregating information using results in a standard convolution. A detailed discussion of different forms of ℎ Θ can be found in [30] . We have adapted the EdgeConv operation from DGCNN [30] in our network to learn both global shape structure, captured by the center coordinates , and local neighborhood information, captured by | − |. The overall formulation is as follows: where and are learnable parameter for local and global information, respectively. is the positional embedding to represent relative distances along each edge starting at . Thus far, we have built the graph and defined the edge embeddings in terms of strictly spatial features. If we follow existing point patterns graph-based DNN architectures (e.g., Pointnet++ [23] , DGCNN [30] ), we would simply concatenate the categorical features into the embedded feature space. However, the importance of interactions between vertices of categorical features and ∈N would not be learned in this way. As a result, the model would be confined to learning individual category features. Instead, the classifier should learn how to correctly weight diverse point pair associations as a stronger inductive bias. To this end, we propose a point pair prioritization layer to learn the importance (i.e., strength) of the spatial relationship between different category pairs, followed by an average pooling layer to weigh different subsets accordingly. As a whole, this layer is analogous to a weighted average pooling function, where the weights correspond to the importance of the categorical interaction. The input to this layer is an edge embedding ′′ , which is the output from the EdgeConv layer. In the prioritization layer, we first deriveˆ, an edge embedding augmented by the strength of categorical pairwise association: where is a learnable linear transformation on the original embedding to aid prioritization expressivity, and ì is our learned pairwise association weight vector for categorical point pair features ( , ). In this formulation, we have included ∈N , where ì is a learned self-weighting based only on the categorical feature of . We also note that interactions are assumed invariant with respect to the ordering of the categories; for example, C 1 C 2 ≡ C 2 C 1 . Similar to other prioritization (i.e., attention) layers, we then apply a LeakyReLU (LR) activation, followed by a softmax function, resulting in the normalized pairwise association: where is the learned categorical pairwise association for each neighbor, such that ∈ IR . With this normalized attention coefficient, , we can calculate the weighted average pooling and produce the final vertex embedding: This formulation can be extended to heads, following other prioritization networks such as GAT [27] , where each head learns a separate categorical pairwise association and linear transformation weight followed by an aggregation operation (AGG) over the different head outputs: where is a non-linear activation function such as LeakyRelu and the aggregation operation can take the form of an average or a concatenation. Since this layer preserves the identity of the center vertex, it can also be extended to multiple layers of the network by maintaining the categorical features of vertices between layers with a skip connection. We do this by adding point pair prioritization after each layer's EdgeConv operation. In the context of hierarchical feature learning, our network is therefore effectively capable of learning the importance of categorical N-way interactions in a hierarchical feature space. Finally, we note that the choice of aggregation is not limited to average pooling; for example, one may choose to select a large number of k-nearest neighbors when building the graph, while taking only a top-′ subset of highest features to pool, in order to filter out an overpowering number of weak interactions. Max pooling can be demonstrated as a special case of this concept, where only the top-1 of a neighbor's features is selected. Evaluation Tasks: We validated our proposed approach with (1) a comparative analysis to evaluate the proposed SAMCNet against classical spatial association intereset measures and state-of-the-art DNN architectures on this problem, (2) a sensitivity analysis to evaluate the impact of key building blocks (e.g., self-prioritization, neighboring prioritization, etc.) and with key parameters on selected performance metrics (See appendix Section B), and (3) a feature selection analysis to evaluate the impact of prioritization sub-network to learn the importance of different point pair instances and identify the most relevant N-way spatial relationships. Model Architecture: Fig. 4 shows the network architecture. The proposed SAMCNet was implemented in Pytorch. For local reference frame characterization, the grid-scale, minimal grid cell size and maximal grid cell size were set to 5, 1, and 100, respectively. The number of k nearest neighbors was set to 6. We followed the same settings for the 4 Edge-Conv layers, residual block connection, batch normalization, activation functions, and dropout as described in [30] . We used the Adam optimization algorithm with a learning rate of 10 −3 and cross-entropy loss for 200 epochs to train the SAMCNet. The batch size, momentum, and prioritization heads were set to 7, 1, 0.9, respectively. All hyper-parameters were set through tuning on the validation set. DNN candidate methods were tested with the same setting described above. Baseline Methods 3 : We compared our proposed framework on selected classification metrics with the following baseline methods. Point patterns composed of hand-constructed features using participation index (PI) [26] , then spatial association interest measure values were fed into a (1) decision tree with a max depth of 2 (PI+DT), (2) Random Forest with similar depth (PI + RF), and (3) fully connected neural network (PI + NN) with four Relu hidden layers and 2048 neurons. In similar settings, we used point patterns composed of cross-k [13] values fed into similar classifiers previously described, giving us three more candidate methods: (4) cross-k + DT, (5) cross-k + RF, and (6) cross-k + NN. We used a fixed-neighborhood distance of 50 pixels to construct participation index and cross-k values. We also evaluated proposed model with state-of-the-art DNN architectures: (7) PointNet [23] , a neural network architectures that directly consume point sets for applications ranging from object classification to part segmentation; (8) DGCNN [30] , a dynamic graph convolutional neural network architecture for CNN-based high-level point cloud tasks such as classification and segmentation; (9) SRNet [15] , a DNN architecture for binary spatial relationships in multi-category point patterns. Dataset: Experiments were conducted on two multi-category point pattern cancer datasets from MxIF images. The first dataset was used for two distinct classification tasks, (1) tumor-margin classification and (2) tumor-core classification. The second dataset was used for a (3) disease classification task. In the tumor-margin classification task, we used 145 FOV point sets indicating two different clinical outcomes of ICI therapy, 68 of which were labeled as responders, and 77 labeled as non-responders for individual who progressed and experienced recurrence in less than a year. We used 103 FOVs point sets in the tumor-core classification task, 30 of which were labeled as responders and 73 non-responders, extracted from tumor area of metastatic lymph nodes. In the disease classification task, we used 143 point sets of chronic pancreatitis and 53 pancreatic ductal adenocarcinomas (PDAC). Evaluation Metrics: The model performance was measured by using the weighted average of precision, recall, F1-score, and accuracy (ACC). Data Preparation: In each classification task, we divided the data into 80% training and 20% testing. Ten percent of the training set was selected to be the validation set. Due to the limited number of learning samples, we used data augmentation techniques, whereby each learning sample was rotated 12 degrees clockwise five times during the training procedure. We restricted rotation to only five times due to potential overfitting issues. We uniformly sampled 1,024 points from each point set for the underlying classification task. Platform: We used K40 GPU composed of 40 Haswell Xeon E5-2680 v3 nodes. Each node has 128 GB of RAM and 2 NVidia Tesla K40m GPUs. Each K40m GPU has 11 GB of RAM and 2880 CUDA cores. Comparative Analysis: We tested the candidate methods on tree classification tasks described in Section 6.1. Results on selected classification metrics are presented in Table 2 . Results show the superiority of the proposed SAMCNet over traditional data-driven spatial association interest measures and existing DNN competition (i.e., PointNet, DGCNN, SR-Net). Most notably, we were able to improve accuracy over SRNet by a margin of 7.0%, and 17.0% on tumor-margin, and disease classifications, respectively. These results suggest that local reference frame characterization (LRFC) and specifying different weights to points of the same neighborhood with the distinct categorical attribute are beneficial. In addition, we used the model inference time on three classification tasks as a measure of the model's computational time complexity and examined the trade-off between time complexity and classification accuracy. In this experiment, we only compared our proposed model with SRNet as a direct competitor, which is specifically designed to learn spatial associations in multi-category point patterns. SAMC-Net is not only more accurate than SRNet, but it also runs faster from 3 to 10 times faster on the three classification tasks. Table 3 provides the details on each classification task. Pytorch Profiler 4 was used to measure the inference time across different candidate methods. Sensitivity Analysis: To evaluate the performance of the primary building pieces inside our suggested DNN architecture, we asked one question: How does the model perform in the presence and absence of important components? To answer this, we incrementally added key elements of the model (e.g., using only local reference frame characterization (LRFC), only using self-prioritization (i.e., self-prt), and only using neighbor-prioritization (i.e., neighbor-prt) , etc.) and assessed performance using a variety of classification measures. Results on selected classification metrics are presented in Table 4 . The results show that using a prioritization subnetwork is beneficial to filter out the exponential number of weak interactions caused by center points (e.g., center cells) and neighboring points (e.g., neighboring cells). In addition, it can be observed that the local reference frame characterization (LRFC) plays a critical role in representing relative distances and mixture distribution caused by neighboring points (e.g., cells), where LRFC combined with a prioritization sub-network (i.e., self or neighbor prioritization) provides better classification performance in most cases. This last result suggests that the proposed model performs the best when the local reference frame characterization layer is integrated with the point pair prioritization sub-network as a whole. Impact of Prioritization Sub-network: The goal of this experiment was to demonstrate the interpretability of SAM-CNet by measuring the impact of various point pair values (e.g., contribution between distinct cells) for distinguishing between point patterns of different classes (e.g., responder and non-responder). However, non-linear activation functions in DNN architectures are required for learning complex configurations, making ML models hard to interpret and this task challenging. To address this problem, we separated the feature vector indicating the relevance of distinct point pairs from the pooling layer and non-linear activation functions . We transformed the feature vector from each distinct point pair to a scaler value using a vector norm [14] to measure the magnitude (i.e., importance) of the learned associations. 4 https://pytorch.org/tutorials/recipes/recipes/profiler_recipe.html We divided each by the maximum value found across all point pair scalers to further normalize them for a direct comparison. The feature vectors were composed of different distinct cell values extracted at layer-1 and layer-4 from the point pair prioritization sub-network, an indication of the ability of SAMCNet to learn hierarchical feature representations. These values present vector norm of ì in equation 5. Results are presented in Fig. 5 , where it can be observed that the spatial interactions between points within the same category (e.g., {Tumor Cell, Tumor Cell}, {Vasculature, Vasculature}) remain critical across all layers. By contrast, the spatial interactions between distinct pairs (e.g., {Tumor Cell, Macrophage}) are adjusted through different prioritization layers, but they remain relatively important in the learning N-way spatial relationships separating two distinct classes (e.g., responder and non-responder) in the tumor-core. Note that we chose the tumor enriched areas of the lymph node (tumor-core) to interpret the proposed SAMCNet because it is primarily used in oncology analysis to determine the efficacy of ICI therapy by examining the various spatial interactions and variability between different cell species. Most Relevant N-way Spatial Relationships: Thus far, we have demonstrated the most distinctive point pairs; but the main idea of this work is to show SAMCNet's ability to identify the most significant high-order spatial interactions. Hence, each sample point pattern was represented by extracting its corresponding feature vectors composed of all N-way spatial relationships (e.g., tertiary, ternary) with respect to its center and neighboring points (e.g., different cell categories). To be more precise, the trained model was used to extract features after the point pair prioritization network at layer-4, where the model has learned both spatial and categorical associations. Thus, we have a feature vector ì = ([ ′′ * ′ ]) for each N-way spatial relationship, where it is composed of an aggregation (e.g., mean) over ′′ as point categories located around the center point and ′ is the embedded feature space (e.g., 256). For example, having a tumor cell as a center cell (i.e., data point) and unique counts of macrophages and neutrophils as neighboring cells is an instance of a 3-way spatial association. We evaluated the importance of the identified spatial relationships, namely, SAMCNet representation for different category subsets, through permutation feature importance. This metric measures the importance of a given feature by the increase observed in the prediction error caused by randomly shuffling the feature space. The relevance of the discovered N-way spatial relationships was tested in this experiment by the classification accuracy after exchanging the corresponding elements in the representation vectors. The top five most relevant spatial associations found within the tumor-core are shown in Table 5 . The pro-tumor (non-responder) relationship between tumorassociated macrophages and tumor-associated neutrophils has been studied in past works [5] , although its nature remains not entirely clear. Tumor-associated macrophages appear to play a protective role against antitumor immunotherapy, and the association of tumor-associated macrophages with tumor cells preferentially in patients who did not respond is consistent with established biology. However, added to this is the effect of tumor-associated neutrophils which also appear to promote tumor cell survival and lack of response to immunotherapy. The latter is less well understood, with notable findings indicating that tumor-associated neutrophils are relevant to tumor progression but not necessarily to immunotherapy resistance or a relationship with tumorassociated macrophages. These findings suggest a previously unknown shared biology among these two populations of myeloid cells (macrophages and neutrophils) and provide new insight into the possibility of a relationship between tumor-associated macrophages and tumor-associated neutrophils as they engage tumor cells in the tumor core. This is an intriguing pattern that has not yet been studied. Future research is warranted to understand the relationships of macrophages, tumor cells, and neutrophils sub-population in these interactions. In this paper, we propose SAMCNet, a neural network architecture with local reference frame characterization and a point pair prioritization sub-network. SAMCNet provides a promising way to help understand the spatial configuration of multi-category point patterns and most relevant N-way spatial relationships. Experimental evaluation shows that the proposed model outperforms existing DNN techniques. In the future, we plan to investigate a dynamic local reference frame characterization layer to learn the spatial distribution of an embedded feature space between a given point and its neighbor. We also plan to identify a multi-category public benchmark dataset for a larger and broader evaluation of the proposed method. We plan to extend this work to consider spatial variability by learning point pair importance based on density and distribution of multi-category points in different sub-regions. To promote open science and reproducibility, the code used in the experiments are shared through Github 5 . Patient privacy and the propriety nature of the data prevent us from publishing the dataset. But we would like to bring to the attention of the researcher community to evaluate the capability of the proposed model; a recent paper (Astropath) highlighted a similar dataset used in the experiments on big data for cancer immunology and conducting spatiallypreserved analysis [3] . However, this dataset was not publicly available at the time of submission. A more detailed discussion on this can be found here: https://ventures.jhu.edu/news/astronomypathology-biopath-biomarkers-cancer/. Table 6 presents the details of all parameters is used to train SAMCNet. We further evaluated our proposed model by varying the key parameters, namely grid scale count, k-nearest neighborhood, and prioritization heads. As shown in Fig. 6a , the trends show that classification accuracy is sensitive to the choice of scale representation, where increasing grid-scale does not guarantee better performance. For example, the classification accuracy drastically dropped to lower than 0.75% in disease classification when the grid-scale count (i.e., s in equation 1) for a multi-scale representation was set to 10. While we did not thoroughly test all feasible grid-scale counts, our intuition is representing the relative distance in an embedded space very close or even higher than that first multi-layer perceptron (i.e., first EdgeConv in layer-1); makes it challenging to approximate local information. We tested our model with different sizes of nearest neighborhoods. As shown in Fig. 6b , a large k neighbor size results in deteriorating the classification performance. It shows that beyond a certain threshold density, the locally connected neighborhood graph fails to approximate geodesic distance and destroys the geometry of each patch, as discussed in [30] . In addition, this confirms the hypothesis that a large size of allows an overpowering amount of weak point pair interactions contributing to the overall representation of the center node. Hence, as discussed in Section 5.3, one may investigate more in a combination of different aggregation operations as the size of K increases. Lastly, We also tested SAMCNet with a different number of heads, = {1, 2, 4}, at each layer. As shown in Fig. 6c , our point pair convincingly learned point pair interactions between various categorical attributes. This also implies that compared to a multi-head attention network, our one-head prioritization sub-network provides the best tradeoff in model complexity, computational time complexity, and classification performance in terms of learning fewer parameters and taking less time. The success of convolutional neural networks (CNNs) in many pattern recognition tasks (e.g., [4, 8] ) has inspired researchers to generalize convolution-like operations to directly apply to 2D/3D point cloud data without any computationally expensive intermediate conversion layers. PointNet [22] , the first neural network architecture that directly applies to point cloud data, learns point features independently through several fully connected neural network layers and aggregates them using an asymmetric function operation (e.g., Max pooling). PointNet++ [23] , a variation of PointNet accounts for the local structure by applying graph coarsening operation and a shared PointNet recursively to a set of local points chosen by farthest point sampling and subsequently their k-nearest neighbors. However, these techniques are limited in learning fine-grained local structures since learning representation of each point independently at a localized scale to preserve permutation invariance. DGCNN [30] proposes a graph dynamic graph CNN that dynamically updates the graph network at each layer by learning both local and global information. This work is inspired by PointNet using a simple operation known as EdgeConv, where rather than independently applying to individual points, construct a locally connected neighborhood graph to exploit from both center nodes and edge features. Many other efforts have been made to learn local structure. For example, SpiderCNN [32] proposes a multi-scale hierarchical that extend convolutional operations from regular grids to irregular point sets that can be embedded to IR . However, these approaches are not designed to learn spatial relationships in multi-categorical point sets. In addition, they do not fully exploit the spatial distribution of points beyond simply measuring relative distance or applying a discretization or feed-forward neural network to coordinates. Measures. Cross-K Function: Spatial statistics [13] uses the cross-K function, a generalization of Ripley's K function, to detect spatial relationships between point patterns with more than one feature. The cross-K function (ℎ) for binary spatial features is defined as (ℎ) = −1 |# type instances within distance ℎ of a randomly chosen type instance|, where and represents two category types, is the density of type instances, ℎ is the distance, and |.| is the expectation. The cross-k function could be estimated in the form of^(ℎ) = 1 Σ Σ ℎ ( ( , )), where ( , ) is the distance between the instance and the instance, ℎ is an indicator function, and is the study area [26] . The value of cross-k is a function of neighborhood distance ℎ, which implies the spatial relationship between categorical points at different scales. Participation Index: The co-location pattern interest measure most related to the cross-k function is the participation index. The participation index, an upper-bound approximation of the cross-K function, possesses an anti-monotone property that can be used for computational efficiency. Before defining participation index, we need to define another interest measure, participation ratio. The participation ratio ( , ) of feature in a co-location pattern C = { 1 , ..., }, 1 ≤ ≤ , is the fraction of spatial objects of feature in the neighborhood of instances of co-location C. Then, participation index ( ) is defined as the minimum participation ratio of the features in a co-location pattern, that is ( ) = ∈ { ( , )}. The overall formulation of participation ratio is as follows: ( , ) = Number of distinct in instances of C Number of (8) From equation above, it can be observed that the value of the participation index is between 0 and 1. A large ( ) value shows that events of tend to be located in close spatial proximity of other events of features in C. We used the crossk function and participation index to quantify the spatial relationship between categorical point sets. For example, given a point set containing points belonging to categories and a set of neighborhood distance threshold = {ℎ 1 , ..., ℎ }, 1 ≤ ≤ , there will be ( − 1) * cross-k functions or participation index pairs. Local reference frame characterization is proposed based on the following theorem which proof is given in [10] . Theorem E.1. Let Ψ( ) = ( ⟨ , ⟩ , = 1, 2, 3) ∈ C 3 where = cos + sin and ⟨ , ⟩ is the inner product of and . 1 , 2 , 3 ∈ R 2 are 2D vectors such that the angle between each pair is 2 /3, ∀ , ∥ ∥= 2 √ . Let ∈ C 3×3 be a random complex matrix such as * = . Then ( ) = Ψ( ), (∆ ) = (Ψ(∆ )) * satisfies and ⟨ ( + ∆ ), ( )⟩ = (1 − ∥∆ ∥ 2 ) (10) where ( ) is the representation of location , = 3 is the dimension of ( ), and ∆ is a small displacement from . Ψ( ) is represented as a concatenation of the position embedding ( ) at scales. Nobel prize for decoding brain's sense of place Spatial interaction of tumor cells and regulatory T cells correlates with survival in non-small cell lung cancer Analysis of multispectral imaging with the AstroPath platform informs efficacy of PD-1 blockade Grape detection with convolutional neural networks The Colorectal Cancer Tumor Microenvironment and Its Impact on Liver and Lung Metastasis Deep computational pathology in breast cancer Identification of FDA approved drugs targeting COVID-19 virus by structure-based drug repositioning Twitter Sentiment on Affordable Care Act using Score Embedding CD8+ cytotoxic T lymphocytes in cancer immunotherapy: A review Learning grid cells as vector representation of self-position coupled with matrix representation of self-motion RE voSim: Organism-level simulation of macro and microevolution Close proximity of immune and tumor cells underlies response to anti-PD-1 based therapies in metastatic melanoma patients An Introduction to Spatial Data Mining. The Geographic Information Science & Technology Body of Knowledge Attention is not only a weight: Analyzing transformers with vector norms SRNet: A spatial-relationship aware point-set classification method for multiplexed pathology images Feature-driven local cell graph (FLocK): New computational pathology-based descriptors for prognosis of lung cancer and HPV status of oropharyngeal cancers Multi-scale representation learning for spatial feature distributions using grid cells An ecological measure of immune-cancer colocalization as a prognostic factor for breast cancer How do immune checkpoint inhibitors work against cancer? Scikit-learn: Machine Learning in Python Inferring predator-prey interactions in food webs Pointnet: Deep learning on point sets for 3d classification and segmentation Pointnet++: Deep hierarchical feature learning on point sets in a metric space histoCAT: analysis of cell phenotypes and interactions in multiplex image cytometry data Understanding COVID-19 Effects on Mobility: A Community-Engaged Approach Discovering spatial co-location patterns: A summary of results Graph attention networks Coronavirus biology and replication: implications for SARS-CoV-2 PD1 blockade reverses the suppression of melanoma antigen-specific CTL by CD4+ CD25Hi regulatory T cells Dynamic graph cnn for learning on point clouds Tumor microenvironment and therapeutic response Spidercnn: Deep learning on point sets with parameterized convolutional filters Understanding heterogeneous tumor microenvironment in metastatic melanoma This material is based upon work supported by the NSF under Grants No. 2040459, 1737633, 1901099, 1218168, and 1916518. We also thank Kim Koffolt and the Spatial Computing Research Group for valuable comments and refinements.