key: cord-0623022-h6qskukf authors: Zhang, Sharon; Moscovich, Amit; Singer, Amit title: Product Manifold Learning date: 2020-10-20 journal: nan DOI: nan sha: 634d892e8fe47f59085603c7ffd4e9fab63fb7d0 doc_id: 623022 cord_uid: h6qskukf We consider problems of dimensionality reduction and learning data representations for continuous spaces with two or more independent degrees of freedom. Such problems occur, for example, when observing shapes with several components that move independently. Mathematically, if the parameter space of each continuous independent motion is a manifold, then their combination is known as a product manifold. In this paper, we present a new paradigm for non-linear independent component analysis called manifold factorization. Our factorization algorithm is based on spectral graph methods for manifold learning and the separability of the Laplacian operator on product spaces. Recovering the factors of a manifold yields meaningful lower-dimensional representations and provides a new way to focus on particular aspects of the data space while ignoring others. We demonstrate the potential use of our method for an important and challenging problem in structural biology: mapping the motions of proteins and other large molecules using cryo-electron microscopy datasets. Consider a data-generating process F which maps vectors of latent (unobserved) variables to observations, (θ 1 , . . . , θ m ) (1) Our focus is on the case where the latent variables are low-dimensional and the observations are highdimensional vectors. As an illustrative example, consider an industrial articulated robot where the latent Preliminary work under review. variables θ 1 , . . . , θ m correspond to the angles of its m rotary joints, and the observed vector x is a set of measurements recorded by an external monitoring system. When F is deterministic (i.e. without noise), full-rank, smooth and injective, then the space of observations is a submanifold of R D (Lee, 2012, Theorem 4.14) . This observation underlies the field of manifold learning (Tenenbaum et al., 2000; Belkin and Niyogi, 2003; Coifman and Lafon, 2006; Talmon et al., 2013) . A key problem in this field is dimensionality reduction, where a set of high-dimensional observations x 1 , . . . , x n ∈ R D are mapped to a low-dimensional space, preferably one whose dimension is not much larger than the dimension of the latent space. In this work, we explore the idea of manifold learning when the latent space is a product manifold. If each latent variable θ i lies on a manifold M i of dimension d i then the latent space is the cartesian product which is a manifold of dimension d = d 1 +· · ·+d m . We refer to the decomposition (2) as a manifold factorization of M and to each of the manifolds M i as factors. Learning in the latent space is achieved by a datadependent manifold factorization algorithm whose input is the output of spectral graph-based methods for manifold learning (Belkin and Niyogi, 2003; Coifman and Lafon, 2006) . These algorithms output a set of N Laplacian eigenvectors which approximate the Laplaceian eigenfunctions of the manifold. On a product manifold the eigenfunctions are nothing but products of eigenfunctions on the factor manifolds. Hence, we can try to find the eigenvectors of the factor manifolds by seeking two small sets of vectors whose cartesian elementwise products produce all of the N eigenvectors. Instead of having each observation map to a point on a high-dimensional manifold, our approach enables each observation to be mapped to several points on separate low-dimensional spaces. Since each factor may encode a different aspect of the data, this opens the door to exciting new methods for data visualization and representation. For example, one can use it to visualize observations which come from a high-dimensional latent arXiv:2010.09908v1 [cs. LG] 19 Oct 2020 spaces as a set of points on m low-dimensional manifolds that correspond to the factors of M 1 , . . . , M m , then determine which aspects of the data space correspond to what factor, and ultimately focus only on the factors of interest. In this paper, we build on spectral embedding methods for dimensionality reduction and data representation (Belkin and Niyogi, 2003; Coifman and Lafon, 2006) . These methods construct a data-dependent graph Laplacian which approximates the Laplacian operator on the data manifold. Furthermore, the eigenvectors of the graph Laplacian converge to the eigenfunctions of the manifold Laplacian. By mapping the observations to their eigenvector coordinates one can obtain a low-dimensional embedding of the data space (Bates, 2014) . One work that is closely related to ours is the spectral method of (Singer, 2006a) for linear independent component analysis (ICA), as it is also based on the separability of the Laplacian on a product space. This was later extended to a method for non-linear ICA that assumes that the latent space has a product structure and that the Jacobian of the mapping (θ 1 , . . . , θ m ) → x is either known or can be estimated in some way (Singer and Coifman, 2008) . Recently, (Rodolà et al., 2019) applications of spectral representations on product manifolds to shape analysis problems such as finding correspondences. In the neural-network literature, learning disentangled representations has drawn interest in recent years (Reed et al., 2014; Locatello et al., 2019; Tran et al., 2017; Siddharth et al., 2017; Kim and Mnih, 2018) . Unlike our work, which makes strict assumptions on the data manifold, the disentanglement literature seems to treat broader classes of data-generating processes and does not have a formal notion of what constitutes a disentangled representation. We begin with a brief review of some useful properties of the Laplace operator, and then examine how the Helmoltz equation separates over product domains in R d . Lastly, we go over results that relate the graph Laplacian to the continuous Laplacian, and introduce the connection to diffusion maps. Let f be a real-valued, twice-differentiable function and M a compact d-dimensional manifold through-out this section. Let ∇ = ∇ M be the gradient and ∆ = ∆ M = ∇ M · ∇ M the manifold Laplacian (or Laplace-Beltrami operator). We consider the solutions of the Helmholtz equation with Neumann boundary conditions where ν is a normal unit vector to the boundary. The functions f are known as the Neumann eigenfunctions, and together with the corresponding eigenvalues λ they comprise the spectrum of the Laplacian operator on M. In this paper, we will only consider the particular Neumann boundary condition specified in (4), so we refer to the Neumann eigenfunctions simply as the eigenfunctions. A well-known property of the eigenfunctions is the following. Theorem 1. Let {f k } ∞ k=1 be the Neumann eigenfunctions, and {λ k } ∞ k=1 be the corresponding eigenvalues of a compact domain Ω ⊂ R d . Then (ii) The eigenfunctions {f m } ∞ m=1 form a complete orthonormal basis for L 2 (Ω). For a complete proof and discussion on generalizations, see (Folland, 1976) . In particular, the second result of Theorem 1 tells us that we can express any squareintegrable function over the data manifold in terms of the eigenfunctions. Analogs of these results also hold for more general Riemmannian manifolds. As we shift to the case when our data comes from a product manifold M, we can take advantage of some key separability properties of the Laplacian. For simplicity, we treat the case when M = M 1 × M 2 . Proposition 1. Let f 1 : M 1 → R and f 2 : M 2 → R be twice-differentiable functions such that ∆ M1 f 1 = λ 1 f 1 and ∆ M2 f 2 = λ 2 f 2 . Take π i : M → M i to be the projection of M onto M i for i = 1, 2. We can then define the natural extension of f i to M via g i = f i •π i . It follows that ∆ M (g 1 g 2 ) = (λ 1 + λ 2 )g 1 g 2 . For a proof, see for example (Canzani, 2013, Section 4.6) . Proposition 1 tells us that the product of the eigenfunctions of M 1 and M 2 are eigenfunctions of M and that the corresponding eigenvalue is equal to the sum of the eigenvalues in M 1 and M 2 . Note that this result easily extends to the general product of m > 2 Riemannian manifolds. Thus far, we can conclude that products of eigenfunctions of manifold factors are eigenfunctions of the product manifold. The converse is actually also true, and this can be shown by an application of the Stone-Weierstrass theorem (Canzani, 2013) . Hence, the eigenfunctions of M = M 1 × M 2 are precisely with corresponding eigenvalues {λ i + µ j } i,j . For the m > 2 case, the eigenfunctions of M are of the form with corresponding eigenvalues Note that since λ The Laplacian eigenfunctions with Neumann boundary conditions are f m,n (x, y) = 1 2 cos mπ a x cos nπ b y , with eigenvalues The eigenfunctions are precisely the product of the eigenfunctions on the closed intervals [0, a] and [0, b], which are cos( mπ a x) and cos( nπ b y). The eigenvalues of the product space are also the sums of the corresponding eigenvalues of the intervals. For the Laplacian eigenfunctions and eigenvalues on other domains, see the review by Grebenkov and Nguyen (2013) . The results of the previous sections hold for the continuous Laplacian, but in practice we only have limited observable samples x 1 , . . . x n ∈ R D . Thus, we must turn to the discrete analog of the Laplace operator, the graph Laplacian. To start, define the fully-connected graph (G, E, V ) whose edge weight matrix W ∈ R n×n is described by We then compute the random walk matrix A = D −1 W , where D is the diagonal degree matrix, The matrix A can be viewed as a transition probability matrix, and the ith entry of the jth column of A t describes the probability cloud of a random walk on G ending up in position j at time t, given a starting position at i. The diffusion map (Coifman and Lafon, 2006; Belkin and Niyogi, 2003 ) is a mapping φ : V → R n−1 given by Here, φ j are the right eigenvectors of A and φ j (i) denotes the ith coordinate of φ j . The coordinates of the embedding are known as diffusion coordinates. It can be shown that the diffusion distance, which is defined as the Euclidean distance in the diffusion coordinates, corresponds to the difference between different probability clouds (Coifman and Lafon, 2006) . Alternatively, if we view the observed data as samples from a smooth signal on the manifold, then the eigenvectors can also be interpreted as a Fourier basis for these signals (Coifman and Lafon, 2006; Lee and Izbicki, 2016) . The standard graph Laplacian is defined as L = D − W . We can also define a normalized version called the random walk Laplacian, which is L rw = D −1 L = I − A. The random walk Laplacian has several properties that are discussed in the next section which make it the most suitable approximation. Computing the eigenvectors and eigenvalues of L rw directly is computationally inefficient, but we can compute them indirectly via the spectrum of the symmetric Laplacian, L sym = D −1/2 LD −1/2 . The spectrum of L rw can then be derived directly from the spectrum of L sym (Von Luxburg, 2007) . A condensed version of this procedure can also be found in the thesis of Lafon (2004) . The graph Laplacian constructed from an i.i.d. sample of points on a manifold converges to a linear differential operator. For the standard kernel-based construction of the graph and when the samples are drawn uniformly the limiting operator is the Laplace-Beltrami operator ∆ M (Hein et al., 2005; Singer, 2006b; Belkin and Niyogi, 2008) . If the data instead has a nonuniform density p(x), the graph Laplacian converges to the Fokker-Planck operator on the manifold, which has an additional drift term given by the log-density U (x) = −2 log p(x) (Nadler et al., 2005; Coifman and Lafon, 2006) , Alternative graph constructions, such as the k-nearestneighbor graph also converge to a Fokker-Planck operator (Ting et al., 2010) . In addition to the pointwise convergence, several works have proved the spectral convergence of the graph Laplacian. i.e. the convergence of the eigenvalues and eigenvectors of graph Laplacians (von Luxburg et al., 2008; Rosasco et al., 2010; García Trillos and Slepčev, 2018; García Trillos et al., 2020) . As discussed in Section 2.2, each eigenfunction ϕ k on the product space M = M 1 × M 2 is the product of an eigenfunction ϕ i on M 1 and an eigenfunction ϕ j on M 2 . A corollary of the convergence of the graph Laplacian (see Section 2.4) is the following: given a large sample of points on M (or any isometric embedding of it), if we construct the graph Laplacian from the sample then each Laplacian eigenvector ϕ k should be approximately equal (up to normalization) to an elementwise product ϕ i ϕ j where ϕ i and ϕ j approximate eigenfunctions on M 1 and M 2 respectively. If either ϕ i or ϕ j correspond to an eigenfunction with eigenvalue λ = 0 we shall say ϕ k is a factor eigenvector. Otherwise, we call ϕ k ≈ ϕ i ϕ j a product eigenvector. Note that for data sampled uniformly from a connected manifold, the only eigenfunctions with λ = 0 are the constant functions. The main idea behind our method is to first differentiate between the factor eigenvectors and the product eigenvectors and then to divide the factor eigenvectors into two sets that correspond to the eigenspace of M 1 and M 2 . This is based on the observation that if ϕ k ≈ ϕ i ϕ j then this is a hint that ϕ i and ϕ j belong to different factor manifolds. Mathematically, our method is based on two assumptions: Remark 1. Embeddings of Riemannian manifolds into Euclidean space exists for any Riemannian manifold (Nash, 1954; Kuiper, 1955) . As isometric embeddings preserve the Riemannan metric, the spectral properties are not affected by the specific embedding. Assumption 2. The latent variables θ 1 , . . . , θ m are drawn independently of each other. Remark 2. This assumption may be relaxed by using the diffusion maps normalization for the graph Laplacian (Coifman and Lafon, 2006) . In contrast to the random-walk graph Laplacian, which converges to a density-dependent Fokker-Planck operator, with the diffusion maps normalization the graph Laplacian converges to the Laplace-Beltrami operator which does not depend on the density. We now present an algorithm for factoring the set of eigenvectors of the graph Laplacian. First, we find for each eigenvector ϕ k the pair ϕ i , ϕ j whose elementwise product is closest. This "closeness" is measured by the absolute cosine similarity, We take the absolute value of the dot product because of the sign ambiguity of the computed eigenvectors. For each k, we find the combination (i, j) with the highest similarity. To avoid computing the element-wise product for all k(k − 1)/2 combinations, we skip triplets (i, j, k) for which |λ i + λ j − λ k | > δ for some certain eigenvalue threshold δ > 0, which is specified as a parameter. This "eigenvalue criterion" is based on the results of Proposition 1. In terms of the quality of the match, the eigenvalue criterion may only act as a coarse filter. Thus, we additionally filter out triplets whose highest similarity is less than some threshold γ, which we call the "similarity criterion." In total, we store at most N combinations, one for every 1 ≤ k ≤ N . This process is summarized in Algorithm 1. Data: Eigenvectors {ϕ 1 , . . . , ϕ N } of L rw . Result: List of triplets (i, j, k) where ϕ k ≈ ϕ i ϕ j and their corresponding similarity scores. The result of the algorithm is a list of index triplets (i, j, k), indicating that the best factorization we could find for ϕ k is ϕ i ϕ j . Next, we use the triplets to divide the factor eigenvectors into two sets that correspond to the manifolds M 1 and M 2 . We can use the list of triplets to split the candidate factor eigenvectors into two separate subsets using a Max-Cut algorithm. To do so, we store the computed similarity scores in a T × T separability matrix C, where T is the number of eigenvectors which appear as a factor eigenvector in any triplet. Each entry of C is defined by and note that C is upper-triangular with zero diagonal. Here, the similarity score is repurposed as a representation of the factorization assignments between each pair of factor eigenvectors in a triplet. The separation of two eigenvectors into eigenspaces of distinct manifold factors should be prioritized depending on the similarity of the triplet in which they appear. This naturally lends to a graph-cut approach to determine the ideal separation. We use the symmetrized separability matrix, C + C T , as the input to our Max-Cut algorithm to obtain two groups of factor eigenvectors. Since the original Max-Cut problem is NP-hard, we use the semi-definite program (SDP) relaxation described by (Goemans and Williamson, 1995) . The entire algorithm was implemented in Python, and the Max-Cut SDP was implemented using CVXPY (Diamond and Boyd, 2016) . The first part of the algorithm involves computing the diffusion map of the data. The runtime for this step is dependent on the Scikit-learn randomized SVD algorithm (Pedregosa et al., 2011; Martinsson et al., 2011) . In the subsequent step, we must find the best triplet for each of k = 1, . . . , N out of all k(k − 1)/2 triplets. Taking into account the element-wise vector multiplication at each step, this takes O(nN 3 ) in the worst case. In practice, however, the eigenvalue criterion eliminates the multiplication computation for the vast majority of triplets. The last part of the algorithm consists of iterating through the list of triplets to form the dissimilarity matrix C and then using a Max-Cut SDP solver. Table 1 gives a breakdown of runtimes for each of these parts on 10, 000 samples from a rectangle with three-dimensional noise. We tested the algorithm on two different types of data: uniform samples from a noisy rectangle and synthetic cryo-EM data. As a baseline, we ran the algorithm on data sampled from a rectangle that lies in R 3 . The data was generated by taking random samples from a product of lines 1 = [0, √ π +1] and 2 = [0, 1.5], and then adding Gaussian noise in the z-direction. We ran two different trials on the rectangle data: first, using 10,000 samples with Gaussian noise applied in the z-direction; second, repeating the first trial except with additional Gaussian noise added in the x and y directions. For each trial, we use γ = 0.85 and δ = 0.5. The results of the separations are shown in Figure 1 . The two groups of eigenvectors clearly show cosine waves running alongside the x and y directions, which precisely matches the actual eigenfunctions for this domain. Our algorithm has two main input parameters which act as thresholds: δ for the eigenvalue criterion and γ for the similarity criterion. A question one might have is, for a given product eigenvector ϕ k , how much of an outlier will the similarity of the true combination of factor eigenvectors ϕ i ϕ j be compared to the similarities of all other pairs? To investigate this, we create scatterplots of the similarities against the eigenvalue criteria error over all pairs for various product eigenvectors, shown in Figure 3 . We see that in most cases the similarity score for the chosen triplet is indeed an outlier from the rest of the distribution. Unsurprisingly, for product eigenvectors with lower eigenvalues the highest similarity stands out more significantly, as this part of the spectrum is more stable. As the eigenvalue of the product eigenvector increases, the best pair, or the one that has the highest similarity score and satisfies the eigenvalue criterion, becomes slightly more ambiguous. However, this ambiguity tends to be between relatively few candidates. This confirms that the eigenvalue and similarity criteria work well in conjunction. A particularly compelling application of manifold factorization is for cryo-electron microscopy (cryo-EM). Cryo-EM is a technique used for imaging proteins and other macromolecules that has fostered major advances in structural biology (Kühlbrandt, 2014) , thus garnering it recognition by the 2017 Nobel Prize in Chemistry (Cressey and Callaway, 2017) . More recently, cryo-EM has played a significant role in unraveling the structural properties of the novel coronavirus SARS-CoV-2 (Zimmer, 2020) . In cryo-EM, molecular samples are rapidly frozen in a thin layer of ice, thus capturing them in their native states (Dubochet et al., 1988) . The preservation of this natural state is key for better understanding of their biological function. Once frozen, a transmission electron microscope is used to produce images of Figure 3 : Analysis of eigenvalue criterion and similarity score for the rectangle in 3D. We choose six eigenvectors uniformly from the set of product eigenvectors and plot the eigenvalue criteria error and similarity scores for all possible triplets. Indices at the top identify the product eigenvector to which a column corresponds. The highest similarity is marked by a red arrow. The similarity is measured along the x-axis from 0 to 1. the molecules, which are 2D tomographic projections of their electrostatic potential (Vulović et al., 2013) . These 2D images are then used to reconstruct, through a complicated series of computational steps, a 3D reconstruction of the molecule. The classical cryo-EM reconstruction problem is posed as follows: given n tomographic projection images I 1 , . . . , I n of a particular molecule at random (unknown) orientations, how can we recover the threedimensional structure of the molecule? A core obstacle to this problem is handling the low signal-to-noise ratio present in the micrographs (Singer and Sigworth, 2020) . This task is made even more difficult if we consider the heterogeneity problem, which has been the subject of recent literature (Frank, 2018; Nakane et al., 2018; Andén and Singer, 2018; Sorzano et al., 2019; Lederman et al., 2020; Zhong et al., 2020) . Here, a separate challenge arises from structural variations arising from motion within the molecule. Namely, for different images I i we will likely capture different molecular conformations. In particular, several works have focused on the application of diffusion maps for the analysis of cryo-EM datasets with continuous hetetogeneity (Dashti et al., 2014; Schwander et al., 2014; Zelesko et al., 2020; Dashti et al., 2020) . For molecules that exhibit two or more independent continuous motions, the molecular shape manifold of 3D electrostatic densities is a product space, so one can try using our algorithm to factor it. For our experiments we used the model of a potassium ion channel with two independently deformable parts: a spinning subunit with 360 • of rotation, and a subunit that stretches in the xy-plane (see Figure 2 ). To create the images, we sampled 10, 000 conformations of the molecule with uniformly random angles and x, y stretches ranging from [−20, 20] in either direction. All the images were projected from a single view. This models the computational pipeline of (Dashti et al., 2020) , which computes a diffusion maps embedding for each viewing direction separately. Finally, we use PCA with four components to preprocess the raw images and transform them so that they are zero-centered with unit variance. We perform three different trials: first, a dataset of molecules with rotations and stretches in the x direction only and low noise; then, a dataset of molecules with rotations and stretches in the x-direction only and high noise; and finally, a dataset of molecules with rotations and stretches in both the x and y directions and low noise. For each trial, use γ = 0.80 and δ = 1.0. The results for the second trial are shown in Figure 2 . The eigenvectors corresponding to the x-stretch manifold exhibit the ground truth cosine waves, and the eigenvectors corresponding to the rotation manifold (a circle) exhibit the ground truth sine waves. Note that even though the latent space is a product space, Assumption 1 does not strictly hold due to the additive Gaussian noise. Nonetheless, our method works nicely on this dataset. M 1 M 2 Figure 4 : Laplacian eigenmap embeddings. Each scatterplot shows 10,000 samples of the first two nontrivial eigenvectors found by the algorithm. The colors represent the ground truth data in the parameter space. Top row. Samples from a rectangle with zdirection noise. The algorithm was run with δ = 0.5, γ = 0.85. Middle row. Samples from noisy cryo-EM data. The algorithm was run with δ = 1.0, γ = 0.85. Bottom row. Samples from a torus. The algorithm was run with δ = 1.0 and γ = 0.6. We can visualize the manifold factors by plotting the Laplacian eigenmap embeddings. Figure 4 shows the eigenmap embeddings for three different experiments: the rectangle with noise in the z-direction; noisy cryo-EM data with a rotational component and a stretch component in the x-direction; and a torus. In each case, the embeddings are in agreement with the actual structure of the underlying domain. These embeddings show that the results of the algorithm can be used to accomplish two objectives. First, we can use representative factor eigenvectors from each manifold factor to get a more condensed representation of the data, thus achieving dimensionality reduction. This is similar to using diffusion coordinates for dimensionality reduction, but diffusion coordinates generally appear in no particular order. Additionally, by separating the factor eigenvectors according to the sepa-rate manifold factors, we also "de-mix" these diffusion coordinates into a more interpretable order. In this paper, we presented a spectral method for nonlinear dimensionality reduction and data representation. Our method is based on manifold factorization and applies to data sets with two or more independent degrees of freedom. We tested our algorithm on electron-microscope snapshots of a protein with two independently moving subunits and demonstrated that we can recover the motion manifolds for each subunit. Hence, our approach shows promise as a tool for the analysis of macromolecules with continuous degrees of freedom. While our discussion and experiments are limited to the two-factor case, our algorithm can be extended to the general case of K independent latent variables by replacing Max-Cut with Max-K-Cut (Newman, 2018) or similar procedures. Another interesting direction for future work is to study the potential of our approach for analyzing broader classes of data sets, other than product spaces, where the Laplacian operator is still separable. Code for reproducing the figures in this paper can be found at: https://github.com/sxzhang25/ product-manifold-learning Structural Variability from Noisy Tomographic Projections The embedding dimension of Laplacian eigenfunction maps Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Towards a theoretical foundation for Laplacian-based manifold methods Analysis on Manifolds via the Laplacian (lecture notes Applied and Computational Harmonic Analysis Cryo-electron microscopy wins chemistry Nobel Trajectories of the ribosome as a Brownian nanomachine Retrieving functional pathways of biomolecules from single-particle snapshots CVXPY: A Python-Embedded Modeling Language for Convex Optimization Cryo-electron microscopy of vitrified specimens Introduction to Partial Differential Equations New Opportunities Created by Single-Particle Cryo-EM: The Mapping of Conformational Space A variational approach to the consistency of spectral clustering Error Estimates for Spectral Convergence of the Graph Laplacian on Random Geometric Graphs Toward the Laplace-Beltrami Operator Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming From Graphs to Manifolds -Weak and Strong Pointwise Consistency of Graph Laplacians Disentangling by factorising The Resolution Revolution On C1-isometric imbeddings. I. Indagationes Mathematicae (Proceedings) Diffusion Maps and Geometric Harmonics Hyper-molecules: on the representation and recovery of dynamical structures for applications in flexible macro-molecules in cryo-EM A spectral series approach to high-dimensional nonparametric regression Introduction to Smooth Manifolds Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations A randomized algorithm for the decomposition of matrices Cryo-EM reconstruction of continuous heterogeneity by Laplacian spectral volumes Diffusion Maps, Spectral Clustering and Eigenfunctions of Fokker-Planck operators Erik Lindahl and Sjors HW Scheres. Characterisation of molecular motions in cryo-EM single-particle data by multibody refinement in RELION. eLife C 1 Isometric Imbeddings Complex semidefinite programming and max-k-cut Scikit-learn: Machine Learning in Python Learning to disentangle factors of variation with manifold interaction Functional Maps Representation On Product Manifolds On Learning with Integral Operators Russell Fung and Abbas Ourmazd. Conformations of macromolecules and their complexes from heterogeneous datasets Learning Disentangled Representations with Semi-Supervised Deep Generative Models From graph to manifold Laplacian: The convergence rate Spectral independent component analysis Non-linear independent component analysis with diffusion maps Computational Methods for Single-Particle Electron Cryomicroscopy Survey of the analysis of continuous conformational variability of biological macromolecules by electron microscopy Diffusion Maps for Signal Processing: A Deeper Look at Manifold-Learning Techniques Based on Kernels and Graphs A Global Geometric Framework for Nonlinear Dimensionality Reduction An Analysis of the Convergence of Graph Laplacians Disentangled Representation Learning GAN for Pose-Invariant Face Recognition A tutorial on spectral clustering Consistency of spectral clustering Image formation modeling in cryoelectron microscopy Earthmover-Based Manifold Learning for Analyzing Molecular Conformation Spaces Reconstructing continuous distributions of 3D protein structure from cryo-EM images The Coronavirus Unveiled