key: cord-0435982-sykotx8d authors: Bu, Kaifeng; Koh, Dax Enshan; Li, Lu; Luo, Qingxian; Zhang, Yaobo title: On the statistical complexity of quantum circuits date: 2021-01-15 journal: nan DOI: nan sha: 25483d718631fe569c10895f265225ff8983b02c doc_id: 435982 cord_uid: sykotx8d In theoretical machine learning, the statistical complexity is a notion that measures the richness of a hypothesis space. In this work, we apply a particular measure of statistical complexity, namely the Rademacher complexity, to the quantum circuit model in quantum computation and study how the statistical complexity depends on various quantum circuit parameters. In particular, we investigate the dependence of the statistical complexity on the resources, depth, width, and the number of input and output registers of a quantum circuit. To study how the statistical complexity scales with resources in the circuit, we introduce a resource measure of magic based on the $(p,q)$ group norm, which quantifies the amount of magic in the quantum channels associated with the circuit. These dependencies are investigated in the following two settings: (i) where the entire quantum circuit is treated as a single quantum channel, and (ii) where each layer of the quantum circuit is treated as a separate quantum channel. The bounds we obtain can be used to constrain the capacity of quantum neural networks in terms of their depths and widths as well as the resources in the network. Owing to its ability to recognize and analyze patterns in data and use them to make predictions, deep learning-a subfield of machine learning-has made a profound impact on the computing industry [1] [2] [3] and has found applications in a myriad of fields, including natural language processing [4] [5] [6] , drug design [7, 8] , fraud detection [9, 10] , medical image analysis [11, 12] , self-driving cars [13, 14] , handwriting recognition [15, 16] , and computer vision [17, 18] . A central object in many deep learning models is the neural network, an interconnected collection of nodes that can learn from data and model relationships between them [19] . Different neural networks differ in terms of their ability to learn from data, and understanding this difference is a key problem in theoretical machine learning. This ability of neural networks has been quantified by various statistical complexity measures, including the Vapnik-Chervonenkis (VC) dimension [20, 21] , the metric entropy [22] , the Gaussian complexity [23] , and the Rademacher complexity [23] . The dependence of these measures on various structure parameters of the neural network, such as its depth and width and the number of parameters in the neural network, has been studied in a number of papers [24] [25] [26] [27] [28] [29] . In addition to the progress in deep learning, the last decade also saw rapid developments in quantum computing [30] . With the development of noisy intermediate-scale quantum (NISQ) hardware [31] as well as near-term quantum algorithms like the variational quantum eigensolver (VQE) [32] and the quantum approximate optimization algorithm * kfbu@fas.harvard.edu † dax_koh@ihpc.a-star.edu.sg (QAOA) [33, 34] , there are expectations that quantum computers are poised to revolutionize computation by speeding up the solutions of certain practical computational problems [35] . Major experimental milestones in this direction include the recent demonstrations of quantum computational supremacy [36, 37] (also called quantum advantage [38] ), defined to be an event in which a quantum computer empirically solves a computational problem deemed intractable for classical computers, independent of the practical value of the problem [39] [40] [41] [42] . At the intersection of deep learning and quantum computing is the field of quantum deep learning, which has the quantum neural network-the quantum generalization of the classical neural network-as one of its central objects [43] [44] [45] [46] [47] [48] . Quantum deep learning has been explored as an application of quantum machine learning, which has gained significant interest of late [49] [50] [51] [52] [53] . Compared to the classical neural networks, however, considerably less is known about quantum neural networks and characterizations of their statistical complexities. For example, the following question has hitherto remained largely unaddressed: how does the statistical complexity of quantum neural networks depend on the structure parameters of the quantum circuit underlying it as well as the amount of certain resources it contains? In this paper, we address the above gap by characterizing the statistical complexity of quantum circuits in terms of their Rademacher complexity. To characterize the dependence of Rademacher complexity on resources in the framework of quantum resource theories [54, 55] , we introduce a resource measure of magic [56] for quantum channels based on the (p, q) group norm. We consider the Rademacher complexity of quantum circuits in two different settings. First, we consider the case where the entire quantum circuit is treated as a single quantum channel independent of its depth or width. In this case, we find a bound for the statistical complexity that depends on a resource measure of magic as well as the number of input and output qubits. Second, we consider the case where each layer of the quantum circuit is treated as a separate quantum channel. In this case, we find a bound for the statistical complexity that depends not only on the resource measure of magic but also on the depth and width of the quantum circuit. Consider m independent samples S = ( x 1 , . . . , x m ), where each x i is encoded as a quantum state |ψ( x i ) . After a quantum circuit C (e.g., C could be an instance of a variational quantum circuit or a quantum neural network) is applied to the quantum state |ψ( x i ) and a (Hermitian) observable H is measured on the output, the expected measurement outcome is given by In this way, each quantum circuit C defines a real-valued function f C . Let F • C := { f C : C ∈ C} denote the function class defined by the set of quantum circuits C. Consider the hypothesis space H = F • C, where C is a given set of quantum circuits. Given m independent samples from some unknown probability distribution D on some X × Y, let us consider a loss function l : Y × Y → R. The goal of the learning task is to find some function in the hypothesis space that minimizes the expected error L( f ) = E ( x,y)∼D l( f ( x), y). As we have access to only the m independent samples { ( x i , y i ) } m i=1 , one strategy is to find some function in hypothesis space to minimize the empirical errorL The difference between the empirical and expected error is called the generalization error, which determines the performance of the hypothesis function f on the unseen data drawn from the unknown probability distribution. The Rademacher complexity is a measure of the richness of a hypothesis space and can be used to provide bounds on the generalization error associated with learning from training data [23, 57] . Let us consider the Rademacher complexity of F • C on m independent samples S = { x 1 , ..., x m }, defined as where each ε i in the expectation above is a Rademacher random variable, which takes the values ±1 with equal probability 1/2. Here, we use the Rademacher complexity as a measure of the statistical complexity of the hypothesis space F •C. A. Rademacher complexity of quantum channels Given a quantum channel Φ : L((C 2 ) ⊗n 1 ) → L((C 2 ) ⊗n 2 ) from n 1 qubits to n 2 qubits, we define the 4 n 2 × 4 n 1 representation matrix M Φ of Φ to be the matrix whose entries are given by where x ∈ { 0, 1, 2, 3 } n 1 , z ∈ { 0, 1, 2, 3 } n 2 , and P x , P z are the corresponding Pauli operators. For any Hermitian operator P, the representation vector α P of P is defined as For any N 1 × N 2 matrix M, which can be treated as a column of N 1 row vectors, the (p, q) group norm of M, where . Of interest to us is the (p, q) group norm of the representation matrix of quantum channels. As we shall show in Appendix A, the (p, q) group norm of the representation matrix of quantum gates can be used as a resource measure to quantify the amount of magic in the quantum gates. Here, we treat the entire quantum circuit as a single quantum channel. Let us define C n 0 ,n 1 · p,q ≤µ to be the set of quantum circuits from n 0 qubits to n 1 qubits that have a (p, q) group norm bounded by µ. Theorem 1. Given the set of quantum circuits C from n 0 qubits to n 1 qubits with bounded (p, q) norm · p,q , the Rademacher complexity of F • C n 0 ,n 1 · p,q ≤µ on m independent samples S = { x 1 , ..., x m } is bounded as follows: (1) For 1 ≤ p ≤ 2, we have (2) For 2 < p < ∞, we have where p * is the Hölder conjugate of p, i.e., 1 p + 1 p * = 1; and α and f I ( x i ) are the representation vectors of H and |ψ(x i ) ψ(x i )| in the Pauli basis, respectively. This result provides an upper bound on the Rademacher complexity of quantum circuits that depends on the amount of magic and the number of input and output qubits. (See Appendix A for a proof of Theorem 1.) We now consider the special case where the quantum channel Φ is unital, i.e., Φ(I) = I. In this case, the representation matrix M Φ has the following form M Φ = 1 0 T 0M Φ . We shall define the modified representation matrixM Φ to be the bottom-right (4 n 2 − 1) × (4 n 2 − 1) submatrix of M Φ . Next, note that the representation vector of a Hermitian operator P can be written as α P = (α 0 ,ˆ α P ). We shall callˆ α P the modified representation vector of the operator P. For a unital channel Φ, we shall denote the (p, q) group norm of the modified representation matrixM Φ as M Φ p,q . Note that the (p, q) group norm of the modified representation matrix of unital quantum channels can be regarded as a resource measure of magic (see Appendix B). Similarly, let us define C n 0 ,n 1 · p,q ≤µ to be the set of unital quantum circuits C from n 0 qubits to n 1 qubits with bounded norm · p,q . Theorem 2. Let H be a traceless observable. Given a set of unital quantum circuits from n 0 qubits to n 1 qubits with bounded norm · p,q , the Rademacher complexity of F • C n 0 ,n 1 · p,q ≤µ on m samples S = { x 1 , ..., x m } is bounded as follows. (1) For 1 ≤ p ≤ 2, we have (2) For 2 < p < ∞, we have andˆ α andˆ f I ( x i ) are the modified representation vector of H and |ψ(x i ) ψ(x i )| in the Pauli basis, respectively. The proof of this theorem is presented in Appendix B. In this subsection, we take the depth and width of the quantum circuits involved into account by considering the the layer structure of the circuits. Consider a depth-l quan- Fig. 1 for a circuit diagram). We shall denote the quantum circuit as C l = (Φ l , Φ l−1 , ..., Φ 1 ) and the set of quantum circuits with fixed depth l and width vector n = (n l , . . . , n 1 , n 0 ) as Next, let us define the resource measure for a depth-l quantum circuit C l as follows: FIG. 1. Circuit diagram of a depth-l quantum circuit which represents the average amount of magic over the layers of the quantum circuit. Let us denote C l, n ν p,q ≤ν to be the set of quantum circuits with bounded resource ν p,q ≤ ν, fixed depth l, and width vector n (See Fig. 2 ). Then we have the following results. Theorem 3. Given the set of depth-l quantum circuits with bounded resource ν p,q ≤ ν, the Rademacher complexity on m independent samples S = { x 1 , ..., x m } is bounded as follows. ( (14) where K p (S, H) is defined by Eq. (7), and n 1 = ∑ l i=1 n i . This theorem tells us how the Rademacher complexity depends on the depth, width and the amount of magic in the quantum circuits. Note that we can choose suitable p, q to reduce the exponential dependence on the width vector to polynomial dependence, for example, by taking p * = q = Ω( n 1 / log n 1 ) or p * = q = ∞. The proof of Theorem 3 is presented in Appendix C. If the quantum channel in the quantum circuit is unital (for example, a unitary quantum channel), then we modify the resource measure as follows (See Fig. 3 ): We are now ready to state our next result. Theorem 4. Let H be a traceless observable. Given the set of depth-l quantum circuits with bounded resourceν p,q ≤ ν, the Rademacher complexity of F • C l, n ν p,q ≤ν on m independent samples S = { x 1 , ..., x m } satisfies the following bounds (2) For 2 < p < ∞, we have The proof of this theorem is presented in Appendix D. Remark 5. While we based our resource measure in this paper on the arithmetic mean, we could have alternatively defined a resource measure of the quantum circuit C l = (Φ l , Φ l−1 , . . . , Φ 1 ) that is based on the geometric mean, viz. which is the geometric mean of the resource over the layers of the quantum circuit. By the arithmetic mean-geometric mean inequality, it is easy to see that Also, we could define the path norm as a resource measure as follows: where The modified version of these resource measures for quantum circuits can also be similarly defined. We present similar results on the Rademacher complexity of quantum circuits based on these resource measures in Appendices C and D. Remark 6. Note that for any given quantum channel Ψ, there could be many different ways to realize it by quantum circuits of the same depth l and width vector n, i.e., there could be multiple circuits Furthermore, note that resource measures such as ν p,q depend on the realization of the channel. Thence, if we would like to define a resource measure for quantum channels Ψ that is independent of their quantum circuit realization, it would be necessary to adopt a definition like the one below: which quantifies the minimum amount of resources necessary to realize the target channel over all quantum circuits with a given depth and width. The quantities µ l, n p,q and γ l, n p,q may also be defined analogously. These resource measures may be of independent interest in resource theory. In this work, we studied the Rademacher complexity of quantum circuits. First, we introduced the (p, q) group norm to define the resource measure of magic for quantum channels and for quantum circuits with a layered structure. Second, we proved that the Rademacher complexity of quantum circuits is bounded by its depth and width as well as its amount of magic, where the dependence on the width is determined by the choice of (p, q). These results reveal the dependence of statistical complexity on the resources and structure parameters (such as depth and width) of the quantum circuit. While our results are stated in terms of the Rademacher complexity, there are other prominent choices of measures of statistical complexity, such as the VC dimension and metric entropy, that could be used. Due to the close relationship between the Rademacher complexity and the VC dimension and the metric entropy [58] [59] [60] , it is straightforward to extend our results to obtain bounds on these complexity measures of quantum circuits. Another measure that has recently gained prominence is the topological entropy, a concept from dynamic systems that has recently been used to measure the complexity of classical neural networks [61] . We leave for future work the problem of generalizing the results about Rademacher complexity to topological entropy. Finally, we note that while our results are based on expressing each quantum channel in the Pauli basis, there are also other choices of bases, or more generally frames, that can be used to express quantum channels, a notable example being the phase space point operator basis [62, 63] . How do our results generalize to the case where the basis is chosen arbitrarily? We leave this question for further work. For any N 1 × N 2 real-valued matrix M, which can be written as a column of N 1 rows, we define the (p, q) group norm, with 0 < p, q ≤ ∞, as follows: where the l p norm of the i-th row vector M i is The (p, q) group norm satisfies the following multiplicative property. Lemma 7. Given two matrices M 1 and M 2 , it holds that Proof. This follows directly from the fact that Let P 0 = I, P 1 = X, P 2 = Y , and P 3 = Z be the single-qubit Pauli matrices. The n-qubit Pauli matrices P z are defined as P z = P z 1 ⊗ P z 2 ⊗ . . . ⊗ P z n for any vector z ∈ { 0, 1, 2, 3 } n . Given a quantum channel Φ : L((C 2 ) ⊗n 1 ) → L((C 2 ) ⊗n 2 ) from n 1 qubits to n 2 qubits, we define the 4 n 1 × 4 n 2 representation matrix M Φ in the Pauli basis by its matrix elements as follows: where x ∈ { 0, 1, 2, 3 } n 1 , z ∈ { 0, 1, 2, 3 } n 2 , and P x and P z are the corresponding Pauli operators. From the definition of M Φ , it is easy to see that the representation matrix of quantum channels in the Pauli basis satisfies the following properties. Lemma 8. Given two quantum channels Φ 1 : L((C 2 ) ⊗n 1 ) → L((C 2 ) ⊗n 2 ) and Φ 2 : Proof. Based on definition of the representation matrix M Φ , we have Therefore, it follows that Hence, The other two identities follow directly from the definition of the representation matrix. Next, we consider the (p, q) group norm of the representation matrix for quantum channels, for the case when the channel is unitary. Lemma 9. For a Clifford unitary U, M U is a permutation matrix, up to a ± sign. That is, M U x y = ±δ x,π( y) , where π is a permutation over the set { 0, 1, 2, 3 } n . Proof. This comes directly from the definition of the Clifford unitaries, which map Pauli operators to Pauli operators. Lemma 10. The (p, q) group norm is invariant under the left or right multiplication of M U , if U is a Clifford unitary. That is, for any matrix A, we have Proof. Based on the Lemma 9, M U is a permutation matrix up to some ± sign. Hence, AM U is just a permutation of the columns of A with some ± sign. Thus, for the i-th row vector, we have (AM U ) i p = A i p . Therefore, AM U p,q = A p,q . Similarly, M U A is just a permutation of the columns of A with some ± sign. Thus, for the i-th row vector Lemma 11. Given a unitary channel U, we have the following result: Proof. First, for any unitary U, it is easy to see that M U is an orthogonal matrix. Therefore, M U x 2 = 1 for any x and M U 0 = (1, 0, . . . , 0) . Therefore, we have the statement in (3). (1) For 0 < p < 2, we have M U x p ≥ M U x 2 = 1 for any x. Therefore, M U p,q ≥ 1. Besides, M U p,q = 1 iff M U x p = 1 for any x iff every row vector M U x has only one nonzero element, which could only be ±1, iff U is a Clifford unitary. (2) For p > 2, 0 < q < ∞, we have M U x p ≤ M U x 2 = 1 for any x. Therefore, M U p,q ≤ 1. Besides, M U p,q = 1 iff M U x p = 1 for any x iff every row vector M U x has only one nonzero element, which could only be ±1, iff U is Clifford. Based on the above facts, it is easy to see that the (p, q) norm of the representation matrix can be regarded as some resource measure of magic of quantum gates. Proposition 12. Given a unitary channel U, the (p, q) norm can be regarded as a resource measure satisfying the following properties (1) (Faithfulness) For 0 < p < 2, we have M U p,q ≥ 1, M U p,q = 1 iff U is Clifford unitary. (1') (Faithfulness) For p > 2, 0 < q < ∞, we have M U p,q ≤ 1, M U p,q = 1 iff U is Cllifford unitary. (2) (Invariance under Clifford unitaries) M U 1 •U•U 2 p,q = M U p,q for any Clifford unitaries U 1 and U 2 . Proof. (1) and (1') come from Lemma 11 directly. (2) where the first equality comes from Lemma 8 and the second equality comes from Lemma 11. (3) where the first equality comes from Lemma 8 and the second equality comes from Lemma 7. (4) comes directly from the convexity of l p and l q norm for p ≥ 1, q ≥ 1. Let p * denote the Hölder conjugate of p, i.e., 1 p + 1 p * = 1. Lemma 13. For any N 1 × N 2 real-valued matrix M, and any vector v ∈ R N 2 , we have Proof. First, let us prove the following inequality This inequality holds because If q > p * , then max { 1 p * , 1 q } = 1 p * and M p,q ≥ M p,p * . Hence the inequality Eq. (A12) reduces to where v i ∈ R N . Proof. The proof is similar to that of Lemma 15 in [25] . If 1 ≤ p ≤ 2 log 2 (N) 2 log 2 (N)−1 , then 2 log 2 (N) ≤ p * . Thence, If 2 log 2 (N) 2 log 2 (N)−1 < p < ∞, then by the Khintchine-Kahane inequality, we have and the first inequality comes from the Minkowski inequality and the second inequality from the fact that (x + y) p * /2 ≤ x p * /2 + y p * /2 , for p * /2 < 1. Therefore (A16) Lemma 15 (Massart lemma [64] ). Given a finite set A ⊂ R m , we have where¯ v = 1 |A| ∑ v∈A v. Theorem 16 (Restatement of Theorem 1). Given a set of quantum circuits Φ from n 0 qubits to n 1 qubits with bounded (p, q) norm · p,q , the Rademacher complexity on m samples S = { x 1 , ..., x m } satisfies the following bounds (1) For 1 ≤ p ≤ 2, we have R S (F • C · p,q ≤µ ) ≤ µ4 Proof. First, we compute where the third inequality follows from Lemma 13. Using Lemma 14, we get the results of this theorem. If a quantum channel Φ is unital, i.e., Φ(I) = I, then the representation matrix M Φ has the following form We callM Φ the modified representation matrix of Φ. (Note that a unitary channel is a special case of a unital channel.) For a unital channel Φ, we define the (p, q) group norm of the modified representation matrixM Φ as follows: where N = 4 n − 1. We now state and prove the following properties of the (p, q) norm of the representation matrixM U , where U is a unitary channel. Proposition 17. For any unitary channel U, we have the following relationships between M U p,q and M U p,q , with equality iff U is a Clifford unitary. for any unitary U. (3) For p > 2, q > 0, we have with equality iff U is a Clifford unitary (4) For p = 2, q > 0 Proof. (4) is obvious, as M U andM U are orthogonal matrices. For 0 < p < 2, q > 0 we have M U x p ≥ M U 0 p = 1 for any x = 0. Therefore, M U p,q ≤ M U p,q for 0 < q < ∞ and M U p,q = M U p,q for q = ∞. Hence, we get (2). Next, for 0 < q < ∞, M U p,q = M U p,q iff M U x p = 1 for any x = 0 iff every row vector M U x has only one nonzero element, which could only be ±1, iff U is a Clifford unitary. Hence, we get (1). for any x iff every row vector M U x has only one nonzero element, which could only be ±1, iff U is a Clifford unitary. Therefore, we get (3). A direct consequence of the above proposition is the following corollary. Proposition 19. Let U 1 and U 2 be unitary channels. (1) For 0 < p < 2, 0 < q < ∞, we have with equality iff U 1 and U 2 are Clifford unitaries. (2) For 0 < p < 2, q = ∞, we have for any unitaries U 1 and U 2 . (3) For p > 2, 0 < q ≤ ∞, we have For p > 2, 0 < q < ∞, "=" holds iff U 1 and U 2 are Clifford unitary. (4) For p = 2, 0 < q ≤ ∞, we have Proof. (3) is obvious as bothM U 1 ⊗M U 2 andM U 1 ⊗U 2 are orthogonal matrices. Using the property where N 1 = 4 n 1 − 1, N 2 = 4 n 2 − 1. Hence to compare M U 1 ⊗M U 2 p,q and M U 1 ⊗U 2 p,q , we need only to compare To this end, let us consider a simple inequality first. It is easy to verify the following two inequalities: (1) For a, b ≥ 1, we have Morevover, equality holds iff a = b = 1. (2) For 0 < a, b ≤ 1, we have Moreover, equality holds iff a = b = 1. Thus, for 0 < p < 2, 0 < q < ∞, we have M U x q p ≥ 1 for any x = 0, and M U x q p = 1 for all x = 0 iff U is Clifford. Let a = 1 Then by the first inequality (1), we have Therefore, for 0 < p < 2, 0 < q < ∞, we have where equality holds iff U 1 and U 2 are Clifford unitary. Similarly, for p > 2, 0 < q < ∞, we have M U Then by the second inequality (2), we have Therefore, for p > 2, 0 < q < ∞, we have M U 1 ⊗M U 2 p,q ≤ M U 1 ⊗U 2 p,q . Moreover, equality holds iff U 1 and U 2 are Clifford unitary. Now, let us consider the case where q = ∞. For q = ∞, we have , max , max Hence, for 0 < p < 2, we have M U x p ≥ 1 for any x = 0; therefore, max , max That is, For p > 2, we have M U x p ≤ 1 for any x = 0; therefore, max , max In this subsection, we will assume for simplicity that the observable H is traceless, which implies that α 0 = 0. Theorem 20 (Restatement of Theorem 2). Given the set of unital quantum circuits Φ from n 0 qubits to n 1 qubits with bounded (p, q) norm of the modified representation matrix, the Rademacher complexity on m samples S = { x 1 , ..., x m } satisfies the following bounds (1) For 1 ≤ p ≤ 2, we have where N 1 = 4 n 1 − 1. (2) For 2 < p < ∞, we have Proof. Since α x = 0, it follows that α = (0,ˆ α), whereˆ α ∈ R N 1 . Hence, Similarly, for unital quantum channels Φ, we havê Therefore, we have where the third inequality come from the Lemma 13. Using Lemma 14, we get the results of the theorem. Next, let us denote the set of depth-l quantum circuits C l with bounded depth-norm µ p,q as C l, n µ p,q ≤µ , that is, Lemma 23. Given the set of depth-l quantum circuits C l, n and the set of depth-l quantum circuits C l, n , where n = ( n , n l ) and n = ( n , n l−1 ), then ∀ ε ∈ { ±1 } m , we have (C6) Thus, Proof. The lemma follows from where the inequality follows from Lemma 13. Theorem 24. Given the set of depth-l quantum circuits with bounded depth-norm µ p,q , the Rademacher complexity on m samples S = { x 1 , . . . , x m } satisfies the following bounds (1) For 1 ≤ p ≤ 2, we have (2) For 2 < p < ∞, we have Proof. These bounds follow from where the the second inequality comes from Lemma 23. Using Lemma 14, we obtain the results of the theorem. To get rid of the exponential dependence on the width of the quantum neural network, we need to take p * ≥ ∑ l−1 i=1 n i , q ≥ ∑ l−1 i=1 n i . For example, we could take p = 1, q = ∞. Proposition 25. Given the set of depth-l quantum circuits with bounded µ 1,∞ norm, the Rademacher complexity on m samples S = { x 1 , . . . , x m } satisfies the following bounds: We denote the set of depth-l variational quantum circuits with parameters θ and fixed structure A by C l,n A, θ . By 'fixed structure', we mean that the position of each parametrized gate is fixed. Then, for the i-th layer of the variational quantum circuit, denoted as Φ i ( θ ), let us define . Therefore, for any such depth-l variational quantum circuit with a fixed structure, we have µ p,q ≤ ∏ i µ i . It follows that the Rademacher complexity of the class of depth-l variational quantum circuits with fixed structure is bounded by In this subsection, let us define the summation (p, q) depth-norm for depth-l quantum circuits C l = (Φ l , ..., Φ 1 ) as follows: for any r > 0. For example, if we take r = 1, then which is the average value of the amount of resources in each layer of the quantum circuit. Proposition 26. The summation (p, q) depth-norm satisfy the following properties: (1) Given a depth-l quantum circuit C l and a depth-m quantum circuit C m , we have (2) Given two depth-l quantum circuits C l and C l , we have where r, s,t > 0 and 1 s + 1 t = 1 r . Proof. (1) follows directly from the definition of ν (r) p,q , and (2) follows directly from Hölder's inequality. Similarly, for the depth-l quantum circuit C l , where each layer only contains unitary gates, i.e., C l = (U l ,U l−1 , · · · ,U 1 ), ν r p,q can be viewed as a resource measure of magic. Lemma 27. Given a depth-l quantum circuits C l = (U l ,U l−1 , · · · ,U 1 ), we have (1) (Faithfulness) For 0 < p < 2, q > 0, r > 0: ν (r) p,q ( C l ) ≥ 1, and ν (r) p,q ( C l ) = 1 iff C l is a Clifford circuit. (1') (Faithfulness) For p > 2, 0 < q < ∞, r > 0: ν (r) p,q ( C l ) ≤ 1, and ν (r) p,q ( C l ) = 1 iff C l is a Clifford circuit. (2) (Nonincreasing under Clifford circuits) For 0 < p < 2, q > 0, r > 0, we have ν (r) p,q ( C 1 • C l • C 2 ) ≤ ν (r) p,q ( C l ) if C 1 , C 2 are Clifford circuit. (2') (Nondecreasing under Clifford circuits) For p > 2, 0 < q < ∞, r > 0, we have ν (r) p,q ( C 1 • C l • C 2 ) ≥ ν (r) p,q ( C l ) if C 1 , C 2 are Clifford circuit. Note that for p > 2, 0 < q < ∞, we can define the resource measure as 1 − ν (r) p,q , in which case the resource measure also satisfies the properties of faithfulness and nonincreasing-ness under Clifford circuits. Let us define the set of depth-l quantum circuits with bounded ν (r) p,q norm by C l, n ν (r) p,q ≤ν . It is easy to see the following relationship ν (r) p,q and µ p,q ν (r) p,q ( C l ) ≥ µ p,q ( C l ) 1/l , which follows directly from the Arithmetic Mean-Geometric Mean inequality. Hence we have C l, n ν (r) p,q ≤ν ⊆ C l, n µ p,q ≤ν l . This allows us to obtain the following result on the Rademacher complexity of quantum circuits with bounded ν r p,q norm directly from Theorem 24. (2) For 2 < p < ∞, we have To get rid of the exponential dependence on the width of quantum neural networks, we need to take p * ≥ ∑ l−1 i=1 n i and q ≥ ∑ l−1 i=1 n i . For example, we could take p = 1, q = ∞. Proposition 29. Given the set of depth-l quantum circuits with bounded ν Let us define the (p, q) path-norm for the depth-l quantum circuits C l = (Φ l , Φ l−1 , ..., Φ 1 ). First, for a fixed output P z , where z ∈ { 0, 1, 2, 3 } n l , let us define Hence, we can define the (p, q) path-norm for the depth-l quantum circuits as follows, Proposition 30. The multiplication (p, q) path-norm satisfies the following properties: (1) Given a depth-l quantum circuit C l and a depth-m quantum circuit C m , we have γ p,q ( C l • C m ) ≤ γ p,q ( C l )γ p,∞ ( C m ). (2) Given two depth-l quantum circuits C l and C l , we have γ p,q ( C l ⊗ C l ) = γ p,q ( C l )γ p,q ( C l ). Deep learning Deep Learning Probabilistic Machine Learning: An introduction Recent trends in deep learning based natural language processing Deep learning in natural language processing Deep learning for natural language processing: advantages and challenges Deep learning for drug design: an artificial intelligence paradigm for drug discovery in the big data era Deep learning in drug discovery Deep learning detecting fraud in credit card transactions Credit card fraud detection using deep learning based on auto-encoder and restricted boltzmann machine Deep Learning in Medical Image Analysis A survey on deep learning in medical image analysis Detecting unexpected obstacles for self-driving cars: Fusing deep learning and geometric modeling Deep learning for self-driving cars: Chances and challenges Dropout improves recurrent neural networks for handwriting recognition A comparative study on handwriting digit recognition using neural networks Anastasios Doulamis, Eftychios Protopapadakis, and Diego Andina Everything you wanted to know about deep learning for computer vision but were afraid to ask Neural networks and deep learning On the uniform convergence of relative frequencies of events to their probabilities Necessary and sufficient conditions for the uniform convergence of means to their expectations ε-entropy and ε-capacity of sets in functional spaces Rademacher and Gaussian complexities: Risk bounds and structural results Benefits of depth in neural networks Norm-based capacity control in neural networks Nearly-tight VC-dimension bounds for piecewise linear neural networks Spectrally-normalized margin bounds for neural networks Exploring generalization in deep learning Sizeindependent sample complexity of neural networks Quantum Computing: Progress and Prospects Quantum Computing in the NISQ era and beyond A variational eigenvalue solver on a photonic quantum processor A quantum approximate optimization algorithm Quantum approximate optimization algorithm: Performance, mechanism, and implementation on near-term devices Variational quantum algorithms Quantum supremacy using a programmable superconducting processor Quantum computational advantage using photons Instead of 'supremacy' use 'quantum advantage Quantum computing and the entanglement frontier Quantum sampling problems, BosonSampling and quantum supremacy Quantum computational supremacy How many qubits are needed for quantum computational supremacy Classification with quantum neural networks on near term processors Training deep quantum neural networks Trainability of dissipative perceptron-based quantum neural networks The quest for a Quantum Neural Network Continuousvariable quantum neural networks Quantum convolutional neural networks Quantum algorithms for supervised and unsupervised machine learning Quantum machine learning: what quantum computing means to data mining Quantum machine learning Quantum machine learning: a classical perspective Machine learning & artificial intelligence in the quantum domain: a review of recent progress A mathematical theory of resources Quantum resource theories Application of a Resource Theory for Magic States to Fault-Tolerant Quantum Computing Local Rademacher complexities and oracle inequalities in risk minimization The sizes of compact subsets of Hilbert space and continuity of Gaussian processes Gaussian random processes and measures of solid angles in Hilbert space Entropy and the combinatorial dimension Depth-width trade-offs for neural networks via topological entropy The resource theory of stabilizer quantum computation Framed Hilbert space: hanging the quasi-probability pictures of quantum theory Understanding machine learning: From theory to algorithms ACKNOWLEDGMENTS K. B. thanks Arthur Jaffe and Zhengwei Liu for the help and support during the breakout of the COVID -19 pandemic.K. B. acknowledges the support of ARO Grants W911NF-19-1-0302 and W911NF-20-1-0082, and the support from Yau Mathematical Science Center at Tsinghua University during the visit. Consider a depth-l quantum circuit, where each layer of the quantum circuit is a treated as a quantum channel. We denote the depth-l quantum circuit as C l as followswhere the i-th layer Φ i : L((C 2 ) ⊗n i−1 ) → L((C 2 ) ⊗n i ) (See Figure 4 ). Let us define C l, n with the width vector n = (n l , ..., n 0 ) to be the set of all depth-l quantum circuits C l = (Φ l , Φ l−1 , · · · , Φ 1 ), where the i-th layer Φ i : L((C 2 ) ⊗n i−1 ) → L((C 2 ) ⊗n i ). In this section, we introduce three resource measures to quantify the amount of magic in quantum circuits by making use of the (p, q) group norm. In this subsection, let us define the multiplication (p, q) depth-norm for depth-l quantum circuits C l = (Φ l , Φ l−1 , · · · , Φ 1 ) as followsProposition 21. The multiplication (p, q) depth-norm satisfies the following properties:(1) Given a depth-l quantum circuit C l and a depth-m quantum circuit C m , we have µ p,q ( C l • C m ) = µ p,q ( C l )µ p,q ( C m ),where C l • C m := ( C l , C m ).(2) Given two depth-l quantum circuits C l and C l , we have µ p,q ( C l ⊗ C l ) = µ p,q ( C l )µ p,q ( C l ).where C l ⊗ C l := (Φ l ⊗ Φ l , . . . ,Proof. These two properties follow directly from the definition of µ p,q .Note that for the depth-l quantum circuit C l , where each layer contains only unitary gates, i.e., C l = (U l ,U l−1 , · · · ,U 1 ), µ p,q can be viewed as a resource measure of magic.Lemma 22. Given a depth-l quantum circuit C l = (U l ,U l−1 , · · · ,U 1 ), we have (1) (Faithfulness) For 0 < p < 2, q > 0, it holds that µ p,q ( C l ) ≥ 1, and µ p,q ( C l ) = 1 iff C l is a Clifford circuit, i.e., each U i is a Clifford unitary.(1') (Faithfulness) For p > 2, 0 < q < ∞, it holds that µ p,q ( C l ) ≤ 1, and µ p,q ( C l ) = 1 iff C l is a Clifford circuit.(2) (Invariance under Clifford circuit) For p > 0, q > 0, we have µ p,q ( C 1 • C l • C 2 ) = µ p,q ( C l ) if C 1 , C 2 are Clifford circuits.Proof.(1) holds becausewhere the second equality comes from the fact thatProposition 31. For any depth-l quantum circuit C l , we have the following relationship: For any 0 < p ≤ 1, q > 0, we haveProof. To prove this result, we only need to prove that for any z, we have For a depth-l quantum circuit C l , where each layer contains only unitary gates, i.e., C l = (U l ,U l−1 , · · · ,U 1 ), the path norm γ p,q can be viewed as a resource measure of magic.Lemma 32. Given a depth-l quantum circuit C l = (U l ,U l−1 , · · · ,U 1 ), we have (1) (Faithfulness) For 0 < p ≤ 1, q > 0: γ p,q ( C l ) ≥ 1, γ p,q ( C l ) = 1 iff C l is a Clifford circuit.(2) (Invariance under Clifford circuit) γ p,q ( C 1 • C l • C 2 ) = γ p,q ( C l ) if C 1 and C 2 are Clifford circuits.Proof. γ p,q ( C l ) ≥ 1 comes from the facts that γ p,q ( C l ) ≥ M C l p,q and M C l p,q ≥ 1 (by Lemma 11) . Finally, the invariance under Clifford circuits has been proved in Proposition 30.Let us define the normalized representation matrix of the quantum channel in the depth-l quantum circuit C l as follows.(C26)It is easy to see that for any row vector mBesides, it is easy to verify thatIt is easy to see thatLemma 33. For any p * > 0, the following statement holds for any k-depth quantum circuitsProof. This lemma comes from the fact thatTheorem 34. Given the set of depth-l quantum circuits with bounded path norm γ p,q , the Rademacher complexity on m independent samples S = { x 1 , . . . , x m } satisfies the following boundsProof. First we compute the followingwhere the last inequality comes from Lemma 33. The theorem follows from this and Lemma 14. In this section, let us consider depth-l unital quantum circuits, where each layer of the quantum circuit is a quantum channel. Furthermore, we shall assume that the observable H is traceless. Unlike the previous section, we consider the (p, q) norm of the modified representation matrix, where I, and only I, is mapped to I. Let us define the modified multiplication (p, q) depth-norm for depth-l unital quantum circuits C l = (Φ l , Φ l−1 , · · · , Φ 1 ) as followsμLemma 35. Given a depth-l quantum circuit C l = (U l ,U l−1 , · · · ,U 1 ), we have (1) (Faithfulness) For 0 < p < 2, q > 0:μ p,q ( C l ) ≥ 1, andμ p,q ( C l ) = 1 iff C l is a Clifford circuit.(1') (Faithfulness) For p > 2, q > 0:μ p,q ( C l ) ≤ 1, andμ p,q ( C l ) = 1 iff C l is a Clifford circuit.(2) (Invariance under Clifford circuit) For p > 0, q > 0, we haveμ p,q (Proof. This lemma follows directly from Proposition 18 and 21.Lemma 36. Given the set of depth-l unital quantum circuit C l, n and the set of depth-l unital quantum circuit C l, n , where n = ( n , n l ) and n = ( n , n l−1 ), then ∀ ε ∈ { ±1 } m , we havewhere N l = 4 n l − 1. Thus,Proof. This is becausewhere the inequality follows from Lemma 13.Theorem 37. Given the set of depth-l unital quantum circuits with bounded depth-normμ p,q , the Rademacher complexity on m samples S = { x 1 , . . . , x m } satisfies the following bounds (1) For 1 ≤ p ≤ 2, we haveProof. The theorem follows fromwhere the second inequality follows from Lemma 36. Using Lemma 14, we get the results of theorem theorem.If we take p = 1, q = ∞, then we have the following results directly from the previous results. Let us define the modified summation (p, q) depth-norm for depth-l quantum circuits as followŝfor any r > 0.Lemma 40. Given a depth-l quantum circuit C l = (U l ,U l−1 , · · · ,U 1 ), we have(2') (Nondecreasing under Clifford circuit) For p > 2, q > 0, r > 0, we haveνProof. This lemma comes directly from Lemma 18 and Proposition 26.Based on the following relationship betweenνwe have C l, n ν p,q ≤ν ⊆ C l, n µ p,q ≤ν l .We obtain the following results directly from Theorem 37.Theorem 41 (Restatement of Theorem 4). Given the set of depth-l unital quantum circuits with boundedν (r) p,q norm, the Rademacher complexity on m samples S = { x 1 , . . . , x m } satisfies the following bounds (1) For 1 ≤ p ≤ 2, we haveIf we take p = 1, q = ∞, then we get the following results directly from the previous results.Proposition 42 . Given the set of depth-l unital quantum circuits with boundedν r 1,∞ norm, the Rademacher complexity on m samples S = { x 1 , . . . , x m } satisfies the following bounds3. Modified (p, q) path-norm Let us define the modified (p, q) path-norm for depth-l circuits bŷwhere N l = 4 n l − 1.Let us define the normalized representation matrix of quantum channels in depth-l quantum circuits C l as followŝIt is easy to see that for any row vectormBesides, it is easy to verify thatTherefore,It is easy to see thatSimilarly to γ p,q ,γ p,q satisfies the following property.Proposition 43. For any depth-l unital quantum circuit C l , we have the following relationship: For any 0 < p ≤ 1, q > 0, we haveγProof. The proof is similar to that of Proposition 31.Lemma 44. Given a depth-l quantum circuit C l = (U l ,U l−1 , · · · ,U 1 ), we have (1) (Faithfulness) For 0 < p ≤ 1, q > 0,γ p,q ( C l ) ≥ 1, γ p,q ( C l ) = 1 iff C l is Clifford.(2) (Invariance under Clifford circuit)γ p,q ( C 1 • C l • C 2 ) =γ p,q ( C l ) if C 1 and C 2 are Clifford circuits.Proof. The proof is similar to that of Lemma 32.Theorem 45. Given the set of depth-l unital quantum circuits with bounded path norm γ p,q , the Rademacher complexity on m samples S = { x 1 , . . . , x m } satisfies the following bounds (1) For 1 ≤ p ≤ 2, we have R S (F • C l, n γ p,q (C l )≤γ ) ≤ γN(2) For 2 < p < ∞, we have R S (F • C l, n γ p,q (C l )≤γ ) ≤ γNwhere N i = 4 n i − 1.Proof. The statement in the theorem holds because R S (F • C l, n γ p,q (C l )≤γ ) = E ε 1 m supwhere the last inequality follows from Lemma 33. Using Lemma 14 completes the proof of the theorem.