key: cord-0984211-y7sapceh authors: Kim, Yosep; Teo, Yong Siah; Ahn, Daekun; Im, Dong-Gil; Cho, Young-Wook; Leuchs, Gerd; Sanchez-Soto, Luis L.; Jeong, Hyunseok; Kim, Yoon-Ho title: Universal compressive characterization of quantum dynamics date: 2020-05-28 journal: Physical review letters DOI: 10.1103/physrevlett.124.210401 sha: 55d2a544327cf44490345de02c3a9a815efe9dd1 doc_id: 984211 cord_uid: y7sapceh Recent quantum technologies utilize complex multidimensional processes that govern the dynamics of quantum systems. We develop an adaptive diagonal-element-probing compression technique that feasibly characterizes any unknown quantum processes using much fewer measurements compared to conventional methods. This technique utilizes compressive projective measurements that are generalizable to arbitrary number of subsystems. Both numerical analysis and experimental results with unitary gates demonstrate low measurement costs, of order $O(d^2)$ for $d$-dimensional systems, and robustness against statistical noise. Our work potentially paves the way for a reliable and highly compressive characterization of general quantum devices. Introduction.-Quantum processes are nature's directives that guide the evolution of all physical systems in the quantum realm. Such processes ubiquitously occur in untamed open-system dynamics under interactions with the environment as, for example, depolarizing [1] , dephasing [2] and photon-loss [3] channels. They also exist as universal gates to carry out quantum computation. Notably, quantum processors [4] [5] [6] employ a series of such unitary processes [7] [8] [9] [10] [11] to carry out computations using d-dimensional systems as resources. Thus, reliable characterizations of quantum processes are crucial prerequisites for enhancing the quality of quantum technologies. Such a characterization conventionally require O(d 4 ) measurements [12] [13] [14] [15] [16] that are too resource-intensive to perform for large d. Ancilla- [17] [18] [19] [20] and error-correction-based [21] [22] [23] [24] quantum process tomography (QPT) were introduced to circumvent this problem. These demand sophisticated state and measurement preparations. For specific property prediction tasks, direct schemes may be sufficient [25] [26] [27] [28] [29] [30] . When the unknown process has a certain maximum possible rank, the concept of compressed sensing [31] [32] [33] [34] [35] [36] [37] has so far been the status quo for reconstructing the unknown process with a small set of specialized measurements [38] [39] [40] . In practice however, this concept is only as reliable as the accuracy of the rank knowledge, and lacks an independent verification method to check the reconstruction results without fidelity comparison with target processes [39, 40] . Existing remedies for tackling these issues in compressed sensing are generally ad hoc and incomplete [41] . In what follows, we shall present and experimentally demonstrate an adaptive compressive quantum process tomography scheme (ACQPT) that uniquely characterizes any process through direct diagonal-element-probing measurements in optimally-chosen bases that are much fewer than O(d 4 ), ergo highly compressive. Our scheme does not rely on any sort of prior assumption about the process, with the exception that one knows the dimension d of the underlying quantum system. Instead, it is designed to extract information that is already inherently encoded in the measured data to reveal all process-matrix elements and check if they are uniquely consistent with the data using an efficient semidefinite program [42, 43] . If not, the scheme adaptively chooses the next optimal measurement to perform, and repeats itself until the process is uniquely characterized. We shall elaborate the theoretical formalism of ACQPT, and demonstrate that it is both highly compressive and achievable in practice using a proof-of-principle quantum optics experiment for two-qubit processes. Numerical simulations of a range of dimensions supply compelling evidence of an O(d 2 ) O(d 4 ) measurement cost for characterizing qudit unitary processes, while experimental data confirms the robustness of ACQPT in the presence of statistical noise. Compressive characterization of physical processes.-Every quantum process that governs natural phenomena can be completely described by a positive semidefinite matrix χ, which is defined by O(d 4 ) parameters [12] . This matrix represents a state-to-state transformation rule for the process that accounts for all physical characteristics of the quantum system. We shall unambiguously determine χ using minimal number of measurements necessary with no other presumed information apart from knowing the dimension d of the system (dim{χ} = d 2 ). Without loss of generality, we shall investigate trace-preserving processes, examples of which include all unitary processes used in quantum computation. All subsequent discussions directly apply also to non-trace-preserving processes if so desired. Characterizing a physical process is equivalent to unambiguously finding out the elements of χ. The aim is to do this with as little measurement resources as possible without the need for any other a priori information about χ. We first emphasize a pivotal observation about rank-deficient χ ma- 1. An iterative schematic of adaptive compressive quantum process tomography (ACQPT) to find out the χ matrix that represents a quantum process M. This schematic is feasible in practice and generalizes to systems of any dimension, making ACQPT universal. At a particular step k, the input state and projection basis are set by the control unitaries V o . The detection probability that corresponds to the first diagonal element is accumulated in the dataset D. Because the diagonal element probing of U † k χU k in general demands resource-intensive measurements, U † k χU k is replaced by the closely approximated V † k χV k whose diagonal element can be measured with the simple experimental setup. Completely positive and trace-preserving process (CPTP) matrices χ that satisfy the dataset D form a convex set C k , which shrinks as data accumulate. In informational completeness certification (ICC), we track this shrinkage with a linear function f (χ) = Tr[χZ]/ Tr[Z 2 ], where Z is a full-rank positive operator. Accordingly, the size monotone sCVX is defined as the difference between its unique maximum fmax and minimum fmin over C k . If C k contains only a single estimator (sCVX = 0), then this unique estimator χ (k) est is used to represent the quantum process and terminate ACQPT. If not, ACQPT picks an optimal estimator χ (k) est having the minimum entropy from the convex set C k to pick the next V trices: When the unknown process is unitary, its χ matrix is rank-1 and possesses only one positive eigenvalue. Then, we just need to diagonalize χ via its diagonalizing unitary matrix U diag and measure that single eigenvalue to fully characterize χ (The trace-preserving condition would have fixed this eigenvalue anyway so no measurement is even needed in this case). This straightforward argument can be extended to a rank-r process. In this case, we simply measure all r − 1 positive diagonal values of U † diag χU diag . In other words, diagonalelement measurements, in view of the prior knowledge about U diag , supply the most information compared to other kinds of measurements. Evidently, as one has no knowledge about χ, U diag is also unknown. Nonetheless, we can design ACQPT to adaptively choose a sequence of unitary rotations U = {U 1 , U 2 , . . . , U k , . . .} on χ that converges to U diag . Put simply, the iterative scheme executes two basic stages per step. In stage (i), the scheme deterministically certifies if the accumulated dataset D from the experiment correspond to a unique estimator χ est for the unknown χ matrix-the informationally complete (IC) situation. The contrary would imply that there is a convex set of processes C consistent with the non-IC D [44, 45] . We call this the informational completeness certification (ICC) stage, and its successful implementation stems from the convexity property of C that allows us to assign a mathematically justified number s CVX (size monotone) to indicate whether D is IC (s CVX = 0) or not. For a realistic numerical approach, we preset a certain threshold ε of s CVX . If s CVX > ε, ACQPT finds an optimal measurement setting to collect more data in stage (ii) based on D alone. This is the adaptive measurement stage responsi-ble for generating U. Since our χ of interest is rank-deficient, we can turn ACQPT into a compressive scheme by employing an effective low-rank guiding prescription: After ICC, an estimator with the minimum von Neumann entropy (minENT) is chosen from C [46] [47] [48] [49] . The next optimal U for the χ-rotation would be the one that diagonalizes this estimator to be the eigenvalues in descending order. At the kth iterative step, the κ k th diagonal element of U † k χU k is measured following the rule κ k = mod(k, r k−1 ) + 1, where r k−1 = rank{χ (k−1) est }. The logic of this "modulo rule" is to measure the diagonal element of cyclically-shifted index within the positive-eigenvalue sector of the previously estimated χ est , such that eventually all positive eigenvalues of χ are measured (up to statistical noise) at step k = k IC , that is the final step at which IC measurement data are obtained. As an example, ACQPT measures the first diagonal element of χ at k = 1, then the second diagonal element of U † 2 χU 2 at k = 2, and back to the first diagonal element of U † 3 χU 3 if r 2 = rank{χ (2) est } < 3, and so forth. Figure 1 shows an iterative schematic of ACQPT. The diagonal element probing of U † k χU k in general demands resource-intensive measurements, thus we approximate the probing in an experimental feasible way using a variable input V o , can be obtained from the detection probability, and V are chosen to closely approximate the κ k th diagonal of U † k χU k . We invite the reader to visit Appendix A for further elaboration. As ACQPT proceeds, more independent data are collected Here, the kIC values are recorded at the instant sCVX < 5 × 10 −5 , which are respectively 34.7 ± 3.6 and 47.0 ± 5.9 for the adaptive and random measurements. All simulations are statistically noiseless, so that F at kIC for every process is unity to numerical precision. such that U k → U diag quickly. In this way, our scheme can efficiently acquire optimal data and determine whether they are sufficient to uniquely recover χ without ever requiring spurious pre-experimental assumptions about χ. Notice that the rank-deficiency of χ is not assumed here. A (nearly) full-rank χ unbeknownst to us would automatically result in a much slower convergence of ACQPT. Numerical results.-Our first numerical showcase of AC-QPT demonstrates the superiority of adaptive χ-rotation sequences over random ones. For this, both the size monotone s CVX and fidelity F with the true process are numerically simulated with multiple randomly chosen ququart (d = 4) unitary processes (see Fig. 2 ). The results clearly show that the adaptive strategy gives a significantly smaller k IC than the random strategy. We further give numerical estimates on the scaling behaviors of k IC for both the adaptive and random strategies on qudit unitary processes in Appendix C. We show, for a reasonably large range of d and multiple simulations with random processes, that these strategies only need measurement resources of O(d 2 ) in contrast with the standard O(d 4 ). For a more complete analysis, we also compared the minENT strategy in ACQPT with another available adaptive strategy, the minimum-L1 norm strategy. Experimental results.-The experimental platform we use for demonstrating ACQPT utilizes a source of two-photons. The quantum processes of interest are implemented for the polarization modes. A 140-fs ultrafast laser is impinged on a 1 mm-thick type-II Barium Borate (BBO) crystal to emit imprints a phase shift of π onto two-qubit state, that is, controlledphase gate. To make a CNOT gate, HWPs at 22.5 • are applied on the second (target) qubit before and after the controlled-phase gate [50, 51] . two single-photons in the beamlike configuration using spontaneous parametric down conversion (SPDC). The individual photons each possesses the central wavelength of 780 nm and delivered to the experimental setup shown in Fig. 3 (a) with single-mode fibers. To maximize the indistinguishability between two photons, their spectra are truncated by making use of interference filters having 2 nm full-width at half-maximum bandwidth. The two-photon state is first initialized to the doubly horizontally-polarized state |00 00|. At the kth iterative step of ACQPT, the initial state and projection basis are manipulated using half-(HWP) and quarter-wave (QWP) plates according to the previously chosen two-qubit operations V respectively. For simple implementation, the closest separable initial state and projection basis are utilized. We anticipate an even better performance from ACQPT with sophisticated entangling operations. The ACQPT proceeds as follows: Starting with k = 1, V to measure in the next iteration, and the whole procedure repeats until a unique process matrix is completely characterized at k = k IC . We investigate both the I ⊗ H (Hadamard transform only on the second qubit) and controlled-NOT (CNOT) gates constructed as in Figs. 3(b) and (c). The Hadamard gate is simply realized using a HWP at 22.5 • , whereas the CNOT gate is implemented by exploiting Hong-Ou-Mandel interference effects on a partially polarizing beam splitter (PPBS) that partially reflects vertical polarization with a transmittance of 1/3 and perfectly transmits horizontal polarization [50] [51] [52] . Figure 4 shows plots of s CVX and F at each step k for the twoqubit processes. ACQPT essentially gives almost the same results as standard QPT with much fewer measurement outcomes (38.1±6.2 and 51.9±6.7 for I ⊗ H and CNOT gates) than 256 and no prior information. The disparity in the required number of measurements for the two gates stems from their different degrees of implementation imperfections that result in non-unitary processes. This clearly shows that AC-QPT works adaptively. Discussion.-All presented simulation and experimental results have confirmed, indeed, that our adaptive elementprobing compressive scheme can characterize any quantum process using drastically less measurement resources than the standard O(d 4 ) without imposing ad hoc assumptions. Additional simulation graphs and procedures illustrated in Appendix C provide evidence that for general qudit unitary processes, there exists a quadratic enhancement to O(d 2 ) in terms of measurement resource costs needed to unambiguously characterize any qudit unitary process in contrast with O(d 4 ). One may additionally incorporate trusted prior information into ACQPT. Most straightforwardly, if one insists in knowing the rank r of the actual unknown process, then one may simply replace r k with r in the "modulo rule" when measuring diagonal elements. Numerically for example, if we enforce r = 1, ACQPT becomes comparable to the most efficient Baldwin-Kalev-Deutsch (BKD) scheme [38, 53] for unitary channel known to date, that requires projective measurements of 2d 2 − d. See Appendix C for details. The advantage of incorporating the prior information this way into ACQPT is that even if the prior information turns out to be inaccurate, the effect is not detrimental since an additional layer of certification is carried out to verify if the process estimator is truly unique. This failsafe is what distinguishes ACQPT in merit from all other reported undersampled characterization schemes to the authors' knowledge. Furthermore, as shown in Fig. 2 , the size monotone and fidelity progress very similarly for both the adaptive and random strategies during the initial measurement phase. We may then use this observation to arrive at a hybrid compressive scheme where random measurements are first used at the initial phase before they are switched to adaptive ones. This would also further reduce the overall computational load in trying to execute the adaptive stage repeatedly. where {B m } is a set of Hermitian basis operators that are mutually trace-orthonormal (Tr B † m B n = δ mn ) [12] . In this operator basis, M has a matrix representation χ defined by the elements χ mn in some computational basis. We hereby consider (with no loss of generality) completely positive and trace-preserving (CPTP) processes, where the positive process matrix χ ≥ 0 satisfies an additional operator constraint To characterize any d 2 -dimensional quantum physical process, one needs to uniquely determine all its d 4 − d 2 independent parameters. The relevant dataset D collected in the experiment for this purpose approximately estimates the actual detection probabilities p k = Tr[O k M[ρ k ]] that are accumulated from studying M with various pairs (ρ k , O k ) of input states ρ k and output observables O k for the kth measurement chosen according to some prescription that shall be discussed shortly. This dataset, in the absence of statistical noise, thus has a linear relationship with χ, a column of χ mn elements, inasmuch as where the K × d 4 transformation matrix possesses elements Φ k,mn = Tr O k B m ρ k B † n . An informationally complete (IC) and noiseless D means that the linear data constraint in Eq. (A3) and CPTP operator constraints permit a unique characterization of the χ matrix. Such an IC situation can occur even when K ≡ k IC d 4 . The adaptive compressive quantum process tomography (ACQPT) scheme is designed to characterize χ (or χ) with much fewer than O(d 4 ) of p k s and requires no additional assumptions about the process. It progresses as an iterative procedure, where in each step it carries out informational completeness certification (ICC) to check if D is IC or not. If not, the scheme adaptively chooses the next optimal measurement to perform. As more data accumulate, the convex set C of all CPTP operators that obey Eq. (A3) eventually shrinks to a singleton that contains χ in the absence of statistical noise. To indicate if C is a singleton or not, we can define an indicator s CVX over C by first defining a linear function f (χ) = Tr[χZ] / Tr[Z 2 ], which is a distance from χ to a hyperplane Tr[χZ] for any positive operator Z. Upon denoting the minimum and maximum of f over C by f min and f max respectively, we may define the indicator s CVX = f max − f min . It is well-known in the study of convex optimization that minimizing or maximizing such a linear f over any convex region gives a unique optimum, so that the convexity of C immediately implies that s CVX = 0 ↔ C = {χ} (noiseless case). For a more realistic numerical approach, the instant s CVX reaches below a certain preset threshold, the present process matrix χ (k) est is considered as the desired unknown target matrix and ACQPT is terminated. When the dataset D becomes IC, s CVX is abruptly reduced by a few orders. Thus, the threshold can be distinctly decided this way in practice. The indicator s CVX,k ≥ 0 is in fact a size monotone for the data convex set C k , which is a (non-strict) monotonically increasing function with the size s k of C k . In principle, this special property holds for any convex/concave function f that defines this indicator. We can instructively prove this for a concave function f (χ). In this case, we have f max,1 ≥ f max,2 ≥ . . . and f min,1 ≤ f min,2 ≤ . . . It is clear that if f max,k+1 − f min,k+1 < f max,k − f min,k , then C k+1 ⊂ C k . It follows immediately that if s CVX,k ≡ (f max,k − f min,k )/(f max,1 − f min,1 ), then s CVX,k is a size monotone that decreases with increasing k. When s CVX,kIC = 0, the convexity of C kIC implies that C kIC must contain only χ due to the unique maximum possessed by f . Similar arguments hold for a convex f . Since f (χ) is a linear function, it can also be used to formulate the size monotone as it facilitates the class of semidefinite programs known to give unique stationary points in the CPTP space. We emphasize that the above arguments hold provided that Z is non-pathological. A trivial pathological instance would be when Z = I/d, such that f = 1 over C. Another example is when Z is rank-deficient, in which case any χ ∈ C (if exists) that lies in the kernel of Z results in f = 0 and s CVX = 0 C = {χ} in general. A randomly-chosen full-rank Z would therefore avoid such pathological situations. After ICC, the adaptive measurement stage ensues if s CVX is not sufficiently small. We reiterate the basic observation that if an observer knows U diag that diagonalizes a rank-r χ, then the rank-r diagonalized D = U † diag χU diag has r nonzero parameters so that measuring all the r − 1 independent diagonal terms can completely characterize the quantum process. In other words, the measurements of the diagonal terms of D provide more information about the quantum channel than any other kind of measurements. Based on the above observation, we design the ACQPT scheme to make an informed guess about the unknown diagonalizing U diag from the available dataset D at hand. Suppose that ACQPT now operates at the kth iterative step. Then since the unknown χ matrix is rank-deficient, the next optimal rotation U k+1 that approximates U is taken to be the one that diagonalizes a low-rank estimator χ (k) est from C k , where the unknown rank r is also estimated to be r k , the rank of χ (k) est . The estimator χ (k) est is defined to be the one that minimizes the entropy (minENT) over C k which has been found to perform very well in compressive tomography. After this optimization, we sort the r k estimated diagonal elements of the diagonal matrix D est U k+1 in descending order and ensure that U k+1 precisely gives the same sorted order of eigenvalues, before rotating χ with it in the (k + 1)th step. After which, we measure the actual diagonal elements of D (k+1) = U † k+1 χU k+1 using this sorted list as a guide. The straightforward logical action now is to measure the [κ k+1 = mod(k, r k ) + 1]-th largest diagonal term of D (k+1) , which spreads the measurements over the predicted support of χ, so that its entire actual support is covered with larger probability. As more linearly independent data are collected, the principle of tomography dictates that C k → {χ}, U k+1 → U diag and r k → r as k → k IC , and the minENT adaptation compressively reduces the value of k IC . There are only a few easy adjustments to accommodate statistical noise in actual experiments. We first understand that this time, D is now a set of normalized detection counts that are no longer the actual probabilities. Hence, the column p on the left-hand side of Eq. (A3) must be replaced by another column of physical probabilities p est such that there shall still exist a CPTP solution χ est for Now, for our experiment, the photon source generates photon counts that well follow a Poisson distribution every time we collect a particular datum D k . For large number of sampling copies, the distribution at each k can then be further approximated to a Gaussian distribution with mean and variance both proportional to p k . In statistics, there exists a log-likelihood function, which is the logarithmic conditional probability of obtaining the observed normalized photon counts ν k given the true probabilities p k . For every ν up to the (k = K)th measurement, we may obtain the physical probability column p est = p ML that approximates the unknown true detection probability column p by maximizing log L(ν|p ) over p subject to the CPTP constraints-the maximum-likelihood (ML) method. In effect, we have searched for the most likely physical probability column p ML that can give rise to the observed normalized photon-count column ν. The second easy adjustment is to now regard C k as the convex set of process matrices that give the same maximum value of log L for the accumulated dataset up to the kth step. These are the process matrices that lie in the domain of the plateau of L. When k = k IC , the final unique ML estimator χ ML shall also clearly be different from χ. The distance between the two matrices can be further reduced either with more sampling copies or additional new measurements. Apart from these, we stress that the double implication s CVX,kIC = 0 ↔ C kIC = {χ ML } is perfectly robust against noise in the sense that even if D is statistically noisy, C is always convex and all arguments leading to the above double implication relies solely on this convexity property. When χ is expressible in the operator basis whose elements are of the form where Tr B † m B n = δ mn , the diagonal element χ mm can be directly estimated from the detection probability p m = Tr[O i M[ρ j ]] of the input state ρ j = |j j| and the output projection observable O i = |i i|. However, if χ cannot be expressed with operator basis elements of such a form, then measurements of diagonal elements become complicated. For simplicity, we first assume to know the identity of the diagonalizing unitary U diag of χ. To reveal the operator bases of D = U † diag χU diag , we may rewrite Eq. (A1) in terms of a new operator basis, It is evident that if the reference basis elements B n takes the form in (A6), then the diagonalizing operator basis ele- ≥ . . ., typically possess multiple components in its singular-value decomposition. Since these components are mutually noncommuting, a simultaneous measurement of all of them in order to determine D nn in one experiment incurs intrinsic quantum uncertainties, which turns out to be a physically impossible task. As a physically realistic alternative to B n , the closest operator in the form of Eq. (A6) is defined as S n = |b n a n | ≡ |φ 1 . Measuring such a rank-1 component corresponds to a measurement values that approximate D nn . These largest-singular-value principle is applied to approximately measure diagonal elements of any rotated χ in the course of an ACQPT run. This is the first necessary augmentation to the idealized ACQPT procedure. The χ element for S n = |b n a n | can be measured with the input state ρ = |a n a n | and output observable O = |b n b n |. The second augmentation finds the nearest product observables for both ρ and O in every iterative step when one is dealing with many-body systems. There are many ways to do this, one of which is to express o in terms of a reference state |0 0|, and next respectively look for product unitary operators V o . In this way, the closest approximation for the κ 1 th diagonal element of U † 1 χU 1 is accomplished. If the quantum system is many-body, measure instead the normalized counts of the nearest separable counterparts. Fix a randomly-generated fullrank d 2 × d 2 positive square matrix Z that is of unit trace and define ν = ν 1 . 1. ML: Find the physical ML probabilities p ML given the data ν. 2. ICC stage: Compute the unique minimum and maximum values of f (χ) over χ ∈ C k , that is, subject to the CPTP constraints of χ and p ML = Φχ, where Φ k,mn = Tr O k B m ρ k B † n . Obtain s CVX . If s CVX < ε, terminate ACQPT, otherwise proceed to the next step. 3. Adaptive stage: Find the minENT estimator χ (k) est over χ ∈ C k that minimizes the process entropy −Tr[χ log χ]. (k) est , sort its eigenvalues in descending order and find its correct diagonalizing unitary U k+1 such that U † k+1 χ (k) est U k+1 gives the correct ordered eigen-value matrix. Find out the rank r k of χ (k) est and calculate κ k+1 = mod (k, r k ) + 1. Increase k by one. We first provide a short numerical procedure that generates the random unitary matrices needed for the study of random compressive strategies. These matrices are distributed uniformly according to the Haar measure of the unitary group. Constructing a random d 2 × d 2 Haar unitary matrix 1. Generate a random d × d matrix A with entries independently and identically distributed according to the standard Gaussian distribution. 2. Compute Q and R from the QR decomposition A = QR. 3. Define R diag = diag{R}. 4. Define L = R diag |R diag | ( refers to the Hadamard division). We next supply another general recipe that generates a distribution of random rank-r χ matrices that are used to investigate the behavior of k IC against r. To do this, we state that the "rank" of a process is equivalent to the number of linearly independent Kraus operators K l used to represent this process. To see this, recall the process evolution relation To proceed with a more convenient notation, we define the superket of an operator A, |A , as a basis-dependent transposition map that transforms A into an element in the d 2dimensional complex vector space through matrix-column stacking. The superbra is then the adjoint of the superket, Using the formula |AXB = B T ⊗ A|X , and the fact that either Eq. (B1) or (B2) must hold for any ρ, we arrive at which leads to So, χ mn are the (transposed) matrix elements of l |K l K † l | in the basis {|B n }, whose rank r equals the degree of linear independence of the set of rank-1 superoperators {|K l K † l |} and is independent of the basis choice. We therefore essentially require a simple procedure to generate a random set of r linearly-independent Kraus operators K l . Furthermore, the property r l=1 K † l K l = I is to be preserved: Generate r random d × d Kraus matrices 1. Generate r random d × d matrices A l with entries independently and identically distributed according to the standard Gaussian distribution. 3. Define the r Kraus matrices as K l = A l S −1/2 . In this section, we discuss the scaling behaviors of k IC for both ACQPT and the random strategy. For d-dimensional unitary processes, based on simulation results in the interval 2 ≤ d ≤ 7, we find that ACQPT can uniquely determine the process with an average of only about 3.5d 2 measurements compared to the Haar-random strategy that requires 4.8d 2 for large d, see Fig. 5 . These results imply that both adaptive and random strategies that employ ICC for uniqueness certification use only O(d 2 ) measurements, exponentially fewer than O(d 4 )], to characterize qudit gates. For two-qubit processes, tensor-product local unitary rotation sequences are used on χ in ACQPT. Interestingly, ququart and two-qubit processes have almost identical k IC behaviors for high ranks. We benchmark these two strategies with the known Baldwin-Kalev-Deutsch (BKD) scheme for unitary channels that requires less projective measurements [38] . To derive the scaling behavior of the BKD scheme, Baldwin, Kalev and Deutsch argued that in order to characterize a process that is presumably unitary, one may choose to feed the process with a specific set of d input pure states and characterize the corresponding output pure states. To reiterate their arguments, we parametrize the unknown unitary operator as U = d−1 j=0 |u j j|, where u j |u k = δ j,k are kets we want to characterize and j|k = δ j,k are computational kets, and consider the set of d input kets |ψ 0 = |0 , Then feeding |ψ 0 to U yields |u 0 , and |ψ n gives (|u 0 + |u n )/ √ 2. After |u 0 is fully determined, subsequent output states require the determination of fewer than d amplitudes in the computational basis {|j }, the number of which decreases as n increases. For instance, the output state corresponding to ψ k for some n = k > 0 can be determined by characterizing the amplitudes of |u k . Since all previous k − 1 states are determined, we only need to characterize d − k amplitudes j|u k for 0 ≤ j ≤ d−k−1 and make use of 2k orthogonality relations with the kets |u 0 through |u k−1 . The total number of measurements needed is thus k IC = est from C k by minimizing the norm U † k χU k 1 over all χ ∈ C k instead of entropy. The former is known to promote sparsity, which here refers to the (approximately found) diagonal basis representation of χ. Graphs of sCVX and F are plotted against k and averaged over 60 random ququart unitary processes. All other specifications are those for Fig. 2 in the main article. The kIC value for minL1 is 42.95 ± 5.43, which is larger than minENT, suggesting that ACQPT with minENT is still the more efficient compressive strategy. to fully characterize an arbitrary pure state. In the original BKD unitary scheme, M = 2d is the number of outcomes of a complicated measurement scheme involving non-projective observables. In our context, we shall consider only rank-1 projective measurements that are more feasible to carry out in experiments. For this, there exists a lower bound of M for such measurements, which is 3d − 2 [53] . The final scaling behavior for the projective BKD unitary scheme then reads k IC = 2d 2 − d. We reiterate that these BKD optimal measurements, however, are effective only when the unknown process is strictly unitary. By contrast, ACQPT works without such a risky unitarity assertion and the reduction is also dramatic [O(d 2 )] compared to O(d 4 ) measurements employed in traditional QPT. Furthermore, the scaling behavior of the projective BKD scheme is, incidentally, very close to the IC number when r k is replaced by r = 1 in the "modulo rule" used in ACQPT if one assumes that the unknown qudit process is unitary, which is k IC = O(2.2d 2 ) in the large-d limit. This tells us that any presumed rank assumption can, and should be conservatively incorporated, in such a way that we allow ACQPT to decide if the data ultimately agree with such an assumption. Figure 6 demonstrates the superiority of minENT over the minimum-L1 principle (minL1), the characteristics of which extends beyond ququart systems considered in the figure. Experimental implementation of a fully controllable depolarizing quantum operation Experimental demonstration of high fidelity entanglement distribution over decoherence channels via qubit transduction Protecting entanglement from decoherence using weak measurement and quantum measurement reversal Quantum computers Roads towards fault-tolerant universal quantum computation Blueprint for a microwave trapped ion quantum computer Fast quantum logic gates with trapped-ion qubits Accurate quantum logic gates by spin echo in Rydberg atoms Implementation of a quantum controlled-swap gate with photonic circuits A quantum Fredkin gate Linear optical Fredkin gate based on partial-swap gate Quantum Computation and Quantum Information Quantum process tomography of a controlled-not gate Maximum-likelihood estimation of quantum measurement Complete characterization of a quantum process: The two-bit quantum gate Adaptive schemes for incomplete quantum process tomography Ancilla-assisted quantum process tomography Chois proof as a recipe for quantum process tomography Imprinting complete information about a quantum channel on its output state Quantum tomography for measuring experimentally the matrix elements of an arbitrary quantum operation Characterization of quantum dynamics using quantum error correction Quantum code for quantum error characterization Direct characterization of quantum dynamics: General theory Direct characterization of quantum dynamics Direct quantum process tomography via measuring sequential weak values of incompatible observables Experimental demonstration of selective quantum process tomography on an nmr quantum information processor Selective and efficient quantum state tomography and its application to quantum process tomography Selective and efficient quantum process tomography without ancilla Selective and efficient quantum process tomography Selective and efficient estimation of parameters for quantum process tomography Compressed sensing Near-optimal signal recovery from random projections: Universal encoding strategies? Exact matrix completion via convex optimization Quantum state tomography via compressed sensing Quantum tomography protocols with positivity are compressed sensing protocols Experimentally exploring compressed sensing quantum tomography Experimental quantum compressed sensing for a seven-qubit system Quantum process tomography of unitary and near-unitary maps Compressed sensing quantum process tomography for superconducting quantum gates Efficient measurement of quantum dynamics via compressive sensing Pitfalls in compressed sensing reconstruction and how to avoid them Convex Optimization Semidefinite programming Optimal Experiment Design for Quantum State and Process Tomography and Hamiltonian Parameter Estimation Quantum Process Tomography via L1-norm Minimization Adaptive compressive tomography with no a priori information Adaptive compressive tomography: A numerical study Sparse signal recovery based on nonconvex entropy minimization Low-rank matrices recovery via entropy function Linear Optics Controlled-Phase Gate Made Simple Demonstration of an Optical Quantum Controlled-NOT Gate without Path Interference Measurement of subpicosecond time intervals between two photons by interference Pure-state informationally complete and "really" complete measurements