Variational Quantum Framework for Nonlinear PDE Constrained Optimization Using Carleman Linearization

Gnanasekaran, Abeynaya; Surana, Amit; Zhu, Hongyu

doi:10.2478/qic-2025-0014

Full Article

1.

Introduction

We present a variational quantum framework for optimization problems constrained by nonlinear partial differential equations (PDE). This extends the recently proposed bi-level variational quantum PDE constrained optimization (BVQPCO) framework for linear PDEs in [1] to a nonlinear setting. This is an important extension as many science and engineering applications such as aerodynamics, computational fluid dynamics (CFD), material science, computer vision, and inverse problems necessitate the solution of optimization problems constrained by a system of nonlinear PDE. Examples include Euler and Navier-Stokes PDEs and heat equation in fluid dynamics, combustion, and weather forecasting; magnetohydrodynamics and Vlasov-Maxwell PDEs in plasma dynamics and astrophysics; wave equation in structural mechanics; Black–Scholes PDE in finance; and reaction-diffusion PDEs in chemistry, biology, and ecology to name a few. Closed-form solutions are generally unavailable for PDE constrained optimization problems, necessitating the development of numerical methods [2–4]. A variety of classical gradient-based and gradient-free numerical methods have been developed, and rely on repeated calls to the underlying PDE solver. Since PDE simulations are computationally expensive, using them within the design/optimization loop can become a bottleneck.

To extend the BVQPCO framework for nonlinear PDE we employ the Carleman linearization (CL) framework. The CL framework allows one to transform polynomial ordinary differential equations (ODE), i.e., ODE with polynomial vector field, into a system of infinite but linear ODE [5]. For instance, such polynomial ODE naturally arises when PDE, e.g., as mentioned above, are semi-discretized in spatial dimensions. By truncating the CL system to a finite order, one obtains a finite system of linear ODE, and under suitable conditions, the impact of truncation error can be characterized [6,7]. CL has become an important tool in developing quantum algorithms for simulating polynomial ODE on quantum devices since the seminal work [8]. Specifically, assuming R₂ < 1, where R₂ is a parameter characterizing the ratio of the nonlinearity and forcing to the linear dissipation, the query/gate complexity of the proposed algorithm takes the form $\frac{s T^{2} q}{ϵ} poly (\log T, \log n, \log 1 / ϵ)$ , where ϵ is desired solution accuracy, n is system dimension, q = ∥x(0)∥/∥x(T)∥ measures the decay of the solution x(t) ∈ ℝⁿ, T is the period of integration, and is the sparseness of system matrices describing the ODE. This algorithm/analysis was extended for systems of k-th degree polynomial ODE for arbitrary (finite) values of k in [9]. Recent work has shown that by transforming the problem or working with a different formulation of conservation laws in CFD applications, e.g. lattice Boltzmann method (LBM), one may be able to more readily satisfy such conditions [10]. Furthermore, [11] developed detailed quantum resource estimates for implementing CL-based LBM simulation of incompressible CFD on a fully fault-tolerant quantum computer. While the motivation of exploring quantum algorithms in [11] is to accelerate simulation-based design which results in a PDE constrained optimization, there was no algorithmic treatment of that problem. The application of CL for solving polynomial ODE in a variational quantum setting has also been numerically investigated in [12], in which QLSA is replaced by the variational quantum linear solver (VQLS) [13]. This work was also restricted to applying CL+VQLS for PDE simulation and did not consider any associated design/optimization problem. Furthermore, the work does not make any computational error/complexity assessments for solving polynomial ODE by the CL+VQLS approach.

The main contributions of this work are as follows:

We formulate a bi-level variational quantum framework, referred to as nonlinear BVQPCO (nBVQPCO), for solving nonlinear PDE constrained optimization problems as discussed above. While the original optimization problem is formulated in terms of an unnormalized PDE solution, we show how the problem can be transformed under certain conditions into an equivalent form which depends only on the normalized PDE solution. This is significant as VQLS generates only a normalized solution of the linear system as typically quantum algorithms do, and would have otherwise required a substantial overhead to obtain the unnormalized solution via additional estimation of the norm of the solution.
We use the sigma basis [14], an alternative tensor product decomposition that exploits the sparsity/structure of linear system arising from PDE discretization to facilitate VQLS computations. This results in a substantial reduction in the number of terms in the linear combination of unitary (LCU) decomposition necessary for VQLS cost function evaluations, and thus leads to a significant computational advantage compared to the Pauli basis traditionally used for the LCU decomposition.
We present a detailed computational error analysis for CL+VQLS-based solution of polynomial ODE and prove that under suitable assumptions the CL+VQLS approach can generate a normalized solution of a given ODE to an arbitrary accuracy. To the best of our knowledge such a rigorous analysis of CL+VQLS approach has not been presented before in the literature. Furthermore, by combining this error analysis with the empirically known results for the run time of VQLS, we assess potential utility of our nBVQPCO framework over equivalent classical methods.
We implement the nBVQPCO framework using the PennyLane library and apply it to solve a prototypical inverse problem applied to nonlinear Burgers’ PDE which involves calibrating the PDE model parameter to match the given measurements.

The paper is organized as follows: We start with mathematical preliminaries in Section 2 and introduce the nonlinear PDE constrained optimization problem in Section 3.1. We describe the CL+VQLS procedure in Section 3.2 and develop the nBVQPCO framework extension including an outline of pseudo-code for its implementation in Section 3.3. Complexity and error analysis of the proposed framework are covered in Section 4. In Section 5 we formulate the inverse PDE problem in our nBVQPCO framework, apply an alternative tensor product decomposition for computing VQLS cost function, and describe the implementation details and numerical results using the PennyLane library. We conclude in Section 7 with avenues for future work.

2.

Mathematical Preliminaries

Let ℕ = {1, 2,…}, ℝ, and ℂ be the sets of positive integers, real numbers, and complex numbers, respectively. We will denote vectors by small bold letters and matrices by capital bold letters. A^† and A′ will denote the vector/matrix complex conjugate and vector/matrix transpose, respectively. Tr(A) will denote the trace of a matrix. We will represent an identity matrix of size s × s by I_s.

Kronecker product will be denoted by ⊗. For any pair of vectors, x ∈ ℝⁿ and y ∈ ℝ^m, their Kronecker product w ∈ ℝ^nm is $w = x \otimes y = {(x_{1} y_{1}, x_{1} y_{2}, \dots, x_{1} y_{m}, \dots, x_{2} y_{m}, \dots, x_{n} y_{1}, \dots, x_{n} y_{m})}^{T} .$

Similarly, for A ∈ ℝ^m×n and B ∈ ℝ^p×q, their Kronecker product is defined as 1 $C = A \otimes B = (\begin{matrix} a_{11} B & \dots & a_{1 n} B \\ ⋮ & ⋮ & ⋮ \\ a_{m 1} B & \dots & a_{m n} B \end{matrix}),$ where C ∈ ℝ^mp×nq. The Kronecker power is a convenient notation to express all possible products of elements of a vector up to a given order, and is denoted by 2 $x^{[i]} = \underset{i - times}{\underset{︸}{x \otimes x \dots \otimes x}},$ for any i ∈ ℕ with the convention x^[0] = 1. Moreover, dim(x^[i]) = nⁱ, and each component of x^[i] is of the form $x_{1}^{ω_{1}} x_{2}^{ω_{2}} \dots x_{n}^{ω_{n}}$ for some multi-index ∈ ℕⁿ of weight $\sum_{j = 1}^{n} ω_{j} = i$ . Similarly, we denote the matrix Kronecker power by, 3 $A^{[i]} = \underset{i - times}{\underset{︸}{A \otimes A \dots \otimes A}} .$

The standard inner product between two complex vectors x, y ∈ ℝⁿ will be denoted by 〈x, y〉. The l_p-norm in the Euclidean space ℝⁿ will be denoted by ∥ · ∥_p, p = 1, 2,…, ∞ and is defined as follows 4 $‖ x ‖_{p} = {(\sum_{j = 1}^{n} {| x_{j} |}^{p})}^{1 / p} .$

For the norm of a matrix A ∈ ℝ^n×m, we use the induced norm, namely 5 $‖ A ‖_{p} = \max_{x \in ℝ^{m}} \frac{‖ A x ‖_{p}}{‖ x ‖_{p}} .$

We will use p = 2, i.e., l₂ norm for vectors and spectral norm for matrices unless stated otherwise. The spectral radius of a matrix denoted by ρ(A) is defined as 6 $ρ (A) = \max {| λ | : λ eigenvalues of A} .$

Trace norm of a matrix A is defined as $‖ A ‖_{T r} = T r (\sqrt{A^{*} A})$ which is the Schatten q-norms with q = 1.

Let ℱ be a vector space of real vector valued functions u(t) : [0, T] → ℝⁿ defined on [0, T]. Then the inner product between u₁, u₂ ∈ ℱ is defined as 7 ${〈 u_{1}, u_{2} 〉}_{T} = \int_{0}^{T} 〈 u_{1} (t), u_{2} (t) 〉 d t .$

We will use standard braket notation, i.e. |ψ〉 and 〈ψ| in representing the quantum states [15]. The inner product between two quantum states |ψ〉 and |ϕ〉 will be denoted by 〈ψ|ϕ〉. For a vector x, we denote by $| x 〉 = \frac{x}{‖ x ‖}$ as the corresponding quantum state. The trace norm ρ(ψ, ϕ) between two pure states |ψ〉 and |ϕ〉 is defined as 8 $ρ (ψ, ϕ) = \frac{1}{2} ‖ | ϕ 〉〈 ϕ | - | ψ 〉〈 ψ | ‖_{T r} = \sqrt{1 - | 〈 ϕ ∣ ψ 〉 |^{2}} .$

3.

Variational Quantum Framework for Nonlinear PDE Constrained Optimization

3.1.

Nonlinear PDE Constrained Optimization Problem

We consider a general class of PDE constrained design optimization problems of the form [16] 9 $\underset{p, u}{m i n} C_{d} (u, p)$ 10 $s . t . ℱ (u, p, t) =0,$ 11 $ℬ (u, p, t) = 0,$ 12 $g_{i} (p) \leq 0, i = 1, \dots, n_{c},$ where t ∈ [0, T] for given T ∈ ℝ, p ∈ ℝ^n_p is the vector of design variables (e.g., material properties, aerodynamic shape), C_d is the cost function (e.g., heat flux, drag/lift), and ℱ(u, p, t) are the PDE constraints (e.g., conservation laws, constitutive relations) with u(x, t; p) being the solution of the PDE defined as a function of space x and time t for given parameters p, ℬ(u, p, t) to capture the constraints coming from the PDE boundary and initial conditions, and g_i(p), i = 1, ⋯, n_c are constraints on the parameters. We make the following assumptions.

Assumption 1.

When semi-discretized in space, the PDE (10) and associated boundary/initial conditions (11) results in a polynomial ODE of the form 13 $\begin{array}{l} \dot{u} (t) = F_{0} (t, p) + F_{1} (t, p) u + \cdot + F_{k} (t, p) u^{[k]}, \\ u (0) = u_{0} (p) \in ℝ^{n}, \end{array}$ where k ∈ ℕ is the order of the polynomial, u(t) ∈ ℝⁿ, t ∈ [0, T] (with slight about of notation) is an -dimensional solution vector and F_i(t, p) ∈ ℝ^n×nⁱ are in general time dependent matrices. We assume that the boundary conditions have already been accounted for in this ODE representation, see Section 5.1 for an example.

Assumption 2.

The cost function is a quadratic function of the solution u(t), t ∈ [0, T], i.e. 14 $C_{d} (u, p) = f (w_{1} {〈 u, H u 〉}_{T} + w_{2} {〈 u, h 〉}_{T}, p),$ where w₁, w₂ ∈ ℝ, H ∈ ℝ^n×n and h ∈ ℝⁿ

Under these assumptions, the optimization problem (9) can be represented as 15 $\underset{p, u}{m i n} C_{d} (u, p) \equiv f (w_{1} {〈 u, H u 〉}_{T} + w_{2} {〈 u, h 〉}_{T}, p)$ 16 $s .t . \dot{u} (t) = F_{0} (t, p) + F_{1} (t, p) u + \cdot + F_{k} (t, p) u^{[k]},$ 17 $u (0) = u_{0} (p),$ 18 $g_{i} (p) \leq 0, i = 1, \dots, n_{c} .$

Remark 1.

Assumption 1 is fairly general as several PDEs arising in engineering and science result in polynomial ODEs when semi-discretized in space, see [9] and references therein. Examples include Euler and Navier-Stokes PDEs and heat equation in fluid dynamics, combustion, and weather forecasting; magnetohydrodynamics and Vlasov-Maxwell PDEs in plasma dynamics and astrophysics; wave equation in structural mechanics; Black–Scholes PDE in finance; and reaction-diffusion PDEs in chemistry, biology, and ecology to name a few. Furthermore, polynomial ODEs also arise in mechanics, molecular dynamics, chemical kinetics, epidemiology, social dynamics, and biological networks.

Remark 2.

Assumption 2 commonly arises in many PDE constrained optimization problems. For instance in inverse problems the cost function is typically taken as a squared norm of difference between the solution and measured variables, which results in a quadratic form, see Section 5.1 for details. Other examples include PDE optimal control where the cost function is a weighted combination of deviation from a reference signal and squared norm of control input, aerodynamic design where the objective is to minimize heat transfer/drag or maximize lift, and computer vision problems such as shape from shading, surface reconstruction from sparse data, and optical flow computation where the objective function depends on an appropriate form of squared norm.

The BVQPCO framework proposed in [1] cannot be applied directly to the optimization problem (15)-(18), as the constraints are nonlinear. To address this we employ the CL framework, whereby the polynomial ODEs are transformed into an infinite-dimensional system of linear ODEs and truncated. Since, it is always possible to map the -th degree polynomial ODE (13) to a higher dimensional quadratic polynomial ODE, and then apply CL [9], without loss of generality we restrict k = 2 throughout this paper.

3.2.

Solving Nonlinear ODEs via Carleman Linearization-Based VQLS

Consider a system of inhomogeneous quadratic polynomial ODE (16) with k = 2, i.e. 19 $\begin{array}{l} \dot{u} = F_{0} (t, p) + F_{1} (t, p) u + F_{2} (t, p) u^{[2]}, \\ u (0) = u_{0} \in ℝ^{n} . \end{array}$

Below we describe the steps required to transform the solution of these nonlinear ODEs into a form amenable to VQLS, see Figure 1 for illustration of the steps involved.

Step 1:
For CL we introduce variables w_i = u^[i] ∈ ℝ^nⁱ , i ∈ ℕ, which satisfy 20 ${\dot{w}}_{i} = \sum_{j = 0}^{2} A_{i + j - 1}^{i} (t, p) w_{i + j - 1},$ where $A_{i + j - 1}^{i} \in ℝ^{n^{i} \times n^{i + j - 1}}$ is given by 21 $A_{i + j - 1}^{i} (t, p) = \sum_{p = 1}^{i} \overset{i factors}{\overset{︷}{I_{n \times n} \otimes \dots \otimes \underset{p - th position}{\underset{︸}{F_{j} (t, p)}} \otimes \dots \otimes I_{n \times n}}},$ with 0 ≤ j ≤ 2. Note that (20) defines an infinite dimensional system of linear ODE with state $w_{\infty} (t) = {(w_{1}^{T} (t), w_{2}^{T} (t), \dots)}^{T}$ and initial condition $w_{\infty} (0) = {(u_{0}^{T}, {(u_{0}^{[2]})}^{T}, \dots)}^{T}$ , and is known as CL. For any finite N ∈ ℕ define 22 $w (t) = (\begin{matrix} w_{1} (t) \\ w_{2} (t) \\ w_{3} (t) \\ ⋮ \\ w_{N} (t) \end{matrix}) \in ℝ^{N_{c}},$ where $N_{c} = \frac{n^{N + 1} - n}{n - 1}$ . The CL when truncated to order N, results in a finite system of linear ODEs 23 $\dot{\hat{w}} = A_{N} (t, p) \hat{w} + b (t, p),$ 24 $\hat{w} (0) = {(u_{0}^{T}, {(u_{0}^{[2]})}^{T}, \dots, {(u_{0}^{[N]})}^{T})}^{T},$ 25 $A_{N} (t, p) = (\begin{matrix} A_{1}^{1} & A_{2}^{1} & 0 & \dots & 0 \\ A_{1}^{2} & A_{2}^{2} & A_{3}^{2} & \dots & 0 \\ 0 & A_{2}^{3} & A_{3}^{3} & A_{4}^{3} & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ 0 & 0 & 0 & A_{N - 1}^{N} & A_{N}^{N} \end{matrix}), \hat{w} (t) = (\begin{matrix} {\hat{w}}_{1} \\ {\hat{w}}_{2} \\ {\hat{w}}_{3} \\ ⋮ \\ {\hat{w}}_{N} \end{matrix}) \in ℝ^{N_{c}}, b (t, p) = (\begin{matrix} F_{0} \\ 0 \\ 0 \\ ⋮ \\ 0 \end{matrix})$ where we have dropped dependence on , t, p in the RHS terms for brevity. Note that $\hat{w} (t)$ approximates w(t) as defined in Eq. (22) and the error incurred as a function of the truncation level N is described by Lemma 1.

Forward Euler

Step 2:
Applying the forward Euler method to the system (23) with step size h = T/M, M ∈ ℕ, results in a set of difference equations 26 ${\hat{w}}^{k + 1} - (I + A_{N} (k h, p) h) {\hat{w}}^{k} = b (k h, p) h,$ where ${\hat{w}}^{k}$ , approximates $\hat{w} (k h)$ as defined in Eq. (25) for k = {0, ⋯ , M – 1} with ${\hat{w}}^{0} = \hat{w} (0)$ . The error introduced by Euler discretization is characterized by Lemma 2.
Step 3:
The iterative system (26) can be expressed as a single system of linear equations 27 $\overset{\tilde{A} (p)}{\overset{︷}{(\begin{matrix} I & 0 & 0 & \dots \\ - [I + A_{N} (0, p) h] & I & 0 & \dots \\ 0 & 0 & ⋱ & ⋱ \\ 0 & 0 & - [I + A_{N} ((M - 1) h, p) h] & I \end{matrix})}} \overset{\tilde{w}}{\overset{︷}{(\begin{matrix} {\hat{w}}^{0} \\ {\hat{w}}^{1} \\ ⋮ \\ {\hat{w}}^{M} \end{matrix})}} = \overset{\tilde{b} (p)}{\overset{︷}{(\begin{matrix} \hat{w} (0) \\ h b (0, p) \\ ⋮ \\ h b ((M - 1) h, p) \end{matrix})}} .$

Backward Euler

Step 2:
Similarly, applying the backward Euler method to the system (23) with step size h = T/M yields 28 $(I - A_{N} (k h, p) h) {\hat{w}}^{k + 1} - {\hat{w}}^{k} = b (k h, p) h,$ where ${\hat{w}}^{k}$ , approximates $\hat{w} (k h)$ , for k = {0, ⋯ M, – 1}.
Step 3:
The iterative system (28) can similarly be expressed as a single system of linear equations 29 $\overset{\tilde{A} (p)}{\overset{︷}{(\begin{matrix} I & 0 & 0 & \dots \\ - I & [I - A_{N} (0, p) h] & 0 & \dots \\ 0 & 0 & ⋱ & ⋱ \\ 0 & 0 & - I & [I - A_{N} ((M - 1) h, p) h] \end{matrix})}} \overset{\tilde{w}}{\overset{︷}{(\begin{matrix} {\hat{w}}^{0} \\ {\hat{w}}^{1} \\ ⋮ \\ {\hat{w}}^{M} \end{matrix})}} = \overset{\tilde{b} (p)}{\overset{︷}{(\begin{matrix} {\hat{w}}_{i n} \\ h b (0, p) \\ ⋮ \\ h b ((M - 1) h, p) \end{matrix})}} .$

In the VQLS framework the linear system obtained using forward Euler (Eq. (27)) or backward Euler (Eq. (29)) is further transformed into the form 30 $\tilde{A} (p) | \tilde{w} 〉 = | \tilde{b} (p) 〉,$ where $| \tilde{b} 〉 = \tilde{b} / ‖ \tilde{b} ‖$ and $| \tilde{w} 〉 = \tilde{w} / ‖ \tilde{w} ‖$ are the normalized vectors. The VLQS algorithm then optimizes the parameter of θ the ansatz V(θ) such that $\tilde{A} (p) V (θ) | 0 〉 = | \tilde{b} (p) 〉 .$

The optimal parameter θ* then prepares a solution $| \bar{w} 〉 = V (θ^{*}) | 0 〉$ which is an approximation to $| \tilde{w} 〉$ . The approximation error is characterized by the Lemma 4.

Different cost functions have been proposed for the VQLS algorithm [13]. This includes global cost function C_ug(θ) and its normalized version C_g(θ) 31 $C_{u g} (θ) = Tr (| ϕ 〉〈 ϕ | (I - | \tilde{b} 〉〈 \tilde{b} |)) = 〈 ψ | H_{g} | ψ 〉, C_{g} (θ) = \frac{u g}{〈 ϕ ∣ ϕ 〉} = 1 - \frac{| 〈 D ∣ ϕ 〉 |^{2}}{〈 ϕ ∣ ϕ 〉}$ where |ϕ〉 = Ã|ψ(θ)〉, |ϕ(ψ)〉 = V(θ)|0〉 and H_g = Ã*U_b(I – |0〉〈0|)(U_b)*Ã, and similarly unnormalized and normalized local cost functions 32 $C_{u l} (θ) = 〈 ψ | H_{l} | ψ 〉, C_{l} (θ) = \frac{C_{u l}}{〈 ϕ ∣ ϕ 〉}, H_{l} = {\tilde{A}}^{*} U_{b} (I - \frac{1}{n} \sum_{i = 1}^{n} | 0_{k} 〉〈 0_{k} | \otimes I_{\tilde{k}}) {(U_{b})}^{*} \tilde{A}$ respectively, with |0_k〉 being a zero state on qubit k, $I_{\tilde{k}}$ being identity on all qubits except the qubit k, and the unitary U_b prepares $| \tilde{b} 〉 = U_{b} | 0 〉$ . Computation of these cost functions can be accomplished via the Hadamard test circuit and its variations (see [13] for the details) and rely on the linear combination of unitaries (LCU) decomposition of the matrix Ã.

Linear Combination of Unitaries Decomposition: The square matrix Ã can be decomposed as a linear combination of unitary operators Ã_i with complex scalar coefficients α_i as, 33 $\tilde{A} = \sum_{i = 1}^{n_{l}} α_{i} {\tilde{A}}_{i} .$

One popular approach for unitary operators is based on the Pauli basis formed from the identity I and the Pauli gates X, Y, and Z. A matrix Ã with the size 2ⁿ × 2ⁿ can then be written as a linear combination of elements selected from the basis set $𝒫_{n} = {P_{1} \otimes P_{2} \dots \otimes P_{n} : P_{i} \in {I, X, Y, Z}} .$

On this basis, each Ã_n ∈ 𝒫_n and its corresponding coefficient α_i can be determined via different numerical approaches [17]. More recently, alternative tensor product decomposition has been proposed which uses different basis elements to better exploit the underlying structure and sparsity of matrices and results in more efficient decomposition [14,18]. For example, for the matrices arising from discretization of Poisson equation [18] and heat equation [14] the number of LCU terms under sigma basis only scale logarithmically, i.e., as 𝒪(n) with the matrix size 2ⁿ × 2ⁿ, compared to the Pauli basis for which the number of terms can vary from 𝒪(2ⁿ) (diagonal matrices) to 𝒪(2²ⁿ) (dense matrices).

The sigma basis set comprises of following operators 34 $S = {σ_{+}, σ_{-}, σ_{+} σ_{-}, σ_{+} σ_{-}, I},$ where σ₊ = |0〉〈1|, σ_– = |1〉〈0|, σ₊σ_– = |1〉〈1|. Given matrix Ã, its LCU-type tensor product decomposition in terms if sigma basis takes similer form as (33) but with Ã_i ∈ 𝒮 where $S_{n} = {σ_{1} \otimes σ_{2} \dots \otimes σ_{n} : σ_{i} \in S} .$

Even though some of the operators in S are non-unitary, using the concept of unitary completion, one can still design efficient quantum circuits for computing the global/local VQLS cost functions, see [14] for details. We have used this alternate decomposition in the numerical study in Section 5.

3.3.

Nonlinear BVQPCO

To solve the optimization problem (15)–(18) in a variational quantum framework, we extend the BVQPCO framework to nonlinear setting, where the outer optimization level iteratively selects p^k using a classical black box optimizer based on the cost function evaluated using solution of CL truncated system obtained via VQLS. To do so, let K₀ ∈ ℝ^n×N_c(M+1) and K_f ∈ ℝ^{n(M+1)×N_c(M+1)} be matrices such that 35 $u_{0} = {\hat{w}}_{1}^{0} = K_{0} \tilde{w},$ and 36 $(\begin{matrix} {\hat{w}}_{1}^{0} \\ {\hat{w}}_{1}^{1} \\ ⋮ \\ {\hat{w}}_{1}^{M} \end{matrix}) = K_{f} \tilde{w},$ where ${\hat{w}}_{1}^{k}$ is the first n components extracted from the vectors ${\hat{w}}^{k}$ for k = 0, ⋯, M. Thus, the two terms in the cost function Eq. (14) can be approximated via the Riemann sum as 37 $〈_{u, H u 〉 T} \approx \sum_{k = 0}^{M} h 〈 u (k h), H u (k h) 〉 \approx h 〈 K_{f} \tilde{w}, (I_{M + 1} \otimes H) K_{f} \tilde{w} 〉,$ 38 $〈_{u, h 〉 T} \approx h 〈 1_{M + 1} \otimes h, K_{f} \tilde{w} 〉,$ where 1_M+1 ∈ ℝ^M+1 is a vector of ones. Furthermore 39 $\begin{array}{l} 〈 K_{f} \tilde{w}, (I_{M + 1} \otimes H) K_{f} \tilde{w} 〉 & = {‖ u_{0} ‖}^{2} \frac{〈 K_{f} \tilde{w}, (I_{M + 1} \otimes H) K_{f} \tilde{w} 〉}{〈 K_{0} \tilde{w}, K_{0} \tilde{w} 〉}, \\ = {‖ u_{0} ‖}^{2} \frac{〈 \tilde{w} | K_{f}^{T} (I_{M + 1} \otimes H) K_{f} | \tilde{w} 〉}{〈 \tilde{w} | K_{0}^{T} K_{0} | \tilde{w} 〉}, \end{array}$ and similarly 40 $〈 1_{M + 1} \otimes h, K_{f} \tilde{w} 〉 = ‖ u_{0} ‖ \frac{〈 1_{M + 1} \otimes h, K_{f} \tilde{w} 〉}{\sqrt{〈 K_{0} \tilde{w}, K_{0} \tilde{w} 〉}} = ‖ u_{0} ‖ \frac{〈 {(1_{M + 1} \otimes h)}^{T} K_{f} ∣ \tilde{w} 〉}{\sqrt{〈 \tilde{w} | K_{0}^{T} K_{0} | \tilde{w} 〉}} .$

Thus, the optimization problem (15)–(18) can be expressed in a variational quantum form as 41 $\underset{p, θ}{m i n} f (g (θ), p),$ 42 $s .t . \tilde{A} (p) V (θ) | 0 〉 = | \tilde{b} (p) 〉,$ 43 $g_{i} (p) \leq 0, i = 1, \dots, n_{c},$ where 44 $g (θ) = {(w_{1} {‖ u_{0} ‖}^{2} \frac{〈 \bar{w} | K_{f}^{T} (I_{M + 1} \otimes H) K_{f} | \bar{w} 〉}{〈 \bar{w} | K_{0}^{T} K_{0} | \bar{w} 〉} + w_{2} ‖ u_{0} ‖ \frac{〈 {(1_{M + 1} \otimes h)}^{T} K_{f} ∣ \bar{w} 〉}{\sqrt{〈 \bar{w} | K_{0}^{T} K_{0} | \bar{w} 〉}})}_{| \bar{w} 〉 = V (θ) | 0 〉},$ 45 $Φ_{f} = K_{f}^{T} (I_{M + 1} \otimes H) K_{f}, Φ_{0} = K_{0}^{T} K_{0}, ϕ = {(1_{M + 1} \otimes h)}^{T} K_{f} .$

Note that the above quantum formulation, unlike the original optimization problem (15)–(18), depends only on the normalized PDE solution. This is significant as VQLS generates only a normalized solution of the linear system as typically quantum algorithms do and would have otherwise required a substantial overhead to obtain the unnormalized solution via additional estimation of the norm of the solution.

Algorithm 1

nBVQPCO Algorithm.

Input: Semi-discretized ODE in the form (19), cost function (14), ${g_{i} (p)}_{i = 1}^{n_{c}}$ , CL truncation level N, V(θ),γ, n_sh and ϵ

Output: Optimal parameters: p*

1: Initialize k = 0 and p⁰ = p_in.

2: Apply CL to (19) with truncation level N and determine Ã(p), $\tilde{b} (p)$ in Eq. (30) and associated ϕ, Φ_f, Φ₀ as defined in Eq. (45).

3: Determine the unitary U_ϕ to prepare |ϕ〉, and LCU decomposition for $Φ_{f} = \sum_{i = 1}^{n_{f}} α_{f i} Φ_{f i}$ and $Φ_{0} = \sum_{i = 1}^{n_{0}} α_{0 i} Φ_{0 i}$ .

4: while stopping criteria not met do

5: Compute LCU ${α_{i} (p^{k}), {\tilde{A}}_{i} (p^{k})}_{i = 1}^{n_{k}}$ , of Ã(P^k), and determine a unitary $U_{b}^{k}$ such that $| \tilde{b} (p^{k}) 〉 = U_{b}^{k} | 0 〉$ .

6: Compute $θ_{*}^{k} = V Q L S ({α_{i}, {\tilde{A}}_{i}}, U_{b}^{k}, V (θ), γ, n_{s h})$ .

7: Let $| ψ (θ_{*}^{k}) 〉 = V (θ_{*}^{k}) | 0 〉$ . Compute $〈 ϕ ∣ ψ (θ_{*}^{k}) 〉$ , $〈 ψ (θ_{*}^{k}) | Φ_{f} | ψ (θ_{*}^{k}) 〉$ and $〈 ψ (θ_{*}^{k}) | Φ_{0} | ψ (θ_{*}^{k}) 〉$ using associated quantum circuits, and evaluate the cost function (41).

8: Use classical black box optimizer to select next p^k+1 subject to constraints ${g_{i} (p)}_{i = 1}^{n_{c}}$ .

9: Check the stopping criterion, e.g., whether S(p^k, p^k+1) ≤ ϵ.

10: k ← k +1

11: end while

12: Return p^k

The pseudo-code for nBVQPCO is summarized in Algorithm 1, and Figure 2 shows the overall flow diagram. Several remarks follow:

Note that VQLS provides solution of the given linear system only up to a normalization constant. By reformulating the cost function expression as discussed above one can make the cost function evaluation independent of the normalization constant.
The inputs for VQLS in line (6) include the linear combination of unitaries (LCU) decomposition of Ã(p), unitary $U_{b}^{k}$ which prepares $| \tilde{b} (p^{k}) 〉$ , V(θ) the selected ansatz, γ the stopping threshold, and n_sh the number of shots used in VQLS cost function evaluation. See Section 5 for the details. From the relations (60), one can select such that, for C_g ≤ γ instance, for ≤ , results in a VLQS approximation error $ϵ \leq κ \sqrt{γ \log N}$ , see Lemma 4.
For LCU decomposition a popular approach is to use Pauli basis as discussed above. We propose to employ a more efficient tensor product decomposition [14], which exploits the underlying structure and sparsity of matrices such as arising from PDE discretizations, see Section 5.
Depending on how Ã depends on the parameters p, a parameter-dependent LCU decomposition {α_i(p), Ã_i} can be pre-computed once, thus saving computational effort, see Section 5.1 for an example.
For computing $〈 ϕ ∣ ψ (θ_{*}^{k}) 〉$ in line 7 one can use the unitary U_ϕ as computed in line (3) and quantum circuit for the SWAP test. Similarly, given the LCU decompositions of Φ_f and Φ₀ as computed in line (3) one can express $\begin{array}{l} 〈 ψ (θ_{*}^{k}) | Φ_{f} | ψ (θ_{*}^{k}) 〉 = \sum_{i = 1}^{n_{f}} α_{f i} 〈 ψ (θ_{*}^{k}) | Φ_{f_{i}} | ψ (θ_{*}^{k}) 〉, \\ 〈 ψ (θ_{*}^{k}) | Φ_{0} | ψ (θ_{*}^{k}) 〉 = \sum_{i = 1}^{n_{0}} α_{f i} 〈 ψ (θ_{*}^{k}) | Φ_{0_{i}} | ψ (θ_{*}^{k}) 〉, \end{array}$ and compute each term in the above sums using quantum circuit associated with the Hadamard test.
For the outer optimization one can utilize any global black box optimization method [1]. For instance, one can sample the optimization variables p to generate a set of predetermined grid points, evaluate the cost function at those points and select the grid point with the minimum cost. Alternatively, one can use more adaptive techniques such as Bayesian optimization (BO) which is a sequential design strategy for global optimization of black-box functions and uses exploration/exploitation trade-off to find optimal solution with minimum number of function calls.
For convergence, line 9 uses a step size tolerance, i.e., S(p^k, p^k+1) = ∥p^k+1 – p^k∥. However, other conditions can be employed, such as functional tolerance, i.e., the algorithm is terminated when the change in design cost $| C_{d}^{k + 1} - C_{d}^{k} |$ is within a tolerance ϵ.
In Section 4 we provide a detailed computational error analysis for CL+VQLS-based solution of polynomial ODEs. Since there are no theoretical results available for run time of VQLS, we employ empirically known results along with our rigorous CL+VQLS error analysis to assess the potential advantage of the nBVQPCO framework over classical methods.

4.

Computational Complexity and Error Analysis

Consider a system of inhomogeneous quadratic polynomial ODEs (19) 46 $\begin{array}{l} \dot{u} = F_{0} (t) + F_{1} u + F_{2} u^{[2]}, \\ u (0) = u_{0} \in ℝ^{n}, \end{array}$ where we have dropped explicit dependence on the parameters p and time t. Associated with above system, let w, $\hat{w}$ , and $\tilde{W}$ be as defined in the Eqs. (22), (25), and (27), and let 47 $w_{c} = (\begin{matrix} w (0) \\ w (h) \\ ⋮ \\ w (M h) \end{matrix}) .$

See also Figure 1, which illustrates the relationship between these different variables. We make following assumptions.

Assumption 3.

For the system (46)

F₁, F₂ be time independent matrices.
Spectral norms ∥F₁∥, ∥F₂∥, ∥F₀∥ = max_t∈[0,T] ∥F₀(t)∥ are known, and $\max_{t \in [0, T]} ‖ {\dot{F}}_{0} ‖ < \infty$
F₁ be diagonalizable with eigenvalues λ_i of F₁ satisfying Re(λ_n) ≤ ⋯ ≤ Re(λ₁) < 0.
Let 48 $R_{2} = \frac{1}{| R e (λ_{1}) |} (‖ u_{0} ‖ ‖ F_{2} ‖ + \frac{‖ F_{0} ‖}{‖ u_{0} ‖}) < 1.$

Condition (48) can be conservative and too restrictive to be satisfied in practical applications. As pointed out in the Introduction, for CFD-type applications, one can work with alternative form of conservation laws, e.g., LBM [10], for which the condition Eq. (48) can more readily be met in practical regimes of interest. Additionally, one can employ a rescaling technique proposed in [19] to further improve computational advantage when using the Carleman linearization framework.

Remark 3.

Under the condition (48), system (46) can be rescaled $u \to \frac{\bar{u}}{γ}$ to (see Appendix in [8] for details) 49 $\dot{\bar{u}} = {\bar{F}}_{0} + {\bar{F}}_{1} \bar{u} + {\bar{F}}_{2} {\bar{u}}^{[2]},$ such that the following relations hold where 50 $‖ {\bar{F}}_{2} ‖ + ‖ {\bar{F}}_{0} ‖ < | Re (λ_{1}) |,$ 51 $‖ \bar{u} (0) ‖ < 1,$ where ${\bar{F}}_{0} = γ F_{0}$ , ${\bar{F}}_{1} = F_{1}$ , ${\bar{F}}_{2} = \frac{1}{γ} F_{2}$ and γ satisfies, 52 $γ = \frac{1}{\sqrt{‖ u (0) ‖ r_{+}}},$ with 53 $r_{\pm} = \frac{- R e (λ_{1}) \pm \sqrt{{(R e (λ_{1}))}^{2} - 4 ‖ F_{2} ‖ ‖ F_{0} ‖}}{2 ‖ F_{2} ‖}$

Note that under this rescaling, the value of R₂ in (48) remains unchanged.

Under Assumption 3, Lemmas 1 – 3 hold, see [8] for the details.

Lemma 1.

The error $η_{j} (t) = w_{j} (t) - {\hat{w}}_{j} (t)$ for j ∈ {1, 2, ⋯ , N} is bounded as follows 54 $‖ η_{j} (t) ‖ \leq ‖ η (t) ‖ \leq t N ‖ F_{2} ‖ {‖ u_{0} ‖}^{N + 1},$

Lemma 2.

Suppose that error η(t) as introduced in Lemma (1) satisfies 55 $‖ η (t) ‖ \leq \frac{‖ u (T) ‖}{4},$ then there exists a sufficiently small h such that 56 $\begin{array}{l} ‖ {\hat{w}}_{j}^{k} - {\hat{w}}_{j} (k h) ‖ \leq ‖ {\hat{w}}^{k} - \hat{w} (k h) ‖ \\ \leq 3 N^{2.5} k h^{2} ({(‖ F_{2} ‖ + ‖ F_{1} ‖ + ‖ F_{0} ‖)}^{2} + ‖ {\dot{F}}_{0} ‖), \end{array}$ for all j ∈ {1, ⋯ , N} and ∈ {0, 1, ⋯ ,M – 1}. Concretely, the choice of h should satisfy 57 $h \leq \min {\frac{1}{N ‖ F_{1} ‖}, \frac{2 (| R e (λ_{1}) | - ‖ F_{2} ‖ - ‖ F_{0} ‖)}{N ({| R e (λ_{1}) |}^{2} - {(‖ F_{1} ‖ + ‖ F_{0} ‖)}^{2} + {‖ F_{1} ‖}^{2})}} .$

Lemma 3.

The stiffness κ of Ã in (Eq. (27)), recall arising from application of steps 1–3 to the ODE (46), is bounded as follows 58 $κ \leq 3 (M + 1),$ where M = T/h.

Lemma 4.

Consider a linear system 59 $\tilde{A} | ψ 〉 = | \tilde{b} 〉,$ where Ã ∈ R^N×N, N = 2ⁿ, ∈ ℕ and, b ∈ R^N. Then following bounds hold for the different VQLS cost functions as defined in Eqs. (31) and (32) 60 $C_{u g} \geq \frac{ϵ^{2}}{κ^{2}}, C_{g} \geq \frac{ϵ^{2}}{κ^{2} ‖ \tilde{A} ‖}, C_{u l} \geq \frac{ϵ^{2}}{n κ^{2}}, C_{l} \geq \frac{ϵ^{2}}{n κ^{2} ‖ \tilde{A} ‖},$ where κ is the condition number of Ã, and ϵ is the desired error tolerance $ϵ = ρ (ψ, ψ (θ^{*})),$ with ρ being the trace norm, and |ψ(θ*)〉 being the approximate VQLS solution.

The proof of above result can be found in [13].

Assumption 4.

We assume that the run time of VQLS scales as $O (\log^{8.5} N κ \log (1 / ϵ)),$ where κ is the condition number, N is size of the matrix A, and ϵ is the desired accuracy of the VQLS solution as described in Lemma 4.

Note that the above assumption is based on an empirical scaling study in [13] and no theoretical guarantees are available. While using Assumption 4 for a given application, one needs to be careful about how well the underlying empirical setup used to obtain this run time scaling of VQLS is met for that application.

Remark 4.

Underlying Assumption 4, the following empirical setup was used (see Section 2.2 in [13]).

The run time of VQLS solution is characterized in terms of time-to-solution, which refers to the number of exact cost function evaluations during the optimization needed to guarantee thatis below a specified value. Since the true solution of the linear system is not known, the value of the cost computed during the VQLS, combined with the error bounds (60), is used to determine the worst casefor the stopping criterion.
Randomly sampled sparse matrices were used for which the number of LCU terms scales as 𝒪((log N)²).
Normalized local VQLS cost function was used.
A problem efficient ansatz was used and time-to-solution was averaged over several VQLS runs.
VQLS was implemented with exact sampling i.e. no finite sampling or shots were used for the computation of the cost function. Furthermore, no hardware noise was considered.

Remark 5.

As discussed in Remark 4, we will use the term run time to refer to time-to-solution throughout the rest of the paper unless otherwise stated.

Lemma 5.

The trace norm and l₂ norm between |ψ〉 and |ϕ〉 are related as follows 61 ${(1 - \frac{‖ | ψ 〉 - | ϕ 〉 ‖_{2}^{2}}{2})}^{2} + ρ^{2} (ψ, ϕ) = 1.$

See Lemma 2 in [1] for the proof.

Theorem 1.

Consider the ODE system (46). Then under Assumption 3, for any given ϵ > 0, one can choose CL truncation level (see Eq (A9)) and Euler discretization step size h (see Eq (A5)) such that 62 $‖ | w_{c} 〉 - | \tilde{w} 〉 ‖ \leq ϵ,$ where $| \tilde{w} 〉$ is the solution of the linear system (30) and w_c is the solution of CL as defined in (47).

The proof of the above theorem is given in Appendix A.1.

Theorem 2.

Consider the ODE system (46). Then under Assumption 3, for any given 0 < ϵ < 1, one can choose CL truncation level N, Euler discretization step size h and VQLS stopping threshold γ such that 63 $‖ | w_{c} 〉 - | \bar{w} 〉 ‖ \leq ϵ,$ where $| \bar{w} 〉 = V (θ^{*}) | 0 〉$ is solution generated by VQLS for system (30) and w_c is solution of CL as defined in (47). Furthermore, under the Assumption 4 the run time scales as 64 $O (\log^{8.5} (\frac{n T^{4}}{{‖ w_{c} ‖}^{2} ϵ^{2}}) \frac{T^{4}}{{‖ w_{c} ‖}^{2} ϵ^{2}} \log (1 / ϵ)) .$

Proof. From Theorem 1, choose N, h such that 65 $‖ | w_{c} 〉 - | \tilde{w} 〉 ‖ \leq ϵ / 2.$

Let γ be 66 $γ = \frac{1 - {(1 - \frac{ϵ^{2}}{8})}^{2}}{9 {(M + 1)}^{2} \ln (N_{c} (M + 1))} < 1,$ where note M = T/h N_c and is the dimension of the truncated CL system (23). Let the VQLS algorithm be terminated under the condition $C_{l} \leq γ,$ then from Lemmas 3 and 4 we obtain $ρ^{2} (\tilde{w}, \bar{w}) = {(ϵ^{'})}^{2} \leq n κ^{2} γ \leq 1 - {(1 - \frac{ϵ^{2}}{8})}^{2}$ where n = ln(N_c(M + 1)). It then follows from Lemma 5 that $1 - {(1 - \frac{‖ | \tilde{w} 〉 - | \bar{w} 〉 ‖_{2}^{2}}{2})}^{2} \leq 1 - {(1 - \frac{ϵ^{2}}{8})}^{2},$ which implies 67 $‖ | \tilde{w} 〉 - | \bar{w} 〉 ‖_{2} \leq ϵ / 2.$

Thus, using relations (65) and (67) 68 $‖ | w_{c} 〉 - | \bar{w} 〉 ‖ \leq ‖ | w_{c} 〉 - | \tilde{w} 〉 ‖ + ‖ | \tilde{w} 〉 - | \bar{w} 〉 ‖ \leq ϵ .$

Furthermore, by Assumption 4 the run time of VLQS behaves as O((log(N_c(M + 1))^8.5#x03BA;log(1/ϵ′)) which using Lemma 3 can be simplified as follows 69 $\begin{array}{l} O (\log^{8.5} (N_{c} (M + 1)) κ \log (1 / ϵ^{'})) & = O (\log^{8.5} (N_{c} M) 3 (M + 1) \log (1 / ϵ)) \\ = O (\log^{8.5} (n^{N} \frac{T}{h}) \frac{T}{h} \log (1 / ϵ)) \\ = O (\log^{8.5} (\frac{n T^{4}}{{‖ w_{c} ‖}^{2} ϵ^{2}}) \frac{T^{4}}{{‖ w_{c} ‖}^{2} ϵ^{2}} \log (1 / ϵ)), \end{array}$ where we have used the bounds Eq. (A5) and Eq. (A9), and note that ${‖ w_{c} ‖}^{2} = \sum_{k = 0}^{M} ‖ w (k h) ‖^{2} = \sum_{k = 0}^{M} \sum_{i = 1}^{N} ‖ u (k h) ‖^{2 i} .$

Remark 6.

Theorem 1 and Theorem 2 can also be applied to higher-order polynomial systems (13), by replacing Assumption 3 with Assumption 5.

Assumption 5.

For the system (13)

Let F_i, i = 0,…, k be time independent matrices with bounded norms.
Assume ${\tilde{F}}_{1}$ is diagonalizable and its eigenvalues satisfy ${\tilde{λ}}_{n_{k}} \leq \dots \leq {\tilde{λ}}_{1} < 0$ , where 70 ${\tilde{F}}_{1} = (\begin{matrix} A_{1}^{1} & A_{2}^{1} & A_{3}^{1} & \dots & A_{k - 1}^{1} \\ A_{1}^{2} & A_{2}^{2} & A_{3}^{2} & \dots & A_{k - 1}^{2} \\ 0 & 0 & \dots & \dots & \cdot \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ 0 & 0 & \dots & A_{k - 2}^{k - 1} & A_{k - 1}^{k - 1} \end{matrix}),$ and $A_{i + j - 1}^{i} \in ℝ^{n^{i} \times n^{i + j - 1}}$ is given by 71 $A_{i + j - 1}^{i} = \sum_{p = 1}^{i} \overset{i f a c t o r s}{\overset{︷}{I_{n \times n} \otimes \dots \otimes \underset{p - t h p o s i t i o n}{\underset{︸}{F_{j}}} \otimes \dots \otimes I_{n \times n}}} .$ which is a generalization of Eq. (21), see [9] for details.
Let 72 $R_{k} = \frac{1}{| R e ({\tilde{λ}}_{1}) |} ((k - 1) \sum_{j = 2}^{k} ‖ F_{j} ‖ \sqrt{\sum_{i = 1}^{k - 1} {‖ x_{0} ‖}^{2 i}} + \frac{‖ F_{0} ‖}{\sqrt{\sum_{i = 1}^{k - 1} {‖ x_{0} ‖}^{2 i}}}) < 1.$

To apply the nBVQPCO framework to the higher-order polynomial case, one can either transform (13) to the form (19) and then follow the procedure described in Section 3.2 or apply CL in Step 1 directly to (13) and then follow the subsequent steps. As shown in [9], the latter is equivalent to the former but leads to more compact CL representation.

Comparison with Classical Approach

Classically to solve the optimization problem (15)–(18) a variety of methods can be used. Consistent with the nBVQPCO framework we use forward Euler to solve the ODE (46) for a given parameter p, compute the cost function and warp a black box optimizer, e.g., BO.

Applying forward Euler discretization scheme with step size h = T/M to (46) results in a set of nonlinear finite difference equations 73 ${\hat{u}}^{k + 1} = {\hat{u}}^{k} + h (F_{0} (k h) + F_{1} {\hat{u}}^{k} + F_{2} {({\hat{u}}^{k})}^{[2]}),$ where k = 0, ⋯ ,M – 1, ${\hat{u}}^{k} \approx u (k h)$ and ${\hat{u}}^{0} = u_{0}$ . Let 74 $u_{c} = (\begin{matrix} u (0) \\ u (h) \\ ⋮ \\ u (M h) \end{matrix}), \tilde{u} = (\begin{matrix} {\hat{u}}^{0} \\ {\hat{u}}^{1} \\ ⋮ \\ {\hat{u}}^{M} \end{matrix}) .$

The next theorem shows that one can always choose a step size h so that ũ approximates u_c to within desired error.

Theorem 3.

Consider the ODE system (46). Then under Assumption 3, for any given ϵ > 0 one can choose h (see Eq. (A23)) such that 75 $‖ | u_{c} 〉 - | \tilde{u} 〉 ‖ \leq ϵ .$

For the proof see Appendix A.2.

Let s be the maximum of sparsity of F₁ and F₂. The computational complexity for solving (73) thus scales as $O (n s M) = O (s n \frac{T}{h}) = O (s n T (h_{0} + \frac{C^{2} T}{{‖ u_{c} ‖}^{2} ϵ^{2}})) = O (\frac{s n T^{2}}{{‖ u_{c} ‖}^{2} ϵ^{2}}),$ where we have used the bound (A23) and ${‖ u_{c} ‖}^{2} = \sum_{k = 0}^{M} ‖ u (k h) ‖^{2} .$

On the other hand, based on Theorem 2, under certain conditions the run time of VQLS based explicit Euler scheme scales polylogarithmically w.r.t , see Eq. (64). Assuming that the number of outer loop iterations are similar for both this classical and the nBVQPCO framework, the run time of nBVQPCO will scale polylogarithmically with , and thus could provide a significant computational advantage for simulation-based design problems.

Remark 7.

In the comparison above, note we have not accounted for computational effort for LCU decomposition required within the nBVQPCO framework (see line 5 in Algorithm 1). This however may not be a significant overhead. As pointed out earlier, depending on how Ã depends on the parameters p, a parameter-dependent LCU decomposition {α_i(p), A_i} can be pre-computed once, thus saving computational effort: we show that via an example in Section 5.1. Moreover, it may be possible to derive LCU decomposition analytically and implement efficiently, e.g., when using sigma basis, see Section 5.2.

Remark 8.

Finally, we would like to remind the reader that the conclusion nBVQPCO could be advantageous over equivalent classical methods is based on the strong Assumption 4 regarding VQLS scaling. Recall this scaling is determined empirically using random matrices whose LCU decomposition admits 𝒪((log N)²) terms. In the future, it will be worthwhile to characterize VQLS scaling behavior for sparse and structured matrices which typically arise from PDE discretization, and refine the comparison above.

5.

Numerical Study

5.1.

Inverse Burgers’ problem

Consider the 1D Burgers’ equation, which models convective flow u(x, t), ona spatial domain [0, L] with Dirichlet 76 $\partial_{t} u + u \partial_{x} u = ν \partial_{x}^{2} u + f (x, t),$ 77 $u (x, 0) = u_{0} (x), u (0, t) = 0, u (L, t) = 0,$ with f(x, t) being the forcing function. We consider an inverse problem, where given y(t) = u(x_p, t), t ∈ [0, T] where x_p ∈ [0,L] is some fixed point, find v such that 78 $\min_{v} \frac{\frac{1}{T} \int_{0}^{T} {(y (t) - u (x_{p}, t))}^{2} d t}{u_{0}^{2} (x_{p})},$ 79 $s .t . Eqs . (76) and (77)$ 80 $ν_{m i n} \leq ν \leq ν_{m a x} .$

Discretizing the PDE with n_x spatial grid points with grid size Δx = L/(n_x + 1) using a central difference scheme, and letting u_i(t) = u(x_i, t) with x_i = iΔx, i = 1, ⋯, n_x, leads to 81 ${\dot{u}}_{i} = - u_{i} \frac{u_{i + 1} - u_{i - 1}}{2 Δ x} + ν \frac{u_{i + 1} - 2 u_{i} + u_{i - 1}}{2 {(Δ x)}^{2}} + f (x_{i}, t),$ for i = 2, ⋯, n_x – 1. Using Dirichlet boundary conditions, we get at i = 1, and i = n_x 82 ${\dot{u}}_{1} = - u_{1} \frac{u_{2}}{2 Δ x} + ν \frac{u_{2} - 2 u_{1}}{2 {(Δ x)}^{2}} + f (x_{1}, t),$ and 83 ${\dot{u}}_{n_{x}} = - u_{n_{x}} \frac{- u_{n_{x}} - 1}{2 Δ x} + ν \frac{- 2 u_{n_{x}} + u_{n_{x} - 1}}{2 {(Δ x)}^{2}} + f (x_{n_{x}}, t),$ respectively. Expressing in vector form, the evolution of u = (u₁, ⋯, u_{n_x})^T can be expressed as an ODE 84 $\dot{u} = F_{0} (t) + F_{1} u + F_{2} u^{[2]},$ with an initial condition u(0) = u₀ = (u₀(x₁), ⋯ , u₀(x_{n_x}))^T, 85 $F_{0} (t) = (\begin{matrix} f (x_{1}, t) \\ f (x_{2}, t) \\ ⋮ \\ f (x_{n_{x}}, t) \end{matrix}), F_{1} = ν {\tilde{F}}_{1} = ν \frac{1}{2 Δ x^{2}} (\begin{matrix} - 2 & 1 & 0 & \dots & 0 \\ 1 & - 2 & 1 & 0 & \dots \\ 0 & 1 & - 2 & 1 & \dots \\ ⋮ & ⋮ & ⋮ & \dots & ⋮ \\ 0 & 0 & \dots & 1 & - 2 \end{matrix}),$ and 86 $F_{2} (i, j) = \frac{1}{2 Δ x} {\begin{array}{l} - 1 & if i = 1, j = 2 \\ 1 & if i = n_{x}, j = n_{x} - 1 \\ 1 & if j = (n_{x} + 1) (i - 1), 1 < i < n_{x} \\ - 1 & if j = (n_{x} + 1) (i - 1) + 2, 1 < i < n_{x} \\ 0 & otherwise \end{array} .$

The system (84) is in the form (19). Applying CL with truncation level N, we can transform (84) to (23) with 87 $A_{N} (t) = (\begin{matrix} ν {\tilde{A}}_{1}^{1} & A_{2}^{1} & 0 & \dots & 0 \\ A_{1}^{2} & ν {\tilde{A}}_{2}^{2} & A_{3}^{2} & \dots & 0 \\ 0 & A_{2}^{3} & ν {\tilde{A}}_{3}^{3} & A_{4}^{3} & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ 0 & 0 & 0 & A_{N - 1}^{N} & ν {\tilde{A}}_{N}^{N} \end{matrix}),$ where 88 ${\tilde{A}}_{i}^{i} = \sum_{p = 1}^{i} \overset{i factors}{\overset{︷}{I_{n_{x} \times n_{x}} \otimes \dots \otimes \underset{p - th position}{\underset{︸}{{\tilde{F}}_{1}}} \otimes \dots \otimes I_{n_{x} \times n_{x}}}} .$

We further decompose Ã_N as $A_{N} = A_{N, 1} + ν A_{N, 2},$ with $A_{N, 1} = (\begin{matrix} 0 & A_{2}^{1} & 0 & \dots & 0 \\ A_{1}^{2} & 0 & A_{3}^{2} & \dots & 0 \\ 0 & A_{2}^{3} & 0 & A_{4}^{3} & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ 0 & 0 & 0 & A_{N - 1}^{N} & 0 \end{matrix}), A_{N, 2} = (\begin{matrix} {\tilde{A}}_{1}^{1} & 0 & 0 & \dots & 0 \\ 0 & {\tilde{A}}_{2}^{2} & 0 & \dots & 0 \\ 0 & 0 & {\tilde{A}}_{3}^{3} & 0 & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ 0 & 0 & 0 & 0 & {\tilde{A}}_{N}^{N} \end{matrix}) .$

Using the backward Euler time stepping and taking M = n_t – 1 time steps with step size h = T/M, Ã in (29) can be expressed as, $\tilde{A} (ν) = {\tilde{A}}_{1} + ν {\tilde{A}}_{2},$ where $\begin{array}{l} {\tilde{A}}_{1} & = (\begin{matrix} I & 0 & 0 & \dots \\ - I & I - A_{N, 1} (0) h & 0 & \dots \\ 0 & 0 & ⋱ & ⋱ \\ 0 & 0 & - I & I - A_{N, 1} ((M - 1) h)) h \end{matrix}), \\ {\tilde{A}}_{2} & = (\begin{matrix} 0 & 0 & 0 & \dots \\ 0 & - A_{N, 2} (0) h & 0 & \dots \\ 0 & 0 & ⋱ & ⋱ \\ 0 & 0 & - A_{N, 2} ((M - 1) h) h \end{matrix}) . \end{array}$

Consequently, for VQLS one can apply LCU for Ã₁, Ã₂once, and then generate a parameter-dependent LCU for Ã for any parameter ν for implementation of the nBVQPCO framework.

Furthermore, the optimization problem (78–80) can be approximated as 89 $\min_{ν} \frac{h}{T} \frac{〈 y - H \tilde{w}, y - H \tilde{w} 〉}{| 〈 h, \tilde{w} 〉 |^{2}},$ 90 $s .t \tilde{A} (ν) \tilde{w} = \tilde{b},$ 91 $ν_{\min} \leq ν \leq ν_{\max}$ where $u_{0} (x_{p}) = 〈 h, \tilde{w} 〉$ for some appropriate vector h, y = (y(0), ⋯, y(kh), ⋯, y(Mh))^T and H is an appropriate matrix which picks sub-vector ${({\hat{w}}_{1 p}^{0}, \dots, {\hat{w}}_{1 p}^{M})}^{T}$ from $\tilde{w}$ , where ${\hat{w}}_{1 p}^{k}$ is p-th element taken from the first n_x components ${\hat{w}}_{1}^{k}$ of vector ŵ^k, for each k = 0, ⋯, M. Note that $\begin{array}{l} \frac{〈 y - H \tilde{w}, y - H \tilde{w} 〉}{| 〈 h, \tilde{w} 〉 |^{2}} & = \frac{〈 y, y 〉}{u_{0}^{2} (x_{p})} - \frac{2 〈 y, H \tilde{w} 〉}{| u_{0} (x_{p}) | 〈 h, \tilde{w} 〉} + \frac{〈 H \tilde{w}, H \tilde{w} 〉}{| 〈 h, \tilde{w} 〉 |^{2}}, \\ = \frac{〈 y, y 〉}{u_{0}^{2} (x_{p})} - \frac{2 〈 y, H \tilde{w} 〉}{| u_{0} (x_{p}) | 〈 h, \tilde{w} 〉} + \frac{〈 H \tilde{w}, H \tilde{w} 〉}{| 〈 h, \tilde{w} 〉 |^{2}}, \\ = \frac{〈 y, y 〉}{u_{0}^{2} (x_{p})} - \frac{2 ‖ H^{T} y ‖}{| u_{0} (x_{p}) | ‖ h ‖} \frac{〈 H^{T} y ∣ \tilde{w} 〉}{〈 h ∣ \tilde{w} 〉} + \frac{1}{‖ h ‖^{2}} \frac{〈 \tilde{w} | H^{T} H | \tilde{w} 〉}{| 〈 h ∣ \tilde{w} 〉 |^{2}} . \\ \equiv C_{d} (ν, | \tilde{w} 〉) . \end{array}$

Thus, the optimization problem can be expressed in the form of (41–43) as 92 $\min_{ν, | \tilde{w} 〉} C_{d} (ν, | \tilde{w} 〉)$ 93 $s .t \tilde{A} (ν) | \tilde{w} 〉 = | \tilde{b} 〉,$ 94 $ν_{\min} \leq ν \leq ν_{\max}$

Note that all the inner products can be computed using appropriate quantum circuits once the solution from VQLS is available as described in Section 3.3.

5.2.

LCU Decomposition with Sigma Basis

We use a recursive strategy for tensor decomposition of Ã₁, Ã₂ under the sigma basis as described in Section 3.2. For simplicity, assume that the forcing function f(x, t) ≡ 0. Since, F₁, F₂ as defined in Eq. (85) and Eq. (86) are time independent, A_N(t) ≔ A_N is also time independent. Assume n_x = 2^s, n_t = 2^t. Note that A_N ∈ ℝ^{N_c × N_c} where $N_{c} = \frac{n_{x}^{N + 1} - n_{x}}{n_{x} - 1}$ is not a power of 2 and it is padded with zero block of size N₀ = 2^{sN + 1} – N_c and thus can be represented with a sN + 1 + t qubits and can be written as ${\tilde{A}}_{1} : = {\tilde{A}}_{1}^{(s N + 1 + t)} = {\hat{A}}_{1}^{(s N + 1 + t)} + \underset{t times}{\underset{︸}{σ_{+} σ_{-} \otimes \dots \otimes σ_{+} σ_{-}}} \otimes A_{N, 1} h,$ where $\begin{array}{l} {\hat{A}}_{1}^{(s N + 1 + t)} & = (\begin{matrix} {\hat{A}}_{1}^{(s N + t)} & 0 \\ D_{1}^{(s N + t)} & {\hat{A}}_{1}^{(s N + t)} \end{matrix}), \\ D_{1}^{(s N + t)} & = (\begin{matrix} 0 & \dots & - I_{s N + 1} \\ ⋮ & ⋱ & ⋮ \\ 0 & \dots & 0 \end{matrix}) = - \underset{t times}{\underset{︸}{σ_{+} \otimes \dots \otimes σ_{+}}} \otimes I_{s N + 1} . \end{array}$

Then $\begin{array}{l} {\hat{A}}_{1}^{(s N + 1 + t)} & = I_{2} \otimes {\hat{A}}_{1}^{(s N + t)} + σ_{-} \otimes D_{1}^{(s N + t)}, \\ {\hat{A}}_{1}^{(s N + 2)} & = (\begin{matrix} I_{s N + 1} - A_{N, 1} h & 0 \\ - I_{s N + 1} & I_{s N + 1} - A_{N, 1} h \end{matrix}) & = I_{2} \otimes (I_{s N + 1} - A_{N, 1} h) - σ_{-} \otimes I_{s N + 1} . \end{array}$

Similarly, Ã₂ can be written as ${\tilde{A}}_{2} = I_{n_{t}} \otimes A_{N, 2} - \underset{t times}{\underset{︸}{σ_{+} σ_{-} \otimes \dots \otimes σ_{+} σ_{-}}} \otimes A_{N, 2} .$

Next, we find the tensor decomposition of A_N,1, A_N,2. Each diagonal block of A_N,2 is of the form ${\tilde{A}}_{i}^{i} = \sum_{p = 1}^{i} \overset{i factors}{\overset{︷}{I_{n \times n} \otimes \dots \otimes \underset{p - th position}{\underset{︸}{{\tilde{F}}_{1}}} \otimes \dots \otimes I_{n \times n}}},$ where $\begin{array}{l} {\tilde{F}}_{1} : = & {\tilde{F}}_{1}^{(s)} = I_{2} \otimes {\tilde{F}}_{1}^{(s - 1)} + σ_{-} \otimes {\tilde{D}}_{1}^{(s - 1)} + σ_{+} \otimes {({\tilde{D}}_{1}^{(s - 1)})}^{T}, \\ {\tilde{D}}_{1}^{(s - 1)} = \underset{s - 1 times}{\underset{︸}{σ_{+} \otimes \dots \otimes σ_{+}}}, {\tilde{F}}_{1}^{(1)} = - 2 I_{2} + σ_{-} + σ_{+} . \end{array}$

With these relations, ${\tilde{F}}_{1}^{(s)}$ can be written using 2 log n_x + 1 terms. As a result, ${\tilde{A}}_{i}^{i}$ can be written using i(2 log n_x + 1) terms out of which the factor I ⊗ ⋯ ⊗ I will repeat i – 1 times and can be combined into a single term. Hence, ${\tilde{A}}_{i}^{i}$ can be written using 2i log n_x + 1 terms. Finally A_N,2 can be decomposed using ∑_i(2i log n_x + 1) = N + N(N + 1) log n_x terms.

Next, we focus on the A_N,1 term. Since f(x, t) = 0, the sub-diagonal blocks $A_{i}^{i + 1}$ are zero. Each of the superdiagonal block $A_{i + 1}^{i} \in ℝ^{n_{x}^{i} \times n_{x}^{i + 1}}$ . The first term $A_{2}^{1} = F_{2} \in ℝ^{n_{x} \times n_{x}^{2}}$ can be split into square blocks $F_{2}^{i}$ of size n_x × n_x $A_{2}^{1} = F_{2} = (\begin{array}{l} F_{2}^{1} & F_{2}^{2} & \dots & F_{2}^{n_{x}} \end{array}) .$

Let’s say that each block $F_{2}^{i}, i = 1, \dots, n_{x}$ can be decomposed with at most C terms where C is a small constant. We will show below that C ≤ 2 and there are no repeat factors in the decomposition of $F_{2}^{p}$ and $F_{2}^{q}$ for any p ≠ q. Then, representing $A_{2}^{1}$ in A_N,1 requires C n_x terms; one for each $F_{2}^{i}$ . Next, we can write $A_{3}^{2} = I \otimes F_{2} + F_{2} \otimes I = (\begin{matrix} F_{2} \\ F_{2} \\ ⋱ \\ F_{2} \end{matrix}) + (\begin{array}{l} F_{2}^{1} \otimes I & F_{2}^{2} \otimes I & \dots & F_{2}^{n_{x}} \otimes I \end{array}) .$

Each block $F_{2}^{i} \otimes I$ is of size $n_{x}^{2} \times n_{x}^{2}$ and can be written directly using the tensor decomposition of $F_{2}^{i}$ . Then, F₂ ⊗ I requires C n_x terms. On the other hand, each of the diagonal blocks of I ⊗ F₂ requires n_x terms. Thus, representing $A_{3}^{2}$ in A_N,1 requires $C (n_{x} + n_{x}^{2})$ terms. In general, we can prove by induction that the number of terms needed are as follows: $\begin{array}{l} A_{2}^{1} : C n_{x} terms, \\ A_{3}^{2} : C (n_{x} + n_{x}^{2}) terms, \\ A_{4}^{3} : C (n_{x} + n_{x}^{2} + n_{x}^{3}) terms, \\ ⋮ ⋮ \\ A_{N}^{N - 1} : C \sum_{i = 1}^{N - 1} n_{x}^{i} terms . \end{array}$

Thus, A_N,1 has a total of $\sum_{j = 1}^{N - 1} \sum_{i = 1}^{j} C n_{x}^{i} = \frac{C n_{x}}{{(n_{x} - 1)}^{2}} (n_{x}^{N} - 1 - N (n_{x} - 1))$ . Putting it all together, $\begin{array}{l} {\tilde{A}}_{1} : \log n_{t} + 1 + 2 (\frac{C n_{x}}{{(n_{x} - 1)}^{2}} (n_{x}^{N} - 1 - N (n_{x} - 1))), \\ {\tilde{A}}_{2} : 2 (N + N (N + 1) \log n_{x}), \end{array}$ terms, respectively.

Finally, we look at the decomposition of the blocks $F_{2}^{i}$ and determine the constant C. The first and last block of F₂ have one non-zero entry with the following structure $\begin{array}{l} F_{2}^{1} & = \frac{1}{2 Δ x} (\begin{matrix} 0 & - 1 & 0 & \dots & 0 \\ 0 & 0 & 0 & \dots & 0 \\ ⋮ & ⋮ \\ 0 & 0 & 0 & \dots & 0 \end{matrix}) = \frac{- 1}{2 Δ x} \underset{s - 1 times}{\underset{︸}{σ_{+} σ_{-} \otimes \dots \otimes σ_{+} σ_{-}}} \otimes σ_{+}, \\ F_{2}^{n_{x}} & = \frac{1}{2 Δ x} (\begin{matrix} 0 & \dots & 0 & 0 & 0 \\ 0 & \dots & 0 & 0 & 0 \\ ⋮ & ⋮ \\ 0 & \dots & 0 & 1 & 0 \end{matrix}) = \frac{1}{2 Δ x} \underset{s - 1 times}{\underset{︸}{σ_{-} σ_{+} \otimes \dots \otimes σ_{-} σ_{+}}} \otimes σ_{-}, \end{array}$ and thus have a single decomposition term, i.e. C = 1. The remaining $F_{2}^{k}$ blocks have exactly two non-zero entries and can be decomposed into two terms (i.e., C = 2), which follows from the result below. In general, the tensor decomposition of a square matrix A_rc in sigma basis with a single non-zero entry at (r, c) position for r, c ∈ {0, 1, …, n_x – 1} can be determined as given in Theorem 4.

Theorem 4

Let A_s be a 2^s × 2^s matrix with a single non-zero entry in position (r, c) r, c ∈ {0, 1, …, n_x – 1}. Let the binary representation of $r = \sum_{p = 0}^{s - 1} 2^{p} q_{r} (p)$ and $c = \sum_{p = 0}^{s - 1} 2^{p} q_{c} (p)$ . Then, the tensor decomposition ofin sigma basis is given by A_s = A_s (r, c)(⊗_pσ(q_r(p), q_c(p))), where $σ (q_{r} (p), q_{c} (p)) = {\begin{array}{l} σ_{+} σ_{-} & if q_{r} (p) = 0, q_{c} (p) = 0 \\ σ_{+} & if q_{r} (p) = 0, q_{c} (p) = 1 \\ σ_{-} & if q_{r} (p) = 1, q_{c} (p) = 0 \\ σ_{-} σ_{+} & if q_{r} (p) = 1, q_{c} (p) = 1 \end{array} .$

Proof

This can be proved by induction. Trivially true for s = 1, by the definition of sigma basis. Assume, the statement holds for s – 1. Then $A_{s} = A_{s} (r, c) ([\begin{array}{l} a & b \\ c & d \end{array}] \otimes A_{s - 1}),$ where a, b, c, d ∈ {0, 1}. Since, there is only one non-zero entry in A_s, only one of a, b, c or d can be non-zero. If r // 2^s–1 = // 2^s–1 = 0, where // corresponds to integer division, then a = 1. Equivalently, using the binary representation of r, c, if q_r(s – 1) = q_c(s – 1) = 0, then a = 1 and the corresponding basis is σ₊σ_-. Similarly, b = 1 if q_r(s – 1) = 0, q_c(s – 1) = 1, c = 1 if q_r(s – 1) = 1, q_c(s – 1) = 0, and d = 1 if q_r(s – 1) = q_c(s – 1) = 1. Combining this with the induction hypothesis completes the proof.

To demonstrate the efficacy of using the sigma basis compared to the Pauli basis, we compute the number of terms needed in the LCU decomposition for different matrix sizes for Ã₁, Ã₂. To vary the matrix size we consider different numbers of spatial/temporal discretization points n_x and n_t, respectively, and different values of CL truncation level N. For the Pauli basis we used the matrix slicing method proposed in [17] to determine the number of terms, while for sigma basis we used analytical expressions derived above. Figure 3 shows the comparison and highlights significantly better scaling for the sigma basis. For example, for n_x = n_t = 4, N = 4, LCU with Pauli basis generates 276, 000 terms whereas sigma basis gives rise to only 111 terms. Furthermore, computation of the Pauli LCU decomposition was too time consuming as the problem size increases as indicated by the missing data points for n_x = n_t = 8, 16 and N > 2.

Finally, note that the sigma basis-based LCU decomposition analysis presented in this section, while developed in context of a specific example, is general and can be readily extended to Carleman matrices arising from the linearization of any other polynomial nonlinear ODE.

5.3.

Implementation

In this section, we demonstrate the application of our nBVQPCO framework to solve the inverse problem described in Section 5.1. We take L = 0.5 and T = 0.35, and assume that the forcing function f(x, t) ≡ 0. The problem is discretized on a spatial grid with n_x = 4 grid points leading to Δx = 0.1. For time discretization we take n_t = 8 time steps resulting in Δt = 0.05. We consider an initial condition of the form u₀(x) = sin(k(x – Δx)), $k = \frac{2 π}{L}$ , which upon discretization takes the form 95 $u_{0} = c (\begin{matrix} \sin (0) \\ \sin (k Δ x) \\ ⋮ \\ \sin ((n_{x} - 1) k Δ x) \end{matrix}) .$

For simplicity we choose such that ∥u₀∥ = 1. To simulate the measurement data y(t), t ∈ [0, T], we take the measurement point to be x_p = x₂ where x_i = iΔx. We integrate the nonlinear ODEs (84), and generate the measurement data y(t_i) = u₂(t_i), t_i = ih, i = 0, ⋯, n_t – 1 using ν = 0.07. In our numerical studies we investigate N = 1 and N = 2 truncation for the CL.

The nBVQPCO framework was implemented using the PennyLane software framework for quantum computing. PennyLane’s lightning.qubit device was used as the simulator, autograd interface was used as the automatic differentiation library and the adjoint method was used for gradient computations. We do not consider any shot, device or measurement noise throughout this study.

5.3.1.

VQLS Implementation and Results

In this section we provide details related to VQLS implementation including the state preparation circuit, and the selected ansatz, optimizer, and LCU decomposition approach.

State Preparation: Here we describe the circuit to construct $\tilde{b}$ in (29). Given padding of A_N as discussed in Section 5.2, we equivalently pad $\tilde{b}$ which takes the form $\tilde{b} = (\begin{matrix} 0 \\ \hat{w} (0) \\ 0 \\ h b (0, p) \\ ⋮ \\ 0 \\ h b ((n_{t} - 2) h, p) \end{matrix}) = (\begin{matrix} 0 \\ \hat{w} (0) \\ 0 \\ 0 \\ ⋮ \\ 0 \\ 0 \end{matrix}),$ where $\hat{w} (0) = (\begin{array}{l} u_{0} & {{(u_{0}^{[2]})}^{T} \dots {(u_{0}^{[N]})}^{T})}^{T} \end{array}$ with u₀ defined in Eq. (95). Since f(x, t) = 0, implies F₀(t) = 0 and thus b(t, p) = 0 in Eq. (25), and the second equality above follows. Since all other entries are zero, it is sufficient to focus on state preparation of the form $\frac{1}{‖ \hat{w} (0) ‖} (\begin{matrix} 0 \\ \hat{w} (0) \end{matrix})$ . We generate this state using a sequence of RY, (multi-) controlled RY gates and controlled applications of the circuit for generating u₀ as follows.

State preparation for u₀ can be accomplished using RY, CNOT, and PauliZ gates as shown in Figure 4 for different values of n_x. Next, starting from the state |0〉^n_x, we first create the state $| {\hat{w}}^{1} (0) 〉 = \frac{1}{\sqrt{N}} {(0, e_{1}^{T}, {(e_{1}^{[2]})}^{T}, \dots, {(e_{1}^{[N]})}^{T})}^{T}$ where e₁ = [1, 0,…, 0]^T ∈ ℝ^n_x using RY gates as shows in Figure 5. Finally, by adding the circuit to prepare u₀ and its controlled application as illustrated in Figure 6, we sequentially create the states $\begin{array}{l} | {\hat{w}}^{1} (0) 〉 & \to \frac{1}{\sqrt{N}} {(0^{T}, u_{0}^{T}, u_{0}^{T} \otimes e_{1}^{T}, \dots, u_{0}^{T} \otimes {(e_{1}^{[N - 1]})}^{T})}^{T} \\ \to \frac{1}{\sqrt{N}} {(0^{T}, u_{0}^{T}, {(u_{0}^{[2]})}^{T}, \dots, {(u_{0}^{[2]})}^{T} \otimes {(e_{1}^{[N - 2]})}^{T})}^{T} \\ \to \frac{1}{\sqrt{N}} {(0^{T}, u_{0}^{T}, {(u_{0}^{[2]})}^{T}, {(u_{0}^{[3]})}^{T}, \dots, {(u_{0}^{[3]})}^{T} \otimes {(e_{1}^{[N - 3]})}^{T})}^{T} \\ ⋮ \\ \to \frac{1}{\sqrt{N}} {(0^{T}, u_{0}^{T}, {(u_{0}^{[2]})}^{T}, {(u_{0}^{[3]})}^{T}, \dots, {(u_{0}^{[N]})}^{T})}^{T} = \frac{1}{‖ \hat{w} (0) ‖} (\begin{matrix} 0 \\ \hat{w} (0) \end{matrix}) . \end{array}$

Note that since ∥u₀∥ = 1 and $‖ u_{0}^{[i]} ‖ = 1$ for i = 1, 2, ⋯, N, the states generated above are normalized as required. Finally, since the forcing term f(x, t) = 0 in our example, simply adding empty log₂ n_t wires completes state preparation of the state $\tilde{b}$ . The technique described here can be used in general to prepare the vector ŵ, arising due to Carleman linearization with the circuit depth scaling approximately as 𝒪(N²) to 𝒪(N^2.5). The depth also depends on the user-specified initial and boundary conditions.

Ansatz: We use a modified version of the ansatz circuit 9 from [20] with 3 repetitions inspired by the work in [12]. This is shown in Figure 7.

LCU Decomposition: As discussed in Section 5.2, we use sigma basis-based tensor product decomposition due to its computationally efficiency compared to the Pauli basis for our application. Since sigma basis is non-unitary, we used circuits based on unitary completion [14] to implement the Hadamard test required for computing the VQLS cost functions.

Optimizer: There are several choices of optimizers that can be used with VQLS [21]. We used the Adagrad optimizer from PennyLane with a step size of 0.8. The convergence criteria for VQLS were set to a maximum of 200 iterations. The optimizer was initialized randomly with samples taken from the Beta distribution with shape parameters α = β = 0.5.

VQLS Results: We study the convergence of VQLS for two Carleman truncation levels N = 1, 2 using local VQLS cost function. The VQLS results and convergence of the VQLS cost functions are shown in Figure 8. The VQLS solution is compared with the solution obtained by classically solving the underlying linear system (29). Qualitative comparison of the two solutions indicates that VQLS produces reasonable solutions to the problem. To characterize VQLS solution quality we define aggregated solution error as time average of norm of normalized solution error at each time step, i.e. 97 $ℰ = \frac{1}{n_{t}} \sum_{i = 1}^{n_{t}} ‖ {\tilde{w}}^{i} - {\bar{w}}^{i} ‖,$ where ${\tilde{w}}^{i}$ is the vector of components taken from $| \tilde{w} 〉$ corresponding to classical solution at the time step i, and similarly ${\bar{w}}^{i}$ is the vector of components taken from $| \bar{w} 〉$ corresponding to VQLS solution at the i-th time step. Figure 9 shows the error ε for different values of ν and two CL truncation levels. The VQLS solution with N = 2 truncation level has lower error compared to the truncation level N = 1.

5.3.2.

nBVQPCO Results

Next we study the performance of the nBVQPCO framework. The true optimum of the inverse problem is ν* = 0.07 as we used that value to generate the simulated measurement data y(t). Since the optimization variable is a scalar, we use exhaustive search as the black box optimizer. In this approach we uniformly sample ν in the range [ν_min, ν_max], use CL+VQLS to generate the normalized solution and compute the design cost (92).

As the curves in Fig. 9b indicate, our algorithm returns the minimum solution at ν = 0.06 for both CL truncation levels N = 1 and N = 2, which is close to the true optimal value of ν* = 0.07. The value of R₂ in Eq. (48) ranges from 2.46 to 18 over the range of ν values considered here. Despite being well outside the bound in Eq. (48), CL works well in this practical use case.

6.

Discussion

In this section we discuss various pros/cons of the nBVQPCO framework and a path toward fully fault-tolerant quantum computing implementation. While the nBVQPCO framework is well suited for NISQ implementation, like other variational quantum algorithms it is difficult to subject it to a fully theoretical complexity analysis. As pointed out in Remark 8, it will be worthwhile to analyze VQLS empirical scaling behavior for sparse and structured matrices. Furthermore, the error and computational complexity results we presented depend on the condition in Eq. (48), which can be conservative and too restrictive to be satisfied in practical applications. As we pointed out earlier, for CFD type applications, one can work with alternative forms of conservation laws, e.g., LBM [10], for which the condition in Eq. (48) can more readily be met in practical regimes of interest. Additionally, one can employ a rescaling technique and higher-order discretization schemes proposed in [19] to further improve computational advantage when using the Carleman linearization framework.

One of the bottlenecks in the nBVQPCO framework is the computation of VQLS cost function which scales polynomially with the number of LCU terms. By replacing conventionally employed Pauli basis with sigma basis, we showed in our application that the number of LCU terms scales more favorably. Ideally, one would like this scaling to have a polylogarithmic dependence on the matrix size, and further enhancing the sigma basis type approaches would be necessary to make the proposed framework scalable. Other variational quantum algorithms such as variational quantum eigensolver (VQE), and quantum approximate optimization algorithm (QAOA) can also significantly benefit from such efficient LCU decompositions.

In the nBVQPCO framework one can replace VQLS with the quantum linear systems algorithm (QLSA) [22–24]. This could be beneficial since QLSA has provable exponential advantage over classical linear system solvers and thus comes with rigorous guarantees unlike variational methods. However, QLSA-based implementation is expected to have high quantum resource needs. For instance, in [11], the authors obtained a detailed quantum resource estimate for implementing the CL-based LBM simulation of incompressible flow fields and concluded that the QLSA-based framework can only be feasible in a fault-tolerant quantum computing setting. This study also highlighted that dominant quantum resource needs arise due to the block encoding step which is required to implement the oracle in QLSA. Block encoding in fault-tolerant implementation can be thought of as an analogous step to LCU decomposition in NISQ implementation. Techniques like sigma basis which exploit sparsity and structure for more efficient LCU decomposition can also be potentially leveraged to make block encoding schemes more efficient.

7.

Conclusions

In this paper we presented nBVQPCO, a novel variational quantum framework for nonlinear PDE constrained optimization problems. The proposed framework utilizes Carleman linearization, VQLS algorithm, and a black box optimizer nested in bi-level optimization structure. We presented a detailed computational error and complexity analysis to establish potential benefits of our framework over classical techniques under certain assumptions. We demonstrated the framework on an inverse problem and presented simulation results. The analysis and results demonstrate the correctness of our framework for solving nonlinear simulation-based design optimization problems.

Future work will involve studying and mitigating the effect of device/measurement noise and making the nBVQPCO framework robust. It will also be important to study the scalability of the framework by applying it to larger problem sizes and implementing it on quantum hardware. It will also be beneficial to explore the application of our framework with other PDE discretization approaches such as finite volume and finite element methods, and extend the error and complexity analysis for higher-order temporal discretization schemes. Finally, exploring and refining the framework to better exploit sparsity/structure and extension to fault-tolerant setting as discussed above are also avenues for future research.

Variational Quantum Framework for Nonlinear PDE Constrained Optimization Using Carleman Linearization

Full Article

Paradigm

My account