We present a variational quantum framework for optimization problems constrained by nonlinear partial differential equations (PDE). This extends the recently proposed bi-level variational quantum PDE constrained optimization (BVQPCO) framework for linear PDEs in [1] to a nonlinear setting. This is an important extension as many science and engineering applications such as aerodynamics, computational fluid dynamics (CFD), material science, computer vision, and inverse problems necessitate the solution of optimization problems constrained by a system of nonlinear PDE. Examples include Euler and Navier-Stokes PDEs and heat equation in fluid dynamics, combustion, and weather forecasting; magnetohydrodynamics and Vlasov-Maxwell PDEs in plasma dynamics and astrophysics; wave equation in structural mechanics; Black–Scholes PDE in finance; and reaction-diffusion PDEs in chemistry, biology, and ecology to name a few. Closed-form solutions are generally unavailable for PDE constrained optimization problems, necessitating the development of numerical methods [2–4]. A variety of classical gradient-based and gradient-free numerical methods have been developed, and rely on repeated calls to the underlying PDE solver. Since PDE simulations are computationally expensive, using them within the design/optimization loop can become a bottleneck.
To extend the BVQPCO framework for nonlinear PDE we employ the Carleman linearization (CL) framework. The CL framework allows one to transform polynomial ordinary differential equations (ODE), i.e., ODE with polynomial vector field, into a system of infinite but linear ODE [5]. For instance, such polynomial ODE naturally arises when PDE, e.g., as mentioned above, are semi-discretized in spatial dimensions. By truncating the CL system to a finite order, one obtains a finite system of linear ODE, and under suitable conditions, the impact of truncation error can be characterized [6,7]. CL has become an important tool in developing quantum algorithms for simulating polynomial ODE on quantum devices since the seminal work [8]. Specifically, assuming R2 < 1, where R2 is a parameter characterizing the ratio of the nonlinearity and forcing to the linear dissipation, the query/gate complexity of the proposed algorithm takes the form
The main contributions of this work are as follows:
We formulate a bi-level variational quantum framework, referred to as nonlinear BVQPCO (nBVQPCO), for solving nonlinear PDE constrained optimization problems as discussed above. While the original optimization problem is formulated in terms of an unnormalized PDE solution, we show how the problem can be transformed under certain conditions into an equivalent form which depends only on the normalized PDE solution. This is significant as VQLS generates only a normalized solution of the linear system as typically quantum algorithms do, and would have otherwise required a substantial overhead to obtain the unnormalized solution via additional estimation of the norm of the solution.
We use the sigma basis [14], an alternative tensor product decomposition that exploits the sparsity/structure of linear system arising from PDE discretization to facilitate VQLS computations. This results in a substantial reduction in the number of terms in the linear combination of unitary (LCU) decomposition necessary for VQLS cost function evaluations, and thus leads to a significant computational advantage compared to the Pauli basis traditionally used for the LCU decomposition.
We present a detailed computational error analysis for CL+VQLS-based solution of polynomial ODE and prove that under suitable assumptions the CL+VQLS approach can generate a normalized solution of a given ODE to an arbitrary accuracy. To the best of our knowledge such a rigorous analysis of CL+VQLS approach has not been presented before in the literature. Furthermore, by combining this error analysis with the empirically known results for the run time of VQLS, we assess potential utility of our nBVQPCO framework over equivalent classical methods.
We implement the nBVQPCO framework using the PennyLane library and apply it to solve a prototypical inverse problem applied to nonlinear Burgers’ PDE which involves calibrating the PDE model parameter to match the given measurements.
The paper is organized as follows: We start with mathematical preliminaries in Section 2 and introduce the nonlinear PDE constrained optimization problem in Section 3.1. We describe the CL+VQLS procedure in Section 3.2 and develop the nBVQPCO framework extension including an outline of pseudo-code for its implementation in Section 3.3. Complexity and error analysis of the proposed framework are covered in Section 4. In Section 5 we formulate the inverse PDE problem in our nBVQPCO framework, apply an alternative tensor product decomposition for computing VQLS cost function, and describe the implementation details and numerical results using the PennyLane library. We conclude in Section 7 with avenues for future work.
Let ℕ = {1, 2,…}, ℝ, and ℂ be the sets of positive integers, real numbers, and complex numbers, respectively. We will denote vectors by small bold letters and matrices by capital bold letters. A† and A′ will denote the vector/matrix complex conjugate and vector/matrix transpose, respectively. Tr(A) will denote the trace of a matrix. We will represent an identity matrix of size s × s by Is.
Kronecker product will be denoted by ⊗. For any pair of vectors, x ∈ ℝn and y ∈ ℝm, their Kronecker product w ∈ ℝnm is
Similarly, for A ∈ ℝm×n and B ∈ ℝp×q, their Kronecker product is defined as
The standard inner product between two complex vectors x, y ∈ ℝn will be denoted by 〈x, y〉. The lp-norm in the Euclidean space ℝn will be denoted by ∥ · ∥p, p = 1, 2,…, ∞ and is defined as follows
For the norm of a matrix A ∈ ℝn×m, we use the induced norm, namely
We will use p = 2, i.e., l2 norm for vectors and spectral norm for matrices unless stated otherwise. The spectral radius of a matrix denoted by ρ(A) is defined as
Trace norm of a matrix A is defined as
Let ℱ be a vector space of real vector valued functions u(t) : [0, T] → ℝn defined on [0, T]. Then the inner product between u1, u2 ∈ ℱ is defined as
We will use standard braket notation, i.e. |ψ〉 and 〈ψ| in representing the quantum states [15]. The inner product between two quantum states |ψ〉 and |ϕ〉 will be denoted by 〈ψ|ϕ〉. For a vector x, we denote by
We consider a general class of PDE constrained design optimization problems of the form [16]
When semi-discretized in space, the PDE (10) and associated boundary/initial conditions (11) results in a polynomial ODE of the form
The cost function is a quadratic function of the solution u(t), t ∈ [0, T], i.e.
Under these assumptions, the optimization problem (9) can be represented as
Assumption 1 is fairly general as several PDEs arising in engineering and science result in polynomial ODEs when semi-discretized in space, see [9] and references therein. Examples include Euler and Navier-Stokes PDEs and heat equation in fluid dynamics, combustion, and weather forecasting; magnetohydrodynamics and Vlasov-Maxwell PDEs in plasma dynamics and astrophysics; wave equation in structural mechanics; Black–Scholes PDE in finance; and reaction-diffusion PDEs in chemistry, biology, and ecology to name a few. Furthermore, polynomial ODEs also arise in mechanics, molecular dynamics, chemical kinetics, epidemiology, social dynamics, and biological networks.
Assumption 2 commonly arises in many PDE constrained optimization problems. For instance in inverse problems the cost function is typically taken as a squared norm of difference between the solution and measured variables, which results in a quadratic form, see Section 5.1 for details. Other examples include PDE optimal control where the cost function is a weighted combination of deviation from a reference signal and squared norm of control input, aerodynamic design where the objective is to minimize heat transfer/drag or maximize lift, and computer vision problems such as shape from shading, surface reconstruction from sparse data, and optical flow computation where the objective function depends on an appropriate form of squared norm.
The BVQPCO framework proposed in [1] cannot be applied directly to the optimization problem (15)-(18), as the constraints are nonlinear. To address this we employ the CL framework, whereby the polynomial ODEs are transformed into an infinite-dimensional system of linear ODEs and truncated. Since, it is always possible to map the -th degree polynomial ODE (13) to a higher dimensional quadratic polynomial ODE, and then apply CL [9], without loss of generality we restrict k = 2 throughout this paper.
Consider a system of inhomogeneous quadratic polynomial ODE (16) with k = 2, i.e.
Below we describe the steps required to transform the solution of these nonlinear ODEs into a form amenable to VQLS, see Figure 1 for illustration of the steps involved.
- Step 1:
For CL we introduce variables wi = u[i] ∈ ℝni , i ∈ ℕ, which satisfy
20 where is given by21 with 0 ≤ j ≤ 2. Note that (20) defines an infinite dimensional system of linear ODE with state and initial condition , and is known as CL. For any finite N ∈ ℕ define22 where . The CL when truncated to order N, results in a finite system of linear ODEs23 24 25 where we have dropped dependence on , t, p in the RHS terms for brevity. Note that approximates w(t) as defined in Eq. (22) and the error incurred as a function of the truncation level N is described by Lemma 1.

Illustration of steps involved in transforming a quadratic polynomial ODE (19) into a system of linear algebraic equations (30) amenable to VQLS. This involves CL and truncation resulting in a system of linear ODEs, time discretization of linear ODEs to generate a system of linear difference equations, and expressing difference equation into a single system of linear algebraic equations with normalized solution and right-hand side. While an explicit time discretization is shown in the figure, one can use implicit Euler discretization or other schemes as well.
- Step 2:
Applying the forward Euler method to the system (23) with step size h = T/M, M ∈ ℕ, results in a set of difference equations
26 where , approximates as defined in Eq. (25) for k = {0, ⋯ , M – 1} with . The error introduced by Euler discretization is characterized by Lemma 2. - Step 3:
The iterative system (26) can be expressed as a single system of linear equations
27
- Step 2:
Similarly, applying the backward Euler method to the system (23) with step size h = T/M yields
28 where , approximates , for k = {0, ⋯ M, – 1}. - Step 3:
The iterative system (28) can similarly be expressed as a single system of linear equations
29
In the VQLS framework the linear system obtained using forward Euler (Eq. (27)) or backward Euler (Eq. (29)) is further transformed into the form
The optimal parameter θ* then prepares a solution
Different cost functions have been proposed for the VQLS algorithm [13]. This includes global cost function Cug(θ) and its normalized version Cg(θ)
Linear Combination of Unitaries Decomposition: The square matrix à can be decomposed as a linear combination of unitary operators Ãi with complex scalar coefficients αi as,
One popular approach for unitary operators is based on the Pauli basis formed from the identity I and the Pauli gates X, Y, and Z. A matrix à with the size 2n × 2n can then be written as a linear combination of elements selected from the basis set
On this basis, each Ãn ∈ 𝒫n and its corresponding coefficient αi can be determined via different numerical approaches [17]. More recently, alternative tensor product decomposition has been proposed which uses different basis elements to better exploit the underlying structure and sparsity of matrices and results in more efficient decomposition [14,18]. For example, for the matrices arising from discretization of Poisson equation [18] and heat equation [14] the number of LCU terms under sigma basis only scale logarithmically, i.e., as 𝒪(n) with the matrix size 2n × 2n, compared to the Pauli basis for which the number of terms can vary from 𝒪(2n) (diagonal matrices) to 𝒪(22n) (dense matrices).
The sigma basis set comprises of following operators
Even though some of the operators in S are non-unitary, using the concept of unitary completion, one can still design efficient quantum circuits for computing the global/local VQLS cost functions, see [14] for details. We have used this alternate decomposition in the numerical study in Section 5.
To solve the optimization problem (15)–(18) in a variational quantum framework, we extend the BVQPCO framework to nonlinear setting, where the outer optimization level iteratively selects pk using a classical black box optimizer based on the cost function evaluated using solution of CL truncated system obtained via VQLS. To do so, let K0 ∈ ℝn×Nc(M+1) and Kf ∈ ℝn(M+1)×Nc(M+1) be matrices such that
Thus, the optimization problem (15)–(18) can be expressed in a variational quantum form as
Note that the above quantum formulation, unlike the original optimization problem (15)–(18), depends only on the normalized PDE solution. This is significant as VQLS generates only a normalized solution of the linear system as typically quantum algorithms do and would have otherwise required a substantial overhead to obtain the unnormalized solution via additional estimation of the norm of the solution.
nBVQPCO Algorithm.
Input: Semi-discretized ODE in the form (19), cost function (14),
Output: Optimal parameters: p*
1: Initialize k = 0 and p0 = pin.
2: Apply CL to (19) with truncation level N and determine Ã(p),
3: Determine the unitary Uϕ to prepare |ϕ〉, and LCU decomposition for
4: while stopping criteria not met do
5: Compute LCU
6: Compute
7: Let
8: Use classical black box optimizer to select next pk+1 subject to constraints
9: Check the stopping criterion, e.g., whether S(pk, pk+1) ≤ ϵ.
10: k ← k +1
11: end while
12: Return pk
The pseudo-code for nBVQPCO is summarized in Algorithm 1, and Figure 2 shows the overall flow diagram. Several remarks follow:
Note that VQLS provides solution of the given linear system only up to a normalization constant. By reformulating the cost function expression as discussed above one can make the cost function evaluation independent of the normalization constant.
The inputs for VQLS in line (6) include the linear combination of unitaries (LCU) decomposition of Ã(p), unitary
which prepares , V(θ) the selected ansatz, γ the stopping threshold, and nsh the number of shots used in VQLS cost function evaluation. See Section 5 for the details. From the relations (60), one can select such that, for Cg ≤ γ instance, for ≤ , results in a VLQS approximation error , see Lemma 4.For LCU decomposition a popular approach is to use Pauli basis as discussed above. We propose to employ a more efficient tensor product decomposition [14], which exploits the underlying structure and sparsity of matrices such as arising from PDE discretizations, see Section 5.
Depending on how à depends on the parameters p, a parameter-dependent LCU decomposition {αi(p), Ãi} can be pre-computed once, thus saving computational effort, see Section 5.1 for an example.
For computing
in line 7 one can use the unitary Uϕ as computed in line (3) and quantum circuit for the SWAP test. Similarly, given the LCU decompositions of Φf and Φ0 as computed in line (3) one can express and compute each term in the above sums using quantum circuit associated with the Hadamard test.For the outer optimization one can utilize any global black box optimization method [1]. For instance, one can sample the optimization variables p to generate a set of predetermined grid points, evaluate the cost function at those points and select the grid point with the minimum cost. Alternatively, one can use more adaptive techniques such as Bayesian optimization (BO) which is a sequential design strategy for global optimization of black-box functions and uses exploration/exploitation trade-off to find optimal solution with minimum number of function calls.
For convergence, line 9 uses a step size tolerance, i.e., S(pk, pk+1) = ∥pk+1 – pk∥. However, other conditions can be employed, such as functional tolerance, i.e., the algorithm is terminated when the change in design cost
is within a tolerance ϵ.In Section 4 we provide a detailed computational error analysis for CL+VQLS-based solution of polynomial ODEs. Since there are no theoretical results available for run time of VQLS, we employ empirically known results along with our rigorous CL+VQLS error analysis to assess the potential advantage of the nBVQPCO framework over classical methods.

Top: Schematic showing transformation of a nonlinear PDE constrained optimization problem into a variational quantum form. Bottom: Flow diagram of the nBVQPCO framework. VQLS uses an inner level optimization to solve the linear system constraints, arising from the discretization of the underlying PDEs, for given design parameters, and evaluate the quantities related to design cost/objective function. A black box optimizer is used for the outer level optimization to select next set of parameters values based on the evaluated design cost.
Consider a system of inhomogeneous quadratic polynomial ODEs (19)
See also Figure 1, which illustrates the relationship between these different variables. We make following assumptions.
For the system (46)
F1, F2 be time independent matrices.
Spectral norms ∥F1∥, ∥F2∥, ∥F0∥ = maxt∈[0,T] ∥F0(t)∥ are known, and
F1 be diagonalizable with eigenvalues λi of F1 satisfying Re(λn) ≤ ⋯ ≤ Re(λ1) < 0.
Let
48
Condition (48) can be conservative and too restrictive to be satisfied in practical applications. As pointed out in the Introduction, for CFD-type applications, one can work with alternative form of conservation laws, e.g., LBM [10], for which the condition Eq. (48) can more readily be met in practical regimes of interest. Additionally, one can employ a rescaling technique proposed in [19] to further improve computational advantage when using the Carleman linearization framework.
Under the condition (48), system (46) can be rescaled
Note that under this rescaling, the value of R2 in (48) remains unchanged.
Under Assumption 3, Lemmas 1 – 3 hold, see [8] for the details.
The error
Suppose that error η(t) as introduced in Lemma (1) satisfies
The stiffness κ of à in (Eq. (27)), recall arising from application of steps 1–3 to the ODE (46), is bounded as follows
Consider a linear system
The proof of above result can be found in [13].
We assume that the run time of VQLS scales as
Note that the above assumption is based on an empirical scaling study in [13] and no theoretical guarantees are available. While using Assumption 4 for a given application, one needs to be careful about how well the underlying empirical setup used to obtain this run time scaling of VQLS is met for that application.
Underlying Assumption 4, the following empirical setup was used (see Section 2.2 in [13]).
The run time of VQLS solution is characterized in terms of time-to-solution, which refers to the number of exact cost function evaluations during the optimization needed to guarantee thatis below a specified value. Since the true solution of the linear system is not known, the value of the cost computed during the VQLS, combined with the error bounds (60), is used to determine the worst casefor the stopping criterion.
Randomly sampled sparse matrices were used for which the number of LCU terms scales as 𝒪((log N)2).
Normalized local VQLS cost function was used.
A problem efficient ansatz was used and time-to-solution was averaged over several VQLS runs.
VQLS was implemented with exact sampling i.e. no finite sampling or shots were used for the computation of the cost function. Furthermore, no hardware noise was considered.
As discussed in Remark 4, we will use the term run time to refer to time-to-solution throughout the rest of the paper unless otherwise stated.
The trace norm and l2 norm between |ψ〉 and |ϕ〉 are related as follows
See Lemma 2 in [1] for the proof.
Consider the ODE system (46). Then under Assumption 3, for any given ϵ > 0, one can choose CL truncation level (see Eq (A9)) and Euler discretization step size h (see Eq (A5)) such that
The proof of the above theorem is given in Appendix A.1.
Consider the ODE system (46). Then under Assumption 3, for any given 0 < ϵ < 1, one can choose CL truncation level N, Euler discretization step size h and VQLS stopping threshold γ such that
Proof. From Theorem 1, choose N, h such that
Let γ be
Thus, using relations (65) and (67)
Furthermore, by Assumption 4 the run time of VLQS behaves as O((log(Nc(M + 1))8.5#x03BA;log(1/ϵ′)) which using Lemma 3 can be simplified as follows
Theorem 1 and Theorem 2 can also be applied to higher-order polynomial systems (13), by replacing Assumption 3 with Assumption 5.
For the system (13)
Let Fi, i = 0,…, k be time independent matrices with bounded norms.
Assume
is diagonalizable and its eigenvalues satisfy , where70 and is given by71 which is a generalization of Eq. (21), see [9] for details.Let
72 To apply the nBVQPCO framework to the higher-order polynomial case, one can either transform (13) to the form (19) and then follow the procedure described in Section 3.2 or apply CL in Step 1 directly to (13) and then follow the subsequent steps. As shown in [9], the latter is equivalent to the former but leads to more compact CL representation.
Classically to solve the optimization problem (15)–(18) a variety of methods can be used. Consistent with the nBVQPCO framework we use forward Euler to solve the ODE (46) for a given parameter p, compute the cost function and warp a black box optimizer, e.g., BO.
Applying forward Euler discretization scheme with step size h = T/M to (46) results in a set of nonlinear finite difference equations
The next theorem shows that one can always choose a step size h so that ũ approximates uc to within desired error.
Consider the ODE system (46). Then under Assumption 3, for any given ϵ > 0 one can choose h (see Eq. (A23)) such that
For the proof see Appendix A.2.
Let s be the maximum of sparsity of F1 and F2. The computational complexity for solving (73) thus scales as
On the other hand, based on Theorem 2, under certain conditions the run time of VQLS based explicit Euler scheme scales polylogarithmically w.r.t , see Eq. (64). Assuming that the number of outer loop iterations are similar for both this classical and the nBVQPCO framework, the run time of nBVQPCO will scale polylogarithmically with , and thus could provide a significant computational advantage for simulation-based design problems.
In the comparison above, note we have not accounted for computational effort for LCU decomposition required within the nBVQPCO framework (see line 5 in Algorithm 1). This however may not be a significant overhead. As pointed out earlier, depending on how à depends on the parameters p, a parameter-dependent LCU decomposition {αi(p), Ai} can be pre-computed once, thus saving computational effort: we show that via an example in Section 5.1. Moreover, it may be possible to derive LCU decomposition analytically and implement efficiently, e.g., when using sigma basis, see Section 5.2.
Finally, we would like to remind the reader that the conclusion nBVQPCO could be advantageous over equivalent classical methods is based on the strong Assumption 4 regarding VQLS scaling. Recall this scaling is determined empirically using random matrices whose LCU decomposition admits 𝒪((log N)2) terms. In the future, it will be worthwhile to characterize VQLS scaling behavior for sparse and structured matrices which typically arise from PDE discretization, and refine the comparison above.
Consider the 1D Burgers’ equation, which models convective flow u(x, t), ona spatial domain [0, L] with Dirichlet
Discretizing the PDE with nx spatial grid points with grid size Δx = L/(nx + 1) using a central difference scheme, and letting ui(t) = u(xi, t) with xi = iΔx, i = 1, ⋯, nx, leads to
The system (84) is in the form (19). Applying CL with truncation level N, we can transform (84) to (23) with
We further decompose ÃN as
Using the backward Euler time stepping and taking M = nt – 1 time steps with step size h = T/M, Ã in (29) can be expressed as,
Consequently, for VQLS one can apply LCU for Ã1, Ã2once, and then generate a parameter-dependent LCU for à for any parameter ν for implementation of the nBVQPCO framework.
Furthermore, the optimization problem (78–80) can be approximated as
Thus, the optimization problem can be expressed in the form of (41–43) as
Note that all the inner products can be computed using appropriate quantum circuits once the solution from VQLS is available as described in Section 3.3.
We use a recursive strategy for tensor decomposition of Ã1, Ã2 under the sigma basis as described in Section 3.2. For simplicity, assume that the forcing function f(x, t) ≡ 0. Since, F1, F2 as defined in Eq. (85) and Eq. (86) are time independent, AN(t) ≔ AN is also time independent. Assume nx = 2s, nt = 2t. Note that AN ∈ ℝNc × Nc where
Then
Similarly, Ã2 can be written as
Next, we find the tensor decomposition of AN,1, AN,2. Each diagonal block of AN,2 is of the form
With these relations,
Next, we focus on the AN,1 term. Since f(x, t) = 0, the sub-diagonal blocks
Let’s say that each block
Each block
Thus, AN,1 has a total of
Finally, we look at the decomposition of the blocks
Let As be a 2s × 2s matrix with a single non-zero entry in position (r, c) r, c ∈ {0, 1, …, nx – 1}. Let the binary representation of
This can be proved by induction. Trivially true for s = 1, by the definition of sigma basis. Assume, the statement holds for s – 1. Then
To demonstrate the efficacy of using the sigma basis compared to the Pauli basis, we compute the number of terms needed in the LCU decomposition for different matrix sizes for Ã1, Ã2. To vary the matrix size we consider different numbers of spatial/temporal discretization points nx and nt, respectively, and different values of CL truncation level N. For the Pauli basis we used the matrix slicing method proposed in [17] to determine the number of terms, while for sigma basis we used analytical expressions derived above. Figure 3 shows the comparison and highlights significantly better scaling for the sigma basis. For example, for nx = nt = 4, N = 4, LCU with Pauli basis generates 276, 000 terms whereas sigma basis gives rise to only 111 terms. Furthermore, computation of the Pauli LCU decomposition was too time consuming as the problem size increases as indicated by the missing data points for nx = nt = 8, 16 and N > 2.

Comparison of the number of LCU terms using sigma basis (blue curve) and Pauli basis (red curves) for different values ofCL truncation levels and number of spatial/temporal discretization points nx and nt.
Finally, note that the sigma basis-based LCU decomposition analysis presented in this section, while developed in context of a specific example, is general and can be readily extended to Carleman matrices arising from the linearization of any other polynomial nonlinear ODE.
In this section, we demonstrate the application of our nBVQPCO framework to solve the inverse problem described in Section 5.1. We take L = 0.5 and T = 0.35, and assume that the forcing function f(x, t) ≡ 0. The problem is discretized on a spatial grid with nx = 4 grid points leading to Δx = 0.1. For time discretization we take nt = 8 time steps resulting in Δt = 0.05. We consider an initial condition of the form u0(x) = sin(k(x – Δx)),
For simplicity we choose such that ∥u0∥ = 1. To simulate the measurement data y(t), t ∈ [0, T], we take the measurement point to be xp = x2 where xi = iΔx. We integrate the nonlinear ODEs (84), and generate the measurement data y(ti) = u2(ti), ti = ih, i = 0, ⋯, nt – 1 using ν = 0.07. In our numerical studies we investigate N = 1 and N = 2 truncation for the CL.
The nBVQPCO framework was implemented using the PennyLane software framework for quantum computing. PennyLane’s lightning.qubit device was used as the simulator, autograd interface was used as the automatic differentiation library and the adjoint method was used for gradient computations. We do not consider any shot, device or measurement noise throughout this study.
In this section we provide details related to VQLS implementation including the state preparation circuit, and the selected ansatz, optimizer, and LCU decomposition approach.
State Preparation: Here we describe the circuit to construct
State preparation for u0 can be accomplished using RY, CNOT, and PauliZ gates as shown in Figure 4 for different values of nx. Next, starting from the state |0〉nx, we first create the state
Note that since ∥u0∥ = 1 and

Quantum circuit to prepare the initial condition u0 Eq. (95) for nx = 4 and nx = 8 grid points.

Quantum circuit to prepare the state

Sequential construction of the quantum circuit to prepare the desired state
Ansatz: We use a modified version of the ansatz circuit 9 from [20] with 3 repetitions inspired by the work in [12]. This is shown in Figure 7.

Modified real version of circuit 9 from [20] that is used as the VQLS ansatz.
LCU Decomposition: As discussed in Section 5.2, we use sigma basis-based tensor product decomposition due to its computationally efficiency compared to the Pauli basis for our application. Since sigma basis is non-unitary, we used circuits based on unitary completion [14] to implement the Hadamard test required for computing the VQLS cost functions.
Optimizer: There are several choices of optimizers that can be used with VQLS [21]. We used the Adagrad optimizer from PennyLane with a step size of 0.8. The convergence criteria for VQLS were set to a maximum of 200 iterations. The optimizer was initialized randomly with samples taken from the Beta distribution with shape parameters α = β = 0.5.
VQLS Results: We study the convergence of VQLS for two Carleman truncation levels N = 1, 2 using local VQLS cost function. The VQLS results and convergence of the VQLS cost functions are shown in Figure 8. The VQLS solution is compared with the solution obtained by classically solving the underlying linear system (29). Qualitative comparison of the two solutions indicates that VQLS produces reasonable solutions to the problem. To characterize VQLS solution quality we define aggregated solution error as time average of norm of normalized solution error at each time step, i.e.

Comparison of normalized solution, obtained classically and via VQLS, of the implicit linear system (29) for the inverse Burgers problem with ν = 0.07 and for two CL truncation levels.
Next we study the performance of the nBVQPCO framework. The true optimum of the inverse problem is ν* = 0.07 as we used that value to generate the simulated measurement data y(t). Since the optimization variable is a scalar, we use exhaustive search as the black box optimizer. In this approach we uniformly sample ν in the range [νmin, νmax], use CL+VQLS to generate the normalized solution and compute the design cost (92).
As the curves in Fig. 9b indicate, our algorithm returns the minimum solution at ν = 0.06 for both CL truncation levels N = 1 and N = 2, which is close to the true optimal value of ν* = 0.07. The value of R2 in Eq. (48) ranges from 2.46 to 18 over the range of ν values considered here. Despite being well outside the bound in Eq. (48), CL works well in this practical use case.

(a) CL+VQLS solution error ε (97) for the inverse Burgers problem for different values of ν and two CL truncation levels N = 1 and N = 2. (b) Design cost (Eq. (92)) as a function of ν obtained via our nBVQPCO framework for CL truncation levels N = 1 and N = 2. Also, marked are ν values where the cost takes the smallest value, which is ν = 0.06 for both the truncation levels. The true optima is known to be ν = 0.07.
In this section we discuss various pros/cons of the nBVQPCO framework and a path toward fully fault-tolerant quantum computing implementation. While the nBVQPCO framework is well suited for NISQ implementation, like other variational quantum algorithms it is difficult to subject it to a fully theoretical complexity analysis. As pointed out in Remark 8, it will be worthwhile to analyze VQLS empirical scaling behavior for sparse and structured matrices. Furthermore, the error and computational complexity results we presented depend on the condition in Eq. (48), which can be conservative and too restrictive to be satisfied in practical applications. As we pointed out earlier, for CFD type applications, one can work with alternative forms of conservation laws, e.g., LBM [10], for which the condition in Eq. (48) can more readily be met in practical regimes of interest. Additionally, one can employ a rescaling technique and higher-order discretization schemes proposed in [19] to further improve computational advantage when using the Carleman linearization framework.
One of the bottlenecks in the nBVQPCO framework is the computation of VQLS cost function which scales polynomially with the number of LCU terms. By replacing conventionally employed Pauli basis with sigma basis, we showed in our application that the number of LCU terms scales more favorably. Ideally, one would like this scaling to have a polylogarithmic dependence on the matrix size, and further enhancing the sigma basis type approaches would be necessary to make the proposed framework scalable. Other variational quantum algorithms such as variational quantum eigensolver (VQE), and quantum approximate optimization algorithm (QAOA) can also significantly benefit from such efficient LCU decompositions.
In the nBVQPCO framework one can replace VQLS with the quantum linear systems algorithm (QLSA) [22–24]. This could be beneficial since QLSA has provable exponential advantage over classical linear system solvers and thus comes with rigorous guarantees unlike variational methods. However, QLSA-based implementation is expected to have high quantum resource needs. For instance, in [11], the authors obtained a detailed quantum resource estimate for implementing the CL-based LBM simulation of incompressible flow fields and concluded that the QLSA-based framework can only be feasible in a fault-tolerant quantum computing setting. This study also highlighted that dominant quantum resource needs arise due to the block encoding step which is required to implement the oracle in QLSA. Block encoding in fault-tolerant implementation can be thought of as an analogous step to LCU decomposition in NISQ implementation. Techniques like sigma basis which exploit sparsity and structure for more efficient LCU decomposition can also be potentially leveraged to make block encoding schemes more efficient.
In this paper we presented nBVQPCO, a novel variational quantum framework for nonlinear PDE constrained optimization problems. The proposed framework utilizes Carleman linearization, VQLS algorithm, and a black box optimizer nested in bi-level optimization structure. We presented a detailed computational error and complexity analysis to establish potential benefits of our framework over classical techniques under certain assumptions. We demonstrated the framework on an inverse problem and presented simulation results. The analysis and results demonstrate the correctness of our framework for solving nonlinear simulation-based design optimization problems.
Future work will involve studying and mitigating the effect of device/measurement noise and making the nBVQPCO framework robust. It will also be important to study the scalability of the framework by applying it to larger problem sizes and implementing it on quantum hardware. It will also be beneficial to explore the application of our framework with other PDE discretization approaches such as finite volume and finite element methods, and extend the error and complexity analysis for higher-order temporal discretization schemes. Finally, exploring and refining the framework to better exploit sparsity/structure and extension to fault-tolerant setting as discussed above are also avenues for future research.