Have a personal or library account? Click to login
Variational Quantum Framework for Nonlinear PDE Constrained Optimization Using Carleman Linearization Cover

Variational Quantum Framework for Nonlinear PDE Constrained Optimization Using Carleman Linearization

Open Access
|Jul 2025

Full Article

1.
Introduction

We present a variational quantum framework for optimization problems constrained by nonlinear partial differential equations (PDE). This extends the recently proposed bi-level variational quantum PDE constrained optimization (BVQPCO) framework for linear PDEs in [1] to a nonlinear setting. This is an important extension as many science and engineering applications such as aerodynamics, computational fluid dynamics (CFD), material science, computer vision, and inverse problems necessitate the solution of optimization problems constrained by a system of nonlinear PDE. Examples include Euler and Navier-Stokes PDEs and heat equation in fluid dynamics, combustion, and weather forecasting; magnetohydrodynamics and Vlasov-Maxwell PDEs in plasma dynamics and astrophysics; wave equation in structural mechanics; Black–Scholes PDE in finance; and reaction-diffusion PDEs in chemistry, biology, and ecology to name a few. Closed-form solutions are generally unavailable for PDE constrained optimization problems, necessitating the development of numerical methods [24]. A variety of classical gradient-based and gradient-free numerical methods have been developed, and rely on repeated calls to the underlying PDE solver. Since PDE simulations are computationally expensive, using them within the design/optimization loop can become a bottleneck.

To extend the BVQPCO framework for nonlinear PDE we employ the Carleman linearization (CL) framework. The CL framework allows one to transform polynomial ordinary differential equations (ODE), i.e., ODE with polynomial vector field, into a system of infinite but linear ODE [5]. For instance, such polynomial ODE naturally arises when PDE, e.g., as mentioned above, are semi-discretized in spatial dimensions. By truncating the CL system to a finite order, one obtains a finite system of linear ODE, and under suitable conditions, the impact of truncation error can be characterized [6,7]. CL has become an important tool in developing quantum algorithms for simulating polynomial ODE on quantum devices since the seminal work [8]. Specifically, assuming R2 < 1, where R2 is a parameter characterizing the ratio of the nonlinearity and forcing to the linear dissipation, the query/gate complexity of the proposed algorithm takes the form sT2qϵpoly(logT,logn,log1/ϵ) , where ϵ is desired solution accuracy, n is system dimension, q = ∥x(0)∥/∥x(T)∥ measures the decay of the solution x(t) ∈ ℝn, T is the period of integration, and is the sparseness of system matrices describing the ODE. This algorithm/analysis was extended for systems of k-th degree polynomial ODE for arbitrary (finite) values of k in [9]. Recent work has shown that by transforming the problem or working with a different formulation of conservation laws in CFD applications, e.g. lattice Boltzmann method (LBM), one may be able to more readily satisfy such conditions [10]. Furthermore, [11] developed detailed quantum resource estimates for implementing CL-based LBM simulation of incompressible CFD on a fully fault-tolerant quantum computer. While the motivation of exploring quantum algorithms in [11] is to accelerate simulation-based design which results in a PDE constrained optimization, there was no algorithmic treatment of that problem. The application of CL for solving polynomial ODE in a variational quantum setting has also been numerically investigated in [12], in which QLSA is replaced by the variational quantum linear solver (VQLS) [13]. This work was also restricted to applying CL+VQLS for PDE simulation and did not consider any associated design/optimization problem. Furthermore, the work does not make any computational error/complexity assessments for solving polynomial ODE by the CL+VQLS approach.

The main contributions of this work are as follows:

  • We formulate a bi-level variational quantum framework, referred to as nonlinear BVQPCO (nBVQPCO), for solving nonlinear PDE constrained optimization problems as discussed above. While the original optimization problem is formulated in terms of an unnormalized PDE solution, we show how the problem can be transformed under certain conditions into an equivalent form which depends only on the normalized PDE solution. This is significant as VQLS generates only a normalized solution of the linear system as typically quantum algorithms do, and would have otherwise required a substantial overhead to obtain the unnormalized solution via additional estimation of the norm of the solution.

  • We use the sigma basis [14], an alternative tensor product decomposition that exploits the sparsity/structure of linear system arising from PDE discretization to facilitate VQLS computations. This results in a substantial reduction in the number of terms in the linear combination of unitary (LCU) decomposition necessary for VQLS cost function evaluations, and thus leads to a significant computational advantage compared to the Pauli basis traditionally used for the LCU decomposition.

  • We present a detailed computational error analysis for CL+VQLS-based solution of polynomial ODE and prove that under suitable assumptions the CL+VQLS approach can generate a normalized solution of a given ODE to an arbitrary accuracy. To the best of our knowledge such a rigorous analysis of CL+VQLS approach has not been presented before in the literature. Furthermore, by combining this error analysis with the empirically known results for the run time of VQLS, we assess potential utility of our nBVQPCO framework over equivalent classical methods.

  • We implement the nBVQPCO framework using the PennyLane library and apply it to solve a prototypical inverse problem applied to nonlinear Burgers’ PDE which involves calibrating the PDE model parameter to match the given measurements.

The paper is organized as follows: We start with mathematical preliminaries in Section 2 and introduce the nonlinear PDE constrained optimization problem in Section 3.1. We describe the CL+VQLS procedure in Section 3.2 and develop the nBVQPCO framework extension including an outline of pseudo-code for its implementation in Section 3.3. Complexity and error analysis of the proposed framework are covered in Section 4. In Section 5 we formulate the inverse PDE problem in our nBVQPCO framework, apply an alternative tensor product decomposition for computing VQLS cost function, and describe the implementation details and numerical results using the PennyLane library. We conclude in Section 7 with avenues for future work.

2.
Mathematical Preliminaries

Let ℕ = {1, 2,…}, ℝ, and ℂ be the sets of positive integers, real numbers, and complex numbers, respectively. We will denote vectors by small bold letters and matrices by capital bold letters. A and A′ will denote the vector/matrix complex conjugate and vector/matrix transpose, respectively. Tr(A) will denote the trace of a matrix. We will represent an identity matrix of size s × s by Is.

Kronecker product will be denoted by ⊗. For any pair of vectors, x ∈ ℝn and y ∈ ℝm, their Kronecker product w ∈ ℝnm is w=xy=(x1y1,x1y2,,x1ym,,x2ym,,xny1,,xnym)T.

Similarly, for A ∈ ℝm×n and B ∈ ℝp×q, their Kronecker product is defined as 1C=AB=(a11Ba1nBam1BamnB), where C ∈ ℝmp×nq. The Kronecker power is a convenient notation to express all possible products of elements of a vector up to a given order, and is denoted by 2x[i]=xxxitimes, for any i ∈ ℕ with the convention x[0] = 1. Moreover, dim(x[i]) = ni, and each component of x[i] is of the form x1ω1x2ω2xnωn for some multi-index ∈ ℕn of weight j=1nωj=i . Similarly, we denote the matrix Kronecker power by, 3A[i]=AAAitimes.

The standard inner product between two complex vectors x, y ∈ ℝn will be denoted by 〈x, y〉. The lp-norm in the Euclidean space ℝn will be denoted by ∥ · ∥p, p = 1, 2,…, ∞ and is defined as follows 4xp=(j=1n| xj |p)1/p.

For the norm of a matrix A ∈ ℝn×m, we use the induced norm, namely 5Ap=maxxmAxpxp.

We will use p = 2, i.e., l2 norm for vectors and spectral norm for matrices unless stated otherwise. The spectral radius of a matrix denoted by ρ(A) is defined as 6ρ(A)=max{|λ|:λeigenvaluesofA}.

Trace norm of a matrix A is defined as ATr=Tr(A*A) which is the Schatten q-norms with q = 1.

Let ℱ be a vector space of real vector valued functions u(t) : [0, T] → ℝn defined on [0, T]. Then the inner product between u1, u2 ∈ ℱ is defined as 7 u1,u2 T=0T u1(t),u2(t) dt.

We will use standard braket notation, i.e. |ψ〉 and 〈ψ| in representing the quantum states [15]. The inner product between two quantum states |ψ〉 and |ϕ〉 will be denoted by 〈ψ|ϕ〉. For a vector x, we denote by |x=xx as the corresponding quantum state. The trace norm ρ(ψ, ϕ) between two pure states |ψ〉 and |ϕ〉 is defined as 8ρ(ψ,ϕ)=12|ϕϕ||ψψ|Tr=1|ϕψ|2.

3.
Variational Quantum Framework for Nonlinear PDE Constrained Optimization
3.1.
Nonlinear PDE Constrained Optimization Problem

We consider a general class of PDE constrained design optimization problems of the form [16] 9minp,uCd(u,p) 10s.t.(u,p,t)=0, 11(u,p,t)=0, 12gi(p)0,i=1,,nc, where t ∈ [0, T] for given T ∈ ℝ, p ∈ ℝnp is the vector of design variables (e.g., material properties, aerodynamic shape), Cd is the cost function (e.g., heat flux, drag/lift), and ℱ(u, p, t) are the PDE constraints (e.g., conservation laws, constitutive relations) with u(x, t; p) being the solution of the PDE defined as a function of space x and time t for given parameters p, ℬ(u, p, t) to capture the constraints coming from the PDE boundary and initial conditions, and gi(p), i = 1, ⋯, nc are constraints on the parameters. We make the following assumptions.

Assumption 1.

When semi-discretized in space, the PDE (10) and associated boundary/initial conditions (11) results in a polynomial ODE of the form 13u.(t)=F0(t,p)+F1(t,p)u+·+Fk(t,p)u[k],u(0)=u0(p)n, where k ∈ ℕ is the order of the polynomial, u(t) ∈ ℝn, t ∈ [0, T] (with slight about of notation) is an -dimensional solution vector and Fi(t, p) ∈ ℝn×ni are in general time dependent matrices. We assume that the boundary conditions have already been accounted for in this ODE representation, see Section 5.1 for an example.

Assumption 2.

The cost function is a quadratic function of the solution u(t), t ∈ [0, T], i.e. 14Cd(u,p)=f(w1u,HuT+w2u,hT,p), where w1, w2 ∈ ℝ, H ∈ ℝn×n and h ∈ ℝn

Under these assumptions, the optimization problem (9) can be represented as 15minp,uCd(u,p)f(w1u,HuT+w2u,hT,p) 16s.t.u.(t)=F0(t,p)+F1(t,p)u+·+Fk(t,p)u[k], 17u(0)=u0(p), 18gi(p)0,i=1,,nc.

Remark 1.

Assumption 1 is fairly general as several PDEs arising in engineering and science result in polynomial ODEs when semi-discretized in space, see [9] and references therein. Examples include Euler and Navier-Stokes PDEs and heat equation in fluid dynamics, combustion, and weather forecasting; magnetohydrodynamics and Vlasov-Maxwell PDEs in plasma dynamics and astrophysics; wave equation in structural mechanics; Black–Scholes PDE in finance; and reaction-diffusion PDEs in chemistry, biology, and ecology to name a few. Furthermore, polynomial ODEs also arise in mechanics, molecular dynamics, chemical kinetics, epidemiology, social dynamics, and biological networks.

Remark 2.

Assumption 2 commonly arises in many PDE constrained optimization problems. For instance in inverse problems the cost function is typically taken as a squared norm of difference between the solution and measured variables, which results in a quadratic form, see Section 5.1 for details. Other examples include PDE optimal control where the cost function is a weighted combination of deviation from a reference signal and squared norm of control input, aerodynamic design where the objective is to minimize heat transfer/drag or maximize lift, and computer vision problems such as shape from shading, surface reconstruction from sparse data, and optical flow computation where the objective function depends on an appropriate form of squared norm.

The BVQPCO framework proposed in [1] cannot be applied directly to the optimization problem (15)-(18), as the constraints are nonlinear. To address this we employ the CL framework, whereby the polynomial ODEs are transformed into an infinite-dimensional system of linear ODEs and truncated. Since, it is always possible to map the -th degree polynomial ODE (13) to a higher dimensional quadratic polynomial ODE, and then apply CL [9], without loss of generality we restrict k = 2 throughout this paper.

3.2.
Solving Nonlinear ODEs via Carleman Linearization-Based VQLS

Consider a system of inhomogeneous quadratic polynomial ODE (16) with k = 2, i.e. 19u.=F0(t,p)+F1(t,p)u+F2(t,p)u[2],u(0)=u0n.

Below we describe the steps required to transform the solution of these nonlinear ODEs into a form amenable to VQLS, see Figure 1 for illustration of the steps involved.

  • Step 1:

    For CL we introduce variables wi = u[i] ∈ ℝni , i ∈ ℕ, which satisfy 20w.i=j=02Ai+j1i(t,p)wi+j1, where Ai+j1ini×ni+j1 is given by 21Ai+j1i(t,p)=p=1iIn×nFj(t,p)pthpositionIn×nifactors, with 0 ≤ j ≤ 2. Note that (20) defines an infinite dimensional system of linear ODE with state w(t)=(w1T(t),w2T(t),)T and initial condition w(0)=(u0T,(u0[2])T,)T , and is known as CL. For any finite N ∈ ℕ define 22w(t)=(w1(t)w2(t)w3(t)wN(t))Nc, where Nc=nN+1nn1 . The CL when truncated to order N, results in a finite system of linear ODEs 23w^.=AN(t,p)w^+b(t,p), 24w^(0)=(u0T,(u0[2])T,,(u0[N])T)T, 25AN(t,p)=(A11A2100A12A22A3200A23A33A430000AN1NANN),w^(t)=(w^1w^2w^3w^N)Nc,b(t,p)=(F0000) where we have dropped dependence on , t, p in the RHS terms for brevity. Note that w^(t) approximates w(t) as defined in Eq. (22) and the error incurred as a function of the truncation level N is described by Lemma 1.

Figure 1.

Illustration of steps involved in transforming a quadratic polynomial ODE (19) into a system of linear algebraic equations (30) amenable to VQLS. This involves CL and truncation resulting in a system of linear ODEs, time discretization of linear ODEs to generate a system of linear difference equations, and expressing difference equation into a single system of linear algebraic equations with normalized solution and right-hand side. While an explicit time discretization is shown in the figure, one can use implicit Euler discretization or other schemes as well.

Forward Euler
  • Step 2:

    Applying the forward Euler method to the system (23) with step size h = T/M, M ∈ ℕ, results in a set of difference equations 26w^k+1(I+AN(kh,p)h)w^k=b(kh,p)h, where w^k , approximates w^(kh) as defined in Eq. (25) for k = {0, ⋯ , M – 1} with w^0=w^(0) . The error introduced by Euler discretization is characterized by Lemma 2.

  • Step 3:

    The iterative system (26) can be expressed as a single system of linear equations 27(I00[ I+AN(0,p)h ]I00000[ I+AN((M1)h,p)h ]I)A˜(p)(w^0w^1w^M)w˜=(w^(0)hb(0,p)hb((M1)h,p))b˜(p).

Backward Euler
  • Step 2:

    Similarly, applying the backward Euler method to the system (23) with step size h = T/M yields 28(IAN(kh,p)h)w^k+1w^k=b(kh,p)h, where w^k , approximatesw^(kh) , for k = {0, ⋯ M, – 1}.

  • Step 3:

    The iterative system (28) can similarly be expressed as a single system of linear equations 29(I00I[ IAN(0,p)h ]00000I[ IAN((M1)h,p)h ])A˜(p)(w^0w^1w^M)w˜=(w^inhb(0,p)hb((M1)h,p))b˜(p).

In the VQLS framework the linear system obtained using forward Euler (Eq. (27)) or backward Euler (Eq. (29)) is further transformed into the form 30A˜(p)|w˜=|b˜(p), where |b˜=b˜/b˜ and |w˜=w˜/w˜ are the normalized vectors. The VLQS algorithm then optimizes the parameter of θ the ansatz V(θ) such that A˜(p)V(θ)|0=|b˜(p).

The optimal parameter θ* then prepares a solution |w¯=V(θ*)|0 which is an approximation to |w˜ . The approximation error is characterized by the Lemma 4.

Different cost functions have been proposed for the VQLS algorithm [13]. This includes global cost function Cug(θ) and its normalized version Cg(θ) 31Cug(θ)=Tr(|ϕϕ|(I|b˜b˜|))=ψ|Hg|ψ,Cg(θ)=ugϕϕ=1|Dϕ|2ϕϕ where |ϕ〉 = Ã|ψ(θ)〉, |ϕ(ψ)〉 = V(θ)|0〉 and Hg = Ã*Ub(I – |0〉〈0|)(Ub)*Ã, and similarly unnormalized and normalized local cost functions 32Cul(θ)=ψ|Hl|ψ,Cl(θ)=Culϕϕ,Hl=A˜*Ub(I1ni=1n|0k0k|Ik˜)(Ub)*A˜ respectively, with |0k〉 being a zero state on qubit k, Ik˜ being identity on all qubits except the qubit k, and the unitary Ub prepares |b˜=Ub|0 . Computation of these cost functions can be accomplished via the Hadamard test circuit and its variations (see [13] for the details) and rely on the linear combination of unitaries (LCU) decomposition of the matrix Ã.

Linear Combination of Unitaries Decomposition: The square matrix à can be decomposed as a linear combination of unitary operators Ãi with complex scalar coefficients αi as, 33A˜=i=1nlαiA˜i.

One popular approach for unitary operators is based on the Pauli basis formed from the identity I and the Pauli gates X, Y, and Z. A matrix à with the size 2n × 2n can then be written as a linear combination of elements selected from the basis set 𝒫n={ P1P2Pn:Pi{I,X,Y,Z} }.

On this basis, each Ãn ∈ 𝒫n and its corresponding coefficient αi can be determined via different numerical approaches [17]. More recently, alternative tensor product decomposition has been proposed which uses different basis elements to better exploit the underlying structure and sparsity of matrices and results in more efficient decomposition [14,18]. For example, for the matrices arising from discretization of Poisson equation [18] and heat equation [14] the number of LCU terms under sigma basis only scale logarithmically, i.e., as 𝒪(n) with the matrix size 2n × 2n, compared to the Pauli basis for which the number of terms can vary from 𝒪(2n) (diagonal matrices) to 𝒪(22n) (dense matrices).

The sigma basis set comprises of following operators 34S={ σ+,σ,σ+σ,σ+σ,I }, where σ+ = |0〉〈1|, σ = |1〉〈0|, σ+σ = |1〉〈1|. Given matrix Ã, its LCU-type tensor product decomposition in terms if sigma basis takes similer form as (33) but with Ãi ∈ 𝒮 where Sn={ σ1σ2σn:σiS }.

Even though some of the operators in S are non-unitary, using the concept of unitary completion, one can still design efficient quantum circuits for computing the global/local VQLS cost functions, see [14] for details. We have used this alternate decomposition in the numerical study in Section 5.

3.3.
Nonlinear BVQPCO

To solve the optimization problem (15)–(18) in a variational quantum framework, we extend the BVQPCO framework to nonlinear setting, where the outer optimization level iteratively selects pk using a classical black box optimizer based on the cost function evaluated using solution of CL truncated system obtained via VQLS. To do so, let K0 ∈ ℝn×Nc(M+1) and Kf ∈ ℝn(M+1)×Nc(M+1) be matrices such that 35u0=w^10=K0w˜, and 36(w^10w^11w^1M)=Kfw˜, where w^1k is the first n components extracted from the vectors w^k for k = 0, ⋯, M. Thus, the two terms in the cost function Eq. (14) can be approximated via the Riemann sum as 37u,HuTk=0Mhu(kh),Hu(kh)h Kfw˜,(IM+1H)Kfw˜ , 38u,hTh 1M+1h,Kfw˜ , where 1M+1 ∈ ℝM+1 is a vector of ones. Furthermore 39 Kfw˜,(IM+1H)Kfw˜ = u0 2 Kfw˜,(IM+1H)Kfw˜ K0w˜,K0w˜ ,= u0 2w˜|KfT(IM+1H)Kf|w˜w˜|K0TK0|w˜, and similarly 40 1M+1h,Kfw˜ = u0 1M+1h,Kfw˜ K0w˜,K0w˜ = u0 (1M+1h)TKfw˜ w˜|K0TK0|w˜.

Thus, the optimization problem (15)–(18) can be expressed in a variational quantum form as 41minp,θf(g(θ),p), 42s.t.A˜(p)V(θ)|0=|b˜(p), 43gi(p)0,i=1,,nc, where 44g(θ)=(w1 u0 2w¯|KfT(IM+1H)Kf|w¯w¯|K0TK0|w¯+w2 u0 (1M+1h)TKfw¯ w¯|K0TK0|w¯)|w¯=V(θ)|0, 45Φf=KfT(IM+1H)Kf,Φ0=K0TK0,ϕ=(1M+1h)TKf.

Note that the above quantum formulation, unlike the original optimization problem (15)–(18), depends only on the normalized PDE solution. This is significant as VQLS generates only a normalized solution of the linear system as typically quantum algorithms do and would have otherwise required a substantial overhead to obtain the unnormalized solution via additional estimation of the norm of the solution.

Algorithm 1

nBVQPCO Algorithm.

Input: Semi-discretized ODE in the form (19), cost function (14), { gi(p) }i=1nc , CL truncation level N, V(θ),γ, nsh and ϵ

Output: Optimal parameters: p*

1: Initialize k = 0 and p0 = pin.

2: Apply CL to (19) with truncation level N and determine Ã(p), b˜(p) in Eq. (30) and associated ϕ, Φf, Φ0 as defined in Eq. (45).

3: Determine the unitary Uϕ to prepare |ϕ〉, and LCU decomposition for Φf=i=1nfαfiΦfi and Φ0=i=1n0α0iΦ0i .

4: while stopping criteria not met do

5:  Compute LCU { αi(pk),A˜i(pk) }i=1nk , of Ã(Pk), and determine a unitary Ubk such that |b˜(pk)=Ubk|0 .

6:  Compute θ*k=VQLS({ αi,A˜i },Ubk,V(θ),γ,nsh) .

7:  Let |ψ(θ*k)=V(θ*k)|0 . Compute ϕψ(θ*k) , ψ(θ*k)|Φf|ψ(θ*k) and ψ(θ*k)|Φ0|ψ(θ*k) using associated quantum circuits, and evaluate the cost function (41).

8:  Use classical black box optimizer to select next pk+1 subject to constraints { gi(p) }i=1nc .

9:  Check the stopping criterion, e.g., whether S(pk, pk+1) ≤ ϵ.

10:  kk +1

11: end while

12: Return pk

The pseudo-code for nBVQPCO is summarized in Algorithm 1, and Figure 2 shows the overall flow diagram. Several remarks follow:

  • Note that VQLS provides solution of the given linear system only up to a normalization constant. By reformulating the cost function expression as discussed above one can make the cost function evaluation independent of the normalization constant.

  • The inputs for VQLS in line (6) include the linear combination of unitaries (LCU) decomposition of Ã(p), unitary Ubk which prepares |b˜(pk) , V(θ) the selected ansatz, γ the stopping threshold, and nsh the number of shots used in VQLS cost function evaluation. See Section 5 for the details. From the relations (60), one can select such that, for Cgγ instance, for ≤ , results in a VLQS approximation error ϵκγlogN , see Lemma 4.

  • For LCU decomposition a popular approach is to use Pauli basis as discussed above. We propose to employ a more efficient tensor product decomposition [14], which exploits the underlying structure and sparsity of matrices such as arising from PDE discretizations, see Section 5.

  • Depending on how à depends on the parameters p, a parameter-dependent LCU decomposition {αi(p), Ãi} can be pre-computed once, thus saving computational effort, see Section 5.1 for an example.

  • For computing ϕψ(θ*k) in line 7 one can use the unitary Uϕ as computed in line (3) and quantum circuit for the SWAP test. Similarly, given the LCU decompositions of Φf and Φ0 as computed in line (3) one can express ψ(θ*k)|Φf|ψ(θ*k) =i=1nfαfi ψ(θ*k)|Φfi|ψ(θ*k) , ψ(θ*k)|Φ0|ψ(θ*k) =i=1n0αfi ψ(θ*k)|Φ0i|ψ(θ*k) , and compute each term in the above sums using quantum circuit associated with the Hadamard test.

  • For the outer optimization one can utilize any global black box optimization method [1]. For instance, one can sample the optimization variables p to generate a set of predetermined grid points, evaluate the cost function at those points and select the grid point with the minimum cost. Alternatively, one can use more adaptive techniques such as Bayesian optimization (BO) which is a sequential design strategy for global optimization of black-box functions and uses exploration/exploitation trade-off to find optimal solution with minimum number of function calls.

  • For convergence, line 9 uses a step size tolerance, i.e., S(pk, pk+1) = ∥pk+1pk∥. However, other conditions can be employed, such as functional tolerance, i.e., the algorithm is terminated when the change in design cost | Cdk+1Cdk | is within a tolerance ϵ.

  • In Section 4 we provide a detailed computational error analysis for CL+VQLS-based solution of polynomial ODEs. Since there are no theoretical results available for run time of VQLS, we employ empirically known results along with our rigorous CL+VQLS error analysis to assess the potential advantage of the nBVQPCO framework over classical methods.

Figure 2.

Top: Schematic showing transformation of a nonlinear PDE constrained optimization problem into a variational quantum form. Bottom: Flow diagram of the nBVQPCO framework. VQLS uses an inner level optimization to solve the linear system constraints, arising from the discretization of the underlying PDEs, for given design parameters, and evaluate the quantities related to design cost/objective function. A black box optimizer is used for the outer level optimization to select next set of parameters values based on the evaluated design cost.

4.
Computational Complexity and Error Analysis

Consider a system of inhomogeneous quadratic polynomial ODEs (19) 46u.=F0(t)+F1u+F2u[2],u(0)=u0n, where we have dropped explicit dependence on the parameters p and time t. Associated with above system, let w, w^ , and W˜ be as defined in the Eqs. (22), (25), and (27), and let 47wc=(w(0)w(h)w(Mh)).

See also Figure 1, which illustrates the relationship between these different variables. We make following assumptions.

Assumption 3.

For the system (46)

  • F1, F2 be time independent matrices.

  • Spectral normsF1∥, ∥F2∥, ∥F0∥ = maxt∈[0,T]F0(t)∥ are known, and maxt[0,T] F.0 <

  • F1 be diagonalizable with eigenvalues λi of F1 satisfying Re(λn) ≤ ⋯ ≤ Re(λ1) < 0.

  • Let 48R2=1| Re(λ1) |( u0 F2 + F0 u0 )<1.

Condition (48) can be conservative and too restrictive to be satisfied in practical applications. As pointed out in the Introduction, for CFD-type applications, one can work with alternative form of conservation laws, e.g., LBM [10], for which the condition Eq. (48) can more readily be met in practical regimes of interest. Additionally, one can employ a rescaling technique proposed in [19] to further improve computational advantage when using the Carleman linearization framework.

Remark 3.

Under the condition (48), system (46) can be rescaled uu¯γ to (see Appendix in [8] for details) 49u¯.=F¯0+F¯1u¯+F¯2u¯[2], such that the following relations hold where 50 F¯2 + F¯0 <| Re(λ1) |, 51u¯(0)<1, where F¯0=γF0 , F¯1=F1 , F¯2=1γF2 and γ satisfies, 52γ=1u(0)r+, with 53r±=Re(λ1)±(Re(λ1))24 F2 F0 2 F2

Note that under this rescaling, the value of R2 in (48) remains unchanged.

Under Assumption 3, Lemmas 13 hold, see [8] for the details.

Lemma 1.

The error ηj(t)=wj(t)w^j(t) for j ∈ {1, 2, ⋯ , N} is bounded as follows 54 ηj(t) η(t)tN F2 u0 N+1,

Lemma 2.

Suppose that error η(t) as introduced in Lemma (1) satisfies 55η(t)u(T)4, then there exists a sufficiently small h such that 56 w^jkw^j(kh) w^kw^(kh) 3N2.5kh2(( F2 + F1 + F0 )2+ F.0 ), for all j ∈ {1, ⋯ , N} and ∈ {0, 1, ⋯ ,M – 1}. Concretely, the choice of h should satisfy 57hmin{ 1N F1 ,2(| Re(λ1) | F2 F0 )N(| Re(λ1) |2( F1 + F0 )2+ F1 2) }.

Lemma 3.

The stiffness κ of à in (Eq. (27)), recall arising from application of steps 1–3 to the ODE (46), is bounded as follows 58κ3(M+1), where M = T/h.

Lemma 4.

Consider a linear system 59A˜|ψ=|b˜, where ÃRN×N, N = 2n, ∈ ℕ and, bRN. Then following bounds hold for the different VQLS cost functions as defined in Eqs. (31) and (32) 60Cugϵ2κ2,Cgϵ2κ2A˜,Culϵ2nκ2,Clϵ2nκ2A˜, where κ is the condition number of Ã, and ϵ is the desired error tolerance ϵ=ρ(ψ,ψ(θ*)), with ρ being the trace norm, and |ψ(θ*)〉 being the approximate VQLS solution.

The proof of above result can be found in [13].

Assumption 4.

We assume that the run time of VQLS scales as O(log8.5Nκlog(1/ϵ)), where κ is the condition number, N is size of the matrix A, and ϵ is the desired accuracy of the VQLS solution as described in Lemma 4.

Note that the above assumption is based on an empirical scaling study in [13] and no theoretical guarantees are available. While using Assumption 4 for a given application, one needs to be careful about how well the underlying empirical setup used to obtain this run time scaling of VQLS is met for that application.

Remark 4.

Underlying Assumption 4, the following empirical setup was used (see Section 2.2 in [13]).

  • The run time of VQLS solution is characterized in terms of time-to-solution, which refers to the number of exact cost function evaluations during the optimization needed to guarantee thatis below a specified value. Since the true solution of the linear system is not known, the value of the cost computed during the VQLS, combined with the error bounds (60), is used to determine the worst casefor the stopping criterion.

  • Randomly sampled sparse matrices were used for which the number of LCU terms scales as 𝒪((log N)2).

  • Normalized local VQLS cost function was used.

  • A problem efficient ansatz was used and time-to-solution was averaged over several VQLS runs.

  • VQLS was implemented with exact sampling i.e. no finite sampling or shots were used for the computation of the cost function. Furthermore, no hardware noise was considered.

Remark 5.

As discussed in Remark 4, we will use the term run time to refer to time-to-solution throughout the rest of the paper unless otherwise stated.

Lemma 5.

The trace norm and l2 norm between |ψand |ϕare related as follows 61(1|ψ|ϕ222)2+ρ2(ψ,ϕ)=1.

See Lemma 2 in [1] for the proof.

Theorem 1.

Consider the ODE system (46). Then under Assumption 3, for any given ϵ > 0, one can choose CL truncation level (see Eq (A9)) and Euler discretization step size h (see Eq (A5)) such that 62|wc|w˜ϵ, where |w˜ is the solution of the linear system (30) and wc is the solution of CL as defined in (47).

The proof of the above theorem is given in Appendix A.1.

Theorem 2.

Consider the ODE system (46). Then under Assumption 3, for any given 0 < ϵ < 1, one can choose CL truncation level N, Euler discretization step size h and VQLS stopping threshold γ such that 63|wc|w¯ϵ, where |w¯=V(θ*)|0 is solution generated by VQLS for system (30) and wc is solution of CL as defined in (47). Furthermore, under the Assumption 4 the run time scales as 64O(log8.5(nT4 wc 2ϵ2)T4 wc 2ϵ2log(1/ϵ)).

Proof. From Theorem 1, choose N, h such that 65|wc|w˜ϵ/2.

Let γ be 66γ=1(1ϵ28)29(M+1)2ln(Nc(M+1))<1, where note M = T/h Nc and is the dimension of the truncated CL system (23). Let the VQLS algorithm be terminated under the condition Clγ, then from Lemmas 3 and 4 we obtain ρ2(w˜,w¯)=(ϵ)2nκ2γ1(1ϵ28)2 where n = ln(Nc(M + 1)). It then follows from Lemma 5 that 1(1|w˜|w¯222)21(1ϵ28)2, which implies 67|w˜|w¯2ϵ/2.

Thus, using relations (65) and (67) 68|wc|w¯|wc|w˜+|w˜|w¯ϵ.

Furthermore, by Assumption 4 the run time of VLQS behaves as O((log(Nc(M + 1))8.5#x03BA;log(1/ϵ′)) which using Lemma 3 can be simplified as follows 69O(log8.5(Nc(M+1))κlog(1/ϵ))=O(log8.5(NcM)3(M+1)log(1/ϵ))=O(log8.5(nNTh)Thlog(1/ϵ))=O(log8.5(nT4 wc 2ϵ2)T4 wc 2ϵ2log(1/ϵ)), where we have used the bounds Eq. (A5) and Eq. (A9), and note that wc 2=k=0Mw(kh)2=k=0Mi=1Nu(kh)2i.

Remark 6.

Theorem 1 and Theorem 2 can also be applied to higher-order polynomial systems (13), by replacing Assumption 3 with Assumption 5.

Assumption 5.

For the system (13)

  • Let Fi, i = 0,…, k be time independent matrices with bounded norms.

  • Assume F˜1 is diagonalizable and its eigenvalues satisfy λ˜nkλ˜1<0 , where 70F˜1=(A11A21A31Ak11A12A22A32Ak1200·00Ak2k1Ak1k1), and Ai+j1ini×ni+j1 is given by 71Ai+j1i=p=1iIn×nFjpth positionIn×nifactors. which is a generalization of Eq. (21), see [9] for details.

  • Let 72Rk=1| Re(λ˜1) |((k1)j=2k Fj i=1k1 x0 2i+ F0 i=1k1 x0 2i)<1.

    To apply the nBVQPCO framework to the higher-order polynomial case, one can either transform (13) to the form (19) and then follow the procedure described in Section 3.2 or apply CL in Step 1 directly to (13) and then follow the subsequent steps. As shown in [9], the latter is equivalent to the former but leads to more compact CL representation.

Comparison with Classical Approach

Classically to solve the optimization problem (15)–(18) a variety of methods can be used. Consistent with the nBVQPCO framework we use forward Euler to solve the ODE (46) for a given parameter p, compute the cost function and warp a black box optimizer, e.g., BO.

Applying forward Euler discretization scheme with step size h = T/M to (46) results in a set of nonlinear finite difference equations 73u^k+1=u^k+h(F0(kh)+F1u^k+F2(u^k)[2]), where k = 0, ⋯ ,M – 1, u^ku(kh) and u^0=u0 . Let 74uc=(u(0)u(h)u(Mh)),u˜=(u^0u^1u^M).

The next theorem shows that one can always choose a step size h so that ũ approximates uc to within desired error.

Theorem 3.

Consider the ODE system (46). Then under Assumption 3, for any given ϵ > 0 one can choose h (see Eq. (A23)) such that 75|uc|u˜ϵ.

For the proof see Appendix A.2.

Let s be the maximum of sparsity of F1 and F2. The computational complexity for solving (73) thus scales as O(nsM)=O(snTh)=O(snT(h0+C2T uc 2ϵ2))=O(snT2 uc 2ϵ2), where we have used the bound (A23) and uc 2=k=0Mu(kh)2.

On the other hand, based on Theorem 2, under certain conditions the run time of VQLS based explicit Euler scheme scales polylogarithmically w.r.t , see Eq. (64). Assuming that the number of outer loop iterations are similar for both this classical and the nBVQPCO framework, the run time of nBVQPCO will scale polylogarithmically with , and thus could provide a significant computational advantage for simulation-based design problems.

Remark 7.

In the comparison above, note we have not accounted for computational effort for LCU decomposition required within the nBVQPCO framework (see line 5 in Algorithm 1). This however may not be a significant overhead. As pointed out earlier, depending on how à depends on the parameters p, a parameter-dependent LCU decomposition {αi(p), Ai} can be pre-computed once, thus saving computational effort: we show that via an example in Section 5.1. Moreover, it may be possible to derive LCU decomposition analytically and implement efficiently, e.g., when using sigma basis, see Section 5.2.

Remark 8.

Finally, we would like to remind the reader that the conclusion nBVQPCO could be advantageous over equivalent classical methods is based on the strong Assumption 4 regarding VQLS scaling. Recall this scaling is determined empirically using random matrices whose LCU decomposition admits 𝒪((log N)2) terms. In the future, it will be worthwhile to characterize VQLS scaling behavior for sparse and structured matrices which typically arise from PDE discretization, and refine the comparison above.

5.
Numerical Study
5.1.
Inverse Burgers’ problem

Consider the 1D Burgers’ equation, which models convective flow u(x, t), ona spatial domain [0, L] with Dirichlet 76tu+uxu=νx2u+f(x,t), 77u(x,0)=u0(x),u(0,t)=0,u(L,t)=0, with f(x, t) being the forcing function. We consider an inverse problem, where given y(t) = u(xp, t), t ∈ [0, T] where xp ∈ [0,L] is some fixed point, find v such that 78minv1T0T(y(t)u(xp,t))2dtu02(xp), 79s.t.Eqs.(76)and(77) 80νminννmax.

Discretizing the PDE with nx spatial grid points with grid size Δx = L/(nx + 1) using a central difference scheme, and letting ui(t) = u(xi, t) with xi = iΔx, i = 1, ⋯, nx, leads to 81u˙i=uiui+1ui12Δx+νui+12ui+ui12(Δx)2+f(xi,t), for i = 2, ⋯, nx – 1. Using Dirichlet boundary conditions, we get at i = 1, and i = nx 82u˙1=u1u22Δx+νu22u12(Δx)2+f(x1,t), and 83u˙nx=unxunx12Δx+ν2unx+unx12(Δx)2+f(xnx,t), respectively. Expressing in vector form, the evolution of u = (u1, ⋯, unx)T can be expressed as an ODE 84u.=F0(t)+F1u+F2u[2], with an initial condition u(0) = u0 = (u0(x1), ⋯ , u0(xnx))T, 85F0(t)=(f(x1,t)f(x2,t)f(xnx,t)),F1=νF˜1=ν12Δx2(2100121001210012), and 86F2(i,j)=12Δx{ 1ifi=1,j=21ifi=nx,j=nx11ifj=(nx+1)(i1),1<i<nx1ifj=(nx+1)(i1)+2,1<i<nx0otherwise.

The system (84) is in the form (19). Applying CL with truncation level N, we can transform (84) to (23) with 87AN(t)=(νA˜11A2100A12νA˜22A3200A23νA˜33A430000AN1NνA˜NN), where 88A˜ii=p=1iInx×nxF˜1pthpositionInx×nxifactors.

We further decompose ÃN as AN=AN,1+νAN,2, with AN,1=(0A2100A120A3200A230A430000AN1N0),AN,2=(A˜110000A˜220000A˜33000000A˜NN).

Using the backward Euler time stepping and taking M = nt – 1 time steps with step size h = T/M, Ã in (29) can be expressed as, A˜(ν)=A˜1+νA˜2, where A˜1=(I00IIAN,1(0)h00000I IAN,1((M1)h) )h),A˜2=(0000AN,2(0)h00000AN,2((M1)h)h).

Consequently, for VQLS one can apply LCU for Ã1, Ã2once, and then generate a parameter-dependent LCU for à for any parameter ν for implementation of the nBVQPCO framework.

Furthermore, the optimization problem (78–80) can be approximated as 89minνhTyHw˜,yHw˜|h,w˜|2, 90s.tA˜(ν)w˜=b˜, 91νminννmax where u0(xp)=h,w˜ for some appropriate vector h, y = (y(0), ⋯, y(kh), ⋯, y(Mh))T and H is an appropriate matrix which picks sub-vector (w^1p0,,w^1pM)T from w˜ , where w^1pk is p-th element taken from the first nx components w^1k of vector ŵk, for each k = 0, ⋯, M. Note that yHw˜,yHw˜|h,w˜|2=y,yu02(xp)2y,Hw˜| u0(xp) |h,w˜+Hw˜,Hw˜|h,w˜|2,=y,yu02(xp)2y,Hw˜| u0(xp) |h,w˜+Hw˜,Hw˜|h,w˜|2,=y,yu02(xp)2 HTy | u0(xp) |h HTyw˜ hw˜+1h2w˜|HTH|w˜|hw˜|2.Cd(ν,|w˜).

Thus, the optimization problem can be expressed in the form of (41–43) as 92minν,|w˜Cd(ν,|w˜) 93s.tA˜(ν)|w˜=|b˜, 94νminννmax

Note that all the inner products can be computed using appropriate quantum circuits once the solution from VQLS is available as described in Section 3.3.

5.2.
LCU Decomposition with Sigma Basis

We use a recursive strategy for tensor decomposition of Ã1, Ã2 under the sigma basis as described in Section 3.2. For simplicity, assume that the forcing function f(x, t) ≡ 0. Since, F1, F2 as defined in Eq. (85) and Eq. (86) are time independent, AN(t) ≔ AN is also time independent. Assume nx = 2s, nt = 2t. Note that AN ∈ ℝNc × Nc where Nc=nxN+1nxnx1 is not a power of 2 and it is padded with zero block of size N0 = 2sN + 1Nc and thus can be represented with a sN + 1 + t qubits and can be written as A˜1:=A˜1(sN+1+t)=A^1(sN+1+t)+σ+σσ+σttimesAN,1h, where A^1(sN+1+t)=(A^1(sN+t)0D1(sN+t)A^1(sN+t)),D1(sN+t)=(0IsN+100)=σ+σ+ttimesIsN+1.

Then A^1(sN+1+t)=I2A^1(sN+t)+σD1(sN+t),A^1(sN+2)=(IsN+1AN,1h0IsN+1IsN+1AN,1h)=I2(IsN+1AN,1h)σIsN+1.

Similarly, Ã2 can be written as A˜2=IntAN,2σ+σσ+σttimesAN,2.

Next, we find the tensor decomposition of AN,1, AN,2. Each diagonal block of AN,2 is of the form A˜ii=p=1iIn×nF˜1pthpositionIn×nifactors, where F˜1:=F˜1(s)=I2F˜1(s1)+σD˜1(s1)+σ+(D˜1(s1))T,D˜1(s1)=σ+σ+s1times,F˜1(1)=2I2+σ+σ+.

With these relations, F˜1(s) can be written using 2 log nx + 1 terms. As a result, A˜ii can be written using i(2 log nx + 1) terms out of which the factor I ⊗ ⋯ ⊗ I will repeat i – 1 times and can be combined into a single term. Hence, A˜ii can be written using 2i log nx + 1 terms. Finally AN,2 can be decomposed using ∑i(2i log nx + 1) = N + N(N + 1) log nx terms.

Next, we focus on the AN,1 term. Since f(x, t) = 0, the sub-diagonal blocks Aii+1 are zero. Each of the superdiagonal block Ai+1inxi×nxi+1 . The first term A21=F2nx×nx2 can be split into square blocks F2i of size nx × nx A21=F2=(F21F22F2nx).

Let’s say that each block F2i,i=1,,nx can be decomposed with at most C terms where C is a small constant. We will show below that C ≤ 2 and there are no repeat factors in the decomposition of F2p and F2q for any pq. Then, representing A21 in AN,1 requires C nx terms; one for each F2i . Next, we can write A32=IF2+F2I=(F2F2F2)+(F21IF22IF2nxI).

Each block F2iI is of size nx2×nx2 and can be written directly using the tensor decomposition of F2i . Then, F2I requires C nx terms. On the other hand, each of the diagonal blocks of IF2 requires nx terms. Thus, representing A32 in AN,1 requires C(nx+nx2) terms. In general, we can prove by induction that the number of terms needed are as follows: A21:Cnxterms,A32:C(nx+nx2)terms,A43:C(nx+nx2+nx3)terms,ANN1:Ci=1N1nxiterms.

Thus, AN,1 has a total of j=1N1i=1jCnxi=Cnx(nx1)2(nxN1N(nx1)) . Putting it all together, A˜1:lognt+1+2(Cnx(nx1)2(nxN1N(nx1))),A˜2:2(N+N(N+1)lognx), terms, respectively.

Finally, we look at the decomposition of the blocks F2i and determine the constant C. The first and last block of F2 have one non-zero entry with the following structure F21=12Δx(010000000000)=12Δxσ+σσ+σs1timesσ+,F2nx=12Δx(000000000010)=12Δxσσ+σσ+s1timesσ, and thus have a single decomposition term, i.e. C = 1. The remaining F2k blocks have exactly two non-zero entries and can be decomposed into two terms (i.e., C = 2), which follows from the result below. In general, the tensor decomposition of a square matrix Arc in sigma basis with a single non-zero entry at (r, c) position for r, c ∈ {0, 1, …, nx – 1} can be determined as given in Theorem 4.

Theorem 4

Let As be a 2s × 2s matrix with a single non-zero entry in position (r, c) r, c ∈ {0, 1, …, nx – 1}. Let the binary representation of r=p=0s12pqr(p) and c=p=0s12pqc(p) . Then, the tensor decomposition ofin sigma basis is given by As = As (r, c)(⊗pσ(qr(p), qc(p))), where σ(qr(p),qc(p))={ σ+σifqr(p)=0,qc(p)=0σ+ifqr(p)=0,qc(p)=1σifqr(p)=1,qc(p)=0σσ+ifqr(p)=1,qc(p)=1.

Proof

This can be proved by induction. Trivially true for s = 1, by the definition of sigma basis. Assume, the statement holds for s – 1. Then As=As(r,c)([ abcd ]As1), where a, b, c, d ∈ {0, 1}. Since, there is only one non-zero entry in As, only one of a, b, c or d can be non-zero. If r // 2s–1 = // 2s–1 = 0, where // corresponds to integer division, then a = 1. Equivalently, using the binary representation of r, c, if qr(s – 1) = qc(s – 1) = 0, then a = 1 and the corresponding basis is σ+σ-. Similarly, b = 1 if qr(s – 1) = 0, qc(s – 1) = 1, c = 1 if qr(s – 1) = 1, qc(s – 1) = 0, and d = 1 if qr(s – 1) = qc(s – 1) = 1. Combining this with the induction hypothesis completes the proof.

To demonstrate the efficacy of using the sigma basis compared to the Pauli basis, we compute the number of terms needed in the LCU decomposition for different matrix sizes for Ã1, Ã2. To vary the matrix size we consider different numbers of spatial/temporal discretization points nx and nt, respectively, and different values of CL truncation level N. For the Pauli basis we used the matrix slicing method proposed in [17] to determine the number of terms, while for sigma basis we used analytical expressions derived above. Figure 3 shows the comparison and highlights significantly better scaling for the sigma basis. For example, for nx = nt = 4, N = 4, LCU with Pauli basis generates 276, 000 terms whereas sigma basis gives rise to only 111 terms. Furthermore, computation of the Pauli LCU decomposition was too time consuming as the problem size increases as indicated by the missing data points for nx = nt = 8, 16 and N > 2.

Figure 3.

Comparison of the number of LCU terms using sigma basis (blue curve) and Pauli basis (red curves) for different values ofCL truncation levels and number of spatial/temporal discretization points nx and nt.

Finally, note that the sigma basis-based LCU decomposition analysis presented in this section, while developed in context of a specific example, is general and can be readily extended to Carleman matrices arising from the linearization of any other polynomial nonlinear ODE.

5.3.
Implementation

In this section, we demonstrate the application of our nBVQPCO framework to solve the inverse problem described in Section 5.1. We take L = 0.5 and T = 0.35, and assume that the forcing function f(x, t) ≡ 0. The problem is discretized on a spatial grid with nx = 4 grid points leading to Δx = 0.1. For time discretization we take nt = 8 time steps resulting in Δt = 0.05. We consider an initial condition of the form u0(x) = sin(k(x – Δx)), k=2πL , which upon discretization takes the form 95u0=c(sin(0)sin(kΔx)sin((nx1)kΔx)).

For simplicity we choose such that ∥u0∥ = 1. To simulate the measurement data y(t), t ∈ [0, T], we take the measurement point to be xp = x2 where xi = iΔx. We integrate the nonlinear ODEs (84), and generate the measurement data y(ti) = u2(ti), ti = ih, i = 0, ⋯, nt – 1 using ν = 0.07. In our numerical studies we investigate N = 1 and N = 2 truncation for the CL.

The nBVQPCO framework was implemented using the PennyLane software framework for quantum computing. PennyLane’s lightning.qubit device was used as the simulator, autograd interface was used as the automatic differentiation library and the adjoint method was used for gradient computations. We do not consider any shot, device or measurement noise throughout this study.

5.3.1.
VQLS Implementation and Results

In this section we provide details related to VQLS implementation including the state preparation circuit, and the selected ansatz, optimizer, and LCU decomposition approach.

State Preparation: Here we describe the circuit to construct b˜ in (29). Given padding of AN as discussed in Section 5.2, we equivalently pad b˜ which takes the form b˜=(0w^(0)0hb(0,p)0hb((nt2)h,p))=(0w^(0)0000), where w^(0)=( u0 (u0[2])T(u0[N])T )T with u0 defined in Eq. (95). Since f(x, t) = 0, implies F0(t) = 0 and thus b(t, p) = 0 in Eq. (25), and the second equality above follows. Since all other entries are zero, it is sufficient to focus on state preparation of the form 1w^(0)(0w^(0)) . We generate this state using a sequence of RY, (multi-) controlled RY gates and controlled applications of the circuit for generating u0 as follows.

State preparation for u0 can be accomplished using RY, CNOT, and PauliZ gates as shown in Figure 4 for different values of nx. Next, starting from the state |0〉nx, we first create the state |w^1(0)=1N(0,e1T,(e1[2])T,,(e1[N])T)T where e1 = [1, 0,…, 0]T ∈ ℝnx using RY gates as shows in Figure 5. Finally, by adding the circuit to prepare u0 and its controlled application as illustrated in Figure 6, we sequentially create the states |w^1(0)1N(0T,u0T,u0Te1T,,u0T(e1[N1])T)T1N(0T,u0T,(u0[2])T,,(u0[2])T(e1[N2])T)T1N(0T,u0T,(u0[2])T,(u0[3])T,,(u0[3])T(e1[N3])T)T1N(0T,u0T,(u0[2])T,(u0[3])T,,(u0[N])T)T=1w^(0)(0w^(0)).

Note that since ∥u0∥ = 1 and u0[i] =1 for i = 1, 2, ⋯, N, the states generated above are normalized as required. Finally, since the forcing term f(x, t) = 0 in our example, simply adding empty log2 nt wires completes state preparation of the state b˜ . The technique described here can be used in general to prepare the vector ŵ, arising due to Carleman linearization with the circuit depth scaling approximately as 𝒪(N2) to 𝒪(N2.5). The depth also depends on the user-specified initial and boundary conditions.

Figure 4.

Quantum circuit to prepare the initial condition u0 Eq. (95) for nx = 4 and nx = 8 grid points.

Figure 5.

Quantum circuit to prepare the state 1N(0T,e1T,(e1[2])T,,(e1[N])T)T for nx = 4 with (a) N 2 and (b) N 3.

Figure 6.

Sequential construction of the quantum circuit to prepare the desired state (0,u0T,(u0[2])T,(u0[3])T)T with N = 3, nx = 4 and u0as defined in Eq. (95).

Ansatz: We use a modified version of the ansatz circuit 9 from [20] with 3 repetitions inspired by the work in [12]. This is shown in Figure 7.

Figure 7.

Modified real version of circuit 9 from [20] that is used as the VQLS ansatz.

LCU Decomposition: As discussed in Section 5.2, we use sigma basis-based tensor product decomposition due to its computationally efficiency compared to the Pauli basis for our application. Since sigma basis is non-unitary, we used circuits based on unitary completion [14] to implement the Hadamard test required for computing the VQLS cost functions.

Optimizer: There are several choices of optimizers that can be used with VQLS [21]. We used the Adagrad optimizer from PennyLane with a step size of 0.8. The convergence criteria for VQLS were set to a maximum of 200 iterations. The optimizer was initialized randomly with samples taken from the Beta distribution with shape parameters α = β = 0.5.

VQLS Results: We study the convergence of VQLS for two Carleman truncation levels N = 1, 2 using local VQLS cost function. The VQLS results and convergence of the VQLS cost functions are shown in Figure 8. The VQLS solution is compared with the solution obtained by classically solving the underlying linear system (29). Qualitative comparison of the two solutions indicates that VQLS produces reasonable solutions to the problem. To characterize VQLS solution quality we define aggregated solution error as time average of norm of normalized solution error at each time step, i.e. 97=1nti=1nt w˜iw¯i , where w˜i is the vector of components taken from |w˜ corresponding to classical solution at the time step i, and similarly w¯i is the vector of components taken from |w¯ corresponding to VQLS solution at the i-th time step. Figure 9 shows the error ε for different values of ν and two CL truncation levels. The VQLS solution with N = 2 truncation level has lower error compared to the truncation level N = 1.

Figure 8.

Comparison of normalized solution, obtained classically and via VQLS, of the implicit linear system (29) for the inverse Burgers problem with ν = 0.07 and for two CL truncation levels.

5.3.2.
nBVQPCO Results

Next we study the performance of the nBVQPCO framework. The true optimum of the inverse problem is ν* = 0.07 as we used that value to generate the simulated measurement data y(t). Since the optimization variable is a scalar, we use exhaustive search as the black box optimizer. In this approach we uniformly sample ν in the range [νmin, νmax], use CL+VQLS to generate the normalized solution and compute the design cost (92).

As the curves in Fig. 9b indicate, our algorithm returns the minimum solution at ν = 0.06 for both CL truncation levels N = 1 and N = 2, which is close to the true optimal value of ν* = 0.07. The value of R2 in Eq. (48) ranges from 2.46 to 18 over the range of ν values considered here. Despite being well outside the bound in Eq. (48), CL works well in this practical use case.

Figure 9.

(a) CL+VQLS solution error ε (97) for the inverse Burgers problem for different values of ν and two CL truncation levels N = 1 and N = 2. (b) Design cost (Eq. (92)) as a function of ν obtained via our nBVQPCO framework for CL truncation levels N = 1 and N = 2. Also, marked are ν values where the cost takes the smallest value, which is ν = 0.06 for both the truncation levels. The true optima is known to be ν = 0.07.

6.
Discussion

In this section we discuss various pros/cons of the nBVQPCO framework and a path toward fully fault-tolerant quantum computing implementation. While the nBVQPCO framework is well suited for NISQ implementation, like other variational quantum algorithms it is difficult to subject it to a fully theoretical complexity analysis. As pointed out in Remark 8, it will be worthwhile to analyze VQLS empirical scaling behavior for sparse and structured matrices. Furthermore, the error and computational complexity results we presented depend on the condition in Eq. (48), which can be conservative and too restrictive to be satisfied in practical applications. As we pointed out earlier, for CFD type applications, one can work with alternative forms of conservation laws, e.g., LBM [10], for which the condition in Eq. (48) can more readily be met in practical regimes of interest. Additionally, one can employ a rescaling technique and higher-order discretization schemes proposed in [19] to further improve computational advantage when using the Carleman linearization framework.

One of the bottlenecks in the nBVQPCO framework is the computation of VQLS cost function which scales polynomially with the number of LCU terms. By replacing conventionally employed Pauli basis with sigma basis, we showed in our application that the number of LCU terms scales more favorably. Ideally, one would like this scaling to have a polylogarithmic dependence on the matrix size, and further enhancing the sigma basis type approaches would be necessary to make the proposed framework scalable. Other variational quantum algorithms such as variational quantum eigensolver (VQE), and quantum approximate optimization algorithm (QAOA) can also significantly benefit from such efficient LCU decompositions.

In the nBVQPCO framework one can replace VQLS with the quantum linear systems algorithm (QLSA) [22–24]. This could be beneficial since QLSA has provable exponential advantage over classical linear system solvers and thus comes with rigorous guarantees unlike variational methods. However, QLSA-based implementation is expected to have high quantum resource needs. For instance, in [11], the authors obtained a detailed quantum resource estimate for implementing the CL-based LBM simulation of incompressible flow fields and concluded that the QLSA-based framework can only be feasible in a fault-tolerant quantum computing setting. This study also highlighted that dominant quantum resource needs arise due to the block encoding step which is required to implement the oracle in QLSA. Block encoding in fault-tolerant implementation can be thought of as an analogous step to LCU decomposition in NISQ implementation. Techniques like sigma basis which exploit sparsity and structure for more efficient LCU decomposition can also be potentially leveraged to make block encoding schemes more efficient.

7.
Conclusions

In this paper we presented nBVQPCO, a novel variational quantum framework for nonlinear PDE constrained optimization problems. The proposed framework utilizes Carleman linearization, VQLS algorithm, and a black box optimizer nested in bi-level optimization structure. We presented a detailed computational error and complexity analysis to establish potential benefits of our framework over classical techniques under certain assumptions. We demonstrated the framework on an inverse problem and presented simulation results. The analysis and results demonstrate the correctness of our framework for solving nonlinear simulation-based design optimization problems.

Future work will involve studying and mitigating the effect of device/measurement noise and making the nBVQPCO framework robust. It will also be important to study the scalability of the framework by applying it to larger problem sizes and implementing it on quantum hardware. It will also be beneficial to explore the application of our framework with other PDE discretization approaches such as finite volume and finite element methods, and extend the error and complexity analysis for higher-order temporal discretization schemes. Finally, exploring and refining the framework to better exploit sparsity/structure and extension to fault-tolerant setting as discussed above are also avenues for future research.

DOI: https://doi.org/10.2478/qic-2025-0014 | Journal eISSN: 3106-0544 | Journal ISSN: 1533-7146
Language: English
Page range: 260 - 289
Submitted on: Mar 7, 2025
Accepted on: May 6, 2025
Published on: Jul 1, 2025
Published by: Cerebration Science Publishing Co., Limited
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year
Related subjects:

© 2025 Abeynaya Gnanasekaran, Amit Surana, Hongyu Zhu, published by Cerebration Science Publishing Co., Limited
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.