There exist no quantum computing computers that formally comply with the principles of quantum mechanics. Nevertheless, many vendors have already delivered intermediate-scale quantum computers that follow different development approaches which are rather “noisy”. Currently, the largest quantum computer consists of 433 qubits, and it has been recently announced the first such computer with more than 1000 qubits while other companies will probably surpass this mark very soon.
Regardless of the fact that quantum systems with millions of qubits are not expected to be developed in the short run, the unprecedented potential that such systems offer has been widely accepted mainly through the study of several R&D use cases in disciplines such as financial technology, medicine, and data science. In particular, these studies have shown that quantum computers can perform calculations much faster than classical computers by exploiting certain principles and properties of quantum mechanics. Such supremacy is for example shown in [1] for computations in cryptography.
Quantum commercialization reports originating from various countries around the globe seem to be rather convincing and present impressive spending intentions and revenue predictions [2].
The benefit from the computing throughput provided by quantum computers has the potential to change many industries. Several quantum applications have recently been proposed and applied to use cases with great success. Demonstrations have been constructed using various types of qubits. However, many important problems might very well not have the inherent properties required for benefiting from quantum computing. Is this the case for scientific computing in general and numerical linear algebra in particular?
For more than half a century numerical linear algebra has been motivating early explorers, as John von Neumann [3] and Alan Turing [4], computer architects and developers, and generally the whole Computer industry [5]. It still does so, as new computer architectures such as FPGAs, GPUs, and TPUs (Tensor Processor Units) emerge.
It is worth recalling that numerical linear algebra and in particular, the LINPACK Benchmark has been, since 1993, the basis to assemble and maintain a list of the 500 most powerful computer systems around the globe [6].
Linear algebra mainly deals with systems of linear algebraic equations of the form
Linear algebra plays a significant role in almost all fields of science and engineering and linear systems of algebraic equations either constitute the core of many applications or are cornerstones of larger algorithms. In either case, they often include enormous amounts of data as n increases arbitrarily.
There is a variety of classical methods for the solution of Linear Systems Problem (LSP) which may be categorized as follows:
Direct Methods like the Gaussian elimination, which is suitable for dense matrices and has a time complexity of O(n3).
Iterative Methods like the conjugate gradient descent which is used for sparse matrices and has a time complexity of
In general, all these classical linear solvers are of polynomial complexity. Further improving their performance in a classical computer seems almost impossible. This alone justifies the turn of the researchers’ attention to algorithms that can be implemented in quantum computers and can solve the LSP fast and efficiently.
The objective of our study is two-fold.
Survey the peer-reviewed literature focusing on articles that examine whether quantum computing is suitable for numerical scientific computing in general and numerical linear algebra in particular.
Present our efforts that lead to improvements of already proposed methods.
This review allows us to claim that the intersection between quantum computing and high-performance scientific computing has not yet received the attention it deserves. To the best of our knowledge, our paper is one of the few attempts to address this particular issue. It offers an updated exposition of the subject through a much-missing literature review.
Our motivation is further supported by the necessity of linear solvers, which exhibit efficiency that can only be achieved by quantum algorithms. Such necessity recently led to the exploitation of quantum solvers for several specific use-cases. Next, we briefly mention a few.
Linear systems describing geologic fracture networks are too vast to be fully solved, even with the most advanced classical techniques, and simplifying them through methods like up-scaling yields only rough approximations that might miss key network properties [7]. For example, omitting smaller fractures could disrupt percolation, eliminating full connectivity within a fracture region.
The computational core in many areas of computational geophysics, in general, and hydrological modeling, in particular, is a linear algebraic system whose size and characteristics prohibit the use of direct solvers. The plethora of existing commercial computational systems rely on iterative methods that apparently require preconditioning. Such an inverse Laplacian preconditioner is introduced in [8] that reduces the system’s condition number scaling from O(N) to
For more related case studies that appeared very recently in the peer, we refer to Section VIII, entitled ”Applications of quantum linear systems solvers”, in [9] and the references within as well as the article entitled at https://quantumzeitgeist.com/quantum-algorithms-for-solving-linear-systems-why-they-matter/ which accurately represent the overall activities presented in the peer-review articles as well as in grey literature reports.
Our review does not focus on quantum mechanics and assumes readers with introductory quantum computing knowledge (e.g., basic gates and states) and some linear algebra background, supplemented by the paper’s preliminaries. We mainly consider this subject from a theoretical standpoint; however, we also proceed to the analysis of a few practical experiments using Qiskit, an open-source SDK for working with quantum computers.
The rest of this paper is organized as follows. In Section 2 we present the necessary theoretical background of quantum mechanics and quantum computing. We mainly focus on the fundamental concepts which are related to the motivation of our study. Section 3 provides the state of the art in the area of quantum numerical linear algebra. In Section 4 we discuss our experimental results on a hybrid quantum-classical linear solver and propose a quantum circuit that enhances the performance of the algorithm. The synopsis and our preliminary concluding remarks can be found in Section 5.
This paper assumes that the reader is aware of the basics of quantum computing. These include, among others, the knowledge of the qubit as the unit of quantum information, quantum states and bra-ket notation, quantum gates and unitary operators, Hermitian matrices, as well as the concepts of quantum systems of n qubits and quantum parallelism. We will take just the meaning of these preliminary terms for granted, however, absolute beginners should not be discouraged, as [10], and a plethora of other sources, provide a thorough and comprehensible introduction to the subject of quantum computing. For example, we assume familiarity with basic quantum state preparation (as in [11], Ch. 5); for a quick recap, see the related material at https://www.youtube.com/channel/UCs9jqTGwJgVpBe50KxLozkw.
Specifically, we first introduce approximation methods of inner products of quantum states, which are necessary for quantum linear algebra as anticipated. In this regard, Sections 2.1 and 2.2 provide insight into the swap test and the Hadamard test respectively. Then in Section 2.3 we explain the concept of density operators which are essential for the description of the quantum solution of an LSP. We conclude with the presentation of the Quantum Fourier Transform in Section 2.4 and the consequent quantum phase estimation in Section 2.5.
In quantum computing inner products 〈x|y〉 express how similar arbitrary quantum states are - the inner product of identical states is 1, while the inner product of orthogonal states is 0. Despite the fact that inner products emerge in numerous quantum formulas and algorithms, in practice it is only possible to approximate them.
For their estimation, a probabilistic implementation called swap test is used. This is based on the 3-register quantum circuit of Figure 1 which includes two input registers for the quantum states and an output register.

The Swap (left) and the Hadamard (right) tests.
The transformation of the quantum state of the registers is expressed mathematically as follows:
After the measurement occurs, the probabilities of the output qubit being in the state |0〉 or |1〉 can easily be calculated as follows
The output of the circuit can be considered as a Bernoulli random variable which produces |1〉 with probability
Another quantum circuit for the calculation of inner products is the Hadamard test. It computes a value of the form ℜ𝔢(〈ψ|U|ψ〉), where U is an arbitrary unitary operator and |ψ〉 is an arbitrary quantum state. The Hadamard test is a 2-register quantum circuit, illustrated in Figure 1.
The input quantum state of the circuit evolves as follows:
After the measurement occurs, the probabilities of taking |0〉 or |1〉 from the output qubit can be estimated:
Following the same concept as the one used for the swap test, the output of the Hadamard test circuit is a Bernoulli random variable which produces |1〉 with probability
Quantum states are divided into two categories:
Pure states: They can be sufficiently represented with a vector |x〉 in bra-ket notation and can be defined without uncertainty even when they are complex superpositions of the basis states.
Mixed states: They include some uncertainty about the quantum state. For this reason, these quantum states are a probabilistic combination of multiple pure states.
The density operators ρ are matrices that are designed to express both pure and mixed states:
For pure states, they are defined as the outer product of the quantum state:
1 {p_j}(\mathop \sum \limits_j {p_j} = 1) For mixed states that consist of a set of pure states |xj〉 with probabilities of occurrence
, they are defined as the weighted sum of the outer products of these pure states:\rho = \mathop \sum \limits_j {p_j}|{x_j}\rangle \langle {x_j}|. 2 Tr(\rho ) = \sum\nolimits_j {{\rho _{jj}}} = 1
Density operators are Hermitian positive semi-definite matrices with unit trace, so
The counterpart of the discrete Fourier transform (DFT) in quantum computing is the Quantum Fourier Transform (QFT), whose input sequence includes a total number of samples (quantum states) equal to a power of 2. The QFT formula is given by:
An important property of the resulting state in (3) is that it can be expressed as a tensor product of single-qubit states. To see this, we write the integers x and y in binary form:
where each xj, yj ∈ {0, 1} and x0, y0 denote the least significant bits. We also use binary fraction notation:
where x1 denotes the first digit after the binary point (contributing 1/2), x2 the second digit (contributing 1/22), and so on. With this notation, the exponential factor can be decomposed as
This shows that the state in (3) factorizes as
Figure 2 illustrates how (5) can be implemented in a quantum system that consists of Hadamard gates and controlled phase shift gates:

Implementation of the Quantum Fourier Transform. (Reproduced from https://commons.wikimedia.org/wiki/File:Q_fourier_nqubits.pngviaWikimediaCommons).
The Rk gates used in the circuit are phase shift gates of the form:
The inverse of the QFT can easily be implemented by reversing the circuit. This inverse is of vital importance, since it can take an input state of the form (3) and recover the corresponding computational basis state
The quantum phase estimation problem [12] refers to an unknown unitary operator U that has an eigenvector |ψ〉 with a corresponding eigenvalue

Phase Estimation Algorithm.
Based on properties of eigenvectors and eigenvalues we get that
Using (6) for all the qubits of the circuit, the subsequent state of the quantum system before the inverse QFT is applied is
As the quantum state of (7) resembles the one of (3), if ϕ can be written with ≤ n binary decimals (ϕ = 0. a1 a2⋯an), the inverse QFT can produce the precise value of ϕ. On the other hand, in cases where this is not possible, it has been proved that the phase estimation algorithm outputs a best n-bits approximation of the value of ϕ with probability
Next, we review selected recent papers that present quantum linear system solvers. In particular, we describe in Section 3.1 the quantum mechanical version of the linear system problem. Then we discuss the main eigenvalue quantum methods for linear systems. Specifically, in Section 3.2 we focus on the most well-known quantum solver, the HHL algorithm, and in Section 3.3 we analyze the WZP algorithm which is similar to the HHL algorithm and can be efficiently applied both to dense and sparse matrices. In Section 3.4 we discuss the iterative solution algorithms and in particular we describe the row and column iterative methods. In Section 3.5 we focus on the variational quantum linear solver which is a hybrid method that requires both a classical and a quantum computer in order to compensate for the flawed nature of the current quantum computers. Finally, in Section 3.6 we present another hybrid algorithm that is based on random walks. We summarize and briefly compare the above algorithms in Section 3.7.
A Linear Systems Problem (LSP) takes as input a matrix A ∈ Rm×n and a vector
where A is a quantum operator, while |x〉 and |b〉 are quantum states represented by quantum vectors. A and |b〉 are given and the purpose of the QLSP is to construct a quantum state |x〉 such that:
Contrary to the classical LSP, |x〉 is not the exact solution of the linear system, because such quantum vectors are normalized by default, but it is proportional to the actual solution vector:
For a quantum solution to be successful, given a threshold ϵ > 0, the produced quantum density operator ρx needs to satisfy the following inequality for the trace distance Dρx,x
The most prominent quantum solution method for linear systems of equations was proposed by Harrow, Hassidim, and Lloyd [13]. Their approach, which is commonly referred to as HHL, involves n × n linear systems of the form
The new matrix A′ is Hermitian, since A′† = A′. The preference for Hermitian matrices is due to their capability to be decomposed as follows:
where λi are their eigenvalues and |ui〉 their corresponding eigenvectors. This simplifies the calculation of their inverse matrix
Expressing the vector
it is possible to find the solution of the linear system using the formula
The output of the HHL algorithm is a solution quantum state of the form
m qubits of working space initialized in |0〉⊗m
⌈log2 n⌉ qubits which host the vector |b〉
an ancilla – supplementary – qubit initialized in |0〉.
The preparation of the vector |b〉 is based on the work of Grover et al. [14] and is taken for granted for the purposes of this algorithm. Using the eigenvectors |ui〉 as the basis of the state space, |b〉 can be represented as follows:
where βi = 〈ui|b〉. The HHL solution method can be divided into the following steps:
Phase estimation: The working space qubits constitute the control qubits while the conditional Hamiltonian evolution unitary operator [15] is used:
U = \sum\limits_{k = 0}^{T - 1} {|k\rangle \langle k| \otimes {e^{iAk{t_0}/T}}.} A full analytical derivation of this operator can be found in [15]. Its purpose is to construct the exponentiated eiAt for a superposition of different values of t which is then applied to |b〉, as an operator of this form is suitable for mapping the eigenvalues of A on the working space.
So, phase estimation yields the eigenvalue-encoded state
9 \sum\limits_{i = 1}^n {{\beta _i}|{u_i}\rangle |{\lambda _i}\rangle ,} as seen in Section 2.5 with a precision of m bits.
Controlled rotation: This process is applied on the ancilla qubit. As A is Hermitian,
are the eigenvalues of the inverse of A which need to be incorporated in the state (9). The values of λi first need to be extracted mechanically in order to calculate the angle θi of the rotation, for which\lambda _i^{ - 1} with C ∈ R. The state that occurs from this process is\sin ({\theta _i}) = {C \over {{\lambda _i}}} 10 \sum\limits_{i = 1}^n {{\beta _i}|{u_i}\rangle |{\lambda _i}\rangle } \left( {\sqrt {1 - {{{C^2}} \over {\lambda _i^2}}} |0\rangle + {C \over {{\lambda _i}}}|1\rangle } \right),C \in {\bf{R}}. Inverse phase estimation: This is the reverse of the first step, so the qubits of the working space are reset to |0〉 and the resulting state is:
11 \sum\limits_{i = 1}^n {{\beta _i}|{u_i}\rangle } \left( {\sqrt {1 - {{{C^2}} \over {\lambda _i^2}}} |0\rangle + {C \over {{\lambda _i}}}|1\rangle } \right).
The algorithm terminates with a post-selection on the ancilla qubit being in the state |1〉. This means that we condition on the outcome of the measurement of this qubit being |1〉, so the quantum state that occurs from this process is:
which is proportional to the theoretical solution. A schematic diagram that summarizes the steps of the HHL algorithm is given in Figure 4:

Schematic representation of the HHL algorithm ([16]).
Contrary to the optimal classical linear solvers which solve a linear system with a time complexity that scales polynomially with the size of the system, the corresponding complexity of the proposed quantum HHL algorithm is O(s2κ2 log n/ϵ) in the case A is s-sparse. This translates into an exponential speed-up. The density of the matrix is a significant, but not the sole factor that determines the performance of the algorithm. The condition number κ also needs to be taken into consideration, as it sometimes scales polynomially or even exponentially with n. In this case, it is doubtful that HHL can be faster than its classical counterparts and the scaling of κ must be controlled. This could possibly be achieved in two ways according to [13]:
By processing only the well-conditioned part of A:
\mathop \sum \limits_{j,{\lambda _j} \lt 1/\kappa } \lambda _j^{ - 1}{\beta _j}|{u_j}\rangle |well\rangle + \mathop \sum \limits_{j,{\lambda _j} \ge 1/\kappa } {\beta _j}|{u_j}\rangle |ill\rangle By applying preconditioners in order to improve the condition of the matrix before proceeding with the algorithm.
The dependence of the complexity of HHL on the precision parameter ϵ is also noteworthy. This is mainly related to the use of the phase estimation routine which can significantly reduce the performance of the algorithm if high accuracy of the solution is required. Finally, possible sources of failure need to be considered. Phase estimation is one of them because it can produce wrong results under certain circumstances (Section 2.5). In addition, postselection is not guaranteed to succeed. According to [13], if the constant C is O(1/κ), O(κ) repetitions of the routine including the post-selection step seem to be enough to produce the desired result.
The HHL algorithm is a breakthrough in the field of quantum linear solvers although it has certain limitations. This work is the cornerstone of the field of quantum linear solvers and this is illustrated by the fact that it inspired several solvers developed recently.
Another notable quantum solver for linear sets of equations is that proposed by Wossnig, Zhao, and Prakash [17] which is commonly referred to as WZP. This algorithm bears some resemblance to the HHL algorithm, but it is appropriate for both dense and sparse matrices. Similar to HHL, the matrix A needs to be Hermitian, but non-Hermitian matrices may also be treated as described for HHL above.
The WZP algorithm relies on quantum singular value estimation (QSVE) which has been proposed in [18] for the implementation of a quantum recommendation system. QSVE is a factorization method that estimates the eigenvalues of the matrix A. It takes as an input a state of the form
where U and V are unitary matrices whose columns are the left and right singular vectors of A, and Σ is a diagonal matrix whose nonnegative entries σi are called the singular values of A. In particular, for each singular value σi we have
where vi is a right singular vector and ui is the corresponding left singular vector. We will use this notation throughout the description of QSVE. This process is based on two matrices P and Q which are defined as follows:
The i-th column of the matrix P ∈ Rn2×n is equal to
with 1 ≤ i ≤ n, where Ai is the i-th row of the matrix A.{e_i} \otimes {{{A_i}} \over {{{\left\| A \right\|}_i}}} The j-th column of the matrix Q ∈ Rn2×n is equal to
with 1 ≤ j ≤ n, where AF ∈ Rn is a vector which contains the Frobenius norms of the rows of the matrix A and ‖A‖F is the Frobenius norm of the matrix A.{{{A_F}} \over {{{\left\| A \right\|}_F}}} \otimes {e_j}
P and Q are orthogonal, as P† P = In and Q† Q = In and the following equalities hold
For the implementation of these quantum states, a data structure is used that performs the mappings:
The complete presentation of this data structure is available in the cited literature [18], while we include here the essential elements relevant to our discussion. The combination of (12) and (13) proves that the matrices P and Q can be used to factorize a normalized version of the matrix A:
Based on the matrices P and Q, two other matrices U and V are defined as follows
These are the reflections in the column space of P and Q respectively. The matrix
is the basis of the QSVE algorithm. It can be proved that given a singular vector vi of A with singular value σi, Qvi is an eigenvector of W with eigenvalue eiθi where cosθi/2 = σi/‖A‖F. This result stems from the research on bipartite quantum walks [19]. A preliminary element of its proof is the fact that an orthogonal projection that maps a vector Qv of the column space of Q to the column space of P is based on the orthogonal projector PP† while mapping a vector Pw to the column space of Q is based on the orthogonal projector QQ†:
Given a singular value σi of A with right singular unit vector vi and left singular unit vector ui, it follows that:
It can be observed that (15) transforms Qvi to Pui, thus it transforms a vector from the column space of Q to the column space of P. On the other hand, (16) does the inverse. Furthermore, it can be proved that:
and similarly |Pui| = |ui|. Taking these observations into consideration, it is evident that
Geometrically θi/2 is equal to the angle between Pui and Qvi, as:
Thus, an alternative expression for (15) and (16) is the following:
The subspace with basis {Pui, Qvi} can be 2-dimensional on the non-trivial case. This subspace is invariant under the action of the matrix W, as W is a reflection in the column space of Q followed by a reflection in the column space of P which in the case of the aforementioned 2-dimensional space translates to a reflection on axis Qvi followed by a reflection on axis Pui. The total effect of these two reflections is a rotation by an angle that is double the angle θi/2 between the axes. Therefore eθi is an eigenvalue of W and the following equality holds:
The previous analysis is essential for the comprehension of the steps of QSVE. These include:
The preparation of the quantum state of an arbitrary vector which can be expressed using a basis of the vector space that consists of the singular vectors vi of A ∈ Rn×n:
|x\rangle = \sum\limits_i {{a_i}|{v_i}\rangle .} The transformation of the previous quantum state using a Q operator:
|Qx\rangle = \sum\limits_i {{a_i}|Q{v_i}\rangle } . Then ⌈log2n⌉ qubits initialized in |0〉 are appended to the resulting state. These qubits will be required for the next step.
Phase estimation extracts
as an estimation of the eigenvalues of W, building on the |Qx〉 state. The resulting state of this step is{{\bar \theta }_i} \mathop \sum \limits_i {a_i}|Q{v_i},{{\bar \theta }_i}\rangle . The transformation of the previous state in order to approximate the singular values σi of A, taking into account
. This is achieved using the formula{{\bar \theta }_i} {{\bar \sigma }_i} = \cos ({{\bar \theta }_i}/2){\left\| A \right\|_F}. The reversion of the effect of the operator Q which was used in step 2 so that the final state is reached
\mathop \sum \limits_i {a_i}|{v_i}\rangle |\overline {{\sigma _i}} \rangle .
The above process is a significant part of the proposed WZP method. As the algorithm works with Hermitian matrices, the knowledge of the approximations
Prepare the state of an arbitrary vector and express it as a linear combination of the singular vectors vi of A:
{\lambda _i} \in [ - 1, - 1/\kappa ]\mathop \cup \nolimits^ [1/\kappa ,1] Use the QSVE routine for the matrices A and A + μI with μ = 1/κ. The matrix A + μI has the same eigenvectors as A and its eigenvalues are equal to μ + λi giving us that
\sum\limits_i {{\beta _i}|{v_i}{\rangle _A}\left\| {{{\bar \lambda }_i}|{\rangle _B}} \right\|\overline {{\lambda _i}} + \mu |{\rangle _C}} . The notation A, B, C divides the generated quantum state into individual registers. The values of the registers B and C reveal the sign of the eigenvalues λi. If λi > 0, then |λi + μ| > |λi|, so the value of register C is larger than the value of register B. On the other hand, if λi < 0, then
{\lambda _i} \lt - 1/k \Rightarrow {\lambda _i} \lt - \mu \Rightarrow {\lambda _i} + \mu \lt 0 because of the aforementioned scaling of A. Thus, this means that:
{\lambda _i} \lt {\lambda _i} + \mu \Rightarrow - |{\lambda _i}| \lt - |{\lambda _i} + \mu | \Rightarrow |{\lambda _i}| \gt |{\lambda _i} + \mu | and, as a consequence, the value of register B now is larger than the value of register C.
Introduce an additional ancilla register D which is set to |1〉 if register B contains a larger value than register C, namely when λi < 0. We apply a conditional phase gate so that the state that occurs is the following
\sum\limits_i {{{( - 1)}^{{f_i}}}{\beta _i}|{v_i}{\rangle _A}\left\| {{{\bar \lambda }_i}|{\rangle _B}} \right\||{\lambda _i} + \mu |{\rangle _C}|{f_i}{\rangle _D}.} Add another ancilla register and apply a controlled rotation by γ = O(1/κ) conditioned on the register B. Afterward, the registers B, C, D are no longer required, so we reset them and reach the state
\sum\limits_i {{{( - 1)}^{{f_i}}}{\beta _i}|{v_i}\rangle ({\gamma \over {|{{\bar \lambda }_i}|}}|0\rangle + \sqrt {1 - {{({\gamma \over {\overline {{\lambda _i}} }})}^2}} |1\rangle ).} To obtain a state |x〉 proportional to the solution of the linear system, we post-select on the register that was introduced in this step being in state |0〉. The choice of the value of γ is such that the measurement produces |0〉 with high probability, so in any case, not too many iterations should be required.
The WZP method proves quantum supremacy and achieves an exponential speed-up in the size of a linear system, while it works quite well for dense matrices. The complexity of the algorithm is
The WZP method has some resemblance to the HHL algorithm, mainly because it is based on the eigenvalues of A in order to calculate the solution vector. In addition, both algorithms use phase estimation as part of their routines. However, this can significantly increase their cost, as phase estimation is an expensive procedure when high accuracy is required. On the other hand, WZP has an important advantage over HHL. In particular, HHL requires the preparation of suitable Hamiltonians which are difficult to implement in order to appropriately represent A, whereas WZP relies on a tree-like data structure which seems easier to construct. Therefore, it would be justified to prefer WZP over HHL under certain conditions.
The use of iterative methods as quantum solvers for linear sets of equations has been actively investigated in recent years [20–22], as the potential of quantum parallelism can lead to a speed-up in comparison to the corresponding classical algorithms. Classical iterative methods are widely used nowadays for large-scale linear systems, because they are faster than the direct methods, especially in cases where not a very precise solution is required or the matrix is sparse. The complexity of these classical algorithms is approximately O(Tn) where T is the total number of iterations and n is the size of the system
Quantum iterative algorithms achieve an exponential acceleration in n, however, their main limitation is that T significantly reduces their performance – some of them even have an exponential dependence on it. This is mainly due to the requirement on multiple copies of the approximation xk of the solution, which is calculated during the k-th iteration, in order to proceed to the (k + 1)-th iteration. The algorithms often need to restart repeatedly from x0 for an exponential number of times until the desired number of copies of xk is reached.
Current research indicates that it is probably not feasible to implement a quantum algorithm that can outperform classical iterative solution methods both in n and T. An exponential speed-up in n is definitely possible, however, a polynomial complexity in T is the best we can expect at the moment.
An interesting work towards that direction is proposed in [20] and is based on the row and column iteration methods which are used in a lot of large-scale data science applications. The main advantage of these methods is that for every iteration, not the whole matrix A is needed. The two methods are defined as follows:
Row (or Kaczmarz) iteration method: It selects in each iteration a row with index 1 ≤ ik ≤ n with probability proportional to the norm of the rows of the matrix. If we denote the i-th row of A as ai and the i-th element of the vector
as bi, the (k + 1)-th approximation of the solution of the system is calculated as follows{\vec b} 17 {x_{k + 1}} = {x_k} - \left( {{{a_{{i_k}}^T{x_k}} \over {\left\| {{a_{{i_k}}}} \right\|}} - {{{b_{{i_k}}}} \over {\left\| {{a_{{i_k}}}} \right\|}}} \right){{{a_{{i_k}}}} \over {\left\| {{a_{{i_k}}}} \right\|}}. Geometrically a step of the row iteration method is equivalent to the orthogonal projection of the vector xk on the plane
.a_{{i_k}}^Tx = {b_{{i_k}}} Column (or coordinate descent) iteration method: It selects in each iteration a column with index 1 ≤ jk ≤ n with probability proportional to the norm of the columns of the matrix. If we denote the j-th row of A as cj and the j-th column of the identity matrix as ej, the (k + 1)-th approximation of the solution of the system is calculated as follows
18 {x_{k + 1}} = {x_k} + {{c_{{j_k}}^T(b - A{x_k})} \over {{{\left\| {{c_{{j_k}}}} \right\|}^2}}}{e_{{j_k}}}. If we denote the residual of the k-th iteration as rk = b − Axk, the expression of (18) can be written as follows:
19 {x_{k + 1}} = {x_k} + {{c_{{j_k}}^T{r_k}} \over {{{\left\| {{c_{{j_k}}}} \right\|}^2}}}{e_{{j_k}}}, where the residual also follows an iterative scheme:
20 {r_{k + 1}} = b - A{x_{k + 1}} = b - A{x_k} - A{{c_{{j_k}}^T{r_k}} \over {{{\left\| {{c_{{j_k}}}} \right\|}^2}}}{e_{{j_k}}} = {r_k} - {{{c_{{j_k}}}c_{{j_k}}^T} \over {{{\left\| {{c_{{j_k}}}} \right\|}^2}}}{r_k}.
The quantum algorithm of Shao assumes the existence of a QRAM [23] which is responsible for preparing a row at in the form of a quantum state and is equivalent to a unitary operator Vt such that:
All the rows of A are supposed to be scaled to ‖at‖ = 1 for the purpose of their representation as quantum states. The solution unit vector approximations |xk〉 are assumed to be included in quantum states |Xk〉 such that
Here and in what follows, we use the notation |⋯〉 as a placeholder for an unspecified quantum state, whose explicit form is not relevant to the argument. The basic idea of the algorithm involves preparing a quantum state of the form
Applying a SWAP gate between the first two qubits of such a state we get
For the next step, the algorithm deploys the following unitary operator:
Through equality
Using (22) and applying the operator Ut on the first part of the quantum state of (21), the quantum state which is calculated below is produced
From (17) it can be observed that in terms of quantum states we have
and this expression resembles the first part of (24) if the values of β, γ are chosen appropriately. For example, β = ‖xk‖ and
Following this intuitive explanation of the concept of the quantum row iteration algorithm, the analytical sequence of its steps is presented next.
An initial approximation |x0〉 of the unit vector of the solution is selected. We set k = 0 and a variable vk = 1 to express the state of the quantum system as follows:
25 |{X_k}\rangle = {{\left\| {{x_k}} \right\|} \over {{v_k}}}|0{\rangle ^{ \otimes k}}|{x_k}\rangle + |{0^ \bot }{\rangle ^{ \otimes k}}| \cdots \rangle . Here
denotes a tensor product of states orthogonal to |0〉. The explicit form of these states is not needed, as they serve only to complete the expression.|{0^ \bot }{\rangle ^{ \otimes k}} The row with index
, is chosen in order to prepare a state of the form{t_k} \in \{ 1, \cdots ,n\} |{Y_k}\rangle = {\beta _{{t_k}}}|0\rangle |{X_k}\rangle + {\gamma _{{t_k}}}|1\rangle |0{\rangle ^{ \otimes k}}|{a_{{t_k}}}\rangle , where
and\beta _{{t_k}}^2 = {{v_k^2} \over {v_k^2 + b_{{t_k}}^2}} . To implement this state, the transformation\gamma _{{t_k}}^2 = 1 - \beta _{{t_k}}^2 is achieved with a suitable rotation operator, whereas |Xk〉 and |atk〉 are incorporated using appropriate controlled operations.\gamma _{{t_k}}^2 = 1 - \beta _{{t_k}}^2 A SWAP operator exchanges the first with the (k + 1)-th qubit of |Yk〉 and an operator of the form
is applied:I_2^{ \otimes k} \otimes {U_{{t_k}}} 26 \matrix{ {|{X_{k + 1}}\rangle = (I_2^{ \otimes k} \otimes {U_{{t_k}}}){\rm{swa}}{{\rm{p}}_{1,k + 1}}|{Y_k}\rangle \Rightarrow } \cr {|{X_{k + 1}}\rangle = (I_2^{ \otimes k} \otimes {U_{{t_k}}}){\rm{swa}}{{\rm{p}}_{1,k + 1}}({\beta _{{t_k}}}|0\rangle |{X_k}\rangle + {\gamma _{{t_k}}}|1\rangle |0{\rangle ^{ \otimes k}}|{a_{{t_k}}}\rangle ) \Rightarrow } \cr {|{X_{k + 1}}\rangle = (I_2^{ \otimes k} \otimes {U_{{t_k}}}){\rm{swa}}{{\rm{p}}_{1,k + 1}}\left( {{{{\beta _{{t_k}}}\left\| {{x_k}} \right\|} \over {{v_k}}}|0{\rangle ^{ \otimes k}}|0\rangle |{x_k}\rangle + {\gamma _{{t_k}}}|1\rangle |0{\rangle ^{ \otimes k}}|{a_{{t_k}}}\rangle + |{0^ \bot }{\rangle ^{ \otimes (k + 1)}}| \cdots \rangle } \right) \Rightarrow } \cr {|{X_{k + 1}}\rangle = |0{\rangle ^{ \otimes (k + 1)}} \otimes \left( {{{{\beta _{{t_k}}}\left\| {{x_k}} \right\|} \over {{v_k}}}({I_n} - |{a_{{t_k}}}\rangle \langle {a_{{t_k}}}|)|{x_k}\rangle + {\gamma _{{t_k}}}(|{a_{{t_k}}}\rangle \langle {a_{{t_k}}}|)|{a_{{t_k}}}\rangle } \right) + |{0^ \bot }{\rangle ^{ \otimes (k + 1)}}| \cdots \rangle \Rightarrow } \cr {|{X_{k + 1}}\rangle = |0{\rangle ^{ \otimes (k + 1)}} \otimes \left( {{{{\beta _{{t_k}}}\left\| {{x_k}} \right\|} \over {{v_k}}}({I_n} - |{a_{{t_k}}}\rangle \langle {a_{{t_k}}}|)|{x_k}\rangle + {\gamma _{{t_k}}}|{a_{{t_k}}}\rangle } \right) + |{0^ \bot }{\rangle ^{ \otimes (k + 1)}}| \cdots \rangle } \cr } From the previous step we have
and\beta _{{t_k}}^2 = {{v_k^2} \over {v_k^2 + b_{{t_k}}^2}} , so\gamma _{{t_k}}^2 = 1 - \beta _{{t_k}}^2 = {{b_{{t_k}}^2} \over {v_k^2 + b_{{t_k}}^2}} . As a result, Ref. (26) can alternatively be written as followsb_{{t_k}}^2\beta _{{t_k}}^2 = \gamma _{{t_k}}^2v_k^2 \Rightarrow {\gamma _{{t_k}}} = {{\beta {t_k}{b_{{t_k}}}} \over {{v_k}}} \matrix{ {|{X_{k + 1}}\rangle = {{{\beta _{{t_k}}}} \over {{v_k}}}|0{\rangle ^{ \otimes (k + 1)}} \otimes (\left\| {{x_k}} \right\|({I_n} - |{a_{{t_k}}}\rangle \langle {a_{{t_k}}}|)|{x_k}\rangle + {b_{{t_k}}}|{a_{{t_k}}}\rangle ) + |{0^ \bot }{\rangle ^{ \otimes (k + 1)}}| \cdots \rangle \Rightarrow } \cr {|{X_{k + 1}}\rangle = {{\left\| {{x_{k + 1}}} \right\|} \over {{v_{k + 1}}}}|0{\rangle ^{ \otimes (k + 1)}}|{x_{k + 1}}\rangle + |{0^ \bot }{\rangle ^{ \otimes (k + 1)}}| \cdots \rangle {\rm{with}}{v_{k + 1}} = {{{v_k}} \over {{\beta _{{t_k}}}}}.} \cr } It can be observed that the produced state has the form of (25), thus an iteration of the algorithm is completed at this point. To proceed to the next iteration of the algorithm, we set k = k + 1 and go to step 2.
Similarly to the row iterative method, for the column iterative method, the existence of a unitary operator Sj which prepares the column vectors |cj〉 is assumed
These columns are scaled to ‖cj‖ = 1 so that they can be represented as quantum states. The algorithm initially selects an approximation x0, calculates the initial residual r0 = b − Ax0 and then updates the respective quantum states |xk〉 and |rk〉 as follows during each iteration
|xk〉’s and |rk〉’s are included in quantum states |Xk〉 and |Rk〉 respectively of the form
The update of |rk〉 is achieved with a SWAP operator which exchanges the first with the (k + 1)-th qubit and a
This resembles the update of |xk〉 in the row iteration algorithm if we set vk = 1. For the update of |xk〉 it is assumed that the approximations are represented as follows:
Using two ancilla qubits these quantum states are combined in a state of the form:
A Wtk operator is applied, where:
as well as a SWAP operator exchanging the second with the third qubit, to produce:
The first part of |ϕ2〉 is similar to (27), but to compute |xk + 1〉 it is essential to rotate its third qubit with an appropriate rotation operator of the form
This leads to the following state
Combining (27) and (28), the expected quantum state of |Xk + 1〉 has the form:
Comparing this state with (28) and setting v = k + 1, it is evident that
At this point, the reader should have a sufficient understanding of how the quantum column iteration algorithm works. Next, we summarize and analyze its main steps.
The algorithm begins with an initial approximation |x0〉 of the unit vector of the solution. An initial residual r0 = b − Ax0 is calculated. The iteration number k is set to 0 and the quantum states |Xk〉 and |Rk〉 are prepared.
The column with index
, which is used for each iteration, is chosen in order to prepare a state of the form{t_k} \in \{ 1, \cdots ,n\} |{\psi _0}\rangle = \sqrt {{{k + 1} \over {k + 2}}} |00\rangle |{X_k}\rangle + \sqrt {{1 \over {k + 2}}} |10\rangle ({I^{ \otimes 2k}} \otimes {S_{{t_k}}})|0{\rangle ^{ \otimes k}}|{R_k}\rangle . Two SWAP operators exchange the second with the (k + 2)-th qubit and the first with the (k + 1)-th qubit of the state |ψ0〉. Afterwards, in order to update |Xk〉, an operator of the form
is applied. To provide more insight into this process, by substituting |Xk〉 and |Rk〉, |ψ0〉 can alternatively be expressed as follows(I_2^{ \otimes (2k + 1)} \otimes {G_k} \otimes {I_n})(I_2^{ \otimes 2k} \otimes {W_{{t_k}}}) \matrix{ {|{\psi _0}\rangle = \sqrt {{{k + 1} \over {k + 2}}} |00\rangle \left( {{{\left\| {{x_k}} \right\|} \over {k + 1}}|0{\rangle ^{ \otimes 2k}}|{x_k}\rangle + |{0^ \bot }{\rangle ^{ \otimes 2k}}| \cdots \rangle } \right) + } \cr {\quad + \sqrt {{1 \over {k + 2}}} |10\rangle \left( {\left\| {{r_k}} \right\||0{\rangle ^{ \otimes 2k}}{S_{{t_k}}}|{r_k}\rangle + |{0^ \bot }{\rangle ^{ \otimes 2k}}| \cdots \rangle } \right)} \cr } The use of the SWAP gates leads to state |ψ1〉:
|{\psi _1}\rangle = |0{\rangle ^{ \otimes 2k}} \otimes \left( {\sqrt {{{k + 1} \over {k + 2}}} {{\left\| {{x_k}} \right\|} \over {k + 1}}|00\rangle |{x_k}\rangle + \sqrt {{1 \over {k + 2}}} \left\| {{r_k}} \right\||10\rangle {S_{{t_k}}}|{r_k}\rangle } \right) + |{0^ \bot }{\rangle ^{ \otimes 2k}}| \cdots \rangle . The operator
transforms the previous state to a new state |ψ2〉I_2^{ \otimes 2k} \otimes {W_{{t_k}}} |{\psi _2}\rangle = |0{\rangle ^{ \otimes (2k + 1)}} \otimes (\sqrt {{{k + 1} \over {k + 2}}} {{\left\| {{x_k}} \right\|} \over {k + 1}}|0\rangle |{x_k}\rangle + \sqrt {{1 \over {k + 2}}} \left\| {{r_k}} \right\||1\rangle |{t_k}\rangle \langle {t_k}|{S_{{t_k}}}|{r_k}\rangle ) + |{0^ \bot }{\rangle ^{ \otimes (2k + 1)}}| \cdots \rangle Finally the operator
leads to a state |ψ3〉 of the formI_2^{ \otimes (2k + 1)} \otimes {G_k} \otimes {I_n} \matrix{ {|{\psi _3}\rangle = |0{\rangle ^{ \otimes (2k + 2)}} \otimes (\sqrt {{{k + 1} \over {k + 2}}} {{\left\| {{x_k}} \right\|} \over {k + 1}}\sqrt {{{k + 1} \over {k + 2}}} |{x_k}\rangle + \sqrt {{1 \over {k + 2}}} \sqrt {{1 \over {k + 2}}} \left\| {{r_k}} \right\||{t_k}\rangle \langle {t_k}|{S_{{t_k}}}|{r_k}\rangle )} \cr { + |{0^ \bot }{\rangle ^{ \otimes (2k + 2)}}| \cdots \rangle \Rightarrow } \cr {|{\psi _3}\rangle = |0{\rangle ^{ \otimes (2k + 2)}} \otimes ({{\left\| {{x_k}} \right\|} \over {k + 2}}|{x_k}\rangle + {1 \over {k + 2}}\left\| {{r_k}} \right\||{t_k}\rangle \langle {t_k}|{S_{{t_k}}}|{r_k}\rangle )} \cr { + |{0^ \bot }{\rangle ^{ \otimes (2k + 2)}}| \cdots \rangle = |{X_{k + 1}}\rangle } \cr } The state of |Rk〉 is updated using the same operators that are used for the update of |Xk〉 in the row iteration algorithm, as explained above. We set k = k + 1 and go to step 2 to proceed to the next iteration.
The row and column iteration algorithms are an innovative idea for a quantum linear solver. Their main advantage compared to the classical iterative methods is that in order to solve a linear system of the form
The performance of the quantum row and column iteration algorithms is approximately
On the other hand, the main drawback of these algorithms is their dependence on ancilla qubits which are used during step 2 of both row and column algorithms. It can be observed that this leads to a linear increase of the qubit requirements with the number of iterations, which might be a problem if a lot of iterations are necessary in order to achieve high accuracy.
All the above described quantum linear solvers exhibit quantum supremacy and promise to solve linear systems faster than their classical counterparts. However, in practice, they have not been implemented in actual quantum computers for large-scale systems of equations. This is due to the fact that currently only noisy intermediate-scale quantum (NISQ) computers have been created which contain a limited number of qubits and are susceptible to quantum decoherence. This is a significant limitation, especially when error correction methods cannot easily be applied to such quantum systems.
Large-scale quantum computers are not expected to emerge soon, so we are forced to lower our expectations concerning the immediate application of the aforementioned quantum algorithms for linear systems and resort to more realistic options. In this regard, Bravo-Prieto proposed the variational quantum linear solver (VQLS) [24], a hybrid quantum – classical algorithm which current quantum computers can handle. The concept of this algorithm is to incorporate classical computations in order to reduce the complexity of the quantum circuits that are needed.
In this subsection, we focus on a short description of the structure of the VQLS algorithm which is summarized in Figure 5:

Diagram of the VQLS algorithm ([24]).
Considering a linear system of the form
To find |x〉 that is proportional to the solution
The parameters a are inputs from the classical computer to the quantum computer and the latter calculates |x(a)〉, as well as the value of a cost function C(a). The calculated C(a) is sent to the classical computer which runs an optimization algorithm that changes the values of a in order to minimize the cost function. The new a is sent to the quantum computer and the whole process is repeated until an optimal value a = aopt is found which leads to a sufficient approximation of the solution
and also
In this subsection, more details about the implementation of the VQLS algorithm are discussed. First, the form of the cost function needs to be defined, so that it measures the component of |Φ〉 = A|x〉 which is orthogonal to |b〉. Four different cost functions are proposed in [24]
Two global cost functions consisting of the
where\widehat {{C_G}} = Tr(|{\rm{\Phi }}\rangle \langle {\rm{\Phi }}|(I - |b\rangle \langle b|)) = \langle x|{H_G}|x\rangle {H_G} = {A^\dag }(I - |b\rangle \langle b|))A and its normalized version
.{H_G} = {A^\dag }(I - |b\rangle \langle b|))A
Two local cost functions consisting of the function
where\widehat {{C_L}} = \langle x|{H_G}|x\rangle and where |0j〉 is the zero state on the j-th qubit, whereas{H_L} = {A^\dag }U(1 - {1 \over n}\mathop \sum \limits_{j = 1}^n |{0_j}\rangle \langle {0_j}| \otimes {I_{\bar j}}){U^\dag }A is the identity matrix which excludes the j-th qubit.{I_{\bar j}} and its normalized version
.{C_L} = \widehat {{C_L}}/\langle {\rm{\Phi }}|{\rm{\Phi }}\rangle
All the above cost functions are suitable for the solution of a n × n system, but the local cost functions lead to better convergence for large values of n. It can be proved that
where κ is the condition number of the matrix A and ϵ is the precision parameter. The right part of these inequalities can be used to define the termination parameter γ.
V(a) can be mathematically expressed as
where k1, k2,…, kl specify the gate types and their positions in the quantum circuit. In cases where k1 = k2 = ⋯ = kl, only a single gate type is used throughout the circuit, which is overly restrictive. More generally, one can consider a fixed-structure ansatz, where the sequence (k1,…,kl) is kept constant during training and only the parameters a are optimized, or a variable-structure ansatz, where the sequence itself may change. In addition, different training strategies can be applied: one may optimize all parameters simultaneously or proceed layer by layer, optimizing parameters by circuit layers.
Finally, another noteworthy element of the VQLS method is the training algorithm that the classical computer uses in order to minimize C(a). There are a few different approaches to achieve that. Ref. [24] proposes a method which in each iteration randomly selects a direction w in the parameter space and solves the optimization problem mins∈RC(a + sw) in order to find the most suitable value for s.
The performance of the VQLS algorithm was evaluated in [24] by solving up to 1024 × 1024 linear systems using Rigetti’s Quantum Cloud Services. Thus, the complexity of the algorithm which we present here is based on experimental data and not theoretical formulas. The time that a linear system needs to be solved has a close to linear dependence on the condition number κ, a logarithmic dependence on the inverse of the precision parameter
Although the calculated complexity is not so reliable, as it is based on a limited number of specific experiments, it is sufficient to show that VQLS leads to an exponential speed-up in n compared to the classical solvers of linear systems, despite the fact that it is implemented in the currently available flawed quantum computers.
Linear algebraic methods that use random walks can handle linear systems of the form
so the solution vector can be calculated as follows
Random walks γ = (i0, i1,⋯,ik) which are required for these methods and terminate after k steps are implemented using Markov chains expressed with a n × n matrix P such that
To construct the matrix P and to define how the corresponding random walk terminates
Neumann and Ulam [25] suggested that the matrix P meets the following requirements:
{P_{ij}} \ge 0,\sum\limits_j {{P_{ij}} \le 1,{H_{ij}} \ne 0 \to {P_{ij}} \ne 0} In addition, a vector T is defined which contains the termination probabilities:
{T_i} = 1 - \sum\limits_j {{P_{ij}}.} After a random walk
finishes, an approximation of the i0-th element of the solution vector can be calculated using the formula\gamma = ({i_0},{i_1}, \cdots ,{i_k}) \widetilde {{x_{{i_0}}}} = {{{H_{{i_0}{i_1}}} \cdots {H_{{i_{k - 1}}{i_k}}}{b_{{i_k}}}} \over {{P_{{i_0}{i_1}}} \cdots {P_{{i_{k - 1}}{i_k}}}{T_{{i_k}}}}}. Dimov et al. [26] proposed that
. Given a random walk{P_{ij}} = {{|{H_{ij}}|} \over {\sum\nolimits_j {|{H_{ij}}|} }} , there are no termination probabilities, but some weights Wi are used instead, such that:\gamma = ({i_0},{i_1}, \cdots ) {W_0} = 1,\quad {W_1} = {W_0}{{{H_{{i_0}{i_1}}}} \over {{P_{{i_0}{i_1}}}}}, \cdots ,\quad {W_n} = {W_{n - 1}}{{{H_{{i_{n - 1}}{i_n}}}} \over {{P_{{i_{n - 1}}{i_n}}}}}. The random walk finishes after its k-th step if |Wk| < ϵ, where ϵ is a predefined tolerance. Then an approximation of the i0-th element of the solution vector is calculated as follows
\widetilde {{x_{{i_0}}}} = \sum\limits_{n = 0}^k {{W_n}{b_{{i_n}}}.}
A hybrid classical-quantum linear solver that utilizes random walks implemented in a quantum computer was proposed in [27] where the reader is referred for a comprehensive explanation of this method. Next, we provide a general description of its structure.
Given a n × n linear system,
where P is a Markov chain transition matrix that satisfies the relations Pij ≥ 0 and
To approximate the I0-th element of x, the following truncated version of (30) is used
The evaluation of (31) is based on a quantum random walk on a graph consisting of n nodes. Given a tolerance of ϵ, c should be selected to be approximately
Contrary to the previous quantum solvers that were presented in this paper, the form of this algorithm does not require the vectors
The proposed implementation requires an additional qubit which acts as a coin register. Given a node represented as a m -bit binary string
The transitions from one node of the graph to another during the different steps of a random walk are achieved by flipping the bits ji with appropriate probabilities. This functionality is based on the coin register; starting with j0 and continuing with the rest of the graph qubits in sequence, the coin register is rotated by an angle θk, and then a CNOT gate is applied to the graph qubit which is processed. Mathematically this is expressed with the operator
where the operator U is the following general single qubit gate
The parameters ϕ and λ are phase angles that represent components of a single qubit rotation. Their ranges are typically defined as ϕ ∈[0, 2π) and λ ∈[0, 2π).
After flipping the coin register once for all the graph qubits, the resulting quantum state is the following
where i−1 = 0. The transition probabilities can be calculated as follows
where
It can be proved that the transition matrix P cannot be written as a tensor product of independent 2 × 2 matrices, which means that the qubits of the graph are entangled, hence the probability that the one flips depends on the probability that the other one flips.
The proposed random walks solver exploits the probabilistic nature of the quantum states in order to implement a Markov chain structure. It is noteworthy that it does not have extravagant space requirements, as 1 + log n qubits are sufficient. In addition, according to [27] the time complexity for the computation of an element of the solution vector is O(log n). The only parameter taken into consideration is the size of the linear system, however, the algorithm still seems to outperform the classical methods, especially when only a part of the solution is required. In addition, contrary to the other quantum solvers, it is important to note that the output of this solver is classical data, so it is more accessible and can easily be extracted.
As Table 1 summarizes, all the quantum solvers presented above exhibit supremacy compared to the bestperforming classical linear solvers.
Comparison of the quantum solvers.
| Algorithm | Complexity | Ref. | Type | Limitations |
|---|---|---|---|---|
| HHL | O(s2κ2 log n/ϵ) | [13] | eigendecomposition | phase estimation, matrix density, condition number, Hamiltonians |
| WZP | [18] | eigendecomposition | phase estimation, condition number | |
| Row and column iteration | [20] | iterative | ancilla qubits, condition number | |
| VQLS | [24] | hybrid | ansatz option, condition number | |
| Random walks | ≥ O(log n) per xi | [27] | hybrid, Monte Carlo | scalability |
All of these algorithms theoretically achieve an exponential speed-up. However, a lot depends on the condition number κ of the matrix A which sometimes scales polynomially with the size of the linear system. If A is ill-conditioned, the quantum algorithms may be slower than the classical methods. This problem can partially be solved using several approaches like preconditioners, but it is often impossible to restrict κ in a scale logarithmic to the number of equations. Moreover, the dependence of the complexity of the algorithms on the precision parameter ϵ should also be considered. If very accurate solutions are required, then an arbitrary increase in the runtime of the algorithms should be expected.
The aforementioned observations can lead to some skepticism about the effectiveness of the quantum linear solvers, but their performance is anyway undoubtedly impressive. It is also encouraging that some of them, like the VQLS algorithm, have been tested, so this confirms that these solvers are not limited to theoretical studies that can not be practically implemented in quantum computers. Nevertheless, asymptotic complexity is not the only factor that determines their viability, and it is equally important to examine the trade-offs between theoretical guarantees and hardware feasibility.
Algorithms such as HHL and WZP rely on quantum phase estimation and deep controlled operations; while they enjoy strong asymptotic guarantees, their circuit depth and noise sensitivity make them infeasible on today’s NISQ devices. By contrast, hybrid solvers like VQLS are explicitly designed for hardware efficiency, since they use shallow variational circuits and shift part of the computational burden to a classical optimizer. This reduces qubit and gate overhead but introduces other challenges: measurement noise can accumulate due to the repeated sampling required, convergence heavily depends on the choice of ansatz, and optimizer robustness is not guaranteed. Thus, the comparison highlights a fundamental balance: algorithms with phase estimation achieve stronger guarantees in theory but lack near-term feasibility, whereas VQLS is practical and moderately robust under current hardware constraints, provided that careful ansatz design and optimizer selection are employed.
In this Section we propose an implementation of the hybrid quantum-classical VQLS algorithm we discussed in Section 3.5. The requisite mathematical formulas required are the following:
Given a quantum linear system of the form A|x〉 = |b〉, the matrix A has to be written as a linear combination of unitary operators Al, as the first formula indicates. The second formula introduces the ansatz V(a) which transforms a set of qubits initialized to |0〉 to an approximation of the solution |x(a)〉. The goal of the VQLS algorithm is to construct the best possible V(a) by finding the optimal a. The third formula calculates a quantum state |Φ〉 which stems from the multiplication of A with the approximate solution |x(a)〉. Finally, the last formula includes the operator U which is responsible for the preparation of |b〉. The cost function used for this implementation is the following
To calculate this in practice using a quantum computer, we have to decompose the terms 〈Φ|Φ〉 and |〈b|Φ〉|2 as follows and apply the Hadamard test (See Section 2.2)
\langle {\rm{\Phi }}|{\rm{\Phi }}\rangle = \langle x(a)|{A^\dag }A|x(a)\rangle = \langle 0|V(a){A^\dag }AV(a)|0\rangle = \sum\nolimits_m {\sum\nolimits_m {c_m^ * {c_n}\langle 0|V{{(a)}^\dag }A_m^\dag {A_n}V(a)|0\rangle } } |\langle b|{\rm{\Phi }}\rangle {|^2} = |\langle b|Ax(a)\rangle {|^2} = |\langle 0|{U^\dag }AV(a)|0\rangle {|^2} = \langle 0|{U^\dag }AV(a)|0\rangle \langle 0|V{(a)^\dag }{A^\dag }U|0\rangle = = \mathop \sum \limits_m \mathop \sum \limits_n c_m^ * {c_n}\langle 0|{U^\dag }{A_n}V(a)|0\rangle \langle 0|V{(a)^\dag }A_m^\dag U|0\rangle Given that all involved operators exclusively include real numbers, we note that
\langle 0|{U^\dag }{A_i}V(a)|0\rangle = \langle 0|V{(a)^\dag }A_i^\dag U|0\rangle .
In all our experiments, the coding for the quantum computation routines relies on the Qiskit framework which allows access to both simulators and cloud-based quantum processors. For the purposes of our study, the Aer simulators has been used. Inspired by the approach described in [29], we solve 8 × 8 linear systems utilizing the VQLS algorithm. Our main contributions to [29] are the following improvements.
We construct and code quantum operator blocks, to reduce the qubit requirements for the inner product calculations.
We utilize different optimizers included in the scipy Python library and compare them to discover the one who produces the optimal a and by extension the best ansatz V(a).
We propose an alternative ansatz structure to the ones in [29] and [24], tailored to real-valued problem instances. In our simulations this ansatz achieves consistently faster and more stable convergence while using reduced circuit complexity. Although classically simulable, this design illustrates how problem-specific ansatz choices can substantially enhance the practical performance of VQLS.
Ref. [29] adapts Hadamard test circuits (extending the 2-register design in Section 2.2, Fig. 1) by adding an ancilla for controlled operations, enabling efficient block encoding without full decomposition. Coding the V(a), U, and Ai operators by constructing ”block operators” as illustrated in Figure 6, this additional qubit can be omitted. In cases of more complex linear systems, more than one ancilla qubits may be omitted.

Inner product circuits for 〈Φ|Φ〉 (left) and |〈b|Φ〉|2 (right).
The ansatz may take many different forms [24] and an indicative fixed ansatz structure is presented which includes alternating Ry gates with controlled-Z gates, as depicted on the left scheme of Figure 10. The question which arises is which optimizer is the most appropriate for the calculation of the optimal values of a, if the ansatz has the above form. To address this, we try to solve a linear system such that A = 0.55𝕀 + 0.45ℤ3, where 𝕀 is the identity operator and ℤ3 is an operator which includes a single qubit Z gate that acts on the third qubit, while U = H⊗3. The optimizers which we experiment with stem from the scipy Python library and an interested reader can discover details about their application in [30]. The convergence behavior of these optimizers is shown in Figure 7.

The convergence of the optimizers.
It is evident that the best-performing algorithm is the COBYLA optimizer which performs constrained optimization using linear approximation when the derivative of the target function is unknown. The Powell optimizer produces comparable results, but it requires many more iterations to achieve converges and its behavior is very unstable. Furthermore, the solution vector it calculates fails to sufficiently converge to the actual solution, as the residual plot of Figure 8 indicates.

The cost function of the optimizers (left) and the corresponding residuals |b−Ax| (right).
On the other hand, the use of the COBYLA optimizer leads to a quantum state that is a much better approximation of the solution. Its density matrix can be seen in Figure 9—white squares indicate positive values and black squares correspond to negative values, whereas the size of the squares is proportional to the magnitude of the values.

Density matrix of the COBYLA quantum solution vector.
Further improving the performance of the VQLS algorithm largely depends on the structure of the ansatz. Thus, we propose an ansatz depicted in Figure 10 consisting of four gates for each qubit (U(θ,0,0), U(θ,0,π), U(θ,π,0) and U(θ,π,π)), where U is the general single qubit unitary gate. This structure is suitable for linear systems that explicitly contain real numbers, as the selected gates cover all the single-qubit gate cases whose operators do not include complex numbers. We note that the proposed U-ansatz does not include entangling gates.

The initial ansatz structure (Ry, CZ gates) on the left and the proposed ansatz structure (U gates) on the right.
The values of θ are the parameters that are tuned by the optimizer. This structure is suitable for linear systems that explicitly contain real numbers, as the selected gates cover all the single qubit gate cases whose operators do not include complex numbers. We solve the aforementioned linear system using this ansatz, and present in Figure 11 the resulting density matrix.

Density matrix of the quantum solution vector produced by the proposed ansatz.
The results indicate a clear improvement in the performance of the algorithm since the cost function converges much more efficiently. The same applies to most of the linear systems which we have tested. Due to computational limitations, error bars were not computed for the figure. However, the results reflect averages over 10 independent runs, and the consistent trends across runs suggest robustness of the observed reductions. For instance, for the linear system A = 0.3ℤ1 + 0.4ℤ2, we find comparable results, as shown in Figure 12.

Comparison of the ansatz structures A = 0.55𝕀 + 0.45Z3 (left) and A = 0.3Z1 + 0.4Z2 (right). The metrics shown are 〈Φ|Φ〉, |〈b|Φ〉|, and the cost
Future extensions could incorporate real-preserving entangling operations to enhance expressivity while maintaining compatibility with real-valued problem instances. Consequently, it seems that the proposed ansatz is worth considering for the solution of linear systems using the VQLS algorithm, as it serves as a reliable alternative to other quantum circuit structures that have been examined in practice and in theory.
The question arising from this research is whether the algorithms studied will possibly replace the well-known classical linear solvers, such as Gaussian elimination. Much depends on whether the scientific community will be able to build universal quantum computers consisting of millions of qubits, and whether error correction techniques will be able to ensure the stability of these systems under all conditions.
The performance of these quantum linear solvers is undoubtedly promising, as they achieve even exponential speed-up in the size of the linear system compared to their classical counterparts. The dependence of most of them on the condition number of the involved matrices raises some concerns though. Therefore, possible future research efforts should focus on limiting or even eliminating this dependence. In addition, the inability to extract the whole solution vector from the output quantum states is something that cannot be overlooked.
The study of the presented algorithms is also still at a theoretical level and minimal practical implementations have been published which mainly concern small linear systems. In this regard, we attempt to contribute to the enhancement of the performance of the VQLS algorithm from an experimental perspective by solving 8 × 8 linear systems using the Aer Qiskit simulators. It remains to be seen if these methods can be effectively extended for largescale systems as our current experimentation indicates. The convergence trends in Figure 12 are based on averaged data without statistical error quantification due to resource constraints. Future work with larger-scale simulations could incorporate confidence intervals to confirm statistical significance.
An alternative approach will very well be through the algorithms considered in [31]. It seems that these randomized approaches could be mapped to quantum computer architectures in a very effective and natural way. Such a method is presented in Section 3.6 above. A comprehensive study of the whole ecosystem of these randomized approaches is beyond the scope of this paper and will be presented elsewhere. We should note that several of these methods have been proposed long time and ignored by the majority of researchers. We believe that this was due to the fact that they could not be properly mapped on the von Neuman computing model and effectively implemented on the existing classical computers. We strongly believe that they have the potential to exhibit their supremacy in quantum computing systems.
Apart from the above-mentioned algorithms, there is a plethora of other related publications [10,32–52] which for various reasons we do not comment in our paper. The interested reader can refer to them in order to acquire a global view of the whole research ecosystem of the quantum solvers for linear algebraic systems.