Quantum computations offer exponential speedups over their classical counterparts by leveraging the principles of quantum mechanics. In 1980 Benioff and Manin [1] proposed the concept of quantum computing, and Feynman [2] introduced the theoretical framework of a quantum computer in 1982. In 1985 Deutsch [3] proposed the quantum parallel algorithm, showcasing the significant computational advantages of quantum computing over classical methods.
In 1994 Shor [4,5] presented a quantum algorithm for prime factorization, drawing widespread attention to the advantages of quantum computing in solving certain problems. Grover [6] followed in 1996 with a quantum search algorithm, further demonstrating quantum computing’s potential in data search applications. In 2009 the HHL algorithm [7,8] was developed for solving linear systems of equations, greatly advancing the quantum algorithm researches. Several renowned quantum operations, such as quantum Fourier transform [9–11] and quantum phase estimation [11,12], have significantly contributed to the field.
Matrix operations serve as the core computational power support for quantum algorithms. The efficient quantum implementation of its basic operations (including row transformation, transposition, and trace calculation, etc.) holds significant theoretical and practical value. In quantum chemistry simulations [13], the elementary row transformation of matrices can rapidly simplify the Hamiltonian matrix, reducing the time complexity of solving molecular energy spectra from the polynomial level of classical algorithms to the logarithmic level. Matrix Transpose operation enables the efficient analysis of molecular symmetry, while the calculation of the matrix trace can directly yield key physical quantities such as the number of electrons. In quantum cryptography [14], elementary row transformation can optimize the encoding of quantum states to enhance security. Matrix Transpose can achieve efficient information routing in quantum communication networks, and the matrix trace can be used for the integrity detection of quantum states. In the field of quantum machine learning [15], elementary row transformation can accelerate the feature extraction of Quantum Support Vector Machines (QSVMs). Matrix Transpose can improve the backpropagation efficiency of quantum neural networks, and the matrix trace operation can quickly calculate kernel functions to evaluate model performance. Given the wide-ranging influence of matrix operations in interdisciplinary fields, exploring efficient quantum computing methods to solve matrix-related algebraic problems has become a current research hotspot.
In recent years, researchers have proposed several quantum algorithms for matrix operations. For example, the Trotter approximation formula [16] has been utilized to simulate quantum circuits, addressing complex operations such as matrix addition, multiplication and Kronecker products. Quantum algorithms for computing vector inner products [17–22] have also been developed. These algorithms leverage the superposition and entanglement of qubits and have potential applications in fields like machine learning and data processing. Based on a “sender-receiver” quantum computing model, in 2022 Wen-Tao Qi et al. [19] proposed a class of quantum algorithms for matrix operations, including matrix-vector multiplication, matrix-matrix multiplication, matrix addition, determinant computation and matrix inversion. In 2023 A.I. Zenchuk et al. [20] further developed the idea of using specific types of unitary transformations to implement linear algebra protocols according to multi-qubit Toffoli gates and the simplest single-qubit operations. Auxiliary measurements were employed to remove redundant information and extract desired results. Building on this work, in 2024 they [21] also introduced a method to encode the N rows of a matrix into pure states of N independent quantum subsystems. By selecting appropriate unitary operators, the determinant can be computed efficiently. Based on this, they [22] proposed a variational quantum singular value decomposition method based on matrix encoding, which reduced the number of measurements through controlled measurement. Furthermore, in 2025 they [23] discussed the method of encording unnormalized matrices, and the system encoding the matrix was extended adding one more probability amplitude that satisfy the normalization condition.
However, these quantum algorithms for basic matrix operations still face challenges such as high computational complexity and low success probabilities, which limit their broader applicability.
The elementary row transformations of matrices play significant roles in solving linear systems of equations, determining matrix rank, finding invertible matrices and simplifying determinant computations, with wide-ranging value in both theoretical research and practical applications. In quantum computing, elementary row transformations can simplify matrices for efficient linear equation solving, provide theoretical support for matrix decomposition, advance large-scale matrix operations, and help to interpret the geometric meanings of linear transformations. These contributions offer new insights into the designs of quantum algorithms. Currently, there are no specialized quantum algorithms specifically designed for elementary row transformations of matrices.
Drawing on the foundational ideas from [20], this paper proposes algorithms for two elementary row transformations of matrices, transpose and trace of matrices, along with their corresponding quantum circuits. These algorithms collectively involve the following key steps:
Construction of the initial state: matrix elements are first encoded into the probability amplitudes of quantum pure states by using tensor product operations.
Information separation and labeling: combining multi-qubit control operators with ancillary states, where the ancillary states are used to accurately label the useful and redundant parts of the information, achieving effective separation of the data.
State swaping operations: depending on the specific algorithms, C-SWAP operations are flexibly employed to precisely exchange the states of two quantum bits.
State superposition processing: Hadamard operators are applied to induce superposition states among quantum bits.
Measurement on the ancillary states: auxiliary measurement techniques are used to eliminate redundant information, ensuring that the desired results are accurately reflected in the probability amplitudes of the final state.
This paper focuses on the core algorithm layer of quantum computers and aims to design a set of quantum circuit schemes for implementing basic matrix operations. The research is carried out under the following premise: the input state, which is the quantum state encoding matrix elements, is successfully prepared. On this basis, we systematically construct a quantum circuit architecture capable of performing basic matrix operations such as row transformation transposition, and trace calculation.
This approach is of great value in the current development of quantum computing. On the one hand, quantum state preparation and quantum state measurement, as crucial aspects in quantum computing, have attracted much attention on conducting specialized studies. Transforming classical matrices into quantum pure states is a fundamental issue that needs to be addressed in the field of quantum state preparation. Moreover, converting the results of quantum state measurements into classical data interpretable by humans makes quantum readout the core part of quantum state measurement research. On the other hand, the algorithm designed in this paper fully considers the compatibility with quantum state preparation and quantum state measurement schemes, laying a solid foundation for the rapid deployment and application of quantum computers after they become practically implementable in the future.
The remainder of this paper is organized as follows: Section 2 introduces quantum algorithms for two types of elementary matrix transformations. Section 3 presents quantum algorithms for computing the trace and the transposition of a matrix. Conclusion is given in section 2.
This section presents quantum algorithms for two types of elementary row transformations of an N × M matrix A = {aij} (N = 2n, M = 2m), including adding one row to another and swapping two rows.
For the quantum algorithm that adds one row of a matrix to another, the quantum circuit diagram is shown in Figure 1. This algorithm requires two pure states and four auxiliary states, and is divided into seven steps.

Quantum circuit diagram for adding one row of a matrix to another row.
The following steps give rise to the algorithm of adding the (k + 1)th row of a matrix A to the (l + 1)th row with 0 ≤ l, k ≤ N – 1.
First, introduce an n-qubits register R1 and an m-qubit register C1, which are used to enumerate the rows and columns of matrix A, respectively. The entries aij of the matrix A are then encoded into a pure state as follows,
Next, introduce an N-dimensional auxiliary column vector Z (with the elements of the k + 1 and the l + 1 rows being
The initial state of the entire system is then given by
In order to accurately separate the useful and useless information terms in |Φ0〉 for subsequent calculations, we need to label the two states of R2 in |Φ0〉 accordingly.
Introduce a 1-qubit auxiliary B1 in state |0〉B1, and the following control operator,
That is, for the term where the state of R2 is |k〉R2, it is labeled by |1〉B1, and for the term where the state of R2 is |l〉R2, it is labeled by |0〉B1.
Kitaev A. Y. et al. [24] represented an important result: A control operator with n control qubits can be represented by O(n) Toffoli gates. Since the above control operator
To precisely extract the element located at the (k + 1)th row of the second term in |Φ1〉, from the terms in |Φ1〉 labeled by |0〉B1 we only extract the terms where R1 is in the state |k〉R1. To do so, we introduce a 1-qubit auxiliary B2 in the state |0〉B2, and the following control operator,
Applying the operator
That is, the terms where the state of R1 and B1 are |k〉R1 |0〉B1 are labeled by |1〉B2, while the other terms are labeled by |0〉B2. The states |1〉B1 |0〉B2 and |0〉B1 |1〉B2 in |Φ2〉 correspond to the useful information terms required by the algorithm. All other useless information terms are represented by |g1〉R1C1R2.
The control operator
To add the (k + 1)th row to the (l + 1)th row, we apply a C-SWAP gate operation to the terms in |Φ2〉 labeled by |1〉B2, which swaps the states of R1 and R2. This operation is given by the following gate,
Applying
Note that the SWAPs in the control operator
To effectively avoid mixing useful and useless information terms in the subsequent computation process, we introduce an auxiliary qubit B3 in state |0〉B3, together with the projection operator PB1B2 = |00〉B1 B2 B1B2〈00|. We use B3 to label the useful and useless information terms in |Φ3〉 accordingly by using the following control operator,
Applying
In this way, |0〉B3 (|1〉B3) labels the useful (useless) information terms.
Since the control operator
Applying the Hadamard operator
Here, the terms in the state |0〉B1|0〉B2|0〉B3 are retained, which contain all the information required for further process. The remaining terms are represented as |g2〉R1C1R2B1B2B3.
Since only two Hadamard gates are used here, the circuit complexity is O(1).
By measuring the auxiliary B1B2B3 in the state |000〉B1B2B3, we get
The success probability is
The overall computational complexity of the algorithm is O(n), mainly determined by
For the quantum algorithm that swaps two rows of a matrix, the quantum circuit diagram is shown in Figure 2. The algorithm requires two pure states and four auxiliaries, and is divided into seven steps to swap the (k + 1)th and (l + 1)th rows of matrix A (0 ≤ l, k ≤ N – 1).

The quantum circuit diagram for swapping two rows of a matrix.
First, introduce an n-qubit register R1 and an m-qubit register C1, which enumerate the rows and columns of the matrix A, respectively. The elements of the matrix A are then encoded into the following pure state,
Next, introduce an N × N auxiliary matrix Z such that the elements in the (l + 1)-th row and (k + 1)-th column, the (k + 1)-th row and (k + 1)-th column, and the (l + 1)-th row and (l + 1)-th column are
The initial state is of the form,
To accurately separate the useful from the useless information terms in |Φ0〉, introduce an auxiliary qubit B1 in state |0〉B1, and the following control operator,
In this way, the terms where the states of R2 and C2 are different are labeled by |1〉B1, and the terms where the states of R2 and C2 are the same are labeled by |0〉B1.
Since the control operator
We proceed to separate the useful information terms from the useless ones.
- 1)
In the first sum, extract all elements except for those in the (k + 1)th and (l + 1)th rows.
- 2)
In the second sum, extract all elements in the (l + 1)th row.
- 3)
In the third sum, extract all elements in the (k + 1)th row.
- 4)
For the terms in |Φ1〉 labeled by |1〉B1, extract all terms except those where R1 is in the states |k〉R1 and |l〉R1.
- 5)
For the two sums in |Φ1〉 labeled by |0〉B1, extract the terms where R1 is in the state |l〉R1 and the terms where R1 is in the state |k〉R1.
We introduce an auxiliary qubit B2 in state |0〉B2, and use the following control operators,
When
That is, the terms labeled by |1〉B1 |00〉B2, |0〉B1 |01〉B2 and |0〉B1 |10〉B2 are the useful information terms required by the algorithm, while the other terms are the useless information terms denoted by |g1〉R1C1R2C2B1B2.
Since the control operator
To swap the (k + 1)th and (l + 1)th rows of the matrix, we apply C-SWAP gate to the terms in |Φ2〉 labeled by |10〉B2 and |01〉B2, so as to swap the states of R1 and R2, as well as the states of R1 and C2. Denote
Applying
The SWAP gates transfer the elements originally in the (k + 1)th row to the (l + 1)th row in the terms labeled by |0〉B1 |01〉B2, and the elements originally in the (l + 1)th row to the (k + 1)th row in the terms labeled by |0〉B1 |10〉B2.
Note that the SWAPs in the control operator
Introduce an auxiliary qubit B3 in state |0〉B3. To use B3 to label the useful and useless information terms in |Φ3〉, we apply the following control operator,
Acting
Here, |1〉B3 labels the useful information term, while |0〉B3 labels the useless information term.
Since the control operator
Applying the Hadamard operator
At this point, we have successfully swaped the elements of the (k + 1) -th row of matrix A with the corresponding elements of the (l + 1)-th row.
Since only three Hadamard gates are used here, the circuit complexity is O(1).
We measure the auxiliary B1B2B3 in the state |0001〉B1B2B3 to remove the useless information terms, and obtain
The success probability is
The overall computational complexity of the algorithm is O(n), mainly determined by
For specific examples of this algorithm, please refer to Appendix A.
In this section we present quantum algorithms for trace calculation of an N × N matrix and transposition of an N × M matrix.
The quantum circuit for calculating the trace of the matrix S is shown in Figure 3. Assume that a matrix S={aik} is of size N × N with N = 2n. This algorithm requires one pure state and three auxiliary qubits, and is divided into the following six steps.

Quantum circuit for calculating matrix trace.
First, introduce two n-qubit registers R and C, which are used to enumerate the rows and columns of the matrix S, and encode the elements of the matrix S to obtain the following pure initial state,
To calculate the sum of the diagonal elements of matrix S, we need to mark the diagonal elements of S, i.e., the terms where the states of registers R and C are the same. To do this, introduce an n-qubit auxiliary A in the state |0〉A, and an operator
Applying the operator
In the process of realizing the quantum state |Φ1〉, we check whether the j-th qubits of registers R and C are the same. If their j-th qubits are the same, the j-th qubit of A is flipped from |0〉 to |1〉. If the states of all n qubits of R and C are the same, the states of all n qubits of A are flipped from |0〉A to |N – 1〉A.
The control operator
Currently, all useful information terms are stored in the first sum of |Φ1〉, while the second sum |g1〉RCA represents useless information terms that are not needed by the algorithm.
To separately mark the useful and useless information terms in the state |Φ1〉, we introduce the projection operator PA = |N – 1〉A A 〈N – 1|, and a 1-qubit auxiliary B1 in the state |0〉B1. The control operator
Here, the useful information terms are marked by |1〉B1, and the useless information terms are marked by |0〉B1.
Since the control operator
Up to this point, we have obtained the diagonal elements of S. To compute the sum we apply the Hadamard operator
In the state |Φ3〉, we select the terms in the state |0〉r|0〉C|0〉A|1〉B1, as these terms contain all the useful information required for the trace calculation. The other terms are represented by |g2〉RCAB1.
Since 3n Hadamard gates are used here, the circuit complexity is O(n).
Since new useless information terms were generated in the previous step and the useful information is stored in the first term of |Φ3〉, we need to re-mark the useful and useless information terms. To do this, we introduce an auxiliary qubit B2 in state |0〉A2, as well as the projection operator PRCAB1 = |0〉R|0〉C|0〉A|1〉B1 R〈0|C〈0|A〈0|B1 〈1|. We apply the following control operator
Here, the useful information terms are marked by |1〉B2, and the useless information terms are marked by |0〉B2.
Since the control operator
The trace of the matrix S is stored in the quantum state |Φ4〉 in the form of probability amplitudes, with the useful information marked by |1〉B2. To extract the useful information, we measure the auxiliary qubit B2 in the state |1〉B2, and obtain
The trace is obtained based on the measurement probability
The overall computational complexity of the algorithm is O(n), mainly determined by
The quantum circuit for matrix transpose is shown in Figure 4. Assume that a matrix S={aik} is of size N × M with N = 2n and M = 2m. The algorithm requires two pure states and is divided into two steps.

Quantum Circuit for Matrix Transpose.
First, introduce an n -qubit register R and an m-qubit register C, where R and C enumerate the rows and columns of the matrix S, respectively. The elements of matrix S are encoded into the following pure state,
Then, introduce a pure state |0〉D for an m-qubit register D. The initial state is give by
We use a SWAP gate to exchange the states of the D and C registers. The operator SWAPDC is applied to |Φ0〉, yielding
Now, the new rows and new columns of the current target matrix are enumerated by the states D and R respectively. Clearly, the success probability of this algorithm is 1.
Since the exchange operator SWAPDC only involves exchange operations when acting on different qubit pairs, its computational complexity is O(m).
Formally the non-square matrix can be made a square matrix by including the proper number of zero rows or columns. In this case transposition of non-square matrix can be done as a transposition of a square matrix, i.e. by the swap operator SWAPRC. The complexity of this operation is O(n).
We have developed quantum algorithms for two types of elementary row transformations of matrices, the trace and transpose of matrices. The operational mechanism of all these algorithms is as follows: First, construct the initial state of the entire system. Second, utilize unitary operations and ancillary qubits to mark the information items required by the algorithm. Third, apply the Hadamard transformation to superpose these information items. Finally, measure the ancillary qubits to eliminate the useless information items and retain only the useful ones.
In all these algorithms, the final result is stored in the probability amplitude of certain quantum states. Our success probability for adding one row of a matrix to another is
It would be desired to develop new methods to improve the success probability given by the final measurement in the algorithm, or explore the designs of other quantum algorithms for matrix operations based on multi-qubit Toffoli type gates and the simplest single qubit operations. It would be also interesting to conduct further researches on quantum algorithms for other matrix operations.