Reducing T-Count In Quantum String Matching Algorithm Using Relative-Phase Fredkin Gate

Park, Byeongyong; Noh, Hansol; Ahn, Doyeol

doi:10.2478/qic-2025-0016

Full Article

1.

Introduction

Grover’s quantum search algorithm is renowned for providing a quadratic speedup in solving unstructured search problems [1,2]. A notable application of this speedup is in addressing the string-matching problem [3,4], where the goal is to find a pattern P of length M within a given string L of length N. This problem is fundamental in computer science due to its wide-ranging applications [5], including information retrieval [6], plagiarism detection [7], text mining [8], intrusion detection [9,10], language translation [11], and DNA sequence analysis [12–15], among others.

A significant challenge in implementing the quantum string matching (QSM) algorithm is the requirement for fault-tolerant quantum computation due to the inherent fragility of quantum information. In fault-tolerant quantum computations, circuits are constructed using a universal and fault-tolerant gate set. The Clifford + T gate set is commonly employed in various leading fault-tolerant schemes [16,17]. Within these approaches, Clifford gates are relatively straightforward to implement, often through transversal methods. In contrast, T gates are resource-intensive and require techniques like magic state distillation, which significantly increase the implementation costs [18–21]. Consequently, optimizing the number of T gates (T-count) is crucial for the efficient execution of large-scale quantum algorithms, including QSM.

Several studies have developed QSM algorithms [3,4,22], with the most notable contribution being the work of Niroula and Nam [4]. Their work is particularly significant because it provides an explicit quantum circuit implementation of the QSM algorithm. In their circuit design, they introduce a component called the “cyclic shift operator,” which plays an essential role in constructing the Grover’s operator specific to the QSM. The construction of these operators in the QSM circuit is the primary factor that increases the T-count within the QSM algorithm.

In this paper, we propose a method to reduce the T-count for the QSM circuit, based on the circuit design proposed by Niroula and Nam [4]. The QSM circuit in Niroula and Nam’s work requires approximately 14N^3/2 log₂ N T gates, the majority of which arise from the synthesis of controlled-SWAP (Fredkin) gates in cyclic shift operators. They synthesized each Fredkin gate into Clifford + T gates using seven T gates (see Figure 1a). As part of our strategy to reduce the T-count, we introduce a new concept: the relative-phase Fredkin gate, inspired by the relative-phase Toffoli gate [23]. We synthesize a relative-phase Fredkin gate that contains only four T gates (see Figure 1b) and prove that each Fredkin gate in the QSM circuit can be replaced with any relative-phase Fredkin gate. As a result, we reduce the leading-order term of the T-count for QSM to 8N^3/2 log₂ N. Additionally, we demonstrate that our approach provides advantages in terms of reducing other circuit costs, such as the depth of T gates (T-depth) and the number of CNOT gates (CNOT-count).

The circuit cost reduction presented in this work is achieved by incorporating relative-phase encoding into the quantum states of the QSM algorithm proposed by Niroula and Nam [4]. This approach is feasible because the relevant information about the string and pattern is encoded in basis states rather than amplitudes, preserving measurement outcomes performed in the computational basis. Furthermore, our relative-phase-based optimization strategy effectively reduces circuit costs not only for the QSM algorithm but also for a broader class of quantum algorithms, particularly those employing controlled quantum gates such as Fredkin or similar controlled permutations.

2.

Backgrounds

2.1.

Grover’s Algorithm

Grover’s algorithm is a quantum algorithm that offers a quadratic speedup over classical search algorithms in unstructured search problems [1,2]. This section briefly explains Grover’s algorithm using general quantum amplitude amplification [2].

Consider a Boolean function f : {0,1,…, 2ⁿ – 1} → {0,1}. Within an n-qubit system, two subspaces, G and B, are defined as follows: G ≡ span({|x〉 | f (x) = 1}) and B = span({|x〉 | f (x) = 0}). Note that I_n = P_G + P_B, where P_G and P_B are projectors onto G and B, respectively. Given a unitary operator A acting on the n-qubit system, A|0〉_n = |ψ〉 can be expressed as a linear combination of P_G|ψ〉 and P_B|ψ〉 as shown in Eq. (1). Here, the states |ϕ_g〉 and |ϕ_b〉 are normalized states of P_G|ψ〉 and P_B|ψ〉, respectively.1 $A | 0 〉_{n} \equiv | ψ 〉 = P_{G} | ψ 〉 + P_{B} | ψ 〉 = \sin θ | ϕ_{g} 〉 + \cos θ | ϕ_{b} 〉$ A|0{\rangle _n} \equiv |\psi \rangle = {P_G}|\psi \rangle + {P_B}|\psi \rangle = \sin \theta \left| {{\phi _g}} \right\rangle + \cos \theta \left| {{\phi _b}} \right\rangle

In this study, we call the operator A the initialization operator.

The primary goal of Grover’s algorithm is to find (or measure) the state in G with high probability. The operators utilized in Grover’s algorithm are defined in Eqs. (2) to (5): 2 $Q \equiv - R_{ψ} R_{g}$ Q \equiv - {R_\psi }{R_g} 3 $R_{ψ} \equiv I - 2 | ψ 〉〈 ψ | \equiv A R_{0} A^{- 1}$ {R_\psi } \equiv I - 2|\psi \rangle \langle \psi | \equiv A{R_0}{A^{ - 1}} 4 $R_{0} \equiv I - | 0 〉_{n} 〈 0 |$ {R_0} \equiv I - |0{\rangle _n}\langle 0| 5 $R_{g} \equiv I - 2 | ϕ_{g} 〉〈 ϕ_{g} |$ {R_g} \equiv I - 2\left| {{\phi _g}} \right\rangle \langle {\phi _g}|

We refer to operator Q as Grover’s operator. The process of Grover’s algorithm can be delineated as follows: First, an n-qubit is set to state |0〉_n, after which the operator A is applied to the state |0〉_n. Next, Q^k is applied to the state A|0〉_n = |ψ〉, where k = ⌊π/4θ⌋. Finally, a measurement is conducted on the n-qubit state. The post measurement state becomes a state in G with a probability higher than max(1 – sin² θ, sin² θ) [2].

2.2.

Quantum String Matching Algorithm

This section describes the QSM algorithm and its circuit representation described in Ref. [4]. String matching aims to locate a pattern P of length M in a given string L of length N. For simplicity of analysis, we assume P and L to be binary digit strings and N = 2ⁿ throughout this study.

The core subcircuit of the algorithm is the cyclic shift operator C. The cyclic shift operator C on the (n + N)-qubit system is defined as Eq. (6), where |k〉_n is a computational basis state, the additions are modulo N additions, and each x_i is a binary number 0 or 1.6 $C | k 〉_{n} {| x_{0} x_{1} x_{2} \dots x_{N - 1} 〉}_{N} = | k 〉_{n} {| x_{0 + k} x_{1 + k} x_{2 + k} \dots x_{N - 1 + k} 〉}_{N}$ C|k{\rangle _n}{\left| {{x_0}{x_1}{x_2} \ldots \;{x_{N - 1}}} \right\rangle _N} = \;|k{\rangle _n}{\left| {{x_{0 + k}}\;{x_{1 + k}}\;{x_{2 + k}}\; \ldots \;{x_{N - 1 + k}}} \right\rangle _N}

To construct the circuit of this operator, the cyclic shift operator C is divided into operators C_k, which are defined as follows.7 $C_{k} | l 〉_{n} {| x_{0} x_{1} x_{2} \dots x_{N - 1} 〉}_{N} = {\begin{array}{l} | l 〉_{n} {| x_{0} x_{1} x_{2} \dots x_{N - 1} 〉}_{N}, & if l \neq k, \\ | l 〉_{n} {| x_{0 + k} x_{1 + k} x_{2 + k} \dots x_{N + k - 1} 〉}_{N}, & if l = k . \end{array}$ {C_k}|l{\rangle _n}{\left| {{x_0}{x_1}{x_2} \ldots \;{x_{N - 1}}} \right\rangle _N} = \;\left\{ {\matrix{ {|l{\rangle _n}{{\left| {{x_0}{x_1}{x_2} \ldots {x_{N - 1}}} \right\rangle }_N},} \hfill & {{\rm{ if }}l \ne k,} \hfill \cr {|l{\rangle _n}{{\left| {{x_{0 + k}}\;{x_{1 + k}}\;{x_{2 + k}}\; \ldots \;{x_{N + k - 1}}} \right\rangle }_N},} \hfill & {{\rm{ if }}l = k.} \hfill \cr } } \right. where |l〉_n is a computational basis state. Then, the cyclic shift operator C can then be rewritten as described in Eq. (8).8 $C = C_{2^{0}} C_{2^{1}} C_{2^{2}} \dots C_{2^{n - 1}}$ C = {C_{{2^0}}}{C_{{2^1}}}{C_{{2^2}}} \ldots {C_{{2^{n - 1}}}}

Constructing a C_2^j operator requires a maximum of N – 1 controlled-SWAP (Fredkin) gates. The process of constructing the C_2^j operator is divided into n stages. In the first stage, N/2 digits are moved to the right place using N/2 Fredkin gates. In the second stage, N/4 digits are moved to the right place using N/4 Fredkin gates, and so on. The total number of Fredkin gates used to construct the C_2^j operator is at most N/2 + N/4 + … + 1 = N – 1 (see Figure 2). Optionally, the Fredkin gates at each stage of constructing a C_2^j operator can be parallelized using N/2 – 1 ancilla qubits and fan-out operations with N – 2 CNOT gates. The exact number of Fredkin gates required to construct the cyclic shift operator C can be found in Appendix A.

To implement Grover’s algorithm, the initialization operator A such that A|0〉_(n+N+M) = |ψ〉 needs to be defined. The process of applying A to state |0〉_(n+N+M) is as follows (see Figure 3a):

(i)
Prepare three quantum registers with n, N, and M qubits. We call the n-qubit register the first register, N-qubit register the second register, and M-qubit register the third register. All the states were initialized to state |0〉.
(ii)
Apply H^⊗n to the first register and apply X gates to encode L and P in the second and third registers, respectively. Then, the entire state becomes state |ψ₁ in Eq. (9). |L〉_N and |P〉_M in Eq. (9) are defined in Eqs. (10) and (11), respectively, where each l_j and p_j is a binary number.9 $| ψ_{1} 〉 = \frac{1}{\sqrt{N}} \sum_{k = 0}^{N - 1} | k 〉_{n} | L 〉_{N} | P 〉_{M}$ \left| {{\psi _1}} \right\rangle = {1 \over {\sqrt N }}\sum\limits_{k = 0}^{N - 1} {|k{\rangle _n}|L{\rangle _N}|P{\rangle _M}} 10 $| L 〉_{N} = | l_{0} l_{1} l_{2} \dots l_{N - 1} 〉$ |L{\rangle _N} = \left| {{l_0}{l_1}{l_2} \ldots {l_{N - 1}}} \right\rangle 11 $| P 〉_{M} = | p_{0} p_{1} p_{2} \dots p_{M - 1} 〉$ |P{\rangle _M} = \left| {{p_0}{p_1}{p_2} \ldots {p_{M - 1}}} \right\rangle
(iii)
Apply cyclic shift operator C to the first and second registers. Subsequently, the state evolves into state |ψ₂〉. in Eq. (12), where the additions are modulo N additions.12 $| ψ_{2} 〉 = \frac{1}{\sqrt{N}} \sum_{k = 0}^{N - 1} | k 〉_{n} | l_{0 + k} l_{1 + k} l_{2 + k} \dots l_{N - 1 + k} 〉 | p_{0} p_{1} p_{2} \dots p_{M - 1} 〉$ \left| {{\psi _2}} \right\rangle = {1 \over {\sqrt N }}\sum\limits_{k = 0}^{N - 1} {|k{\rangle _n}\left| {{l_{0 + k}}{l_{1 + k}}{l_{2 + k}} \ldots {l_{N - 1 + k}}} \right\rangle \left| {{p_0}{p_1}{p_2} \ldots {p_{M - 1}}} \right\rangle } \;
(iv)
Apply M CNOT gates to the first M qubits in the second register and to all qubits in the third register. For the CNOT operations, each qubit in the second register serves as the control qubit, and each qubit in the third register serves as the target qubit (see Figure 3b). Subsequently, the state evolves into state |ψ〉 as described in Eq. (13).13 $\begin{array}{l} | ψ 〉 & = \frac{1}{\sqrt{N}} \sum_{k = 0}^{N - 1} | k 〉_{n} | l_{0 + k} l_{1 + k} \dots l_{N + k - 1} 〉 \\ | (p_{0} \oplus l_{0 + k}) (p_{1} \oplus l_{1 + k}) \dots (p_{M - 1 + k} \oplus l_{M - 1 + k}) 〉 \end{array}$ \matrix{ {|\psi \rangle } \hfill & { = {1 \over {\sqrt N }}\sum\limits_{k = 0}^{N - 1} | k{\rangle _n}\left| {{l_{0 + k}}{l_{1 + k}} \ldots {l_{N + k - 1}}} \right\rangle } \hfill \cr {} \hfill & {\left| {\left( {{p_0} \oplus {l_{0 + k}}} \right)\left( {{p_1} \oplus {l_{1 + k}}} \right) \ldots \left( {{p_{M - 1 + k}} \oplus {l_{M - 1 + k}}} \right)} \right\rangle } \hfill \cr }

In the remainder of this study, we use the notation in Eqs. (14) and (15).14 $| l_{0 + k} l_{1 + k} \dots l_{N + k - 1} 〉 \equiv {| L_{k} 〉}_{N}$ \left| {{l_{0 + k}}{l_{1 + k}} \ldots {l_{N + k - 1}}} \right\rangle \equiv {\left| {{L_k}} \right\rangle _N} 15 $| (p_{0} \oplus l_{0 + k}) (p_{1} \oplus l_{1 + k}) \dots (p_{M - 1 + k} \oplus l_{M - 1 + k}) 〉 \equiv {| P_{k} 〉}_{M}$ \left| {\left( {{p_0} \oplus {l_{0 + k}}} \right)\left( {{p_1} \oplus {l_{1 + k}}} \right) \ldots \left( {{p_{M - 1 + k}} \oplus {l_{M - 1 + k}}} \right)} \right\rangle \equiv {\left| {{P_k}} \right\rangle _M}

We define the state in G as the state in which the third register is in state |0〉_M.

As described earlier, Grover’s operator Q is given by AR₀A⁻¹R_g (see Figure 3c). If amplitude amplification is carried out using Grover’s operator Q = AR₀A⁻¹R_g on the n + N + M qubits, with operator A defined as in Figure 3a, the pattern P can be identified within the given string L. R₀ and R_g can be constructed using X gates and a multi-qubit controlled-Z gate. A k-qubit controlled-Z gate is decomposed using two H gates and a k-qubit Toffoli gate. In Ref. [4], multi-qubit Toffoli gates are decomposed using relative-phase Toffoli gates in Ref. [23]. The calculations regarding the size and cost of R₀, which is not explicitly presented in Ref. [4], can be found in Appendix B.

3.

Proposed QSM Circuit: Design and Validation

First, we define the relative-phase Fredkin gate analogously to the relative-phase Toffoli gate [23]. The Fredkin gate, also known as the controlled-SWAP gate, has a matrix representation on a computational basis given by $diag (1, 1, 1, 1, 1,_{1}^{0}_{0}^{1}, 1, 1)$ {\mathop{\rm diag}\nolimits} \left( {1,1,1,1,1,_1^0_0^1,1,1} \right). We extend this to define the relative-phase Fredkin gate, which has a matrix representation of $diag (z_{0}, z_{1}, z_{2}, z_{3}, z_{4},_{z_{5}}^{0}_{0}^{z_{6}}, z_{5}, z_{7})$ {\mathop{\rm diag}\nolimits} \left( {{z_0},{z_1},{z_2},{z_3},{z_4},_{{z_5}}^0_0^{{z_6}},{z_5},{z_7}} \right), where each z_i is an arbitrary complex number with norm 1.

A Fredkin gate can be constructed using a Toffoli gate and two CNOT gates (see Figure 4a) [24]. Similarly, a relative-phase Fredkin gate can be constructed using a relative-phase Toffoli gate and two CNOT gates (see Figure 4b). This construction can be easily demonstrated through a straightforward matrix multiplication: 16 $\begin{array}{l} diag ([\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \end{matrix}], [\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \end{matrix}]) \cdot diag (z_{0}, z_{1}, z_{2}, z_{3}, z_{4}, z_{5}, [\begin{matrix} 0 & z_{7} \\ z_{6} & 0 \end{matrix}]) \cdot \\ diag ([\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \end{matrix}], [\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \end{matrix}]) = diag (z_{0}, z_{1}, z_{2}, z_{3}, z_{4}, [\begin{matrix} 0 & z_{6} \\ z_{5} & 0 \end{matrix}], z_{7}) \end{array}$ \matrix{ {{\mathop{\rm diag}\nolimits} \left( {\left[ {\matrix{ 1 & 0 & 0 & 0 \cr 0 & 0 & 0 & 1 \cr 0 & 0 & 1 & 0 \cr 0 & 1 & 0 & 0 \cr } } \right],\left[ {\matrix{ 1 & 0 & 0 & 0 \cr 0 & 0 & 0 & 1 \cr 0 & 0 & 1 & 0 \cr 0 & 1 & 0 & 0 \cr } } \right]} \right)\;\cdot\;{\mathop{\rm diag}\nolimits} \left( {{z_0},{z_1},{z_2},{z_3},{z_4},{z_5},\left[ {\matrix{ 0 & {{z_7}} \cr {{z_6}} & 0 \cr } } \right]} \right)\cdot} \hfill \cr {{\mathop{\rm diag}\nolimits} \left( {\left[ {\matrix{ 1 & 0 & 0 & 0 \cr 0 & 0 & 0 & 1 \cr 0 & 0 & 1 & 0 \cr 0 & 1 & 0 & 0 \cr } } \right],\left[ {\matrix{ 1 & 0 & 0 & 0 \cr 0 & 0 & 0 & 1 \cr 0 & 0 & 1 & 0 \cr 0 & 1 & 0 & 0 \cr } } \right]} \right) = {\mathop{\rm diag}\nolimits} \left( {{z_0},{z_1},{z_2},{z_3},{z_4},\left[ {\matrix{ 0 & {{z_6}} \cr {{z_5}} & 0 \cr } } \right],{z_7}} \right)} \hfill \cr }

Using this method, as shown in Figure 1b, we constructed a relative-phase Fredkin gate from the relative-phase Toffoli gate described in Ref. [23]. The relative-phase Fredkin gate in Figure 1b has a matrix representation of $diag (1, 1, 1, 1, 1,_{- i}^{0}_{0}^{i}, - 1)$ {\mathop{\rm diag}\nolimits} (1,1,1,1,1,_{ - i}^0_0^i, - 1).

Next, we prove that any relative-phase Fredkin gate can replace the Fredkin gates used to construct the cyclic shift operator C in the QSM algorithm.

Lemma 1.

Any quantum circuit X_r composed solely of relative-phase Fredkin gates can be represented by Eq. (17), where X is a quantum circuit in which all relative-phase Fredkin gates in X_r are replaced with Fredkin gates, and D_r and D_l are circuits with diagonal matrix representation.17 $X_{r} = X \cdot D_{r} = D_{l} \cdot X$ {X_r} = X\cdot{D_r} = {D_l}\cdotX

Proof. It is sufficient to verify Eq. (18), where RFred represents an arbitrary relative-phase Fredkin gate, Fred represents a Fredkin gate, and D₁ and D₂ represent circuits with a diagonal matrix representation.18 $RFred = Fred \cdot D_{1} = D_{2} \cdot Fred$ {\rm{RFred }} = {\rm{ Fred }}\cdot{D_1} = {D_2}\cdot{\rm{ Fred}}

Basic matrix multiplication can prove Eq. (18): 19 $\begin{array}{l} diag (z_{0}, z_{1}, z_{2}, z_{3}, z_{4}, [\begin{matrix} 0 & z_{6} \\ z_{5} & 0 \end{matrix}], z_{7}) \\ = diag (1, 1, 1, 1, 1, [\begin{array}{l} 0 & 1 \\ 1 & 0 \end{array}], 1) \cdot diag (z_{0}, z_{1}, z_{2}, z_{3}, z_{4}, z_{5}, z_{6}, z_{7}) \\ = diag (z_{0}, z_{1}, z_{2}, z_{3}, z_{4}, z_{6}, z_{5}, z_{7}) \cdot diag (1, 1, 1, 1, 1, [\begin{array}{l} 0 & 1 \\ 1 & 0 \end{array}], 1) \end{array}$ \matrix{ {{\mathop{\rm diag}\nolimits} \left( {{z_0},{z_1},{z_2},{z_3},{z_4},\left[ {\matrix{ 0 & {{z_6}} \cr {{z_5}} & 0 \cr } } \right],{z_7}} \right)} \hfill \cr { = {\mathop{\rm diag}\nolimits} \left( {1,1,1,1,1,\left[ {\matrix{ 0 \hfill & 1 \hfill \cr 1 \hfill & 0 \hfill \cr } } \right],1} \right)\;\cdot{\mathop{\rm diag}\nolimits} \left( {{z_0},{z_1},{z_2},{z_3},{z_4},{z_5},{z_6},{z_7}} \right)} \hfill \cr { = {\mathop{\rm diag}\nolimits} \left( {{z_0},{z_1},{z_2},{z_3},{z_4},{z_6},{z_5},{z_7}} \right)\;\cdot{\mathop{\rm diag}\nolimits} \left( {{\rm{1, 1, 1, 1, 1,}}\left[ {\matrix{ 0 \hfill & 1 \hfill \cr 1 \hfill & 0 \hfill \cr } } \right],1} \right)} \hfill \cr }

Theorem 1.

Even if arbitrary relative-phase Fredkin gates replace the Fredkin gates in constructing the cyclic shift operator C in the QSM algorithm described in Ref. [4], the algorithm yields the same results.

Proof. Assume that we replace all Fredkin gates in cyclic shift operator C with relative-phase Fredkin gates. We express operators A and C with relative-phase Fredkin gates as A′ and C′. By Lemma 1, C′ can be written as C · D, where D is an (n + N)-qubit operator with a diagonal matrix representation. Subsequently, A|0〉_n |0〉_N|0〉_M and A′|0〉_n|0〉_N|0〉_M evolve into the states in Eqs. (20) and (21), where each z_k is a complex number with norm 1.20 $A | 0 〉_{n} | 0 〉_{N} | 0 〉_{M} = | ψ 〉 = \frac{1}{\sqrt{N}} \sum_{k = 0}^{N - 1} | k 〉_{n} {| L_{k} 〉}_{N} {| P_{k} 〉}_{M} \equiv \sin θ | ϕ_{g} 〉 + \cos θ | ϕ_{b} 〉$ A|0{\rangle _n}|0{\rangle _N}|0{\rangle _M} = |\psi \rangle = {1 \over {\sqrt N }}\sum\limits_{k = 0}^{N - 1} | k{\rangle _n}{\left| {{L_k}} \right\rangle _N}{\left| {{P_k}} \right\rangle _M} \equiv \sin \theta \left| {{\phi _g}} \right\rangle + \cos \theta \left| {{\phi _b}} \right\rangle 21 $A^{'} | 0 〉_{n} | 0 〉_{N} | 0 〉_{M} = | ψ^{'} 〉 = \frac{1}{\sqrt{N}} \sum_{k = 0}^{N - 1} z_{k} | k 〉_{n} {| L_{k} 〉}_{N} {| P_{k} 〉}_{M} \equiv \sin θ^{'} | ϕ_{g}^{'} 〉 + \cos θ | ϕ_{b}^{'} 〉$ {A^\prime }|0{\rangle _n}|0{\rangle _N}|0{\rangle _M} = \left| {{\psi ^\prime }} \right\rangle = {1 \over {\sqrt N }}\sum\limits_{k = 0}^{N - 1} {{z_k}} |k{\rangle _n}{\left| {{L_k}} \right\rangle _N}{\left| {{P_k}} \right\rangle _M} \equiv \sin {\theta ^\prime }\left| {\phi _g^\prime } \right\rangle + \cos \theta \left| {\phi _b^\prime } \right\rangle

Note that θ = θ′ because states |k〉_n|L_k)_N|P_k)_M with different k values are orthonormal. Therefore, Grover’s algorithm using A′ gives the same results as the case using A.

Based on Theorem 1, we can replace every Fredkin gate in the QSM algorithm with the relative-phase Fredkin gate of Figure 1b. Seven T gates are required to synthesize a Fredkin gate into Clifford + T gates (see Figure 1a). However, only four T gates are required to synthesize the relative-phase Fredkin gate of Figure 1b. Therefore, with this replacement, the cyclic shift operator C′ is decomposed into Clifford + T gates using 4(N log₂ N – N + 1) T gates. This results in reducing the leading-order term of the T-count in the QSM algorithm from 14N^3/2 log₂ N to 8N^3/2 log₂ N, assuming that N^1/2 Grover iterations (the number of executions of the Grover operator) are implemented.

To validate the proposed QSM circuit, we simulated the QSM algorithm using the quantum simulator in Qiskit [25]. As shown in Figure 5, the proof-of-principle results exhibit excellent agreement with the expected outcomes of Grover’s search algorithm. Our designed circuit targeted the pattern “11” within the data sequence “00110000”. We varied the number of Grover iterations from 0 to 9 and compared the probabilities of finding the correct answer between the theoretically ideal Grover’s search algorithm and the QSM algorithm implemented with the relative-phase Fredkin gate. Each QSM simulation was executed 10,000 times to estimate the probabilities.

4.

Results

In this section, we present the T-count required for the implementation of the QSM algorithm with relative-phase Fredkin gates. When Grover’s operator is repeatedly implemented N^1/2 times, the leading order term of the total T-count for the QSM algorithm is reduced from 14N^3/2 log₂ N to 8N^3/2 log₂ N. Additionally, we observe further T-count reductions in each iteration of A′R₀(A′)⁻¹, where the primed operators refer to those using the relative-phase Fredkin gates. To construct C′, we sequentially combine $C_{1}^{'}, C_{2}^{'}, C_{4}^{'}, \dots$ C_1^\prime,C_2^\prime,C_4^\prime, \ldots in sequence. For (C′)⁻¹, we combine the inverse gates of the gates constituting C′ in reverse order. Then, as shown in Figure 6, there is a T-count reduction around R₀. Each time the structure of A′R₀(A′)⁻¹ appears in the QSM circuit, the T-count is reduced by 2N compared to the case of the QSM algorithm with Fredkin gates.

The introduction of the proposed relative-phase Fredkin gate for the QSM algorithm also benefits the T-depth by allowing the parallelization of the T gates without additional ancilla qubits for the construction of the cyclic shift operator. The QSM algorithm described in Ref. [4] parallelizes Fredkin gates in the cyclic shift operator by using N/2 – 1 ancilla qubits and fan-out operations with CNOT gates, which leads to the parallelization of the T gates. In contrast, the use of relative-phase Fredkin gates allows for automatic parallelization of T gates without the need for fan-out operations or additional ancilla qubits, as depicted in Figure 7. Compared to the Fredkin gate in Figure 1a with a T-depth of five, the relative-phase Fredkin gate in Figure 1b has a T-depth of four. Consequently, the T-depth required for the cyclic shift operator is reduced to 4/5 of the result in Ref. [4] without additional ancilla qubits. The use of relative-phase Fredkin gates also offers advantages in terms of other circuit costs, such as the number of CNOT gates. In Table 1, the circuit cost of the QSM algorithms proposed in this paper is compared against those previously reported.

Table 1.

The table presents the required resources for the QSM algorithms, including T-count (the number of T gates), T-depth (the depth of T gates), CNOT-count (the number of CNOT gates), and Qubit-count (the number of qubits). Here, N represents the size of the database to be searched, and M represents the size of the pattern. For the comparison, we assume that N^1/2 Grover iterations are implemented.

Source	Ref. [4]	This Work
T-count	14N^3/2 log₂ N – 14N^3/2 + 7N log₂ N – 7N + 8N^1/2 log₂ N + N^1/2(8M – 20) + 7 = 14N^3/2 log₂ N – O(N^3/2)	8N^3/2 log₂ N – 10N^3/2 + 4N log₂ N – 4N + 8N^1/2 log₂ N + N^1/2(8M – 26) + 1 = 8N^3/2 log₂ N – O(N^3/2)
T-depth	$5 N^{1 / 2} \log_{2}^{2} N + 9 N^{1 / 2} \log_{2} N + N^{1 / 2} (4 M + 2) + \frac{5}{2} \log_{2}^{2} N + \frac{5}{2} \log_{2} N = 5 N^{1 / 2} \log_{2}^{2} N + O (N^{1 / 2} \log N)$ 5{N^{1/2}}\log _2^2N\, + \,9{N^{1/2}}{\log _2}N\, + \,{N^{1/2}}(4M + 2)\, + \,{5 \over 2}\,\log _2^2N\, + \;{5 \over 2}\;{\log _2}N = 5{N^{1/2}}\log _2^2N + O\left( {{N^{1/2}}\log N} \right)	$4 N^{1 / 2} \log_{2}^{2} N + 8 N^{1 / 2} \log_{2} N + N^{1 / 2} (4 M - 2) + 2 \log_{2}^{2} N + 2 \log_{2} N = 4 N^{1 / 2} \log_{2}^{2} N + O (N^{1 / 2} \log N)$ 4{N^{1/2}}\log _2^2N\; + \;8{N^{1/2}}{\log _2}N\; + \;{N^{1/2}}(4M - 2)\; + \;2\log _2^2N\; + \;2{\log _2}N = 4{N^{1/2}}\log _2^2N + O\left( {{N^{1/2}}\log N} \right)
CONT-count	16N^3/2 log₂ N – 14N^3/2 + 7N log₂ N – 7N + 10N^1/2 log₂ N + N^1/2(8M – 10) + M + 7 = 16N^3/2 log₂ N – O(N^3/2)	10N^3/2 log₂ N – 10N^3/2 + 5N log₂ N – 5N + 6N^1/2 log₂ N + N^1/2(8M– 14) + M + 5 = 10N^3/2 log₂ N – O(N^3/2)
Qubit-count	$\frac{3}{2} N + \log_{2} N + M - 1$ {3 \over 2}N\; + \;{\log _2}N + M - 1	N + log₂ N + M

One might argue that the presence of an O(N^3/2) term in the second dominant component of the T-count, appearing in both our result and Ref. [4], could diminish the significance of the improvement achieved in the leading coefficient. However, the difference in T-count between the two approaches is given by N^3/2(6 log₂ N – 4) + O(N log N), which remains positive even for relatively small values of N and continues to grow as the problem size increases. As illustrated in Figure 8, the T-count difference between our proposed approach and the one in Ref. [4] becomes increasingly significant with larger values of N. Given that the QSM algorithm is expected to exhibit quantum advantage primarily for large-scale instances, this trend indicates that the proposed method offers progressively greater benefits as the input size grows.

5.

Discussion

The string-matching problem is a widespread challenge in computer science. As the scale of the problem increases, it becomes essential to enhance the efficiency of algorithms by reducing computational costs. In this paper, we have reduced the T-count of the QSM algorithm from 14N^3/2 log₂ N – O(N^3/2) to 8N^3/2 log₂ N – O(N^3/2). Additionally, our approach reduces other circuit costs of the QSM algorithm, including T-depth. Given that T gates dominate the cost of implementing fault-tolerant quantum circuits, our results may expedite the utilization of quantum advantages in solving string-matching problems, potentially impacting fields such as bioinformatics, text processing, and cryptography.

Our primary strategy for reducing circuit costs is to incorporate relative-phase encoding into the quantum states used in the QSM algorithm presented in Ref. [4]. This approach is feasible because the QSM algorithm from Ref. [4] stores information about the string and pattern in basis states rather than in quantum amplitudes. We anticipate that this circuit cost reduction strategy could be broadly applicable to various quantum algorithms. A practical example involving quantum image processing is provided in Appendix C.

To facilitate relative-phase encoding, we introduced the concept of the relative-phase Fredkin gate, inspired by the relative-phase Toffoli gate [23]. Given the extensive use of relative-phase Toffoli gates in optimizing quantum circuits [23,26–28], we expect relative-phase Fredkin gates to have similarly broad applications in quantum circuit optimization. An illustrative example of this application is presented in Appendix D.

By employing the Toffoli gate implementation method presented in Ref. [29] and the Fredkin gate synthesis method presented in Ref. [24], it is possible to execute a Fredkin gate using four T gates. This method can achieve a T-count reduction comparable to our approach. However, this Fredkin gate implementation method necessitates measurement, classically controlled gates, an ancilla qubit, and additional Clifford gates. In contrast, our method, which utilizes the relative-phase Fredkin gate shown in Figure 1b, employs a straightforward circuit synthesis approach. This enables us to reduce the T-count without incurring additional circuit costs. One might also argue that using the method from Ref. [26] allows a Fredkin gate to be implemented with a single T-depth, thereby further reducing the T-depth of the QSM algorithm. However, while this method can create a single Fredkin gate with a lower T-depth, the T gates of different Fredkin gates are not parallelized, ultimately resulting in a higher T-depth compared to our method.

Recently, K. Prousalis et al. proposed a quantum string-matching algorithm based on quantum image processing techniques, where string-matching is performed using a quantum-generated dot-matrix image followed by quantum median filtering to detect similarity patterns [30]. Their approach achieves a time complexity of O(N²) and a space complexity of O(log N) for a string of length N, demonstrating significant improvements in memory efficiency compared to previous quantum methods. They utilized the NEQR (novel enhanced quantum representation) framework [31], which encodes image information in basis states rather than in quantum amplitudes. Therefore, our approach, which incorporates relative-phase encoding in quantum states, may also be applicable to the NEQR-based QSM algorithm proposed by Prousalis et al. [30], potentially reducing quantum circuit complexity in their framework as well. Exploring this possibility would be an interesting direction for future research.

Reducing T-Count In Quantum String Matching Algorithm Using Relative-Phase Fredkin Gate

Full Article

Paradigm

My account