The Overlap Gap Property Limits Limit Swapping in the QAOA

Goh, Mark

doi:10.2478/qic-2025-0018

Full Article

1.

Introduction

Combinatorial Optimization Problems (COPs) are notoriously difficult even as a decision problem [1]—well-known examples include the travelling salesman problem [2] and finding the ground state of a spin glass Hamiltonian [3]. Rather than attempting to find an exact solution, one is often interested in approximate solutions. One such algorithm is the Quantum Approximate Optimization Algorithm (QAOA) introduced by Farhi et al. [4], a type of variational quantum algorithm that, given p layers, uses 2p optimization parameters.

Attempting to evaluate the expectation value of the QAOA is incredibly difficult. Naively, given a problem with size n, each parameter in the QAOA requires a sum over 2ⁿ terms. In a series of works starting with [5], algorithms to evaluate the expectation value of the QAOA on q-spin glass models with time complexity independent of n have been found with increasing performance. The best known one for evaluating q-spin glass is found in [6] with a time complexity of 𝒪(p²4^p) using algebraic techniques.

Another line of research with respect to the QAOA is to prove its limitation in performance at shallow depths via the Overlap Gap Property (OGP). One of the first applications to show the limitation of performance uses locality properties of the QAOA. Thus, at shallow depth, the QAOA does not explore the whole graph underlying a COP and is unable to output a solution that beats the OGP barrier for the Maximum Independent Set problem [7]. The limitation of the QAOA as a result of the OGP has been predominately limited on sparse graphs, but a breakthrough came in [8] using a dense-from-sparse relation between complete graphs and sparse graphs to show that the QAOA is also limited in performance even if it sees the whole graph. Furthermore, they showed that for any constant p in the asymptotic analysis, the QAOA is unable to surpass the OGP barrier. Assuming a stronger version of the OGP, a similar and slightly better result shows limitation at super constant depth, p ~ 𝒪(log log n), for dense graphs [9].

In this paper, we re-derive a result of [8] that for the Max-q-XORSAT problem, and equivalently the mean-field q-spin glass, the QAOA is unable to find the optimal value even if p goes to infinity for even q ≥ 4 if we swap the order of limits, the thermodynamic limit and the run time of the algorithm (i.e., taking lim_p→∞lim_n→∞ rather than lim_n→∞ limp→∞. More precisely, the analysis done by various authors is to study the performance of the QAOA asymptotically by taking the problem size n to infinity and studying the output of the QAOA at constant depths p. This results in the underlying graph explored by the QAOA to appear as a tree and the parameters that optimize the QAOA’s performance in this instance are known as tree-parameters that work well on any Hamiltonian instance [10,11]. While this is a re-derivation of a known proof [8], their proof is significantly more technical compared to ours. Furthermore, our theorem provides some hints that their proof can be extended beyond constant p and hold when p ≤ ϵ log n for dense graphs since we show that their method is implicitly valid at logarithmic depth for sparse graphs.

The paper is organized as follows: In Section 2 we give a brief background to random graphs, spin glass problems, the OGP, and the QAOA; in Section 3, we summarize what is known in literature about the results of the QAOA on spin glasses and their equivalence between the mean-field and dilute spin model; in Section 4 we formalize the point about the OGP in random regular hypergraphs that was mentioned in [6] and re-derive the result in [8], that the QAOA cannot find the optimal value for a spin glass COP even if the algorithm runs indefinitely under limit swapping. Following that, we outline a proof to affirm the theorem while providing some numerical evidence that the proof should extend to odd q-spin glass as well.

1.1.

Statement of Result

The main result of this work is to show that the OGP exists as a limitation for Max-q-XORSAT on a random regular hypergraph with sufficiently large degree. This is done via the following theorem

Theorem 1.

If the Overlap Gap Property limits the performance of a local algorithm on the Erdös–Rényi hypergraph at logarithmic depth p, it also limits the performance on a random regular hypergraph at logarithmic depth p.

The proof is done via contradiction. First, we show that we can trim the Erdös–Rényi hypergraph of average degree λ to a regular hypertree removing at most an 𝒪(1/λ^logλ)-fraction of edges. Then, one can form a λ-regular hypergraph from the hypertree. Assuming an algorithm, that is limited by the OGP, is able to find a near-optimal solution on the regular hypergraph leads to a contradiction since such an algorithm is unable to find near-optimal solution on Erdös–Rényi hypergraph. Thus, the OGP also acts as a barrier to optimization for random regular graph.

This limitation to logarithmic depth local algorithms leads us to re-derive a recently discovered theorem and improve upon its results:

Theorem 2

(Informal, theorems 3 and 4 of [8]). When q ≥ 4 and is even, the performance of the QAOA is limited at logarithmic depth for the Max-q-XORSAT with an underlying D-regular q-uniform hypergraph. The performance is strictly upper-bounded by the OGP since 1 $\lim_{n \to \infty} \lim_{p \to ϵ \log n} \frac{1}{| E |} 〈 γ, β | H_{X O R}^{q} | γ, β 〉 = \frac{1}{2} + v_{\infty}^{[q]} (D, γ, β) \sqrt{\frac{q}{2 D}},$ \mathop {\lim }\limits_{n \to \infty } \mathop {\lim }\limits_{p \to \log n} {1 \over {|E|}}\langle \gamma ,{\bf{\beta }}|H_{XOR}^q|\gamma ,\beta \rangle = {1 \over 2} + v_\infty ^{[q]}(D,\gamma ,{\bf{\beta }})\sqrt {{q \over {2D}}} , with $v_{\infty}^{[q]} (D, γ, β) < v_{\infty}^{[q]} (γ, β) < Π_{q}$ \mathop {\lim }\limits_{n \to \infty } \mathop {\lim }\limits_{p \to \log n} {1 \over {|E|}}\langle \gamma ,{\bf{\beta }}|H_{XOR}^q|\gamma ,\beta \rangle = {1 \over 2} + v_\infty ^{[q]}(D,\gamma ,{\bf{\beta }})\sqrt {{q \over {2D}}} ,.

There are several immediate corollaries of this result for the QAOA. The first of which was noted in [6] as a side-note.

Corollary 1.

Optimizing the QAOA using the algorithm in [6,8] only allows it to perform equally in performance to local classical algorithms, thus providing no quantum advantage.

The above corollary is a result of optimizing the QAOA under limit swapping of the algorithm run time p and the problem size n. This therefore results in the following corollary:

Corollary 2.

If a COP exhibits the OGP, then optimizing QAOA via limit swapping (i.e. using the tree parameters) results in sub-optimal performance.

2.

Background

2.1.

Random Graphs

Here, we standardize the notation we use to denote a hypergraph. A hypergraph G = (V, E) has |V| = n vertices, and |E| = m edges. A graph is q-uniform if every hyperedge connects to exactly q vertices. A graph is d-regular if every vertex has degree d. Conventionally, an instance of the Erdös–Rényi–(Gilbert) q-uniform hypergraph $G = G_{E R}^{q} (n, p)$ \mathop {\lim }\limits_{n \to \infty } \mathop {\lim }\limits_{p \to \log n} {1 \over {|E|}}\langle \gamma ,{\bf{\beta }}|H_{XOR}^q|\gamma ,\beta \rangle = {1 \over 2} + v_\infty ^{[q]}(D,\gamma ,{\bf{\beta }})\sqrt {{q \over {2D}}} , is a random graph with n vertices where each hyperedge is added with probability p. The original Erdös–Rényi q-uniform hypergraph $G = G_{E R}^{q} (n, m)$ \mathop {\lim }\limits_{n \to \infty } \mathop {\lim }\limits_{p \to \log n} {1 \over {|E|}}\langle \gamma ,{\bf{\beta }}|H_{XOR}^q|\gamma ,\beta \rangle = {1 \over 2} + v_\infty ^{[q]}(D,\gamma ,{\bf{\beta }})\sqrt {{q \over {2D}}} , is chosen randomly from the set of hypergraphs with n vertices and m hyperedges. The former is now more frequently used. The two types of Erdös–Rényi graphs are similar to each other when $(\begin{matrix} n \\ q \end{matrix}) p = m$ \left( \matrix{ n \cr q \cr} \right)p = m. In fact, it has been shown that the two types of random graphs are asymptotically equivalent under certain conditions [12].

Another type of random hypergraph of interest is the d-regular q-uniform hypergraph G = ℝ^q(n, d), where we implicitly assume that nd = qx for some integer x. Unlike Erdös–Rényi graphs that can be generated randomly, there is no easy unbiased way to generate such graphs, though one such method is known as the configuration model introduced by Bollobás [13].

2.2.

Spin Glass

In this paper, we focus on the mean field spin glass model and the related dilute spin glass model. For the mean field model, the main goal, roughly speaking, is to find the ground state energy of a spin glass Hamiltonian. A widely studied model is the Sherrington–Kirkpatrick (SK) model [14]. More generally, an Ising mean field q-spin model is given by the following 2 $H_{q} (z) = \sqrt{\frac{q!}{2 n^{(q - 1)}}} \sum_{j < k < \dots < q} J_{j k \dots q} z_{j} z_{k} \dots z_{q},$ {H_q}(z) = \sqrt {{{q!} \over {2{n^{(q - 1)}}}}} \sum\limits_{j < k < \cdots < q} {{J_{jk \ldots q}}} {z_j}{z_k} \ldots {z_q}, where the couplings are randomly chosen over a normal distribution and z_i ∈ {–1,1}.

The ground state energy of the mean-field spin glass can be computed exactly [15], which we denote as the Parisi constant 3 $Π_{q} = \lim_{n \to \infty} \max_{z \in {- 1, 1}^{n}} \frac{H_{q} (z)}{n} .$ {\Pi _q} = \mathop {\lim }\limits_{n \to \infty } \mathop {\max }\limits_{z \in {{\{ - 1,1\} }^n}} {{{H_q}(z)} \over n}.

The dilute spin glass model is also known as the XOR-satisfiability (XORSAT) problem. Specifically, given a q-uniform hypergraph G = (V, E) where E ⊂ V^q, and a given signed weight J_{i₁,…,i_q} ∈ {–1,+1}, Max-q-XORSAT is the problem of maximizing the following cost function 4 $H_{X O R}^{q} (z) = \sum_{(i_{1}, \dots, i_{q}) \in E} \frac{1}{2} (1 + J_{i_{1} i_{2} \dots i_{q}} z_{i_{1}} z_{i_{2}} \dots z_{i_{q}}) .$ H_{XOR}^q(z) = \sum\limits_{\left( {{i_1}, \ldots ,{i_q}} \right) \in E} {{1 \over 2}} \left( {1 + {J_{{i_1}{i_2} \ldots {i_q}}}{z_{{i_1}}}{z_{{i_2}}} \ldots {z_{{i_q}}}} \right).

In terms of the hypergraph, z_ij refers to the vertices and J_{i₁i₂…i_q} refers to the hyperedge.

We say that an instance of the problem is satisfied if there is an assignment of values to the bit-string z which satisfies all the clauses (i.e. $H_{X O R}^{q} (z) = | E |$ H_{XOR}^q(z) = |E|). Otherwise, we say it is unsatisfiable. Suppose we fix d > q sufficiently large so that we are in the unsatisfiable regime. The maximum number of satisfiable equations in an instance of random XORSAT for both $G_{E R}^{q} (n, p)$ H_{XOR}^q(z) = |E|, and ℝ^q(n, d) has been found [16] to be 5 $\frac{1}{| E |} \max_{z} H_{X O R}^{q} (z) = \frac{1}{2} + Π_{q} \sqrt{\frac{q}{2 d}} + O (1 / \sqrt{d}) .$ {1 \over {|E|}}\mathop {\max }\limits_z H_{XOR}^q(z) = {1 \over 2} + {\Pi _q}\sqrt {{q \over {2d}}} + {\cal O}(1/\sqrt d ).

2.3.

Overlap Gap Property

One major obstacle to finding optimal solutions for COPs is known as the Overlap Gap Property (OGP). The term was introduced in [17], though the concept was already used by various authors [18,19].

For the definition of the OGP, one can informally think that for certain choices of disorder J, there is a gap in the set of possible pairwise overlaps of near-optimal solution. Informally, for every two near-optimal solution z¹, z², it is the case that the distance between them is either extremely small, or extremely large. Formally, we define the OGP for a single instance as the following:

Definition 1 (Overlap Gap Property [20]). For a general maximization problem with random input J, the OGP holds if there exists some ϵ > 0, with 0 ≥ μ₁ ≤ μ₂ such that for every z¹, z² that is an ϵ-optimal solution 6 $H_{J} (z^{i}) \geq \max_{z \in 2^{n}} H_{J} (z) - ϵ,$ {H_J}\left( {{z^i}} \right) \ge \mathop {\max }\limits_{z \in {2^n}} {H_J}(z) - , it holds that the (normalized) overlap between them is either less than μ₁ or greater than μ₂ 7 $| R_{1, 2} | \in [0, μ_{1}] \cup [μ_{2}, 1],$ \left| {{R_{1,2}}} \right| \in \left[ {0,{\mu _1}} \right] \cup \left[ {{\mu _2},1} \right], where 8 $R_{1, 2} = \frac{1}{n} \sum_{i = 1}^{n} z_{i}^{1} z_{i}^{2} .$ {R_{1,2}} = {1 \over n}\sum\limits_{i = 1}^n {z_i^1} z_i^2.

Rather than using the overlap, it is often easier to visualize sets of solutions as clusters using the Hamming distance between two states via the relation 9 $\frac{H_{1, 2}}{n} = \frac{1}{2} | 1 - R_{1, 2} | \in [0, a] \cup [b, 1],$ {R_{1,2}} = {1 \over n}\sum\limits_{i = 1}^n {z_i^1} z_i^2. where H_1,2 is the Hamming distance between states z¹ and z² and 0 < a < b < 1.

The first interval is trivial as we can simply choose the overlap z¹ with itself. It is the existence of the second overlap, or rather the non-existence of overlap in the interval (a, b), that is difficult to prove.

A general version of it is known as the ensemble-OGP introduced in [21] or coupled-OGP as used in [22]. This version is required to prove limitations of local algorithm for technical reasons and requires an interpolation scheme between two different instances of Erdös–Rényi graphs. For spin glasses, a branching OGP has been developed that makes use of the ultrametric structure in the Parisi solution [23].

Informally speaking, the OGP limits the performance of algorithms by first showing that the set of near-optimal solutions exhibits a strong clustering property. That is, with high probability, a gap exists in their overlaps or equivalently, a gap in the hamming distance between near-optimal solutions. Then, one proceeds to show that if an algorithm is able to find arbitrary near-optimal solution, the algorithm outputs a solution that is in the region forbidden by the OGP. In the context of the QAOA, one typically uses the concentration of measure to prove the limitation of the QAOA at shallow depth [8,22]. We refer the reader to the review papers by Garmanik [3,20] for details on how the OGP limits the performance of algorithms.

2.4.

QAOA

The QAOA is a local quantum algorithm [22] designed to find approximate solutions to combinatorial optimization problems [4]. The goal is to find a bit string z ∈ {–1, +1}ⁿ that maximizes the cost function C(z). Given a classical cost function C, we can define a corresponding quantum operator $\hat{C}$ {R_{1,2}} = {1 \over n}\sum\limits_{i = 1}^n {z_i^1} z_i^2. that is diagonal in the computational basis, $\hat{C} | z 〉 = C (z) | z 〉$ {R_{1,2}} = {1 \over n}\sum\limits_{i = 1}^n {z_i^1} z_i^2.. In addition, define the operator $\hat{B} = \sum_{j}^{n} {\hat{X}}_{j}$ {R_{1,2}} = {1 \over n}\sum\limits_{i = 1}^n {z_i^1} z_i^2. where ${\hat{X}}_{j}$ {{\hat X}_j} is the Pauli X operator acting on qubit j. Given a set of parameters γ = (γ₁, …, γ_p) ∈ ℝ^p and β = (β₁, …, β_p) ∈ ℝ^p, the QAOA prepares the initial state as $| s 〉 = | + 〉^{n} = 2^{- n / 2} \sum_{z} | z 〉$ |s\rangle = | + {\rangle ^n} = {2^{ - n/2}}\sum\nolimits_z {|z\rangle } and applies p layers of alternating unitary operators $e^{- i γ_{k} \hat{C}}$ |s\rangle = | + {\rangle ^n} = {2^{ - n/2}}\sum\nolimits_z {|z\rangle } and $e^{- i β_{k} \hat{B}}$ {e^{ - i{\beta _k}\hat B}} to prepare the state 10 $| γ, β 〉 = e^{- i β_{p} \hat{B}} e^{- i γ_{p} \hat{C}} \dots e^{- i β_{1} \hat{B}} e^{- i γ_{1} \hat{C}} | s 〉 .$ |\gamma ,\beta \rangle = {e^{ - i{\beta _p}\hat B}}{e^{ - i{\gamma _p}\hat C}} \ldots {e^{ - i{\beta _1}\hat B}}{e^{ - i{\gamma _1}\hat C}}|s\rangle .

For a given cost function C, the corresponding QAOA objective function is the expectation value $〈 γ, β | \hat{C} | γ, β 〉$ \langle \gamma ,\beta |\hat C|\gamma ,\beta \rangle . The goal in the QAOA at depth p is to find the 2p optimal parameters (γ*, β*) that maximize the expectation value 〈γ, β|C|γ, β〉 (i.e. arg max_{γ, β} 〈γ, β|C|γ, β〉). Heuristics strategies to optimize 〈γ, β|C|γ, β〉 with respect to (γ, β) using a good initial guess have been proposed in [24]. Recently, when applying the QAOA to large girth regular graphs with girth greater than 2p + 1, the graph appears as a regular tree, and the authors of [6] developed an algorithm to find the optimal parameters which we call tree-parameters.

3.

Summary of Known Theorems

3.1.

QAOA and COPs

The first result of the QAOA on spin glasses can be found in [5] where the authors applied the QAOA on the Sherrington–Kirkpatrick (SK) model and found an algorithm to evaluate the expectation value in the infinite limit after averaging over the disorder J. More generally, for a q-spin glass with cost function 11 $H_{q} (z) = \sum_{j < k < \dots < q} J_{j k \dots q} z_{j} z_{k} \dots z_{q},$ {H_q}(z) = \sum\limits_{j < k < \cdots < q} {{J_{jk \ldots q}}} {z_j}{z_k} \ldots {z_q}, it was shown in [8] that the following theorem holds

Theorem 3

(Theorem 1 of [8]). For any p and any parameters (γ, β), we have 12 $\lim_{n \to \infty} E_{J} [〈 γ, β | H_{q} / n | γ, β 〉] = V_{p}^{(q)} (γ, β),$ \mathop {\lim }\limits_{n \to \infty } {_J}\left[ {\langle {\bf{\gamma }},{\bf{\beta }}|{H_q}/n|{\bf{\gamma }},{\bf{\beta }}\rangle } \right] = V_p^{(q)}({\bf{\gamma }},{\bf{\beta }}), where $V_{p}^{(q)} (γ, β)$ V_p^{(q)}({\bf{\gamma }},{\bf{\beta }}) is some analytic expression that can be computed explicitly.

In [6], the authors evaluated the performance of the QAOA for Max-q-XORSAT on large-girth (D + 1)-regular graphs. By restricting to graphs that are regular and have girth (also known as the shortest Berge-cycle) greater than 2p + 1, the subgraph explored by the QAOA at depth p will appear as a regular tree. Since the optimal cut fraction is of the form $1 / 2 + O (1 / \sqrt{D})$ V_p^{(q)}({\bf{\gamma }},{\bf{\beta }}) in a typical random graph as in (5), we have 13 $\frac{1}{| E |} 〈 γ, β | H_{XOR}^{q} | γ, β 〉 = \frac{1}{2} + v_{p}^{[q]} (D, γ, β) \sqrt{\frac{q}{2 D}} + O (1 / \sqrt{D}) .$ {1 \over {|E|}}\langle {\bf{\gamma }},{\bf{\beta }}|H_{{\rm{XOR}}}^q|{\bf{\gamma }},{\bf{\beta }}\rangle = {1 \over 2} + v_p^{[q]}(D,{\bf{\gamma }},{\bf{\beta }})\sqrt {{q \over {2D}}} + {\cal O}(1/\sqrt D ).

Let 14 $v_{p}^{[q]} (γ, β) = \lim_{D \to \infty} v_{p}^{[q]} (D, γ, β),$ v_p^{[q]}(\gamma ,\beta ) = \mathop {\lim }\limits_{D \to \infty } v_p^{[q]}(D,\gamma ,\beta ), then, we have the following theorem

Theorem 4

(Theorem 2 of [6]). For $H_{XOR}^{q}$ v_p^{[q]}(\gamma ,\beta ) = \mathop {\lim }\limits_{D \to \infty } v_p^{[q]}(D,\gamma ,\beta ), on any (D + 1)-regular q-uniform hypergraph with girth > 2p + 1, for any choice of J, (13) can be evaluated using 𝒪(p4^pq) time and 𝒪(4^p) space. In addition, the infinite D limit can be evaluated with an iteration using 𝒪(p²4^p) time and 𝒪(p²) space.

Furthermore, the authors also made the following conjecture based on promizing numerical evidence,

Conjecture 5 (Conjecture of [6]). Optimizing the QAOA using tree-parameters found in [6], the Parisi value for the Sherrington–Kirkpatrick model can be reached: 15 $\lim_{p \to \infty} v_{p}^{[2]} (γ, β) = Π_{2} .$ v_p^{[q]}(\gamma ,\beta ) = \mathop {\lim }\limits_{D \to \infty } v_p^{[q]}(D,\gamma ,\beta ),

Remark 1.

Note that to compute $v_{p + 1}^{[2]} (γ, β)$ v_p^{[q]}(\gamma ,\beta ) = \mathop {\lim }\limits_{D \to \infty } v_p^{[q]}(D,\gamma ,\beta ),, one technically first computes $v_{p + 1}^{[2]} (D, γ, β)$ v_p^{[q]}(\gamma ,\beta ) = \mathop {\lim }\limits_{D \to \infty } v_p^{[q]}(D,\gamma ,\beta ), before taking the large D limit $\lim_{D \to \infty} v_{p + 1}^{[2]} (D, γ, β)$ \mathop {\lim }\limits_{D \to \infty } v_{p + 1}^{[<xref ref-type="bibr" href="#j_qic-2025-0018_ref_002">2</xref>]}(D,\gamma ,\beta ) so the statement is saying the $Π_{2} = \lim_{D \to \infty} \lim_{p \to \infty} v_{p}^{[2]} (D, γ, β)$ {\Pi _2} = \mathop {\lim }\nolimits_{D \to \infty } \mathop {\lim }\nolimits_{p \to \infty } v_p^{[<xref ref-type="bibr" href="#j_qic-2025-0018_ref_002">2</xref>]}\left( {D,{\bf{\gamma }},{\bf{\beta }}} \right).

While we are unable to prove or disprove the conjecture, we prove later that the generalized Parisi value Πq is not obtainable if the OGP is present.

One point to note is that in [7], it has been shown that at low depth p, if a problem exhibits the OGP, then the locality of the QAOA makes it such that it is prevented from getting close to the optimal value if it does not see the whole graph. Specifically, the following theorem is proven.

Theorem 6

(modified version of Corollary 4.4 in [22]). For Max-q-XORSAT on a random Erdös–Réyi directed multihypergraph, for every even q ≥ 4, there exists a value η_OGP < η_OPT, where η_OPT is the energy of the optimal solution, and a sequence {δ(d)}_d≥1 with the following property: for every ϵ > 0 there exists sufficiently large d₀ such that for every d > d₀, every p < δ(d) log n and an arbitrary choice of parameters γ, β with probability converging to 1 as n → ∞, the performance of the QAOA with depth p satisfies $〈 γ, β | C_{XOR}^{q} / n | γ, β 〉 \leq η_{O G P} + ϵ$ \langle {\bf{\gamma }},{\bf{\beta }}|C_{{\rm{XOR}}}^q/n|{\bf{\gamma }},{\bf{\beta }}\rangle \le {\eta _{OGP}} + .

The authors of [6] noted that assuming the OGP also holds for regular hypergraphs, then a similar argument can be used to show that the QAOA’s performance as measured by the algorithm in Theorem 4 does not converge to the Parisi value Πq for even q ≥ 4. This is because the large girth assumption implies that the graph has at least D^p vertices so p is always less than ϵ log n in this limit. For the Max-q-XORSAT, the subgraph explored at constant p has q[(q – 1)^pDp + … + (q – 1)D + 1] vertices. This lays the foundation of Theorem 10 later.

3.2.

Equivalence of Performance

The first equivalence between spin glass and MaxCUT for the QAOA was shown in [6], where the performance of the QAOA at depth p on the SK model as n → ∞ to is equal to the performance of the QAOA at depth p on MaxCUT problems on large-girth (D + 1)-regular graphs when D → ∞. In the follow-up work of [8], they generalize this result to show that the QAOA’s performance for the q-spin model is equivalent to that for Max-q-XORSAT on any large girth D-regular hypergraphs in the limit D → ∞.

Theorem 7

(Theorem 3 of [8]). Let $v_{p}^{[q]} (γ, β)$ \langle {\bf{\gamma }},{\bf{\beta }}|C_{{\rm{XOR}}}^q/n|{\bf{\gamma }},{\bf{\beta }}\rangle \le {\eta _{OGP}} + be the performance of the QAOA on any instance of Max-q-XORSAT that has an underlying D-regular, q-uniform hypergraph with girth > 2p + 1 as given in [6]. Then for any p and any parameters (γ, β), we have 16 $V_{p}^{(q)} (γ, β) = {\sqrt{2 v}}_{p}^{[q]} (\sqrt{q} γ, β) .$ V_p^{(q)}({\bf{\gamma }},{\bf{\beta }}) = \sqrt {2v} _p^{[q]}(\sqrt q {\bf{\gamma }},{\bf{\beta }}).

The equivalence of performance of the QAOA on dense and sparse graph is also shown to hold in the case of Erdös–Rényi graph.

Theorem 8

(modified version of theorem 2 in [8]). Let 17 $V_{p} (G, γ, β) = \lim_{n \to \infty} E_{J ~ G (n)} 〈 γ, β | H_{J} / n | γ, β 〉,$ {V_p}(,{\bf{\gamma }},{\bf{\beta }}) = \mathop {\lim }\limits_{n \to \infty } {_{J\~(n)}}\langle {\bf{\gamma }},{\bf{\beta }}|{H_J}/n|{\bf{\gamma }},{\bf{\beta }}\rangle , where 𝔾 denotes the underlying graph and H_J the cost function associated with it. Then, for the q-spin model 𝔾_q and the Erdös–Rényi graph with connectivity λ, the asymptotic performance of the QAOA on 𝔾_q is the same as $G_{E R}^{q}$ _{ER}^q for any (γ, β) 18 $V_{p}^{(q)} (γ, β) = \lim_{λ \to \infty} V_{p} (G_{E R^{'}}^{q}, γ, β) .$ V_p^{(q)}({\bf{\gamma }},{\bf{\beta }}) = \mathop {\lim }\limits_{\lambda \to \infty } \;{V_p}\left( {{\rm{G}}_{E{R^\prime }}^q,{\bf{\gamma }},{\bf{\beta }}} \right).

4.

Main Results

We now have the pieces in place to state our main theorem

Theorem 9.

Given a local algorithm 𝒜 that is limited in performance up to depth p = ϵ log n on an Erdös–Rényi hypergraph with sufficiently large average degree λ, 𝒜 is also limited in performance up to depth p on a random D-regular hypergraph for sufficiently large D.

We delay the proof to Section 4.1 and note an immediate consequence for the performance of the QAOA.

Theorem 10

For Max-q-XORSAT on a D-regular q-uniform hypergraph, for every even q > 4, there exists a value η_OGP such that it is smaller than the optimal value η_OPT with the following property. For every ϵ > 0 there exists sufficiently large d₀ such that for every d > d₀, every p ≤ d log n and an arbitrary choice of parameters γ, β with probability converging to 1 as n → ∞, the performance of the QAOA with depth p satisfies 19 $\frac{〈 γ, β | C_{X O R}^{q} / n | γ, β 〉}{| E |} = \frac{1}{2} + v_{\infty}^{[q]} (D, γ, β) \sqrt{\frac{q}{2 D}} + O (1 / \sqrt{D}) < η_{O G P} + ϵ .$ {{\langle \gamma ,\beta |C_{XOR}^q/n|\gamma ,\beta \rangle } \over {|E|}} = {1 \over 2} + v_\infty ^{[q]}(D,\gamma ,\beta )\sqrt {{q \over {2D}}} + {\cal O}(1/\sqrt D ) < {\eta _{OGP}} + .

Proof. The proof of the inequality follows from Theorem 6 and Theorem 9.

The proof of the equality comes from Theorem 9 which states that evaluating $v_{p}^{[q]} (D, γ, β)$ v_p^{[q]}(D,\gamma ,\beta ) is valid up to p = ϵ log n. In the case of MaxCUT or Max-q-XORSAT, we have 20 $\lim_{n \to \infty} \lim_{p \to ε \log n} \frac{1}{| E |} 〈 γ, β | H_{X O R}^{q} | γ, β 〉 = \frac{1}{2} + v_{\infty}^{[q]} (D, γ, β) \sqrt{\frac{q}{2 D}} + O (1 / \sqrt{D})$ \mathop {\lim }\limits_{n \to \infty } \mathop {\lim }\limits_{p \to \varepsilon \log n} {1 \over {|E|}}\langle \gamma ,\beta |H_{XOR}^q|\gamma ,\beta \rangle = {1 \over 2} + v_\infty ^{[q]}(D,\gamma ,\beta )\sqrt {{q \over {2D}}} + {\cal O}(1/\sqrt D ) 21 $< \frac{1}{2} + Π_{q} \sqrt{\frac{q}{2 D}} + O (1 / \sqrt{D}),$ < {1 \over 2} + {\Pi _q}\sqrt {{q \over {2D}}} + {\cal O}(1/\sqrt D ), where $ν_{\infty}^{[q]} (D, γ, β) = \lim_{p \to \infty} ν_{p}^{[q]} (D, γ, β) < ν_{\infty}^{[q]} (γ, β) < Π_{q}$ \nu _\infty ^{[q]}(D,\gamma ,\beta ) = \mathop {\lim }\nolimits_{p \to \infty } \nu _p^{[q]}(D,\gamma ,\beta ) < \nu _\infty ^{[q]}(\gamma ,\beta ) < {\Pi _q}.

As a result of this theorem, we re-derive and improve a theorem of [8]:

Corollary 3

(Theorem 4 of [8]). From Theorem 10 the performance of the QAOA on the pure q-spin glass for even q ≥ 4 is upper bounded by η_OGP as p → ∞ and is strictly less than the optimal value, i.e. the Parisi value Π_q, under the swapping of limits.

We emphasize again that while [8] proves this corollary, it does so via explicit calculation of the QAOA using a highly technical proof involving a generalized multinomial sum. This work improves upon it by providing a simpler, straightforward argument using graph properties and extending their result from constant depth to ϵ log n depth for the random regular graph. Note that both proofs require Theorem 6 to prove this corollary.

Unfortunately, the QAOA performance on mean-field spin glass has only been shown to be equivalent in performance to the dilute spin glass model via Eq. (16) at constant depth p, and we are unable to find a proof to extend their equivalence. As such, we are not able to prove that the OGP acts as a barrier on dense graphs at logarithmic depth. If one can show that an asymptotic analysis of the mean-field spin glass at constant depth and logarithmic depth result in the same solution as in the random regular graph, this could extend the algorithmic barrier for dense graphs to logarithmic depth as well.

We note that Corollary 3 shows that the QAOA will not be able to find the optimal value even when it sees the whole graph, and the algorithm runs indefinitely if one optimizes the parameters of the QAOA using the tree-parameters obtained via asymptotic analysis. Formally, the Parisi value is attainable via the QAOA with the following limits 22 $\lim_{n \to \infty} \lim_{p \to \infty} E_{J} 〈 γ, β | H_{q} / n | γ, β 〉 = Π_{q} .$ \mathop {\lim }\limits_{n \to \infty } \mathop {\lim }\limits_{p \to \infty } {_J}\langle {\bf{\gamma }},{\bf{\beta }}|{H_q}/n|{\bf{\gamma }},{\bf{\beta }}\rangle = {\Pi _q}.

The iteration provided in Theorem 4 swaps the limits which results in failure of the QAOA to find the optimal value. This leads us to the following corollaries.

Corollary 4.

If the OGP exists in a spin glass type problem, then the swapping of limits results in a sub-optimal solution for both random regular graphs and Erdös–Rényi graphs. In other words, a necessary condition for the validity of limit swapping is that the problem does not exhibit the OGP.

Remark 2.

For q-spin glass, it is expected that OGP holds for all q ≥ 3, which suggests that limit swapping is not allowed for all mean-field spin glasses with the possible exception for the 2-spin glass model (i.e. the SK model) [25].

In addition, Corollary 4 also applies to the Maximum Independent Set problem since [7] shows a similar limitation at logarithmic depth. We suspect that similar results extend to all COPs rather than being limited to spin glass type COPs.

4.1.

Proof of Theorem 9

Proof. The proof is as follows: first, we need to show that for some λ ∈ ℤ⁺, a λ-regular q-uniform hypertree can be embedded into an Erdös–Rényi hypergraph of sufficiently high average connectivity. Then, we show that a λ-regular q-uniform hypergraph can be generated from said hypertree. Finally, we show that algorithm 𝒜 must also fail to find solutions arbitrarily close to the optimal solution in a λ-regular q-uniform hypergraph as doing so would result in a contradiction.

We note that a COP instance with m chosen edges can be converted into a regular instance changing only $o_{λ} (1 / \sqrt{λ})$ {o_\lambda }(1/\sqrt \lambda )¹ edges. This reduction has been used several times before as in [16,26] and recently in [27]. We follow the proof of [27] but simplify one of the lemmas using $G_{E R}^{q} (n, p)$ G_{ER}^q(n,p) later to avoid the combinatoric arguments needed when considering $G_{E R}^{q} (n, m)$ G_{ER}^q(n,m), and extend the argument from constant depth l to ϵ log n.

Here, we show that an Erdös–Rényi hypergraph can be converted into a hypertree by changing only $o (1 / \sqrt{λ})$ o(1/\sqrt \lambda ) edges. Given $G_{E R}^{q} (n, m)$ _{ER}^q(n,m) with average degree γ, define $λ^{'} = ⌈ λ + \sqrt{λ} \log λ ⌉$ {\lambda ^\prime } = \left\lceil {\lambda \; + \;\sqrt \lambda \log \lambda } \right\rceil . Let d_i be the degree of vertex V_i. Modify the graph as follows:

Remove edges until d_i < γ′ for all vertices.
Add edges to all vertices until d_i = λ′ where each vertex is chosen with probability proportional to λ′ – d_i. In order to prove this, we will need the following lemmas.

Lemma 1.

In the limit n → ∞, the number of edges removed from $G_{E R}^{q} (n, m)$ _{ER}^q(n,m) is at most n · 𝒪_γ(1/γ^clogγ) for some constant c > 0.

Proof. Note that the distribution of edges in an Erdös–Rényi follows a binomial distribution B(m, 1/n). For each vertex v_i, the number of edges removed is either 0 or d_i – λ′ so Δ_i := max (λ′,0). The first moment of Δ_i is bounded by 23 $E [Δ_{i}] = \sum_{d > λ^{'}} P (Δ_{i} = k) (k - λ^{'}) = \sum_{d > λ^{'}} P (Δ_{i} \geq k) \leq \int_{λ^{'}}^{\infty} P (d_{i} \geq x) d x .$ \left[ {{\Delta _i}} \right] = \sum\limits_{d > {\lambda ^\prime }} P \left( {{\Delta _i} = k} \right)\left( {k - {\lambda ^\prime }} \right) = \sum\limits_{d > {\lambda ^\prime }} P \left( {{\Delta _i} \ge k} \right) \le \int_{{\lambda ^\prime }}^\infty P \left( {{d_i} \ge x} \right)dx.

Where we used the fact that the expectation value of a random variable equals the cumulative function in the second equality.

The second moment can also be bounded as 24 $E [Δ_{i}^{2}] = \sum_{d > λ^{'}} P (Δ_{i} = k) {(k - λ^{'})}^{2} \leq \int_{λ^{'}}^{\infty} P (d_{i} \geq x) 2 (x - λ^{'}) d x .$ \left[ {\Delta _i^2} \right] = \sum\limits_{d > {\lambda ^\prime }} P \left( {{\Delta _i} = k} \right){\left( {k - {\lambda ^\prime }} \right)^2} \le \int_{{\lambda ^\prime }}^\infty P \left( {{d_i} \ge x} \right)2\left( {x - {\lambda ^\prime }} \right)dx.

Using Chernoff’s bound for the binomial distribution, we have 25 $P (d_{i} \geq x) \leq 2 \exp {- \frac{{(x - λ^{'})}^{2}}{3 λ^{'}}} .$ P\left( {{d_i} \ge x} \right) \le 2\exp \left\{ { - {{{{\left( {x - {\lambda ^\prime }} \right)}^2}} \over {3{\lambda ^\prime }}}} \right\}.

Applying Eq. (25) to the second moment bound, we have 26 $\begin{array}{l} \int_{λ^{'}}^{\infty} 2 (x - λ^{'}) \exp {- \frac{{(x - λ^{'})}^{2}}{3 λ^{'}}} d x & = 3 λ^{'} \exp {- \frac{{(λ^{'} - λ)}^{2}}{3 λ}} \\ = O_{λ} (λ \exp {- c {(\log λ)}^{2}}) . \end{array}$ \matrix{ {\int_{{\lambda ^\prime }}^\infty 2 \left( {x - {\lambda ^\prime }} \right)\exp \left\{ { - {{{{\left( {x - {\lambda ^\prime }} \right)}^2}} \over {3{\lambda ^\prime }}}} \right\}dx} \hfill & { = 3{\lambda ^\prime }\exp \left\{ { - {{{{\left( {{\lambda ^\prime } - \lambda } \right)}^2}} \over {3\lambda }}} \right\}} \hfill \cr {} \hfill & { = {{\cal O}_\lambda }\left( {\lambda \exp \left\{ { - c{{(\log \lambda )}^2}} \right\}} \right).} \hfill \cr } for some c > 0. Thus, both the first and second moments are bounded by 𝒪_λ(d^{–c′ log d}) for some constant c′ > 0.

For the total number of edges removed Δ, we note that by a union bound, $Δ \leq \tilde{Δ}$ \Delta \le \tilde \Delta , where $\tilde{Δ} : = \sum_{i} Δ_{i}$ \tilde \Delta : = \sum\nolimits_i {{\Delta _i}} . Furthermore, we have $E [\tilde{Δ}] = n E [Δ_{i}]$ [\tilde \Delta ] = n\left[ {{\Delta _i}} \right]. Unless Δ_i and Δ_j share the same edge, they are independent, so 27 $Var (\tilde{Δ}) \leq n E [Δ_{i}^{2}] + 2 \sum_{i, j} Cov (Δ_{i}, Δ_{j}) .$ {\mathop{\rm Var}\nolimits} (\tilde \Delta ) \le n\left[ {\Delta _i^2} \right] + 2\sum\limits_{i,j} {{\mathop{\rm Cov}\nolimits} } \left( {{\Delta _i},{\Delta _j}} \right).

Since the degree of each vertex is not independent (i.e. follows a multinomial distribution), the covariance term is negative. Therefore, as n → ∞, Δ is at most n𝒪_λ(d^{–c log d}) for some c > 0 with high probability.

The process of removing edges does not create cycles (i.e. destroy tree-like property). However, we need to ensure that the graph was initially tree-like and remains tree-like after adding edges. Rather than working with $G = G_{E R}^{q} (n, m)$ G\;{\rm{ = }}\;_{ER}^q(n,m), we will work with $G = G_{E R}^{q} (n, p)$ G\;{\rm{ = }}\;_{ER}^q(n,p) for the next lemma and use the fact that a graph containing a k-cycle is a monotone increasing property and that for any monotone increasing graph property 𝒫, $P (\in G_{E R}^{q} (n, m)) \leq C \cdot P (\in G_{E R}^{q} (n, p))$ P\left( {{\cal P} \in _{ER}^q(n,m)} \right) \le C\cdotP\left( {{\cal P} \in _{ER}^q(n,p)} \right) for some constant 𝒞 [28].

Lemma 2.

Fix any constant λ and l ≤ ϵ log n. With high probability as n → ∞, 1-o(1) fraction of the l-local neighbourhood are treelike.

Proof. Let $p = \frac{c + \log n + λ \log \log n}{(\overset{n}{q - 1})} ~ O (\log n / n^{q - 1})$ p = {{c + \log n + \lambda \log \log n} \over {\left( {\mathop {q - 1}\limits^n } \right)}}\~{\cal O}\left( {\log n/{n^{q - 1}}} \right) for some constant c > 1, and k, ∈ ℕ. Let X be the number of k-cycles. Leaving k in the big-𝒪 notation to account for k as a function of n later, we have 28 $E X = (\begin{matrix} n \\ k \end{matrix}) p^{k} ~ O ({(q!)}^{k} \frac{n^{2 k} \log^{k} n}{n^{k q} k!}) .$ X = \left( \matrix{ n \cr k \cr} \right){p^k}\~{\cal O}\left( {{{(q!)}^k}{{{n^{2k}}{{\log }^k}n} \over {{n^{kq}}k!}}} \right).

By Markov’s inequality, we thus have 29 $P (X > 0) \leq O (\frac{{(q!)}^{k} \log^{k} n}{n^{k (q - 2)} k!}),$ P(X > 0) \le {\cal O}\left( {{{{{(q!)}^k}{{\log }^k}n} \over {{n^{k(q - 2)}}k!}}} \right), which vanishes in the limit n → ∞ implying that the number of constant k-cycles is o(1). The same argument implies that cycles of size log n and (log n)^logn also vanish.

Now we show that the l-local neighbourhood of an arbitrary vertex has at most o(1) k-cycles. In the limit n → ∞, the degree of each vertex follows a Poisson distribution with mean λ. Let Υ denote the degree of an arbitrary vertex. The probability that any vertex v has degree at most log n is given by 30 $P (Y \leq \log n) = 1 - P (Υ > \log n) .$ P(Y \le \log n) = 1 - P(Y > \log n).

From the Chernoff bound for the Poisson distribution, we have that 31 $P (Υ > x) \leq e^{- λ} \frac{{(e λ)}^{x}}{x^{x}} ~ O (c^{x} / x^{x}) = o (1),$ P(Y > x) \le {e^{ - \lambda }}{{{{(e\lambda )}^x}} \over {{x^x}}}\~{\cal O}\left( {{c^x}/{x^x}} \right) = o(1), for some constant c so all vertices have degree at most log n with high probability.

Thus, the l-local neighbourhood has at most (log n)^ϵlogn vertices. Repeating the same argument for the number of k-cycles shows that only o(1) of the l-local neighbourhood will contain a cycle.

Lemma 3.

Fix any λ, there exists some ϵ > 0 such that for l ≤ ϵ log n, with high probability as n → ∞, adding edges preserves trees in 1-o(1) fraction of the l-local neighbourhood.

Proof. Right after removing edges, every vertex has at most degree λ′ so given some constant λ and l ≤ ϵ log n, the l-local neighbourhood B_G(v, l) is upper-bound by λ′^ϵlogn and is a hypertree. Then, we have to add on average n(λ log λ) ~ Θ(n) edges but since B_G(v, l) is of order 𝒪(λ′^ϵlogn), the probability that an added hyperedge contains at least two vertices in B_G(v, l) is 𝒪(λ′^ϵlogn/n).

Choose ϵ < 1/log λ′. As a result, adding clauses results in o(1) fraction of l-local neighbourhood forming a cycle.

Now, we show that a λ-regular, q-uniform hypergraph is locally also a hypertree.

Lemma 4.

Fix any λ > 1 and p ≤ ϵ log n for some ϵ > 0, with high probability as n → ∞, 1 – o_λ(1) fraction of vertices in the p-local neighbourhood are treelike.

Proof. As we are interested in the large n limit, we first show that for fixed p, the dominant term in the probability that a cycle is formed in the large n limit is given by $\frac{{(q - 1)}^{p} λ^{p}}{n - 1 - \dots - λ^{p - 1} {(q - 1)}^{p - 1}}$ {{{{(q - 1)}^p}{\lambda ^p}} \over {n - 1 - \cdots - {\lambda ^{p - 1}}{{(q - 1)}^{p - 1}}}}. Consider p = 1 and choose any hyperedge. Then, the first (q – 1) vertices form no cycle with probability 1. The next hyperedge added will form a cycle with probability $\frac{q - 1}{n - 1}$ {{q - 1} \over {n - 1}}. This process repeats until we reach the last hyperedge for the root vertex (i.e. the λ hyperedge) where the probability of forming a cycle is given by $\frac{(λ - 1) (q - 1)}{n - 1}$ {{(\lambda - 1)(q - 1)} \over {n - 1}} so the dominant term is of the form $\frac{λ (q - 1)}{n - 1}$ {{\lambda (q - 1)} \over {n - 1}}. In other words, the term that contributes the highest probability of forming a cycle at depth p is when we are filling up the last hyperedge. For p = 2 and higher, choosing the first hyperedge already has a non-trivial probability of forming a cycle as we might add a vertex at the p−1 level. Focusing on p = 2, this means that adding the first (q−1) vertices has a probability of $\frac{λ (q - 1) - 1}{n - 1}$ {{\lambda (q - 1) - 1} \over {n - 1}} to form a vertex. If we are in the middle of filling up the second layer (i.e. some of the p = 1 vertices already have degree λ), then the adding the next hyperedge and vertex would form a cycle with a p = 1 vertex with $\frac{λ (q - 1) - c}{n - 1 - c}$ {{\lambda (q - 1) - c} \over {n - 1 - c}} for some constant c while the probability that it forms a cycle with a p = 2 vertex is given by $\frac{c * λ (q - 1)}{n - 1 - λ (q - 1)}$ {{c*\lambda (q - 1)} \over {n - 1 - \lambda (q - 1)}}. For the very last hyperedge added in p = 2, the probability of forming a cycle is given by $\frac{{(q - 1)}^{2} (λ^{2} - 1)}{n - 1 - λ (q - 1)}$ {{{{(q - 1)}^2}\left( {{\lambda ^2} - 1} \right)} \over {n - 1 - \lambda (q - 1)}} which is the dominant term. This process can be iterated to show that the dominant term is of the form $O (\frac{q^{p} λ^{p}}{n - \dots - λ^{p - 1} {(q - 1)}^{p - 1}})$ {\cal O}\left( {{{{q^p}{\lambda ^p}} \over {n - \cdots - {\lambda ^{p - 1}}{{(q - 1)}^{p - 1}}}}} \right) at depth p. For any fix λ and p ≤ ϵ log n for some constant ϵ > 0, the probability that at p distance away from any vertex v_i remains tree-like is given by 32 $\max (0, 1 - \frac{(q - 1) λ}{n - 1} - \dots - \frac{{(q - 1)}^{p} λ^{p}}{n - 1 - \dots - λ^{p - 1} {(q - 1)}^{p - 1}}),$ \max \left( {0,1 - {{(q - 1)\lambda } \over {n - 1}} - \cdots - {{{{(q - 1)}^p}{\lambda ^p}} \over {n - 1 - \cdots - {\lambda ^{p - 1}}{{(q - 1)}^{p - 1}}}}} \right), since 33 $O (\frac{c^{ϵ \log n}}{n}) = O (\frac{n^{ϵ \log c}}{n})$ {\cal O}\left( {{{{c^{\log n}}} \over n}} \right) = {\cal O}\left( {{{{n^{\log c}}} \over n}} \right) choose $ϵ < \frac{1}{\log c} = \frac{1}{\log (λ q)}$ < {1 \over {\log c}} = {1 \over {\log (\lambda q)}} so that $\lim_{n \to \infty} (n^{ϵ \log c - 1})$ {\lim _{n \to \infty }}\left( {{n^{\log c - 1}}} \right) goes to 0 . Then the probability that at ϵ log n distance away from any vertex is tree like converges to unity for n → ∞, 34 $\lim_{n \to \infty} 1 - \dots - \frac{{(q - 1)}^{p} λ^{p}}{n - 1 - \dots - λ^{p - 1} {(q - 1)}^{p - 1}} = 1.$ \mathop {\lim }\limits_{n \to \infty } 1 - \cdots - {{{{(q - 1)}^p}{\lambda ^p}} \over {n - 1 - \cdots - {\lambda ^{p - 1}}{{(q - 1)}^{p - 1}}}} = 1.

Now we can show that the OGP is also an obstruction in random regular hypergraphs via contradiction. Assume that an algorithm 𝒜 at logarithmic depth is able to find solutions arbitrarily close to the optimal solution for the Maxq-XORSAT on a regular hypergraph. Then this would imply that 𝒜 is also able to find such solutions when performed on an Erdös–Rényi hypergraph since both graphs are p-locally the same. However, this contradicts Theorem 6 and thus, the OGP must also restrict the performance of logarithmic depth local algorithms when applied to a regular hypergraph.

It is of note that proving that the OGP exists in a problem is much easier when the underlying graph is an Erdös–Rényi hypergraph as compared to a regular hypergraph since only the former can be described by a probability distribution. This is why there is no proof that the OGP exists for the Max-q-XORSAT on regular graph as it requires the Poisson distribution found in an Erdös–Rényi hypergraph. Given Theorem 9 and that it is possible to show that the OGP exists in both Erdös–Rényi hypergraphs and regular hypergraphs in some problems [29], it is reasonable to think that if the OGP exists in the former, it also exists in the latter. Motivated by this, we make the following conjectures.

Conjecture 11. If the overlap gap property exists in a COP with an underlying Erdös–Rényi hypergraph of sufficiently high connectivity, then it also exists when the underlying hypergraph is a regular hypergraph of sufficiently high degree.

Remark 3.

While Theorem 9 seems to support this conjecture, we have only proved this asymptotically and at logarithmic depth. It is not entirely clear if the same result applies when the depth is of 𝒪 (n) or for the finite case since the proof of OGP for max-q-XORSAT on an Erdös–Rényi holds for some constant n ∈ ℤ.

Conjecture 12 (Monotonicity of the OGP). For the Max-q-XORSAT problem, the overlap gap property is a monotonically increasing graph property.

We note that the proof of Theorem 9 is much simpler if the conjectures are true as can be seen in Appendix A.

4.2.

Numerical evidence

We refer the reader to numerous numerical studies about how the performance of the QAOA is unable to surpass the OGP barrier. For instance, the authors of [6] numerically evaluated the performance of the QAOA on Max-3- XORSAT up to p = 14 and got 0.6623Π₃. For the 3-spin glass, the OGP inhibits the AMS algorithm’s performance to get to 0.987Π₃ [30]. A study of the QAOA on Max-q-XORSAT problem similarly finds that for n = 18, the QAOA is unable to get close to the 0.987 approximation ratio even at a depth of p = 30 for q = 3 [31].

Instead, we provide some numerical evidence that instances of the OGP can occur in random regular hypergraphs of odd degree. Our numerical simulation proceeds in the following manner. First, we define the problem size n, uniformity q, and degree d, where we implicitly assume that nd is a multiple of q. Then, randomly generate a d-regular q-uniform hypergraph so that the total number of hyperedges |E| = (nd/q). Next, we randomly generate the list J = {–1, +1}^|E| for the coupling strength of the hyperedges. Finally, we perform a branch and bound algorithm and record those whose cut-fraction exceeds a certain threshold.

Once we have the list of bit-strings and their corresponding cut-fraction, we have to choose some ϵ > 0 such that the list of bit-strings that are ϵ-optimal solutions is small. By default, we limit the bit-strings that are at least 95% to the optimal solution. Finally, compute the overlap between all such ϵ-optimal bit-string and obtain the overlap spectrum.

We find that on average, when d < q, the OGP is not present. It is only when d is greater than q that instances of problems exhibiting the OGP first appear. The numerical simulations were run on q = 3 and varying n up to 30. A histogram of the distribution of solutions can be found in Figure 1 while the typical evolution of the overlap spectrum can be found in Figure 2.

We also ran simulations on the SK model as it is believed, though not yet proven, that the SK model does not exhibit the OGP [3,30]. We find that indeed the SK model does not exhibit the OGP at n = 45 as can be seen in Figure 3.

5.

Discussion and Further Work

Being a heuristic algorithm, the limitations and potential of the QAOA have not yet been fully explored. While swapping the order of limits allows us to evaluate the expectation value with a classical computer faster, it also seems to lead to sub-optimal results. This of course is expected and one can instead use the algorithm developed in [6] as a heuristic starting ansatz for (γ, β) to be further optimized for a specific problem.

Currently, the OGP has only been shown to be a limitation on dense models at super-constant depth p ~ 𝒪(log log n) [9]. Given the “dense-from-sparse” reduction performed in [8] that showed equivalence between constant depth QAOA for dense and sparse graphs, perhaps one can extend the equivalence to logarithmic depth similar to how we extended the validity of the algorithm from constant to logarithmic depth.

We note that these results suggests that at logarithmic depth, the performance of the QAOA equals that of AMS’s algorithm for the mean field spin glass [30], a type of Approximate Message Passing (AMP) algorithm. This suggests that if the QAOA is optimized correctly beyond logarithmic depth such as polynomial depth, it should outperform such classical algorithms since the QAOA is known to find exact solutions by reduction to the Quantum Adiabatic Algorithm. It is still an open question to determine at what depth p the QAOA v outperform the algorithm. Furthermore, given the similarity in performance to the AMP algorithm, this also suggests that the conjecture in [6] that the Parisi value for the Sherrington–Kirkpatrick model is obtained under limit swapping might be true as the AMP algorithm achieves the optimal value under the assumption that the OGP does not exists.

The Overlap Gap Property Limits Limit Swapping in the QAOA

Full Article

Paradigm

My account