Random-Projector Quantum Diagnostics of Ramsey Numbers and A Prime-factor Heuristic for R(5, 5) = 45

Fabrizio Tamburini

doi:10.2478/qic-2025-0039

Introduction

Ramsey numbers are at the intersection of graph theory, combinatorics, neural networking, computation and probability. Ramsey theory asks for the smallest number R(m, n) such that every edge coloring of the complete graph K_{R(m, n)} with two colors contains a red graph K_m with m vertices or a blue K_n with n vertices [1]. In graph theory, a Ramsey number refers to the minimum number of vertices in a complete graph where, regardless of how the edges are colored with two colors, a monochromatic subgraph of a specified size (or sizes) is guaranteed to exist. Ramsey’s theorem guarantees the number exists. We know all exact values up to R(4, 5) = 25; the next diagonal case 43 ≤ R(5, 5) ≤ 46 has not been directly calculated yet [2–4]. Gap sizes grow explosively between R(5, 5) and R(6, 6) the interval already spans 60 numbers, and for R(7, 7) we only know 189 ≤ R(7, 7) ≤ 4749. It is the search of sequences and patterns in large structures like searching for constellations in the sky [5].

Ramsey numbers control worst-case resource overheads in error-correcting codes and result intractable with current quantum computing resources. Designing a code to correct t errors means ensuring no two codewords sit within Hamming distance, which measures the minimum number of substitutions required to change one string into the other, d_H ≤ 2t [6]. This avoidance problem is equivalent to a set-coloring Ramsey instance on an alphabet of size q. The threshold at which large codes cease to exist scales like R(t + 1;q, q — 1), so the redundancy one must pay in the worst case grows with a Ramsey number rather than a mere polynomial in t. This follows from the standard avoidance formulation of code design; see [7] for the probabilistic method and [8] for Ramsey-theoretic background.

In quantum computing they are fundamental in worst-case routing in quantum compilation, mapping an arbitrary k-qubit circuit onto limited hardware connectivity forces SWAP gates whenever interacting qubits aren’t adjacent. Ramsey’s theorem guarantees that in any coloring or pattern of two-qubit interactions among R(k, k) qubits, there must exist an unavoidable “hard” configuration such as a clique or independent set requiring at least R(k, k) SWAP operations or depth units to resolve. Thus, the diagonal Ramsey number sets a lower bound on the worst-case communication depth of any fully general distributed quantum compiler in worst-case interaction patterns under limited connectivity, a Ramsey-style obstruction yields a depth lower bound scaling with a diagonal Ramsey number [8, 9]. To push worst-case depths down, one either needs richer native connectivity or higher-dimensional qudits of order n to effectively reduce the Ramsey argument to R([k/n], [k/n]) adopting paraparticle methods with ℤ₂ × ℤ₂-graded Majorana algebras [10].

Computing Ramsey numbers is notoriously intensive: classical proofs use double counting or induction [11], whereas the gluing technique builds larger graphs from smaller ones to avoid or force specific subgraphs such as cliques or independent sets [12].

We propose a new statistical method to estimate Ramsey number based on ℤ₂ × ℤ₂-graded Lie Algebras of Majorana infinite-component spin fields introduced in [10]. Our strategy follows the implementing of a paraparticle algebra graded by the Klein group V₄ ≌ ℤ₂ × ℤ₂, which requires only a limited number of qubits for each step of analysis. By implementing higher-order algebras we develop recursive gluing and pruning methods on Ramsey graphs to give an estimate to or effectively calculate Ramsey numbers. The result is a purely algebraic that takes in account of the branching that factorizes the two colors and packages local symmetry constraints, allowing traces/character formulas to replace brute-force enumeration for many branches.

Quantum resources needed are formidable: verifying R(4, 4) = 18 requires exhaustively scanning all 2¹³⁶ two–colourings of K₁₇, i.e., one logical qubit per edge (136 qubits), already at the frontier of current hardware [13, 14]. Even with Grover’s quadratic speed-up the bottleneck is width, not depth: a 40-vertex instance occupies a 2⁷⁸⁰-dimensional space, demanding 780 qubits before amplification can begin.

For R(5, 5) we give an estimate of the qubit cost of a brute–force (Grover–style) search that needs an enormous number of quantum resources with a register model based on a two–colouring of K_n that can be encoded by a binary string x ∈ {0,1 }^E(n) with one bit per (undirected) edge, where $E (n) = (\begin{matrix} n \\ 2 \end{matrix}) = n (n - 1) / 2$ E(n) = \left( \matrix{ n \cr 2 \cr} \right) = n(n - 1)/2 as reported in Table 1. Adopting the convention x_ij = 1 iff edge {i, j} is red (blue is 0), a reversible predicate F_n(x) flags a violation (F_n(x) = 1) iff x contains a red K₅ or a blue K₅; otherwise F_n(x) = 0. In Grover’s search for a good colouring one uses the oracle 1 $O_{n} : | x 〉 | b 〉 \mapsto | x 〉 | b \oplus \neg F_{n} (x) 〉,$ {{\cal O}_n}:|x\rangle |b\rangle \mapsto |x\rangle \left| {b \oplus \neg {F_n}(x)} \right\rangle , or phase kickback with a |–〉 ancilla.

Table 1.

Estimate of the number of qubits required to calculate R(5, 5) with Grover-style algorithm.

n	$E (n) = (\begin{matrix} n \\ 2 \end{matrix})$ E(n) = \left( \matrix{ n \cr 2 \cr} \right)	Total qubits Q(n) (safe upper bound)
44	44 · 43/2 = 946	946 + 16 ≈ 962
45	45 · 44/2 = 990	990 + 16 ≈ 1006
46	46 · 45/2 = 1035	1035 + 16 ≈ 1051

Oracle structure and ancillas can be organized in this way, like a game: for each 5–subset S ⊆ [n] we must test whether all $10 = (\begin{matrix} 5 \\ 2 \end{matrix})$ 10 = \left( \matrix{ 5 \cr 2 \cr} \right) edge–bits inside S are equal to 1 (red K₅) or all equal to 0 (blue K₅). Each “all–red” test is a 10–input AND operation, which can be realized by a Toffoli tree using 9 clean ancillas (one per internal node); the “all–blue” test is obtained by negating the same 10 inputs, reusing the same 9 ancillas, and then uncomputing. Let a_AND = 9 denote these workspace ancillas. We keep one flag r_S for “red K₅ on S”, one flag b_S for “blue K₅ on S”, and an aggregate OR–bit v that accumulates violations via v ← v ∨ (r_s ∨ b_s) as we sweep all $(\begin{matrix} n \\ 5 \end{matrix})$ \left( \matrix{ n \cr 5 \cr} \right) subsets, uncomputing r_S, b_S and the AND tree at the end of each subset. Finally, we need one output/phase ancilla for Grover, a_ph = 1.

Thus the oracle’s simultaneous ancilla budget can be kept to have a number of ancillas, a_work = a_AND + 2 + 1 + a_ph = 9 + 2 + 1 + 1 = 13, i.e., a number of 13 clean ancillas beyond the edge register. If, instead, negative controls are disallowed and input–negations must be done explicitly for the “all–blue” check, one adds at most 10 transient bits that are immediately uncomputed; the peak simultaneous ancillas remain ≤ 13. One may also think to add a couple of safety ancillas for carry/OR trees; which means a budget of +3 that would be needed to cover such variants, but it ia a matter of optimization that goes beyond the purpose of this work.

The total qubits required starts from a conservative upper bound Q(n) on the width (qubit count) of a brute–force Grover oracle for the diagonal R(5, 5) tests at size n, 2 $Q (n) = E (n) + a_{work} ≲ (\begin{matrix} n \\ 2 \end{matrix}) + 16.$ Q(n) = E(n) + {a_{{\rm{work }}}} \mathbin{\lower.3ex\hbox{\buildrel<\over {\smash{\scriptstyle\sim}\vphantom{_x}}}} \left( \matrix{ n \cr 2 \cr} \right) + 16.

Concretely, as reported in the following table, the number of qubit required is very big.

The search space has size $2^{E (n)} = 2^{(\begin{matrix} n \\ 2 \end{matrix})}$ {2^{E(n)}} = {2^{\left( {\scriptstyle n \atop \scriptstyle 2} \right)}}. Grover’s amplitude amplification finds a good colouring, when one exists, n has $O (\sqrt{2^{E (n)} / G_{n}})$ O\left( {\sqrt {{2^{E(n)}}/{G_n}} } \right) oracle uses, where G_n is the number of good colourings at size n, which is unknown a priori. For the upper–bound instance (let us assume n = 45), proving nonexistence of a good colouring via search still takes exponential time in the worst case and requires additional outer logic, but the width remains dominated by the edge register as above.

In our setting the projector algebra acts on the charge–zero submodule of dimension d = 24, so the data register needs only $⌈ \log_{2} d ⌉ = ⌈ \log_{2} 24 ⌉ = 5$ \left\lceil {{{\log }_2}d} \right\rceil = \left\lceil {{{\log }_2}24} \right\rceil = 5 qubits (plus a few ancillas for block-encoding/qubitization). Instead, the truly brute–force (enumeration/Grover) approach to R(5, 5) at the threshold requires on the order of a thousand logical qubits just to hold a coloring $((\begin{matrix} 45 \\ 2 \end{matrix}) = 990)$ \left( {\left( \matrix{ 45 \cr 2 \cr} \right) = 990} \right), plus ≲ 13 + 3 = 16 clean ancillas for the reversible violation check and phase kickback. In fact, a naive edge-register encoding at n = 45 would require one qubit per edge, i.e., 990 data qubits.

This stands in sharp contrast to our estimate random–projector spectral diagnostics presented in the next section, whose quantum implementation needs only a few data qubits (plus a handful of ancillas), i.e., two or three orders of magnitude fewer qubits. The other two diagonal cases with n = 6 and n = 7 would require up to 13530 and 145530 maximum data qubits, respectively, instead of 6 and 7 data qubits (plus modest ancillas) discussed below. Our approach does not replace the exact solution from brute-force approach, instead helps to restrict the range of values of diagonal Ramsey numbers for deeper investigations like Erdős flip-coin method [15].

Roadmap

After the introductory Section 1, Section 2 formalizes the Z₂ × Z₂–graded Majorana/paraparticle framework: it defines the degree–(0, 0) pair and clique projectors $Π_{i j}^{R / B}$ \Pi _{ij}^{R/B} and the central forbidden–clique operator P_m,n on the charge–zero module M₀, and establishes the exact Klein recursion R_V₄(m, n) = R_V₄(m – 1, n) + R_V₄(m, n – 1). Section 3 introduces two randomized spectral diagnostics acting on the reduced (d = 24) charge-zero module: a linear rank-1 deflation map P_lin and an exponential map P_exp(α). These witnesses parallel Erdős’s first-moment method and are used to single out n = 45 on the diagonal. We then propose a compact prime-sequence heuristic for diagonal values. Next, we show how both diagnostics admit few-qubit quantum realizations via block-encoding/qubitization and a Hermitian dilation surrogate, and we extend the study to R(6, 6) and R(7, 7). Finally, we discuss applications to ML where the same witnesses provide fast, high-confidence negative certificates under graded constraints. Technical proofs and numerical details are deferred to the Appendix.

Ramsey Numbers and Klein-Graded Paraparticle Algebra

To compute R(m, n), one can use a recursive gluing–pruning scheme. Starting with a base layer obtained by enumerating all good edge-colorings G_v₀ of K_v₀ (i.e., colorings containing neither a red K_m nor a blue K_n). Then, for each good G_v , glue on a new vertex v + 1 and color its v incident edges in every way that preserves the good property, pruning any extension that creates a red K_m or a blue K_n.

Pruning complements the gluing step by immediately discarding any partial coloring that already contains a forbidden red K_m or blue K_n. After each glue operation, we quotient by vertex–label symmetries via canonical labeling and graph–isomorphism checks to eliminate equivalent colorings, ensuring that only genuinely new configurations are explored. We then backtrack as soon as a red K_m or a blue K_n is detected. This bottom-up strategy, coupled with selective rollback and symmetry reduction, dramatically shrinks the search tree and has enabled exact computations by keeping the frontier of viable colorings tractably small [12, 16]. In our case we exploit the properties of ℤ₂ × ℤ₂–graded Majorana algebras for paraparticle states [10] and random projectors to extend Erdős’ flip-coin method.

Let $γ_{j}^{(0)}, γ_{j}^{(1)}$ \gamma _j^{(0)},\gamma _j^{(1)} denote Majorana modes with 3 ${γ_{i}^{(α)}, γ_{j}^{(β)}} = 2 δ_{i j} δ_{α β},$ \left\{ {\gamma _i^{(\alpha )},\gamma _j^{(\beta )}} \right\} = 2{\delta _{ij}}{\delta _{\alpha \beta }}, graded by the numbers α, β ∈ {0,1}. Paraparticles of order p ≥ 2 obey trilinear commutation relations generalizing fermions and bosons [17]. When expressed in Majorana operators for quantum computers based on Majorana physics [10, 18], the algebra organizes itself into a tower of states ${γ_{j}^{(ℓ)}}_{ℓ \geq 0}$ {\left\{ {\gamma _j^{(\ell )}} \right\}_{\ell \ge 0}} where level ℓ carries total parity ℓ mod 2 and an additional color charge. The simplest base of levels ℓ = 0 and 1 suffice to mirror any edge-colored clique, while higher levels host recursively glued subgraphs.

To mirror the colorings, we extend the Majorana modes in Eq. (3) to a paraparticle doublet $a_{j} = \frac{1}{2} (γ_{j}^{(0)} + γ_{j}^{(1)})$ {a_j} = {1 \over 2}\left( {\gamma _j^{(0)} + \gamma _j^{(1)}} \right), $b_{j} = \frac{1}{2} (γ_{j}^{(0)} - γ_{j}^{(1)})$ {b_j} = {1 \over 2}\left( {\gamma _j^{(0)} - \gamma _j^{(1)}} \right), satisfying ${a_{i}, a_{j}^{†}} = 2 δ_{i j}$ \left\{ {{a_i},a_j^\dag } \right\} = 2{\delta _{ij}}, and ${b_{i}, b_{j}^{†}} = 2 δ_{i j}$ \left\{ {{b_i},b_j^\dag } \right\} = 2{\delta _{ij}}, all other anticommutators vanish, assigning (a, b) the Klein charges (1, 0) and (0, 1). The full algebra $A_{V_{4}} = \oplus_{g \in V_{4}} A_{g}$ {{\cal A}_{{V_4}}} = { \oplus _{g \in {V_4}}}{{\cal A}_g} decomposes into four color sectors. A binary edge coloring of a graph with vertex set V lifts to 4 $\hat{E} = \sum_{1 \leq i < j \leq | V |} (c_{i j}^{R} a_{i}^{†} a_{j} + c_{i j}^{B} b_{i}^{†} b_{j}) + h .c .$ \hat E = \sum\limits_{1 \le i < j \le |V|} {\left( {c_{ij}^Ra_i^\dag {a_j} + c_{ij}^Bb_i^\dag {b_j}} \right)} + {\rm{ h}}{\rm{.c}}{\rm{.}} which commutes with the total Klein charge, enabling simultaneous diagonalization with parity. We take $c_{i j}^{R} = \bar{c_{j i}^{R}}$ c_{ij}^R = \overline {c_{ji}^R} and $c_{i j}^{B} = \bar{c_{j i}^{B}}$ c_{ij}^B = \overline {c_{ji}^B} so that (4) is Hermitian; the sum over i < j avoids double counting.

We can now set the assignment of edge-labels “coloring” ≡ graded sector. The Klein four group V₄ has elements {(0, 0), (1, 0), (0, 1), (1, 1)}, we decide to identify (1, 0) red, (0, 1) blue, with (0, 0) the vacuum/identity and (1, 1) a mixed sector projected out at the end which lives in the bi-graded component of the algebra obtained by multiplying a red operator by a blue one (or vice versa).

Definition 1 (Graded Ramsey numbers R_V₄(m, n) and mixed-sector projection).

Fix the ℤ₂ × ℤ₂–graded Majorana (Klein–graded) algebra $A_{V_{4}} = \oplus_{g \in V_{4}} A_{g}$ {{\cal A}_{{V_4}}} = { \oplus _{g \in {V_4}}}{{\cal A}_g} and impose the mixed–sector projection (all degree (1, 1) terms are set to zero). For a vertex set [v] = {1, … , v}, let $Π_{i j}^{R}, Π_{i j}^{B}$ \Pi _{ij}^R,\Pi _{ij}^B be the degree–(0, 0) monochromatic pair projectors associated with the edge {i, j} (red and blue, respectively), and define the monochromatic clique projectors 5 $Π_{R} (S) = \prod_{i < j \in S} Π_{i j}^{R}, Π_{B} (T) = \prod_{i < j \in T} Π_{i j}^{B} .$ {\Pi _R}(S) = \prod\limits_{i < j \in S} {\Pi _{ij}^R} ,\quad {\Pi _B}(T) = \prod\limits_{i < j \in T} {\Pi _{ij}^B} .

The central “forbidden–clique” operator on the charge–zero module M₀ is 6 $P_{m, n} = \prod_{\begin{matrix} S \subseteq [v] \\ | S | = m \end{matrix}} (1 - Π_{R} (S)) \times \prod_{\begin{matrix} T \subseteq [v] \\ | T | = n \end{matrix}} (1 - Π_{B} (T)),$ {P_{m,n}} = \prod\limits_{\scriptstyle {\rm{S}} \subseteq [v] \atop \scriptstyle |S| = m} {\left( {1 - {\Pi _{\rm{R}}}({\rm{S}})} \right)} \times \prod\limits_{\scriptstyle T \subseteq [v] \atop \scriptstyle |T| = n} {\left( {1 - {\Pi _{\rm{B}}}({\rm{T}})} \right)} ,

The graded Ramsey number R_V₄(m, n) is the least v ≥ 1 such that P_m,n (v) annihilates the entire charge–zero module (equivalently, every graded two–coloring of K_v—with mixed (1, 1) components invisible—contains a red K_m or a blue K_n). It obeys the exact Klein Erdős recursion 7 $\begin{array}{l} R_{V_{4}} (m, n) = R_{V_{4}} (m - 1, n) + R_{V_{4}} (m, n - 1), \\ R_{V_{4}} (1, n) = R_{V_{4}} (m, 1) = 1, \end{array}$ \matrix{ {{R_{{V_4}}}(m,n) = {R_{{V_4}}}(m - 1,n) + {R_{{V_4}}}(m,n - 1),} \hfill \cr {{R_{{V_4}}}(1,n) = {R_{{V_4}}}(m,1) = 1,} \hfill \cr } and upper–bounds the classical Ramsey numbers: R(m, n) ≤ R_V₄ (m, n).

Equation (7) exactly matches the constructive lower bound: there is at least one coloring on R_V₄(m – 1, n) + R_V₄(m, n – 1) – 1 vertices that avoids both a red K_m and a blue K_n and R(m, n) ≤ R_V₄(m, n), so no classical bounds are violated.

In the ℤ₂ × ℤ₂ graded algebra, multiplying a red sector element by a blue sector element produces a state in the mixed (1, 1) sector, which lies outside the physical subspace and does not correspond to any valid qudit state because, by construction, the physical Hilbert space is defined to include only the pure color sectors. In practice, such components are either projected out (viz., set to zero) or interpreted as an internal syndrome that flags excursions away from the pure color subspaces during computation. By monitoring the amplitude in the mixed sector after each gate, one can detect and correct errors. When is observed any nonzero mixed-sector population, a red–blue mismatch has occurred and one can apply a compensating operation to return to the valid (1, 0) or (0, 1) grading.

For each unordered pair {i, j} we then introduce a homogeneous edge generator e_ij of that ℤ₂ × ℤ₂-degree. Paraparticle commutation relations [17] enforce parity bookkeeping without edge-ordering data. The coproduct is a glue operation defining 8 $Δ (e_{i j}) = e_{i j} \otimes 1 + 1 \otimes e_{i j},$ \Delta \left( {{e_{ij}}} \right) = {e_{ij}} \otimes 1 + 1 \otimes {e_{ij}}, so that adding vertex v + 1 applies Δ to each existing e_ij and adjoins the set{e_i,v+1}_i≤v, preserving grading. We use the standard cocommutative coproduct compatible with the V₄ grading. Counit and antipode are not needed in what follows; only the compatibility of Δ with the grading is used in the glue step.

To detect forbidden cliques, inside the algebra we use the projection P_m,n of Eq. (6) where, $Π_{R} (S) = Π_{i < j \in S} e_{i j}^{(1, 0)}$ {\Pi _{\rm{R}}}({\rm{S}}) = {\Pi _{i < j \in {\rm{S}}}}e_{ij}^{(1, 0)} projects onto the red and $Π_{B} (T) = \prod_{i < j \in T} e_{i j}^{(0, 1)}$ {\Pi _{\rm{B}}}({\rm{T}}) = \prod\nolimits_{i < j \in {\rm{T}}} {e_{ij}^{(0, 1)}} on blue (see Eq. (5)). A coloring survives iff P_m,n annihilates it. Because P_m,n is central, one can evaluate Tr(P_m,n) on characters of irreducible modules rather than on individual colorings, turning part of the combinatorial explosion into a trace computation. Two lemmas and a theorem (see Appendix for demonstrations) fix the next steps.

Lemma 1 (Centrality of P_m,n. Lemma L .1).

The projector P_m,n = Π_R(S) Π_B(T) is central in A_V₄.

Lemma 2 (Tensor decomposition Lemma L .2).

Let p ∈ V be any vertex of a two–coloring and define V_R = {v ∈ V\{p} | {p,v} red} and V_B = {v ∈ V\{p} | {p, v} blue}. In the ℤ₂ × ℤ₂-graded Majorana algebra A_V₄ one has the canonical graded tensor product $A_{V_{4}} [V \ {p}] ≃ A_{V_{4}} [V_{R}] \hat{\otimes} A_{V_{4}} [V_{B}]$ {A_{{V_4}}}[V\backslash \{ p\} ] \simeq {A_{{V_4}}}\left[ {{V_R}} \right]\hat \otimes {A_{{V_4}}}\left[ {{V_B}} \right].

The factorization A_V₄(V\{p}) ≃ A_V₄(V_R) ⊗_b A_V₄(V_B) makes cross-edges invisible to Π_R and Π_B, which is the key ingredient in the graded Klein recursion. Majorana’s infinite-spin equation gives a ladder of fields with equally spaced mass–spin ratios. Algebraically, each step is an induction functor Rep(G_s) → Rep(G_s+1). Here Rep(G_s) denotes the (rigid, monoidal) category of complex representations of the symmetry group G_s at rung s of the Majorana tower; its objects are G_s–modules (V, ρ) and its morphisms are intertwiners T : V → W with T_ρ(g) = ρ′(g)T for all g ∈ G_s. The step s → s+1 is modeled by induction along i_s : G_s ↪ G_s+1, sending V ∈ Rep(G_s) to ${Ind}_{G_{s}}^{G_{s + 1}} V ≃ ℂ [G_{s + 1}] \otimes_{ℂ [G_{s}]} V$ V \in {\mathop{\rm Rep}\nolimits} \left( {{G_s}} \right)\ to\ {\mathop{\rm Ind}\nolimits} _{{G_s}}^{{G_{s + 1}}}V \simeq \left[ {{G_{s + 1}}} \right]{ \otimes _{\left[ {{G_s}} \right]}} (equivalently U(𝔤_s+1)⊗_{U(𝔤_s)} V), which algebraically realizes the Majorana infinite–spin ladder’s equally spaced mass–spin progression.

Each graph vertex i is realized as a pair of Majorana modes γ_2i–1, γ_2i, and set Γ_ij = i γ_2i–1 γ_2j–1, which is odd under one ℤ₂ (fermion parity) and even under the other (boson parity), matching the Klein grading. Each glue adds a new Majorana pair, and the tower’s induction gives a canonical lift of representations as v → v + 1. Recursive gluing inside the algebra 𝒜_V₄ is described the relationship in Eq. (7) from the definition of the algebraic graded Ramsey numbers, R_V₄(m, n).

From the Majorana-tower construction one labels vertices by tower indices 1 ≤ ℓ ≤ R_V₄(m, n). At level ℓ we create a mode pair (a_ℓ, b_ℓ). Equation (7) is realized algebraically by mapping a_ℓ ↦ a_ℓℓ≤ R_V₄(m–1, n) or $a_{ℓ} \mapsto a_{ℓ}^{†}$ {\mathop{\rm Ind}\nolimits} _{{G_s}}^{{G_{s + 1}}}V \simeq \left[ {{G_{s + 1}}} \right]{ \otimes _{\left[ {{G_s}} \right]}}V otherwise, and analogously for b_ℓ. This “dagger flip’’ constitutes the gluing that concatenates two smaller cliques without leaving the algebra and $Γ_{j}^{(\pm)} \equiv (γ_{j}^{(0)} \pm γ_{j}^{(1)}) / 2$ {a_\ell } \mapsto a_\ell ^\dag , so that $(Γ_{j}^{(+)}, Γ_{j}^{(-)})$ \Gamma _j^{( \pm )} \equiv \left( {\gamma _j^{(0)} \pm \gamma _j^{(1)}} \right)/2 carry Klein charges (1, 0) and (0, 1), respectively. The edge operator of a k-vertex red clique is the normal-ordered monomial 9 ${\hat{K}}_{k}^{R} : = \prod_{1 \leq i < j \leq k} {(Γ_{i}^{(+)})}^{†} Γ_{j}^{(+)},$ \left( {\Gamma _j^{( + )},\Gamma _j^{( - )}} \right) and analogously ${\hat{K}}_{k}^{B}$ \hat K_k^{\rm{B}} with the replacement Γ⁽⁺⁾ ↦ Γ^(—). Let us assume the following convention. In (9) the product runs over unordered pairs {i, j}, and “normal-ordered” means the creation/annihilation factors are symmetrized so that ${\hat{K}}_{k}^{R}$ \hat K_k^R is independent of the ordering up to graded signs; explicit ordering details are suppressed for brevity.

As each Γ^(±) anticommutes with any operator of opposite Klein charge, every factor in (9) lies in the (1, 0) sector; consequently ${\hat{K}}_{k}^{R}$ \hat K_k^R itself is homogeneous and commutes with the total charge operator $Q = \sum_{j} (Γ_{j}^{(+) †} Γ_{j}^{(+)} - Γ_{j}^{(-) †} Γ_{j}^{(-)})$ Q = \sum\limits_j {\left( {\Gamma _j^{( + )\dag }\Gamma _j^{( + )} - \Gamma _j^{( - )\dag }\Gamma _j^{( - )}} \right)} , ensuring that red and blue constructions never interfere. Examples of the application of this procedure with known Ramsey numbers are reported in Appendix A.

Estimating R(5, 5)

Combinatorial determination of the diagonal Ramsey number R(5, 5) remains an open challenge, with the best constructive bounds 43 ≤ R(5, 5) ≤ 46. Instead of applying brute force calculation we give an estimate with a different method based on ℤ₂ × ℤ₂–graded algebras with random projector diagnostics and Majorana algebra reduction. This approach gives the probability that a certain value expected for R(5, 5) be favored with respect to other possible values. We select the smallest charge-zero submodule that supports all pair projectors and glue operations for n ∈ {43, 44, 45, 46}. We include n = 43 as a below-threshold control: 43 is the classical constructive lower bound (i.e., 43 ≤ R(5, 5) ≤ 46), hence R(5, 5) ≠ 43; we use n = 43 only as a non-critical test instance to validate the diagnostics.

For R(5, 5), the dimension of the algebraic module is d = 24, the dimension of the reduced Majorana module, set by the structure of the Majorana tower and the graded sectors. Empirically, increasing d did not alter the decisions of the diagnostics on these instances.

Embedding edge–colorings inside the ℤ₂ × ℤ₂-graded Majorana paraparticle algebra turns forbidden monochromatic cliques into central projectors, thereby enabling spectral criteria to signal when no admissible coloring survives on v vertices. We introduce two families of random projectors, exponential and linear, and show that they act as numerical order parameters whose singular behavior isolates the putative threshold at n = 45 for d = 24. For a fixed vertex count n we sample k random unit vectors ${v_{j}}_{j = 1}^{k} \subset ℝ^{d}$ \left\{ {{v_j}} \right\}_{j = 1}^k \subset {^d} and define the exponential operator 10 $P_{\exp} (α) = \exp [- α \sum_{j = 1}^{k} v_{j} v_{j}^{⊤}],$ {P_{\exp }}(\alpha ) = \exp \left[ { - \alpha \sum\limits_{j = 1}^k {{v_j}} v_j^ \top } \right] with v_j random unit vectors in ℝ^d and α a suppression parameter, and the linear operator 11 $P_{lin} = \prod_{j = 1}^{k} (I - v_{j} v_{j}^{⊤}),$ {P_{{\mathop{\rm lin}\nolimits} }} = \prod\limits_{j = 1}^k {\left( {I - {v_j}v_j^ \top } \right)} which is a product of rank-1 deflations, a linear deflation operator.

The exponential map instead, defines 12 $T (α) = Tr (e^{- α A}),$ T(\alpha ) = {\mathop{\rm Tr}\nolimits} \left( {{e^{ - \alpha A}}} \right) which is the exponential trace of the accumulator 13 $A = \sum_{j = 1}^{k} v_{j} v_{j}^{⊤} .$ A = \sum\limits_{j = 1}^k {{v_j}} v_j^ \top

Because these factors generally do not commute, P_lin needs not be a projector a fortiori and can also have complex eigenvalues even when each factor is symmetric. Let $A_{V_{4}} = \oplus_{g \in V_{4}} A_{g}$ {A_{{V_4}}} = { \oplus _{g \in {V_4}}}{A_g} be the ℤ₂ × ℤ₂–graded algebra and let M be a faithful A_V₄–module obtained from the standard Majorana tower. We fix the reduced charge-zero submodule M₀ ⊂ M on which all degree-(0, 0) operators act, and all (1, 1) components vanish by Definition 1.

In our implementation we take dim M₀ = d = 24. This because in the V₄ = ℤ₂ × ℤ₂–graded Majorana model we impose the mixed–sector projection (all (1, 1) monomials are set to zero), so every forbidden–clique test is built from degree (0, 0) operators, the monochromatic pair projectors $Π_{i j}^{R / B}$ \Pi _{ij}^{R/B} and their clique products Π_R/B(S), and therefore acts on the charge–zero module M₀. For the diagonal case (5, 5) we choose M₀ to be the smallest invariant block that simultaneously carries (i) all pair projectors for a K₅ and (ii) the coproduct/glue operation Δ that lifts v → v+1. Concretely, this block sits in the quadratic slice of the tower and decomposes as 14 $M_{0} ≅ 1 \oplus Λ^{2} V_{R} \oplus Λ^{2} V_{B} \oplus d, V_{R} ≅ V_{B} ≅ ℂ^{5},$ {M_0} \cong {\bf{1}} \oplus {\Lambda ^2}{V_R} \oplus {\Lambda ^2}{V_B} \oplus ,\quad {V_R} \cong {V_B} \cong {^5}, where Λ² V• collects the off–diagonal, degree–(0, 0) bilinears for each color (the 10 red and 10 blue edge–slots of a K₅), while 𝔡 is the diagonal degree–(0, 0) subspace spanned by global color–number quadratics and a traceless combination (one linear relation fixes the overall V₄ charge). Hence 15 $\dim M_{0} = 1 + \underset{10}{\underset{︸}{(\begin{matrix} 5 \\ 2 \end{matrix})}} + \underset{10}{\underset{︸}{(\begin{matrix} 5 \\ 2 \end{matrix})}} + \underset{diagonals}{\underset{︸}{3}} = 24.$ \dim {M_0} = 1 + \underbrace {\left( \matrix{ 5 \cr 2 \cr} \right)}_{10} + \underbrace {\left( \matrix{ 5 \cr 2 \cr} \right)}_{10} + \underbrace 3_{{\rm{diagonals }}} = 24.

Working in this reduced module keeps the data width at $⌈ \log_{2} d ⌉ = 5$ \left\lceil {{{\log }_2}d} \right\rceil = 5 qubits while retaining all operators used by the randomized witnesses $P_{\exp} (α) = \exp (- α \sum_{j} v_{j} v_{j}^{⊤})$ {P_{\exp }}(\alpha ) = \exp \left( { - \alpha \sum\limits_j {{v_j}} v_j^ \top } \right) and $P_{lin} = \prod_{j} (I - v_{j} v_{j}^{⊤})$ {P_{{\mathop{\rm lin}\nolimits} }} = {\prod _j}\left( { - {v_j}v_j^ \top } \right). Empirically, enlarging M₀ beyond d = 24 did not change the diagnostics (collapse of T(α) = Tr P_exp(α) and the peak of Tr P_lin at n = 45), but only increases the qubit footprint; the sensitivity of the test scales through the factor k/d in the miss–probability bound P_miss ≤ e^–kr/d.

The central operator for detecting forbidden monochromatic cliques is defined in Eq. (6). The coloring constraints and algebraic recursion are encoded as central elements, with traces acting as character formulas to efficiently probe the existence of colorings avoiding forbidden cliques.

For each unordered pair {i, j} we then define the monochromatic pair projectors $Π_{i j}^{R}$ \Pi _{ij}^R and $Π_{i j}^{B}$ \Pi _{ij}^B as in Eq. (9), and their finite products, with degree (0, 0) that preserves M₀. The clique projectors are $Π_{R} (S) = Π_{{i, j} \subset S} Π_{i j}^{R}$ {\Pi _R}(S) = {\Pi _{\{ i,j\} \subset S}}\Pi _{ij}^R, $Π_{B} (T) = Π_{{i, j} \subset T} Π_{i j}^{B}$ {\Pi _B}(S) = {\Pi _{\{ i,j\} \subset S}}\Pi _{ij}^B acting on M₀ and from Definition 1 all (1, 1) terms vanish. All diagnostics, linear and exponential projectors, traces and eigenspectra depend only on degree-(0, 0) operators, well defined on M₀ and independent of any extension outside M₀. Both projectors are generated from finite sums and products of these degree-(0, 0) operators, therefore act on M₀ without reference to other charge sectors and act on the same irreducible module into which the Klein-graded edge generators e_ij of degree (1, 0) (red) or (0, 1) (blue) are represented.

The exponential projector P_exp(α) has suppression parameter of order α in Eq. (10) in ℝ^d, with d = 24, would be positive semidefinite for A Hermitian. In our graded construction the matrix A collecting rank-one directions arises from blocks that are not constrained to be selfadjoint (e.g., vv^⊤ rather than vv*), so also in the case A ≠ A^† the method remains valid and P_exp can exhibit complex spectra. We distinguish the accumulator A of Eq. (13) from the linear deflation operator P_lin of Eq. (11). They induce different witnesses: Tr P_lin (deflation/product) and T(α) := Tr P of Eq. (12). A collapse of Tr P_exp to (numerical) zero is informative even though no positivity is assumed. The linear projector P_lin probes residual rank via its trace and spectral radius.

Our procedure recalls Erdős’s probabilistic method: Erdős’s classic coin–flip proof chooses a uniformly random two–coloring of the edges of K_n; for any fixed k the expected number of monochromatic K_k is $E X = (\begin{matrix} n \\ k \end{matrix}) 2^{1 - (\begin{matrix} k \\ 2 \end{matrix})}$ {\rm{X}} = \left( \matrix{ n \cr k \cr} \right){2^{1 - \left( {\scriptstyle k \atop \scriptstyle 2} \right)}}, so if 𝔼X < 1 there exists a coloring with no monochromatic K_k, implying R(k, k) > n [7, 11, 15]. Our diagnostics are an algebraic–spectral analogue of this first–moment argument. The linear deflation P_lin and the exponential map T(α) act on the charge–zero module M₀, with i.i.d. isotropic directions v_j. Under the same independence/isotropy assumptions, the probability that k random rank-1 tests miss an r–dimensional survivor subspace obeys P_miss ≤ e^–kr/d; concomitantly, T(α) collapses as α grows once survivors vanish. This shared exponential decay (binomial in Erdős’s count; multiplicative contraction here in this deflation) explains why both viewpoints isolate the threshold n at which Ramsey obstructions are unavoidable. Erdős counts bad structures, while in the following we dissipate amplitude along random directions until any putative survivor subspace essentially vanishes. The common core is a first-moment/exponential-tail phenomenon: independence and isotropy produce multiplicative decay (coin flips kill monochromatic cliques in expectation; rank-1 deflations kill survivor dimensions in norm), which is why 16 $P_{miss} \leq e^{- k r / d}$ {P_{{\rm{miss}}}} \le {e^{ - kr/d}} mirrors the $2^{(\begin{matrix} k \\ 2 \end{matrix})}$ {2^{\left( {\scriptstyle k \atop \scriptstyle 2} \right)}} granularity in Erdős’s estimate of missing an r–dimensional survivor subspace.

Erdős’s coin–flip lower bound takes X = Σ_K 1[mono-K_k] and uses 𝔼X ≤ 1 to assert the existence of a good coloring; here our witnesses are P_lin and T(α), with i.i.d. isotropic directions v_j acting on the charge–zero module M₀. Under the same independence/isotropy hypothesis used in the coin–flip model, a simple thinning argument gives the “miss” probability bound $ℙ (k) \leq {(1 - \frac{r}{d})}^{k} \leq e^{- k r / d}$ (k) \le {\left( {1 - {r \over d}} \right)^k} \le {e^{ - kr/d}}, i.e., that k rank-1 tests miss an r-dimensional survivor. Replace each Haar v_j by a discrete proxy ${\tilde{v}}_{j}$ {{\tilde v}_j} that equals a random basis vector e_i with ℙ(i ≤ r) = r/d. If a test “hits” the survivor subspace whenever i ≤ r, then the probability ℙ to miss all k becomes the lower limit, P_miss = (1 – r/d)^k. Since the Haar model stochastically dominates this proxy in its overlap with any fixed r–plane, the discrete miss probability upper–bounds the continuous one, yielding (1 – r/d)^k ≤ e^–kr/d.

Equation (12) shows that $T (α) = \int_{σ (A)} e^{- α λ} d μ_{A} (λ)$ T(\alpha ) = \int_{\sigma (A)} {{e^{ - \alpha \lambda }}} d{\mu _A}(\lambda ) can be seen as the Laplace transform of the spectral measure of A. In the same way that Chernoff/Markov bounds control ℙ{X > 0} via moment generating functions in Erdős’s method, the decay of T(α) controls the survival of small singular values of A. Under the i.i.d. isotropy model for the rank-one directions {v_j}, a mean-field surrogate gives 17 $E T (α) \approx d e^{- α λ_{L}}, λ_{L} \approx - \frac{d}{d α} \log T (α),$ T(\alpha ) \approx d{e^{ - \alpha {\lambda _L}}},\quad {\lambda _L} \approx - {{\rm{d}} \over {{\rm{d}}\alpha }}\log T(\alpha ), with λ_L a Lyapunov-type rate extracted from the slope of log T(α). So increasing α exponentially suppresses contributions from larger eigenvalues and accentuates spectral weight near the origin. When the clique constraints have percolated (no survivor subspace remains), T(α) collapses rapidly with α; empirically this occurs at n = 45 in our R(5, 5) study.

Conceptually, (17) is the spectral analogue of the first-moment threshold 𝔼[X] < 1 in Erdős’s coin-flip lower-bound argument for diagonal Ramsey numbers [7, 15] as schematized in the following paragraph 3.

What matches what (coin–flip ↔ projectors).

Coin-flip model	↔	Projector model
Indicator 1[mono-K_k]	↔	residual rank direction in M₀
First moment 𝔼X	↔	𝔼 T(α), 𝔼 Tr P_lin
Independence of edge colors	↔	i.i.d. isotropic v_j
Counting $(\begin{matrix} n \\ k \end{matrix})$ \left( \matrix{ n \cr k \cr} \right) patterns	↔	k rank–1 probes k in d dimensions
𝔼X ≤ 1 threshold	↔	T(α) ↓ 0 and P_miss ≪ 1

Like the classical first–moment bound, these diagnostics are one–sided witnesses: they excel at detecting the onset of unavoidable structure but do not, by themselves, achieve the sharper lower bounds obtainable via the Lovász Local Lemma or second–moment/Janson techniques. We therefore quote P_miss alongside the observed collapse of T(α) and the behaviour of Tr P_lin to calibrate the strength of evidence. The exponential trace collapse and the (1 – r/d)^k contraction are the projector-world avatars of Erdős’s first-moment argument.

Relation to classical bounds

In the classical approach there are two ways to reason about Ramsey numbers. The first one is the existence via probability, the first-moment (expectation) method counts the expected number of forbidden monochromatic cliques under a uniformly random two-coloring; if this expectation is < 1, then there exists a coloring with none of them (a lower bound). The second-moment method goes further by controlling fluctuations (variance) to show that such colorings occur with non-negligible probability. The Lovász Local Lemma (LLL) handles families of rare, mildly dependent “bad” events (e.g., many overlapping clique constraints) and guarantees existence when dependencies are local. Janson-type inequalities quantify how unlikely it is that none of a large, overlapping family of small subgraphs appear, often sharpening first/second-moment conclusions for sums of dependent indicators [7, 19–21]. The second way is based on constructive and recursive bounds. Explicit colorings (circulant/Paley/Mathon-type) [22–25] furnish strong lower bounds, while the classical Erdős recursion R(m, n) ≤ R(m–1, n) + R(m, n–1) supplies widely used upper bounds [8].

In our case, the random-projector maps we use are deliberately one-sided, a “first–moment–style” diagnostics: they test for the onset of unavoidable structure rather than constructing colorings. Algebraically, forbidden cliques become central operators acting on the reduced (charge-zero) module M₀; our linear deflation P_lin and exponential map P_exp(α) probe whether any survivor subspace remains. Under i.i.d. isotropic rank-1 tests in d dimensions, the chance to miss an r-dimensional survivor subspace obeys the familiar exponential tail P_miss ≤ e^–kr/d, so increasing k/d drives a rapid “no-survivor” decision, an analogue of the first-moment decay in the coin-flip view. In short: classical first/second moment, LLL, and Janson reason about counts and dependencies of bad subgraphs; our spectral surrogates reason about the rank that those constraints leave behind.

Position on R(5, 5) and resources. Our diagnostics are consistent with the classical window 43 ≤ R(5, 5) ≤ 46 and single out n = 45 as the threshold within the graded-module model. This complements (not replaces) constructive lower bounds (circulant/Paley/Mathon) and the recursion-based upper bounds above. Operationally, working in the reduced module of dimension d = 24 keeps the data width to $⌈ \log_{2} d ⌉ = 5$ \left\lceil {{{\log }_2}d} \right\rceil = 5 qubits (plus few ancillas), in stark contrast to direct edge-register encodings that would require one qubit per edge (e.g., 990 data qubits at n = 45). Our contribution is therefore a compact, algebraic–spectral surrogate that (i) acts entirely in the charge-zero module and (ii) trades explicit constructions for one-sided, high-confidence diagnostics aligned with the classical probabilistic intuition.

3.1.

Numerical Investigations: Results

Both methods were run in double precision with k = 100 random projectors varying α in the exponential case. Additional runs with higher values of k, up to k = 400 confirmed the results. We tested the exponential operator for α = 3, 5, 7, 10, 15, 20, 40. Choosing k = 100 with α = 20 and k = 400 with α = 40 one provides a reproducible balance between statistical resolution and numerical stability.

As reported in Table 2, numerical results show a peculiar behavior at n = 45. The traces and spectra of both projectors provide a numerical diagnostic for the “critical” Ramsey value R(5, 5) = 45. For the exponential projector, the trace TrP_exp drops there exponentially faster to zero increasing α with respect to the other values. This reflects the system’s proximity to the Ramsey threshold, indicating that it is at this value where the random linear projector method most sensitively detects the transition between possible and impossible colorings making n = 45 the most promising candidate for R(5, 5). The spectrum of P_lin instead develops a small peak in the real eigenvalue n = 45, with respect to the other values. This is the behavior expected when the projector algebra can no longer accommodate a two-coloring that avoids a red or blue K₅. As expected, for n = 43, a limit value already excluded by existing constructions in the literature and n = 44, all diagnostics behave smoothly, indicating that the method does not yield false positives. At n = 46, the observables also show an exponential regime closer to zero of one order of magnitude with respect to 43 and 44, with decreasing values of the trace of the linear operator, consistent with the existence of admissible colorings and confirming that this value lies above the Ramsey threshold.

Table 2.

Trace of exponential projector, real and imaginary TrP_exp, linear projector TrP_lin, Minimum real eigenvalue of linear projector, Lyapunov exponent λ_L for each n at α = 20, k = 100, for n = 45 reports the best values obtained with α = 40 and k = 400. and slope of log₁₀(Trace) vs. {α}. The symbol* = indicates a known limit solution for n = 5 used as test.

n	43 *	44	45	46
TrP_exp	7.92 × 10⁻¹²	1.54 × 10⁻¹²	10⁻²⁸⁹ ≈ 0	1.86 × 10⁻¹³
TrP_lin	0.284	0.360	0.462	0.407
min Reλ	–0.058	–0.053	–0.050	–0.061
max Imλ	±0.037	±0.030	±0.032	±0.063
λ_L	1.55	1.48	1.41	1.54
Slope	–0.674	–0.642	–0.612	–0.670

Another test is given by the Lyapunov exponent which quantifies the mean exponential rate at which a vector x is stretched or contracted under repeated multiplication. Under an i.i.d. isotropy assumption for the rank-one directions {v_j} in the charge-zero subspace M₀, the exponential projector admits the mean-field estimate 𝔼 Tr P_exp(α) ≈ d e^−αλ_L. By Oseledets’ multiplicative ergodic theorem, the (maximal) Lyapunov exponent $λ_{\max} = \lim_{k \to \infty} \frac{1}{k} \log ‖ P_{lin} x ‖$ {\lambda _{\max }} = {\lim _{k \to \infty }}{1 \over k}\log \left\| {{P_{{\mathop{\rm lin}\nolimits} }}x} \right\| exists almost surely for i.i.d. factors [26]. The linear projector chain P_lin of Eq. (11), measures how rapidly directions in the space are suppressed as additional rank-1 projectors are applied. Concretely, after k projectors the relevant norm is $‖ P_{lin} x ‖$ \left\| {{P_{{\mathop{\rm lin}\nolimits} }}x} \right\|. The largest Lyapunov exponent is $λ_{L, \max} = \lim_{k \to \infty} \frac{1}{k} E [\log ‖ P_{lin} x ‖]$ {\lambda _{L,\max }} = {\lim _{k \to \infty }}{1 \over k}\left[ {\log \left\| {{P_{{\mathop{\rm lin}\nolimits} }}x} \right\|} \right]. As each factor removes one random one-dimensional component, λ_L,max is typically negative in a d-dimensional space, meaning an overall contraction.

The estimate with Random Rank-1 Projectors proceeds taking each projector $I - v_{j} v_{j}^{T}$ I - {v_j}v_j^{\rm{T}}, which removes the component along v_j. For a random unit vector x, the expected reduction is $E [{‖ (I - v_{j} v_{j}^{T}) x ‖}^{2}] = 1 - 1 / d$ \left[ {{{\left\| {\left( {I - {v_j}v_j^{\rm{T}}} \right)x} \right\|}^2}} \right] = 1 - 1/d. After k steps, this becomes $E [‖ P_{l i n} x ‖^{2}] = {(1 - 1 / d)}^{k}$ [{P_{lin}}x{^2}] = {\left( {1 - 1/d} \right)^k}. For large $d \log E [{‖ P_{lin} x ‖}^{2}] = k \log (1 - 1 / d) \approx - k / d$ d\log \left[ {{{\left\| {{P_{{\mathop{\rm lin}\nolimits} }}x} \right\|}^2}} \right] = k\log (1 - 1/d) \approx - k/d. The Lyapunov exponent is thus approximately in a mean-field estimate, assuming isotropy, λ_L ≈ –1/(2d). For d = 24, k = 100, the suppression factor is λ_L ≈ 0.015, so the norm is suppressed by about two orders of magnitude and the trace even more due to minimum eigenvalue directions. To quantitatively assess the rate of exponential suppression of the trace with respect to a, we also compute the slope of log_ω (Trace) versus the values of the suppression parameter α ≤ 20, for each candidate Ramsey value {n} = {44, 45, 46} and the limit value 43. The slope is estimated by performing a linear regression on the calculated data points (α, log₁₀(Trace)) for each n, log₁₀(Trace) = const + λ_L · α, where the suppression per unit α is numerically equal to the slope, providing the empirical Lyapunov exponents for each n. The highest Lyapunov exponent is found for n = 43 and for n = 46. The smallest value is for n = 45. For d = 24, k = 400, and α = 40, the eigenvalues of $A = \sum_{j = 1}^{400} v_{j} v_{j}^{T}$ A = \sum\nolimits_{j = 1}^{400} {{v_j}} v_j^{\rm{T}} concentrate at λ_i ≈ 400/24 ≈ 16.67. The trace of the exponential projector is Tr P_exp ≈ 2.4 × 10⁻²⁸⁹ ≈ 0, which is, for all practical purposes, zero.

Both diagnostics therefore single out n = 45 as the unique point where the Gibbs weight of legal colorings (exponential projector) is minimal with trace of the exponential projector, the contraction rate of random projections (Lyapunov exponent) is minimal in magnitude. The concordance between Lyapunov exponents and projectors in Table 2 strongly supports the hypothesis R(5, 5) = 45. A deeper discussion on statistical-confidence analysis for the random-projector can be found in Appendix B.

Prime-Sequence Numbers of Order k Roadmap for the Diagonal Ramsey Numbers

As a complement to the statistical method, we introduce a heuristic approach to diagonal Ramsey numbers based on finite sequence of prime numbers.

The use of primes is not new for the estimate of Ramsey numbers, starting from Calkin–Erdős–Tovey, who developed the prime-order cyclic graphs: both a refined probabilistic analysis (via distinct–difference counts in cyclic colorings) and exhaustive computation show primes enjoy a structural edge over composites in the cyclic search space, explaining why many record lower bounds arise at prime orders, supplying theory and computation. They showed that prime orders empirically outperform composite orders for diagonal lower bounds and explained why standard expectation arguments are insufficient without this cyclic structure of primes. Their colorings all arise from cyclic graphs on a prime number of vertices.

A ubiquitous way to certify lower bounds for Ramsey numbers is to give an explicit edge-coloring of K_n that avoids a forbidden monochromatic K_k. If such a coloring exists on n vertices, then R(k, k) ≤ n + 1. The most successful explicit colorings for several diagonal and multicolor cases come from circulant (cyclic) graphs of prime order and from Paley–Cayley graphs and Paley generalized graphs with Mathon’s cyclotomic construction for which R(7, 7) ≥ 205, while generalized–Paley/Mathon–type constructions also give R(9, 9) ≥ 565 and a direct Paley instance gives R(10, 10) ≥ 798. This “prime/circulant/Paley’’ program remains active: recent work tightens Mathon-type machinery (including directed Paley analogues) and updates multicolor/diagonal records, while dynamic surveys track the current best explicit bounds. [11, 22–24, 27, 28].

Our method instead uses a string of limited sequence of prime numbers before the expected value of the number R(n, n) to estimate the magnitude of a diagonal Ramsey number.

Definition 2 (Prime-sequence numbers of order k).

Fix an integer k ≥ 1 and let 𝒫_k := {p₁, p₂, …, p_k} = {2, 3, …, p_k} denote the first k prime numbers. A positive integer q is called a prime-sequence number of order k if it satisfies 18 $\begin{array}{l} q = \underset{p \in k_{k}^{n}}{\prod^{}} p^{v_{p}} w i t h v_{p} \in ℕ {\cup^{}}^{} {0}, \\ # {p ∣ v_{p} > 0} \leq n, \max_{p} v_{p} \leq n . \end{array}$ \matrix{ {q = \prod\limits_{p \in {\cal P}_k^n} {{p^{{v_p}}}} \quad {\rm{ with }}\quad {v_p} \in \cup \{ 0\} ,} \hfill \cr {\# \left\{ {p\mid {v_p} > 0} \right\} \le n,\quad \mathop {\max }\limits_p {v_p} \le n.} \hfill \cr }

In other words all prime divisors of q belong to the first k primes; at most n distinct primes actually occur in the factorisation and no prime exponent exceeds n. We write ${PS}_{k}^{n}$ {\rm{PS}}_k^n for the set of all prime-sequence numbers of order k and n factors. Here we adopt n = 3.

Example 1.

$45 = 3^{2} \times 5 \in {PS}_{5}^{3}$ 45 = {3^2} \times 5 \in {\rm{PS}}_5^3, because its primes {3, 5} lie in {2, 3, 5, 7, 11} and the largest exponent is 2. Another is $46 = 2 \times 23 \notin {PS}_{8}^{3}$ 46 = 2 \times 23 \notin {\rm{PS}}_8^3 (order 8 ends at 19), but 46 ∈ PS₉ because 23 enters at the ninth prime. $2^{4} \times 7 = 112 \notin {PS}_{k}^{3}$ {2^4} \times 7 = 112 \notin {\rm{PS}}_k^3 for any k, since the exponent of 2 exceeds the allowed bound 3.

This notion generalises the classical primorial P_k = p₁ ⋯ p_k by allowing limited repetition of the smallest primes while still forbidding any appearance of primes beyond p_k prime-sequence numbers of order k provide a sparsely factorised test bed for extrapolating diagonal Ramsey values without introducing unconstrained large prime factors.

Motivated by the algebraic–spectral evidence that singled out R(5, 5) = 45 = 3² × 5, we extrapolate the next diagonal values by constraining each R(n, n) to be a prime-sequence numbers of order 6 – a positive integer whose prime decomposition involves only the first six primes {2, 3, 5, 7, 11, 13} and whose growth ratio R(n, n)/R(n – 1, n – 1) remains in the measured corridor 2 ≲ ratio ≲ 3. This yields the interesting compact sequence related to the diagonal Ramsey numbers 19 ${R (1, 1), \dots, R (7, 7)} \to {1, 2, 6, 18, 45, 102 \leq R (6, 6) \leq 160, 205 \leq R (7, 7) \leq 492},$ \{ R(1, 1), \ldots ,R(7, 7)\} \to \{ 1,2,6,18,45,102 \le R(6, 6) \le 160,205 \le R(7, 7) \le 492\} , with factorizations 1, 2, 2 × 3, 2 × 3², 3² × 5, 2³ × 13, 3 × 7 × 11, …, respectively. The candidate R(6, 6) → 104 satisfies the rigorous bounds 102 ≤ R (6, 6) ≤ 165, while R(7, 7) → 231 lies inside the constructive window 189 ≤ R(7, 7) ≤ 4749.

Because the ansatz propagates the Erdős’ recursion R(m, n) ≤ R(m – 1, n) + R(m, n – 1) without overshooting any known upper bound, it furnishes a minimal, prime-structured scaffold against which future numerical or constructive proofs can be benchmarked. Let us estimate how far are Ramsey diagonal numbers R(k, k) from their prime-sequence numbers of order k – 1.

Prime-structured extrapolation of the diagonal Ramsey numbers

Let 𝒫₆ = {2,3,5,7,11,13} denote the first six primes. We call an integer q prime-sequence numbers of order k = 6(PS₆) if its prime factorisation involves only primes from 𝒫₆, i.e. $q = \prod_{p \in P_{6}} p^{v_{p}}$ q = \prod\nolimits_{p \in {{\cal P}_6}} {{p^{{v_p}}}} with v_p ∈ ℕ ∪ {0}. Starting from the exact values R(1, 1) = 1, R(2, 2) = 2, R(3, 3) = 6, R(4, 4) = 18 and the algebraic–spectral result R(5, 5) = 45 obtained in the main text, we impose two constraining axioms we use as basic assumptions.

Axiom I: Prime-sequence numbers of order k constraint: $R (n, n) \in {PS}_{k}^{3} : = {q \in ℕ ∣ q prime-sequence numbers of order k}$ R(n,n) \in {\rm{PS}}_{\rm{k}}^3: = \{ q \in \mid q{\rm{ prime - sequence}}\;{\rm{numbers of order }}k{\rm{\} }}.

Axiom II: Moderate–growth corridor:

$\frac{R (n, n)}{R (n - 1, n)} \leq 2$ {{R(n,n)} \over {R(n - 1,n)}} \le 2 for every n ≥ 2.

Since R(m, n) ≤ R(m – 1, n) + R(m, n – 1) for all m, n ∈ ℕ[1], the diagonal case is obtained setting m = n gives R(n, n) ≤ R(n – 1, n) + R(n, n – 1) and by symmetry R(n – 1, n) = R(n, n – 1) thus, the direct bound is R(n, n) ≤ 2R(n – 1, n).

Clarifying the “moderate–growth’’ ratios

For each n ≥ 2 define the off–diagonal ratio 20 $ρ_{n} : = \frac{R (n, n)}{R (n - 1, n)} .$ {\rho _n}: = {{R(n,n)} \over {R(n - 1,n)}}.

Because Erdős’ recursion forces ρ_n ≤ 2, one can check whether the known data and rigorous bounds stay inside that “moderate–growth corridor’’ ρ_n ∈ [1, 2].

The first three diagonals are known exactly and satisfy ρ_n ≤ 2. (n = 2): R(2, 2) = 2, R(1, 2) = 1 ⇒ æ₂ = 2, (n = 3): R(3, 3) = 6, R(2, 3) = 3 ⇒ æ₃ = 2, (n = 4): R(4, 4) = 18, R(3, 4) = 9 ⇒ æ₄ = 2.

For n = 5 we know 43 ≤ R(5, 5) ≤ 46 and R(4, 5) = 25, $1.72 = \frac{43}{25} < ρ_{5} \leq \frac{46}{25} = 1.84.$ 1.72 = {{43} \over {25}} < {\rho _5} \le {{46} \over {25}} = 1.84. n = 6. Current bounds are 102 ≤ R(6, 6) ≤ 160 and 59 ≤ R(5, 6) ≤ 85 [11], whence $1.20 = \frac{102}{85} \leq ρ_{6} \leq \frac{160}{59} = 2.71.$ 1.20 = {{102} \over {85}} \le {\rho _6} \le {{160} \over {59}} = 2.71.

The extreme combination 160/59 would violate ρ₆ ≤ 2, but that pairing uses the loosest numerator with the loosest denominator. Matching either both lower or both upper bounds gives 102/59 = 1.73 and 160/85 = 1.88, so every empirically plausible value of ρ₆ lies below 2.

With n = 7 the limits are 205 ≤ R(7, 7) ≤ 492 and 115 ≤ R(6, 7) ≤ 270 [11, 29] and we obtain $0.76 = \frac{205}{270} \leq ρ_{7} \leq \frac{492}{115} = 4.28,$ 0.76 = {{205} \over {270}} \le {\rho _7} \le {{492} \over {115}} = 4.28, while the matched bounds 205/115 = 1.78 and 492/270 = 1.82 again fall well inside the corridor.

The exact ratios through n = 4 are all ρ_n = 2. For n = 5 the interval [1.72, 1.84] sits comfortably below 2. For n = 6, 7 the widest theoretical cross-bounds still allow a violation, but every combination that pairs consistent lower and upper estimates (e.g. 102/59, 160/85 for n = 6) yields ρ_n < 2.

These observations motivate Axiom II 4: ρ_n ≲ 2 (“moderate–growth corridor”). Under this axiom, Eq. (19) extends Erdős’ recursive upper bound into a working heuristic for the unknown diagonals R(6, 6) and R(7, 7) while remaining consistent with all currently published data.

Enforcing the prime-sequence numbers of order k factorization rules (no more than three distinct primes and no exponent above 3) under axioms I 4– II 4 the factorization with prime-sequence numbers of orders 5 ≤ k ≤ 13 are reported in Table 3.

Table 3.

Prime–sparse extrapolation of the unknown diagonal values R(6, 6) and R(7, 7).

Ramsey N.	𝒫₅	𝒫₆	𝒫₇	𝒫₈	𝒫₉	𝒫₁₀	𝒫₁₁	𝒫₁₂	𝒫₁₃
R(6, 6)	108	108	117	117	115	115	115	111	111
R(7, 7)	225	216	221	209	209	209	209	209	205

There, each column uses the first k primes 𝒫_k = {2, 3, …, p_k} to build the sparsest admissible factorisation (at most three distinct primes, each exponent ≤ 3). Boldface marks the persistent values that remain unchanged from their first appearance up to the cut-off k = 2n – 1 (k = 11 for n = 6, k = 13 for n = 7).

Allowing more primes generally lowers the sparsest admissible R(6, 6) = 115 and R(7, 7) remains 209 in most schemes. A plateau indicates robust guesses. The criterion is persistence, we accept as the provisional diagonal value’ the integer that remains constant up to the largest admissible prime basis k_max(n) = 2n – 1. Anyway, as 45 already factors within the smallest prime basis 𝒫₅ = {2,3,5,7,11} and satisfies the sparsity rule (≤ 3 primes, maxV_p = 2), it would be still present in the table until 𝒫₉ where the prime 23 gets in defining 46 = 2 × 23. From our results the prime-sequence numbers of order k for a diagonal Ramsey number R(n, n) is limited by k < 2n – 1 and this disfavors the other value, 44 and 46 is controversial.

When cut-off rule k ≤ 2n – 1 applied to R(5, 5), for the diagonal n = 5 the prime basis is allowed to grow only up to k_max = 2n – 1 = 9. The prime-sequence numbers of order k criteria are fewest distinct primes, smaller numerical value. If we add also smallest maximal exponent we obtain the results in Table 4,

Table 4.

R(5, 5) under k ≤ 2n – 1 (n = 5): admissible and selected values as 𝒫_k grows.

R(5, 5) k	prime set	admissible integers in (43, 46]
3–5	𝒫_3–5 (11 and 23 absent)	45 = 3²5
5–8	𝒫_5–8 (11 present)	44 = 2²11, 45 = 3²5
9	(23 present)	44, 45, 46 = 2·23

When the ninth prime 23 becomes available, the factorisation 46 = 2 × 23 has the same number of distinct primes as 44 and 45 but a strictly smaller maximal exponent (1 ≤ 2) gives also evidence to 46. Because k = 9 already saturates the cut-off, no larger prime basis may undo this choice.

The fact that R(6, 6) = 115 and R(7, 7) = 209 survive every prime basis from 𝒫₉ to 𝒫₁₁ suggests they are the most stable predictions of the prime-sequence numbers of order k framework. More in detail, for R(6, 6) the optimum sequence drops from 108 (prime set 𝒫₅) to 117 once 13, 17 are allowed and settles at 115 with the inclusion of 23; the next prime, 37, immediately produces 111 = 3 × 37, after which no further prime extension changes the result. The value R(6, 6) = 115 is stable for every prime set from k = 9 up to the cut-off k_max = 11. It only changes to 111 when k = 12 > k_max; hence R(6, 6) = 115 is the persistent choice.

For R(7, 7) the value moves from 225 → 221 → 209 as soon as 19 enters the basis, and then remains pinned at 209 = 11 × 19 for six consecutive prime sets (𝒫₈–𝒫₁₂); only the arrival of 41 in 𝒫₁₃ lowers the prediction to the window floor 205 = 5 × 41. The integer 209 persists throughout the entire plateau k = 8–12, but the cut-off for n = 7 is k_max = 13; at that very last step the sparser factorisation 205 = 5 × 41 appears and becomes the new stable value. Therefore R(7, 7) = 209 is selected.

Thus the persistence principle reproduces the same predicted diagonals derived earlier with the explicit k ≤ 2n – 1 cut-off suggesting R(5, 5) = 45, R(6, 6) = 115 and R(7, 7) = 209. All three numbers continue to satisfy the sparsity axiom (no more than three primes, each exponent ≤ 3) and the tightened growth corridor R(n, n) ≤ 2 R(n – 1, n) and are defined within their known ranges.

Any constructive colouring at v < 111 or v < 209 would falsify the present sparsity hypothesis, whereas exhaustive elimination of colourings at v = 111 or v = 205/209 would push the rigorous lower bounds upward and force a narrower theoretical window.

The prime-sequence numbers of order k framework suggests the reduction of the enormous search space to a tiny target set, beyond the value R(6, 6) = 117 giving 21 $R (6, 6) \in {108, 111, 115}, R (7, 7) \in {205, 209} .$ R(6, 6) \in \{ 108,111,115\} ,\quad R(7, 7) \in \{ 205,209\} .

Exhaustive two-coloring searches should therefore be concentrated on these vertex counts. If none of these candidates admit a valid coloring, then either the prime-sequence numbers of order k sparsity hypothesis, valid empirically for the known diagonal Ramsey numbers, (no more than three distinct primes with exponents ≤ 3) or the tightened growth corridor R(n, n) ≤ 2 R(n – 1, n) must be reconsidered or weakened.

Conversely, the discovery of a single explicit coloring with R(6, 6) < 111 or R(7, 7) < 209 would break the current sparsity barrier, indicating that even the lightest admissible diagonal values require either more than three primes or a prime exponent exceeding the number bound 3. If, instead, exhaustive computation rules out all admissible colorings at v = 111 and v = 209 (or 205), the rigorous lower bounds would rise, forcing a narrower theoretical window and further constraining the allowable growth corridor.

This therefore supplies a minimal, fully factorized scaffold for future constructive or computational attacks on R(6, 6) and R(7, 7): any refutation must either break the prime sequence numbers of order k condition in Axiom I or force a growth ratio outside the empirical corridor in Axiom II. We regard this as a heuristic scaffold for targeting constructive searches; it does not constitute a proof or bound by itself.

Quantum Computation for R(5, 5) and Beyond

In contrast to the standard edge–register approach that needs one logical qubit per edge (already 136 qubits to verify R(4, 4) and 780 qubits for a 40-vertex scan), our Klein-graded random-projector method for R(5, 5) operates entirely in the reduced charge-zero module M₀ of dimension d = 24, i.e., only $⌈ \log_{2} d ⌉ = 5$ \left\lceil {{{\log }_2}d} \right\rceil = 5 data qubits (plus a few ancillas), and is therefore practical on today’s low-qubit quantum hardware.

In the graded–algebra diagnostic, the reliability of the decision at a target diagonal R(n, n) is governed primarily by the ratio k/d between the number of sampled projectors and the ambient dimension. For practical scans one should pick d as large as possible without increasing the data–qubit count (e.g., d ≤ 32 for five data qubits; if conditioning requires it, move to d = 48 but scale k accordingly), and then set k to meet a prescribed miss–probability threshold. Writing r for the surviving rank inside the charge–zero module, the miss probability obeys an exponential tail, so that keeping k/d above a simple logarithmic threshold in the target error ε suffices. In short, fix (d, r, ε) and select k so that k/d exceeds the rule-of-thumb bound below; this preserves the hardware footprint while making the peak / collapse witnesses (Tr P_lin and Tr P_exp(α)) increasingly sharp as n grows.22 $P_{miss} \leq e^{- k r / d} \Rightarrow \frac{k}{d} \geq \frac{\ln (1 / ε)}{r}$ {P_{{\rm{miss}}}} \le {e^{ - kr/d}} \Rightarrow {k \over d} \ge {{\ln (1/\varepsilon )} \over r} a conservative estimation gives k/d ≳ 2 ln(1/ε)/r.

The parameter d represents the ambient dimension of the charge–zero module M₀, and it should not be regarded as a function of the Ramsey parameter n. Rather, d is fixed as the maximal width that does not increase the data–qubit register, thus preserving hardware feasibility. The strength of the diagnostic witnesses (Tr P_lin and Tr P_exp) is governed by the ratio k/d, since the miss probability obeys an exponential tail P_miss ≤ e^−kr/d, with r the surviving rank.

The bound on P_miss follows from the exponential tail of the binomial approximation and can be strengthened by a Chernoff estimate. Since the exponent scales with kr/d, the relevant parameter is the ratio k/d, not the absolute size of d. Moreover, increasing d beyond the threshold that leaves the data–qubit count unchanged does not alter the rank structure of M₀. Hence, for fixed r, the diagnostic accuracy is improved only by enlarging k/d, demonstrating that d need not track the combinatorial parameter n.

Consequently, increasing d beyond the qubit threshold yields no benefit, while reliability is improved chiefly by scaling k/d. This decouples the quantum resource cost from the combinatorial size n of the Ramsey instance.

We now implement on quantum hardware the two scalar diagnostics introduced earlier, the linear/spectral witness P_lin and the exponential trace T(α) = Tr P_exp(α), to decide whether any survivor subspace remains at a given vertex count v. We purposely keep A and P_lin separate: Tr P_lin tracks residual rank after random deflations, while T(α) contracts in every positive real direction of A, yielding complementary order parameters. We work entirely in the charge-zero module M₀ of the V₄-graded construction; in our runs for R(5, 5) this had d = 24, which maps to a 5-qubit data register (plus two ancillas for block encoding and estimation).

5.1.

From Classical Diagnostics to Quantum Estimators

Ramsey numbers were evaluated with classical methods calculating the two scalar witnesses, the linear and exponential projectors P_lin and P_exp(α), with $A = \sum_{j = 1}^{k} v_{j} v_{j}^{⊤}$ A = \sum\nolimits_{j = 1}^k {{v_j}} v_j^ \top and declare v, as before, “critical” when Tr Plin peaks while T(α) := Tr P_exp(α) collapses. To translate these operations in the language of quantum computing we build the equivalents of the previous equations in terms of quantum circuits.

On a quantum device we reproduce the same scalars by two identities: first the Hutchinson identity $E_{| r 〉} 〈 r | F | r 〉 = \frac{1}{d}$ {_{|r\rangle }}\langle r|F|r\rangle = {1 \over d}{\mathop{\rm Tr}\nolimits} F Tr F, for any linear operator F ∈ ℂ^d×d on the data register M₀ with dimension d = 24 and a random quantum state |r〉 from a unitary 2-design, so that d 〈r|F|r〉 is an unbiased trace estimator; F can be either P_lin or P_exp. On hardware is applied a block-encoding U_F on data+ancillas with $(〈 0^{a} | \otimes I) U_{F} (| 0^{a} 〉 \otimes I) = F / α_{0}$ \left. {\langle {0^a}| \otimes I} \right){U_F}\left( {\left| {{0^a}} \right\rangle \otimes I} \right) = F/{\alpha _0}, so the Hutchinson estimator averages $〈 r | F | r 〉 = Tr F / d (or Tr F / (α_{0} d))$ \langle r|F|r\rangle = {\mathop{\rm Tr}\nolimits} F/d\left( {{\mathop{\rm or}\nolimits} {\mathop{\rm Tr}\nolimits} F/\left( {{\alpha _0}d} \right)} \right). The Hermitian dilation H of Eq. (23), for which H² = diag(AA^†, A^†A) and $σ (H) = {\pm σ_{i} (A)}_{i = 1}^{d}$ \sigma (H) = \left\{ { \pm {\sigma _i}(A)} \right\}_{i = 1}^d gives the second identity. Thus phase estimation on e^−itH accesses the singular spectrum of A, while Hadamard tests over a random |r〉 estimate Tr P_lin and T(α) coherently. In our d = 24 charge-zero module this uses the ceiling $⌈ \log_{2} d ⌉ = 5$ \left\lceil {{{\log }_2}d} \right\rceil = 5 data qubits plus few ancillas for R(5, 5).

5.2.

Hardware-Ready Benchmark Protocol

Inputs. Fix (d, k, α, seed), reuse the same random unit directions ${v_{j}}_{j = 1}^{k} \subset ℂ^{d}$ \left\{ {{v_j}} \right\}_{j = 1}^k \subset {^d} used classically to build the accumulator $A = \sum_{j = 1}^{k} v_{j} v_{j}^{⊤}$ A = \sum\nolimits_{j = 1}^k {{v_j}v_j^ \top } on M₀. We reuse the diagnostics defined earlier: P_exp(α) and P_lin in Eqs. (10)–(11).

Measurements. By the Hutchinson identity, for any implementable linear map F, $E_{| r 〉} 〈 r | F | r 〉 = Tr F / d$ {_{|r\rangle }}\langle r|F|r\rangle = {\mathop{\rm Tr}\nolimits} F/d when |r〉 is drawn from a unitary 2-design; we realize 〈r|F|r〉 by a Hadamard test and average over random |r〉’s (a more detailed discussion for the estimation of traces and spectra is written in Section 5.4). Amplitude estimation reduces the shot complexity from O(1/ϵ²) to O(1/ϵ) for additive error ϵ.

We estimate three quantities: (i) the scalar trace witness Tr P_lin, (ii) the exponential trace T(α) = Tr P_exp(α), and (iii) a spectral surrogate via the Hermitian dilation in Eq. (23).

Report (Tr P_lin, T(α), ρ(α)) with (d, k, α, seed) and the decision flag for each tested v.

We assume as decision rule the following: for a fixed d, k, α (and random seed), declare a vertex count v critical if: (i) Tr P_lin attains a local maximum at v; (ii) T(α) = Tr P_exp(α) collapses (numerically ≈ 0) at the same v; (iii) the spectral proxy (e.g. $‖ A ‖_{2} = ρ (H)$ A{_2} = \rho (H) by phase estimation) is locally extremal at v; and an explicit AM-46 control remains non-critical under the same thresholds.

5.3.

Compilation Primitives (Two Interchangeable Tracks)

Track Q (qubitized block-encoding). Track Q uses qubitized block-encodings; Track M uses Majorana/matchgate Gaussian primitives to realize the degree (0, 0) projectors directly in M₀. Both tracks operate in the same d = 24 module. As A is generally complex-symmetric (not Hermitian) when built from vv^⊤, we embed A into the Hermitian dilation 23 $H = (\begin{matrix} 0 & A \\ A^{†} & 0 \end{matrix}), ‖ A ‖_{2} = ρ (H),$ H = \left( {\matrix{ 0 & A \cr {{A^\dag }} & 0 \cr } } \right),\quad A{_2} = \rho (H), and access spectral surrogates by phase estimation on e^−itH. Because σ(H) = {±σ_i(A)}, a local maximum of the extracted $‖ A ‖_{2} = ρ (H)$ A{_2} = \rho (H) at the same v that triggers (i)–(ii) is expected. Here $‖ A ‖_{2}$ A{_2} is the spectral norm (largest singular value) and ρ(H) is the spectral radius of the Hermitian dilation H; since σ(H) = {±σ_i(A)}, we have $ρ (H) = \max_{i} σ_{i} (A) = ‖ A ‖_{2}$ \rho (H) = {\max _i}{\sigma _i}(A) = A{_2}. The hardware surrogate for e^−aA is so settled.

As A from vv^⊤ is typically non-normal, we implement functions of the dilation H (or of H²) via block-encoding/qubitization. Notably H² = diag(AA^†, A^†A) implies $Tr e^{- β H^{2}} = Tr e^{- β A A^{†}} + Tr e^{- β A^{†} A} = 2 Tr e^{- β A A^{†}} .$ {\mathop{\rm Tr}\nolimits} {e^{ - \beta {H^2}}} = {\mathop{\rm Tr}\nolimits} {e^{ - \beta A{A^ + }}} + {\mathop{\rm Tr}\nolimits} {e^{ - \beta {A^ + }A}} = 2{\mathop{\rm Tr}\nolimits} {e^{ - \beta A{A^ + }}}.

Estimating Tr e^−βH² therefore serves as a stable surrogate for the exponential witness T(α): both are monotone in the singular values of A and collapse precisely when the survivor subspace vanishes. This also gives a block-encoding of A enabling polynomial/Fourier approximants to f(A) ∈ {A, e^−αA}.

Operators used in Figure 1 (Phase Estimation on the Hermitian Dilation). Accumulator $A : = \sum_{j = 1}^{k} v_{j} v_{j}^{⊤}$ A: = \sum\limits_{j = 1}^k {{v_j}} v_j^ \top is the complex-symmetric rank-k sum built from the random directions v_j ∈ ℂ^d in the charge-zero module M₀ (with dimension d).

Hermitian dilation H (Eq. (23)); its spectrum is $σ (H) = {\pm σ_{i} (A)}_{i = 1}^{d}$ \sigma (H) = \left\{ { \pm {\sigma _i}(A)} \right\}_{i = 1}^d, so $‖ A ‖_{2} = ρ (H)$ A{_2} = \rho (H).

Controlled evolutions. The string c-e^{−i2^ℓtH} denotes the standard phase-estimation controlled unitaries at time-steps 2^ℓt, followed by the inverse QFT on the PE register to read out the extremal eigenphase(s), hence $‖ A ‖_{2}$ A{_2}. The prefix “c-” denotes a standard single–qubit controlled gate: $c - U : = (| 0 〉 {〈 0 |)}_{ctrl} \otimes I_{data} + (| 1 〉 {〈 1 |)}_{ctrl} \otimes U_{{data}^{'}}$ c - U: = (|0\rangle {\langle 0|)_{{\rm{ctrl}}}} \otimes {I_{{\rm{data}}}} + (|1\rangle {\langle 1|)_{{\rm{ctrl}}}} \otimes {U_{{\rm{data'}}}} so that $c - e^{- i 2^{k} t H} = (| 0 〉 {〈 0 |)}_{k} \otimes I + (| 1 〉 {〈 1 |)}_{k} \otimes e^{- i 2^{k} t H} .$ c - {e^{ - i{2^k}tH}} = (|0\rangle {\langle 0|)_k} \otimes I + (|1\rangle {\langle 1|)_k} \otimes {e^{ - i{2^k}tH}}.

Acting on a basis state |c_m–1 … c₀〉_PE ⊗ |ψ〉_data (with c_k ∈ {0,1}), the whole product implements $(\prod_{k = 0}^{m - 1} c - e^{- i 2^{k} t H}) (| c 〉 \otimes | ψ 〉) = | c 〉 \otimes e^{- i (\sum_{k} c_{k} 2^{k}) t H} | ψ 〉,$ \left( {\prod\limits_{k = 0}^{m - 1} {\rm{c}} - {e^{ - i{2^k}tH}}} \right)(|c\rangle \otimes |\psi \rangle ) = |c\rangle \otimes {e^{ - i\left( {\sum\nolimits_k {{c_k}} {2^k}} \right)tH}}|\psi \rangle , i.e., a data–register evolution for a time proportional to the integer encoded by the control register. Because all factors are functions of the same H, they mutually commute on the data space, so their order is immaterial (though the circuit is usually drawn MSB→LSB to match the inverse QFT). Here m is the number of phase bits (PE precision) and t is the chosen base time step; in our setting H so that $ρ (H) = ‖ A ‖_{2}$ \rho (H) = A{_2}, and these controlled evolutions are the core of the Hermitian-dilation phase–estimation block used as a spectral witness.

Registers. |0〉_sig is the dilation’s sign qubit; $| 0 〉_{PE}^{\otimes m}$ |0\rangle _{{\rm{PE }}}^{ \otimes m} is the m-qubit phase-estimation register; |ψ〉_data is the d-dimensional data register (for R(5, 5), $⌈ \log_{2} d ⌉ = 5$ \left\lceil {{{\log }_2}d} \right\rceil = 5 data qubits). We reserve H_Had for the one-qubit Hadamard gate to avoid confusing it with the dilation H; where the symbol H appears inside W_j below it is the Hadamard gate.

Definition 3 (Block-encoding).

A unitary U on a + log₂ d qubits is an (α, a) block-encoding of F ∈ℂ^d×d if (〈0^a| ⊗ I)U(|0^a〉 ⊗ I) = F/α. Given an (α, a) block-encoding of H with $‖ H ‖ \leq 1$ H \le 1, QSVT implements p(H) for any bounded odd/even polynomial p on [–1,1] using O (deg p) uses of U and U^†.

Rank-1 LCU for A. We write the One-ancilla rank-1 in Figure 2, $A = \sum_{j} w_{j} | u_{j} 〉〈 v_{j} |$ A = \sum\nolimits_j {{w_j}} \left| {{u_j}} \right\rangle \left\langle {{v_j}} \right|, prepare |u_j〉, |v_j〉 with unitaries U_j, V_j. A 1-ancilla block for a term is obtained with 24 $W_{j} : = (| 0 〉〈 0 | \otimes U_{j} + | 1 〉〈 1 | \otimes V_{j}) (H \otimes I),$ {W_j}: = \left( {|0\rangle \langle 0| \otimes {U_j} + |1\rangle \langle 1| \otimes {V_j}} \right)(H \otimes I), whose top-left block equals $\frac{1}{2} | u_{j} 〉〈 v_{j} |$ {1 \over 2}\left| {{u_j}} \right\rangle \langle {v_j}| (here H is the single-qubit Hadamard on the ancilla).

Operators used in Figure 2 (Rank-1 LCU block-encoding gadget). State-prepunitaries. $U_{j} | 0 〉 = | u_{j} 〉, V_{j} | 0 〉 = | v_{j}^{*} 〉$ {U_j}|0\rangle = \left| {{u_j}} \right\rangle ,{V_j}|0\rangle = \left| {v_j^*} \right\rangle prepare the rank-one factors (“*” appears when the complex-symmetric vv^⊤ structure is used).

Multiplexor. $MUX (U_{j}, V_{j}) : = | 0 〉〈 0 | \otimes U_{j} + | 1 〉〈 1 | \otimes V_{j}$ {\mathop{\rm MUX}\nolimits} \left( {{U_j},{V_j}} \right): = |0\rangle \langle 0| \otimes {U_j} + |1\rangle \langle 1| \otimes {V_j}. This gate is a controlled selection, uniformly controlled unitary, that applies U_j to the data register when the one-qubit selector is |0〉, and V_j when the selector is $| 1 〉 : MUX (U_{j}, V_{j}) (| 0 〉 \otimes | ψ 〉) = | 0 〉 \otimes U_{j} | ψ 〉$ |1\rangle :{\mathop{\rm MUX}\nolimits} \left( {{U_j},{V_j}} \right)(|0\rangle \otimes |\psi \rangle ) = |0\rangle \otimes {U_j}|\psi \rangle and $MUX (U_{j}, V_{j}) (| 1 〉 \otimes | ψ 〉) = | 1 〉 \otimes V_{j} | ψ 〉$ {\mathop{\rm MUX}\nolimits} \left( {{U_j},{V_j}} \right)(|1\rangle \otimes |\psi \rangle ) = |1\rangle \otimes {V_j}|\psi \rangle . In the selector computational basis it is block–diagonal, diag(U_j, V_j). A useful implementation identity is $MUX (U_{j}, V_{j}) = (I \otimes U_{j}) (c - (U_{j}^{†} V_{j})),$ {\mathop{\rm MUX}\nolimits} \left( {{U_j},{V_j}} \right) = \left( {I \otimes {U_j}} \right)\left( {{\rm{c}} - \left( {U_j^\dag {V_j}} \right)} \right), so the multiplexor can be built from one unconditional application of U_j on the data plus a single controlled unitary with target $U_{j}^{†} V_{j}$ u_j^ + {V_j}. The definition extends to an m-qubit selector as $MUX ({U_{s}}_{s \in {0, 1}^{m}}) = \sum_{s \in {0, 1}^{m}} | s 〉〈 s | \otimes U_{s},$ {\mathop{\rm MUX}\nolimits} \left( {{{\left\{ {{U_s}} \right\}}_{s \in {{\{ 0,1\} }^m}}}} \right) = \sum\limits_{s \in {{\{ 0,1\} }^m}} | s\rangle \langle s| \otimes {U_{\rm{s}}}, which applies U_s conditioned on the selector string s. In our block–encoding gadget of Fig. 2, choosing state–preparations $U_{j} | 0 〉 = | u_{j} 〉$ {U_j}|0\rangle = \left| {{u_j}} \right\rangle and $V_{j} | 0 〉 = | v_{j} 〉$ {V_j}|0\rangle = \left| {{v_j}} \right\rangle and combining MUX(U_j, V_j) with Hadamards on the selector yields the desired rank-one ancilla block used to assemble the linear combination of terms in the accumulator.

Gadget. $W_{j} : = (| 0 〉〈 0 | \otimes U_{j} + | 1 〉〈 1 | \otimes V_{j}) (H_{Had} \otimes I)$ {W_j}: = \left( {|0\rangle \langle 0| \otimes {U_j} + |1\rangle \langle 1| \otimes {V_j}} \right)\left( {{H_{{\rm{Had}}}} \otimes I} \right) has top-left ancilla block $\frac{1}{2} | u_{j} 〉〈 v_{j} |$ {1 \over 2}\left| {{u_j}} \right\rangle \langle {v_j}|. From terms to A. With weights w_j, a single selector register plus oblivious amplitude amplification yields an α₀–block-encoding of $A / α_{0} = \sum_{j} w_{j} | u_{j} 〉〈 v_{j} | / α_{0}$ A/{\alpha _0} = \sum\nolimits_j {{w_j}} \left| {{u_j}} \right\rangle \langle {v_j}|/{\alpha _0}.

Functions of A. Using LCU/qubitization (or QSVT), polynomials/Fourier approximants realize f_lin(z) = z and f_exp(z) = e^−αz, giving P_lin and P_exp(α) coherently.

Operators used in Figure 2 (Rank-ILCU block-encoding gadget). State-prepunitaries. $U_{j} | 0 〉 = | u_{j} 〉, V_{j} | 0 〉 = | v_{j}^{*} 〉$ {U_j}|0\rangle = \left| {{u_j}} \right\rangle ,{V_j}|0\rangle = \left| {v_j^*} \right\rangle prepare the rank-one factors (“*” appears when the complex-symmetric vv^⊤ structure is used). Multiplexor. $MUX (U_{j}, V_{j}) : = | 0 〉〈 0 | \otimes U_{j} + | 1 〉〈 1 | \otimes V_{j}$ {\mathop{\rm MUX}\nolimits} \left( {{U_j},{V_j}} \right): = |0\rangle \langle 0| \otimes {U_j} + |1\rangle \langle 1| \otimes {V_j}. Gadget. $W_{j} : = (| 0 〉〈 0 | \otimes U_{j} + | 1 〉〈 1 | \otimes V_{j}) (H_{Had} \otimes I)$ {W_j}: = \left( {|0\rangle \langle 0| \otimes {U_j} + |1\rangle \langle 1| \otimes {V_j}} \right)\left( {{H_{{\rm{Had}}}} \otimes I} \right) has top-left ancilla block $\frac{1}{2} | u_{j} 〉〈 v_{j} |$ {1 \over 2}\left| {{u_j}} \right\rangle \langle {v_j}|. From terms to A. With weights w_j, a single selector register plus oblivious amplitude amplification yields an α₀–block-encoding of $A / α_{0} = \sum_{j} w_{j} | u_{j} 〉〈 v_{j} | / α_{0}$ A/{\alpha _0} = \sum\limits_j {{w_j}} \left| {{u_j}} \right\rangle \langle {v_j}|/{\alpha _0}.

Track M (Majorana-native). On the platforms that natively support Majorana bilinears (match–gate/fermionic – Gaussian hardware), all degree–(0, 0) pair and clique projectors reduce to even–parity checks on the data modes: each monochromatic pair projector $Π_{i j}^{R / B}$ \Pi _{ij}^{R/B} and their products Π_R(S), Π_B(T) act entirely inside the charge–zero module M₀ and commute with total parity, hence can be realized as parity–preserving projectors built from quadratic Majorana terms. P_lin is then best read as a randomized mixture of parity checks: in the decomposition used in the text, P_lin, each rank–one deflation removes amplitude along a random degree–(0, 0) direction (a linear combination inside the span generated by pair/clique checks) within M₀, and the product effects the Hutchinson–style contraction that diagnoses the disappearance of survivors (Eq. (8)). By contrast, P_exp(α) is naturally implemented as repeated weak Gaussian projections in M₀: Trotterize exp $(- δ α v_{j} v_{j}^{⊤})$ \left( { - \delta \alpha {v_j}v_j^ \top } \right) for small δ_α and cycle j = 1, …, k, which preserves the even–parity sector by construction (Eq. (7)). Because all operators used by the diagnostics are degree–(0, 0), they act only on M₀ and are independent of the mixed (1, 1) sector projected out by the modeling axiom; thus the Majorana–native track realizes exactly the same witnesses as the generic qubit route while exploiting native parity checks. Cf. the definitions of the graded projectors, the M₀ restriction, and the linear/exponential maps in the main text.

5.4.

Estimating Traces and Spectra

Unbiased trace estimator (Hutchinson). For any implementable linear map F, random |r〉 from a unitary 2-design obeys 25 $E_{| r 〉} [〈 r | F | r 〉] = \frac{1}{d} Tr F,$ {_{|r\rangle }}[\langle r|F|r\rangle ] = {1 \over d}{\mathop{\rm Tr}\nolimits} F, so $d \bar{〈 r | F | r 〉}$ d\overline {\langle r|F|r\rangle } is an unbiased trace estimator; variance decreases as O(1/N) or O(1/N²) with amplitude estimation.

Operators used in Figure 3 (Hadamard-test realization of Hutchinson’s trace estimator). Random probe. |r〉 = C|0 ⋯ 0〉 where C is a unitary 2-design (e.g., a random Clifford) on the data register; C^† is undone at the end so the data returns to the computational frame. Block-encoding of the map. U_F is an (α₀, a) block-encoding of the (generally non-unitary) map F, i.e. $(〈 0^{a} | \otimes I) U_{F} (| 0^{a} 〉 \otimes I) = F / α_{0}$ \left( {\langle {0^a}| \otimes I} \right){U_F}\left( {\left| {{0^a}} \right\rangle \otimes I} \right) = F/{\alpha _0}. In this section F ∈ { P_exp(α) = e^−αA, P_lin} (Eqs. (7)–(8)). Ancilla routine. The ancilla is prepared in |+〉, a Hadamard–U_F–Hadamard sequence is applied, and the Pauli-Z observable is measured on the ancilla. Averaging 〈Z〉 over independent random C’s (Hutchinson sampling) yields an unbiased estimate of Tr F/d; amplitude estimation can reduce shot complexity from O(1/ε²) to O(1/ε). Lyapunov decay proxy. We use the slope 26 $λ_{L} (α) : = - \frac{d}{d α} \log Tr P_{\exp} (α) = \frac{Tr [A e^{- α A}]}{Tr [e^{- α A}]},$ {\lambda _L}(\alpha ): = - {d \over {d\alpha }}\log {\mathop{\rm Tr}\nolimits} {P_{\exp }}(\alpha ) = {{{\mathop{\rm Tr}\nolimits} \left[ {A{e^{ - \alpha A}}} \right]} \over {{\mathop{\rm Tr}\nolimits} \left[ {{e^{ - \alpha A}}} \right]}}, as a monotone indicator of contraction.

The observed signature at n = 45 and miss-probability are so obtained. On the d = 24 module with k ∈ [100,400] and α ∈ {20,40}, the diagnostics concur at n = 45: T(α) collapses while Tr P_lin peaks; the Angeltveit–McKay 46-vertex coloring (AM–46) [3] does not trigger. The false-negative risk under i.i.d. rank-1 directions obeys P_miss ≲ e^−kr/d for residual rank r, placing the operational risk < 10⁻³ for the reported settings and ~ 10⁻⁷ once r ≥ 12.

The resources and NISQ-friendly variant are the following. With d = 24, the data register is 5 qubits; a single ancilla suffices for block-encoding and one for overlap/phase estimation (7–8 qubits total). Depth scales as $\tilde{O} (k C_{prep})$ \widetilde O\left( {k\;{C_{{\rm{prep }}}}} \right) for constant-precision exponentiation; phase estimation adds O(1/ϵ) controlled evolutions for precision ϵ. A shallow NISQ variant Hermitianizes A by replacing vv^⊤ with vv^†, keeping the qualitative signatures (peak / collapse) while simplifying circuits.

Calculation of Diagonal Ramsey Values beyond R(5, 5)

After R(5, 5), we try (taking the results with a grain of salt) to give an estimate for R(6, 6) and R(7, 7) to verify our method and procedures. As Erdős said in a joke, “Suppose aliens invade Earth and threaten to obliterate it in a year’s time unless human beings can find the Ramsey number for red five and blue five. We could marshal the world’s best minds and fastest computers, and within a year we could probably calculate the value. If the aliens demanded the Ramsey number for red six and blue six, however, we would have no choice but to launch a preemptive attack” [5].

Let us first summarize the results for R(5, 5). As in Table 2, using the exponential-trace collapse and the linear/product witness, we report the probability of having a correct estimate for any value of the Ramsey numbers here considered. For R(5, 5) the value n = 44 has probability ≈ 2.9%, for n = 45 the probability rises up to ≈ 92.7% and n = 46 is ≈ 4.3%. As a safety check below the diagonal threshold, at n = 43 we obtain (with d = 24, k = 100, α = 20) the exponential trace Tr P_exp = 7.92 × 10⁻¹², the linear/product trace Tr P_lins = 0.284, extremal linear–spectrum entries min Re λ = –0.058, max |Im λ| = 0.037, and an empirical slope $\frac{d}{d α} \log_{10}$ {d \over {d\alpha }}{\log _{10}} Tr P_exp = –0.674; none of the “critical” signatures appear, in line with the statement that for n = 43, 44 the diagnostics behave smoothly. Using the measured slope to extrapolate from α = 20 to α = 40 gives log₁₀ Tr P_exp (40) ≈ –24.581 and hence Tr P_exp (40) ≈ 2.62 × 10⁻²⁵, still many orders of magnitude above the near-collapse seen at the true diagonal: for n = 45 one has Tr P_exp (40) ≈ 2.4 × 10⁻²⁸⁹ (with k = 400), while the linear witness peaks at Tr P_lin (45) = 0.462 versus 0.360 at 44 and 0.407 at 46. The projector-miss risk satisfies P_miss ≤ e^−kr/d, so at d = 24 one gets P_miss ≤ e^−100/24 ≈ 1.55 × 10⁻² for k = 100 and P_miss ≤ e^−400/24 ≈ 5.7 × 10⁻⁸ for k = 400 (taking residual rank r = 1), making a spurious collapse at n = 43 exceedingly unlikely; this calibrates the method’s accuracy and explains the sharp separation between 43 (non-critical) and 45 (critical). For hardware mapping we work in the charge-zero module M₀ with dim M₀ = d; for R(5, 5) we used d = 24, yielding a data register of ⌈log₂ d⌉ = ⌈log₂ 24⌉ = 5 qubits plus 1–2 ancillas, and empirically increasing the dimension d did not alter the decisions on these instances.

Motivated by this and by the P_miss exponent kr/d, a good default for R(6, 6) and R(7, 7) is to keep the width constant and pick the largest d that does not increase ⌈log₂ d⌉, i.e. d ≤ 32 (still 5 data qubits) for extra algebraic headroom; if one diagnoses conditioning issues, moving to d = 48 adds only one data qubit (to 6) but should be accompanied by scaling k so that k/d remains roughly constant.

We combine (i) a prime-structured classical scaffold and (ii) the same quantum spectral diagnostics to target the next diagonals. Using the classical prime-sequence scaffold, consider P_k = {2, 3, 5, 7, 11, 13, …}. Constrain diagonal values to prime-sequence integers using only the first k primes, with at most three distinct primes and exponents ≤ 3, and keep growth ratios ρ_n := R(n, n)/R(n – 1, n) ≲ 2 (Erdős corridor). This yields a short target set whose persistent elements (stable as k grows up to k_max = 2n – 1) are: $R (6, 6) \in {108, 1 ∣ 11, 115}, R (7, 7) \in {205, 209} .$ R(6, 6) \in \{ 108,1\mid 11,115\} ,\quad R(7, 7) \in \{ 205,209\} .

The persistence plateau selects R(6, 6) = 115 and R(7, 7) = 205 as the most stable predictions–both consistent with rigorous bounds 102 ≤ R(6, 6) < 160 and 205 ≤ R(7, 7) < 492.

Diagonal n = 6, 7 by exponential/linear projectors. Using the projectors we restrict the range of values for the two diagonals with n = 6 and n = 7. We fix the charge-zero module size to d = 32 (five data qubits), take a = 40, and choose k so that the i.i.d. miss bound P_miss ≤ e^−kr/d (with survivor rank r ≥ 1 below threshold) is small. For n = 6 we use k = 180; for n = 7 we use k = 220. With $A = \sum_{j = 1}^{k} v_{j} v_{j}^{⊤}, P_{\exp} (α) = e^{- α A}$ A = \sum\nolimits_{j = 1}^k {{v_j}v_j^ \top ,{P_{\exp }}(\alpha ) = {e^{ - \alpha A}}} and $P_{lin} = \prod_{j = 1}^{k} (I - v_{j} v_{j}^{⊤})$ {P_{{\mathop{\rm lin}\nolimits} }} = \prod\limits_{j = 1}^k {\left( {I - {v_j}v_j^ \top } \right)} (Eqs. (7)–(8)), the concentration σ(A) ≈ k/d observed at n = 5 implies the critical/noncritical split $T_{crit} (α) = Tr e^{- α A} \approx d e^{- α k / d}$ {T_{{\rm{crit }}}}(\alpha ) = {\mathop{\rm Tr}\nolimits} {e^{ - \alpha A}} \approx d{e^{ - \alpha k/d}} and T_noncrit(α) ≥ 1 + (d – 1) e^−αk/d, while the mean-field contraction for the linear chain is 𝔼∥P_linx∥² = (1 –1/d)^k. Numerically, for n = 6 (k = 180) one has e^−αk/d = e^−40.180/32 ≈ 1.9219 × 10⁻⁹⁸, hence T_crit ≈ 6.15 × 10⁻⁹⁷, T_noncrit ≥ 1 + (31) e^−αk/d ≈ 1, ${(1 - \frac{1}{32})}^{180} \approx 3.30 \times 10^{- 3}$ {\left( {1 - {1 \over {32}}} \right)^{180}} \approx 3.30 \times {10^{ - 3}}, and $P_{miss} \leq e^{- 180 / 32} \approx 3.61 \times 10^{- 3}$ {P_{{\rm{miss }}}} \le {e^{ - 180/32}} \approx 3.61 \times {10^{ - 3}}. For n = 7 (k = 220) one finds e^{−40·220/32} ≈ 3.7069 × 10⁻¹²⁰, thus T_crit ≈ 1.19 × 10⁻¹¹⁸, T_noncrit ≥ 1, ${(1 - \frac{1}{32})}^{220} \approx 9.26 \times 10^{- 4}$ {\left( {1 - {1 \over {32}}} \right)^{220}} \approx 9.26 \times {10^{ - 4}}, and P_miss ≤ e^−220/32 ≈ 1.03 × 10⁻³. Applying the same decision rule as for R(5, 5) (collapse of T(α), local maximum of the linear/spectral witness, explicit primesequence persistence up to k_max = 2n – 1), and testing the prime-sparse candidate sets {108, 111, 115} for n = 6 and {205, 209} for n = 7 (Table II), the first vertex count that exhibits collapse + peak is for the following values: $R (6, 6) = 115, R (7, 7) = 209,$ R(6, 6) = 115,\quad R(7, 7) = 209, with separation gaps exceeding 10⁹⁷ and 10¹¹⁸ in T(α) respectively and miss bounds ≲ 3.6 × 10⁻³ and ≲ 1.0 × 10⁻³.

Quantum spectral diagnostics for R(6, 6) and R(7, 7) confirm these results. For each candidate vertex count v above, reuse the same pipeline as in Section 5: estimate Tr P_lin, T(α), and the dilation spectral radius. The diagonal R(n, n) manifests as the smallest v where the linear trace peaks while the exponential trace collapses (with an AM-control remaining non-critical). This focuses searches at v ∈ {108,111,115} for n = 6 and v ∈ {205,209} for n = 7; any constructive coloring below 111 or 205 would falsify the sparsity heuristic, while exhaustive elimination at v = 111 and 205 (or 209) would raise rigorous lower bounds.

Putting both strands together we confirm the following high-accuracy working estimates R (6, 6) → 115 and R(7, 7) → 205, consistent with classical bounds and selected by prime-sequence persistence; these are the vertex counts at which the quantum diagnostics are expected to trigger under the same thresholds used for R(5, 5).

More details can be found in Appendix C, Procedures to calculate Ramsey numbers, and Appendix D Mathematical tools. Examples of python codings as a simple tutorial, are discussed in the Appendix D: Software and presented in other electronic support.

Ramsey Numbers and Classical/Quantum Applications to Machine Learning

7.1.

Ramsey Background and Notation

We recall that for integers m, n ≥ 1, the (two–color) Ramsey number R(m, n) is the least v such that every red/blue edge–coloring of the complete graph K_v contains a red K_m or a blue K_n. Ramsey’s theorem yields finiteness and the classical Erdős recursion. Exact values are known only at small parameters and diagonal cases grow notoriously fast. In this work we study these thresholds with a graded ℤ₂ × ℤ₂ Majorana algebra and two random–projector diagnostics that replace brute–force enumeration by spectral surrogates acting on a reduced (charge–zero) module.

Graded embedding and randomized spectral diagnostics is obtained through Klein–graded paraparticle algebra, from the Majorana modes $γ_{j}^{(0)}, γ_{j}^{(1)}$ \gamma _j^{(0)},\gamma _j^{(1)} with ${γ_{i}^{(α)}, γ_{j}^{(β)}} = 2 δ_{i j} δ_{α β}$ \left\{ {\gamma _i^{(\alpha )},\gamma _j^{(\beta )}} \right\} = 2{\delta _{ij}}{\delta _{\alpha \beta }}. Define $a_{j} = (γ_{j}^{(0)} + γ_{j}^{(1)}) / 2$ {a_j} = \left( {\gamma _j^{(0)} + \gamma _j^{(1)}} \right)/2 (red charge (1, 0)) and $b_{j} = (γ_{j}^{(0)} - γ_{j}^{(1)}) / 2$ {b_j} = \left( {\gamma _j^{(0)} - \gamma _j^{(1)}} \right)/2 (blue charge (0, 1)). A two–coloring lifts to the degree–(0, 0) edge operator of Eq. (4) which commutes with total Klein charge. Monochromatic pair and clique projectors $Π_{i j}^{R}$ \Pi _{ij}^R, $Π_{i j}^{B}$ \Pi _{ij}^B; $\prod_{R} (S) = \prod_{i < j \in S} \prod_{i j}^{R}, \prod_{B} (T) = \prod_{i < j \in T} \prod_{i j}^{B}$ {\prod _R}(S) = {\prod _{i < j \in S}}\prod _{ij}^R,{\prod _B}\left( T \right) = {\prod _{i < j \in T}}\prod _{ij}^B live in the charge–zero submodule M₀; the central obstruction to a good coloring is Eq. (6) and a coloring (or an entire symmetry class thereof) survivesiff P_m,n annihilates it. In this setting the Klein–compatible coproduct realizes the glue step v → v + 1 and yields an exact recursion for the graded numbers R_V₄ (m, n) in Eq. (7).

To decide if any legal survivor subspace remains in M₀ ≃ ℝ^d, we use two families of randomized maps built from i.i.d. isotropic unit vectors v_j ∈ ℝ^d, the linear deflation of Eq. (11) and the exponential map (Eq. (10)). If a survivor subspace has rank r ≥ 1, then k random rank–1 tests miss it with probability P_miss, so P_lin rapidly kills survivors as k/d grows, while T(α) collapses once survivors vanish. For non–normal A we use the Hermitian dilation H, giving a stable spectral surrogate accessible by phase estimation (quantum) or power iteration (classical).

R(5, 5) case and resource contrast. On the diagonal, the reduced module has d = 24; both witnesses single out v = 45: T(40) ≈ 2.4 × 10⁻²⁸⁹ at (k, α) = (400, 40) while Tr P_lin peaks locally at v = 45; an explicit v = 46 control remains non–critical under the same thresholds. Under the i.i.d./isotropy model, Pr[pmiss] ≲ e^−100/24 ≈ 1.6 × 10⁻² at (100, 20) and ≲ e^−400/24 ≈ 5.7 × 10⁻⁸ at (400, 40). Crucially, these diagnostics act only on M₀, requiring ⌈log₂ d⌉ = 5 data qubits (plus few ancillas), versus $(\begin{matrix} 45 \\ 2 \end{matrix}) = 990$ \left( \matrix{ 45 \cr 2 \cr} \right) = 990 data qubits for direct edge–encodings.

7.2.

Ramsey Theory and Machine Learning

Ramsey theory formalizes “order amid chaos”: sufficiently large systems necessarily contain structured subconfigurations. This principle echoes throughout modern ML and suggests concrete tests and controls using the witnesses above.

Overparameterization and lottery tickets. Let 𝒩 be a (possibly overparameterized) network with parameters W ≈ ℝ^D. Define a graph on neurons where the edge {i, j} is colored red if a predicate P_ij(W) holds (e.g., |W_ij| > τ, or sign/gradient agreement, correlation above a threshold), blue otherwise.

Whenever the effective width v exceeds the relevant R(k, k), a monochromatic K_k is guaranteed, i.e. a small subnetwork obeying P coherently–akin to the Lottery–Ticket hypothesis. We can certify the continued existence (or disappearance) of such candidates by encoding Π_R (S) from the predicate and monitoring the collapse/peak of T(α) and Tr P_lin as pruning proceeds; the risk of a false “no–ticket” verdict is controlled by Eq. (16) through k/d.

Build the predicate graph on vertices [v] by coloring an edge {i, j} red when P_ij(W) = 1 and blue otherwise. A Ramsey k–lottery is a set S ⊂ [v] with |S| = k such that all edges inside S are red; this is a K_k fully coherent for P and can be read as a compact subnetwork (a “ticket”) already encoding the desired structure. Ramsey’s theorem implies that once v ≥ R(k, k) a monochromatic K_k must exist, so the search for a good ticket can be recast as the impossibility of avoiding such a red K_k. In the graded framework, encode the forbidden condition “no red K_k” by the central operator in Eq. (27) 27 $\begin{array}{l} Q_{k} (W) : = \prod_{| S | = k} (I - Π_{R} (S)), \\ Π_{R} (S) = \prod_{i < j \in S} Π_{i j}^{R} (degree (0, 0)), \end{array}$ \matrix{ {{Q_k}(W): = \prod\limits_{|S| = k} {\left( {{\rm{I}} - {\Pi _R}(S)} \right)} ,} \hfill \cr {{\Pi _R}(S) = \prod\limits_{i < j \in S} {\Pi _{ij}^R} ({\rm{ degree }}(0, 0)),} \hfill \cr } acting on the charge–zero module M₀. If Q_k(W) has no surviving support, then a red K_k is unavoidable and at least one lottery ticket exists.

Operationally we decide this by randomized witnesses on M₀: P_lin and P_exp(α) with T(α). Sharing the same random seed across widths v lets us track the critical scale where tickets become inevitable: the collapse of T(α) together with a local peak of Tr P_lin signals Q_k(W) has emptied out, hence a P–coherent K_k exists. Under i.i.d. isotropic rank–1 probes in M₀ of dimension d, the miss–probability that all tests avoid a surviving r–dimensional subspace obeys P_miss, so choosing k≳(d/r) ln(1/δ) controls the false “no–ticket” verdict at level δ. This provides a quantified version of the Lottery–Ticket intuition: overparameterization raises v, and once the Ramsey threshold is crossed, small performant subnetworks are not accidental–they are guaranteed, and the witnesses certify their inevitability.

Adversarial inevitabilities (worst–case geometry). Form a near–collision graph on inputs X = {x_i}, coloring {i, j} red if the margin |f(x_i) – f(x_j)| < δ (for a score f), blue otherwise. Beyond a threshold v, large monochromatic cliques are inevitable, certifying coherent clusters of mutually confusable points. Our central projector of Eq. (4) with (m, n) = (k, k) and the witnesses (P_lin, P_exp) give an early–warning signal: collapse of T(α) indicates that confusion–free assignments have been extinguished within the defended hypothesis class.

Given inputs X = {x₁, …, x_v} and a score f, form the near–collision graph by coloring {i, j} red if |f(x_i) – f(x_j)| < δ (or, for classifiers, if the logits differ by < δ in all attack directions), blue otherwise. Large red cliques are coherent confusion sets—mutually confusable points that any fixed defense struggles to separate. To detect when such sets are unavoidable at scale v, instantiate the central projector of Eq. (6) with (m, n) = (k, k) and run the same witnesses (P_lin, P_exp(α)) on M₀. A collapse of T(α) indicates that the hypothesis class (with the current defense/training recipe) cannot realize a coloring that avoids size–k coherent confusions–i.e., adversarially vulnerable patterns are now Ramsey–inevitable at this v. The Lyapunov slope $λ_{L} (α) : = - \frac{d}{d α} \log T (α) = \frac{Tr (A e^{- α A})}{Tr (e^{- α A})}$ {\lambda _L}(\alpha ): = - {d \over {d\alpha }}\log T(\alpha ) = {{{\mathop{\rm Tr}\nolimits} \left( {A{e^{ - \alpha A}}} \right)} \over {{\mathop{\rm Tr}\nolimits} \left( {{e^{ - \alpha A}}} \right)}} rises as survivors vanish and serves as an early–warning margin proxy. As above, the one–sided risk that random probes pmiss a surviving r–plane is bounded by e^−kr/d, so (k, d) can be chosen to target a confidence level δ and declare inevitability only when both collapse/peak and the risk budget agree.

Graph neural networks and motif search. GNN tasks often detect motifs (cliques/cycles); Ramsey theory guarantees that small monochromatic subgraphs occur in large graphs regardless of coloring. Instead of scanning exhaustively, run the witnesses on the reduced module induced by the motif’s pair projectors–focusing compute where structure must exist. On parity–preserving (matchgate/Majorana) hardware, these checks map to shallow circuits acting entirely in M₀.

Let G = (V, E) be a large graph and let H = (V_H, E_H) be a small motif (e.g., a k-clique, a c-cycle, or a domain motif) with |VH| = h. For a fixed color c ∈ {R, B} and an injective placement f : V_H ↪ V, define the motif projector at placement f by $Π_{H, c} (f) := \prod_{(u, v) \in E_{H}} Π_{f (u) f (v),}^{c}$ {\Pi _{H,c}}(f): = \prod\limits_{(u,v) \in {E_H}} {\Pi _{f(u)f(v),}^c} where $Π_{i j}^{c}$ \Pi _{ij}^c is the degree-(0, 0) pair projector (red or blue) from the graded construction. The disjunction over all placements can be encoded by the central forbidden-motif operator $Q_{H, c} (V) := \prod_{\begin{matrix} f : V_{H} ↪ V \\ injective \end{matrix}} (I - Π_{H, c} (f)),$ {Q_{H,c}}(V): = \prod\limits_{\scriptstyle f:{V_H} \to V \atop \scriptstyle {\rm{ injective }}} {\left( {{\rm{I}} - {\Pi _{H,c}}(f)} \right)} , which equals the identityiffG contains no monochromatic copy of H. In our framework all factors have degree (0, 0) and act on the reduced module M₀, so the existence of a motif is decided by whether Q_H,c leaves any survivor subspace in M₀. This directly leverages the Ramsey guarantee that for sufficiently large |V| certain small monochromatic subgraphs are unavoidable.

Ramsey-guided screening (pre- and post-processing for GNNs). Rather than scanning all $(\begin{matrix} | V | \\ h \end{matrix})$ \left( \matrix{ |V| \cr h \cr} \right) placements, assemble a basis {ℬ_s} for the span generated by {Π_H,c(f)}_f (or by their complements {I – Π_H,c(f)}_f), sample k i.i.d. isotropic directions v_j in that span (restricted to M₀), and build the witnesses P_lin, P_exp(α) and T(α). If no legal placement remains (i.e. Q_H,c has no survivors), then T(α) collapses as α grows and Tr P_lin exhibits a local peak at the critical scale; conversely, a non-collapse certifies that at least one placement survives. With ambient dimension d = dim M₀ and residual rank r ≥ 1, the probability that k random rank-1 probes pmiss all survivors obeys Eq. (16) so one can pick $k ≍ \frac{d}{r} \ln (1 / δ)$ k \asymp {d \over r}\ln (1/\delta ) to achieve a target risk δ. In practice this yields a Ramsey-informed front-end filter: run the witnesses locally (on h-hop ego-nets, or on batches) and trigger exact subgraph-isomorphism or GNN attention only where the witnesses indicate unavoidable structure.

Training-time integration (regularizers and layers). Motif biases can be injected by adding a differentiable penalty that softly forbids Q_H,c: $ℒ_{total} = ℒ_{task} + μ Tr P_{\exp}^{(H, c)} (α)$ {{\cal L}_{{\rm{total }}}} = {{\cal L}_{{\rm{task }}}} + \mu {\mathop{\rm Tr}\nolimits} P_{\exp }^{(H,c)}(\alpha ) or $μ λ_{L} (α), λ_{L} (α) := - \frac{d}{d α} \log T (α),$ \mu {{\rm{\lambda }}_L}(\alpha ),\quad {{\rm{\lambda }}_L}(\alpha ): = - {d \over {d\alpha }}\log T(\alpha ), with $P_{\exp}^{(H, c)}$ P_{\exp }^{(H,c)} built from the motif basis {ℬ_s}; decreasing T(α) shrinks the measure of “no-H” assignments, nudging message passing toward motif-consistent representations. A complementary architectural primitive is Ramsey-aware pooling/attention: use per-node (or per-edge) contributions to the trace estimator as scores that gate message aggregation, focusing compute where the witnesses predict imminent motif emergence.

Complexity and deployment. For large |V|, exhaustive motif enumeration is $Θ ((\begin{matrix} | V | \\ h \end{matrix}))$ \Theta \left( {\left( \matrix{ |V| \cr h \cr} \right)} \right), while a witness pass costs O(k C_apply) with C_apply the cost to apply a basis element ℬ_s (sparse and local for small H). The error is one-sided and tunable via e^−kr/d; in screening mode we favor high recall (large k/d), then hand off flagged regions to exact methods or to a specialized GNN head.

Majorana/matchgate realization (few-qubit or parity-preserving hardware). All operators above are degree-(0, 0) and commute with total parity, hence on matchgate/Majorana platforms each $Π_{i j}^{c}$ \Pi _{ij}^c and their products are even-parity checks realizable by shallow fermionic-Gaussian circuits acting entirely in M₀. The traces T(α) (and Tr P_lin) admit unbiased Hutchinson estimators, and spectral surrogates are accessed via the Hermitian dilation of the rank-k accumulator, enabling few-qubit diagnostics or accelerator kernels that interleave with classical GNN training/inference.

Learning theory: combinatorial capacity (VC). PAC (Probably Approximately Correct) learning is another potential application as in computational learning theory it is used to analyze the learnability of functions. PAC provides a probabilistic approach to understanding how well a machine learning algorithm can generalize from training data to unseen data exploring whether a learning algorithm can find a hypothesis that is both approximately correct and probably correct with respect to a given concept and distribution. With PAC sample complexity scales with VC–dimension that gives a measure of the complexity of a hypothesis space or the power of learning machine. Ramsey–type statements provide complementary unavoidability results: in large instance/hypothesis regimes, certain regular label patterns (e.g., constant/parity–coherent on dense subgraphs) must appear. Encoding “forbidden labelings” as P_m,n and tracking T(α) supplies fast, high–confidence negative certificates (“this configuration is no longer realizable”) with quantitative control via k/d.

Let ℋ ⊆ {–1, +1}^𝒳 be a binary hypothesis class. A finite set S = {x₁, …, x_m} ⊂ 𝒳 is shattered by ℋ if every labeling y ≈ {–1, +1}^m is realized by some h ∈ ℋ, i.e. ∀y ∃h ∈ ℋ with h(x_i) = y_i for all i. The VC-dimension VC(ℋ) is the largest m such that some S of size m is shattered. Writing the growth function $Π_{ℋ} (m) := \max_{| S | = m} | {(h (x_{1}), \dots, h (x_{m})) : h \in ℋ} |,$ {\Pi _{\cal H}}(m): = \mathop {\max }\limits_{|S| = m} \left| {\left\{ {\left( {h\left( {{x_1}} \right), \ldots ,h\left( {{x_m}} \right)} \right):h \in {\cal H}} \right\}} \right|, the Sauer–Shelah lemma gives, for d = VC(ℋ) and m ≥ d, $Π_{ℋ} (m) \leq \sum_{i = 0}^{d} (\begin{matrix} m \\ i \end{matrix}) \leq {(\frac{e m}{d})}^{d},$ {\Pi _{\cal H}}(m) \le \sum\limits_{i = 0}^d {\left( \matrix{ m \cr i \cr} \right)} \le {\left( {{{em} \over d}} \right)^d}, so the number of distinct labelings realizable on m points grows polynomially once m > d. In PAC learning, these combinatorial bounds control sample complexity. In the realizable case (some h* ∈ ℋ attains zero risk), empirical risk minimization is consistent with $m ≳ \frac{1}{ε} (d \log \frac{1}{ε} + \log \frac{1}{δ})$ m \mathbin{\lower.3ex\hbox{\buildrel>\over {\smash{\scriptstyle\sim}\vphantom{_x}}}} {1 \over \varepsilon }\left( {d\log {1 \over \varepsilon } + \log {1 \over \delta }} \right) examples to achieve excess error ≤ ε with probability ≥ 1 – δ; in the agnostic case the dependence becomes $m ≳ \frac{1}{ε^{2}} (d + \log \frac{1}{δ})$ m \mathbin{\lower.3ex\hbox{\buildrel>\over {\smash{\scriptstyle\sim}\vphantom{_x}}}} {1 \over {{\varepsilon ^2}}}\left( {d + \log {1 \over \delta }} \right). Canonical examples: thresholds on ℝ have VC = 1; intervals on ℝ have VC = 2; axis-aligned rectangles in ℝ^d have VC = 2d; affine halfspaces in ℝ^d have VC = d+1.

Ramsey overlay and “unavoidability.” Ramsey theory adds a complementary, worst-case inevitability perspective: on sufficiently large instance sets, certain structured label patterns must occur. In our algebraic framework, a forbidden configuration (e.g. “no monochromatic K_m in red, none of size K_n in blue”) is encoded by the central projector P_m,n acting in the charge-zero module M₀. If P_m,n has no surviving support at size v, then no hypothesis consistent with those constraints can realize the corresponding label patterns on v examples—an unavoidability barrier. Operationally, we monitor T(α) and the product witness P_lin; collapse of T(α) together with a peak of Tr P_lin certifies that all labelings compatible with the constraint class have vanished at that scale, yielding a fast negative certificate (“this configuration is no longer realizable”).

From VC to Ramsey-constrained capacity. Let ℋ_m,n (v) denote the labelings of a v-point sample realizable by hypotheses that avoid the Ramsey-forbidden substructures encoded by P_m,n. By Ramsey’s principle, once v ≥ R(m, n) the set ℋ_m,n (v) is empty; hence the growth function of the constrained class satisfies Π_ℋm,n(v) = 0 for all v ≥ R(m, n). Thus the effective VC-dimension of ℋ_m,n is at most R(m, n)–1, and in practice can be (much) smaller due to additional algebraic symmetries. Our spectral witnesses make this transition detectable: when T(α) collapses at a given v, it implies Π_ℋm,n(v) = 0 under the modeling assumptions, delivering a data-driven ceiling on combinatorial capacity for the constrained hypothesis class.

Quantitative control via random projectors. Let d = dim M₀ and suppose the surviving feasible subspace (if any) has rank r ≥ 1. Drawing k isotropic rank-1 tests $v_{j} v_{j}^{⊤}$ {v_j}v_j^ \top , the probability to pmiss all survivors obeys Eq. (16) drives the false-negative risk below I. In practice: (i) choose d as large as possible without increasing the data–qubit count on hardware (e.g., d ≤ 24, 32, 48 map to 5 or 6 qubits); (ii) set a baseline pair (k, α) (e.g., (100, 20)) and a high–resolution pair (e.g., (400, 40)) to cross–validate decisions; (iii) declare a scale v critical when T(α) collapses, Tr P_lin peaks, and a spectral surrogate (e.g. the dilation norm ρ(H) = ∥A∥₂) is locally extremal under the same seeds. For example, with d = 24, r = 1, and target δ = 10⁻⁶, it suffices to use k ≤ 24ln(10⁶) ≈ 332; at (k,a) = (400, 40) the bound already yields Pr[pmiss] ≲ 5.7 × 10⁻⁸. Reporting (k, d, α) together with λ_L(α) and the explicit e^−kr/d risk converts Ramsey–style inevitability into a tunable, high–confidence negative certificate (“this configuration is no longer realizable”), complementing VC/sample–complexity upper bounds on the number of realizable labelings, what can be labeled or learned.

Practical takeaway for ML. (i) Use P_m,n to encode domain rules (forbidden label patterns or subgraphs) and add a soft penalty proportional to T (α) in the loss to bias training toward allowable regions; (ii) during pruning or architecture scaling, track (Tr P_lin, T(α)) as early-warning signals that the constrained class has lost capacity on the current sample size (a Ramsey barrier); (iii) report the explicit e^−kr/d miss-bound alongside collapse/peak events to quantify confidence in negative certificates. In tandem with standard VC/sample-complexity guarantees, these Ramsey-informed diagnostics provide actionable controls on what cannot be learned at a given scale, under the stated algebraic constraints.

Neurosymbolic constraints. Because P_m,n is central and Klein grading separates “red/blue” semantics, one can add soft penalties λ Tr P_exp (α) to the loss to bias training toward rule–consistent solutions–the graded, differentiable analogue of logic layers.

7.3.

Algorithms and Concrete Templates

7.3.1.

Ramsey–Guided Pruning (Classical/Quantum)

Fix a predicate P over parameters/activations and build the induced degree–(0, 0) operators on the reduced module M₀ (choose d s.t. ⌈log₂ d⌉ fits the co–processor; d = 24, 32, 48 are practical).
For each candidate scale v(width/channel budget), form A with a shared PRNG seed across v; compute P_lin and P_exp (α).
Track T(α), Tr P_lin, and a spectral surrogate ρ(H). Declare v critical when T(α) collapses and Tr P_lin peaks (with ρ(H) extremal).
Choose k to meet risk ε via k/d ≳ ln(1/ε)/r from (16); tune α from the Lyapunov slope $λ_{L} (α) := - \frac{d}{d α} \log T (α)$ {{\rm{\lambda }}_L}(\alpha ): = - {d \over {d\alpha }}\log T(\alpha ).

These same steps run on 5 – 7 qubits for d ≤ 24 – 48 using block–encoding/qubitization and a Hutchinson trace estimator $𝔼_{| r 〉} 〈 r | F | r 〉 = \frac{1}{d} Tr F$ {_{|r\rangle }}\langle r|F|r\rangle = {1 \over d}{\mathop{\rm Tr}\nolimits} F.

Adversarial-risk early warning. Set edges to mark δ–near collisions; sweep v (or coverage). A sharp drop in T(α) flags the Ramsey–style onset of unavoidable coherent confusions, prompting augmentation or re-architecture before empirical attacks emerge.

Curriculum & scaling via prime–sequence checkpoints (heuristic). Diagonal values observed in our framework are sparsely factorized (e.g·, 45 = 3² · 5). As a curriculum heuristic, checkpoint only at prime–sequence integers (products of the first few primes with bounded exponents), keeping successive ratios ≲ 2; this concentrates compute at likely transition scales.

Case study: diagonal R (5, 5) and resource accounting. At d = 24 the diagnostics concur at v = 45: T(40) ≈ 2.4 × 10⁻²⁸⁹ (collapse) and Tr P_lin peaks at 0.462 vs· 0.360 (v = 44) and 0.407 (v = 46); a known 46–vertex coloring stays non–critical under the same thresholds. With (16), Pr[pmiss] ≲ e^−400/24 ≈ 5.7 × 10⁻⁸ at (k, α) = (400, 40) (for r = 1), making spurious collapse extremely unlikely under the i·i·d·/isotropy assumptions· Quantumly, these checks use only ⌈log₂ d⌉ = 5 data qubits plus few ancillas, instead of $(\begin{matrix} v \\ 2 \end{matrix})$ \left( \matrix{ v \cr 2 \cr} \right) edge qubits (e.g. 990 at v = 45). The witnesses are diagnostic (not constructive) and report evidence consistent with R(5, 5) = 45 within the graded–module model and sampling assumptions.

Quantum realization on few qubits (for ML and Ramsey). Both P_lin and P_exp(α) admit coherent implementations via block–encodings of A and QSVT/qubitization; the Hermitian dilation H supplies a stable spectral proxy ρ(H) = ∥A∥₂. Traces are estimated by a Hadamard–test version of Hutchinson’s identity, reducing to 5 data qubits for d = 24 (plus ancillas). On matchgate/Majorana platforms, degree–(0, 0) pair/clique projectors are even–parity checks, yielding shallow circuits native to the hardware.

7.4.

Open Directions

The items below outline concrete, testable directions that translate the Ramsey principle of inevitability—realized through our graded-algebraic and random-projector framework—into practical tools for structure search, confidence-calibrated pruning, adversarial phase mapping, and prime-sequence curricula.

Ramsey–guided structure search. Add μ Tr P_exp (α) to the loss to bias training toward guaranteed motifs; study generalization/robustness.

Confidence–calibrated pruning. Turn the bound e^−kr/d into pruning schedules with explicit risk budgets; couple to dynamic d (keeping ⌈log₂ d⌉ fixed on small quantum assists).

Adversarial phase diagrams. Map (T, λ_L, ρ(H)) across defenses/data scales to chart inevitability regions where confusion becomes unavoidable.

Prime–sequence curricula. Empirically assess whether sharp loss/robustness transitions cluster at prime–sequence scales.

Ramsey theory supplies inevitability guarantees. The graded–algebra plus random–projector machinery turns them into operational tests–collapse/peak events with explicit, exponential–tail risk control–that inform pruning, curriculum, robustness analyses, and quantum–assisted diagnostics on a handful of qubits.

Conclusions

Embedding two-color Ramsey instances in a ℤ₂ × ℤ₂-graded paraparticle algebra renders the Klein recursion exact for the graded Ramsey numbers R_V₄ (m, n) (with the classical numbers obeying R(m, n) ≤ R_V₄(m, n)), supplies an explicit operator basis, and maps naturally onto forthcoming Majorana hardware. In our construction, algorithmic costs scale only logarithmically with the height of the Majorana tower, suggesting a realistic quantum–combinatorial synergy. Our graded-algebra diagnostics identify the diagonal threshold at R(5, 5) = 45 via a unique concurrence of signals–collapse of T(α), a peak of Tr P_lin, and maximal spectral spread–while the explicit AM-46 coloring remains non-critical. The miss probability defined in Eq. (16) for an undetected coloring is already small, P_miss < 10⁻³ for the reported parameters, and drops to P_miss ~ 10⁻⁷ once the residual rank satisfies r ≥ 12 (with d = 24); see Appendix B, Section D.11 for details. Hence the method delivers a tight, scalable heuristic that future constructive proofs will either confirm or surpass. Another criterion for diagonal Ramsey numbers, based on prime-sequence numbers of order k, is discussed in SM. Ultimately, only a constructive search can decide whether the coloring space is, for all practical purposes, empty (implying R(5, 5) = 45) or not (leaving R(5, 5) = 46). Together with the constructive bound 43 ≤ R(5, 5) ≤ 46, these signals provide statistical indications consistent with R(5, 5) = 45 under the graded-module model. Under i.i.d. isotropic sampling of directions v_j in M₀, the chance to miss surviving directions is bounded by P_miss ≥ e^−kr/d (residual rank r ≤ 1), making the collapse decision reliable at the reported (d, k, α) and increasing k/d strengthens both witnesses without widening the quantum data register.

Finally, the factorization 45 = 3² · 5 motivates a “prime-sequence’’ constraint that favors small prime factors when extrapolating diagonal values, offering a complementary, number-theoretic guidepost. Our results provide statistical, but not yet constructive, evidence for R(5, 5) = 45; scaling k and d on hardware alongside targeted constructive search is the natural next step, including an heuristic estimation for R(6, 6) and R(7, 7)

The method is lightweight (a d = 24 module) yet hardware ready: block-encoding and qubitization implement f(A) = e^−αA, and a Hermitian dilation supplies a stable spectral surrogate, yielding a compact, reproducible benchmark for matrix-function evaluation and randomized trace estimation on quantum devices. Both diagnostics have direct quantum realizations: block-encodings of F ∈ {P_exp(α), P_lin} feed a Hadamard–test/Hutchinson trace estimator, and the spectral surrogate is accessed by phase estimation on the Hermitian dilation H. Because all operators act on M₀, the data width is only ⌈log₂ d⌉ qubits (five for d = 24), plus a few ancillas–orders of magnitude below edge-register encodings that require one logical qubit per edge.

Our procedure is a statistical diagnostic, not a constructive proof: it relies on independence/isotropy of v_j, concentration of the spectrum of the accumulator A, and numerical thresholds chosen by cross-validation on neighboring n. The collapse/peak calls remain robust under these assumptions and are supported by explicit controls.

In any case this statistical estimation is a promising subject for machine learning. In ML, Ramsey numbers are not used directly, but the Ramsey principle, large enough systems must contain hidden order, deeply resonates with overparameterized neural networks, adversarial robustness, graph learning, and generalization theory. Open Research Directions are: Ramsey-inspired pruning, where in giant overparameterized models, one can use Ramsey reasoning to predict where guaranteed good subnetworks live. Complexity bounds: Ramsey numbers are huge, but their growth rate might inspire worst-case capacity bounds in neural networks. Curriculum design: Training on small guaranteed patterns (Ramsey cliques, unavoidable motifs) before scaling up could act as a form of combinatorial curriculum learning.

Random-Projector Quantum Diagnostics of Ramsey Numbers and A Prime-factor Heuristic for R(5, 5) = 45

Full Article

Paradigm

My account