Have a personal or library account? Click to login
Random-Projector Quantum Diagnostics of Ramsey Numbers and A Prime-factor Heuristic for R(5, 5) = 45 Cover

Random-Projector Quantum Diagnostics of Ramsey Numbers and A Prime-factor Heuristic for R(5, 5) = 45

Open Access
|Mar 2026

Full Article

1.
Introduction

Ramsey numbers are at the intersection of graph theory, combinatorics, neural networking, computation and probability. Ramsey theory asks for the smallest number R(m, n) such that every edge coloring of the complete graph KR(m, n) with two colors contains a red graph Km with m vertices or a blue Kn with n vertices [1]. In graph theory, a Ramsey number refers to the minimum number of vertices in a complete graph where, regardless of how the edges are colored with two colors, a monochromatic subgraph of a specified size (or sizes) is guaranteed to exist. Ramsey’s theorem guarantees the number exists. We know all exact values up to R(4, 5) = 25; the next diagonal case 43 ≤ R(5, 5) ≤ 46 has not been directly calculated yet [24]. Gap sizes grow explosively between R(5, 5) and R(6, 6) the interval already spans 60 numbers, and for R(7, 7) we only know 189 ≤ R(7, 7) ≤ 4749. It is the search of sequences and patterns in large structures like searching for constellations in the sky [5].

Ramsey numbers control worst-case resource overheads in error-correcting codes and result intractable with current quantum computing resources. Designing a code to correct t errors means ensuring no two codewords sit within Hamming distance, which measures the minimum number of substitutions required to change one string into the other, dH ≤ 2t [6]. This avoidance problem is equivalent to a set-coloring Ramsey instance on an alphabet of size q. The threshold at which large codes cease to exist scales like R(t + 1;q, q — 1), so the redundancy one must pay in the worst case grows with a Ramsey number rather than a mere polynomial in t. This follows from the standard avoidance formulation of code design; see [7] for the probabilistic method and [8] for Ramsey-theoretic background.

In quantum computing they are fundamental in worst-case routing in quantum compilation, mapping an arbitrary k-qubit circuit onto limited hardware connectivity forces SWAP gates whenever interacting qubits aren’t adjacent. Ramsey’s theorem guarantees that in any coloring or pattern of two-qubit interactions among R(k, k) qubits, there must exist an unavoidable “hard” configuration such as a clique or independent set requiring at least R(k, k) SWAP operations or depth units to resolve. Thus, the diagonal Ramsey number sets a lower bound on the worst-case communication depth of any fully general distributed quantum compiler in worst-case interaction patterns under limited connectivity, a Ramsey-style obstruction yields a depth lower bound scaling with a diagonal Ramsey number [8, 9]. To push worst-case depths down, one either needs richer native connectivity or higher-dimensional qudits of order n to effectively reduce the Ramsey argument to R([k/n], [k/n]) adopting paraparticle methods with ℤ2 × ℤ2-graded Majorana algebras [10].

Computing Ramsey numbers is notoriously intensive: classical proofs use double counting or induction [11], whereas the gluing technique builds larger graphs from smaller ones to avoid or force specific subgraphs such as cliques or independent sets [12].

We propose a new statistical method to estimate Ramsey number based on ℤ2 × ℤ2-graded Lie Algebras of Majorana infinite-component spin fields introduced in [10]. Our strategy follows the implementing of a paraparticle algebra graded by the Klein group V4 ≌ ℤ2 × ℤ2, which requires only a limited number of qubits for each step of analysis. By implementing higher-order algebras we develop recursive gluing and pruning methods on Ramsey graphs to give an estimate to or effectively calculate Ramsey numbers. The result is a purely algebraic that takes in account of the branching that factorizes the two colors and packages local symmetry constraints, allowing traces/character formulas to replace brute-force enumeration for many branches.

Quantum resources needed are formidable: verifying R(4, 4) = 18 requires exhaustively scanning all 2136 two–colourings of K17, i.e., one logical qubit per edge (136 qubits), already at the frontier of current hardware [13, 14]. Even with Grover’s quadratic speed-up the bottleneck is width, not depth: a 40-vertex instance occupies a 2780-dimensional space, demanding 780 qubits before amplification can begin.

For R(5, 5) we give an estimate of the qubit cost of a brute–force (Grover–style) search that needs an enormous number of quantum resources with a register model based on a two–colouring of Kn that can be encoded by a binary string x ∈ {0,1 }E(n) with one bit per (undirected) edge, where E(n)=(n2)=n(n1)/2E(n) = \left( \matrix{ n \cr 2 \cr} \right) = n(n - 1)/2 as reported in Table 1. Adopting the convention xij = 1 iff edge {i, j} is red (blue is 0), a reversible predicate Fn(x) flags a violation (Fn(x) = 1) iff x contains a red K5 or a blue K5; otherwise Fn(x) = 0. In Grover’s search for a good colouring one uses the oracle 1On:|x|b|x|b¬Fn(x),{{\cal O}_n}:|x\rangle |b\rangle \mapsto |x\rangle \left| {b \oplus \neg {F_n}(x)} \right\rangle , or phase kickback with a |–〉 ancilla.

Table 1.

Estimate of the number of qubits required to calculate R(5, 5) with Grover-style algorithm.

nE(n)=(n2)E(n) = \left( \matrix{ n \cr 2 \cr} \right)Total qubits Q(n) (safe upper bound)
4444 · 43/2 = 946946 + 16 ≈ 962
4545 · 44/2 = 990990 + 16 ≈ 1006
4646 · 45/2 = 10351035 + 16 ≈ 1051

Oracle structure and ancillas can be organized in this way, like a game: for each 5–subset S ⊆ [n] we must test whether all 10=(52)10 = \left( \matrix{ 5 \cr 2 \cr} \right) edge–bits inside S are equal to 1 (red K5) or all equal to 0 (blue K5). Each “all–red” test is a 10–input AND operation, which can be realized by a Toffoli tree using 9 clean ancillas (one per internal node); the “all–blue” test is obtained by negating the same 10 inputs, reusing the same 9 ancillas, and then uncomputing. Let aAND = 9 denote these workspace ancillas. We keep one flag rS for “red K5 on S”, one flag bS for “blue K5 on S”, and an aggregate OR–bit v that accumulates violations via vv ∨ (rsbs) as we sweep all (n5)\left( \matrix{ n \cr 5 \cr} \right) subsets, uncomputing rS, bS and the AND tree at the end of each subset. Finally, we need one output/phase ancilla for Grover, aph = 1.

Thus the oracle’s simultaneous ancilla budget can be kept to have a number of ancillas, awork = aAND + 2 + 1 + aph = 9 + 2 + 1 + 1 = 13, i.e., a number of 13 clean ancillas beyond the edge register. If, instead, negative controls are disallowed and input–negations must be done explicitly for the “all–blue” check, one adds at most 10 transient bits that are immediately uncomputed; the peak simultaneous ancillas remain ≤ 13. One may also think to add a couple of safety ancillas for carry/OR trees; which means a budget of +3 that would be needed to cover such variants, but it ia a matter of optimization that goes beyond the purpose of this work.

The total qubits required starts from a conservative upper bound Q(n) on the width (qubit count) of a brute–force Grover oracle for the diagonal R(5, 5) tests at size n, 2Q(n)=E(n)+awork (n2)+16.Q(n) = E(n) + {a_{{\rm{work }}}} \mathbin{\lower.3ex\hbox{\buildrel<\over {\smash{\scriptstyle\sim}\vphantom{_x}}}} \left( \matrix{ n \cr 2 \cr} \right) + 16.

Concretely, as reported in the following table, the number of qubit required is very big.

The search space has size 2E(n)=2(n2){2^{E(n)}} = {2^{\left( {\scriptstyle n \atop \scriptstyle 2} \right)}}. Grover’s amplitude amplification finds a good colouring, when one exists, n has O(2E(n)/Gn)O\left( {\sqrt {{2^{E(n)}}/{G_n}} } \right) oracle uses, where Gn is the number of good colourings at size n, which is unknown a priori. For the upper–bound instance (let us assume n = 45), proving nonexistence of a good colouring via search still takes exponential time in the worst case and requires additional outer logic, but the width remains dominated by the edge register as above.

In our setting the projector algebra acts on the charge–zero submodule of dimension d = 24, so the data register needs only log2d = log224 =5\left\lceil {{{\log }_2}d} \right\rceil = \left\lceil {{{\log }_2}24} \right\rceil = 5 qubits (plus a few ancillas for block-encoding/qubitization). Instead, the truly brute–force (enumeration/Grover) approach to R(5, 5) at the threshold requires on the order of a thousand logical qubits just to hold a coloring ((452)=990)\left( {\left( \matrix{ 45 \cr 2 \cr} \right) = 990} \right), plus ≲ 13 + 3 = 16 clean ancillas for the reversible violation check and phase kickback. In fact, a naive edge-register encoding at n = 45 would require one qubit per edge, i.e., 990 data qubits.

This stands in sharp contrast to our estimate random–projector spectral diagnostics presented in the next section, whose quantum implementation needs only a few data qubits (plus a handful of ancillas), i.e., two or three orders of magnitude fewer qubits. The other two diagonal cases with n = 6 and n = 7 would require up to 13530 and 145530 maximum data qubits, respectively, instead of 6 and 7 data qubits (plus modest ancillas) discussed below. Our approach does not replace the exact solution from brute-force approach, instead helps to restrict the range of values of diagonal Ramsey numbers for deeper investigations like Erdős flip-coin method [15].

Roadmap

After the introductory Section 1, Section 2 formalizes the Z2 × Z2–graded Majorana/paraparticle framework: it defines the degree–(0, 0) pair and clique projectors ΠijR/B\Pi _{ij}^{R/B} and the central forbidden–clique operator Pm,n on the charge–zero module M0, and establishes the exact Klein recursion RV4(m, n) = RV4(m – 1, n) + RV4(m, n – 1). Section 3 introduces two randomized spectral diagnostics acting on the reduced (d = 24) charge-zero module: a linear rank-1 deflation map Plin and an exponential map Pexp(α). These witnesses parallel Erdős’s first-moment method and are used to single out n = 45 on the diagonal. We then propose a compact prime-sequence heuristic for diagonal values. Next, we show how both diagnostics admit few-qubit quantum realizations via block-encoding/qubitization and a Hermitian dilation surrogate, and we extend the study to R(6, 6) and R(7, 7). Finally, we discuss applications to ML where the same witnesses provide fast, high-confidence negative certificates under graded constraints. Technical proofs and numerical details are deferred to the Appendix.

2.
Ramsey Numbers and Klein-Graded Paraparticle Algebra

To compute R(m, n), one can use a recursive gluing–pruning scheme. Starting with a base layer obtained by enumerating all good edge-colorings Gv0 of Kv0 (i.e., colorings containing neither a red Km nor a blue Kn). Then, for each good Gv , glue on a new vertex v + 1 and color its v incident edges in every way that preserves the good property, pruning any extension that creates a red Km or a blue Kn.

Pruning complements the gluing step by immediately discarding any partial coloring that already contains a forbidden red Km or blue Kn. After each glue operation, we quotient by vertex–label symmetries via canonical labeling and graph–isomorphism checks to eliminate equivalent colorings, ensuring that only genuinely new configurations are explored. We then backtrack as soon as a red Km or a blue Kn is detected. This bottom-up strategy, coupled with selective rollback and symmetry reduction, dramatically shrinks the search tree and has enabled exact computations by keeping the frontier of viable colorings tractably small [12, 16]. In our case we exploit the properties of ℤ2 × ℤ2–graded Majorana algebras for paraparticle states [10] and random projectors to extend Erdős’ flip-coin method.

Let γj(0),γj(1)\gamma _j^{(0)},\gamma _j^{(1)} denote Majorana modes with 3{ γi(α),γj(β) }=2δijδαβ,\left\{ {\gamma _i^{(\alpha )},\gamma _j^{(\beta )}} \right\} = 2{\delta _{ij}}{\delta _{\alpha \beta }}, graded by the numbers α, β ∈ {0,1}. Paraparticles of order p ≥ 2 obey trilinear commutation relations generalizing fermions and bosons [17]. When expressed in Majorana operators for quantum computers based on Majorana physics [10, 18], the algebra organizes itself into a tower of states { γj() }0{\left\{ {\gamma _j^{(\ell )}} \right\}_{\ell \ge 0}} where level carries total parity mod 2 and an additional color charge. The simplest base of levels = 0 and 1 suffice to mirror any edge-colored clique, while higher levels host recursively glued subgraphs.

To mirror the colorings, we extend the Majorana modes in Eq. (3) to a paraparticle doublet aj=12(γj(0)+γj(1)){a_j} = {1 \over 2}\left( {\gamma _j^{(0)} + \gamma _j^{(1)}} \right), bj=12(γj(0)γj(1)){b_j} = {1 \over 2}\left( {\gamma _j^{(0)} - \gamma _j^{(1)}} \right), satisfying { ai,aj }=2δij\left\{ {{a_i},a_j^\dag } \right\} = 2{\delta _{ij}}, and { bi,bj }=2δij\left\{ {{b_i},b_j^\dag } \right\} = 2{\delta _{ij}}, all other anticommutators vanish, assigning (a, b) the Klein charges (1, 0) and (0, 1). The full algebra AV4=gV4Ag{{\cal A}_{{V_4}}} = { \oplus _{g \in {V_4}}}{{\cal A}_g} decomposes into four color sectors. A binary edge coloring of a graph with vertex set V lifts to 4E^=1i<j|V|(cijRaiaj+cijBbibj)+ h.c.\hat E = \sum\limits_{1 \le i &#x003C; j \le |V|} {\left( {c_{ij}^Ra_i^\dag {a_j} + c_{ij}^Bb_i^\dag {b_j}} \right)} + {\rm{ h}}{\rm{.c}}{\rm{.}} which commutes with the total Klein charge, enabling simultaneous diagonalization with parity. We take cijR=cjiR¯c_{ij}^R = \overline {c_{ji}^R} and cijB=cjiB¯c_{ij}^B = \overline {c_{ji}^B} so that (4) is Hermitian; the sum over i < j avoids double counting.

We can now set the assignment of edge-labels “coloring” ≡ graded sector. The Klein four group V4 has elements {(0, 0), (1, 0), (0, 1), (1, 1)}, we decide to identify (1, 0) red, (0, 1) blue, with (0, 0) the vacuum/identity and (1, 1) a mixed sector projected out at the end which lives in the bi-graded component of the algebra obtained by multiplying a red operator by a blue one (or vice versa).

Definition 1 (Graded Ramsey numbers RV4(m, n) and mixed-sector projection).

Fix the2 × ℤ2graded Majorana (Klein–graded) algebra AV4=gV4Ag{{\cal A}_{{V_4}}} = { \oplus _{g \in {V_4}}}{{\cal A}_g} and impose the mixed–sector projection (all degree (1, 1) terms are set to zero). For a vertex set [v] = {1, … , v}, let ΠijR,ΠijB\Pi _{ij}^R,\Pi _{ij}^B be the degree–(0, 0) monochromatic pair projectors associated with the edge {i, j} (red and blue, respectively), and define the monochromatic clique projectors 5ΠR(S)=i<jSΠijR,ΠB(T)=i<jTΠijB.{\Pi _R}(S) = \prod\limits_{i &#x003C; j \in S} {\Pi _{ij}^R} ,\quad {\Pi _B}(T) = \prod\limits_{i &#x003C; j \in T} {\Pi _{ij}^B} .

The central “forbidden–clique” operator on the charge–zero module M0 is 6Pm,n=S[v]|S|=m(1ΠR( S))×T[v]|T|=n(1ΠB( T)),{P_{m,n}} = \prod\limits_{\scriptstyle {\rm{S}} \subseteq [v] \atop \scriptstyle |S| = m} {\left( {1 - {\Pi _{\rm{R}}}({\rm{S}})} \right)} \times \prod\limits_{\scriptstyle T \subseteq [v] \atop \scriptstyle |T| = n} {\left( {1 - {\Pi _{\rm{B}}}({\rm{T}})} \right)} ,

The graded Ramsey number RV4(m, n) is the least v ≥ 1 such that Pm,n (v) annihilates the entire charge–zero module (equivalently, every graded two–coloring of Kv—with mixed (1, 1) components invisible—contains a red Km or a blue Kn). It obeys the exact Klein Erdős recursion 7RV4(m,n)=RV4(m1,n)+RV4(m,n1),RV4(1,n)=RV4(m,1)=1,\matrix{ {{R_{{V_4}}}(m,n) = {R_{{V_4}}}(m - 1,n) + {R_{{V_4}}}(m,n - 1),} \hfill \cr {{R_{{V_4}}}(1,n) = {R_{{V_4}}}(m,1) = 1,} \hfill \cr } and upper–bounds the classical Ramsey numbers: R(m, n) ≤ RV4 (m, n).

Equation (7) exactly matches the constructive lower bound: there is at least one coloring on RV4(m – 1, n) + RV4(m, n – 1) – 1 vertices that avoids both a red Km and a blue Kn and R(m, n) ≤ RV4(m, n), so no classical bounds are violated.

In the ℤ2 × ℤ2 graded algebra, multiplying a red sector element by a blue sector element produces a state in the mixed (1, 1) sector, which lies outside the physical subspace and does not correspond to any valid qudit state because, by construction, the physical Hilbert space is defined to include only the pure color sectors. In practice, such components are either projected out (viz., set to zero) or interpreted as an internal syndrome that flags excursions away from the pure color subspaces during computation. By monitoring the amplitude in the mixed sector after each gate, one can detect and correct errors. When is observed any nonzero mixed-sector population, a red–blue mismatch has occurred and one can apply a compensating operation to return to the valid (1, 0) or (0, 1) grading.

For each unordered pair {i, j} we then introduce a homogeneous edge generator eij of that ℤ2 × ℤ2-degree. Paraparticle commutation relations [17] enforce parity bookkeeping without edge-ordering data. The coproduct is a glue operation defining 8Δ(eij)=eij1+1eij,\Delta \left( {{e_{ij}}} \right) = {e_{ij}} \otimes 1 + 1 \otimes {e_{ij}}, so that adding vertex v + 1 applies Δ to each existing eij and adjoins the set{ei,v+1}iv, preserving grading. We use the standard cocommutative coproduct compatible with the V4 grading. Counit and antipode are not needed in what follows; only the compatibility of Δ with the grading is used in the glue step.

To detect forbidden cliques, inside the algebra we use the projection Pm,n of Eq. (6) where, ΠR(S)=Πi<jSeij(1,0){\Pi _{\rm{R}}}({\rm{S}}) = {\Pi _{i &#x003C; j \in {\rm{S}}}}e_{ij}^{(1, 0)} projects onto the red and ΠB(T)=i<jTeij(0,1){\Pi _{\rm{B}}}({\rm{T}}) = \prod\nolimits_{i &#x003C; j \in {\rm{T}}} {e_{ij}^{(0, 1)}} on blue (see Eq. (5)). A coloring survives iff Pm,n annihilates it. Because Pm,n is central, one can evaluate Tr(Pm,n) on characters of irreducible modules rather than on individual colorings, turning part of the combinatorial explosion into a trace computation. Two lemmas and a theorem (see Appendix for demonstrations) fix the next steps.

Lemma 1 (Centrality of Pm,n. Lemma L .1).

The projector Pm,n = ΠR(S) ΠB(T) is central in AV4.

Lemma 2 (Tensor decomposition Lemma L .2).

Let pV be any vertex of a two–coloring and define VR = {vV\{p} | {p,v} red} and VB = {vV\{p} | {p, v} blue}. In the2 × ℤ2-graded Majorana algebra AV4 one has the canonical graded tensor product AV4[V\{p}]AV4[VR]^AV4[VB]{A_{{V_4}}}[V\backslash \{ p\} ] \simeq {A_{{V_4}}}\left[ {{V_R}} \right]\hat \otimes {A_{{V_4}}}\left[ {{V_B}} \right].

The factorization AV4(V\{p}) ≃ AV4(VR) ⊗b AV4(VB) makes cross-edges invisible to ΠR and ΠB, which is the key ingredient in the graded Klein recursion. Majorana’s infinite-spin equation gives a ladder of fields with equally spaced mass–spin ratios. Algebraically, each step is an induction functor Rep(Gs) → Rep(Gs+1). Here Rep(Gs) denotes the (rigid, monoidal) category of complex representations of the symmetry group Gs at rung s of the Majorana tower; its objects are Gs–modules (V, ρ) and its morphisms are intertwiners T : VW with Tρ(g) = ρ′(g)T for all gGs. The step ss+1 is modeled by induction along is : GsGs+1, sending V ∈ Rep(Gs) to IndGsGs+1V[ Gs+1 ][ Gs ]VV \in {\mathop{\rm Rep}\nolimits} \left( {{G_s}} \right)\ to\ {\mathop{\rm Ind}\nolimits} _{{G_s}}^{{G_{s + 1}}}V \simeq \left[ {{G_{s + 1}}} \right]{ \otimes _{\left[ {{G_s}} \right]}} (equivalently U(𝔤s+1)U(𝔤s) V), which algebraically realizes the Majorana infinite–spin ladder’s equally spaced mass–spin progression.

Each graph vertex i is realized as a pair of Majorana modes γ2i–1, γ2i, and set Γij = i γ2i–1 γ2j–1, which is odd under one ℤ2 (fermion parity) and even under the other (boson parity), matching the Klein grading. Each glue adds a new Majorana pair, and the tower’s induction gives a canonical lift of representations as vv + 1. Recursive gluing inside the algebra 𝒜V4 is described the relationship in Eq. (7) from the definition of the algebraic graded Ramsey numbers, RV4(m, n).

From the Majorana-tower construction one labels vertices by tower indices 1 ≤ ℓ ≤ RV4(m, n). At level we create a mode pair (a, b). Equation (7) is realized algebraically by mapping aaRV4(m–1, n) or aa{\mathop{\rm Ind}\nolimits} _{{G_s}}^{{G_{s + 1}}}V \simeq \left[ {{G_{s + 1}}} \right]{ \otimes _{\left[ {{G_s}} \right]}}V otherwise, and analogously for b. This “dagger flip’’ constitutes the gluing that concatenates two smaller cliques without leaving the algebra and Γj(±)(γj(0)±γj(1))/2{a_\ell } \mapsto a_\ell ^\dag , so that (Γj(+),Γj())\Gamma _j^{( \pm )} \equiv \left( {\gamma _j^{(0)} \pm \gamma _j^{(1)}} \right)/2 carry Klein charges (1, 0) and (0, 1), respectively. The edge operator of a k-vertex red clique is the normal-ordered monomial 9K^kR:=1i<jk(Γi(+))Γj(+),\left( {\Gamma _j^{( + )},\Gamma _j^{( - )}} \right) and analogously K^kB\hat K_k^{\rm{B}} with the replacement Γ(+) ↦ Γ(—). Let us assume the following convention. In (9) the product runs over unordered pairs {i, j}, and “normal-ordered” means the creation/annihilation factors are symmetrized so that K^kR\hat K_k^R is independent of the ordering up to graded signs; explicit ordering details are suppressed for brevity.

As each Γ(±) anticommutes with any operator of opposite Klein charge, every factor in (9) lies in the (1, 0) sector; consequently K^kR\hat K_k^R itself is homogeneous and commutes with the total charge operator Q=j(Γj(+)Γj(+)Γj()Γj())Q = \sum\limits_j {\left( {\Gamma _j^{( + )\dag }\Gamma _j^{( + )} - \Gamma _j^{( - )\dag }\Gamma _j^{( - )}} \right)} , ensuring that red and blue constructions never interfere. Examples of the application of this procedure with known Ramsey numbers are reported in Appendix A.

3.
Estimating R(5, 5)

Combinatorial determination of the diagonal Ramsey number R(5, 5) remains an open challenge, with the best constructive bounds 43 ≤ R(5, 5) ≤ 46. Instead of applying brute force calculation we give an estimate with a different method based on ℤ2 × ℤ2–graded algebras with random projector diagnostics and Majorana algebra reduction. This approach gives the probability that a certain value expected for R(5, 5) be favored with respect to other possible values. We select the smallest charge-zero submodule that supports all pair projectors and glue operations for n ∈ {43, 44, 45, 46}. We include n = 43 as a below-threshold control: 43 is the classical constructive lower bound (i.e., 43 ≤ R(5, 5) ≤ 46), hence R(5, 5) ≠ 43; we use n = 43 only as a non-critical test instance to validate the diagnostics.

For R(5, 5), the dimension of the algebraic module is d = 24, the dimension of the reduced Majorana module, set by the structure of the Majorana tower and the graded sectors. Empirically, increasing d did not alter the decisions of the diagnostics on these instances.

Embedding edge–colorings inside the ℤ2 × ℤ2-graded Majorana paraparticle algebra turns forbidden monochromatic cliques into central projectors, thereby enabling spectral criteria to signal when no admissible coloring survives on v vertices. We introduce two families of random projectors, exponential and linear, and show that they act as numerical order parameters whose singular behavior isolates the putative threshold at n = 45 for d = 24. For a fixed vertex count n we sample k random unit vectors { vj }j=1kd\left\{ {{v_j}} \right\}_{j = 1}^k \subset {^d} and define the exponential operator 10Pexp(α)=exp[ αj=1kvjvj ],{P_{\exp }}(\alpha ) = \exp \left[ { - \alpha \sum\limits_{j = 1}^k {{v_j}} v_j^ \top } \right] with vj random unit vectors in ℝd and α a suppression parameter, and the linear operator 11Plin=j=1k(Ivjvj),{P_{{\mathop{\rm lin}\nolimits} }} = \prod\limits_{j = 1}^k {\left( {I - {v_j}v_j^ \top } \right)} which is a product of rank-1 deflations, a linear deflation operator.

The exponential map instead, defines 12T(α)=Tr(eαA),T(\alpha ) = {\mathop{\rm Tr}\nolimits} \left( {{e^{ - \alpha A}}} \right) which is the exponential trace of the accumulator 13A=j=1kvjvj.A = \sum\limits_{j = 1}^k {{v_j}} v_j^ \top

Because these factors generally do not commute, Plin needs not be a projector a fortiori and can also have complex eigenvalues even when each factor is symmetric. Let AV4=gV4Ag{A_{{V_4}}} = { \oplus _{g \in {V_4}}}{A_g} be the ℤ2 × ℤ2–graded algebra and let M be a faithful AV4–module obtained from the standard Majorana tower. We fix the reduced charge-zero submodule M0M on which all degree-(0, 0) operators act, and all (1, 1) components vanish by Definition 1.

In our implementation we take dim M0 = d = 24. This because in the V4 = ℤ2 × ℤ2–graded Majorana model we impose the mixed–sector projection (all (1, 1) monomials are set to zero), so every forbidden–clique test is built from degree (0, 0) operators, the monochromatic pair projectors ΠijR/B\Pi _{ij}^{R/B} and their clique products ΠR/B(S), and therefore acts on the charge–zero module M0. For the diagonal case (5, 5) we choose M0 to be the smallest invariant block that simultaneously carries (i) all pair projectors for a K5 and (ii) the coproduct/glue operation Δ that lifts vv+1. Concretely, this block sits in the quadratic slice of the tower and decomposes as 14M01Λ2VRΛ2VBd,VRVB5,{M_0} \cong {\bf{1}} \oplus {\Lambda ^2}{V_R} \oplus {\Lambda ^2}{V_B} \oplus ,\quad {V_R} \cong {V_B} \cong {^5}, where Λ2 V• collects the off–diagonal, degree–(0, 0) bilinears for each color (the 10 red and 10 blue edge–slots of a K5), while 𝔡 is the diagonal degree–(0, 0) subspace spanned by global color–number quadratics and a traceless combination (one linear relation fixes the overall V4 charge). Hence 15dimM0=1+(52)10+(52)10+3diagonals =24.\dim {M_0} = 1 + \underbrace {\left( \matrix{ 5 \cr 2 \cr} \right)}_{10} + \underbrace {\left( \matrix{ 5 \cr 2 \cr} \right)}_{10} + \underbrace 3_{{\rm{diagonals }}} = 24.

Working in this reduced module keeps the data width at log2d =5\left\lceil {{{\log }_2}d} \right\rceil = 5 qubits while retaining all operators used by the randomized witnesses Pexp(α)=exp(αjvjvj){P_{\exp }}(\alpha ) = \exp \left( { - \alpha \sum\limits_j {{v_j}} v_j^ \top } \right) and Plin=j(Ivjvj){P_{{\mathop{\rm lin}\nolimits} }} = {\prod _j}\left( { - {v_j}v_j^ \top } \right). Empirically, enlarging M0 beyond d = 24 did not change the diagnostics (collapse of T(α) = Tr Pexp(α) and the peak of Tr Plin at n = 45), but only increases the qubit footprint; the sensitivity of the test scales through the factor k/d in the miss–probability bound Pmissekr/d.

The central operator for detecting forbidden monochromatic cliques is defined in Eq. (6). The coloring constraints and algebraic recursion are encoded as central elements, with traces acting as character formulas to efficiently probe the existence of colorings avoiding forbidden cliques.

For each unordered pair {i, j} we then define the monochromatic pair projectors ΠijR\Pi _{ij}^R and ΠijB\Pi _{ij}^B as in Eq. (9), and their finite products, with degree (0, 0) that preserves M0. The clique projectors are ΠR(S)=Π{i,j}SΠijR{\Pi _R}(S) = {\Pi _{\{ i,j\} \subset S}}\Pi _{ij}^R, ΠB(T)=Π{i,j}TΠijB{\Pi _B}(S) = {\Pi _{\{ i,j\} \subset S}}\Pi _{ij}^B acting on M0 and from Definition 1 all (1, 1) terms vanish. All diagnostics, linear and exponential projectors, traces and eigenspectra depend only on degree-(0, 0) operators, well defined on M0 and independent of any extension outside M0. Both projectors are generated from finite sums and products of these degree-(0, 0) operators, therefore act on M0 without reference to other charge sectors and act on the same irreducible module into which the Klein-graded edge generators eij of degree (1, 0) (red) or (0, 1) (blue) are represented.

The exponential projector Pexp(α) has suppression parameter of order α in Eq. (10) in ℝd, with d = 24, would be positive semidefinite for A Hermitian. In our graded construction the matrix A collecting rank-one directions arises from blocks that are not constrained to be selfadjoint (e.g., vv rather than vv*), so also in the case AA the method remains valid and Pexp can exhibit complex spectra. We distinguish the accumulator A of Eq. (13) from the linear deflation operator Plin of Eq. (11). They induce different witnesses: Tr Plin (deflation/product) and T(α) := Tr P of Eq. (12). A collapse of Tr Pexp to (numerical) zero is informative even though no positivity is assumed. The linear projector Plin probes residual rank via its trace and spectral radius.

Our procedure recalls Erdős’s probabilistic method: Erdős’s classic coin–flip proof chooses a uniformly random two–coloring of the edges of Kn; for any fixed k the expected number of monochromatic Kk is EX=(nk)21(k2){\rm{X}} = \left( \matrix{ n \cr k \cr} \right){2^{1 - \left( {\scriptstyle k \atop \scriptstyle 2} \right)}}, so if 𝔼X < 1 there exists a coloring with no monochromatic Kk, implying R(k, k) > n [7, 11, 15]. Our diagnostics are an algebraic–spectral analogue of this first–moment argument. The linear deflation Plin and the exponential map T(α) act on the charge–zero module M0, with i.i.d. isotropic directions vj. Under the same independence/isotropy assumptions, the probability that k random rank-1 tests miss an r–dimensional survivor subspace obeys Pmissekr/d; concomitantly, T(α) collapses as α grows once survivors vanish. This shared exponential decay (binomial in Erdős’s count; multiplicative contraction here in this deflation) explains why both viewpoints isolate the threshold n at which Ramsey obstructions are unavoidable. Erdős counts bad structures, while in the following we dissipate amplitude along random directions until any putative survivor subspace essentially vanishes. The common core is a first-moment/exponential-tail phenomenon: independence and isotropy produce multiplicative decay (coin flips kill monochromatic cliques in expectation; rank-1 deflations kill survivor dimensions in norm), which is why 16Pmissekr/d{P_{{\rm{miss}}}} \le {e^{ - kr/d}} mirrors the 2(k2){2^{\left( {\scriptstyle k \atop \scriptstyle 2} \right)}} granularity in Erdős’s estimate of missing an r–dimensional survivor subspace.

Erdős’s coin–flip lower bound takes X = ΣK 1[mono-Kk] and uses 𝔼X ≤ 1 to assert the existence of a good coloring; here our witnesses are Plin and T(α), with i.i.d. isotropic directions vj acting on the charge–zero module M0. Under the same independence/isotropy hypothesis used in the coin–flip model, a simple thinning argument gives the “miss” probability bound (k)(1rd)kekr/d(k) \le {\left( {1 - {r \over d}} \right)^k} \le {e^{ - kr/d}}, i.e., that k rank-1 tests miss an r-dimensional survivor. Replace each Haar vj by a discrete proxy v˜j{{\tilde v}_j} that equals a random basis vector ei with ℙ(ir) = r/d. If a test “hits” the survivor subspace whenever ir, then the probability ℙ to miss all k becomes the lower limit, Pmiss = (1 – r/d)k. Since the Haar model stochastically dominates this proxy in its overlap with any fixed r–plane, the discrete miss probability upper–bounds the continuous one, yielding (1 – r/d)kekr/d.

Equation (12) shows that T(α)=σ(A)eαλdμA(λ)T(\alpha ) = \int_{\sigma (A)} {{e^{ - \alpha \lambda }}} d{\mu _A}(\lambda ) can be seen as the Laplace transform of the spectral measure of A. In the same way that Chernoff/Markov bounds control ℙ{X > 0} via moment generating functions in Erdős’s method, the decay of T(α) controls the survival of small singular values of A. Under the i.i.d. isotropy model for the rank-one directions {vj}, a mean-field surrogate gives 17ET(α)deαλL,λLd dαlogT(α),T(\alpha ) \approx d{e^{ - \alpha {\lambda _L}}},\quad {\lambda _L} \approx - {{\rm{d}} \over {{\rm{d}}\alpha }}\log T(\alpha ), with λL a Lyapunov-type rate extracted from the slope of log T(α). So increasing α exponentially suppresses contributions from larger eigenvalues and accentuates spectral weight near the origin. When the clique constraints have percolated (no survivor subspace remains), T(α) collapses rapidly with α; empirically this occurs at n = 45 in our R(5, 5) study.

Conceptually, (17) is the spectral analogue of the first-moment threshold 𝔼[X] < 1 in Erdős’s coin-flip lower-bound argument for diagonal Ramsey numbers [7, 15] as schematized in the following paragraph 3.

What matches what (coin–flip ↔ projectors).

Coin-flip modelProjector model
Indicator 1[mono-Kk]residual rank direction in M0
First moment 𝔼X𝔼 T(α), 𝔼 Tr Plin
Independence of edge colorsi.i.d. isotropic vj
Counting (nk)\left( \matrix{ n \cr k \cr} \right) patternsk rank–1 probes k in d dimensions
𝔼X ≤ 1 thresholdT(α) ↓ 0 and Pmiss ≪ 1

Like the classical first–moment bound, these diagnostics are one–sided witnesses: they excel at detecting the onset of unavoidable structure but do not, by themselves, achieve the sharper lower bounds obtainable via the Lovász Local Lemma or second–moment/Janson techniques. We therefore quote Pmiss alongside the observed collapse of T(α) and the behaviour of Tr Plin to calibrate the strength of evidence. The exponential trace collapse and the (1 – r/d)k contraction are the projector-world avatars of Erdős’s first-moment argument.

Relation to classical bounds

In the classical approach there are two ways to reason about Ramsey numbers. The first one is the existence via probability, the first-moment (expectation) method counts the expected number of forbidden monochromatic cliques under a uniformly random two-coloring; if this expectation is < 1, then there exists a coloring with none of them (a lower bound). The second-moment method goes further by controlling fluctuations (variance) to show that such colorings occur with non-negligible probability. The Lovász Local Lemma (LLL) handles families of rare, mildly dependent “bad” events (e.g., many overlapping clique constraints) and guarantees existence when dependencies are local. Janson-type inequalities quantify how unlikely it is that none of a large, overlapping family of small subgraphs appear, often sharpening first/second-moment conclusions for sums of dependent indicators [7, 1921]. The second way is based on constructive and recursive bounds. Explicit colorings (circulant/Paley/Mathon-type) [2225] furnish strong lower bounds, while the classical Erdős recursion R(m, n) ≤ R(m–1, n) + R(m, n–1) supplies widely used upper bounds [8].

In our case, the random-projector maps we use are deliberately one-sided, a “first–moment–style” diagnostics: they test for the onset of unavoidable structure rather than constructing colorings. Algebraically, forbidden cliques become central operators acting on the reduced (charge-zero) module M0; our linear deflation Plin and exponential map Pexp(α) probe whether any survivor subspace remains. Under i.i.d. isotropic rank-1 tests in d dimensions, the chance to miss an r-dimensional survivor subspace obeys the familiar exponential tail Pmissekr/d, so increasing k/d drives a rapid “no-survivor” decision, an analogue of the first-moment decay in the coin-flip view. In short: classical first/second moment, LLL, and Janson reason about counts and dependencies of bad subgraphs; our spectral surrogates reason about the rank that those constraints leave behind.

Position on R(5, 5) and resources. Our diagnostics are consistent with the classical window 43 ≤ R(5, 5) ≤ 46 and single out n = 45 as the threshold within the graded-module model. This complements (not replaces) constructive lower bounds (circulant/Paley/Mathon) and the recursion-based upper bounds above. Operationally, working in the reduced module of dimension d = 24 keeps the data width to log2d =5\left\lceil {{{\log }_2}d} \right\rceil = 5 qubits (plus few ancillas), in stark contrast to direct edge-register encodings that would require one qubit per edge (e.g., 990 data qubits at n = 45). Our contribution is therefore a compact, algebraic–spectral surrogate that (i) acts entirely in the charge-zero module and (ii) trades explicit constructions for one-sided, high-confidence diagnostics aligned with the classical probabilistic intuition.

3.1.
Numerical Investigations: Results

Both methods were run in double precision with k = 100 random projectors varying α in the exponential case. Additional runs with higher values of k, up to k = 400 confirmed the results. We tested the exponential operator for α = 3, 5, 7, 10, 15, 20, 40. Choosing k = 100 with α = 20 and k = 400 with α = 40 one provides a reproducible balance between statistical resolution and numerical stability.

As reported in Table 2, numerical results show a peculiar behavior at n = 45. The traces and spectra of both projectors provide a numerical diagnostic for the “critical” Ramsey value R(5, 5) = 45. For the exponential projector, the trace TrPexp drops there exponentially faster to zero increasing α with respect to the other values. This reflects the system’s proximity to the Ramsey threshold, indicating that it is at this value where the random linear projector method most sensitively detects the transition between possible and impossible colorings making n = 45 the most promising candidate for R(5, 5). The spectrum of Plin instead develops a small peak in the real eigenvalue n = 45, with respect to the other values. This is the behavior expected when the projector algebra can no longer accommodate a two-coloring that avoids a red or blue K5. As expected, for n = 43, a limit value already excluded by existing constructions in the literature and n = 44, all diagnostics behave smoothly, indicating that the method does not yield false positives. At n = 46, the observables also show an exponential regime closer to zero of one order of magnitude with respect to 43 and 44, with decreasing values of the trace of the linear operator, consistent with the existence of admissible colorings and confirming that this value lies above the Ramsey threshold.

Table 2.

Trace of exponential projector, real and imaginary TrPexp, linear projector TrPlin, Minimum real eigenvalue of linear projector, Lyapunov exponent λL for each n at α = 20, k = 100, for n = 45 reports the best values obtained with α = 40 and k = 400. and slope of log10(Trace) vs. {α}. The symbol* = indicates a known limit solution for n = 5 used as test.

n43 *444546
TrPexp7.92 × 10−121.54 × 10−1210−289 ≈ 01.86 × 10−13
TrPlin0.2840.3600.4620.407
min Reλ–0.058–0.053–0.050–0.061
max Imλ±0.037±0.030±0.032±0.063
λL1.551.481.411.54
Slope–0.674–0.642–0.612–0.670

Another test is given by the Lyapunov exponent which quantifies the mean exponential rate at which a vector x is stretched or contracted under repeated multiplication. Under an i.i.d. isotropy assumption for the rank-one directions {vj} in the charge-zero subspace M0, the exponential projector admits the mean-field estimate 𝔼 Tr Pexp(α) ≈ d e−αλL. By Oseledets’ multiplicative ergodic theorem, the (maximal) Lyapunov exponent λmax=limk1klog Plinx {\lambda _{\max }} = {\lim _{k \to \infty }}{1 \over k}\log \left\| {{P_{{\mathop{\rm lin}\nolimits} }}x} \right\| exists almost surely for i.i.d. factors [26]. The linear projector chain Plin of Eq. (11), measures how rapidly directions in the space are suppressed as additional rank-1 projectors are applied. Concretely, after k projectors the relevant norm is Plinx \left\| {{P_{{\mathop{\rm lin}\nolimits} }}x} \right\|. The largest Lyapunov exponent is λL,max=limk1kE[ log Plinx ]{\lambda _{L,\max }} = {\lim _{k \to \infty }}{1 \over k}\left[ {\log \left\| {{P_{{\mathop{\rm lin}\nolimits} }}x} \right\|} \right]. As each factor removes one random one-dimensional component, λL,max is typically negative in a d-dimensional space, meaning an overall contraction.

The estimate with Random Rank-1 Projectors proceeds taking each projector IvjvjTI - {v_j}v_j^{\rm{T}}, which removes the component along vj. For a random unit vector x, the expected reduction is E[ (IvjvjT)x 2 ]=11/d\left[ {{{\left\| {\left( {I - {v_j}v_j^{\rm{T}}} \right)x} \right\|}^2}} \right] = 1 - 1/d. After k steps, this becomes E[Plinx2]=(11/d)k[{P_{lin}}x{^2}] = {\left( {1 - 1/d} \right)^k}. For large dlogE[ Plinx 2 ]=klog(11/d)k/dd\log \left[ {{{\left\| {{P_{{\mathop{\rm lin}\nolimits} }}x} \right\|}^2}} \right] = k\log (1 - 1/d) \approx - k/d. The Lyapunov exponent is thus approximately in a mean-field estimate, assuming isotropy, λL ≈ –1/(2d). For d = 24, k = 100, the suppression factor is λL ≈ 0.015, so the norm is suppressed by about two orders of magnitude and the trace even more due to minimum eigenvalue directions. To quantitatively assess the rate of exponential suppression of the trace with respect to a, we also compute the slope of logω (Trace) versus the values of the suppression parameter α ≤ 20, for each candidate Ramsey value {n} = {44, 45, 46} and the limit value 43. The slope is estimated by performing a linear regression on the calculated data points (α, log10(Trace)) for each n, log10(Trace) = const + λL · α, where the suppression per unit α is numerically equal to the slope, providing the empirical Lyapunov exponents for each n. The highest Lyapunov exponent is found for n = 43 and for n = 46. The smallest value is for n = 45. For d = 24, k = 400, and α = 40, the eigenvalues of A=j=1400vjvjTA = \sum\nolimits_{j = 1}^{400} {{v_j}} v_j^{\rm{T}} concentrate at λi ≈ 400/24 ≈ 16.67. The trace of the exponential projector is Tr Pexp ≈ 2.4 × 10−289 ≈ 0, which is, for all practical purposes, zero.

Both diagnostics therefore single out n = 45 as the unique point where the Gibbs weight of legal colorings (exponential projector) is minimal with trace of the exponential projector, the contraction rate of random projections (Lyapunov exponent) is minimal in magnitude. The concordance between Lyapunov exponents and projectors in Table 2 strongly supports the hypothesis R(5, 5) = 45. A deeper discussion on statistical-confidence analysis for the random-projector can be found in Appendix B.

4.
Prime-Sequence Numbers of Order k Roadmap for the Diagonal Ramsey Numbers

As a complement to the statistical method, we introduce a heuristic approach to diagonal Ramsey numbers based on finite sequence of prime numbers.

The use of primes is not new for the estimate of Ramsey numbers, starting from Calkin–Erdős–Tovey, who developed the prime-order cyclic graphs: both a refined probabilistic analysis (via distinct–difference counts in cyclic colorings) and exhaustive computation show primes enjoy a structural edge over composites in the cyclic search space, explaining why many record lower bounds arise at prime orders, supplying theory and computation. They showed that prime orders empirically outperform composite orders for diagonal lower bounds and explained why standard expectation arguments are insufficient without this cyclic structure of primes. Their colorings all arise from cyclic graphs on a prime number of vertices.

A ubiquitous way to certify lower bounds for Ramsey numbers is to give an explicit edge-coloring of Kn that avoids a forbidden monochromatic Kk. If such a coloring exists on n vertices, then R(k, k) ≤ n + 1. The most successful explicit colorings for several diagonal and multicolor cases come from circulant (cyclic) graphs of prime order and from Paley–Cayley graphs and Paley generalized graphs with Mathon’s cyclotomic construction for which R(7, 7) ≥ 205, while generalized–Paley/Mathon–type constructions also give R(9, 9) ≥ 565 and a direct Paley instance gives R(10, 10) ≥ 798. This “prime/circulant/Paley’’ program remains active: recent work tightens Mathon-type machinery (including directed Paley analogues) and updates multicolor/diagonal records, while dynamic surveys track the current best explicit bounds. [11, 2224, 27, 28].

Our method instead uses a string of limited sequence of prime numbers before the expected value of the number R(n, n) to estimate the magnitude of a diagonal Ramsey number.

Definition 2 (Prime-sequence numbers of order k).

Fix an integer k ≥ 1 and let 𝒫k := {p1, p2, …, pk} = {2, 3, …, pk} denote the first k prime numbers. A positive integer q is called a prime-sequence number of order k if it satisfies 18q=pkknpvp with vp{0},#{pvp>0}n,maxpvpn.\matrix{ {q = \prod\limits_{p \in {\cal P}_k^n} {{p^{{v_p}}}} \quad {\rm{ with }}\quad {v_p} \in \cup \{ 0\} ,} \hfill \cr {\# \left\{ {p\mid {v_p} > 0} \right\} \le n,\quad \mathop {\max }\limits_p {v_p} \le n.} \hfill \cr }

In other words all prime divisors of q belong to the first k primes; at most n distinct primes actually occur in the factorisation and no prime exponent exceeds n. We write PSkn{\rm{PS}}_k^n for the set of all prime-sequence numbers of order k and n factors. Here we adopt n = 3.

Example 1.

45=32×5PS5345 = {3^2} \times 5 \in {\rm{PS}}_5^3, because its primes {3, 5} lie in {2, 3, 5, 7, 11} and the largest exponent is 2. Another is 46=2×23PS8346 = 2 \times 23 \notin {\rm{PS}}_8^3 (order 8 ends at 19), but 46 ∈ PS9 because 23 enters at the ninth prime. 24×7=112PSk3{2^4} \times 7 = 112 \notin {\rm{PS}}_k^3 for any k, since the exponent of 2 exceeds the allowed bound 3.

This notion generalises the classical primorial Pk = p1pk by allowing limited repetition of the smallest primes while still forbidding any appearance of primes beyond pk prime-sequence numbers of order k provide a sparsely factorised test bed for extrapolating diagonal Ramsey values without introducing unconstrained large prime factors.

Motivated by the algebraic–spectral evidence that singled out R(5, 5) = 45 = 32 × 5, we extrapolate the next diagonal values by constraining each R(n, n) to be a prime-sequence numbers of order 6 – a positive integer whose prime decomposition involves only the first six primes {2, 3, 5, 7, 11, 13} and whose growth ratio R(n, n)/R(n – 1, n – 1) remains in the measured corridor 2 ≲ ratio ≲ 3. This yields the interesting compact sequence related to the diagonal Ramsey numbers 19{R(1,1),,R(7,7)}{1,2,6,18,45,102R(6,6)160,205R(7,7)492},\{ R(1, 1), \ldots ,R(7, 7)\} \to \{ 1,2,6,18,45,102 \le R(6, 6) \le 160,205 \le R(7, 7) \le 492\} , with factorizations 1, 2, 2 × 3, 2 × 32, 32 × 5, 23 × 13, 3 × 7 × 11, …, respectively. The candidate R(6, 6) → 104 satisfies the rigorous bounds 102 ≤ R (6, 6) ≤ 165, while R(7, 7) → 231 lies inside the constructive window 189 ≤ R(7, 7) ≤ 4749.

Because the ansatz propagates the Erdős’ recursion R(m, n) ≤ R(m – 1, n) + R(m, n – 1) without overshooting any known upper bound, it furnishes a minimal, prime-structured scaffold against which future numerical or constructive proofs can be benchmarked. Let us estimate how far are Ramsey diagonal numbers R(k, k) from their prime-sequence numbers of order k – 1.

Prime-structured extrapolation of the diagonal Ramsey numbers

Let 𝒫6 = {2,3,5,7,11,13} denote the first six primes. We call an integer q prime-sequence numbers of order k = 6(PS6) if its prime factorisation involves only primes from 𝒫6, i.e. q=pP6pvpq = \prod\nolimits_{p \in {{\cal P}_6}} {{p^{{v_p}}}} with vp ∈ ℕ ∪ {0}. Starting from the exact values R(1, 1) = 1, R(2, 2) = 2, R(3, 3) = 6, R(4, 4) = 18 and the algebraic–spectral result R(5, 5) = 45 obtained in the main text, we impose two constraining axioms we use as basic assumptions.

Axiom I: Prime-sequence numbers of order k constraint: R(n,n)PSk3:={qq prime-sequencenumbers of order k}R(n,n) \in {\rm{PS}}_{\rm{k}}^3: = \{ q \in \mid q{\rm{ prime - sequence}}\;{\rm{numbers of order }}k{\rm{\} }}.

Axiom II: Moderate–growth corridor:

R(n,n)R(n1,n)2{{R(n,n)} \over {R(n - 1,n)}} \le 2 for every n ≥ 2.

Since R(m, n) ≤ R(m – 1, n) + R(m, n – 1) for all m, n ∈ ℕ[1], the diagonal case is obtained setting m = n gives R(n, n) ≤ R(n – 1, n) + R(n, n – 1) and by symmetry R(n – 1, n) = R(n, n – 1) thus, the direct bound is R(n, n) ≤ 2R(n – 1, n).

Clarifying the “moderate–growth’’ ratios

For each n ≥ 2 define the off–diagonal ratio 20ρn:=R(n,n)R(n1,n).{\rho _n}: = {{R(n,n)} \over {R(n - 1,n)}}.

Because Erdős’ recursion forces ρn ≤ 2, one can check whether the known data and rigorous bounds stay inside that “moderate–growth corridor’’ ρn ∈ [1, 2].

The first three diagonals are known exactly and satisfy ρn ≤ 2. (n = 2): R(2, 2) = 2, R(1, 2) = 1 ⇒ æ2 = 2, (n = 3): R(3, 3) = 6, R(2, 3) = 3 ⇒ æ3 = 2, (n = 4): R(4, 4) = 18, R(3, 4) = 9 ⇒ æ4 = 2.

For n = 5 we know 43 ≤ R(5, 5) ≤ 46 and R(4, 5) = 25, 1.72=4325<ρ54625=1.84.1.72 = {{43} \over {25}} &#x003C; {\rho _5} \le {{46} \over {25}} = 1.84. n = 6. Current bounds are 102 ≤ R(6, 6) ≤ 160 and 59 ≤ R(5, 6) ≤ 85 [11], whence 1.20=10285ρ616059=2.71.1.20 = {{102} \over {85}} \le {\rho _6} \le {{160} \over {59}} = 2.71.

The extreme combination 160/59 would violate ρ6 ≤ 2, but that pairing uses the loosest numerator with the loosest denominator. Matching either both lower or both upper bounds gives 102/59 = 1.73 and 160/85 = 1.88, so every empirically plausible value of ρ6 lies below 2.

With n = 7 the limits are 205 ≤ R(7, 7) ≤ 492 and 115 ≤ R(6, 7) ≤ 270 [11, 29] and we obtain 0.76=205270ρ7492115=4.28,0.76 = {{205} \over {270}} \le {\rho _7} \le {{492} \over {115}} = 4.28, while the matched bounds 205/115 = 1.78 and 492/270 = 1.82 again fall well inside the corridor.

The exact ratios through n = 4 are all ρn = 2. For n = 5 the interval [1.72, 1.84] sits comfortably below 2. For n = 6, 7 the widest theoretical cross-bounds still allow a violation, but every combination that pairs consistent lower and upper estimates (e.g. 102/59, 160/85 for n = 6) yields ρn < 2.

These observations motivate Axiom II 4: ρn ≲ 2 (“moderate–growth corridor”). Under this axiom, Eq. (19) extends Erdős’ recursive upper bound into a working heuristic for the unknown diagonals R(6, 6) and R(7, 7) while remaining consistent with all currently published data.

Enforcing the prime-sequence numbers of order k factorization rules (no more than three distinct primes and no exponent above 3) under axioms I 4– II 4 the factorization with prime-sequence numbers of orders 5 ≤ k ≤ 13 are reported in Table 3.

Table 3.

Prime–sparse extrapolation of the unknown diagonal values R(6, 6) and R(7, 7).

Ramsey N.𝒫5𝒫6𝒫7𝒫8𝒫9𝒫10𝒫11𝒫12𝒫13
R(6, 6)108108117117115115115111111
R(7, 7)225216221209209209209209205

There, each column uses the first k primes 𝒫k = {2, 3, …, pk} to build the sparsest admissible factorisation (at most three distinct primes, each exponent ≤ 3). Boldface marks the persistent values that remain unchanged from their first appearance up to the cut-off k = 2n – 1 (k = 11 for n = 6, k = 13 for n = 7).

Allowing more primes generally lowers the sparsest admissible R(6, 6) = 115 and R(7, 7) remains 209 in most schemes. A plateau indicates robust guesses. The criterion is persistence, we accept as the provisional diagonal value’ the integer that remains constant up to the largest admissible prime basis kmax(n) = 2n – 1. Anyway, as 45 already factors within the smallest prime basis 𝒫5 = {2,3,5,7,11} and satisfies the sparsity rule (≤ 3 primes, maxVp = 2), it would be still present in the table until 𝒫9 where the prime 23 gets in defining 46 = 2 × 23. From our results the prime-sequence numbers of order k for a diagonal Ramsey number R(n, n) is limited by k < 2n – 1 and this disfavors the other value, 44 and 46 is controversial.

When cut-off rule k ≤ 2n – 1 applied to R(5, 5), for the diagonal n = 5 the prime basis is allowed to grow only up to kmax = 2n – 1 = 9. The prime-sequence numbers of order k criteria are fewest distinct primes, smaller numerical value. If we add also smallest maximal exponent we obtain the results in Table 4,

Table 4.

R(5, 5) under k ≤ 2n – 1 (n = 5): admissible and selected values as 𝒫k grows.

R(5, 5) kprime setadmissible integers in (43, 46]
3–5𝒫3–5 (11 and 23 absent)45 = 325
5–8𝒫5–8 (11 present)44 = 2211, 45 = 325
9(23 present)44, 45, 46 = 2·23

When the ninth prime 23 becomes available, the factorisation 46 = 2 × 23 has the same number of distinct primes as 44 and 45 but a strictly smaller maximal exponent (1 ≤ 2) gives also evidence to 46. Because k = 9 already saturates the cut-off, no larger prime basis may undo this choice.

The fact that R(6, 6) = 115 and R(7, 7) = 209 survive every prime basis from 𝒫9 to 𝒫11 suggests they are the most stable predictions of the prime-sequence numbers of order k framework. More in detail, for R(6, 6) the optimum sequence drops from 108 (prime set 𝒫5) to 117 once 13, 17 are allowed and settles at 115 with the inclusion of 23; the next prime, 37, immediately produces 111 = 3 × 37, after which no further prime extension changes the result. The value R(6, 6) = 115 is stable for every prime set from k = 9 up to the cut-off kmax = 11. It only changes to 111 when k = 12 > kmax; hence R(6, 6) = 115 is the persistent choice.

For R(7, 7) the value moves from 225 → 221 → 209 as soon as 19 enters the basis, and then remains pinned at 209 = 11 × 19 for six consecutive prime sets (𝒫8–𝒫12); only the arrival of 41 in 𝒫13 lowers the prediction to the window floor 205 = 5 × 41. The integer 209 persists throughout the entire plateau k = 8–12, but the cut-off for n = 7 is kmax = 13; at that very last step the sparser factorisation 205 = 5 × 41 appears and becomes the new stable value. Therefore R(7, 7) = 209 is selected.

Thus the persistence principle reproduces the same predicted diagonals derived earlier with the explicit k ≤ 2n – 1 cut-off suggesting R(5, 5) = 45, R(6, 6) = 115 and R(7, 7) = 209. All three numbers continue to satisfy the sparsity axiom (no more than three primes, each exponent ≤ 3) and the tightened growth corridor R(n, n) ≤ 2 R(n – 1, n) and are defined within their known ranges.

Any constructive colouring at v < 111 or v < 209 would falsify the present sparsity hypothesis, whereas exhaustive elimination of colourings at v = 111 or v = 205/209 would push the rigorous lower bounds upward and force a narrower theoretical window.

The prime-sequence numbers of order k framework suggests the reduction of the enormous search space to a tiny target set, beyond the value R(6, 6) = 117 giving 21R(6,6){108,111,115},R(7,7){205,209}.R(6, 6) \in \{ 108,111,115\} ,\quad R(7, 7) \in \{ 205,209\} .

Exhaustive two-coloring searches should therefore be concentrated on these vertex counts. If none of these candidates admit a valid coloring, then either the prime-sequence numbers of order k sparsity hypothesis, valid empirically for the known diagonal Ramsey numbers, (no more than three distinct primes with exponents ≤ 3) or the tightened growth corridor R(n, n) ≤ 2 R(n – 1, n) must be reconsidered or weakened.

Conversely, the discovery of a single explicit coloring with R(6, 6) < 111 or R(7, 7) < 209 would break the current sparsity barrier, indicating that even the lightest admissible diagonal values require either more than three primes or a prime exponent exceeding the number bound 3. If, instead, exhaustive computation rules out all admissible colorings at v = 111 and v = 209 (or 205), the rigorous lower bounds would rise, forcing a narrower theoretical window and further constraining the allowable growth corridor.

This therefore supplies a minimal, fully factorized scaffold for future constructive or computational attacks on R(6, 6) and R(7, 7): any refutation must either break the prime sequence numbers of order k condition in Axiom I or force a growth ratio outside the empirical corridor in Axiom II. We regard this as a heuristic scaffold for targeting constructive searches; it does not constitute a proof or bound by itself.

5.
Quantum Computation for R(5, 5) and Beyond

In contrast to the standard edge–register approach that needs one logical qubit per edge (already 136 qubits to verify R(4, 4) and 780 qubits for a 40-vertex scan), our Klein-graded random-projector method for R(5, 5) operates entirely in the reduced charge-zero module M0 of dimension d = 24, i.e., only log2d =5\left\lceil {{{\log }_2}d} \right\rceil = 5 data qubits (plus a few ancillas), and is therefore practical on today’s low-qubit quantum hardware.

In the graded–algebra diagnostic, the reliability of the decision at a target diagonal R(n, n) is governed primarily by the ratio k/d between the number of sampled projectors and the ambient dimension. For practical scans one should pick d as large as possible without increasing the data–qubit count (e.g., d ≤ 32 for five data qubits; if conditioning requires it, move to d = 48 but scale k accordingly), and then set k to meet a prescribed miss–probability threshold. Writing r for the surviving rank inside the charge–zero module, the miss probability obeys an exponential tail, so that keeping k/d above a simple logarithmic threshold in the target error ε suffices. In short, fix (d, r, ε) and select k so that k/d exceeds the rule-of-thumb bound below; this preserves the hardware footprint while making the peak / collapse witnesses (Tr Plin and Tr Pexp(α)) increasingly sharp as n grows.22Pmissekr/dkdln(1/ε)r{P_{{\rm{miss}}}} \le {e^{ - kr/d}} \Rightarrow {k \over d} \ge {{\ln (1/\varepsilon )} \over r} a conservative estimation gives k/d ≳ 2 ln(1/ε)/r.

The parameter d represents the ambient dimension of the charge–zero module M0, and it should not be regarded as a function of the Ramsey parameter n. Rather, d is fixed as the maximal width that does not increase the data–qubit register, thus preserving hardware feasibility. The strength of the diagnostic witnesses (Tr Plin and Tr Pexp) is governed by the ratio k/d, since the miss probability obeys an exponential tail Pmissekr/d, with r the surviving rank.

The bound on Pmiss follows from the exponential tail of the binomial approximation and can be strengthened by a Chernoff estimate. Since the exponent scales with kr/d, the relevant parameter is the ratio k/d, not the absolute size of d. Moreover, increasing d beyond the threshold that leaves the data–qubit count unchanged does not alter the rank structure of M0. Hence, for fixed r, the diagnostic accuracy is improved only by enlarging k/d, demonstrating that d need not track the combinatorial parameter n.

Consequently, increasing d beyond the qubit threshold yields no benefit, while reliability is improved chiefly by scaling k/d. This decouples the quantum resource cost from the combinatorial size n of the Ramsey instance.

We now implement on quantum hardware the two scalar diagnostics introduced earlier, the linear/spectral witness Plin and the exponential trace T(α) = Tr Pexp(α), to decide whether any survivor subspace remains at a given vertex count v. We purposely keep A and Plin separate: Tr Plin tracks residual rank after random deflations, while T(α) contracts in every positive real direction of A, yielding complementary order parameters. We work entirely in the charge-zero module M0 of the V4-graded construction; in our runs for R(5, 5) this had d = 24, which maps to a 5-qubit data register (plus two ancillas for block encoding and estimation).

5.1.
From Classical Diagnostics to Quantum Estimators

Ramsey numbers were evaluated with classical methods calculating the two scalar witnesses, the linear and exponential projectors Plin and Pexp(α), with A=j=1kvjvjA = \sum\nolimits_{j = 1}^k {{v_j}} v_j^ \top and declare v, as before, “critical” when Tr Plin peaks while T(α) := Tr Pexp(α) collapses. To translate these operations in the language of quantum computing we build the equivalents of the previous equations in terms of quantum circuits.

On a quantum device we reproduce the same scalars by two identities: first the Hutchinson identity E|rr|F|r=1d{_{|r\rangle }}\langle r|F|r\rangle = {1 \over d}{\mathop{\rm Tr}\nolimits} F Tr F, for any linear operator F ∈ ℂd×d on the data register M0 with dimension d = 24 and a random quantum state |r〉 from a unitary 2-design, so that dr|F|r〉 is an unbiased trace estimator; F can be either Plin or Pexp. On hardware is applied a block-encoding UF on data+ancillas with (0a|I )UF(|0aI)=F/α0\left. {\langle {0^a}| \otimes I} \right){U_F}\left( {\left| {{0^a}} \right\rangle \otimes I} \right) = F/{\alpha _0}, so the Hutchinson estimator averages r|F|r=TrF/d(orTrF/(α0d))\langle r|F|r\rangle = {\mathop{\rm Tr}\nolimits} F/d\left( {{\mathop{\rm or}\nolimits} {\mathop{\rm Tr}\nolimits} F/\left( {{\alpha _0}d} \right)} \right). The Hermitian dilation H of Eq. (23), for which H2 = diag(AA, AA) and σ(H)={ ±σi(A) }i=1d\sigma (H) = \left\{ { \pm {\sigma _i}(A)} \right\}_{i = 1}^d gives the second identity. Thus phase estimation on eitH accesses the singular spectrum of A, while Hadamard tests over a random |r〉 estimate Tr Plin and T(α) coherently. In our d = 24 charge-zero module this uses the ceiling log2d =5\left\lceil {{{\log }_2}d} \right\rceil = 5 data qubits plus few ancillas for R(5, 5).

5.2.
Hardware-Ready Benchmark Protocol

Inputs. Fix (d, k, α, seed), reuse the same random unit directions { vj }j=1kd\left\{ {{v_j}} \right\}_{j = 1}^k \subset {^d} used classically to build the accumulator A=j=1kvjvjA = \sum\nolimits_{j = 1}^k {{v_j}v_j^ \top } on M0. We reuse the diagnostics defined earlier: Pexp(α) and Plin in Eqs. (10)–(11).

Measurements. By the Hutchinson identity, for any implementable linear map F, E|rr|F|r=TrF/d{_{|r\rangle }}\langle r|F|r\rangle = {\mathop{\rm Tr}\nolimits} F/d when |r〉 is drawn from a unitary 2-design; we realize 〈r|F|r〉 by a Hadamard test and average over random |r〉’s (a more detailed discussion for the estimation of traces and spectra is written in Section 5.4). Amplitude estimation reduces the shot complexity from O(1/ϵ2) to O(1/ϵ) for additive error ϵ.

We estimate three quantities: (i) the scalar trace witness Tr Plin, (ii) the exponential trace T(α) = Tr Pexp(α), and (iii) a spectral surrogate via the Hermitian dilation in Eq. (23).

Report (Tr Plin, T(α), ρ(α)) with (d, k, α, seed) and the decision flag for each tested v.

We assume as decision rule the following: for a fixed d, k, α (and random seed), declare a vertex count v critical if: (i) Tr Plin attains a local maximum at v; (ii) T(α) = Tr Pexp(α) collapses (numerically ≈ 0) at the same v; (iii) the spectral proxy (e.g. A2=ρ(H)A{_2} = \rho (H) by phase estimation) is locally extremal at v; and an explicit AM-46 control remains non-critical under the same thresholds.

5.3.
Compilation Primitives (Two Interchangeable Tracks)

Track Q (qubitized block-encoding). Track Q uses qubitized block-encodings; Track M uses Majorana/matchgate Gaussian primitives to realize the degree (0, 0) projectors directly in M0. Both tracks operate in the same d = 24 module. As A is generally complex-symmetric (not Hermitian) when built from vv, we embed A into the Hermitian dilation 23H=(0AA0),A2=ρ(H),H = \left( {\matrix{ 0 & A \cr {{A^\dag }} & 0 \cr } } \right),\quad A{_2} = \rho (H), and access spectral surrogates by phase estimation on eitH. Because σ(H) = {±σi(A)}, a local maximum of the extracted A2=ρ(H)A{_2} = \rho (H) at the same v that triggers (i)–(ii) is expected. Here A2A{_2} is the spectral norm (largest singular value) and ρ(H) is the spectral radius of the Hermitian dilation H; since σ(H) = {±σi(A)}, we have ρ(H)=maxiσi(A)=A2\rho (H) = {\max _i}{\sigma _i}(A) = A{_2}. The hardware surrogate for eaA is so settled.

As A from vv is typically non-normal, we implement functions of the dilation H (or of H2) via block-encoding/qubitization. Notably H2 = diag(AA, AA) implies TreβH2=TreβAA+TreβAA=2TreβAA.{\mathop{\rm Tr}\nolimits} {e^{ - \beta {H^2}}} = {\mathop{\rm Tr}\nolimits} {e^{ - \beta A{A^ + }}} + {\mathop{\rm Tr}\nolimits} {e^{ - \beta {A^ + }A}} = 2{\mathop{\rm Tr}\nolimits} {e^{ - \beta A{A^ + }}}.

Estimating Tr eβH2 therefore serves as a stable surrogate for the exponential witness T(α): both are monotone in the singular values of A and collapse precisely when the survivor subspace vanishes. This also gives a block-encoding of A enabling polynomial/Fourier approximants to f(A) ∈ {A, eαA}.

Operators used in Figure 1 (Phase Estimation on the Hermitian Dilation). Accumulator A:=j=1kvjvjA: = \sum\limits_{j = 1}^k {{v_j}} v_j^ \top is the complex-symmetric rank-k sum built from the random directions vj ∈ ℂd in the charge-zero module M0 (with dimension d).

Figure 1.

Phase-estimation on the Hermitian dilation H (Eq. (23)) to estimate its extremal eigenphase(s), hence ρ(H)=A2\rho (H) = A{_2}. The “sig” (sign) qubit together with the data register realizes the 2d-dimensional space of the dilation; the PE register controls powers of eitH. A local maximum of the extracted A2A{_2} at the critical v complements the linear/exponential diagnostics.

Hermitian dilation H (Eq. (23)); its spectrum is σ(H)={ ±σi(A) }i=1d\sigma (H) = \left\{ { \pm {\sigma _i}(A)} \right\}_{i = 1}^d, so A2=ρ(H)A{_2} = \rho (H).

Controlled evolutions. The string c-ei2tH denotes the standard phase-estimation controlled unitaries at time-steps 2t, followed by the inverse QFT on the PE register to read out the extremal eigenphase(s), hence A2A{_2}. The prefix “c-” denotes a standard single–qubit controlled gate: cU:=(|00|)ctrlIdata+(|11|)ctrlUdatac - U: = (|0\rangle {\langle 0|)_{{\rm{ctrl}}}} \otimes {I_{{\rm{data}}}} + (|1\rangle {\langle 1|)_{{\rm{ctrl}}}} \otimes {U_{{\rm{data'}}}} so that cei2ktH=(|00|)kI+(|11|)kei2ktH.c - {e^{ - i{2^k}tH}} = (|0\rangle {\langle 0|)_k} \otimes I + (|1\rangle {\langle 1|)_k} \otimes {e^{ - i{2^k}tH}}.

Acting on a basis state |cm–1c0PEdata (with ck ∈ {0,1}), the whole product implements (k=0m1cei2ktH)(|c|ψ)=|cei(kck2k)tH|ψ,\left( {\prod\limits_{k = 0}^{m - 1} {\rm{c}} - {e^{ - i{2^k}tH}}} \right)(|c\rangle \otimes |\psi \rangle ) = |c\rangle \otimes {e^{ - i\left( {\sum\nolimits_k {{c_k}} {2^k}} \right)tH}}|\psi \rangle , i.e., a data–register evolution for a time proportional to the integer encoded by the control register. Because all factors are functions of the same H, they mutually commute on the data space, so their order is immaterial (though the circuit is usually drawn MSB→LSB to match the inverse QFT). Here m is the number of phase bits (PE precision) and t is the chosen base time step; in our setting H so that ρ(H)=A2\rho (H) = A{_2}, and these controlled evolutions are the core of the Hermitian-dilation phase–estimation block used as a spectral witness.

Registers. |0〉sig is the dilation’s sign qubit; |0PE m|0\rangle _{{\rm{PE }}}^{ \otimes m} is the m-qubit phase-estimation register; |ψdata is the d-dimensional data register (for R(5, 5), log2d =5\left\lceil {{{\log }_2}d} \right\rceil = 5 data qubits). We reserve HHad for the one-qubit Hadamard gate to avoid confusing it with the dilation H; where the symbol H appears inside Wj below it is the Hadamard gate.

Definition 3 (Block-encoding).

A unitary U on a + log2 d qubits is an (α, a) block-encoding of F ∈ℂd×d if (〈0a| ⊗ I)U(|0a〉 ⊗ I) = F/α. Given an (α, a) block-encoding of H with H1H \le 1, QSVT implements p(H) for any bounded odd/even polynomial p on [–1,1] using O (deg p) uses of U and U.

Rank-1 LCU for A. We write the One-ancilla rank-1 in Figure 2, A=jwj|ujvj|A = \sum\nolimits_j {{w_j}} \left| {{u_j}} \right\rangle \left\langle {{v_j}} \right|, prepare |uj〉, |vj〉 with unitaries Uj, Vj. A 1-ancilla block for a term is obtained with 24Wj:=(|00|Uj+|11|Vj)(HI),{W_j}: = \left( {|0\rangle \langle 0| \otimes {U_j} + |1\rangle \langle 1| \otimes {V_j}} \right)(H \otimes I), whose top-left block equals 12|ujvj|{1 \over 2}\left| {{u_j}} \right\rangle \langle {v_j}| (here H is the single-qubit Hadamard on the ancilla).

Figure 2.

One-ancilla rank-1 block-encoding gadget Wj implementing the top-left block 12|ujvj|{1 \over 2}\left| {{u_j}} \right\rangle \langle {v_j}| (see Eq. (24)). The multiplexor applies Uj when the ancilla is |0〉 and Vj when it is |1〉. Composing these gadgets with a single selector/index register and oblivious amplitude amplification yields an α0–block-encoding of A/α0=jwj|ujvj|/α0A/{\alpha _0} = \sum\nolimits_j {{w_j}} \left| {{u_j}} \right\rangle \langle {v_j}|/{\alpha _0}.

Operators used in Figure 2 (Rank-1 LCU block-encoding gadget). State-prepunitaries. Uj|0=|uj,Vj|0=|vj*{U_j}|0\rangle = \left| {{u_j}} \right\rangle ,{V_j}|0\rangle = \left| {v_j^*} \right\rangle prepare the rank-one factors (“*” appears when the complex-symmetric vv structure is used).

Multiplexor. MUX(Uj,Vj):=|00|Uj+|11|Vj{\mathop{\rm MUX}\nolimits} \left( {{U_j},{V_j}} \right): = |0\rangle \langle 0| \otimes {U_j} + |1\rangle \langle 1| \otimes {V_j}. This gate is a controlled selection, uniformly controlled unitary, that applies Uj to the data register when the one-qubit selector is |0〉, and Vj when the selector is |1:MUX(Uj,Vj)(|0|ψ)=|0Uj|ψ|1\rangle :{\mathop{\rm MUX}\nolimits} \left( {{U_j},{V_j}} \right)(|0\rangle \otimes |\psi \rangle ) = |0\rangle \otimes {U_j}|\psi \rangle and MUX(Uj,Vj)(|1|ψ)=|1Vj|ψ{\mathop{\rm MUX}\nolimits} \left( {{U_j},{V_j}} \right)(|1\rangle \otimes |\psi \rangle ) = |1\rangle \otimes {V_j}|\psi \rangle . In the selector computational basis it is block–diagonal, diag(Uj, Vj). A useful implementation identity is MUX(Uj,Vj)=(IUj)(c(UjVj)),{\mathop{\rm MUX}\nolimits} \left( {{U_j},{V_j}} \right) = \left( {I \otimes {U_j}} \right)\left( {{\rm{c}} - \left( {U_j^\dag {V_j}} \right)} \right), so the multiplexor can be built from one unconditional application of Uj on the data plus a single controlled unitary with target UjVju_j^ + {V_j}. The definition extends to an m-qubit selector as MUX({ Us }s{0,1}m)=s{0,1}m|ss|Us,{\mathop{\rm MUX}\nolimits} \left( {{{\left\{ {{U_s}} \right\}}_{s \in {{\{ 0,1\} }^m}}}} \right) = \sum\limits_{s \in {{\{ 0,1\} }^m}} | s\rangle \langle s| \otimes {U_{\rm{s}}}, which applies Us conditioned on the selector string s. In our block–encoding gadget of Fig. 2, choosing state–preparations Uj|0=|uj{U_j}|0\rangle = \left| {{u_j}} \right\rangle and Vj|0=|vj{V_j}|0\rangle = \left| {{v_j}} \right\rangle and combining MUX(Uj, Vj) with Hadamards on the selector yields the desired rank-one ancilla block used to assemble the linear combination of terms in the accumulator.

Gadget. Wj:=(|00|Uj+|11|Vj)(HHadI){W_j}: = \left( {|0\rangle \langle 0| \otimes {U_j} + |1\rangle \langle 1| \otimes {V_j}} \right)\left( {{H_{{\rm{Had}}}} \otimes I} \right) has top-left ancilla block 12|ujvj|{1 \over 2}\left| {{u_j}} \right\rangle \langle {v_j}|. From terms to A. With weights wj, a single selector register plus oblivious amplitude amplification yields an α0–block-encoding of A/α0=jwj|ujvj|/α0A/{\alpha _0} = \sum\nolimits_j {{w_j}} \left| {{u_j}} \right\rangle \langle {v_j}|/{\alpha _0}.

Functions of A. Using LCU/qubitization (or QSVT), polynomials/Fourier approximants realize flin(z) = z and fexp(z) = eαz, giving Plin and Pexp(α) coherently.

Operators used in Figure 2 (Rank-ILCU block-encoding gadget). State-prepunitaries. Uj|0=|uj,Vj|0=|vj*{U_j}|0\rangle = \left| {{u_j}} \right\rangle ,{V_j}|0\rangle = \left| {v_j^*} \right\rangle prepare the rank-one factors (“*” appears when the complex-symmetric vv structure is used). Multiplexor. MUX(Uj,Vj):=|00|Uj+|11|Vj{\mathop{\rm MUX}\nolimits} \left( {{U_j},{V_j}} \right): = |0\rangle \langle 0| \otimes {U_j} + |1\rangle \langle 1| \otimes {V_j}. Gadget. Wj:=(|00|Uj+|11|Vj)(HHadI){W_j}: = \left( {|0\rangle \langle 0| \otimes {U_j} + |1\rangle \langle 1| \otimes {V_j}} \right)\left( {{H_{{\rm{Had}}}} \otimes I} \right) has top-left ancilla block 12|ujvj|{1 \over 2}\left| {{u_j}} \right\rangle \langle {v_j}|. From terms to A. With weights wj, a single selector register plus oblivious amplitude amplification yields an α0–block-encoding of A/α0=jwj|ujvj|/α0A/{\alpha _0} = \sum\limits_j {{w_j}} \left| {{u_j}} \right\rangle \langle {v_j}|/{\alpha _0}.

Track M (Majorana-native). On the platforms that natively support Majorana bilinears (match–gate/fermionic – Gaussian hardware), all degree–(0, 0) pair and clique projectors reduce to even–parity checks on the data modes: each monochromatic pair projector ΠijR/B\Pi _{ij}^{R/B} and their products ΠR(S), ΠB(T) act entirely inside the charge–zero module M0 and commute with total parity, hence can be realized as parity–preserving projectors built from quadratic Majorana terms. Plin is then best read as a randomized mixture of parity checks: in the decomposition used in the text, Plin, each rank–one deflation removes amplitude along a random degree–(0, 0) direction (a linear combination inside the span generated by pair/clique checks) within M0, and the product effects the Hutchinson–style contraction that diagnoses the disappearance of survivors (Eq. (8)). By contrast, Pexp(α) is naturally implemented as repeated weak Gaussian projections in M0: Trotterize exp(δαvjvj)\left( { - \delta \alpha {v_j}v_j^ \top } \right) for small δα and cycle j = 1, …, k, which preserves the even–parity sector by construction (Eq. (7)). Because all operators used by the diagnostics are degree–(0, 0), they act only on M0 and are independent of the mixed (1, 1) sector projected out by the modeling axiom; thus the Majorana–native track realizes exactly the same witnesses as the generic qubit route while exploiting native parity checks. Cf. the definitions of the graded projectors, the M0 restriction, and the linear/exponential maps in the main text.

5.4.
Estimating Traces and Spectra

Unbiased trace estimator (Hutchinson). For any implementable linear map F, random |r〉 from a unitary 2-design obeys 25E|r[r|F|r]=1dTrF,{_{|r\rangle }}[\langle r|F|r\rangle ] = {1 \over d}{\mathop{\rm Tr}\nolimits} F, so dr|F|r¯d\overline {\langle r|F|r\rangle } is an unbiased trace estimator; variance decreases as O(1/N) or O(1/N2) with amplitude estimation.

Operators used in Figure 3 (Hadamard-test realization of Hutchinson’s trace estimator). Random probe. |r〉 = C|0 ⋯ 0〉 where C is a unitary 2-design (e.g., a random Clifford) on the data register; C is undone at the end so the data returns to the computational frame. Block-encoding of the map. UF is an (α0, a) block-encoding of the (generally non-unitary) map F, i.e. (0a|I)UF(|0aI)=F/α0\left( {\langle {0^a}| \otimes I} \right){U_F}\left( {\left| {{0^a}} \right\rangle \otimes I} \right) = F/{\alpha _0}. In this section F ∈ { Pexp(α) = e−αA, Plin} (Eqs. (7)–(8)). Ancilla routine. The ancilla is prepared in |+〉, a Hadamard–UF–Hadamard sequence is applied, and the Pauli-Z observable is measured on the ancilla. Averaging 〈Z〉 over independent random C’s (Hutchinson sampling) yields an unbiased estimate of Tr F/d; amplitude estimation can reduce shot complexity from O(1/ε2) to O(1/ε). Lyapunov decay proxy. We use the slope 26λL(α):=ddαlogTrPexp(α)=Tr[ AeαA ]Tr[ eαA ],{\lambda _L}(\alpha ): = - {d \over {d\alpha }}\log {\mathop{\rm Tr}\nolimits} {P_{\exp }}(\alpha ) = {{{\mathop{\rm Tr}\nolimits} \left[ {A{e^{ - \alpha A}}} \right]} \over {{\mathop{\rm Tr}\nolimits} \left[ {{e^{ - \alpha A}}} \right]}}, as a monotone indicator of contraction.

Figure 3.

Hadamard-test realization of the Hutchinson identity (Eq. (25)). A random |r〉 is prepared by a unitary 2-design C on the data register; U˜F{{\tilde U}_F} is a block-encoding of F (either F = Pexp (α) = e−aA or F = Plin). Averaging the ancilla’s 〈Z〉 over random |r〉 gives an unbiased estimate ofTr F/d; amplitude estimation can reduce shot complexity.

The observed signature at n = 45 and miss-probability are so obtained. On the d = 24 module with k ∈ [100,400] and α ∈ {20,40}, the diagnostics concur at n = 45: T(α) collapses while Tr Plin peaks; the Angeltveit–McKay 46-vertex coloring (AM–46) [3] does not trigger. The false-negative risk under i.i.d. rank-1 directions obeys Pmisse−kr/d for residual rank r, placing the operational risk < 10−3 for the reported settings and ~ 10−7 once r ≥ 12.

The resources and NISQ-friendly variant are the following. With d = 24, the data register is 5 qubits; a single ancilla suffices for block-encoding and one for overlap/phase estimation (7–8 qubits total). Depth scales as O˜(kCprep )\widetilde O\left( {k\;{C_{{\rm{prep }}}}} \right) for constant-precision exponentiation; phase estimation adds O(1/ϵ) controlled evolutions for precision ϵ. A shallow NISQ variant Hermitianizes A by replacing vv with vv, keeping the qualitative signatures (peak / collapse) while simplifying circuits.

6.
Calculation of Diagonal Ramsey Values beyond R(5, 5)

After R(5, 5), we try (taking the results with a grain of salt) to give an estimate for R(6, 6) and R(7, 7) to verify our method and procedures. As Erdős said in a joke, “Suppose aliens invade Earth and threaten to obliterate it in a year’s time unless human beings can find the Ramsey number for red five and blue five. We could marshal the world’s best minds and fastest computers, and within a year we could probably calculate the value. If the aliens demanded the Ramsey number for red six and blue six, however, we would have no choice but to launch a preemptive attack” [5].

Let us first summarize the results for R(5, 5). As in Table 2, using the exponential-trace collapse and the linear/product witness, we report the probability of having a correct estimate for any value of the Ramsey numbers here considered. For R(5, 5) the value n = 44 has probability ≈ 2.9%, for n = 45 the probability rises up to ≈ 92.7% and n = 46 is ≈ 4.3%. As a safety check below the diagonal threshold, at n = 43 we obtain (with d = 24, k = 100, α = 20) the exponential trace Tr Pexp = 7.92 × 10−12, the linear/product trace Tr Plins = 0.284, extremal linear–spectrum entries min Re λ = –0.058, max |Im λ| = 0.037, and an empirical slope ddαlog10{d \over {d\alpha }}{\log _{10}} Tr Pexp = –0.674; none of the “critical” signatures appear, in line with the statement that for n = 43, 44 the diagnostics behave smoothly. Using the measured slope to extrapolate from α = 20 to α = 40 gives log10 Tr Pexp (40) ≈ –24.581 and hence Tr Pexp (40) ≈ 2.62 × 10−25, still many orders of magnitude above the near-collapse seen at the true diagonal: for n = 45 one has Tr Pexp (40) ≈ 2.4 × 10−289 (with k = 400), while the linear witness peaks at Tr Plin (45) = 0.462 versus 0.360 at 44 and 0.407 at 46. The projector-miss risk satisfies Pmisse−kr/d, so at d = 24 one gets Pmisse−100/24 ≈ 1.55 × 10−2 for k = 100 and Pmisse−400/24 ≈ 5.7 × 10−8 for k = 400 (taking residual rank r = 1), making a spurious collapse at n = 43 exceedingly unlikely; this calibrates the method’s accuracy and explains the sharp separation between 43 (non-critical) and 45 (critical). For hardware mapping we work in the charge-zero module M0 with dim M0 = d; for R(5, 5) we used d = 24, yielding a data register of ⌈log2 d⌉ = ⌈log2 24⌉ = 5 qubits plus 1–2 ancillas, and empirically increasing the dimension d did not alter the decisions on these instances.

Motivated by this and by the Pmiss exponent kr/d, a good default for R(6, 6) and R(7, 7) is to keep the width constant and pick the largest d that does not increase ⌈log2 d⌉, i.e. d ≤ 32 (still 5 data qubits) for extra algebraic headroom; if one diagnoses conditioning issues, moving to d = 48 adds only one data qubit (to 6) but should be accompanied by scaling k so that k/d remains roughly constant.

We combine (i) a prime-structured classical scaffold and (ii) the same quantum spectral diagnostics to target the next diagonals. Using the classical prime-sequence scaffold, consider Pk = {2, 3, 5, 7, 11, 13, …}. Constrain diagonal values to prime-sequence integers using only the first k primes, with at most three distinct primes and exponents ≤ 3, and keep growth ratios ρn := R(n, n)/R(n – 1, n) ≲ 2 (Erdős corridor). This yields a short target set whose persistent elements (stable as k grows up to kmax = 2n – 1) are: R(6,6){108,111,115},R(7,7){205,209}.R(6, 6) \in \{ 108,1\mid 11,115\} ,\quad R(7, 7) \in \{ 205,209\} .

The persistence plateau selects R(6, 6) = 115 and R(7, 7) = 205 as the most stable predictions–both consistent with rigorous bounds 102 ≤ R(6, 6) < 160 and 205 ≤ R(7, 7) < 492.

Diagonal n = 6, 7 by exponential/linear projectors. Using the projectors we restrict the range of values for the two diagonals with n = 6 and n = 7. We fix the charge-zero module size to d = 32 (five data qubits), take a = 40, and choose k so that the i.i.d. miss bound Pmisse−kr/d (with survivor rank r ≥ 1 below threshold) is small. For n = 6 we use k = 180; for n = 7 we use k = 220. With A=j=1kvjvj,Pexp(α)=eαAA = \sum\nolimits_{j = 1}^k {{v_j}v_j^ \top ,{P_{\exp }}(\alpha ) = {e^{ - \alpha A}}} and Plin=j=1k(Ivjvj){P_{{\mathop{\rm lin}\nolimits} }} = \prod\limits_{j = 1}^k {\left( {I - {v_j}v_j^ \top } \right)} (Eqs. (7)–(8)), the concentration σ(A) ≈ k/d observed at n = 5 implies the critical/noncritical split Tcrit (α)=TreαAdeαk/d{T_{{\rm{crit }}}}(\alpha ) = {\mathop{\rm Tr}\nolimits} {e^{ - \alpha A}} \approx d{e^{ - \alpha k/d}} and Tnoncrit(α) ≥ 1 + (d – 1) e−αk/d, while the mean-field contraction for the linear chain is 𝔼∥Plinx2 = (1 –1/d)k. Numerically, for n = 6 (k = 180) one has e−αk/d = e−40.180/32 ≈ 1.9219 × 10−98, hence Tcrit ≈ 6.15 × 10−97, Tnoncrit ≥ 1 + (31) e−αk/d ≈ 1, (1132)1803.30×103{\left( {1 - {1 \over {32}}} \right)^{180}} \approx 3.30 \times {10^{ - 3}}, and Pmiss e180/323.61×103{P_{{\rm{miss }}}} \le {e^{ - 180/32}} \approx 3.61 \times {10^{ - 3}}. For n = 7 (k = 220) one finds e−40·220/32 ≈ 3.7069 × 10−120, thus Tcrit ≈ 1.19 × 10−118, Tnoncrit ≥ 1, (1132)2209.26×104{\left( {1 - {1 \over {32}}} \right)^{220}} \approx 9.26 \times {10^{ - 4}}, and Pmisse−220/32 ≈ 1.03 × 10−3. Applying the same decision rule as for R(5, 5) (collapse of T(α), local maximum of the linear/spectral witness, explicit primesequence persistence up to kmax = 2n – 1), and testing the prime-sparse candidate sets {108, 111, 115} for n = 6 and {205, 209} for n = 7 (Table II), the first vertex count that exhibits collapse + peak is for the following values: R(6,6)=115,R(7,7)=209,R(6, 6) = 115,\quad R(7, 7) = 209, with separation gaps exceeding 1097 and 10118 in T(α) respectively and miss bounds ≲ 3.6 × 10−3 and ≲ 1.0 × 10−3.

Quantum spectral diagnostics for R(6, 6) and R(7, 7) confirm these results. For each candidate vertex count v above, reuse the same pipeline as in Section 5: estimate Tr Plin, T(α), and the dilation spectral radius. The diagonal R(n, n) manifests as the smallest v where the linear trace peaks while the exponential trace collapses (with an AM-control remaining non-critical). This focuses searches at v ∈ {108,111,115} for n = 6 and v ∈ {205,209} for n = 7; any constructive coloring below 111 or 205 would falsify the sparsity heuristic, while exhaustive elimination at v = 111 and 205 (or 209) would raise rigorous lower bounds.

Putting both strands together we confirm the following high-accuracy working estimates R (6, 6) → 115 and R(7, 7) → 205, consistent with classical bounds and selected by prime-sequence persistence; these are the vertex counts at which the quantum diagnostics are expected to trigger under the same thresholds used for R(5, 5).

More details can be found in Appendix C, Procedures to calculate Ramsey numbers, and Appendix D Mathematical tools. Examples of python codings as a simple tutorial, are discussed in the Appendix D: Software and presented in other electronic support.

7.
Ramsey Numbers and Classical/Quantum Applications to Machine Learning
7.1.
Ramsey Background and Notation

We recall that for integers m, n ≥ 1, the (two–color) Ramsey number R(m, n) is the least v such that every red/blue edge–coloring of the complete graph Kv contains a red Km or a blue Kn. Ramsey’s theorem yields finiteness and the classical Erdős recursion. Exact values are known only at small parameters and diagonal cases grow notoriously fast. In this work we study these thresholds with a graded ℤ2 × ℤ2 Majorana algebra and two random–projector diagnostics that replace brute–force enumeration by spectral surrogates acting on a reduced (charge–zero) module.

Graded embedding and randomized spectral diagnostics is obtained through Klein–graded paraparticle algebra, from the Majorana modes γj(0),γj(1)\gamma _j^{(0)},\gamma _j^{(1)} with { γi(α),γj(β) }=2δijδαβ\left\{ {\gamma _i^{(\alpha )},\gamma _j^{(\beta )}} \right\} = 2{\delta _{ij}}{\delta _{\alpha \beta }}. Define aj=(γj(0)+γj(1))/2{a_j} = \left( {\gamma _j^{(0)} + \gamma _j^{(1)}} \right)/2 (red charge (1, 0)) and bj=(γj(0)γj(1))/2{b_j} = \left( {\gamma _j^{(0)} - \gamma _j^{(1)}} \right)/2 (blue charge (0, 1)). A two–coloring lifts to the degree–(0, 0) edge operator of Eq. (4) which commutes with total Klein charge. Monochromatic pair and clique projectors ΠijR\Pi _{ij}^R, ΠijB\Pi _{ij}^B; R(S)=i<jSijR,B(T)=i<jTijB{\prod _R}(S) = {\prod _{i &#x003C; j \in S}}\prod _{ij}^R,{\prod _B}\left( T \right) = {\prod _{i &#x003C; j \in T}}\prod _{ij}^B live in the charge–zero submodule M0; the central obstruction to a good coloring is Eq. (6) and a coloring (or an entire symmetry class thereof) survivesiff Pm,n annihilates it. In this setting the Klein–compatible coproduct realizes the glue step vv + 1 and yields an exact recursion for the graded numbers RV4 (m, n) in Eq. (7).

To decide if any legal survivor subspace remains in M0 ≃ ℝd, we use two families of randomized maps built from i.i.d. isotropic unit vectors vj ∈ ℝd, the linear deflation of Eq. (11) and the exponential map (Eq. (10)). If a survivor subspace has rank r ≥ 1, then k random rank–1 tests miss it with probability Pmiss, so Plin rapidly kills survivors as k/d grows, while T(α) collapses once survivors vanish. For non–normal A we use the Hermitian dilation H, giving a stable spectral surrogate accessible by phase estimation (quantum) or power iteration (classical).

R(5, 5) case and resource contrast. On the diagonal, the reduced module has d = 24; both witnesses single out v = 45: T(40) ≈ 2.4 × 10−289 at (k, α) = (400, 40) while Tr Plin peaks locally at v = 45; an explicit v = 46 control remains non–critical under the same thresholds. Under the i.i.d./isotropy model, Pr[pmiss] ≲ e−100/24 ≈ 1.6 × 10−2 at (100, 20) and ≲ e−400/24 ≈ 5.7 × 10−8 at (400, 40). Crucially, these diagnostics act only on M0, requiring ⌈log2 d⌉ = 5 data qubits (plus few ancillas), versus (452)=990\left( \matrix{ 45 \cr 2 \cr} \right) = 990 data qubits for direct edge–encodings.

7.2.
Ramsey Theory and Machine Learning

Ramsey theory formalizes “order amid chaos”: sufficiently large systems necessarily contain structured subconfigurations. This principle echoes throughout modern ML and suggests concrete tests and controls using the witnesses above.

Overparameterization and lottery tickets. Let 𝒩 be a (possibly overparameterized) network with parameters W ≈ ℝD. Define a graph on neurons where the edge {i, j} is colored red if a predicate Pij(W) holds (e.g., |Wij| > τ, or sign/gradient agreement, correlation above a threshold), blue otherwise.

Whenever the effective width v exceeds the relevant R(k, k), a monochromatic Kk is guaranteed, i.e. a small subnetwork obeying P coherently–akin to the Lottery–Ticket hypothesis. We can certify the continued existence (or disappearance) of such candidates by encoding ΠR (S) from the predicate and monitoring the collapse/peak of T(α) and Tr Plin as pruning proceeds; the risk of a false “no–ticket” verdict is controlled by Eq. (16) through k/d.

Build the predicate graph on vertices [v] by coloring an edge {i, j} red when Pij(W) = 1 and blue otherwise. A Ramsey k–lottery is a set S ⊂ [v] with |S| = k such that all edges inside S are red; this is a Kk fully coherent for P and can be read as a compact subnetwork (a “ticket”) already encoding the desired structure. Ramsey’s theorem implies that once vR(k, k) a monochromatic Kk must exist, so the search for a good ticket can be recast as the impossibility of avoiding such a red Kk. In the graded framework, encode the forbidden condition “no red Kk” by the central operator in Eq. (27) 27Qk(W):=|S|=k(IΠR(S)),ΠR(S)=i<jSΠijR( degree (0,0)),\matrix{ {{Q_k}(W): = \prod\limits_{|S| = k} {\left( {{\rm{I}} - {\Pi _R}(S)} \right)} ,} \hfill \cr {{\Pi _R}(S) = \prod\limits_{i &#x003C; j \in S} {\Pi _{ij}^R} ({\rm{ degree }}(0, 0)),} \hfill \cr } acting on the charge–zero module M0. If Qk(W) has no surviving support, then a red Kk is unavoidable and at least one lottery ticket exists.

Operationally we decide this by randomized witnesses on M0: Plin and Pexp(α) with T(α). Sharing the same random seed across widths v lets us track the critical scale where tickets become inevitable: the collapse of T(α) together with a local peak of Tr Plin signals Qk(W) has emptied out, hence a P–coherent Kk exists. Under i.i.d. isotropic rank–1 probes in M0 of dimension d, the miss–probability that all tests avoid a surviving r–dimensional subspace obeys Pmiss, so choosing k≳(d/r) ln(1/δ) controls the false “no–ticket” verdict at level δ. This provides a quantified version of the Lottery–Ticket intuition: overparameterization raises v, and once the Ramsey threshold is crossed, small performant subnetworks are not accidental–they are guaranteed, and the witnesses certify their inevitability.

Adversarial inevitabilities (worst–case geometry). Form a near–collision graph on inputs X = {xi}, coloring {i, j} red if the margin |f(xi) – f(xj)| < δ (for a score f), blue otherwise. Beyond a threshold v, large monochromatic cliques are inevitable, certifying coherent clusters of mutually confusable points. Our central projector of Eq. (4) with (m, n) = (k, k) and the witnesses (Plin, Pexp) give an early–warning signal: collapse of T(α) indicates that confusion–free assignments have been extinguished within the defended hypothesis class.

Given inputs X = {x1, …, xv} and a score f, form the near–collision graph by coloring {i, j} red if |f(xi) – f(xj)| < δ (or, for classifiers, if the logits differ by < δ in all attack directions), blue otherwise. Large red cliques are coherent confusion sets—mutually confusable points that any fixed defense struggles to separate. To detect when such sets are unavoidable at scale v, instantiate the central projector of Eq. (6) with (m, n) = (k, k) and run the same witnesses (Plin, Pexp(α)) on M0. A collapse of T(α) indicates that the hypothesis class (with the current defense/training recipe) cannot realize a coloring that avoids size–k coherent confusions–i.e., adversarially vulnerable patterns are now Ramsey–inevitable at this v. The Lyapunov slope λL(α):=ddαlogT(α)=Tr(AeαA)Tr(eαA){\lambda _L}(\alpha ): = - {d \over {d\alpha }}\log T(\alpha ) = {{{\mathop{\rm Tr}\nolimits} \left( {A{e^{ - \alpha A}}} \right)} \over {{\mathop{\rm Tr}\nolimits} \left( {{e^{ - \alpha A}}} \right)}} rises as survivors vanish and serves as an early–warning margin proxy. As above, the one–sided risk that random probes pmiss a surviving r–plane is bounded by e−kr/d, so (k, d) can be chosen to target a confidence level δ and declare inevitability only when both collapse/peak and the risk budget agree.

Graph neural networks and motif search. GNN tasks often detect motifs (cliques/cycles); Ramsey theory guarantees that small monochromatic subgraphs occur in large graphs regardless of coloring. Instead of scanning exhaustively, run the witnesses on the reduced module induced by the motif’s pair projectors–focusing compute where structure must exist. On parity–preserving (matchgate/Majorana) hardware, these checks map to shallow circuits acting entirely in M0.

Let G = (V, E) be a large graph and let H = (VH, EH) be a small motif (e.g., a k-clique, a c-cycle, or a domain motif) with |VH| = h. For a fixed color c ∈ {R, B} and an injective placement f : VHV, define the motif projector at placement f by ΠH,c(f):=(u,v)EHΠf(u)f(v),c{\Pi _{H,c}}(f): = \prod\limits_{(u,v) \in {E_H}} {\Pi _{f(u)f(v),}^c} where Πijc\Pi _{ij}^c is the degree-(0, 0) pair projector (red or blue) from the graded construction. The disjunction over all placements can be encoded by the central forbidden-motif operator QH,c(V):=f:VHV injective (IΠH,c(f)),{Q_{H,c}}(V): = \prod\limits_{\scriptstyle f:{V_H} \to V \atop \scriptstyle {\rm{ injective }}} {\left( {{\rm{I}} - {\Pi _{H,c}}(f)} \right)} , which equals the identityiffG contains no monochromatic copy of H. In our framework all factors have degree (0, 0) and act on the reduced module M0, so the existence of a motif is decided by whether QH,c leaves any survivor subspace in M0. This directly leverages the Ramsey guarantee that for sufficiently large |V| certain small monochromatic subgraphs are unavoidable.

Ramsey-guided screening (pre- and post-processing for GNNs). Rather than scanning all (|V|h)\left( \matrix{ |V| \cr h \cr} \right) placements, assemble a basis {ℬs} for the span generated by {ΠH,c(f)}f (or by their complements {I – ΠH,c(f)}f), sample k i.i.d. isotropic directions vj in that span (restricted to M0), and build the witnesses Plin, Pexp(α) and T(α). If no legal placement remains (i.e. QH,c has no survivors), then T(α) collapses as α grows and Tr Plin exhibits a local peak at the critical scale; conversely, a non-collapse certifies that at least one placement survives. With ambient dimension d = dim M0 and residual rank r ≥ 1, the probability that k random rank-1 probes pmiss all survivors obeys Eq. (16) so one can pick kdrln(1/δ)k \asymp {d \over r}\ln (1/\delta ) to achieve a target risk δ. In practice this yields a Ramsey-informed front-end filter: run the witnesses locally (on h-hop ego-nets, or on batches) and trigger exact subgraph-isomorphism or GNN attention only where the witnesses indicate unavoidable structure.

Training-time integration (regularizers and layers). Motif biases can be injected by adding a differentiable penalty that softly forbids QH,c: total =task +μTrPexp(H,c)(α){{\cal L}_{{\rm{total }}}} = {{\cal L}_{{\rm{task }}}} + \mu {\mathop{\rm Tr}\nolimits} P_{\exp }^{(H,c)}(\alpha ) or μλL(α),λL(α):=ddαlogT(α),\mu {{\rm{\lambda }}_L}(\alpha ),\quad {{\rm{\lambda }}_L}(\alpha ): = - {d \over {d\alpha }}\log T(\alpha ), with Pexp(H,c)P_{\exp }^{(H,c)} built from the motif basis {ℬs}; decreasing T(α) shrinks the measure of “no-H” assignments, nudging message passing toward motif-consistent representations. A complementary architectural primitive is Ramsey-aware pooling/attention: use per-node (or per-edge) contributions to the trace estimator as scores that gate message aggregation, focusing compute where the witnesses predict imminent motif emergence.

Complexity and deployment. For large |V|, exhaustive motif enumeration is Θ((|V|h))\Theta \left( {\left( \matrix{ |V| \cr h \cr} \right)} \right), while a witness pass costs O(k Capply) with Capply the cost to apply a basis element ℬs (sparse and local for small H). The error is one-sided and tunable via e−kr/d; in screening mode we favor high recall (large k/d), then hand off flagged regions to exact methods or to a specialized GNN head.

Majorana/matchgate realization (few-qubit or parity-preserving hardware). All operators above are degree-(0, 0) and commute with total parity, hence on matchgate/Majorana platforms each Πijc\Pi _{ij}^c and their products are even-parity checks realizable by shallow fermionic-Gaussian circuits acting entirely in M0. The traces T(α) (and Tr Plin) admit unbiased Hutchinson estimators, and spectral surrogates are accessed via the Hermitian dilation of the rank-k accumulator, enabling few-qubit diagnostics or accelerator kernels that interleave with classical GNN training/inference.

Learning theory: combinatorial capacity (VC). PAC (Probably Approximately Correct) learning is another potential application as in computational learning theory it is used to analyze the learnability of functions. PAC provides a probabilistic approach to understanding how well a machine learning algorithm can generalize from training data to unseen data exploring whether a learning algorithm can find a hypothesis that is both approximately correct and probably correct with respect to a given concept and distribution. With PAC sample complexity scales with VC–dimension that gives a measure of the complexity of a hypothesis space or the power of learning machine. Ramsey–type statements provide complementary unavoidability results: in large instance/hypothesis regimes, certain regular label patterns (e.g., constant/parity–coherent on dense subgraphs) must appear. Encoding “forbidden labelings” as Pm,n and tracking T(α) supplies fast, high–confidence negative certificates (“this configuration is no longer realizable”) with quantitative control via k/d.

Let ℋ ⊆ {–1, +1}𝒳 be a binary hypothesis class. A finite set S = {x1, …, xm} ⊂ 𝒳 is shattered by ℋ if every labeling y ≈ {–1, +1}m is realized by some h ∈ ℋ, i.e. ∀yh ∈ ℋ with h(xi) = yi for all i. The VC-dimension VC(ℋ) is the largest m such that some S of size m is shattered. Writing the growth function Π(m):=max|S|=m| { (h(x1),,h(xm)):h } |,{\Pi _{\cal H}}(m): = \mathop {\max }\limits_{|S| = m} \left| {\left\{ {\left( {h\left( {{x_1}} \right), \ldots ,h\left( {{x_m}} \right)} \right):h \in {\cal H}} \right\}} \right|, the Sauer–Shelah lemma gives, for d = VC(ℋ) and md, Π(m)i=0d(mi)(emd)d,{\Pi _{\cal H}}(m) \le \sum\limits_{i = 0}^d {\left( \matrix{ m \cr i \cr} \right)} \le {\left( {{{em} \over d}} \right)^d}, so the number of distinct labelings realizable on m points grows polynomially once m > d. In PAC learning, these combinatorial bounds control sample complexity. In the realizable case (some h* ∈ ℋ attains zero risk), empirical risk minimization is consistent with m1ε(dlog1ε+log1δ)m \mathbin{\lower.3ex\hbox{\buildrel>\over {\smash{\scriptstyle\sim}\vphantom{_x}}}} {1 \over \varepsilon }\left( {d\log {1 \over \varepsilon } + \log {1 \over \delta }} \right) examples to achieve excess error ≤ ε with probability ≥ 1 – δ; in the agnostic case the dependence becomes m1ε2(d+log1δ)m \mathbin{\lower.3ex\hbox{\buildrel>\over {\smash{\scriptstyle\sim}\vphantom{_x}}}} {1 \over {{\varepsilon ^2}}}\left( {d + \log {1 \over \delta }} \right). Canonical examples: thresholds on ℝ have VC = 1; intervals on ℝ have VC = 2; axis-aligned rectangles in ℝd have VC = 2d; affine halfspaces in ℝd have VC = d+1.

Ramsey overlay and “unavoidability.” Ramsey theory adds a complementary, worst-case inevitability perspective: on sufficiently large instance sets, certain structured label patterns must occur. In our algebraic framework, a forbidden configuration (e.g. “no monochromatic Km in red, none of size Kn in blue”) is encoded by the central projector Pm,n acting in the charge-zero module M0. If Pm,n has no surviving support at size v, then no hypothesis consistent with those constraints can realize the corresponding label patterns on v examples—an unavoidability barrier. Operationally, we monitor T(α) and the product witness Plin; collapse of T(α) together with a peak of Tr Plin certifies that all labelings compatible with the constraint class have vanished at that scale, yielding a fast negative certificate (“this configuration is no longer realizable”).

From VC to Ramsey-constrained capacity. Let ℋm,n (v) denote the labelings of a v-point sample realizable by hypotheses that avoid the Ramsey-forbidden substructures encoded by Pm,n. By Ramsey’s principle, once vR(m, n) the set ℋm,n (v) is empty; hence the growth function of the constrained class satisfies Πm,n(v) = 0 for all vR(m, n). Thus the effective VC-dimension of ℋm,n is at most R(m, n)–1, and in practice can be (much) smaller due to additional algebraic symmetries. Our spectral witnesses make this transition detectable: when T(α) collapses at a given v, it implies Πm,n(v) = 0 under the modeling assumptions, delivering a data-driven ceiling on combinatorial capacity for the constrained hypothesis class.

Quantitative control via random projectors. Let d = dim M0 and suppose the surviving feasible subspace (if any) has rank r ≥ 1. Drawing k isotropic rank-1 tests vjvj{v_j}v_j^ \top , the probability to pmiss all survivors obeys Eq. (16) drives the false-negative risk below I. In practice: (i) choose d as large as possible without increasing the data–qubit count on hardware (e.g., d ≤ 24, 32, 48 map to 5 or 6 qubits); (ii) set a baseline pair (k, α) (e.g., (100, 20)) and a high–resolution pair (e.g., (400, 40)) to cross–validate decisions; (iii) declare a scale v critical when T(α) collapses, Tr Plin peaks, and a spectral surrogate (e.g. the dilation norm ρ(H) = ∥A∥2) is locally extremal under the same seeds. For example, with d = 24, r = 1, and target δ = 10−6, it suffices to use k ≤ 24ln(106) ≈ 332; at (k,a) = (400, 40) the bound already yields Pr[pmiss] ≲ 5.7 × 10−8. Reporting (k, d, α) together with λL(α) and the explicit e−kr/d risk converts Ramsey–style inevitability into a tunable, high–confidence negative certificate (“this configuration is no longer realizable”), complementing VC/sample–complexity upper bounds on the number of realizable labelings, what can be labeled or learned.

Practical takeaway for ML. (i) Use Pm,n to encode domain rules (forbidden label patterns or subgraphs) and add a soft penalty proportional to T (α) in the loss to bias training toward allowable regions; (ii) during pruning or architecture scaling, track (Tr Plin, T(α)) as early-warning signals that the constrained class has lost capacity on the current sample size (a Ramsey barrier); (iii) report the explicit e−kr/d miss-bound alongside collapse/peak events to quantify confidence in negative certificates. In tandem with standard VC/sample-complexity guarantees, these Ramsey-informed diagnostics provide actionable controls on what cannot be learned at a given scale, under the stated algebraic constraints.

Neurosymbolic constraints. Because Pm,n is central and Klein grading separates “red/blue” semantics, one can add soft penalties λ Tr Pexp (α) to the loss to bias training toward rule–consistent solutions–the graded, differentiable analogue of logic layers.

7.3.
Algorithms and Concrete Templates
7.3.1.
Ramsey–Guided Pruning (Classical/Quantum)
  • Fix a predicate P over parameters/activations and build the induced degree–(0, 0) operators on the reduced module M0 (choose d s.t. ⌈log2 d⌉ fits the co–processor; d = 24, 32, 48 are practical).

  • For each candidate scale v(width/channel budget), form A with a shared PRNG seed across v; compute Plin and Pexp (α).

  • Track T(α), Tr Plin, and a spectral surrogate ρ(H). Declare v critical when T(α) collapses and Tr Plin peaks (with ρ(H) extremal).

  • Choose k to meet risk ε via k/d ≳ ln(1/ε)/r from (16); tune α from the Lyapunov slope λL(α):=ddαlog T(α){{\rm{\lambda }}_L}(\alpha ): = - {d \over {d\alpha }}\log T(\alpha ).

These same steps run on 5 – 7 qubits for d ≤ 24 – 48 using block–encoding/qubitization and a Hutchinson trace estimator 𝔼|rr|F|r=1dTrF{_{|r\rangle }}\langle r|F|r\rangle = {1 \over d}{\mathop{\rm Tr}\nolimits} F.

Adversarial-risk early warning. Set edges to mark δ–near collisions; sweep v (or coverage). A sharp drop in T(α) flags the Ramsey–style onset of unavoidable coherent confusions, prompting augmentation or re-architecture before empirical attacks emerge.

Curriculum & scaling via prime–sequence checkpoints (heuristic). Diagonal values observed in our framework are sparsely factorized (e.g·, 45 = 32 · 5). As a curriculum heuristic, checkpoint only at prime–sequence integers (products of the first few primes with bounded exponents), keeping successive ratios ≲ 2; this concentrates compute at likely transition scales.

Case study: diagonal R (5, 5) and resource accounting. At d = 24 the diagnostics concur at v = 45: T(40) ≈ 2.4 × 10−289 (collapse) and Tr Plin peaks at 0.462 vs· 0.360 (v = 44) and 0.407 (v = 46); a known 46–vertex coloring stays non–critical under the same thresholds. With (16), Pr[pmiss] ≲ e−400/24 ≈ 5.7 × 10−8 at (k, α) = (400, 40) (for r = 1), making spurious collapse extremely unlikely under the i·i·d·/isotropy assumptions· Quantumly, these checks use only ⌈log2 d⌉ = 5 data qubits plus few ancillas, instead of (v2)\left( \matrix{ v \cr 2 \cr} \right) edge qubits (e.g. 990 at v = 45). The witnesses are diagnostic (not constructive) and report evidence consistent with R(5, 5) = 45 within the graded–module model and sampling assumptions.

Quantum realization on few qubits (for ML and Ramsey). Both Plin and Pexp(α) admit coherent implementations via block–encodings of A and QSVT/qubitization; the Hermitian dilation H supplies a stable spectral proxy ρ(H) = ∥A2. Traces are estimated by a Hadamard–test version of Hutchinson’s identity, reducing to 5 data qubits for d = 24 (plus ancillas). On matchgate/Majorana platforms, degree–(0, 0) pair/clique projectors are even–parity checks, yielding shallow circuits native to the hardware.

7.4.
Open Directions

The items below outline concrete, testable directions that translate the Ramsey principle of inevitability—realized through our graded-algebraic and random-projector framework—into practical tools for structure search, confidence-calibrated pruning, adversarial phase mapping, and prime-sequence curricula.

Ramsey–guided structure search. Add μ Tr Pexp (α) to the loss to bias training toward guaranteed motifs; study generalization/robustness.

Confidence–calibrated pruning. Turn the bound e−kr/d into pruning schedules with explicit risk budgets; couple to dynamic d (keeping ⌈log2 d⌉ fixed on small quantum assists).

Adversarial phase diagrams. Map (T, λL, ρ(H)) across defenses/data scales to chart inevitability regions where confusion becomes unavoidable.

Prime–sequence curricula. Empirically assess whether sharp loss/robustness transitions cluster at prime–sequence scales.

Ramsey theory supplies inevitability guarantees. The graded–algebra plus random–projector machinery turns them into operational tests–collapse/peak events with explicit, exponential–tail risk control–that inform pruning, curriculum, robustness analyses, and quantum–assisted diagnostics on a handful of qubits.

8.
Conclusions

Embedding two-color Ramsey instances in a ℤ2 × ℤ2-graded paraparticle algebra renders the Klein recursion exact for the graded Ramsey numbers RV4 (m, n) (with the classical numbers obeying R(m, n) ≤ RV4(m, n)), supplies an explicit operator basis, and maps naturally onto forthcoming Majorana hardware. In our construction, algorithmic costs scale only logarithmically with the height of the Majorana tower, suggesting a realistic quantum–combinatorial synergy. Our graded-algebra diagnostics identify the diagonal threshold at R(5, 5) = 45 via a unique concurrence of signals–collapse of T(α), a peak of Tr Plin, and maximal spectral spread–while the explicit AM-46 coloring remains non-critical. The miss probability defined in Eq. (16) for an undetected coloring is already small, Pmiss < 10−3 for the reported parameters, and drops to Pmiss ~ 10−7 once the residual rank satisfies r ≥ 12 (with d = 24); see Appendix B, Section D.11 for details. Hence the method delivers a tight, scalable heuristic that future constructive proofs will either confirm or surpass. Another criterion for diagonal Ramsey numbers, based on prime-sequence numbers of order k, is discussed in SM. Ultimately, only a constructive search can decide whether the coloring space is, for all practical purposes, empty (implying R(5, 5) = 45) or not (leaving R(5, 5) = 46). Together with the constructive bound 43 ≤ R(5, 5) ≤ 46, these signals provide statistical indications consistent with R(5, 5) = 45 under the graded-module model. Under i.i.d. isotropic sampling of directions vj in M0, the chance to miss surviving directions is bounded by Pmisse−kr/d (residual rank r ≤ 1), making the collapse decision reliable at the reported (d, k, α) and increasing k/d strengthens both witnesses without widening the quantum data register.

Finally, the factorization 45 = 32 · 5 motivates a “prime-sequence’’ constraint that favors small prime factors when extrapolating diagonal values, offering a complementary, number-theoretic guidepost. Our results provide statistical, but not yet constructive, evidence for R(5, 5) = 45; scaling k and d on hardware alongside targeted constructive search is the natural next step, including an heuristic estimation for R(6, 6) and R(7, 7)

The method is lightweight (a d = 24 module) yet hardware ready: block-encoding and qubitization implement f(A) = e−αA, and a Hermitian dilation supplies a stable spectral surrogate, yielding a compact, reproducible benchmark for matrix-function evaluation and randomized trace estimation on quantum devices. Both diagnostics have direct quantum realizations: block-encodings of F ∈ {Pexp(α), Plin} feed a Hadamard–test/Hutchinson trace estimator, and the spectral surrogate is accessed by phase estimation on the Hermitian dilation H. Because all operators act on M0, the data width is only ⌈log2 d⌉ qubits (five for d = 24), plus a few ancillas–orders of magnitude below edge-register encodings that require one logical qubit per edge.

Our procedure is a statistical diagnostic, not a constructive proof: it relies on independence/isotropy of vj, concentration of the spectrum of the accumulator A, and numerical thresholds chosen by cross-validation on neighboring n. The collapse/peak calls remain robust under these assumptions and are supported by explicit controls.

In any case this statistical estimation is a promising subject for machine learning. In ML, Ramsey numbers are not used directly, but the Ramsey principle, large enough systems must contain hidden order, deeply resonates with overparameterized neural networks, adversarial robustness, graph learning, and generalization theory. Open Research Directions are: Ramsey-inspired pruning, where in giant overparameterized models, one can use Ramsey reasoning to predict where guaranteed good subnetworks live. Complexity bounds: Ramsey numbers are huge, but their growth rate might inspire worst-case capacity bounds in neural networks. Curriculum design: Training on small guaranteed patterns (Ramsey cliques, unavoidable motifs) before scaling up could act as a form of combinatorial curriculum learning.

DOI: https://doi.org/10.2478/qic-2025-0039 | Journal eISSN: 3106-0544 | Journal ISSN: 1533-7146
Language: English
Page range: 739 - 772
Submitted on: Aug 25, 2025
|
Accepted on: Oct 30, 2025
|
Published on: Mar 9, 2026
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2026 Fabrizio Tamburini, published by Cerebration Science Publishing Co., Limited
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.