Global Central Limit Theorems for Stationary Markov Chains

Lin, Michael

doi:10.2478/amsil-2025-0011

Full Article

1.

Introduction

Let P = P(x, A) be a Markov transition probability function on a general state space (S, Σ), with invariant probability measure m (i.e. m(·) = ∫_S P(x, ·)dm(x)). Let Ω := S^ℕ be the space of trajectories with σ-algebra 𝒜 := Σ^⊗^ℕ, and let ℙ_x be the probability measure on 𝒜 governing the chain with transition probability function P and initial distribution δ_x. The probability of the chain with initial distribution m is then ℙ_m = ∫_S ℙ_x dm(x). By invariance of m, ℙ_m is shift invariant on (Ω, 𝒜). Let X_n be the projection of Ω on the nth coordinate. Then (X_n) on (Ω, 𝒜, ℙ_m) is a stationary Markov chain with state space S.

For 1 ≤ p < ∞ we denote by L^p(m) the Banach space {f : S → ℝ : ∫_S | f |^p dm < ∞}, and put $L_{0}^{p} (m) = f \in L^{p} (m) : \int_{S} f dm = 0\}$ L_0^p\left( m \right) = \left\{ {f \in {L^p}\left( m \right)\;:\int_S {f\;dm = 0} \;} \right\} .

We assume m ergodic for P, which means (by one of the equivalent definitions) that if f ∈ L²(m) satisfies f(x) = ∫_S f(y)P(x, dy) m-a.e., then f is constant a.e. Then the chain is ergodic too, i.e. the shift θ on (Ω, 𝒜, ℙ_m), defined by θ(X_n)_n∈_ℕ = (X_n₊₁)_n∈_ℕ, is ergodic.

We say that a real centered $f \in L_{0}^{2} (m)$ f \in L_0^2\left( m \right) satisfies the annealed CLT if in (Ω, ℙ_m) we have $\frac{1}{\sqrt{n}} \sum_{k = 1}^{n} f (X_{k}) \overset{𝒟}{\to} 𝒩 (0, σ^{2}), where 𝒩 (0, 0) : = δ_{0} .$ {1 \over {\sqrt n }}\sum\limits_{k = 1}^n {f\left( {{X_k}} \right)\buildrel {{\cal {D}}} \over \longrightarrow {\cal {N}}\left( {0,{\sigma ^2}} \right),\;\;\;{\rm{where}}\;\;\;{\cal {N}}\left( {0,0} \right)} : = {\delta _0}.

We say that a real centered $f \in L_{0}^{2} (m)$ f \in L_0^2\left( m \right) satisfies the L²-normalized CLT if $\frac{1}{σ_{n} (f)} \sum_{k = 1}^{n} f (X_{k}) \overset{𝒟}{\to} 𝒩 (0, 1),$ {1 \over {{\sigma _n}\left( f \right)}}\sum\limits_{k = 1}^n {f\left( {{X_k}} \right)\buildrel {{\cal {D}}} \over \longrightarrow {\cal {N}}\left( {0,1} \right)} , provided $σ_{n} (f) : = {\sum_{k = 1}^{n} f (X_{k})‖}_{L^{2} (ℙ_{m})} > 0$ {\sigma _n}\left( f \right): = {\left\| {\sum\nolimits_{k = 1}^n {f\left( {{X_k}} \right)} } \right\|_{{L^2}\left( {{{\mathbb{P}}_m}} \right)}} > 0 for sufficiently large n ∈ ℕ.

We denote by P also the Markov operator defined as $P f (x) : = \int_{S} f (y) P (x, dy)$ Pf\left( x \right): = \int_S {f\left( y \right)P\left( {x,dy} \right)} for every bounded measurable f and every x ∈ S. By invariance of m, P extends to all L¹(m) functions, and is a contraction of all L^p(m) spaces, 1 ≤ p ≤ ∞, meaning that it does not increase the norm of functions in these spaces. As previously mentioned, ergodicity implies that Pf = f ∈ L^p holds only for f constant. We denote by Pⁿ the n-fold composition of the operator P, and by Ef := ∫_S f dm the expectation (with respect to the probability measure m) of f ∈ L^p(m), p ≥ 1.

Following the early work of Doeblin, many efforts were made to identify conditions on an ergodic Markov operator P with invariant measure m which would ensure that every centered f ∈ L²(m) satisfies the annealed CLT – an L²-global annealed CLT for the chain.

2.

History

Nagaev ([21]) used the following condition of Dobrushin: there exist k ∈ ℕ and δ < 1 $sup_{x, y \in S} P^{k} (x, A) - P^{k} (y, A)| < δ, \forall A \in Σ .$ \mathop {{\rm{sup}}}\limits_{x,y \in S} {\rm{\;}}\left| {{P^k}\left( {x,A} \right) - {P^k}\left( {y,A} \right)} \right| < \delta ,\;\;\;\;\forall A \in \Sigma . This condition implies uniform geometric ergodicity: sup_x ‖Pⁿ(x, ·) − m‖_TV ≤ Mρⁿ for some M > 0 and 0 < ρ < 1. But the latter condition implies ‖Pⁿ − E‖_∞ → 0, which turns out to be equivalent to Doeblin’s condition; see [24, p. 213]. Ibragimov ([18]) used a strong mixing condition (φ-mixing), which also turns out to imply Doeblin’s condition. Davydov ([9], [10]) constructed a positive recurrent aperiodic chain with countable state space such that the CLT fails for some centered f ∈ L²(m).

Theorem 1 (M. Rosenblatt, [24]).

If ‖Pⁿ − E‖₂ → 0, then every centered f ∈ L²(m) satisfies the annealed CLT.

Rosenblatt proved that his condition is equivalent to ρ-mixing of the chain, and gave examples that it yields neither ‖Pⁿ − E‖_∞ → 0 nor ‖Pⁿ − E‖₁ → 0, although each of these conditions implies it; but ‖Pⁿ − E‖₂ → 0 if and only if ‖Pⁿ − E‖_p → 0 for some (every) 1 < p < ∞. Importantly, Rosenblatt’s condition does not necessarily imply Harris recurrence, see an example below.

Example (Random walks on the unit circle 𝕋).

Let μ be a probability measure on 𝕋, and define the convolution operator Pf = μ ∗ f, f ∈ L¹(𝕋, m), m the normalized Haar (Lebesgue) measure. It is shown in [11] that if ${lim}_{k| \to \infty} \hat{μ} (k) = 0$ {\lim _{\left| k \right| \to \infty }}\hat \mu \left( k \right) = 0 , that is, the Fourier transform of μ vanishes at infinity (i.e. μ is Rajchman), then ‖Pⁿ − E‖₂ → 0. When μ is Rajchman with all its powers singular with respect to Lebesgue measure, P is not Harris recurrent.

A contraction T on a Banach space 𝒳 is called uniformly ergodic if $\frac{1}{n} \sum_{k = 1}^{n} T^{k}$ {1 \over n}\mathop \sum \nolimits_{k = 1}^n {T^k} converges in the operator norm. The limit is a projection onto Fix(T) := {f ∈ 𝒳 : Tf = f} corresponding to the decomposition $X = Fix (T) \oplus \bar{(I - T) X}$ {\cal X} = Fix\left( T \right) \oplus \overline {\left( {I - T} \right){\cal X}} . A contraction T is uniformly ergodic if and only if (I−T)𝒳 is closed in 𝒳 ([20]).

When P is uniformly ergodic in L²(m), we have $L_{0}^{2} (m) = (I - P) L^{2} (m) = (I - P) L_{0}^{2} (m)$ L_0^2\left( m \right) = \left( {I - P} \right){L^2}\left( m \right) = \left( {I\; - P} \right)L_0^2\left( m \right) . (Recall that $L_{0}^{2} (m) : = f \in L^{2} : Ef = 0\}$ L_0^2\left( m \right): = \left\{ {f \in {L^2}:Ef = 0} \right\} ). If ‖Pⁿ − E‖₂ → 0, then P is uniformly ergodic on L²(m); moreover, the spectral radius $r (P_{| L_{0}^{2} (m)}) < 1$ r\left( {{P_{|L_0^2\left( m \right)}}} \right) < 1 , meaning P has a spectral gap in the complex $L_{0}^{2} (m)$ L_0^2\left( m \right) .

Theorem 2 (Gordin-Lifshits, [15]).

Let P be a Markov operator with invariant probability measure m, and assume that P is ergodic.

If f ∈ (I − P)L²(m), then f satisfies the annealed CLT, with $σ^{2} = σ_{f}^{2} : = lim_{n \to \infty} \frac{1}{n} {\sum_{k = 1}^{n} f (X_{k})‖}_{2}^{2} = {g‖}^{2} - {Pg‖}^{2},$ {\sigma ^2} = \sigma _f^2: = \mathop {\lim }\limits_{n \to \infty } {1 \over n}\left\| {\sum\limits_{k = 1}^n {f\left( {{X_k}} \right)} } \right\|_2^2 = {\left\| g \right\|^2} - {\left\| {Pg} \right\|^2},

where f = (I − P)g with $g \in L_{0}^{2} (m)$ g \in L_0^2\left( m \right) .

When $σ_{f}^{2} > 0$ \sigma _f^2 > 0 (which is the case when P^∗P is ergodic), f satisfies also the L²-normalized CLT, which follows from a theorem of Slutsky ([25]) (see [8, p. 254]).

By [7], f ∈ (I − P)L²(m) if and only if ${sup}_{n} {\sum_{k = 1}^{n} P^{k} f‖}_{2} < \infty$ {\rm{su}}{{\rm{p}}_n}{\left\| {\mathop \sum \nolimits_{k = 1}^n {P^k}f} \right\|_2} < \infty .

Theorem 1 now follows from Corollary 3 below.

Corollary 3.

Let P be a Markov operator with invariant probability measure m, and assume that P is uniformly ergodic in L²(m) with limit equal to E. Then every $f \in L_{0}^{2} (m)$ f \in L_0^2\left( m \right) satisfies the annealed CLT.

Note that uniform ergodicity does not necessarily imply Harris recurrence.

Problem 1.

Let P be a Markov operator with invariant probability measure m, and assume that P is ergodic. If every $f \in L_{0}^{2} (m)$ f \in L_0^2\left( m \right) satisfies the annealed CLT, does it follow that P is uniformly ergodic in L²(m)?

3.

Some ergodic properties

Theorem 4 (Derriennic-Lin, [11]).

Let P be a Markov operator with invariant probability measure m, and assume P is ergodic. Then the following conditions are equivalent:

(i)
P is uniformly ergodic in L²(m).
(ii)
For every $f \in L_{0}^{2} (m)$ f \in L_0^2\left( m \right) we have ${sup}_{n \geq 1} {\frac{1}{n} \sum_{k = 1}^{n} f (X_{k})‖}_{L^{2} (ℙ_{m})}^{2} < \infty$ {\rm{su}}{{\rm{p}}_{n \ge 1}}\left\| {{1 \over n}\mathop \sum \nolimits_{k = 1}^n f\left( {{X_k}} \right)} \right\|_{{L^2}\left( {{{\mathbb{P}}_m}} \right)}^2 < \infty .
(iii)
For every $f \in L_{0}^{2} (m)$ f \in L_0^2\left( m \right) we have ${sup}_{n \geq 1} {\frac{1}{\sqrt{n}} \sum_{k = 1}^{n} P^{k} f‖}_{2} < \infty$ {\rm{su}}{{\rm{p}}_{n \ge 1}}{\left\| {{1 \over {\sqrt n }}\mathop \sum \nolimits_{k = 1}^n {P^k}f} \right\|_2} < \infty .
(iv)
For every $f \in L_{0}^{2} (m)$ f \in L_0^2\left( m \right) we have ${sup}_{n \geq 1} \sum_{k = 1}^{n} P^{k} f, f〉| < \infty$ {\rm{su}}{{\rm{p}}_{n \ge 1}}\left| {\mathop \sum \nolimits_{k = 1}^n \left\langle {{P^k}f,f} \right\rangle } \right| < \infty .

Note that P is a contraction also of each complex L^p(m) space, 1 ≤ p ≤ ∞, and it is uniformly ergodic in the complex L^p(m) iff it is uniformly ergodic in the real L^p(m). A similar statement holds also for norm convergence of Pⁿ.

Theorem 5.

Let P be a Markov operator with invariant probability measure m. If P is uniformly ergodic on L^p(m), 1 ≤ p < ∞, and is weakly mixing on the complex L^p(m) (the only unimodular eigenvalue of P is 1), then ‖Pⁿ − E‖_p → 0.

The proof primarily relies on positivity and ergodicity.

Lemma 6.

If P^∗P is ergodic, then for every $f \in L_{0}^{2} (m)$ f \in L_0^2\left( m \right) we have Pⁿ f → 0 weakly in L²(m); thus the shift θ on (Ω, 𝒜, ℙ_m) is weakly mixing, hence totally ergodic (all powers θ^k are ergodic). Moreover, ‖(P^∗P)ⁿ f ‖₂ → 0 for every $f \in L_{0}^{2} (m)$ f \in L_0^2\left( m \right) if and only if P^∗P is ergodic.

Proof

We assume that P^∗P is ergodic. Let 𝒦 be the unitary space of P: $K : = g \in L^{2} (m) : {P^{n} g‖}_{2} = {P^{* n} g‖}_{2} = {g‖}_{2} for every n \geq 1\} .$ {\cal K}: = \left\{ {g \in {L^2}\left( m \right):{{\left\| {{P^n}g} \right\|}_2} = {{\left\| {{P^{*n}}g} \right\|}_2} = {{\left\| g \right\|}_2}\;\;\;{\rm{for\; every}}\;\;\;n \ge 1} \right\}. Clearly ${Pg‖}_{2}^{2} = {g‖}_{2}^{2}$ \left\| {Pg} \right\|_2^2 = \left\| g \right\|_2^2 if and only if $P^{*} Pg, g〉 = {g‖}_{2}^{2}$ \left\langle {{P^*}Pg,g} \right\rangle = \left\| g \right\|_2^2 . Hence, by the Cauchy-Schwarz inequality, g ∈ 𝒦 implies P^∗Pg = g, and the ergodicity of P^∗P implies that 𝒦 contains only the constant functions. Any f centered is therefore orthogonal to 𝒦, and by [13] both Pⁿ f → 0 and P^∗n f → 0 weakly in L²(m). Thus P is weakly mixing.

The weak mixing of P implies that the shift θ is weakly mixing; see [1, Section 2].

The operator P^∗P is symmetric positive semi-definite in the complex L²(m), so its spectrum is a subset of [0, 1]. If P^∗P is ergodic, then for centered f ∈ L²(m) we have ‖(P^∗P)ⁿ f ‖₂ → 0 by the spectral theorem.

Conversely, if ‖(P^∗P)ⁿ f ‖₂ → 0 for every centered f ∈ L²(m), then obviously P^∗P is ergodic.

Lemma 7.

Let the shift θ be totally ergodic on (Ω, 𝒜, ℙ_m), which is the case when P^∗P is ergodic. If f ≠ 0 belongs to $L_{0}^{2} (m)$ L_0^2\left( m \right) , then σ_n(f) > 0 for every n ≥ 1.

Proof

By stationarity of the chain (X_n), σ_n(f) = 0 implies ${\sum_{k = 0}^{n - 1} f (X_{k})‖}_{L^{2} (ℙ_{m})} = 0,$ {\left\| {\sum\limits_{k = 0}^{n - 1} {f\left( {{X_k}} \right)} } \right\|_{{L^2}\left( {{{\mathbb{P}}_m}} \right)}} = 0, so $\begin{array}{l} f (X_{0}) \circ θ^{n} - f (X_{0}) & = f (X_{n}) + \sum_{k = 0}^{n - 1} f (X_{k})] - f (X_{0}) \\ = \sum_{k = 1}^{n} f (X_{k}) = \sum_{k = 0}^{n - 1} f (X_{k})] \circ θ = 0 . \end{array}$ \matrix{ {f\left( {{X_0}} \right) \circ {\theta ^n} - f\left( {{X_0}} \right)} \hfill & { = f\left( {{X_n}} \right) + \left[ {\sum\limits_{k = 0}^{n - 1} {f\left( {{X_k}} \right)} } \right] - f\left( {{X_0}} \right)} \hfill \cr {} \hfill & { = \sum\limits_{k = 1}^n {f\left( {{X_k}} \right)} = \left[ {\sum\limits_{k = 0}^{n - 1} {f\left( {{X_k}} \right)} } \right] \circ \theta = 0.} \hfill \cr } By ergodicity of θⁿ, f(X₀) is a constant, which is zero since f is centered.

4.

Global central limit theorems

Theorem 8.

Let P be a Markov operator with invariant probability measure m. If P^∗P is ergodic and P is uniformly ergodic, then ‖Pⁿ − E‖₂ → 0, and every centered 0 ≠ f ∈ L²(m) satisfies a non-degenerate annealed CLT and the L²-normalized CLT.

Moreover, if 0 ≠ f ∈ L³(m) is centered, then (1) $sup_{t \in ℝ} ℙ_{m} \frac{\sum_{k = 1}^{n} f (X_{k})}{σ_{f} \sqrt{n}} \leq t\} - \frac{1}{\sqrt{2 π}} \int_{- \infty}^{t} e^{- x^{2} / 2} dx| = O (\frac{1}{\sqrt{n}}) .$ \mathop {\sup }\limits_{t \in {\mathbb{R}}} {\rm{\;}}\left| {{{\mathbb{P}}_m}\left\{ {{{\sum\nolimits_{k = 1}^n {f\left( {{X_k}} \right)} } \over {{\sigma _f}\sqrt n }} \le t} \right\} - {1 \over {\sqrt {2\pi } }}\mathop \smallint \nolimits_{ - \infty }^t {e^{ - {x^2}/2}}dx} \right| = O\left( {{1 \over {\sqrt n }}} \right).

Proof

Ergodicity of P^∗P implies ergodicity of P, by Lemma 6. The assumption of uniform ergodicity implies that every $f \in L_{0}^{2} (m)$ f \in L_0^2\left( m \right) is of the form f = (I − P)g with g ∈ L²(m) centered.

Fix 0 ≠ f = (I − P)g with g ∈ L²(m) centered. By the Gordin-Lifshits CLT, the annealed CLT holds for f, with variance of the limit expressed as $σ^{2} = σ_{f}^{2} = lim_{n \to \infty} {\frac{1}{\sqrt{n}} \sum_{k = 1}^{n} f (X_{k})‖}_{L^{2} (ℙ_{m})}^{2} = {g‖}_{2}^{2} - {Pg‖}_{2}^{2}, g \in L_{0}^{2} (m) .$ {\sigma ^2} = \sigma _f^2 = \mathop {\lim }\limits_{n \to \infty } \left\| {{1 \over {\sqrt n }}\sum\limits_{k = 1}^n {f\left( {{X_k}} \right)} } \right\|_{{L^2}\left( {{{\mathbb{P}}_m}} \right)}^2 = \left\| g \right\|_2^2 - \left\| {Pg} \right\|_2^2,\;\;\;g \in L_0^2\left( m \right). Hence σ_f = 0 if and only if P^∗Pg = g. If σ_f = 0, then g is constant by the ergodicity of P^∗P. Since g is centered, σ_f = 0 implies g = 0, so f = 0.

By Lemma 6 the shift is totally ergodic, so Lemma 7 yields σ_n(f) > 0 for n ≥ 1. Thus, for centered f ≠ 0 we have n^−1/2σ_n(f) → σ_f > 0, so the annealed CLT implies the L²-normalized CLT, by Slutsky’s theorem [25].

Ergodicity of P^∗P implies weak mixing of P (Lemma 6), so uniform ergodicity yields ‖Pⁿ − E‖₂ → 0 (by Theorem 5). For 0 ≠ f ∈ L³(m) centered σ_f > 0 as shown above, and (1) holds by [17].

Corollary 9.

Let P be a Markov operator with invariant probability measure m, and assume that P is ergodic and uniformly ergodic. Every centered 0 ≠ f ∈ L²(m) satisfies a non-degenerate annealed CLT if and only if P^∗P is ergodic.

Proof

When P^∗P is ergodic Theorem 8 applies. For the converse, if P^∗Pg = g for non-constant g ∈ L²(m), then P^∗P(g − Eg) = g − Eg, and f = (I − P)(g − Eg) ≠ 0 satisfies the CLT with σ_f = 0.

Proposition 10.

Let P be a Markov operator with invariant probability measure m, and assume that P is normal in L²(m), i.e. P^∗P = PP^∗. If ‖Pⁿ − E‖₂ → 0, then P^∗P is ergodic, and Theorem 8 applies.

Proof

Let P^∗Pg = g ∈ L²(m). Since P^∗E = E, normality yields ‖g − Eg‖₂ = ‖(P^∗P)ⁿ g − Eg‖₂ = ‖P^∗nPⁿg − P^∗nEg‖₂ ≤ ‖Pⁿg − Eg‖₂ → 0.

Example

In general, ‖Pⁿ − E‖₂ → 0 does not imply that P^∗P is ergodic.

Let us define P on S := {1, 2, 3} by the matrix $\frac{1}{2} \frac{1}{2} 0 ‖ 0 0 1 ‖ \frac{1}{2} \frac{1}{2} 0]$ \left[ {{1 \over 2}\;{1 \over 2}\;0\;\Vert\;\;0\;0\;1\;\Vert\;{1 \over 2}\;{1 \over 2}\;0} \right] . The invariant probability vector is $(\frac{1}{3}, \frac{1}{3}, \frac{1}{3})$ \left( {{1 \over 3},\;{1 \over 3},\;{1 \over 3}} \right) , and P^∗ is given by the adjoint matrix. P has no non-trivial invariant sets, its only unimodular eigenvalue is 1, but P^∗P is not ergodic.

Problem 2.

If a Markov operator P is ergodic, and every centered nonzero f ∈ L²(m) satisfies a non-degenerate annealed CLT, does ‖Pⁿ − E‖₂ → 0?

Note that P^∗P is ergodic (proof of Corollary 9), so P is weakly mixing.

Below we present a sufficient “moment improving” condition for uniform ergodicity (called hyperboundedness); this condition is sometimes easy to check.

Theorem 11 (Glück, [14]).

Let P be a Markov operator with invariant probability measure m, assumed to be ergodic. Assume that for some 1 ≤ s < r < ∞ we have P L^s(m) ⊂ L^r(m). Then P is uniformly ergodic in all L^p(m) spaces, 1 < p < ∞ (i.e. ${\frac{1}{n} \sum_{k = 1}^{n} P^{k} - E‖}_{p} \to 0$ {\left\| {{1 \over n}\mathop \sum \nolimits_{k = 1}^n {P^k} - E} \right\|_p} \to 0 ); hence (by Corollary 3) every centered f ∈ L²(m) satisfies the annealed CLT.

Example (A hyperbounded Markov operator).

Let (S, m) be the unit circle with normalized Lebesgue measure. Let 0 ≤ g ∈ L²(m) with ∫ g dm = 1, and define P by P f = g ∗ f. Then m is invariant, P is ergodic and normal in L²(m). Since ‖P f ‖₂ = ‖g ∗ f ‖₂ ≤ ‖g‖₂‖f ‖₁ for f ∈ L¹(m), P maps L¹(m) into L²(m).

Proposition 12 (Becker, [2])).

A power-bounded operator T (i.e. sup_n≥₀ ‖Tⁿ‖ < ∞) on a Banach space 𝒳 is uniformly ergodic if and only if for every $f \in \bar{(I - T) X}$ f \in \overline {\left( {I - T} \right){\cal X}} the series ∑_n_≥₁ n⁻¹Tⁿf converges in 𝒳.

Proposition 13.

Let P be a Markov operator with invariant probability measure m, assumed to be ergodic. Then the following conditions are equivalent:

(i)
The Markov chain is ρ-mixing⁽¹⁾.
(ii)
‖Pⁿ − E‖₂ → 0.
(iii)
For every $f \in L_{0}^{2} (m)$ f \in L_0^2\left( m \right) the series $\sum_{k = 1}^{\infty} P^{k} f, f〉$ \mathop \sum \nolimits_{k = 1}^\infty \left\langle {{P^k}f,f} \right\rangle converges.
(iv)
For every $f \in L_{0}^{2} (m)$ f \in L_0^2\left( m \right) we have $\sum_{n = 1}^{\infty} {P^{n} f‖}_{2}^{2} < \infty$ \mathop \sum \nolimits_{n = 1}^\infty \left\| {{P^n}f} \right\|_2^2 < \infty .
(v)
There exists 1 ≤ p < ∞ such that for every $f \in L_{0}^{p} (m)$ f \in L_0^p\left( m \right) there exists r > 1 with $\sum_{n = 1}^{\infty} {P^{n} f‖}_{p}^{r} < \infty$ \mathop \sum \nolimits_{n = 1}^\infty \left\| {{P^n}f} \right\|_p^r < \infty .

If either of the above conditions holds, then the annealed CLT holds for every $f \in L_{0}^{2} (m)$ f \in L_0^2\left( m \right) . The variance of the limiting normal distribution is $σ_{f}^{2} = {f‖}_{2}^{2} + 2 \sum_{k = 1}^{\infty} P^{k} f, f〉 .$ \sigma _f^2 = \left\| f \right\|_2^2 + 2\sum\limits_{k = 1}^\infty {\left\langle {{P^k}f,f} \right\rangle } .

Proof

The equivalence of (i) and (ii) is by [24, p. 207].

By [11, Proposition 3.1], condition (ii) is equivalent to the existence of ρ < 1 and M > 0 such that ‖Pⁿ − E‖₂ ≤ M ρⁿ for n ≥ 1. This yields (iii) and (iv).

(iii) implies uniform ergodicity, by Theorem 4. By [13, Lemma 2.1], (iii) implies Pⁿ f → 0 weakly in L²(m) for every $f \in L_{0}^{2} (m)$ f \in L_0^2\left( m \right) ; hence P is weakly mixing. Now (ii) holds by Theorem 5.

Obviously (iv) implies (v) with p = 2.

If (v) holds, then for every centered f ∈ L^p(m), Hölder’s inequality, applied with s = r/(r − 1), yields $\sum_{n = 1}^{\infty} \frac{{P^{n} f‖}_{p}}{n} \leq {(\sum_{n = 1}^{\infty} \frac{1}{n^{s}})}^{\frac{1}{s}} {(\sum_{n = 1}^{\infty} {P^{n} f‖}_{p}^{r})}^{\frac{1}{r}} < \infty .$ \sum\limits_{n = 1}^\infty {{{{{\left\| {{P^n}f} \right\|}_p}} \over n}} \le {\left( {\sum\limits_{n = 1}^\infty {{1 \over {{n^s}}}} } \right)^{{1 \over s}}}{\left( {\sum\limits_{n = 1}^\infty {\left\| {{P^n}f} \right\|_p^r} } \right)^{{1 \over r}}} < \infty . Hence the series $\sum_{n = 1}^{\infty} \frac{P^{n} f}{n}$ \mathop \sum \nolimits_{n = 1}^\infty {{{P^n}f} \over n} is convergent in L^p-norm when f ∈ L^p(m) is centered. By Becker’s Proposition 12, P is then uniformly ergodic in L^p(m). Since condition (v) implies that P has no unimodular eigenvalues, we have ‖Pⁿ − E‖_p → 0 (by Theorem 5), and by [24, Theorem VII.4.1] (ii) holds.

Finally, (ii) implies the CLT statement by Theorem 1. By Theorem 2 the variance of the limit is lim_n→∞ σ_n(f)²/n.

Proposition 14.

Let P be a Markov operator with invariant probability measure m. If every $0 \neq f \in L_{0}^{2} (m)$ 0 \ne f \in L_0^2\left( m \right) satisfies the L²-normalized CLT, then P^∗P is ergodic. Consequently (Lemma 6 and Theorem 5), if P is uniformly ergodic, then ‖Pⁿ − E‖₂ → 0.

5.

α-mixing

Rosenblatt in [24] introduced a certain “strong mixing” condition, now called α-mixing, and proved that for the stationary chain generated by P with invariant probability measure m, α-mixing is equivalent to $4 α (n) : = sup_{\int f dm = 0} \frac{{P^{n} f‖}_{1}}{{f‖}_{\infty}} \to 0 as n \to \infty .$ 4\alpha \left( n \right): = \mathop {\sup }\limits_{\smallint \;f\;dm = 0} {{{{\left\| {{P^n}f} \right\|}_1}} \over {{{\left\| f \right\|}_\infty }}} \to 0\;\;\;{\rm{as}}\;\;\;n \to \infty . The above supremum is bounded by ‖Pⁿ − E‖₂, so ρ-mixing implies α-mixing. Clearly α-mixing implies ‖Pⁿ g − Eg‖₂ → 0 for every g ∈ L²(m), hence total ergodicity of the shift θ.

A stationary Markov chain which is Harris recurrent and aperiodic is α-mixing; see [4, Section 3.2].

Theorem 15.

Let P be a Markov operator with invariant probability measure m, and assume that the chain is α-mixing. If every $0 \neq f \in L_{0}^{2} (m)$ 0 \ne f \in L_0^2\left( m \right) satisfies the L²-normalized CLT, then P^∗P is ergodic, every $0 \neq f \in L_{0}^{2} (m)$ 0 \ne f \in L_0^2\left( m \right) satisfies a non-degenerate annealed CLT, and ‖Pⁿ − E‖₂ → 0.

Proof

By Proposition 14 P^∗P is ergodic, so the shift is totally ergodic. Hence for $0 \neq f \in L_{0}^{2} (m)$ 0 \ne f \in L_0^2\left( m \right) , σ_n(f) > 0 for every n ≥ 1, by Lemma 7.

Let γ ∈ (0, 1) be fixed. Fix $0 \neq f \in L_{0}^{2} (m)$ 0 \ne f \in L_0^2\left( m \right) , and put σ_n = σ_n(f). Since the chain is α-mixing, the stationary sequence {f(X_j)} is also α-mixing. By a result in [19], the L²-normalized CLT implies that there exists a function L(t), t > 0, slowly varying at ∞, such that $σ_{n}^{2} = nL (n)$ \sigma _n^2 = nL\left( n \right) . By a property of slowly varying functions, we obtain $n^{- (γ + 1)} σ_{n}^{2} = n^{- γ} L (n) \to 0$ {n^{ - \left( {\gamma + 1} \right)}}\sigma _n^2 = {n^{ - \gamma }}L\left( n \right) \to 0 . Then $\frac{1}{n^{(γ + 1) / 2}} {\sum_{k = 1}^{n} P^{k} f‖}_{2} \leq \frac{1}{n^{(γ + 1) / 2}} {\sum_{k = 1}^{n} f (X_{k})‖}_{L^{2} (ℙ_{m})} = n^{- (γ + 1) / 2} σ_{n} \to 0 .$ {1 \over {{n^{\left( {\gamma + 1} \right)/2}}}}{\left\| {\sum\limits_{k = 1}^n {{P^k}f} } \right\|_2} \le {1 \over {{n^{\left( {\gamma + 1} \right)/2}}}}{\left\| {\sum\limits_{k = 1}^n {f\left( {{X_k}} \right)} } \right\|_{{L^2}\left( {{{\mathbb{P}}_m}} \right)}} = {n^{ - \left( {\gamma + 1} \right)/2}}{\sigma _n} \to 0. The above convergence holds for every $f \in L_{0}^{2} (m)$ f \in L_0^2\left( m \right) . Denoting ϵ = (1 − γ)/2, we apply it to f = g − Eg, g ∈ L²(m), to obtain $n^{ε} {\frac{1}{n} \sum_{k = 1}^{n} P^{k} g - Eg‖}_{2} = \frac{1}{n^{(γ + 1) / 2}} {\sum_{k = 1}^{n} P^{k} (g - Eg)‖}_{2} \leq C_{g} \forall n (g \in L^{2} (m)) .$ {n^\varepsilon }{\left\| {{1 \over n}\sum\limits_{k = 1}^n {{P^k}g - Eg} } \right\|_2} = {1 \over {{n^{\left( {\gamma + 1} \right)/2}}}}{\left\| {\sum\limits_{k = 1}^n {{P^k}\left( {g - Eg} \right)} } \right\|_2} \le {C_g}\;\;\;\forall n\left( {g \in {L^2}\left( m \right)} \right). By the Banach-Steinhaus theorem, the norms $n^{ε} {\frac{1}{n} \sum_{k = 1}^{n} P^{k} - E‖}_{2}\}$ \left\{ {{n^\varepsilon }{{\left\| {{1 \over n}\mathop \sum \nolimits_{k = 1}^n \;{P^k} - E} \right\|}_2}} \right\} are bounded, so ${\frac{1}{n} \sum_{k = 1}^{n} P^{k} - E‖}_{2} \leq \frac{K}{n^{ε}} \to 0$ {\left\| {{1 \over n}\mathop \sum \nolimits_{k = 1}^n \;{P^k} - E} \right\|_2} \le {K \over {{n^\varepsilon }}} \to 0 . Thus P is uniformly ergodic. Theorem 8 yields ‖Pⁿ − E‖₂ → 0 and the non-degenerate annealed CLT for every $0 \neq f \in L_{0}^{2} (m)$ 0 \ne f \in L_0^2\left( m \right) .

Theorem 16.

Let P be an ergodic Markov operator with invariant probability measure m. Then the following conditions are equivalent:

(i)
‖Pⁿ − E‖₂ → 0 and P^∗P is ergodic.
(ii)
The chain is α-mixing and every $0 \neq f \in L_{0}^{2} (m)$ 0 \ne f \in L_0^2\left( m \right) satisfies the L²-normalized CLT.
(iii)
Every $0 \neq f \in L_{0}^{2} (m)$ 0 \ne f \in L_0^2\left( m \right) satisfies a non-degenerate annealed CLT and the L²-normalized CLT.

Proof

(i) implies (ii) follows from Theorem 8 and the fact that ρ-mixing implies α-mixing (combined with Proposition 13).

(ii) implies (i): Indeed, P^∗P is ergodic by Proposition 14, and ‖Pⁿ − E‖₂ → 0 by Theorem 15.

(i) implies (iii) by Theorem 8.

(iii) implies (i): First of all, P^∗P is ergodic by Proposition 14. Further, fix $0 \neq f \in L_{0}^{2} (m)$ 0 \ne f \in L_0^2\left( m \right) . We shall prove that $σ_{n} (f) / \sqrt{n}\}$ \left\{ {{\sigma _n}\left( f \right)/\sqrt n } \right\} is bounded. For the sake of contradiction, suppose it is not bounded. Then there is an increasing sequence {n_k}_k∈_ℕ such that $\sqrt{n_{k}} / σ_{n_{k}} (f)$ \sqrt {{n_k}} /{\sigma _{{n_k}}}\left( f \right) converges to zero, whence (2) $\frac{1}{σ_{n_{k}} (f)} \sum_{j = 1}^{n_{k}} f (X_{j}) = \frac{\sqrt{n_{k}}}{σ_{n_{k}} (f)} \cdot \frac{1}{\sqrt{n_{k}}} \sum_{j = 1}^{n_{k}} f (X_{j}) .$ {1 \over {{\sigma _{{n_k}}}\left( f \right)}}\sum\limits_{j = 1}^{{n_k}} {f\left( {{X_j}} \right)} = {{\sqrt {{n_k}} } \over {{\sigma _{{n_k}}}\left( f \right)}} \cdot {1 \over {\sqrt {{n_k}} }}\sum\limits_{j = 1}^{{n_k}} {f\left( {{X_j}} \right)} . The left-hand side of (2) converges in distribution to 𝒩(0, 1) by the assumption of the L²-normalized CLT for f; the right-hand side converges to 𝒩(0, 0), by the assumed annealed CLT for f and Slutsky’s theorem, leading to a contradiction. Hence $σ_{n} (f) / \sqrt{n}\}$ \left\{ {{\sigma _n}\left( f \right)/\sqrt n } \right\} is bounded for every $f \in L_{0}^{2} (m)$ f \in L_0^2\left( m \right) . By Theorem 4, P is uniformly ergodic. By Proposition 14, P^∗P is ergodic, so P is weakly mixing by Lemma 6, and then ‖Pⁿ − E‖₂ → 0 by Theorem 5.

Problem 3.

Assume that P is a Makov operator with invariant probability measure m such that $lim_{n \to \infty} {P^{n} g - Eg‖}_{2} = lim_{n} {P^{* n} g - Eg‖}_{2} = 0 for every g \in L^{2} (m),$ \mathop {\lim }\limits_{n \to \infty } {\left\| {{P^n}g - Eg} \right\|_2} = \mathop {\lim }\limits_n {\left\| {{P^{*n}}g - Eg} \right\|_2} = 0\;\;\;for\;every\;\;\;g \in {L^2}\left( m \right), and assume that every non-zero $f \in L_{0}^{2} (m)$ f \in L_0^2\left( m \right) satisfies the L²-normalized CLT. Does it follow that P is uniformly ergodic in L²(m)?

If yes, then ‖Pⁿ − E‖₂ → 0 by Theorem 5, since P is weakly mixing by the strong convergence of Pⁿ. Note that the assumption implies that P^∗P is ergodic, by Proposition 14.

By Theorem 15, the answer is yes for P which is Harris recurrent and aperiodic.

Example (P not uniformly ergodic with (P^∗P) ergodic).

Let Q be ergodic with invariant probability measure m which is not uniformly ergodic. For ε ∈ (0, 1) define P = P_ε := εI +(1 − ε)Q. We shall prove that P^∗P is ergodic. Clearly m is invariant also for P and for P^∗P. For A ∈ Σ we have $P^{*} P 1_{A} = ε^{2} 1_{A} + ε (1 - ε) (Q^{*} 1_{A} + Q 1_{A}) + {(1 - ε)}^{2} Q^{*} Q 1_{A} .$ {P^*}P{1_A} = {\varepsilon ^2}{1_A} + \varepsilon \left( {1 - \varepsilon } \right)\left( {{Q^*}{1_A} + Q{1_A}} \right) + {(1 - \varepsilon )^2}{Q^*}Q{1_A}. If P^∗P1_A = 1_A a.e., then for almost every x ∉ A the above summands are zero, so in particular Q1_A ≤ 1_A a.e. Since m is invariant, Q1_A = 1_A, and A is trivial by the ergodicity of Q. By definition (I − P)L²(m) = (I − Q)L²(m), so when Q is not uniformly ergodic (I − P)L²(m) is not closed; hence P is not uniformly ergodic.

6.

Geometric ergodicity

Definition.

A Markov operator P with invariant probability measure m is called geometrically ergodic if, for some ρ < 1, $M_{x} : = sup_{n} ρ^{- n} {P^{n} (x, \cdot) - m‖}_{TV} < \infty a . e .$ {M_x}: = {\rm{\;}}\mathop {{\rm{sup}}}\limits_n {\rm{\;}}{\rho ^{ - n}}{\left\| {{P^n}\left( {x, \cdot } \right) - m} \right\|_{TV}} < \infty \;\;\;{\rm{a}}.{\rm{e}}.

Geometric ergodicity implies aperiodic Harris recurrence and α-mixing, with the α-mixing coefficients α(n) converging to 0 exponentially fast; see [4, Section 3.2].

Theorem 17 (Doukhan-Massart-Rio, [12]).

Let Σ be countably generated and let P be a geometrically ergodic Markov operator. Then any centered f with ∫ | f |² log⁺ | f | dm < ∞ satisfies the annealed CLT.

Theorem 18 (Roberts-Tweedie, [23]).

Let Σ be countably generated, and let P be a Harris positive recurrent Markov chain. If ‖Pⁿ − E‖₂ → 0, then P is geometrically ergodic.

Note that ‖Pⁿ − E‖₂ → 0 does not necessarily imply Harris recurrence; therefore Harris recurrrence must be assumed.

Note.

The converse may fail – in [3] and [16] are examples of P geometrically ergodic with some centered f ∈ L²(m) which does not satisfy the annealed CLT, so lim_n→∞ ‖Pⁿ − E‖₂ > 0.

Theorem 19.

Let P be a Markov operator with invariant probability measure m, and assume that P is normal in L²(m). Then ‖Pⁿ − E‖₂ → 0 if (and only if) the α-mixing coefficients converge to zero (at least) exponentially fast.

Bradley ([5]) proved the theorem when P is symmetric.

In general, if P is geometrically ergodic, then P is Harris aperiodic and the α-mixing coefficients converge to zero exponentially fast. We do not know if a Harris aperiodic P whose α–mixing coefficients converge to zero exponentially fast is geometrically ergodic.

Corollary 20.

Let Σ be countably generated. If a Markov operator P is geometrically ergodic, and is additionally normal in L²(m), then ${P^{n} - E‖}_{2} \to 0 .$ {\left\| {{P^n} - E} \right\|_2} \to 0.

The symmetric case is in [22]. For S countable Corollary 20 is established in [26].

Remarks.

P in Theorem 19 need not be Harris recurrent.
When Σ is countably generated and P is Harris recurrent and normal in L²(m), Theorems 18 and 19 yield that exponential decay to 0 of α(n), geometric ergodicity and ρ-mixing are equivalent.

In Bradley’s and Häggström’s examples P is geometrically ergodic, and every centered f ∈ L^p(m), p > 2, satisfies the CLT, by Theorem 17; however, P does not have a spectral gap in L^p(m), i.e. lim_n→∞ ‖Pⁿ − E‖_p > 0, since otherwise it would imply lim_n→∞ ‖Pⁿ − E‖₂ = 0 ([24]), and so the CLT for every centered f ∈ L²(m). By Corollary 20, P in such examples cannot be normal in L²(m).

The examples of Bradley and Häggström show that without normality Theorem 19 fails, although we have geometric ergodicity.

Problem 4.

Let P be a Harris aperiodic Markov chain, and suppose that every centered f such that ∫ | f |² log⁺ | f | dm < ∞ satisfies the annealed CLT. Does this imply that P is geometrically ergodic? (Is a converse of Theorem 17 true?).

Dedecker informed the author that an example of Bradley ([6]) exhibits P Harris recurrent which is not geometrically ergodic, such that every $f \in L_{0}^{p} (m)$ f \in L_0^p\left( m \right) , p > 2, satisfies the annealed CLT. In Problem 4 we (necessarily) assume more, i.e. that the annealed CLT is satisfied by a strictly larger subset of $L_{0}^{2} (m)$ L_0^2\left( m \right) .

See definition, as “asymptotically uncorrelated”, in [24, pp. 206–207].

Global Central Limit Theorems for Stationary Markov Chains

Full Article

Paradigm

My account