Welcome to Randal Douc's wiki

A collaborative site on maths but not only!

User Tools

Site Tools


world:subworld

This is an old revision of the document!


Subdirectories



For adding a newpage here, choose the namespace (world, here) and the name of your newpage.

You are not allowed to add pages

Hewitt-Savage 0-1 Law

Let \((X_i)_{i\in\mathbb{N}}\) be a family of independent and identically distributed (i.i.d.) random variables defined on a common probability space and taking values in a measurable space \((\mathsf{X}, \mathcal{X})\).

Equivalent Formulation: We can also consider \(X_i\) as the coordinate projection associated with the probability space \((\mathsf{X}^\mathbb{N}, \mathcal{X}^{\otimes\mathbb{N}}, \mathbb{P})\), where, under \(\mathbb{P}\), the sequence \((X_i)\) is i.i.d.

Permutation-Invariant σ-Fields: For any \(n \in \mathbb{N}\), let \(\mathcal{G}_n\) be the σ-field generated by measurable functions \(f: \mathsf{X}^\mathbb{N} \to \mathbb{R}\) that are invariant under any permutation of the first \(n\) coordinates. Define the σ-field $\mathcal{G}_\infty$ as: \[ \mathcal{G}_\infty = \bigcap_{n\in\mathbb{N}} \mathcal{G}_n, \] which is generated by measurable functions invariant under any permutation of a finite number of coordinates.

Statement: For any \(A \in \mathcal{G}_\infty\), \(\mathbb{P}(A) = 0\) or \(1\).

This result is known as the Hewitt-Savage 0-1 Law.

Proof of the Hewitt-Savage 0-1 Law

Objective: Show that \(\mathbb{P}(A) = \mathbb{P}(A)^2\), which implies \(\mathbb{P}(A) \in \{0,1\}\).

Key Idea: We aim to establish the identity \(\mathbb{P}(A) = \mathbb{E}[\mathsf{1}_A \mathsf{1}_A] = \mathbb{E}[\mathsf{1}_A]^2 = \mathbb{P}(A)^2\). To do so, the idea is to approximate $\mathsf{1}_A$ by the indicators of two different independent events.

Step 1: Approximation Lemma Let \(\delta > 0\). By the Approximation Lemma, there exist \(n \in \mathbb{N}\) and a set \(B \in \mathcal{F}_n = \sigma(X_{1:n})\) such that: \[ \mathbb{E}\left[|\mathsf{1}_A - \mathsf{1}_B|\right] \leq \delta. \] Since \(B \in \mathcal{F}_n\), there exists \(\bar{B} \in \mathcal{X}^{\otimes n}\) such that: \[ \mathsf{1}_B = \mathsf{1}_{\bar{B}}(X_{1:n}). \] Thus, we have: \[ \mathbb{E}\left[|\mathsf{1}_A - \mathsf{1}_{\bar{B}}(X_{1:n})|\right] \leq \delta. \]

Step 2: Permutation Argument Define a permutation \(\pi\) on \(\{1, \ldots, 2n\}\) that swaps the first \(n\) coordinates with the next \(n\) coordinates: \[ \pi(1:2n) = (n+1:2n, 1:n). \] Since \(A \in \mathcal{G}_\infty\), the event \(A\) is invariant under \(\pi\). Therefore: \[ \delta \geq \mathbb{E}\left[|\mathsf{1}_A - \mathsf{1}_B|\right] = \mathbb{E}\left[|\mathsf{1}_A(X_{\pi(1:2n), 2n:\infty}) - \mathsf{1}_B(X_{\pi(1:2n), 2n:\infty})|\right] = \mathbb{E}\left[|\mathsf{1}_A - \mathsf{1}_{\bar{B}}(X_{n+1:2n})|\right]. \]

Step 3: Independence and Expectation Introduce the intermediate quantities \(\mathsf{1}_{\bar{B}}(X_{1:n})\) and \(\mathsf{1}_{\bar{B}}(X_{n+1:2n})\) which are independent by independence of \((X_i)\). Then, we obtain: $$ |\mathbb{E}[\mathsf{1}_A \mathsf{1}_A] - \mathbb{E}[\mathsf{1}_A]\mathbb{E}[\mathsf{1}_A]| \leq |\mathbb{E}[\mathsf{1}_A \mathsf{1}_A] - \mathbb{E}[\mathsf{1}_{\bar{B}}(X_{1:n})\mathsf{1}_{\bar{B}}(X_{n+1:2n})]| + | \mathbb{E}[\mathsf{1}_{\bar{B}}(X_{1:n})]\mathbb{E}[\mathsf{1}_{\bar{B}}(X_{n+1:2n})]- \mathbb{E}[\mathsf{1}_A] \mathbb{E} [\mathsf{1}_A]| \leq 4\delta. $$ Since \(\delta > 0\) is arbitrary, we conclude that: \[ \mathbb{E}[\mathsf{1}_A \mathsf{1}_A] = \mathbb{E}[\mathsf{1}_A]^2, \] which implies \(\mathbb{P}(A) = \mathbb{P}(A)^2\). Therefore, \(\mathbb{P}(A) \in \{0,1\}\).

Conclusion: The Hewitt-Savage 0-1 Law states that any permutation-invariant event in the tail σ-field \(\mathcal{G}_\infty\) has probability 0 or 1.

2026/02/05 12:36 · rdouc

$$ \newcommand{\arginf}{\mathrm{arginf}} \newcommand{\argmin}{\mathrm{argmin}} \newcommand{\argmax}{\mathrm{argmax}} \newcommand{\asconv}[1]{\stackrel{#1-a.s.}{\rightarrow}} \newcommand{\Aset}{\mathsf{A}} \newcommand{\b}[1]{{\mathbf{#1}}} \newcommand{\ball}[1]{\mathsf{B}(#1)} \newcommand{\bbQ}{{\mathbb Q}} \newcommand{\bproof}{\textbf{Proof :}\quad} \newcommand{\bmuf}[2]{b_{#1,#2}} \newcommand{\card}{\mathrm{card}} \newcommand{\chunk}[3]{{#1}_{#2:#3}} \newcommand{\condtrans}[3]{p_{#1}(#2|#3)} \newcommand{\convprob}[1]{\stackrel{#1-\text{prob}}{\rightarrow}} \newcommand{\Cov}{\mathbb{C}\mathrm{ov}} \newcommand{\cro}[1]{\langle #1 \rangle} \newcommand{\CPE}[2]{\PE\lr{#1| #2}} \renewcommand{\det}{\mathrm{det}} \newcommand{\dimlabel}{\mathsf{m}} \newcommand{\dimU}{\mathsf{q}} \newcommand{\dimX}{\mathsf{d}} \newcommand{\dimY}{\mathsf{p}} \newcommand{\dlim}{\Rightarrow} \newcommand{\e}[1]{{\left\lfloor #1 \right\rfloor}} \newcommand{\eproof}{\quad \Box} \newcommand{\eremark}{</WRAP>} \newcommand{\eqdef}{:=} \newcommand{\eqlaw}{\stackrel{\mathcal{L}}{=}} \newcommand{\eqsp}{\;} \newcommand{\Eset}{ {\mathsf E}} \newcommand{\esssup}{\mathrm{essup}} \newcommand{\fr}[1]{{\left\langle #1 \right\rangle}} \newcommand{\falph}{f} \renewcommand{\geq}{\geqslant} \newcommand{\hchi}{\hat \chi} \newcommand{\Hset}{\mathsf{H}} \newcommand{\Id}{\mathrm{Id}} \newcommand{\img}{\text{Im}} \newcommand{\indi}[1]{\mathbf{1}_{#1}} \newcommand{\indiacc}[1]{\mathbf{1}_{\{#1\}}} \newcommand{\indin}[1]{\mathbf{1}\{#1\}} \newcommand{\itemm}{\quad \quad \blacktriangleright \;} \newcommand{\jointtrans}[3]{p_{#1}(#2,#3)} \newcommand{\ker}{\text{Ker}} \newcommand{\klbck}[2]{\mathrm{K}\lr{#1||#2}} \newcommand{\law}{\mathcal{L}} \newcommand{\labelinit}{\pi} \newcommand{\labelkernel}{Q} \renewcommand{\leq}{\leqslant} \newcommand{\lone}{\mathsf{L}_1} \newcommand{\lp}[1]{\mathsf{L}_{{#1}}} \newcommand{\lrav}[1]{\left|#1 \right|} \newcommand{\lr}[1]{\left(#1 \right)} \newcommand{\lrb}[1]{\left[#1 \right]} \newcommand{\lrc}[1]{\left\{#1 \right\}} \newcommand{\lrcb}[1]{\left\{#1 \right\}} \newcommand{\ltwo}[1]{\PE^{1/2}\lrb{\lrcb{#1}^2}} \newcommand{\Ltwo}{\mathrm{L}^2} \newcommand{\mc}[1]{\mathcal{#1}} \newcommand{\mcbb}{\mathcal B} \newcommand{\mcf}{\mathcal{F}} \newcommand{\meas}[1]{\mathrm{M}_{#1}} \newcommand{\norm}[1]{\left\|#1\right\|} \newcommand{\normmat}[1]{{\left\vert\kern-0.25ex\left\vert\kern-0.25ex\left\vert #1 \right\vert\kern-0.25ex\right\vert\kern-0.25ex\right\vert}} \newcommand{\nset}{\mathbb N} \newcommand{\N}{\mathcal{N}} \newcommand{\one}{\mathsf{1}} \newcommand{\PE}{\mathbb E} \newcommand{\pminfty}{_{-\infty}^\infty} \newcommand{\PP}{\mathbb P} \newcommand{\projorth}[1]{\mathsf{P}^\perp_{#1}} \newcommand{\Psif}{\Psi_f} \newcommand{\pscal}[2]{\langle #1,#2\rangle} \newcommand{\pscal}[2]{\langle #1,#2\rangle} \newcommand{\psconv}{\stackrel{\PP-a.s.}{\rightarrow}} \newcommand{\qset}{\mathbb Q} \newcommand{\revcondtrans}[3]{q_{#1}(#2|#3)} \newcommand{\rmd}{\mathrm d} \newcommand{\rme}{\mathrm e} \newcommand{\rmi}{\mathrm i} \newcommand{\Rset}{\mathbb{R}} \newcommand{\rset}{\mathbb{R}} \newcommand{\rti}{\sigma} \newcommand{\section}[1]{==== #1 ====} \newcommand{\seq}[2]{\lrc{#1\eqsp: \eqsp #2}} \newcommand{\set}[2]{\lrc{#1\eqsp: \eqsp #2}} \newcommand{\sg}{\mathrm{sgn}} \newcommand{\supnorm}[1]{\left\|#1\right\|_{\infty}} \newcommand{\thv}{{\theta_\star}} \newcommand{\tmu}{ {\tilde{\mu}}} \newcommand{\Tset}{ {\mathsf{T}}} \newcommand{\Tsigma}{ {\mathcal{T}}} \newcommand{\ttheta}{{\tilde \theta}} \newcommand{\tv}[1]{\left\|#1\right\|_{\mathrm{TV}}} \newcommand{\unif}{\mathrm{Unif}} \newcommand{\weaklim}[1]{\stackrel{\mathcal{L}_{#1}}{\rightsquigarrow}} \newcommand{\Xset}{{\mathsf X}} \newcommand{\Xsigma}{\mathcal X} \newcommand{\Yset}{{\mathsf Y}} \newcommand{\Ysigma}{\mathcal Y} \newcommand{\Var}{\mathbb{V}\mathrm{ar}} \newcommand{\zset}{\mathbb{Z}} \newcommand{\Zset}{\mathsf{Z}} $$

2023/11/14 18:37

Approximation Lemma

Let $(\Xset^{\mathbb{N}},\Xsigma^{\otimes\mathbb{N}},\mathbb{P})$ be a probability space. For each $k\in\mathbb{N}$, define $$ \mathcal{F}_k=\sigma(X_{1:k}), $$ where $X_i$ denotes the coordinate projection on the $i$-th component, that is, $X_i(\omega)=\omega_i$ for $\omega\in\Xset^{\mathbb{N}}$.

Lemma (Approximation Lemma). Any set $A\in\mathcal{X}^{\otimes\mathbb{N}}$ satisfies the approximation property:

For every $\delta>0$, there exist an integer $k\in\mathbb{N}$ and a set $B\in\mathcal{F}_k$ such that $$ \mathbb{E}\big[|\mathbb{1}_A-\mathbb{1}_B|\big]\leq\delta. $$

Proof

The proof is a classical application of the monotone class theorem.

Let $\mathcal{M}$ denote the collection of all sets $A\in\mathcal{X}^{\otimes\mathbb{N}}$ that satisfy the approximation property stated in the lemma. We show that $\mathcal{M}$ is a monotone class.

  • Stability under set differences.

Let $A_0,A_1\in\mathcal{M}$ with $A_0\subset A_1$. We claim that $A_1\setminus A_0\in\mathcal{M}$.

For arbitrary sets $A_0,A_1,B_0,B_1$, the following identity holds: $$ \mathbb{1}_{A_1\setminus A_0}-\mathbb{1}_{B_1\setminus B_0} =\mathbb{1}_{A_1}\mathbb{1}_{A_0^c}-\mathbb{1}_{B_1}\mathbb{1}_{B_0^c} =\mathbb{1}_{A_1}(\mathbb{1}_{A_0^c}-\mathbb{1}_{B_0^c}) +(\mathbb{1}_{A_1}-\mathbb{1}_{B_1})\mathbb{1}_{B_0^c}. $$

Taking absolute values and expectations yields $$ \mathbb{E}\big[|\mathbb{1}_{A_1\setminus A_0}-\mathbb{1}_{B_1\setminus B_0}|\big] \leq \mathbb{E}\big[|\mathbb{1}_{A_0}-\mathbb{1}_{B_0}|\big] + \mathbb{E}\big[|\mathbb{1}_{A_1}-\mathbb{1}_{B_1}|\big]. $$

Since $A_0,A_1\in\mathcal{M}$, both terms on the right-hand side can be made arbitrarily small. Hence, the approximation property holds for $A_1\setminus A_0$.

  • Stability under increasing limits.

Let $(A_n)_{n\geq0}$ be an increasing sequence of sets in $\mathcal{M}$, and define $$ A=\bigcup_{n\geq0}A_n. $$

For any set $B$, the triangle inequality gives $$ \mathbb{E}\big[|\mathbb{1}_A-\mathbb{1}_B|\big] \leq \mathbb{E}\big[|\mathbb{1}_A-\mathbb{1}_{A_n}|\big] + \mathbb{E}\big[|\mathbb{1}_{A_n}-\mathbb{1}_B|\big]. $$

Since $\mathbb{1}_{A_n}\uparrow\mathbb{1}_A$ pointwise, the first term converges to zero, while the second term can be made arbitrarily small because $A_n\in\mathcal{M}$. Therefore, $A\in\mathcal{M}$.

Finally, $\mathcal{M}$ is a monotone class containing all $\mathcal{F}_k$. By the monotone class theorem, it contains $$ \sigma\Big(\bigcup_{k\geq0}\mathcal{F}_k\Big)=\mathcal{X}^{\otimes\mathbb{N}}. $$ This completes the proof.

2026/02/05 07:35 · rdouc

Upcrossing Inequality and Martingale Convergence

Doob's Upcrossing Inequality

Let $a<b$ and let $(X_k)_{k\ge0}$ be a real-valued process. We define an indicator process $(C_k)$ that tracks the phases during which the process is performing an upcrossing from $a$ to $b$.

Set $$ C_0 := \mathbf 1_{\{X_0<a\}}. $$ For all $k\ge1$, define recursively $$ C_k = \mathbf 1_{\{C_{k-1}=1\}}\mathbf 1_{\{X_{k-1}\le b\}} + \mathbf 1_{\{C_{k-1}=0\}}\mathbf 1_{\{X_{k-1}<a\}}. $$

Define also $$ Y_k := \sum_{\ell=1}^k C_\ell (X_\ell - X_{\ell-1}). $$

Interpretation.

  • $C_k=1$ indicates that the process is currently in an upcrossing phase, started after going below $a$.
  • Once the process exceeds $b$, the upcrossing phase ends and $C_k$ returns to $0$.
  • The process $Y_k$ accumulates only the increments of $X$ occurring during these upcrossing phases.

Let $U_{0:n}[a,b]$ denote the number of completed upcrossings from $a$ to $b$ between times $0$ and $n$. Then the following inequality holds: $$ Y_n \ge (b-a)\,U_{0:n}[a,b] - (X_n-a)^-. $$

Proof

Each completed upcrossing from $a$ to $b$ contributes at least $b-a$ to the sum $Y_n$, since only the increments occurring during the corresponding phase are accumulated.

The only possible negative contribution comes from an unfinished upcrossing at time $n$, when $X_n<a$. This loss is exactly controlled by the term $(X_n-a)^-$.

Summing the contributions of all completed upcrossings and accounting for the final correction yields the stated inequality. ∎

Theorem 1 (Doob's Forward Convergence Theorem)

Let $(X_n)_{n\ge0}$ be a supermartingale such that $$ \sup_n \mathbb E[|X_n|] < \infty. $$ Then the limit $$ \lim_{n\to\infty} X_n $$ exists almost surely.

Proof

By the upcrossing inequality, for any $a<b$, $$ (b-a)\,\mathbb E[U_{0:n}[a,b]] \le \mathbb E[Y_n] + \mathbb E[(X_n-a)^-]. $$

Since $(X_n)$ is a supermartingale, $(Y_n)$ is also a supermartingale with non-positive expectation, hence $\mathbb E[Y_n]\le0$. The $L^1$ boundedness assumption implies that $\sup_n \mathbb E[(X_n-a)^-]<\infty$.

By the monotone convergence theorem, $$ \mathbb E[U_{0:\infty}[a,b]]<\infty, $$ which implies that the total number of upcrossings is almost surely finite. Hence, the process cannot oscillate infinitely many times between $a$ and $b$, and $X_n$ converges almost surely. ∎

Lévy's Downward Theorem

Intuition. Conditioning with respect to smaller and smaller $\sigma$-fields means losing information. The conditional expectations stabilize and converge to the conditional expectation with respect to the remaining common information.

Theorem 2 (Lévy's Downward Convergence Theorem)

Let $(\mathcal F_{-n})_{n\ge1}$ be a decreasing sequence of $\sigma$-fields such that $$ \mathcal F_{-\infty} = \bigcap_{k\ge1}\mathcal F_{-k} \subseteq \cdots \subseteq \mathcal F_{-(n+1)} \subseteq \mathcal F_{-n} \subseteq \cdots \subseteq \mathcal F_{-1}. $$

For any random variable $Z$ satisfying $\mathbb E[|Z|]<\infty$, we have $$ \lim_{n\to\infty}\mathbb E[Z\mid\mathcal F_{-n}] = \mathbb E[Z\mid\mathcal F_{-\infty}] \quad\text{almost surely}. $$

Proof

Define the process $$ X_k := \mathbb E[Z\mid\mathcal F_k], \qquad -n\le k\le -1. $$ This is a martingale bounded in $L^1$.

Similarly as before, we define $C_1=\mathsf{1}_{X_{-n} < a}$ . For all $-n < k \leq -1$, we set $C_k=\mathsf{1}_{C_{k-1}=1} \mathsf{1}_{X_{k-1} \leq b}+ \mathsf{1}_{C_{k-1}=0} \mathsf{1}_{X_{k-1} <a}$. Define also $Y_k=\sum_{\ell=-n}^k C_\ell (X_\ell - X_{\ell-1})$. Then, for the same reasons, the number of upcrossing $U_{-n:-1}[a,b]$ between $-n$ and $-1$ can be bounded as follows: $$ Y_{-1} \geq (b-a) U_{-n:-1}[a,b] - (X_{-1}-a)^- $$

As before, the integrability of $Z$ ensures that the expected number of upcrossings is finite, which implies that $(X_k)$ converges almost surely as $k\to-\infty$.

The limit must coincide with $\mathbb E[Z\mid\mathcal F_{-\infty}]$, which completes the proof. ∎

Linked pages

2026/02/03 15:54 · rdouc

$$ \newcommand{\arginf}{\mathrm{arginf}} \newcommand{\argmin}{\mathrm{argmin}} \newcommand{\argmax}{\mathrm{argmax}} \newcommand{\asconv}[1]{\stackrel{#1-a.s.}{\rightarrow}} \newcommand{\Aset}{\mathsf{A}} \newcommand{\b}[1]{{\mathbf{#1}}} \newcommand{\ball}[1]{\mathsf{B}(#1)} \newcommand{\bbQ}{{\mathbb Q}} \newcommand{\bproof}{\textbf{Proof :}\quad} \newcommand{\bmuf}[2]{b_{#1,#2}} \newcommand{\card}{\mathrm{card}} \newcommand{\chunk}[3]{{#1}_{#2:#3}} \newcommand{\condtrans}[3]{p_{#1}(#2|#3)} \newcommand{\convprob}[1]{\stackrel{#1-\text{prob}}{\rightarrow}} \newcommand{\Cov}{\mathbb{C}\mathrm{ov}} \newcommand{\cro}[1]{\langle #1 \rangle} \newcommand{\CPE}[2]{\PE\lr{#1| #2}} \renewcommand{\det}{\mathrm{det}} \newcommand{\dimlabel}{\mathsf{m}} \newcommand{\dimU}{\mathsf{q}} \newcommand{\dimX}{\mathsf{d}} \newcommand{\dimY}{\mathsf{p}} \newcommand{\dlim}{\Rightarrow} \newcommand{\e}[1]{{\left\lfloor #1 \right\rfloor}} \newcommand{\eproof}{\quad \Box} \newcommand{\eremark}{</WRAP>} \newcommand{\eqdef}{:=} \newcommand{\eqlaw}{\stackrel{\mathcal{L}}{=}} \newcommand{\eqsp}{\;} \newcommand{\Eset}{ {\mathsf E}} \newcommand{\esssup}{\mathrm{essup}} \newcommand{\fr}[1]{{\left\langle #1 \right\rangle}} \newcommand{\falph}{f} \renewcommand{\geq}{\geqslant} \newcommand{\hchi}{\hat \chi} \newcommand{\Hset}{\mathsf{H}} \newcommand{\Id}{\mathrm{Id}} \newcommand{\img}{\text{Im}} \newcommand{\indi}[1]{\mathbf{1}_{#1}} \newcommand{\indiacc}[1]{\mathbf{1}_{\{#1\}}} \newcommand{\indin}[1]{\mathbf{1}\{#1\}} \newcommand{\itemm}{\quad \quad \blacktriangleright \;} \newcommand{\jointtrans}[3]{p_{#1}(#2,#3)} \newcommand{\ker}{\text{Ker}} \newcommand{\klbck}[2]{\mathrm{K}\lr{#1||#2}} \newcommand{\law}{\mathcal{L}} \newcommand{\labelinit}{\pi} \newcommand{\labelkernel}{Q} \renewcommand{\leq}{\leqslant} \newcommand{\lone}{\mathsf{L}_1} \newcommand{\lp}[1]{\mathsf{L}_{{#1}}} \newcommand{\lrav}[1]{\left|#1 \right|} \newcommand{\lr}[1]{\left(#1 \right)} \newcommand{\lrb}[1]{\left[#1 \right]} \newcommand{\lrc}[1]{\left\{#1 \right\}} \newcommand{\lrcb}[1]{\left\{#1 \right\}} \newcommand{\ltwo}[1]{\PE^{1/2}\lrb{\lrcb{#1}^2}} \newcommand{\Ltwo}{\mathrm{L}^2} \newcommand{\mc}[1]{\mathcal{#1}} \newcommand{\mcbb}{\mathcal B} \newcommand{\mcf}{\mathcal{F}} \newcommand{\meas}[1]{\mathrm{M}_{#1}} \newcommand{\norm}[1]{\left\|#1\right\|} \newcommand{\normmat}[1]{{\left\vert\kern-0.25ex\left\vert\kern-0.25ex\left\vert #1 \right\vert\kern-0.25ex\right\vert\kern-0.25ex\right\vert}} \newcommand{\nset}{\mathbb N} \newcommand{\N}{\mathcal{N}} \newcommand{\one}{\mathsf{1}} \newcommand{\PE}{\mathbb E} \newcommand{\pminfty}{_{-\infty}^\infty} \newcommand{\PP}{\mathbb P} \newcommand{\projorth}[1]{\mathsf{P}^\perp_{#1}} \newcommand{\Psif}{\Psi_f} \newcommand{\pscal}[2]{\langle #1,#2\rangle} \newcommand{\pscal}[2]{\langle #1,#2\rangle} \newcommand{\psconv}{\stackrel{\PP-a.s.}{\rightarrow}} \newcommand{\qset}{\mathbb Q} \newcommand{\revcondtrans}[3]{q_{#1}(#2|#3)} \newcommand{\rmd}{\mathrm d} \newcommand{\rme}{\mathrm e} \newcommand{\rmi}{\mathrm i} \newcommand{\Rset}{\mathbb{R}} \newcommand{\rset}{\mathbb{R}} \newcommand{\rti}{\sigma} \newcommand{\section}[1]{==== #1 ====} \newcommand{\seq}[2]{\lrc{#1\eqsp: \eqsp #2}} \newcommand{\set}[2]{\lrc{#1\eqsp: \eqsp #2}} \newcommand{\sg}{\mathrm{sgn}} \newcommand{\supnorm}[1]{\left\|#1\right\|_{\infty}} \newcommand{\thv}{{\theta_\star}} \newcommand{\tmu}{ {\tilde{\mu}}} \newcommand{\Tset}{ {\mathsf{T}}} \newcommand{\Tsigma}{ {\mathcal{T}}} \newcommand{\ttheta}{{\tilde \theta}} \newcommand{\tv}[1]{\left\|#1\right\|_{\mathrm{TV}}} \newcommand{\unif}{\mathrm{Unif}} \newcommand{\weaklim}[1]{\stackrel{\mathcal{L}_{#1}}{\rightsquigarrow}} \newcommand{\Xset}{{\mathsf X}} \newcommand{\Xsigma}{\mathcal X} \newcommand{\Yset}{{\mathsf Y}} \newcommand{\Ysigma}{\mathcal Y} \newcommand{\Var}{\mathbb{V}\mathrm{ar}} \newcommand{\zset}{\mathbb{Z}} \newcommand{\Zset}{\mathsf{Z}} $$

2023/11/14 18:37

De Finetti's Representation Theorem for Exchangeable Random Elements

Theorem

De Finetti's Theorem: Let $(X_i)_{i\in\mathbb{N}}$ be a family of exchangeable random elements taking values on a measurable space $(\mathsf{X},\mathcal{X})$. Then, there exists a $\sigma$-field $\mathcal{G}_\infty$ such that, conditionally on $\mathcal{G}_\infty$, the random variables $(X_i)_{i\in\mathbb{N}}$ are independent and identically distributed (i.i.d.).

The proof is based on the paper “Uses of exchangeability” by J. F. Kingman Click here to see the paper.

Proof

Without loss of generality, we model $(X_i)_{i\in\mathbb{N}}$ as the coordinate projections on the canonical probability space $(\mathsf{X}^{\mathbb{N}},\mathcal{X}^{\otimes\mathbb{N}},\mathbb{P})$. We proceed as follows:

1. Reverse filtration construction:

  • For each $n\in\mathbb{N}$, let $\mathcal{G}_n$ be the $\sigma$-field generated by all measurable functions $f:\mathsf{X}^{\mathbb{N}}\to\mathbb{R}$ invariant under any permutation of the first $n$ coordinates, i.e. for any permutation $\pi$ of $\{1,\ldots,n\}$ and any $x\in\mathsf{X}^{\mathbb{N}}$,

$$ f(x_1,\ldots,x_n,x_{n+1},\ldots)=f(x_{\pi(1)},\ldots,x_{\pi(n)},x_{n+1},\ldots). $$

  • $(\mathcal{G}_n)_{n\in\mathbb{N}}$ is a reverse filtration with $\mathcal{G}_n\supseteq\mathcal{G}_{n+1}$ and

$$ \mathcal{G}_\infty=\bigcap_{n\in\mathbb{N}}\mathcal{G}_n. $$

  • $\mathcal{G}_\infty$ is generated by all functions invariant under any finite permutation of coordinates.

2. Conditional expectation and empirical averages:

  • For any bounded measurable $h:\mathsf{X}\to\mathbb{R}$, the empirical average $\frac1n\sum_{i=1}^n h(X_i)$ is $\mathcal{G}_n$-measurable.
  • For any $A\in\mathcal{G}_n$, exchangeability implies

$$ \mathbb{E}\!\left[\left(\frac1n\sum_{i=1}^n h(X_i)\right)\mathbf1_A\right] =\frac1n\sum_{i=1}^n \mathbb{E}\!\left[\left( h(X_i)\right)\mathbf1_A\right] =\mathbb{E}[h(X_1)\mathbf1_A]. $$

  • The two previous item show the amazing formula:

$$ \frac1n\sum_{i=1}^n h(X_i)=\mathbb{E}[h(X_1)\mid\mathcal{G}_n], \quad a.s. $$

$$ \frac1n\sum_{i=1}^n h(X_i)\xrightarrow{\mathrm{a.s.}}\mathbb{E}[h(X_1)\mid\mathcal{G}_\infty]. $$

3. Multivariate functions:

  • For bounded measurable $f:\mathsf{X}^k\to\mathbb{R}$,

$$ \mathbb{E}[f(X_1,\ldots,X_k)\mid\mathcal{G}_\infty] =\lim_{n\to\infty}\frac1{n(n-1)\cdots(n-k+1)} \sum_{\substack{i_{1:k} \in [1:n]^k\\ i_j\neq i_\ell}} f(X_{i_1},\ldots,X_{i_k}). $$

  • But we have

$$ \frac1{n(n-1)\cdots(n-k+1)} \sum_{\substack{i_{1:k} \in [1:n]^k\\ i_j\neq i_\ell}} f(X_{i_1},\ldots,X_{i_k}) + O\lr{\frac1n} = \frac1{n^k}\sum_{i_1=1}^n\cdots\sum_{i_k=1}^n f(X_{i_1},\ldots,X_{i_k}). $$

  • Hence, for product functions $f(x_1,\ldots,x_k)=f_1(x_1)\cdots f_k(x_k)$,

$$ \mathbb{E}[f_1(X_1)\cdots f_k(X_k)\mid\mathcal{G}_\infty] =\prod_{\ell=1}^k\mathbb{E}[f_\ell(X_1)\mid\mathcal{G}_\infty]. $$

  • Thus, conditionally on $\mathcal{G}_\infty$, $(X_i)$ are independent. $\blacksquare$

If $(X_i)$ are real-valued random variables, then, since the distribution of any random variable $X$ is completely determined by the values $\mathbb{P}(X \le x)$ for $x \in \mathbb{Q}$, we can replace $\mathcal{G}_\infty$ with the $\sigma$-field generated by the countable family of random variables \[ \left\{ \mathbb{E}\big[\mathbf{1}_{\{X_1 \le x\}} \mid \mathcal{G}_\infty \big] : x \in \mathbb{Q} \right\}. \] It follows that there exists a random variable $S$ such that, conditional on $S$, the sequence $(X_i)$ is independent and identically distributed.

Comments: Another proof of the Law of Large Numbers for i.i.d. Random Variables

The previous approach allows to prove the strong law of large numbers (for a proof of the LLN using only the dominated convergence theorem, click here)

Let \((X_i)\) be a sequence of independent and identically distributed (i.i.d.) random variables such that \(\mathbb{E}[|X_1|] < \infty\). Then, \[ \frac{1}{n} \sum_{i=1}^n X_i \xrightarrow{a.s.} \mathbb{E}[X_1]. \]

Proof: By the reverse martingale convergence theorem (see Upcrossing Inequality and Martingale Convergence), the empirical average converges almost surely to the conditional expectation with respect to the tail σ-field \(\mathcal{G}_\infty\): \[ \frac{1}{n} \sum_{i=1}^n X_i = \mathbb{E}[X_1|\mathcal{G}_n] \xrightarrow{a.s.} \mathbb{E}[X_1|\mathcal{G}_\infty]. \]

The Hewitt-Savage 0-1 Law states that any \(\mathcal{G}_\infty\)-measurable random variable is almost surely constant. Consequently, the conditional expectation \(\mathbb{E}[X_1|\mathcal{G}_\infty]\) is almost surely equal to its unconditional expectation: \[ \mathbb{E}[X_1|\mathcal{G}_\infty] = \mathbb{E}[X_1] \quad \text{a.s.} \]

Thus, the empirical average converges almost surely to the theoretical expectation: \[ \frac{1}{n} \sum_{i=1}^n X_i \xrightarrow{a.s.} \mathbb{E}[X_1]. \]

2026/02/03 11:51 · rdouc

Stochastic calculus for Machine Learning

Registration to the course

Program of the course

  • Where? ENS Paris-Saclay.
  • When?
Courses Topics Material
1 Brownian motion and the Wiener space: defintions and first properties TD 0 and TD 1 and notes
2 Brownian motion: Markov property and further properties TD 3 and notes
3 End Chapter 2. Filtration and measurability. Starting continuous martingales complete notes chapter 2 and notes chapter 3
4 Continuous martingales – Bounded variation processes notes chapter 4 TD3
5 Local martingales and their bracket notes chapter 5 TD4
6
7
8

Evaluation

Contact

name email adresses
Alain Durmus alain.durmus “Arobase” polytechnique.edu

2026/01/20 14:14 · rdouc · 0 Comments
world/subworld.1584636617.txt.gz · Last modified: 2022/03/16 01:37 (external edit)