This is an old revision of the document!

$$ \newcommand{\arginf}{\mathrm{arginf}} \newcommand{\argmin}{\mathrm{argmin}} \newcommand{\argmax}{\mathrm{argmax}} \newcommand{\asconv}[1]{\stackrel{#1-a.s.}{\rightarrow}} \newcommand{\Aset}{\mathsf{A}} \newcommand{\b}[1]{{\mathbf{#1}}} \newcommand{\ball}[1]{\mathsf{B}(#1)} \newcommand{\bbQ}{{\mathbb Q}} \newcommand{\bproof}{\textbf{Proof :}\quad} \newcommand{\bmuf}[2]{b_{#1,#2}} \newcommand{\card}{\mathrm{card}} \newcommand{\chunk}[3]{{#1}_{#2:#3}} \newcommand{\condtrans}[3]{p_{#1}(#2|#3)} \newcommand{\convprob}[1]{\stackrel{#1-\text{prob}}{\rightarrow}} \newcommand{\Cov}{\mathbb{C}\mathrm{ov}} \newcommand{\cro}[1]{\langle #1 \rangle} \newcommand{\CPE}[2]{\PE\lr{#1| #2}} \renewcommand{\det}{\mathrm{det}} \newcommand{\dimlabel}{\mathsf{m}} \newcommand{\dimU}{\mathsf{q}} \newcommand{\dimX}{\mathsf{d}} \newcommand{\dimY}{\mathsf{p}} \newcommand{\dlim}{\Rightarrow} \newcommand{\e}[1]{{\left\lfloor #1 \right\rfloor}} \newcommand{\eproof}{\quad \Box} \newcommand{\eremark}{</WRAP>} \newcommand{\eqdef}{:=} \newcommand{\eqlaw}{\stackrel{\mathcal{L}}{=}} \newcommand{\eqsp}{\;} \newcommand{\Eset}{ {\mathsf E}} \newcommand{\esssup}{\mathrm{essup}} \newcommand{\fr}[1]{{\left\langle #1 \right\rangle}} \newcommand{\falph}{f} \renewcommand{\geq}{\geqslant} \newcommand{\hchi}{\hat \chi} \newcommand{\Hset}{\mathsf{H}} \newcommand{\Id}{\mathrm{Id}} \newcommand{\img}{\text{Im}} \newcommand{\indi}[1]{\mathbf{1}_{#1}} \newcommand{\indiacc}[1]{\mathbf{1}_{\{#1\}}} \newcommand{\indin}[1]{\mathbf{1}\{#1\}} \newcommand{\itemm}{\quad \quad \blacktriangleright \;} \newcommand{\jointtrans}[3]{p_{#1}(#2,#3)} \newcommand{\ker}{\text{Ker}} \newcommand{\klbck}[2]{\mathrm{K}\lr{#1||#2}} \newcommand{\law}{\mathcal{L}} \newcommand{\labelinit}{\pi} \newcommand{\labelkernel}{Q} \renewcommand{\leq}{\leqslant} \newcommand{\lone}{\mathsf{L}_1} \newcommand{\lp}[1]{\mathsf{L}_{{#1}}} \newcommand{\lrav}[1]{\left|#1 \right|} \newcommand{\lr}[1]{\left(#1 \right)} \newcommand{\lrb}[1]{\left[#1 \right]} \newcommand{\lrc}[1]{\left\{#1 \right\}} \newcommand{\lrcb}[1]{\left\{#1 \right\}} \newcommand{\ltwo}[1]{\PE^{1/2}\lrb{\lrcb{#1}^2}} \newcommand{\Ltwo}{\mathrm{L}^2} \newcommand{\mc}[1]{\mathcal{#1}} \newcommand{\mcbb}{\mathcal B} \newcommand{\mcD}{\mathcal{D}} \newcommand{\mcf}{\mathcal{F}} \newcommand{\mcl}{\mathcal{L}} \newcommand{\meas}[1]{\mathrm{M}_{#1}} \newcommand{\norm}[1]{\left\|#1\right\|} \newcommand{\normmat}[1]{{\left\vert\kern-0.25ex\left\vert\kern-0.25ex\left\vert #1 \right\vert\kern-0.25ex\right\vert\kern-0.25ex\right\vert}} \newcommand{\nset}{\mathbb N} \newcommand{\N}{\mathcal{N}} \newcommand{\one}{\mathsf{1}} \newcommand{\PE}{\mathbb E} \newcommand{\pminfty}{_{-\infty}^\infty} \newcommand{\PP}{\mathbb P} \newcommand{\projorth}[1]{\mathsf{P}^\perp_{#1}} \newcommand{\Psif}{\Psi_f} \newcommand{\pscal}[2]{\langle #1,#2\rangle} \newcommand{\pscal}[2]{\langle #1,#2\rangle} \newcommand{\psconv}{\stackrel{\PP-a.s.}{\rightarrow}} \newcommand{\qset}{\mathbb Q} \newcommand{\revcondtrans}[3]{q_{#1}(#2|#3)} \newcommand{\rmd}{\mathrm d} \newcommand{\rme}{\mathrm e} \newcommand{\rmi}{\mathrm i} \newcommand{\Rset}{\mathbb{R}} \newcommand{\rset}{\mathbb{R}} \newcommand{\rti}{\sigma} \newcommand{\section}[1]{==== #1 ====} \newcommand{\seq}[2]{\lrc{#1\eqsp: \eqsp #2}} \newcommand{\set}[2]{\lrc{#1\eqsp: \eqsp #2}} \newcommand{\sg}{\mathrm{sgn}} \newcommand{\supnorm}[1]{\left\|#1\right\|_{\infty}} \newcommand{\thv}{{\theta_\star}} \newcommand{\tmu}{ {\tilde{\mu}}} \newcommand{\Tset}{ {\mathsf{T}}} \newcommand{\Tsigma}{ {\mathcal{T}}} \newcommand{\ttheta}{{\tilde \theta}} \newcommand{\tv}[1]{\left\|#1\right\|_{\mathrm{TV}}} \newcommand{\unif}{\mathrm{Unif}} \newcommand{\weaklim}[1]{\stackrel{\mathcal{L}_{#1}}{\rightsquigarrow}} \newcommand{\Xset}{{\mathsf X}} \newcommand{\Xsigma}{\mathcal X} \newcommand{\Yset}{{\mathsf Y}} \newcommand{\Ysigma}{\mathcal Y} \newcommand{\Var}{\mathbb{V}\mathrm{ar}} \newcommand{\zset}{\mathbb{Z}} \newcommand{\Zset}{\mathsf{Z}} $$

2025/04/30 10:21

Krein-Milman, Choquet and Birkhoff-Von Neuman Theorems

Definition: Let $K$ be convex. We say that $x \in K$ is an extreme point if $$ \forall y,z \in K,\ \forall \lambda \in [0,1],\ x = \lambda y + (1-\lambda)z \Rightarrow \lambda \in \{0,1\}. $$ We denote by $\mathrm{Extr}(K)$ the set of extreme points.

Theorem (Krein–Milman). Let $E$ be a Euclidean space of dimension $n$ and $K \subset E$ convex, compact, non-empty. Then $$ K = \mathrm{conv}(\mathrm{Extr}(K)). $$ where the notation $\mathrm{conv}(G)$ stands for the convex enveloppe of the set $G$.

Proof.

It suffices to show $K \subset \mathrm{conv}(\mathrm{Extr}(K))$, the reverse inclusion being clear by convexity of $K$.

The proof proceeds by induction on $n$, the dimension of $E$. We assume the property is true for every Euclidean subspace of dimension $n-1$. Without loss of generality, we assume that $K$ is not reduced to a singleton.

Let $x \in K$. Choose $y \in K \setminus \{x\}$ arbitrarily. Consider $[a,b]$ the maximal segment contained in $K$ and containing $[x,y]$. Since $a,b \in K$ and $x$ is a convex combination of $a,b$, it suffices to show that $a,b \in \mathrm{conv}(\mathrm{Extr}(K))$. We prove that $a \in \mathrm{conv}(\mathrm{Extr}(K))$ (the case $b \in \mathrm{conv}(\mathrm{Extr}(K))$ is similar). Since $a \in \partial K$, there exists a sequence $a_k \notin K$ converging to $a$ and for each $k$, we choose $u_k$ unitary such that for all $y \in K$, $\langle y-a_k,u_k \rangle \geq 0$. We can extract from $(u_k)$ a convergent subsequence on the unit sphere, and let $u$ be its limit. Passing to the limit yields $\langle y-a,u \rangle \geq 0$. Thus, defining $f(y)=\langle y-a,u \rangle$, we obtain that $a \in H_a=f^{-1}(0) \cap K=\mathrm{argmin}_{y\in K} f(y)$. The set $H_a$ is convex, compact, non-empty in a space of dimension $n-1$, and the induction hypothesis shows that $a \in \mathrm{conv}(\mathrm{Extr}(H_a))$. It remains to show that $\mathrm{Extr}(H_a) \subset \mathrm{Extr}(K)$. Indeed, since $f$ is linear, if $z \in \mathrm{Extr}(H_a)$ and $z=\lambda x + (1-\lambda) y$ with $x,y \in K$, then $f(z)=0=\lambda f(x) + (1-\lambda) f(y)$, hence $f(x)=f(y)=0$, which implies $x,y \in H_a$ and thus $\lambda \in \{0,1\}$ since $z \in \mathrm{Extr}(H_a)$.

Theorem (Choquet). Let $E$ be a Euclidean space and $K \subset E$ convex compact such that $\mathrm{Extr}(K)$ is compact. Let $f : E \to \mathbb{R}$ be linear and continuous. Then $f$ attains its minimum on $K$ at a point of $\mathrm{Extr}(K)$.

Proof.

Let $x \in \mathrm{argmin}_{y\in K} f(y)$. By Krein–Milman, there exist $y_1,\ldots,y_n \in \mathrm{Extr}(K)$ such that $x$ is a convex combination of the $(y_i)$, and since $f$ is linear and $f(x)=\min_{y\in K} f(y)$, we obtain $f(x)=f(y_1)=\cdots=f(y_n)$, which completes the proof.

Theorem (Birkhoff–Von Neumann). Let $B_n$ be the set of bistochastic matrices, i.e. defined by $$ B_n = \left\{ A = (a_{ij}) \in \mathcal{M}_n([0,1]) \mid \sum_j a_{ij} = 1,\ \sum_i a_{ij} = 1 \right\}. $$ Then $\mathrm{Extr}(B_n) = P_n$, the set of permutation matrices (i.e. matrices having exactly one $1$ in each row and each column).

Proof.

A permutation matrix is clearly extreme. We prove by induction on the dimension that any extreme matrix is a permutation matrix. Assume it is true for $n-1$. Now take $A \in \mathrm{Extr}(B_n)$.

We first show by contradiction that it has at most $2n-1$ nonzero entries. Indeed, otherwise there exist distinct pairs $(i_k,j_k)$ such that $a_{i_k j_k} > 0$ for all $k \in [1:2n]$. Define $F_1=\mathrm{Vect}(E_{i_k,j_k}, k \in [1:2n])$, where $E_{i,j}$ is the matrix whose only nonzero entry is at $(i,j)$ and equal to $1$. Let $F_2$ be the vector space of matrices whose row and column sums are zero (a basis is given by $(E_{ij} - E_{in} - E_{nj})$ with $i,j \in [1:n-1]$). Then $\dim F_1=2n$ and $\dim F_2=(n-1)^2$, hence $F_1$ and $F_2$ cannot be in direct sum since the sum of their dimensions is $n^2+1$, which exceeds the dimension of the space of $n \times n$ matrices. Thus there exists a nonzero matrix $N \in F_1 \cap F_2$. For $\epsilon$ small enough, $A \pm \epsilon N$ is bistochastic, and we would have $$ A=\frac{1}{2}(A+\epsilon N)+\frac{1}{2}(A-\epsilon N), $$ which contradicts that $A$ is extreme.

Thus $A$ has at most $2n-1$ nonzero entries. Therefore one of its $n$ rows contains only one nonzero entry, which must be equal to $1$. The other entries in the corresponding column must be $0$. Removing this row and column, we obtain a matrix of size $n-1$ which is still bistochastic and still extreme. By the induction hypothesis, it is a permutation matrix. Therefore $A$ is a permutation matrix, which concludes the induction.

Welcome to Randal Douc's wiki

Sidebar

Wiki

Wiki

Courses and public working groups

Courses and public working groups

Private Working Groups

Private Working Groups

Personal Notes

Personal Notes

Réponses

Réponses

Miscellanous

Miscellanous

Krein-Milman, Choquet and Birkhoff-Von Neuman Theorems

Welcome to Randal Douc's wiki

User Tools

Site Tools

Sidebar

Wiki

Wiki

Courses and public working groups

Courses and public working groups

Private Working Groups

Private Working Groups

Personal Notes

Personal Notes

Réponses

Réponses

Miscellanous

Miscellanous

Krein-Milman, Choquet and Birkhoff-Von Neuman Theorems

Page Tools