{{page>:defs}}
====== Krein-Milman, Choquet and Birkhoff-Von Neuman Theorems ======
**Definition:** Let $K$ be convex. We say that $x \in K$ is an extreme point if
$$
\forall y,z \in K,\ \forall \lambda \in [0,1],\ x = \lambda y + (1-\lambda)z \Rightarrow \lambda \in \{0,1\}.
$$
We denote by $\mathrm{Extr}(K)$ the set of extreme points.
**Theorem (Krein–Milman).** Let $E$ be a Euclidean space of dimension $n$ and $K \subset E$ convex, compact, non-empty. Then
$$
K = \mathrm{conv}(\mathrm{Extr}(K)).
$$
where the notation $\mathrm{conv}(G)$ stands for the convex enveloppe of the set $G$.
It suffices to show $K \subset \mathrm{conv}(\mathrm{Extr}(K))$, the reverse inclusion being clear by convexity of $K$.
The proof proceeds by induction on $n$, the dimension of $E$. We assume the property is true for every Euclidean subspace of dimension $n-1$. Without loss of generality, we assume that $K$ is not reduced to a singleton.
Let $x \in K$. Choose $y \in K \setminus \{x\}$ arbitrarily. Consider $[a,b]$ the maximal segment contained in $K$ and containing $[x,y]$. Since $a,b \in K$ and $x$ is a convex combination of $a,b$, it suffices to show that $a,b \in \mathrm{conv}(\mathrm{Extr}(K))$. We prove that $a \in \mathrm{conv}(\mathrm{Extr}(K))$ (the case $b \in \mathrm{conv}(\mathrm{Extr}(K))$ is similar). Since $a \in \partial K$, there exists a sequence $a_k \notin K$ converging to $a$ and for each $k$, we choose $u_k$ unitary such that for all $y \in K$, $\langle y-a_k,u_k \rangle \geq 0$. We can extract from $(u_k)$ a convergent subsequence on the unit sphere, and let $u$ be its limit. Passing to the limit yields $\langle y-a,u \rangle \geq 0$. Thus, defining $f(y)=\langle y-a,u \rangle$, we obtain that $a \in H_a=f^{-1}(0) \cap K=\mathrm{argmin}_{y\in K} f(y)$. The set $H_a$ is convex, compact, non-empty in a space of dimension $n-1$, and the induction hypothesis shows that $a \in \mathrm{conv}(\mathrm{Extr}(H_a))$. It remains to show that $\mathrm{Extr}(H_a) \subset \mathrm{Extr}(K)$. Indeed, since $f$ is linear, if $z \in \mathrm{Extr}(H_a)$ and $z=\lambda x + (1-\lambda) y$ with $x,y \in K$, then $f(z)=0=\lambda f(x) + (1-\lambda) f(y)$, hence $f(x)=f(y)=0$ (since $f\geq 0$ on $K$), which implies $x,y \in H_a$ and thus $\lambda \in \{0,1\}$ since $z \in \mathrm{Extr}(H_a)$. This shows that $z \in \mathrm{Extr}(K)$ and the proof is concluded.
**Theorem (Choquet).** Let $E$ be a Euclidean space and $K \subset E$ convex compact such that $\mathrm{Extr}(K)$ is compact. Let $f : E \to \mathbb{R}$ be linear and continuous. Then $f$ attains its minimum on $K$ at a point of $\mathrm{Extr}(K)$.
Let $x \in \mathrm{argmin}_{y\in K} f(y)$. By Krein–Milman, there exist $y_1,\ldots,y_n \in \mathrm{Extr}(K)$ such that $x$ is a convex combination of the $(y_i)$, and since $f$ is linear and $f(x)=\min_{y\in K} f(y)$, we obtain $f(x)=f(y_1)=\cdots=f(y_n)$, which completes the proof.
**Theorem (Birkhoff–Von Neumann).** Let $B_n$ be the set of bistochastic matrices (of size $n\times n$), i.e. defined by
$$
B_n = \left\{ A = (a_{ij}) \in \mathcal{M}_n([0,1]) \mid \sum_{j=1}^n a_{ij} = 1,\ \sum_{i=1}^n a_{ij} = 1 \right\}.
$$
Then $\mathrm{Extr}(B_n) = P_n$, the set of permutation matrices (i.e. matrices having exactly one $1$ in each row and each column).
A permutation matrix is clearly extreme. We prove by induction on the dimension that any extreme matrix is a permutation matrix. Assume it is true for $n-1$. Now take $A \in \mathrm{Extr}(B_n)$.
We first show by contradiction that it has at most $2n-1$ nonzero entries. Indeed, otherwise there exist distinct pairs $(i_k,j_k)$ such that $a_{i_k j_k} > 0$ for all $k \in [1:2n]$. Define $F_1=\mathrm{Vect}(E_{i_k,j_k}, k \in [1:2n])$, where $E_{i,j}$ is the matrix whose only nonzero entry is at $(i,j)$ and equal to $1$.
Let $F_2$ be the vector space of matrices whose row and column sums are zero (a basis is given by $(E_{ij} - E_{in} - E_{nj})$ with $i,j \in [1:n-1]$). Then $\dim F_1=2n$ and $\dim F_2=(n-1)^2$, hence $F_1$ and $F_2$ cannot be in direct sum since the sum of their dimensions is $n^2+1$, which exceeds the dimension of the space of $n \times n$ matrices. Thus there exists a nonzero matrix $N \in F_1 \cap F_2$. For $\epsilon$ small enough, $A \pm \epsilon N$ is bistochastic, and we would have
$$
A=\frac{1}{2}(A+\epsilon N)+\frac{1}{2}(A-\epsilon N),
$$
which contradicts that $A$ is extreme.
Thus $A$ has at most $2n-1$ nonzero entries. Therefore one of its $n$ rows contains only one nonzero entry, which must be equal to $1$. The other entries in the corresponding column must be $0$. Removing this row and column, we obtain a matrix of size $n-1$ which is still bistochastic and still extreme. By the induction hypothesis, it is a permutation matrix. Therefore $A$ is a permutation matrix, which concludes the induction.