# The Pinsker Inequality.

$$\newcommand{\arginf}{\mathrm{arginf}} \newcommand{\argmin}{\mathrm{argmin}} \newcommand{\argmax}{\mathrm{argmax}} \newcommand{\asconv}[1]{\stackrel{#1-a.s.}{\rightarrow}} \newcommand{\Aset}{\mathsf{A}} \newcommand{\b}[1]{{\mathbf{#1}}} \newcommand{\ball}[1]{\mathsf{B}(#1)} \newcommand{\bproof}{\textbf{Proof :}\quad} \newcommand{\bmuf}[2]{b_{#1,#2}} \newcommand{\card}{\mathrm{card}} \newcommand{\chunk}[3]{{#1}_{#2:#3}} \newcommand{\convprob}[1]{\stackrel{#1-\text{prob}}{\rightarrow}} \newcommand{\Cov}{\mathbb{C}\mathrm{ov}} \newcommand{\CPE}[2]{\PE\lr{#1| #2}} \renewcommand{\det}{\mathrm{det}} \newcommand{\dimlabel}{\mathsf{m}} \newcommand{\dimU}{\mathsf{q}} \newcommand{\dimX}{\mathsf{d}} \newcommand{\dimY}{\mathsf{p}} \newcommand{\dlim}{\Rightarrow} \newcommand{\e}[1]{{\left\lfloor #1 \right\rfloor}} \newcommand{\eproof}{\quad \Box} \newcommand{\eremark}{</WRAP>} \newcommand{\eqdef}{:=} \newcommand{\eqlaw}{\stackrel{\mathcal{L}}{=}} \newcommand{\eqsp}{\;} \newcommand{\Eset}{ {\mathsf E}} \newcommand{\esssup}{\mathrm{essup}} \newcommand{\fr}[1]{{\left\langle #1 \right\rangle}} \newcommand{\falph}{f} \renewcommand{\geq}{\geqslant} \newcommand{\hchi}{\hat \chi} \newcommand{\Hset}{\mathsf{H}} \newcommand{\Id}{\mathrm{Id}} \newcommand{\img}{\text{Im}} \newcommand{\indi}[1]{\mathbf{1}_{#1}} \newcommand{\indiacc}[1]{\mathbf{1}_{\{#1\}}} \newcommand{\indin}[1]{\mathbf{1}\{#1\}} \newcommand{\itemm}{\quad \quad \blacktriangleright \;} \newcommand{\ker}{\text{Ker}} \newcommand{\klbck}[2]{\mathrm{K}\lr{#1||#2}} \newcommand{\law}{\mathcal{L}} \newcommand{\labelinit}{\pi} \newcommand{\labelkernel}{Q} \renewcommand{\leq}{\leqslant} \newcommand{\lone}{\mathsf{L}_1} \newcommand{\lrav}[1]{\left|#1 \right|} \newcommand{\lr}[1]{\left(#1 \right)} \newcommand{\lrb}[1]{\left[#1 \right]} \newcommand{\lrc}[1]{\left\{#1 \right\}} \newcommand{\lrcb}[1]{\left\{#1 \right\}} \newcommand{\ltwo}[1]{\PE^{1/2}\lrb{\lrcb{#1}^2}} \newcommand{\Ltwo}{\mathrm{L}^2} \newcommand{\mc}[1]{\mathcal{#1}} \newcommand{\mcbb}{\mathcal B} \newcommand{\mcf}{\mathcal{F}} \newcommand{\meas}[1]{\mathrm{M}_{#1}} \newcommand{\norm}[1]{\left\|#1\right\|} \newcommand{\normmat}[1]{{\left\vert\kern-0.25ex\left\vert\kern-0.25ex\left\vert #1 \right\vert\kern-0.25ex\right\vert\kern-0.25ex\right\vert}} \newcommand{\nset}{\mathbb N} \newcommand{\one}{\mathsf{1}} \newcommand{\PE}{\mathbb E} \newcommand{\PP}{\mathbb P} \newcommand{\projorth}[1]{\mathsf{P}^\perp_{#1}} \newcommand{\Psif}{\Psi_f} \newcommand{\pscal}[2]{\langle #1,#2\rangle} \newcommand{\pscal}[2]{\langle #1,#2\rangle} \newcommand{\psconv}{\stackrel{\PP-a.s.}{\rightarrow}} \newcommand{\qset}{\mathbb Q} \newcommand{\rmd}{\mathrm d} \newcommand{\rme}{\mathrm e} \newcommand{\rmi}{\mathrm i} \newcommand{\Rset}{\mathbb{R}} \newcommand{\rset}{\mathbb{R}} \newcommand{\rti}{\sigma} \newcommand{\section}[1]{==== #1 ====} \newcommand{\seq}[2]{\lrc{#1\eqsp: \eqsp #2}} \newcommand{\set}[2]{\lrc{#1\eqsp: \eqsp #2}} \newcommand{\sg}{\mathrm{sgn}} \newcommand{\supnorm}[1]{\left\|#1\right\|_{\infty}} \newcommand{\thv}{{\theta_\star}} \newcommand{\tmu}{ {\tilde{\mu}}} \newcommand{\Tset}{ {\mathsf{T}}} \newcommand{\Tsigma}{ {\mathcal{T}}} \newcommand{\ttheta}{{\tilde \theta}} \newcommand{\tv}[1]{\left\|#1\right\|_{\mathrm{TV}}} \newcommand{\unif}{\mathrm{Unif}} \newcommand{\weaklim}[1]{\stackrel{\mathcal{L}_{#1}}{\rightsquigarrow}} \newcommand{\Xset}{{\mathsf X}} \newcommand{\Xsigma}{\mathcal X} \newcommand{\Yset}{{\mathsf Y}} \newcommand{\Ysigma}{\mathcal Y} \newcommand{\Var}{\mathbb{V}\mathrm{ar}} \newcommand{\zset}{\mathbb{Z}} \newcommand{\Zset}{\mathsf{Z}}$$

## What?

The inequality is the following:

The Pinsker inequality. For all probability measures $\sigma$, $\pi$, $$|\int f (\rmd \sigma-\rmd \pi)|^2\leq 2 \Var_{\mu}(f) KL(\sigma|| \pi)$$ where $\mu$ is the probability measure $\mu=2\pi/3+\sigma/3$.

If $0\leq f\leq 1$, then $\Var_{\mu}(f)=\PE[f^2]-\PE^2[f]\leq \PE[f]-\PE^2[f]=\PE[f](1-\PE[f]) \leq 1/4$ and we get $$\rmd_{tv}(\sigma,\pi) \leq KL(\sigma|| \pi)/2$$

## The proof

$\bproof$ Here are the main ideas. For simplicity, we assume $\sigma(x)>0$ for all $x$. First note that for all $y> 0$,

$$(y-1)^2 \leq \frac{2}{3}(2y+1) (-\log y +y-1) \quad (\star)$$

Indeed, set $h(y)=\frac{2}{3}(2y+1)(-\log y+y-1)-(y-1)^2$ and check that $h(1)=h'(1)=0$ and $h''(y)=2(x-1)^2/(3y^2)>0$ for all $y \geq 0$. Therefore, $h$ is a convex function on $\rset^+$ and attains its minimum at $y=1$. Therefore, $h(y) \geq 0$ for all $y\geq 0$, which proves $(\star)$. Then, using $(\star)$ with $y=\frac{\pi(x)}{\sigma(x)}$, and taking $\bar f=f-\mu(f)$ where $\mu=2\pi/3 +\sigma/3$,

\begin{align*} |\int f (\rmd \sigma-\rmd \pi)|^2=|\int \bar f (\rmd \sigma-\rmd \pi)|^2 &\leq \left(\int |\bar f(x)|\ |\sigma(x)-\pi(x)|\rmd x\right)^2 \\ &=\left(\int |\bar f(x)|\ \sigma(x)\ |\frac{\pi(x)}{\sigma(x)}-1|\rmd x\right)^2 \\ &=\left(\int \sigma(\rmd x) \sqrt{\frac{2\bar f^2(x)}{3}\left(2\frac{\pi(x)}{\sigma(x)}+1\right)} \sqrt{\left( -\log \frac{\pi(x)}{\sigma(x)} +\frac{\pi(x)}{\sigma(x)}-1\right)}\right)^2 \end{align*} And we conclude by applying Holder's inequality: $\left(\int \sigma(\rmd x) \sqrt{a(x)}\sqrt{b(x)} \right)^2\leq \int \sigma(\rmd x) a(x) \times \int \sigma(\rmd x) b(x)$ $\eproof$