Wiki
Wiki
Courses and public working groups
Courses and public working groups
Private Working Groups
Private Working Groups
- New!!! Reading Group
- Theatre
- Admin
- Research
- Teaching
This is an old revision of the document!
For adding a newpage here, choose the namespace (world, here) and the name of your newpage.
You are not allowed to add pagesIndiquez votre disponibilité (tranche horaire) pour une visio le 1er aout ici (double cliquez pour modifier la page (ou sur l'icone de crayon à droite) et n'oubliez pas de cliquer sur save, une fois la page modifiée):
Objectif : Ce cours combine une approche pratique (modélisation, entraînement, validation de modèles) et une approche théorique rigoureuse, permettant aux étudiants de :
Une partie significative du cours est consacrée à des exercices d'application mathématique afin de renforcer la compréhension conceptuelle des modèles.
Evaluation : Examen sur table + Tps notés. Des quiz réguliers compteront comme un bonus sur la note finale.
Randal DOUC (du 29 septembre au 10 octobre)
$\newcommand{\Ent}{\mathrm{Ent}}$
Let $\mu$ be a probability measure on $\mathbb{R}^n$ satisfying the following logarithmic Sobolev inequality: $$ \forall f \in C_b^\infty(\mathbb{R}^n), \quad \Ent_\mu(f^2) \leq c \, \mathbb{E}_\mu(|\nabla f|^2), $$ then for every Lipschitz function $F : \mathbb{R}^n \to \mathbb{R}$ with $\|F\|_{\mathrm{Lip}} \leq 1$, and for all $r > 0$, we have: $$ \mu(F \geq \mathbb{E}_\mu(F) + r) \leq \exp\left(-\frac{r^2}{c}\right), \\ \mu(|F - \mathbb{E}_\mu(F)| \geq r) \leq 2 \exp\left(-\frac{r^2}{c}\right). $$
We first assume that $F$ is a smooth and bounded function such that $\|F\|_{\mathrm{Lip}} \leq 1$. Consider the Laplace transform of $F$: $$ H(\lambda) := \mathbb{E}_\mu\left(e^{\lambda F}\right), \quad \lambda > 0. $$
Apply the logarithmic Sobolev inequality to the function $f = e^{\lambda F / 2}$. Then: $$ \Ent_\mu(e^{\lambda F}) \leq c \, \mathbb{E}_\mu\left( \left| \nabla e^{\lambda F/2} \right|^2 \right). $$ Since $\nabla e^{\lambda F / 2} = \frac{\lambda}{2} e^{\lambda F / 2} \nabla F$, we get: $$ \left| \nabla e^{\lambda F / 2} \right|^2 = \frac{\lambda^2}{4} e^{\lambda F} |\nabla F|^2. $$ Because $\|\nabla F\|_\infty \leq 1$, it follows that: $$ \Ent_\mu(e^{\lambda F}) \leq \frac{c \lambda^2}{4} H(\lambda). $$
Also, we have the identity: $$ \Ent_\mu(e^{\lambda F}) = \int e^{\lambda F} \log \frac{e^{\lambda F}}{\int e^{\lambda F} \, d\mu} \, d\mu = \lambda H'(\lambda) - H(\lambda) \log H(\lambda), $$ which gives: $$ \lambda H'(\lambda) - H(\lambda) \log H(\lambda) \leq \frac{c \lambda^2}{4} H(\lambda). $$
Dividing by $\lambda^2 H(\lambda)$ (which is valid since $H(\lambda) > 0$), we obtain: $$ \frac{H'(\lambda)}{\lambda H(\lambda)} - \frac{\log H(\lambda)}{\lambda^2} \leq \frac{c}{4}. $$
Define: $$ K(\lambda) := \frac{1}{\lambda} \log H(\lambda), \quad \lambda > 0. $$
Then we have: $$ K'(\lambda) \leq \frac{c}{4}. $$
The function $K$ is continuous on $[0,\infty)$ and differentiable on $(0,\infty)$. Taking the limit as $\lambda \to 0^+$ gives: $$ \lim_{\lambda \to 0^+} K(\lambda) = \mathbb{E}_\mu(F). $$
Therefore, for all $\lambda > 0$: $$ K(\lambda) \leq \mathbb{E}_\mu(F) + \frac{c}{4} \lambda, $$ that is: $$ \log H(\lambda) \leq \lambda \mathbb{E}_\mu(F) + \frac{c}{4} \lambda^2, $$ and hence: $$ H(\lambda) \leq \exp\left( \lambda \mathbb{E}_\mu(F) + \frac{c}{4} \lambda^2 \right). $$
Now apply Markov’s inequality: $$ \mu(F \geq \mathbb{E}_\mu(F) + r) = \mu\left(e^{\lambda F} \geq e^{\lambda (\mathbb{E}_\mu(F) + r)} \right) \leq \frac{H(\lambda)}{e^{\lambda (\mathbb{E}_\mu(F) + r)}}. $$
Using the previous bound on $H(\lambda)$, we get: $$ \mu(F \geq \mathbb{E}_\mu(F) + r) \leq \exp\left( \frac{c}{4} \lambda^2 - \lambda r \right). $$
Optimizing this upper bound by choosing $\lambda = \frac{2r}{c}$ gives: $$ \mu(F \geq \mathbb{E}_\mu(F) + r) \leq \exp\left( -\frac{r^2}{c} \right). $$
This proves the first inequality. The second follows by applying the same bound to $-F$ and combining both tails: $$ \mu(|F - \mathbb{E}_\mu(F)| \geq r) \leq \mu(F \geq \mathbb{E}_\mu(F) + r) + \mu(F \leq \mathbb{E}_\mu(F) - r) \leq 2 e^{-r^2 / c}. $$
Finally, the general case where $F$ is only Lipschitz (not smooth and bounded) is obtained by standard mollification (i.e., convolution with a smooth kernel) and a limiting argument using the semicontinuity of the Lipschitz norm and uniform upper bounds.
$$ \newcommand{\arginf}{\mathrm{arginf}} \newcommand{\argmin}{\mathrm{argmin}} \newcommand{\argmax}{\mathrm{argmax}} \newcommand{\asconv}[1]{\stackrel{#1-a.s.}{\rightarrow}} \newcommand{\Aset}{\mathsf{A}} \newcommand{\b}[1]{{\mathbf{#1}}} \newcommand{\ball}[1]{\mathsf{B}(#1)} \newcommand{\bbQ}{{\mathbb Q}} \newcommand{\bproof}{\textbf{Proof :}\quad} \newcommand{\bmuf}[2]{b_{#1,#2}} \newcommand{\card}{\mathrm{card}} \newcommand{\chunk}[3]{{#1}_{#2:#3}} \newcommand{\condtrans}[3]{p_{#1}(#2|#3)} \newcommand{\convprob}[1]{\stackrel{#1-\text{prob}}{\rightarrow}} \newcommand{\Cov}{\mathbb{C}\mathrm{ov}} \newcommand{\cro}[1]{\langle #1 \rangle} \newcommand{\CPE}[2]{\PE\lr{#1| #2}} \renewcommand{\det}{\mathrm{det}} \newcommand{\dimlabel}{\mathsf{m}} \newcommand{\dimU}{\mathsf{q}} \newcommand{\dimX}{\mathsf{d}} \newcommand{\dimY}{\mathsf{p}} \newcommand{\dlim}{\Rightarrow} \newcommand{\e}[1]{{\left\lfloor #1 \right\rfloor}} \newcommand{\eproof}{\quad \Box} \newcommand{\eremark}{</WRAP>} \newcommand{\eqdef}{:=} \newcommand{\eqlaw}{\stackrel{\mathcal{L}}{=}} \newcommand{\eqsp}{\;} \newcommand{\Eset}{ {\mathsf E}} \newcommand{\esssup}{\mathrm{essup}} \newcommand{\fr}[1]{{\left\langle #1 \right\rangle}} \newcommand{\falph}{f} \renewcommand{\geq}{\geqslant} \newcommand{\hchi}{\hat \chi} \newcommand{\Hset}{\mathsf{H}} \newcommand{\Id}{\mathrm{Id}} \newcommand{\img}{\text{Im}} \newcommand{\indi}[1]{\mathbf{1}_{#1}} \newcommand{\indiacc}[1]{\mathbf{1}_{\{#1\}}} \newcommand{\indin}[1]{\mathbf{1}\{#1\}} \newcommand{\itemm}{\quad \quad \blacktriangleright \;} \newcommand{\jointtrans}[3]{p_{#1}(#2,#3)} \newcommand{\ker}{\text{Ker}} \newcommand{\klbck}[2]{\mathrm{K}\lr{#1||#2}} \newcommand{\law}{\mathcal{L}} \newcommand{\labelinit}{\pi} \newcommand{\labelkernel}{Q} \renewcommand{\leq}{\leqslant} \newcommand{\lone}{\mathsf{L}_1} \newcommand{\lp}[1]{\mathsf{L}_{{#1}}} \newcommand{\lrav}[1]{\left|#1 \right|} \newcommand{\lr}[1]{\left(#1 \right)} \newcommand{\lrb}[1]{\left[#1 \right]} \newcommand{\lrc}[1]{\left\{#1 \right\}} \newcommand{\lrcb}[1]{\left\{#1 \right\}} \newcommand{\ltwo}[1]{\PE^{1/2}\lrb{\lrcb{#1}^2}} \newcommand{\Ltwo}{\mathrm{L}^2} \newcommand{\mc}[1]{\mathcal{#1}} \newcommand{\mcbb}{\mathcal B} \newcommand{\mcf}{\mathcal{F}} \newcommand{\meas}[1]{\mathrm{M}_{#1}} \newcommand{\norm}[1]{\left\|#1\right\|} \newcommand{\normmat}[1]{{\left\vert\kern-0.25ex\left\vert\kern-0.25ex\left\vert #1 \right\vert\kern-0.25ex\right\vert\kern-0.25ex\right\vert}} \newcommand{\nset}{\mathbb N} \newcommand{\N}{\mathcal{N}} \newcommand{\one}{\mathsf{1}} \newcommand{\PE}{\mathbb E} \newcommand{\pminfty}{_{-\infty}^\infty} \newcommand{\PP}{\mathbb P} \newcommand{\projorth}[1]{\mathsf{P}^\perp_{#1}} \newcommand{\Psif}{\Psi_f} \newcommand{\pscal}[2]{\langle #1,#2\rangle} \newcommand{\pscal}[2]{\langle #1,#2\rangle} \newcommand{\psconv}{\stackrel{\PP-a.s.}{\rightarrow}} \newcommand{\qset}{\mathbb Q} \newcommand{\revcondtrans}[3]{q_{#1}(#2|#3)} \newcommand{\rmd}{\mathrm d} \newcommand{\rme}{\mathrm e} \newcommand{\rmi}{\mathrm i} \newcommand{\Rset}{\mathbb{R}} \newcommand{\rset}{\mathbb{R}} \newcommand{\rti}{\sigma} \newcommand{\section}[1]{==== #1 ====} \newcommand{\seq}[2]{\lrc{#1\eqsp: \eqsp #2}} \newcommand{\set}[2]{\lrc{#1\eqsp: \eqsp #2}} \newcommand{\sg}{\mathrm{sgn}} \newcommand{\supnorm}[1]{\left\|#1\right\|_{\infty}} \newcommand{\thv}{{\theta_\star}} \newcommand{\tmu}{ {\tilde{\mu}}} \newcommand{\Tset}{ {\mathsf{T}}} \newcommand{\Tsigma}{ {\mathcal{T}}} \newcommand{\ttheta}{{\tilde \theta}} \newcommand{\tv}[1]{\left\|#1\right\|_{\mathrm{TV}}} \newcommand{\unif}{\mathrm{Unif}} \newcommand{\weaklim}[1]{\stackrel{\mathcal{L}_{#1}}{\rightsquigarrow}} \newcommand{\Xset}{{\mathsf X}} \newcommand{\Xsigma}{\mathcal X} \newcommand{\Yset}{{\mathsf Y}} \newcommand{\Ysigma}{\mathcal Y} \newcommand{\Var}{\mathbb{V}\mathrm{ar}} \newcommand{\zset}{\mathbb{Z}} \newcommand{\Zset}{\mathsf{Z}} $$
Let $f, g: \Xset \to \rset$ be two measurable functions on a measurable space $(\Xset, \Xsigma)$, and let $\mu$ be a non-negative measure on $(\Xset, \Xsigma)$. Then, for any $p \geq 1$, $$\lr{\int |f+g|^p \rmd \mu}^{1/p} \leq \lr{\int |f|^p \rmd \mu}^{1/p} + \lr{\int |g|^p \rmd \mu}^{1/p}.$$
Some alternative proofs can be found here.
Without loss of generality, we assume that $p>1$, $f, g \in \lp{p}(\mu)$ and $f, g \geq 0$ (the general case can be handled using the inequality $|f+g| \leq |f| + |g|$). For $s > 0$, define $$\varphi(s) = \lr{\int (f + s g)^p \rmd \mu}^{1/p}.$$ Then, $$\varphi'(s) = \lr{\int (f + s g)^p \rmd \mu}^{\frac{1}{p} - 1} \int (f + s g)^{p-1} g \, \rmd \mu.$$ Using Hölder’s inequality for the second term, we can bound $\varphi'(s)$ as follows: $$\varphi'(s) \leq \lr{\int (f + s g)^p \rmd \mu}^{\frac{1}{p} - 1} \lr{\int (f + s g)^p \rmd \mu}^{\frac{p-1}{p}} \lr{\int g^p \rmd \mu}^{1/p} = \lr{\int g^p \rmd \mu}^{1/p}.$$ Hence, $$\lr{\int (f + g)^p \rmd \mu}^{1/p} = \varphi(1) = \varphi(0) + \int_0^1 \varphi'(s) \, \rmd s \leq \lr{\int f^p \rmd \mu}^{1/p}+ \lr{\int g^p \rmd \mu}^{1/p}.$$ This completes the proof.
$$ \newcommand{\arginf}{\mathrm{arginf}} \newcommand{\argmin}{\mathrm{argmin}} \newcommand{\argmax}{\mathrm{argmax}} \newcommand{\asconv}[1]{\stackrel{#1-a.s.}{\rightarrow}} \newcommand{\Aset}{\mathsf{A}} \newcommand{\b}[1]{{\mathbf{#1}}} \newcommand{\ball}[1]{\mathsf{B}(#1)} \newcommand{\bbQ}{{\mathbb Q}} \newcommand{\bproof}{\textbf{Proof :}\quad} \newcommand{\bmuf}[2]{b_{#1,#2}} \newcommand{\card}{\mathrm{card}} \newcommand{\chunk}[3]{{#1}_{#2:#3}} \newcommand{\condtrans}[3]{p_{#1}(#2|#3)} \newcommand{\convprob}[1]{\stackrel{#1-\text{prob}}{\rightarrow}} \newcommand{\Cov}{\mathbb{C}\mathrm{ov}} \newcommand{\cro}[1]{\langle #1 \rangle} \newcommand{\CPE}[2]{\PE\lr{#1| #2}} \renewcommand{\det}{\mathrm{det}} \newcommand{\dimlabel}{\mathsf{m}} \newcommand{\dimU}{\mathsf{q}} \newcommand{\dimX}{\mathsf{d}} \newcommand{\dimY}{\mathsf{p}} \newcommand{\dlim}{\Rightarrow} \newcommand{\e}[1]{{\left\lfloor #1 \right\rfloor}} \newcommand{\eproof}{\quad \Box} \newcommand{\eremark}{</WRAP>} \newcommand{\eqdef}{:=} \newcommand{\eqlaw}{\stackrel{\mathcal{L}}{=}} \newcommand{\eqsp}{\;} \newcommand{\Eset}{ {\mathsf E}} \newcommand{\esssup}{\mathrm{essup}} \newcommand{\fr}[1]{{\left\langle #1 \right\rangle}} \newcommand{\falph}{f} \renewcommand{\geq}{\geqslant} \newcommand{\hchi}{\hat \chi} \newcommand{\Hset}{\mathsf{H}} \newcommand{\Id}{\mathrm{Id}} \newcommand{\img}{\text{Im}} \newcommand{\indi}[1]{\mathbf{1}_{#1}} \newcommand{\indiacc}[1]{\mathbf{1}_{\{#1\}}} \newcommand{\indin}[1]{\mathbf{1}\{#1\}} \newcommand{\itemm}{\quad \quad \blacktriangleright \;} \newcommand{\jointtrans}[3]{p_{#1}(#2,#3)} \newcommand{\ker}{\text{Ker}} \newcommand{\klbck}[2]{\mathrm{K}\lr{#1||#2}} \newcommand{\law}{\mathcal{L}} \newcommand{\labelinit}{\pi} \newcommand{\labelkernel}{Q} \renewcommand{\leq}{\leqslant} \newcommand{\lone}{\mathsf{L}_1} \newcommand{\lp}[1]{\mathsf{L}_{{#1}}} \newcommand{\lrav}[1]{\left|#1 \right|} \newcommand{\lr}[1]{\left(#1 \right)} \newcommand{\lrb}[1]{\left[#1 \right]} \newcommand{\lrc}[1]{\left\{#1 \right\}} \newcommand{\lrcb}[1]{\left\{#1 \right\}} \newcommand{\ltwo}[1]{\PE^{1/2}\lrb{\lrcb{#1}^2}} \newcommand{\Ltwo}{\mathrm{L}^2} \newcommand{\mc}[1]{\mathcal{#1}} \newcommand{\mcbb}{\mathcal B} \newcommand{\mcf}{\mathcal{F}} \newcommand{\meas}[1]{\mathrm{M}_{#1}} \newcommand{\norm}[1]{\left\|#1\right\|} \newcommand{\normmat}[1]{{\left\vert\kern-0.25ex\left\vert\kern-0.25ex\left\vert #1 \right\vert\kern-0.25ex\right\vert\kern-0.25ex\right\vert}} \newcommand{\nset}{\mathbb N} \newcommand{\N}{\mathcal{N}} \newcommand{\one}{\mathsf{1}} \newcommand{\PE}{\mathbb E} \newcommand{\pminfty}{_{-\infty}^\infty} \newcommand{\PP}{\mathbb P} \newcommand{\projorth}[1]{\mathsf{P}^\perp_{#1}} \newcommand{\Psif}{\Psi_f} \newcommand{\pscal}[2]{\langle #1,#2\rangle} \newcommand{\pscal}[2]{\langle #1,#2\rangle} \newcommand{\psconv}{\stackrel{\PP-a.s.}{\rightarrow}} \newcommand{\qset}{\mathbb Q} \newcommand{\revcondtrans}[3]{q_{#1}(#2|#3)} \newcommand{\rmd}{\mathrm d} \newcommand{\rme}{\mathrm e} \newcommand{\rmi}{\mathrm i} \newcommand{\Rset}{\mathbb{R}} \newcommand{\rset}{\mathbb{R}} \newcommand{\rti}{\sigma} \newcommand{\section}[1]{==== #1 ====} \newcommand{\seq}[2]{\lrc{#1\eqsp: \eqsp #2}} \newcommand{\set}[2]{\lrc{#1\eqsp: \eqsp #2}} \newcommand{\sg}{\mathrm{sgn}} \newcommand{\supnorm}[1]{\left\|#1\right\|_{\infty}} \newcommand{\thv}{{\theta_\star}} \newcommand{\tmu}{ {\tilde{\mu}}} \newcommand{\Tset}{ {\mathsf{T}}} \newcommand{\Tsigma}{ {\mathcal{T}}} \newcommand{\ttheta}{{\tilde \theta}} \newcommand{\tv}[1]{\left\|#1\right\|_{\mathrm{TV}}} \newcommand{\unif}{\mathrm{Unif}} \newcommand{\weaklim}[1]{\stackrel{\mathcal{L}_{#1}}{\rightsquigarrow}} \newcommand{\Xset}{{\mathsf X}} \newcommand{\Xsigma}{\mathcal X} \newcommand{\Yset}{{\mathsf Y}} \newcommand{\Ysigma}{\mathcal Y} \newcommand{\Var}{\mathbb{V}\mathrm{ar}} \newcommand{\zset}{\mathbb{Z}} \newcommand{\Zset}{\mathsf{Z}} $$
PC | Doc | Additional Links | |
---|---|---|---|
1 | PC1 | Leave One Out Cross Validation (LOOCV) and Cook's distance | |
2 | PC2 | A useful result on maximisation | |
3 | KKT | ||
4 | PC4 | ||
5 | Computer session | tpfondmachlearn.pdf | |
6 | PC6 | A useful result for the consistency of Decision Trees, Max-bounds for the normal distribution. | |
7 | PC7 | Handwritten notes on the solutions of PC7 All to be known about pca | |
8 | PC8 | Some properties on the determinant of symmetric positive matrices | |
9 | xgboost | Many interesting videos can be found on statquest. |
name | email adresses |
---|---|
Randal Douc | randal.douc At polytechnique.edu |