Welcome to Randal Douc's wiki

A collaborative site on maths but not only!

User Tools

Site Tools


world:subworld

This is an old revision of the document!


Subdirectories



For adding a newpage here, choose the namespace (world, here) and the name of your newpage.

You are not allowed to add pages

Visio 1er aout

Syllabus des interventions

Indiquez votre disponibilité (tranche horaire) pour une visio le 1er aout ici (double cliquez pour modifier la page (ou sur l'icone de crayon à droite) et n'oubliez pas de cliquer sur save, une fois la page modifiée):

  • Randal: Dispo toute la journée (8H-18H). Je suis aussi dispo la première semaine d'août.
  • Eric Moulines:
  • Mehdi Abou El Qassime:
  • Marine Saux:
  • Laurent Miclet:
  • Sylvain Le Corff:
  • Jean-François Giovanelli:
  • Matthieu Jonkheere: Pas dipos le 1, je peux avoir des dispos le lundi 4 ou mardi 5
  • Julien Perez:
  • Mohamed El Machkouri:
2025/07/28 11:08 · rdouc

Cours de Régression et Classification supervisée

Syllabus

Objectif : Ce cours combine une approche pratique (modélisation, entraînement, validation de modèles) et une approche théorique rigoureuse, permettant aux étudiants de :

  • Comprendre les principes mathématiques sous-jacents aux algorithmes
  • Savoir formuler et analyser un problème d'apprentissage supervisé de manière précise
  • Maîtriser les méthodes classiques (régression linéaire/logistique, SVM, etc.) ainsi que des approches probabilistes

Une partie significative du cours est consacrée à des exercices d'application mathématique afin de renforcer la compréhension conceptuelle des modèles.

Evaluation : Examen sur table + Tps notés. Des quiz réguliers compteront comme un bonus sur la note finale.

Plan Prévisionnel

Randal DOUC (du 29 septembre au 10 octobre)

  1. Introduction au ML
    • Survol des types d’apprentissage.
    • Objectifs, pipeline, surapprentissage.
    • Notions de biais-variance, métriques.
  2. Régression supervisée
    • Régression linéaire simple/multiple.
    • Inférence, sélection de variables.
    • Ridge, Lasso, Elastic-Net.
    • Validation croisée, courbes d’apprentissage.
  3. Classification supervisée
    • Classifieur de Bayes.
    • Régression logistique bi-classe.
    • Métriques de performance.
  4. Méthodes probabilistes / modèles génératifs
    • EM, mélanges de gaussiennes.
  5. Méthodes non supervisées
    • ACP, K-means.
  6. Méthodes avancées
    • SVM (introduction).
    • Random Forest.
2025/07/09 17:20 · rdouc

Logarithmic Sobolev inequality and concentration

$\newcommand{\Ent}{\mathrm{Ent}}$

Theorem

  • Taken from Thm 7.4.1 in the book “Sur les inégalités de Sobolev logarithmiques”.

Let $\mu$ be a probability measure on $\mathbb{R}^n$ satisfying the following logarithmic Sobolev inequality: $$ \forall f \in C_b^\infty(\mathbb{R}^n), \quad \Ent_\mu(f^2) \leq c \, \mathbb{E}_\mu(|\nabla f|^2), $$ then for every Lipschitz function $F : \mathbb{R}^n \to \mathbb{R}$ with $\|F\|_{\mathrm{Lip}} \leq 1$, and for all $r > 0$, we have: $$ \mu(F \geq \mathbb{E}_\mu(F) + r) \leq \exp\left(-\frac{r^2}{c}\right), \\ \mu(|F - \mathbb{E}_\mu(F)| \geq r) \leq 2 \exp\left(-\frac{r^2}{c}\right). $$

Proof.

We first assume that $F$ is a smooth and bounded function such that $\|F\|_{\mathrm{Lip}} \leq 1$. Consider the Laplace transform of $F$: $$ H(\lambda) := \mathbb{E}_\mu\left(e^{\lambda F}\right), \quad \lambda > 0. $$

Apply the logarithmic Sobolev inequality to the function $f = e^{\lambda F / 2}$. Then: $$ \Ent_\mu(e^{\lambda F}) \leq c \, \mathbb{E}_\mu\left( \left| \nabla e^{\lambda F/2} \right|^2 \right). $$ Since $\nabla e^{\lambda F / 2} = \frac{\lambda}{2} e^{\lambda F / 2} \nabla F$, we get: $$ \left| \nabla e^{\lambda F / 2} \right|^2 = \frac{\lambda^2}{4} e^{\lambda F} |\nabla F|^2. $$ Because $\|\nabla F\|_\infty \leq 1$, it follows that: $$ \Ent_\mu(e^{\lambda F}) \leq \frac{c \lambda^2}{4} H(\lambda). $$

Also, we have the identity: $$ \Ent_\mu(e^{\lambda F}) = \int e^{\lambda F} \log \frac{e^{\lambda F}}{\int e^{\lambda F} \, d\mu} \, d\mu = \lambda H'(\lambda) - H(\lambda) \log H(\lambda), $$ which gives: $$ \lambda H'(\lambda) - H(\lambda) \log H(\lambda) \leq \frac{c \lambda^2}{4} H(\lambda). $$

Dividing by $\lambda^2 H(\lambda)$ (which is valid since $H(\lambda) > 0$), we obtain: $$ \frac{H'(\lambda)}{\lambda H(\lambda)} - \frac{\log H(\lambda)}{\lambda^2} \leq \frac{c}{4}. $$

Define: $$ K(\lambda) := \frac{1}{\lambda} \log H(\lambda), \quad \lambda > 0. $$

Then we have: $$ K'(\lambda) \leq \frac{c}{4}. $$

The function $K$ is continuous on $[0,\infty)$ and differentiable on $(0,\infty)$. Taking the limit as $\lambda \to 0^+$ gives: $$ \lim_{\lambda \to 0^+} K(\lambda) = \mathbb{E}_\mu(F). $$

Therefore, for all $\lambda > 0$: $$ K(\lambda) \leq \mathbb{E}_\mu(F) + \frac{c}{4} \lambda, $$ that is: $$ \log H(\lambda) \leq \lambda \mathbb{E}_\mu(F) + \frac{c}{4} \lambda^2, $$ and hence: $$ H(\lambda) \leq \exp\left( \lambda \mathbb{E}_\mu(F) + \frac{c}{4} \lambda^2 \right). $$

Now apply Markov’s inequality: $$ \mu(F \geq \mathbb{E}_\mu(F) + r) = \mu\left(e^{\lambda F} \geq e^{\lambda (\mathbb{E}_\mu(F) + r)} \right) \leq \frac{H(\lambda)}{e^{\lambda (\mathbb{E}_\mu(F) + r)}}. $$

Using the previous bound on $H(\lambda)$, we get: $$ \mu(F \geq \mathbb{E}_\mu(F) + r) \leq \exp\left( \frac{c}{4} \lambda^2 - \lambda r \right). $$

Optimizing this upper bound by choosing $\lambda = \frac{2r}{c}$ gives: $$ \mu(F \geq \mathbb{E}_\mu(F) + r) \leq \exp\left( -\frac{r^2}{c} \right). $$

This proves the first inequality. The second follows by applying the same bound to $-F$ and combining both tails: $$ \mu(|F - \mathbb{E}_\mu(F)| \geq r) \leq \mu(F \geq \mathbb{E}_\mu(F) + r) + \mu(F \leq \mathbb{E}_\mu(F) - r) \leq 2 e^{-r^2 / c}. $$

Finally, the general case where $F$ is only Lipschitz (not smooth and bounded) is obtained by standard mollification (i.e., convolution with a smooth kernel) and a limiting argument using the semicontinuity of the Lipschitz norm and uniform upper bounds.

2025/07/09 10:52 · rdouc

$$ \newcommand{\arginf}{\mathrm{arginf}} \newcommand{\argmin}{\mathrm{argmin}} \newcommand{\argmax}{\mathrm{argmax}} \newcommand{\asconv}[1]{\stackrel{#1-a.s.}{\rightarrow}} \newcommand{\Aset}{\mathsf{A}} \newcommand{\b}[1]{{\mathbf{#1}}} \newcommand{\ball}[1]{\mathsf{B}(#1)} \newcommand{\bbQ}{{\mathbb Q}} \newcommand{\bproof}{\textbf{Proof :}\quad} \newcommand{\bmuf}[2]{b_{#1,#2}} \newcommand{\card}{\mathrm{card}} \newcommand{\chunk}[3]{{#1}_{#2:#3}} \newcommand{\condtrans}[3]{p_{#1}(#2|#3)} \newcommand{\convprob}[1]{\stackrel{#1-\text{prob}}{\rightarrow}} \newcommand{\Cov}{\mathbb{C}\mathrm{ov}} \newcommand{\cro}[1]{\langle #1 \rangle} \newcommand{\CPE}[2]{\PE\lr{#1| #2}} \renewcommand{\det}{\mathrm{det}} \newcommand{\dimlabel}{\mathsf{m}} \newcommand{\dimU}{\mathsf{q}} \newcommand{\dimX}{\mathsf{d}} \newcommand{\dimY}{\mathsf{p}} \newcommand{\dlim}{\Rightarrow} \newcommand{\e}[1]{{\left\lfloor #1 \right\rfloor}} \newcommand{\eproof}{\quad \Box} \newcommand{\eremark}{</WRAP>} \newcommand{\eqdef}{:=} \newcommand{\eqlaw}{\stackrel{\mathcal{L}}{=}} \newcommand{\eqsp}{\;} \newcommand{\Eset}{ {\mathsf E}} \newcommand{\esssup}{\mathrm{essup}} \newcommand{\fr}[1]{{\left\langle #1 \right\rangle}} \newcommand{\falph}{f} \renewcommand{\geq}{\geqslant} \newcommand{\hchi}{\hat \chi} \newcommand{\Hset}{\mathsf{H}} \newcommand{\Id}{\mathrm{Id}} \newcommand{\img}{\text{Im}} \newcommand{\indi}[1]{\mathbf{1}_{#1}} \newcommand{\indiacc}[1]{\mathbf{1}_{\{#1\}}} \newcommand{\indin}[1]{\mathbf{1}\{#1\}} \newcommand{\itemm}{\quad \quad \blacktriangleright \;} \newcommand{\jointtrans}[3]{p_{#1}(#2,#3)} \newcommand{\ker}{\text{Ker}} \newcommand{\klbck}[2]{\mathrm{K}\lr{#1||#2}} \newcommand{\law}{\mathcal{L}} \newcommand{\labelinit}{\pi} \newcommand{\labelkernel}{Q} \renewcommand{\leq}{\leqslant} \newcommand{\lone}{\mathsf{L}_1} \newcommand{\lp}[1]{\mathsf{L}_{{#1}}} \newcommand{\lrav}[1]{\left|#1 \right|} \newcommand{\lr}[1]{\left(#1 \right)} \newcommand{\lrb}[1]{\left[#1 \right]} \newcommand{\lrc}[1]{\left\{#1 \right\}} \newcommand{\lrcb}[1]{\left\{#1 \right\}} \newcommand{\ltwo}[1]{\PE^{1/2}\lrb{\lrcb{#1}^2}} \newcommand{\Ltwo}{\mathrm{L}^2} \newcommand{\mc}[1]{\mathcal{#1}} \newcommand{\mcbb}{\mathcal B} \newcommand{\mcf}{\mathcal{F}} \newcommand{\meas}[1]{\mathrm{M}_{#1}} \newcommand{\norm}[1]{\left\|#1\right\|} \newcommand{\normmat}[1]{{\left\vert\kern-0.25ex\left\vert\kern-0.25ex\left\vert #1 \right\vert\kern-0.25ex\right\vert\kern-0.25ex\right\vert}} \newcommand{\nset}{\mathbb N} \newcommand{\N}{\mathcal{N}} \newcommand{\one}{\mathsf{1}} \newcommand{\PE}{\mathbb E} \newcommand{\pminfty}{_{-\infty}^\infty} \newcommand{\PP}{\mathbb P} \newcommand{\projorth}[1]{\mathsf{P}^\perp_{#1}} \newcommand{\Psif}{\Psi_f} \newcommand{\pscal}[2]{\langle #1,#2\rangle} \newcommand{\pscal}[2]{\langle #1,#2\rangle} \newcommand{\psconv}{\stackrel{\PP-a.s.}{\rightarrow}} \newcommand{\qset}{\mathbb Q} \newcommand{\revcondtrans}[3]{q_{#1}(#2|#3)} \newcommand{\rmd}{\mathrm d} \newcommand{\rme}{\mathrm e} \newcommand{\rmi}{\mathrm i} \newcommand{\Rset}{\mathbb{R}} \newcommand{\rset}{\mathbb{R}} \newcommand{\rti}{\sigma} \newcommand{\section}[1]{==== #1 ====} \newcommand{\seq}[2]{\lrc{#1\eqsp: \eqsp #2}} \newcommand{\set}[2]{\lrc{#1\eqsp: \eqsp #2}} \newcommand{\sg}{\mathrm{sgn}} \newcommand{\supnorm}[1]{\left\|#1\right\|_{\infty}} \newcommand{\thv}{{\theta_\star}} \newcommand{\tmu}{ {\tilde{\mu}}} \newcommand{\Tset}{ {\mathsf{T}}} \newcommand{\Tsigma}{ {\mathcal{T}}} \newcommand{\ttheta}{{\tilde \theta}} \newcommand{\tv}[1]{\left\|#1\right\|_{\mathrm{TV}}} \newcommand{\unif}{\mathrm{Unif}} \newcommand{\weaklim}[1]{\stackrel{\mathcal{L}_{#1}}{\rightsquigarrow}} \newcommand{\Xset}{{\mathsf X}} \newcommand{\Xsigma}{\mathcal X} \newcommand{\Yset}{{\mathsf Y}} \newcommand{\Ysigma}{\mathcal Y} \newcommand{\Var}{\mathbb{V}\mathrm{ar}} \newcommand{\zset}{\mathbb{Z}} \newcommand{\Zset}{\mathsf{Z}} $$

2023/11/14 18:37

Minkovski's inequality

Let $f, g: \Xset \to \rset$ be two measurable functions on a measurable space $(\Xset, \Xsigma)$, and let $\mu$ be a non-negative measure on $(\Xset, \Xsigma)$. Then, for any $p \geq 1$, $$\lr{\int |f+g|^p \rmd \mu}^{1/p} \leq \lr{\int |f|^p \rmd \mu}^{1/p} + \lr{\int |g|^p \rmd \mu}^{1/p}.$$

Proof

Some alternative proofs can be found here.

Without loss of generality, we assume that $p>1$, $f, g \in \lp{p}(\mu)$ and $f, g \geq 0$ (the general case can be handled using the inequality $|f+g| \leq |f| + |g|$). For $s > 0$, define $$\varphi(s) = \lr{\int (f + s g)^p \rmd \mu}^{1/p}.$$ Then, $$\varphi'(s) = \lr{\int (f + s g)^p \rmd \mu}^{\frac{1}{p} - 1} \int (f + s g)^{p-1} g \, \rmd \mu.$$ Using Hölder’s inequality for the second term, we can bound $\varphi'(s)$ as follows: $$\varphi'(s) \leq \lr{\int (f + s g)^p \rmd \mu}^{\frac{1}{p} - 1} \lr{\int (f + s g)^p \rmd \mu}^{\frac{p-1}{p}} \lr{\int g^p \rmd \mu}^{1/p} = \lr{\int g^p \rmd \mu}^{1/p}.$$ Hence, $$\lr{\int (f + g)^p \rmd \mu}^{1/p} = \varphi(1) = \varphi(0) + \int_0^1 \varphi'(s) \, \rmd s \leq \lr{\int f^p \rmd \mu}^{1/p}+ \lr{\int g^p \rmd \mu}^{1/p}.$$ This completes the proof. 

2025/01/29 09:31 · rdouc

$$ \newcommand{\arginf}{\mathrm{arginf}} \newcommand{\argmin}{\mathrm{argmin}} \newcommand{\argmax}{\mathrm{argmax}} \newcommand{\asconv}[1]{\stackrel{#1-a.s.}{\rightarrow}} \newcommand{\Aset}{\mathsf{A}} \newcommand{\b}[1]{{\mathbf{#1}}} \newcommand{\ball}[1]{\mathsf{B}(#1)} \newcommand{\bbQ}{{\mathbb Q}} \newcommand{\bproof}{\textbf{Proof :}\quad} \newcommand{\bmuf}[2]{b_{#1,#2}} \newcommand{\card}{\mathrm{card}} \newcommand{\chunk}[3]{{#1}_{#2:#3}} \newcommand{\condtrans}[3]{p_{#1}(#2|#3)} \newcommand{\convprob}[1]{\stackrel{#1-\text{prob}}{\rightarrow}} \newcommand{\Cov}{\mathbb{C}\mathrm{ov}} \newcommand{\cro}[1]{\langle #1 \rangle} \newcommand{\CPE}[2]{\PE\lr{#1| #2}} \renewcommand{\det}{\mathrm{det}} \newcommand{\dimlabel}{\mathsf{m}} \newcommand{\dimU}{\mathsf{q}} \newcommand{\dimX}{\mathsf{d}} \newcommand{\dimY}{\mathsf{p}} \newcommand{\dlim}{\Rightarrow} \newcommand{\e}[1]{{\left\lfloor #1 \right\rfloor}} \newcommand{\eproof}{\quad \Box} \newcommand{\eremark}{</WRAP>} \newcommand{\eqdef}{:=} \newcommand{\eqlaw}{\stackrel{\mathcal{L}}{=}} \newcommand{\eqsp}{\;} \newcommand{\Eset}{ {\mathsf E}} \newcommand{\esssup}{\mathrm{essup}} \newcommand{\fr}[1]{{\left\langle #1 \right\rangle}} \newcommand{\falph}{f} \renewcommand{\geq}{\geqslant} \newcommand{\hchi}{\hat \chi} \newcommand{\Hset}{\mathsf{H}} \newcommand{\Id}{\mathrm{Id}} \newcommand{\img}{\text{Im}} \newcommand{\indi}[1]{\mathbf{1}_{#1}} \newcommand{\indiacc}[1]{\mathbf{1}_{\{#1\}}} \newcommand{\indin}[1]{\mathbf{1}\{#1\}} \newcommand{\itemm}{\quad \quad \blacktriangleright \;} \newcommand{\jointtrans}[3]{p_{#1}(#2,#3)} \newcommand{\ker}{\text{Ker}} \newcommand{\klbck}[2]{\mathrm{K}\lr{#1||#2}} \newcommand{\law}{\mathcal{L}} \newcommand{\labelinit}{\pi} \newcommand{\labelkernel}{Q} \renewcommand{\leq}{\leqslant} \newcommand{\lone}{\mathsf{L}_1} \newcommand{\lp}[1]{\mathsf{L}_{{#1}}} \newcommand{\lrav}[1]{\left|#1 \right|} \newcommand{\lr}[1]{\left(#1 \right)} \newcommand{\lrb}[1]{\left[#1 \right]} \newcommand{\lrc}[1]{\left\{#1 \right\}} \newcommand{\lrcb}[1]{\left\{#1 \right\}} \newcommand{\ltwo}[1]{\PE^{1/2}\lrb{\lrcb{#1}^2}} \newcommand{\Ltwo}{\mathrm{L}^2} \newcommand{\mc}[1]{\mathcal{#1}} \newcommand{\mcbb}{\mathcal B} \newcommand{\mcf}{\mathcal{F}} \newcommand{\meas}[1]{\mathrm{M}_{#1}} \newcommand{\norm}[1]{\left\|#1\right\|} \newcommand{\normmat}[1]{{\left\vert\kern-0.25ex\left\vert\kern-0.25ex\left\vert #1 \right\vert\kern-0.25ex\right\vert\kern-0.25ex\right\vert}} \newcommand{\nset}{\mathbb N} \newcommand{\N}{\mathcal{N}} \newcommand{\one}{\mathsf{1}} \newcommand{\PE}{\mathbb E} \newcommand{\pminfty}{_{-\infty}^\infty} \newcommand{\PP}{\mathbb P} \newcommand{\projorth}[1]{\mathsf{P}^\perp_{#1}} \newcommand{\Psif}{\Psi_f} \newcommand{\pscal}[2]{\langle #1,#2\rangle} \newcommand{\pscal}[2]{\langle #1,#2\rangle} \newcommand{\psconv}{\stackrel{\PP-a.s.}{\rightarrow}} \newcommand{\qset}{\mathbb Q} \newcommand{\revcondtrans}[3]{q_{#1}(#2|#3)} \newcommand{\rmd}{\mathrm d} \newcommand{\rme}{\mathrm e} \newcommand{\rmi}{\mathrm i} \newcommand{\Rset}{\mathbb{R}} \newcommand{\rset}{\mathbb{R}} \newcommand{\rti}{\sigma} \newcommand{\section}[1]{==== #1 ====} \newcommand{\seq}[2]{\lrc{#1\eqsp: \eqsp #2}} \newcommand{\set}[2]{\lrc{#1\eqsp: \eqsp #2}} \newcommand{\sg}{\mathrm{sgn}} \newcommand{\supnorm}[1]{\left\|#1\right\|_{\infty}} \newcommand{\thv}{{\theta_\star}} \newcommand{\tmu}{ {\tilde{\mu}}} \newcommand{\Tset}{ {\mathsf{T}}} \newcommand{\Tsigma}{ {\mathcal{T}}} \newcommand{\ttheta}{{\tilde \theta}} \newcommand{\tv}[1]{\left\|#1\right\|_{\mathrm{TV}}} \newcommand{\unif}{\mathrm{Unif}} \newcommand{\weaklim}[1]{\stackrel{\mathcal{L}_{#1}}{\rightsquigarrow}} \newcommand{\Xset}{{\mathsf X}} \newcommand{\Xsigma}{\mathcal X} \newcommand{\Yset}{{\mathsf Y}} \newcommand{\Ysigma}{\mathcal Y} \newcommand{\Var}{\mathbb{V}\mathrm{ar}} \newcommand{\zset}{\mathbb{Z}} \newcommand{\Zset}{\mathsf{Z}} $$

2023/11/14 18:37

Foundation of Machine Learning

Program of the course

Contact

name email adresses
Randal Douc randal.douc At polytechnique.edu
2024/10/16 18:43 · 0 Comments
world/subworld.1584636617.txt.gz · Last modified: 2022/03/16 01:37 (external edit)