This is an old revision of the document!

Subdirectories

For adding a newpage here, choose the namespace (world, here) and the name of your newpage.

You are not allowed to add pages

Visio 1er aout

Indiquez votre disponibilité (tranche horaire) pour une visio le 1er aout ici (double cliquez pour modifier la page (ou sur l'icone de crayon à droite) et n'oubliez pas de cliquer sur save, une fois la page modifiée):

Randal: Dispo toute la journée (8H-18H). Je suis aussi dispo la première semaine d'août.
Eric Moulines:
Mehdi Abou El Qassime:
Marine Saux:
Laurent Miclet:
Sylvain Le Corff:
Jean-François Giovanelli:
Matthieu Jonkheere: Pas dipos le 1, je peux avoir des dispos le lundi 4 ou mardi 5
Julien Perez:
Mohamed El Machkouri:

2025/07/28 11:08 · rdouc

Cours de Régression et Classification supervisée

Syllabus

Objectif : Ce cours combine une approche pratique (modélisation, entraînement, validation de modèles) et une approche théorique rigoureuse, permettant aux étudiants de :

Comprendre les principes mathématiques sous-jacents aux algorithmes
Savoir formuler et analyser un problème d'apprentissage supervisé de manière précise
Maîtriser les méthodes classiques (régression linéaire/logistique, SVM, etc.) ainsi que des approches probabilistes

Une partie significative du cours est consacrée à des exercices d'application mathématique afin de renforcer la compréhension conceptuelle des modèles.

Evaluation : Examen sur table + Tps notés. Des quiz réguliers compteront comme un bonus sur la note finale.

Plan Prévisionnel

Randal DOUC (du 29 septembre au 10 octobre)

Introduction au ML
- Survol des types d’apprentissage.
- Objectifs, pipeline, surapprentissage.
- Notions de biais-variance, métriques.
Régression supervisée
- Régression linéaire simple/multiple.
- Inférence, sélection de variables.
- Ridge, Lasso, Elastic-Net.
- Validation croisée, courbes d’apprentissage.
Classification supervisée
- Classifieur de Bayes.
- Régression logistique bi-classe.
- Métriques de performance.
Méthodes probabilistes / modèles génératifs
- EM, mélanges de gaussiennes.
Méthodes non supervisées
- ACP, K-means.
Méthodes avancées
- SVM (introduction).
- Random Forest.

2025/07/09 17:20 · rdouc

Logarithmic Sobolev inequality and concentration

$\newcommand{\Ent}{\mathrm{Ent}}$

Theorem

Taken from Thm 7.4.1 in the book “Sur les inégalités de Sobolev logarithmiques”.

Let $\mu$ be a probability measure on $\mathbb{R}^n$ satisfying the following logarithmic Sobolev inequality: $$ \forall f \in C_b^\infty(\mathbb{R}^n), \quad \Ent_\mu(f^2) \leq c \, \mathbb{E}_\mu(|\nabla f|^2), $$ then for every Lipschitz function $F : \mathbb{R}^n \to \mathbb{R}$ with $\|F\|_{\mathrm{Lip}} \leq 1$, and for all $r > 0$, we have: $$ \mu(F \geq \mathbb{E}_\mu(F) + r) \leq \exp\left(-\frac{r^2}{c}\right), \\ \mu(|F - \mathbb{E}_\mu(F)| \geq r) \leq 2 \exp\left(-\frac{r^2}{c}\right). $$

Proof.

We first assume that $F$ is a smooth and bounded function such that $\|F\|_{\mathrm{Lip}} \leq 1$. Consider the Laplace transform of $F$: $$ H(\lambda) := \mathbb{E}_\mu\left(e^{\lambda F}\right), \quad \lambda > 0. $$

Apply the logarithmic Sobolev inequality to the function $f = e^{\lambda F / 2}$. Then: $$ \Ent_\mu(e^{\lambda F}) \leq c \, \mathbb{E}_\mu\left( \left| \nabla e^{\lambda F/2} \right|^2 \right). $$ Since $\nabla e^{\lambda F / 2} = \frac{\lambda}{2} e^{\lambda F / 2} \nabla F$, we get: $$ \left| \nabla e^{\lambda F / 2} \right|^2 = \frac{\lambda^2}{4} e^{\lambda F} |\nabla F|^2. $$ Because $\|\nabla F\|_\infty \leq 1$, it follows that: $$ \Ent_\mu(e^{\lambda F}) \leq \frac{c \lambda^2}{4} H(\lambda). $$

Also, we have the identity: $$ \Ent_\mu(e^{\lambda F}) = \int e^{\lambda F} \log \frac{e^{\lambda F}}{\int e^{\lambda F} \, d\mu} \, d\mu = \lambda H'(\lambda) - H(\lambda) \log H(\lambda), $$ which gives: $$ \lambda H'(\lambda) - H(\lambda) \log H(\lambda) \leq \frac{c \lambda^2}{4} H(\lambda). $$

Dividing by $\lambda^2 H(\lambda)$ (which is valid since $H(\lambda) > 0$), we obtain: $$ \frac{H'(\lambda)}{\lambda H(\lambda)} - \frac{\log H(\lambda)}{\lambda^2} \leq \frac{c}{4}. $$

Define: $$ K(\lambda) := \frac{1}{\lambda} \log H(\lambda), \quad \lambda > 0. $$

Then we have: $$ K'(\lambda) \leq \frac{c}{4}. $$

The function $K$ is continuous on $[0,\infty)$ and differentiable on $(0,\infty)$. Taking the limit as $\lambda \to 0^+$ gives: $$ \lim_{\lambda \to 0^+} K(\lambda) = \mathbb{E}_\mu(F). $$

Therefore, for all $\lambda > 0$: $$ K(\lambda) \leq \mathbb{E}_\mu(F) + \frac{c}{4} \lambda, $$ that is: $$ \log H(\lambda) \leq \lambda \mathbb{E}_\mu(F) + \frac{c}{4} \lambda^2, $$ and hence: $$ H(\lambda) \leq \exp\left( \lambda \mathbb{E}_\mu(F) + \frac{c}{4} \lambda^2 \right). $$

Now apply Markov’s inequality: $$ \mu(F \geq \mathbb{E}_\mu(F) + r) = \mu\left(e^{\lambda F} \geq e^{\lambda (\mathbb{E}_\mu(F) + r)} \right) \leq \frac{H(\lambda)}{e^{\lambda (\mathbb{E}_\mu(F) + r)}}. $$

Using the previous bound on $H(\lambda)$, we get: $$ \mu(F \geq \mathbb{E}_\mu(F) + r) \leq \exp\left( \frac{c}{4} \lambda^2 - \lambda r \right). $$

Optimizing this upper bound by choosing $\lambda = \frac{2r}{c}$ gives: $$ \mu(F \geq \mathbb{E}_\mu(F) + r) \leq \exp\left( -\frac{r^2}{c} \right). $$

This proves the first inequality. The second follows by applying the same bound to $-F$ and combining both tails: $$ \mu(|F - \mathbb{E}_\mu(F)| \geq r) \leq \mu(F \geq \mathbb{E}_\mu(F) + r) + \mu(F \leq \mathbb{E}_\mu(F) - r) \leq 2 e^{-r^2 / c}. $$

Finally, the general case where $F$ is only Lipschitz (not smooth and bounded) is obtained by standard mollification (i.e., convolution with a smooth kernel) and a limiting argument using the semicontinuity of the Lipschitz norm and uniform upper bounds.

2025/07/09 10:52 · rdouc

$$ \newcommand{\arginf}{\mathrm{arginf}} \newcommand{\argmin}{\mathrm{argmin}} \newcommand{\argmax}{\mathrm{argmax}} \newcommand{\asconv}[1]{\stackrel{#1-a.s.}{\rightarrow}} \newcommand{\Aset}{\mathsf{A}} \newcommand{\b}[1]{{\mathbf{#1}}} \newcommand{\ball}[1]{\mathsf{B}(#1)} \newcommand{\bbQ}{{\mathbb Q}} \newcommand{\bproof}{\textbf{Proof :}\quad} \newcommand{\bmuf}[2]{b_{#1,#2}} \newcommand{\card}{\mathrm{card}} \newcommand{\chunk}[3]{{#1}_{#2:#3}} \newcommand{\condtrans}[3]{p_{#1}(#2|#3)} \newcommand{\convprob}[1]{\stackrel{#1-\text{prob}}{\rightarrow}} \newcommand{\Cov}{\mathbb{C}\mathrm{ov}} \newcommand{\cro}[1]{\langle #1 \rangle} \newcommand{\CPE}[2]{\PE\lr{#1| #2}} \renewcommand{\det}{\mathrm{det}} \newcommand{\dimlabel}{\mathsf{m}} \newcommand{\dimU}{\mathsf{q}} \newcommand{\dimX}{\mathsf{d}} \newcommand{\dimY}{\mathsf{p}} \newcommand{\dlim}{\Rightarrow} \newcommand{\e}[1]{{\left\lfloor #1 \right\rfloor}} \newcommand{\eproof}{\quad \Box} \newcommand{\eremark}{</WRAP>} \newcommand{\eqdef}{:=} \newcommand{\eqlaw}{\stackrel{\mathcal{L}}{=}} \newcommand{\eqsp}{\;} \newcommand{\Eset}{ {\mathsf E}} \newcommand{\esssup}{\mathrm{essup}} \newcommand{\fr}[1]{{\left\langle #1 \right\rangle}} \newcommand{\falph}{f} \renewcommand{\geq}{\geqslant} \newcommand{\hchi}{\hat \chi} \newcommand{\Hset}{\mathsf{H}} \newcommand{\Id}{\mathrm{Id}} \newcommand{\img}{\text{Im}} \newcommand{\indi}[1]{\mathbf{1}_{#1}} \newcommand{\indiacc}[1]{\mathbf{1}_{\{#1\}}} \newcommand{\indin}[1]{\mathbf{1}\{#1\}} \newcommand{\itemm}{\quad \quad \blacktriangleright \;} \newcommand{\jointtrans}[3]{p_{#1}(#2,#3)} \newcommand{\ker}{\text{Ker}} \newcommand{\klbck}[2]{\mathrm{K}\lr{#1||#2}} \newcommand{\law}{\mathcal{L}} \newcommand{\labelinit}{\pi} \newcommand{\labelkernel}{Q} \renewcommand{\leq}{\leqslant} \newcommand{\lone}{\mathsf{L}_1} \newcommand{\lp}[1]{\mathsf{L}_{{#1}}} \newcommand{\lrav}[1]{\left|#1 \right|} \newcommand{\lr}[1]{\left(#1 \right)} \newcommand{\lrb}[1]{\left[#1 \right]} \newcommand{\lrc}[1]{\left\{#1 \right\}} \newcommand{\lrcb}[1]{\left\{#1 \right\}} \newcommand{\ltwo}[1]{\PE^{1/2}\lrb{\lrcb{#1}^2}} \newcommand{\Ltwo}{\mathrm{L}^2} \newcommand{\mc}[1]{\mathcal{#1}} \newcommand{\mcbb}{\mathcal B} \newcommand{\mcf}{\mathcal{F}} \newcommand{\meas}[1]{\mathrm{M}_{#1}} \newcommand{\norm}[1]{\left\|#1\right\|} \newcommand{\normmat}[1]{{\left\vert\kern-0.25ex\left\vert\kern-0.25ex\left\vert #1 \right\vert\kern-0.25ex\right\vert\kern-0.25ex\right\vert}} \newcommand{\nset}{\mathbb N} \newcommand{\N}{\mathcal{N}} \newcommand{\one}{\mathsf{1}} \newcommand{\PE}{\mathbb E} \newcommand{\pminfty}{_{-\infty}^\infty} \newcommand{\PP}{\mathbb P} \newcommand{\projorth}[1]{\mathsf{P}^\perp_{#1}} \newcommand{\Psif}{\Psi_f} \newcommand{\pscal}[2]{\langle #1,#2\rangle} \newcommand{\pscal}[2]{\langle #1,#2\rangle} \newcommand{\psconv}{\stackrel{\PP-a.s.}{\rightarrow}} \newcommand{\qset}{\mathbb Q} \newcommand{\revcondtrans}[3]{q_{#1}(#2|#3)} \newcommand{\rmd}{\mathrm d} \newcommand{\rme}{\mathrm e} \newcommand{\rmi}{\mathrm i} \newcommand{\Rset}{\mathbb{R}} \newcommand{\rset}{\mathbb{R}} \newcommand{\rti}{\sigma} \newcommand{\section}[1]{==== #1 ====} \newcommand{\seq}[2]{\lrc{#1\eqsp: \eqsp #2}} \newcommand{\set}[2]{\lrc{#1\eqsp: \eqsp #2}} \newcommand{\sg}{\mathrm{sgn}} \newcommand{\supnorm}[1]{\left\|#1\right\|_{\infty}} \newcommand{\thv}{{\theta_\star}} \newcommand{\tmu}{ {\tilde{\mu}}} \newcommand{\Tset}{ {\mathsf{T}}} \newcommand{\Tsigma}{ {\mathcal{T}}} \newcommand{\ttheta}{{\tilde \theta}} \newcommand{\tv}[1]{\left\|#1\right\|_{\mathrm{TV}}} \newcommand{\unif}{\mathrm{Unif}} \newcommand{\weaklim}[1]{\stackrel{\mathcal{L}_{#1}}{\rightsquigarrow}} \newcommand{\Xset}{{\mathsf X}} \newcommand{\Xsigma}{\mathcal X} \newcommand{\Yset}{{\mathsf Y}} \newcommand{\Ysigma}{\mathcal Y} \newcommand{\Var}{\mathbb{V}\mathrm{ar}} \newcommand{\zset}{\mathbb{Z}} \newcommand{\Zset}{\mathsf{Z}} $$

2023/11/14 18:37

Minkovski's inequality

Let $f, g: \Xset \to \rset$ be two measurable functions on a measurable space $(\Xset, \Xsigma)$, and let $\mu$ be a non-negative measure on $(\Xset, \Xsigma)$. Then, for any $p \geq 1$, $$\lr{\int |f+g|^p \rmd \mu}^{1/p} \leq \lr{\int |f|^p \rmd \mu}^{1/p} + \lr{\int |g|^p \rmd \mu}^{1/p}.$$

Proof

Some alternative proofs can be found here.

Without loss of generality, we assume that $p>1$, $f, g \in \lp{p}(\mu)$ and $f, g \geq 0$ (the general case can be handled using the inequality $|f+g| \leq |f| + |g|$). For $s > 0$, define $$\varphi(s) = \lr{\int (f + s g)^p \rmd \mu}^{1/p}.$$ Then, $$\varphi'(s) = \lr{\int (f + s g)^p \rmd \mu}^{\frac{1}{p} - 1} \int (f + s g)^{p-1} g \, \rmd \mu.$$ Using Hölder’s inequality for the second term, we can bound $\varphi'(s)$ as follows: $$\varphi'(s) \leq \lr{\int (f + s g)^p \rmd \mu}^{\frac{1}{p} - 1} \lr{\int (f + s g)^p \rmd \mu}^{\frac{p-1}{p}} \lr{\int g^p \rmd \mu}^{1/p} = \lr{\int g^p \rmd \mu}^{1/p}.$$ Hence, $$\lr{\int (f + g)^p \rmd \mu}^{1/p} = \varphi(1) = \varphi(0) + \int_0^1 \varphi'(s) \, \rmd s \leq \lr{\int f^p \rmd \mu}^{1/p}+ \lr{\int g^p \rmd \mu}^{1/p}.$$ This completes the proof.

2025/01/29 09:31 · rdouc

$$ \newcommand{\arginf}{\mathrm{arginf}} \newcommand{\argmin}{\mathrm{argmin}} \newcommand{\argmax}{\mathrm{argmax}} \newcommand{\asconv}[1]{\stackrel{#1-a.s.}{\rightarrow}} \newcommand{\Aset}{\mathsf{A}} \newcommand{\b}[1]{{\mathbf{#1}}} \newcommand{\ball}[1]{\mathsf{B}(#1)} \newcommand{\bbQ}{{\mathbb Q}} \newcommand{\bproof}{\textbf{Proof :}\quad} \newcommand{\bmuf}[2]{b_{#1,#2}} \newcommand{\card}{\mathrm{card}} \newcommand{\chunk}[3]{{#1}_{#2:#3}} \newcommand{\condtrans}[3]{p_{#1}(#2|#3)} \newcommand{\convprob}[1]{\stackrel{#1-\text{prob}}{\rightarrow}} \newcommand{\Cov}{\mathbb{C}\mathrm{ov}} \newcommand{\cro}[1]{\langle #1 \rangle} \newcommand{\CPE}[2]{\PE\lr{#1| #2}} \renewcommand{\det}{\mathrm{det}} \newcommand{\dimlabel}{\mathsf{m}} \newcommand{\dimU}{\mathsf{q}} \newcommand{\dimX}{\mathsf{d}} \newcommand{\dimY}{\mathsf{p}} \newcommand{\dlim}{\Rightarrow} \newcommand{\e}[1]{{\left\lfloor #1 \right\rfloor}} \newcommand{\eproof}{\quad \Box} \newcommand{\eremark}{</WRAP>} \newcommand{\eqdef}{:=} \newcommand{\eqlaw}{\stackrel{\mathcal{L}}{=}} \newcommand{\eqsp}{\;} \newcommand{\Eset}{ {\mathsf E}} \newcommand{\esssup}{\mathrm{essup}} \newcommand{\fr}[1]{{\left\langle #1 \right\rangle}} \newcommand{\falph}{f} \renewcommand{\geq}{\geqslant} \newcommand{\hchi}{\hat \chi} \newcommand{\Hset}{\mathsf{H}} \newcommand{\Id}{\mathrm{Id}} \newcommand{\img}{\text{Im}} \newcommand{\indi}[1]{\mathbf{1}_{#1}} \newcommand{\indiacc}[1]{\mathbf{1}_{\{#1\}}} \newcommand{\indin}[1]{\mathbf{1}\{#1\}} \newcommand{\itemm}{\quad \quad \blacktriangleright \;} \newcommand{\jointtrans}[3]{p_{#1}(#2,#3)} \newcommand{\ker}{\text{Ker}} \newcommand{\klbck}[2]{\mathrm{K}\lr{#1||#2}} \newcommand{\law}{\mathcal{L}} \newcommand{\labelinit}{\pi} \newcommand{\labelkernel}{Q} \renewcommand{\leq}{\leqslant} \newcommand{\lone}{\mathsf{L}_1} \newcommand{\lp}[1]{\mathsf{L}_{{#1}}} \newcommand{\lrav}[1]{\left|#1 \right|} \newcommand{\lr}[1]{\left(#1 \right)} \newcommand{\lrb}[1]{\left[#1 \right]} \newcommand{\lrc}[1]{\left\{#1 \right\}} \newcommand{\lrcb}[1]{\left\{#1 \right\}} \newcommand{\ltwo}[1]{\PE^{1/2}\lrb{\lrcb{#1}^2}} \newcommand{\Ltwo}{\mathrm{L}^2} \newcommand{\mc}[1]{\mathcal{#1}} \newcommand{\mcbb}{\mathcal B} \newcommand{\mcf}{\mathcal{F}} \newcommand{\meas}[1]{\mathrm{M}_{#1}} \newcommand{\norm}[1]{\left\|#1\right\|} \newcommand{\normmat}[1]{{\left\vert\kern-0.25ex\left\vert\kern-0.25ex\left\vert #1 \right\vert\kern-0.25ex\right\vert\kern-0.25ex\right\vert}} \newcommand{\nset}{\mathbb N} \newcommand{\N}{\mathcal{N}} \newcommand{\one}{\mathsf{1}} \newcommand{\PE}{\mathbb E} \newcommand{\pminfty}{_{-\infty}^\infty} \newcommand{\PP}{\mathbb P} \newcommand{\projorth}[1]{\mathsf{P}^\perp_{#1}} \newcommand{\Psif}{\Psi_f} \newcommand{\pscal}[2]{\langle #1,#2\rangle} \newcommand{\pscal}[2]{\langle #1,#2\rangle} \newcommand{\psconv}{\stackrel{\PP-a.s.}{\rightarrow}} \newcommand{\qset}{\mathbb Q} \newcommand{\revcondtrans}[3]{q_{#1}(#2|#3)} \newcommand{\rmd}{\mathrm d} \newcommand{\rme}{\mathrm e} \newcommand{\rmi}{\mathrm i} \newcommand{\Rset}{\mathbb{R}} \newcommand{\rset}{\mathbb{R}} \newcommand{\rti}{\sigma} \newcommand{\section}[1]{==== #1 ====} \newcommand{\seq}[2]{\lrc{#1\eqsp: \eqsp #2}} \newcommand{\set}[2]{\lrc{#1\eqsp: \eqsp #2}} \newcommand{\sg}{\mathrm{sgn}} \newcommand{\supnorm}[1]{\left\|#1\right\|_{\infty}} \newcommand{\thv}{{\theta_\star}} \newcommand{\tmu}{ {\tilde{\mu}}} \newcommand{\Tset}{ {\mathsf{T}}} \newcommand{\Tsigma}{ {\mathcal{T}}} \newcommand{\ttheta}{{\tilde \theta}} \newcommand{\tv}[1]{\left\|#1\right\|_{\mathrm{TV}}} \newcommand{\unif}{\mathrm{Unif}} \newcommand{\weaklim}[1]{\stackrel{\mathcal{L}_{#1}}{\rightsquigarrow}} \newcommand{\Xset}{{\mathsf X}} \newcommand{\Xsigma}{\mathcal X} \newcommand{\Yset}{{\mathsf Y}} \newcommand{\Ysigma}{\mathcal Y} \newcommand{\Var}{\mathbb{V}\mathrm{ar}} \newcommand{\zset}{\mathbb{Z}} \newcommand{\Zset}{\mathsf{Z}} $$

2023/11/14 18:37

Foundation of Machine Learning

Program of the course

Where? PC 40
When? Sept 2022-Dec 2022

PC	Doc	Additional Links
1	PC1	Leave One Out Cross Validation (LOOCV) and Cook's distance
2	PC2	A useful result on maximisation
3		KKT
4	PC4
5	Computer session	tpfondmachlearn.pdf
6	PC6	A useful result for the consistency of Decision Trees, Max-bounds for the normal distribution.
7	PC7	Handwritten notes on the solutions of PC7 All to be known about pca
8	PC8	Some properties on the determinant of symmetric positive matrices
9	xgboost	Many interesting videos can be found on statquest.

Contact

name	email adresses
Randal Douc	randal.douc At polytechnique.edu

2024/10/16 18:43 · 0 Comments

machine learning, course

Older entries >>

Welcome to Randal Douc's wiki

Sidebar

Wiki

Wiki

Courses and public working groups

Courses and public working groups

Private Working Groups

Private Working Groups

Personal Notes

Personal Notes

Réponses

Réponses

Miscellanous

Miscellanous

Subdirectories

Visio 1er aout

Cours de Régression et Classification supervisée

Syllabus

Plan Prévisionnel

Logarithmic Sobolev inequality and concentration

Theorem

Proof.

Minkovski's inequality

Proof

Foundation of Machine Learning

Program of the course

Contact

Welcome to Randal Douc's wiki

User Tools

Site Tools

Sidebar

Wiki

Wiki

Courses and public working groups

Courses and public working groups

Private Working Groups

Private Working Groups

Personal Notes

Personal Notes

Réponses

Réponses

Miscellanous

Miscellanous

Subdirectories

Visio 1er aout

Cours de Régression et Classification supervisée

Syllabus

Plan Prévisionnel

Logarithmic Sobolev inequality and concentration

Theorem

Proof.

Minkovski's inequality

Proof

Foundation of Machine Learning

Program of the course

Contact

Page Tools