This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
world:pca [2022/11/12 13:40] rdouc created |
world:pca [2022/11/13 18:38] (current) rdouc [Proof] |
||
---|---|---|---|
Line 1: | Line 1: | ||
{{page>:defs}} | {{page>:defs}} | ||
- | Let $(X_i)_{1\leq i \leq n}$ be $n$ observations in $\rset^d$. Define $\Hset_p$ the set of all the subspaces of $\rset^d$ of dimension $p$ and write $\projorth{H}X$ the orthogonal projection of a vector $X \in \rset^d$ on a subspace $\Hset$. | + | ====== Statement ====== |
- | In PCA, we consider the optimisation problem: | + | Let $(X_i)_{1\leq i \leq n}$ be $n$ observations in $\rset^d$. Define $\Hset_p$ the set of all the subspaces of $\rset^d$ of dimension $p$ and write $\projorth{H}X$ the orthogonal projection of a vector $X \in \rset^d$ on a subspace $H$. |
+ | |||
+ | In Principal Component Analysis (PCA), we consider the optimisation problem: | ||
\begin{align*} | \begin{align*} | ||
V_p&= \argmin_{H \in \Hset_p} \sum_{i=1}^n \norm{X_i - \projorth{H}X_i}^2 \\ | V_p&= \argmin_{H \in \Hset_p} \sum_{i=1}^n \norm{X_i - \projorth{H}X_i}^2 \\ | ||
Line 13: | Line 15: | ||
Define $S_n= \sum_{i=1}^n X_i X_i^T$ and since $S_n$ is a symmetric and positive matrix with real entries, it is diagonizable in an orthonormal basis $(w_j)_{1\leq j \leq d}$ with eigenvalues $(\lambda_j)_{1\leq j \leq d}$ ranked in a decreasing order, that is, $\lambda_1 \geq \ldots \geq \lambda_n \geq 0$. Defining $D={\mathrm {Diag}} ((\lambda_j)_{1\leq j \leq d})$ and $U=[w_1,\ldots,w_d]$ we have $S_n = U D U^T=\sum_{j=1}^d \lambda_j w_j w_j^T$. | Define $S_n= \sum_{i=1}^n X_i X_i^T$ and since $S_n$ is a symmetric and positive matrix with real entries, it is diagonizable in an orthonormal basis $(w_j)_{1\leq j \leq d}$ with eigenvalues $(\lambda_j)_{1\leq j \leq d}$ ranked in a decreasing order, that is, $\lambda_1 \geq \ldots \geq \lambda_n \geq 0$. Defining $D={\mathrm {Diag}} ((\lambda_j)_{1\leq j \leq d})$ and $U=[w_1,\ldots,w_d]$ we have $S_n = U D U^T=\sum_{j=1}^d \lambda_j w_j w_j^T$. | ||
- | \begin{lemma} | + | <note tip> For any $p \in [1:d]$, we have |
- | For any $p \in [1:d]$, we have | + | \begin{equation} \label{eq:vp} |
- | \begin{equation} \label{eq:vp} | + | |
V_p=\mathrm{Span}(w_1,\ldots,w_p) | V_p=\mathrm{Span}(w_1,\ldots,w_p) | ||
- | \end{equation} | + | \end{equation} |
- | \end{lemma} | + | </note> |
+ | ===== Proof ===== | ||
- | \begin{proof} | + | The proof is by induction. We start with $p=1$. For any unitary vector $w\in \rset^d$, |
- | + | \begin{align} | |
- | The proof is by induction. We start with $p=1$. For any unitary vector $w\in \rset^d$, | + | \sum_{i=1}^n \norm{\projorth{\rset w}X_i}^2&=\sum_{i=1}^n (X_i^T w)^2=\sum_{i=1}^n w^T X_i X_i^T w=w^T (\sum_{i=1}^n X_i X_i^T) w \nonumber\\ |
+ | &=w^T S_n w=w^T \lr{\sum_{j=1}^d \lambda_j w_j w_j^T} w=\sum_{j=1}^d \lambda_j (w^T w_j)^2 \label{eq:dim1} | ||
+ | \end{align} | ||
+ | Therefore, we have | ||
\begin{align*} | \begin{align*} | ||
- | \sum_{i=1}^n \norm{\projorth{\rset w}X_i}^2&=\sum_{i=1}^n (X_i^T w)^2=\sum_{i=1}^n w^T X_i X_i^T w=w^T (\sum_{i=1}^n X_i X_i^T) w\\ | + | \sum_{i=1}^n \norm{\projorth{\rset w}X_i}^2 &\leq \lambda_1 \sum_{j=1}^d (w^T w_j)^2=\lambda_1 \norm{w}^2=\lambda_1 |
- | &=w^T S_n w=w^T \lr{\sum_{j=1}^d \lambda_j w_j w_j^T} w=\sum_{j=1}^d \lambda_j (w^T w_j)^2\\ | + | |
- | &\leq \lambda_1 \sum_{j=1}^d (w^T w_j)^2=\lambda_1 \norm{w}^2=\lambda_1 | + | |
\end{align*} | \end{align*} | ||
- | Since $\sum_{i=1}^n \norm{\projorth{\rset w_1}X_i}^2=w_1^T S_n w_1=\lambda_1$. Therefore \eqref{eq:vp} holds true for $p=1$. | + | Note that from \eqref{eq:dim1}, we have for any eigenvector $w_k$ of $S_n$, |
+ | \begin{equation} \label{eq:eigenvector} | ||
+ | \sum_{i=1}^n \norm{\projorth{\rset w_k}X_i}^2=w_k^T S_n w_k=\lambda_k. | ||
+ | \end{equation} | ||
+ | In particular, $\sum_{i=1}^n \norm{\projorth{\rset w_1}X_i}^2=\lambda_1$. Therefore \eqref{eq:vp} holds true for $p=1$. | ||
- | Assume now that \eqref{eq:vp} hold true for some $p \in [1:d-1]$ and let $H \in \Hset_p$. Then, denote by $G=\mathrm{Span}(w_1,\ldots,w_p)^\perp$. Since $\mathrm{dim} (G)=d-p$ and $\mathrm{dim}(H)=p+1$, we must have $G \cap H \notin \{0\}$ (Otherwise the subspace $G+H$ would be of dimension larger than $n-p+p+1=n+1$ which is not possible). Let $w_0$ a unitary vector of $G \cap H$. Then, we have the decomposition $H=\rset w_0 \stackrel{\perp}{+} H_0$ where $H_0$ is of dimension $p$. Then, | + | Assume now that \eqref{eq:vp} hold true for some $p \in [1:d-1]$ and let $H \in \Hset_{p+1}$. Then, denote by $G=\mathrm{Span}(w_1,\ldots,w_p)^\perp$. Since $\mathrm{dim} (G)=d-p$ and $\mathrm{dim}(H)=p+1$, we must have $G \cap H \notin \{0\}$ (Otherwise the subspace $G+H$ would be of dimension $d-p+p+1=d+1$ which is not possible). Let $w_0$ a unitary vector of $G \cap H$. Then, we have the decomposition $H=\rset w_0 \stackrel{\perp}{+} H_0$ where $H_0$ is of dimension $p$. Then, applying \eqref{eq:dim1} |
\begin{align*} | \begin{align*} | ||
\sum_{i=1}^n \norm{\projorth{H}X_i}^2&= \sum_{i=1}^n \norm{\projorth{\rset w_0}X_i}^2+\norm{\projorth{H_0}X_i}^2\\ | \sum_{i=1}^n \norm{\projorth{H}X_i}^2&= \sum_{i=1}^n \norm{\projorth{\rset w_0}X_i}^2+\norm{\projorth{H_0}X_i}^2\\ | ||
Line 36: | Line 43: | ||
&=\sum_{j=p+1}^d \lambda_j (w_0^T w_j)^2 +\sum_{i=1}^n \norm{\projorth{H_0}X_i}^2\\ | &=\sum_{j=p+1}^d \lambda_j (w_0^T w_j)^2 +\sum_{i=1}^n \norm{\projorth{H_0}X_i}^2\\ | ||
\end{align*} | \end{align*} | ||
- | where we used that $w_0 \in G=\mathrm{Span}(w_1,\ldots,w_p)^\perp$. Applying now the induction assumption, | + | where we used that $w_0 \in G=\mathrm{Span}(w_1,\ldots,w_p)^\perp$. Applying the induction assumption and then \eqref{eq:eigenvector}, |
\begin{align*} | \begin{align*} | ||
\sum_{i=1}^n \norm{\projorth{H}X_i}^2& \leq \lambda_{p+1} \sum_{j=p+1}^d (w_0^T w_j)^2 +\sum_{i=1}^n \norm{\projorth{\mathrm{Span}(w_1,\ldots,w_p)}X_i}^2\\ | \sum_{i=1}^n \norm{\projorth{H}X_i}^2& \leq \lambda_{p+1} \sum_{j=p+1}^d (w_0^T w_j)^2 +\sum_{i=1}^n \norm{\projorth{\mathrm{Span}(w_1,\ldots,w_p)}X_i}^2\\ | ||
& \leq \lambda_{p+1} \norm{w_0}^2+\sum_{i=1}^n \norm{\projorth{\mathrm{Span}(w_1,\ldots,w_p)}X_i}^2\\ | & \leq \lambda_{p+1} \norm{w_0}^2+\sum_{i=1}^n \norm{\projorth{\mathrm{Span}(w_1,\ldots,w_p)}X_i}^2\\ | ||
- | & \leq \lambda_{p+1} +\sum_{i=1}^n \norm{\projorth{\mathrm{Span}(w_1,\ldots,w_p)}X_i}^2\\ | + | & = \lambda_{p+1} +\sum_{i=1}^n \norm{\projorth{\mathrm{Span}(w_1,\ldots,w_p)}X_i}^2\\ |
- | & \leq \sum_{i=1}^n \norm{\projorth{\rset w_{p+1}}X_i}^2 +\sum_{i=1}^n \norm{\projorth{\mathrm{Span}(w_1,\ldots,w_p)}X_i}^2\\ | + | & = \sum_{i=1}^n \norm{\projorth{\rset w_{p+1}}X_i}^2 +\sum_{i=1}^n \norm{\projorth{\mathrm{Span}(w_1,\ldots,w_p)}X_i}^2\\ |
- | & \leq \sum_{i=1}^n \lr{\norm{\projorth{\rset w_{p+1}}X_i}^2 +\norm{\projorth{\mathrm{Span}(w_1,\ldots,w_p)}X_i}^2}\\ | + | & = \sum_{i=1}^n \lr{\norm{\projorth{\rset w_{p+1}}X_i}^2 +\norm{\projorth{\mathrm{Span}(w_1,\ldots,w_p)}X_i}^2}\\ |
- | & \leq \sum_{i=1}^n \norm{\projorth{\mathrm{Span}(w_1,\ldots,w_{p+1})}X_i}^2\\ | + | & = \sum_{i=1}^n \norm{\projorth{\mathrm{Span}(w_1,\ldots,w_{p+1})}X_i}^2\\ |
\end{align*} | \end{align*} | ||
which concludes the proof by an induction argument. | which concludes the proof by an induction argument. | ||
- | \end{proof} | + |