Welcome to Randal Douc's wiki

A collaborative site on maths but not only!

User Tools

Site Tools


world:pca

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
world:pca [2022/11/12 13:50]
rdouc [Proof]
world:pca [2022/11/13 18:37]
rdouc [Proof]
Line 3: Line 3:
 ====== Statement ====== ====== Statement ======
  
-Let $(X_i)_{1\leq i \leq n}$ be $n$ observations in $\rset^d$. Define $\Hset_p$ the set of all the subspaces of $\rset^d$ of dimension $p$ and write $\projorth{H}X$ the orthogonal projection of a vector $X \in \rset^d$ on a subspace $\Hset$. +Let $(X_i)_{1\leq i \leq n}$ be $n$ observations in $\rset^d$. Define $\Hset_p$ the set of all the subspaces of $\rset^d$ of dimension $p$ and write $\projorth{H}X$ the orthogonal projection of a vector $X \in \rset^d$ on a subspace $H$. 
  
 In Principal Component Analysis (PCA), we consider the optimisation problem: ​ In Principal Component Analysis (PCA), we consider the optimisation problem: ​
Line 27: Line 27:
     &=w^T S_n w=w^T \lr{\sum_{j=1}^d \lambda_j ​ w_j w_j^T} w=\sum_{j=1}^d \lambda_j (w^T w_j)^2 \label{eq:​dim1}     &=w^T S_n w=w^T \lr{\sum_{j=1}^d \lambda_j ​ w_j w_j^T} w=\sum_{j=1}^d \lambda_j (w^T w_j)^2 \label{eq:​dim1}
 \end{align} \end{align}
 +Therefore, we have 
 \begin{align*} \begin{align*}
-    &\leq \lambda_1 \sum_{j=1}^d (w^T w_j)^2=\lambda_1 \norm{w}^2=\lambda_1 ​+    ​\sum_{i=1}^n \norm{\projorth{\rset w}X_i}^2 ​&\leq \lambda_1 \sum_{j=1}^d (w^T w_j)^2=\lambda_1 \norm{w}^2=\lambda_1 ​
 \end{align*} \end{align*}
-Since $\sum_{i=1}^n \norm{\projorth{\rset ​w_1}X_i}^2=w_1^T S_n w_1=\lambda_1$. Therefore \eqref{eq:​vp} holds true for $p=1$. ​+Note that from \eqref{eq:​dim1},​ we have for any eigenvector ​$w_k$ of $S_n$, ​  
 +\begin{equation} \label{eq:​eigenvector} 
 +\sum_{i=1}^n \norm{\projorth{\rset ​w_k}X_i}^2=w_k^T S_n w_k=\lambda_k. 
 +\end{equation} 
 +In particular, $\sum_{i=1}^n \norm{\projorth{\rset ​w_1}X_i}^2=\lambda_1$. Therefore \eqref{eq:​vp} holds true for $p=1$. ​
  
-Assume now that \eqref{eq:​vp} hold true for some $p \in [1:d-1]$ and let $H \in \Hset_p$. Then, denote by $G=\mathrm{Span}(w_1,​\ldots,​w_p)^\perp$. Since $\mathrm{dim} (G)=d-p$ and $\mathrm{dim}(H)=p+1$,​ we must have $G \cap H \notin \{0\}$ (Otherwise the subspace $G+H$ would be of dimension $n-p+p+1=n+1$ which is not possible). Let $w_0$ a unitary vector of $G \cap H$. Then, we have the decomposition $H=\rset w_0 \stackrel{\perp}{+} H_0$ where $H_0$ is of dimension $p$. Then, +Assume now that \eqref{eq:​vp} hold true for some $p \in [1:d-1]$ and let $H \in \Hset_{p+1}$. Then, denote by $G=\mathrm{Span}(w_1,​\ldots,​w_p)^\perp$. Since $\mathrm{dim} (G)=d-p$ and $\mathrm{dim}(H)=p+1$,​ we must have $G \cap H \notin \{0\}$ (Otherwise the subspace $G+H$ would be of dimension $d-p+p+1=d+1$ which is not possible). Let $w_0$ a unitary vector of $G \cap H$. Then, we have the decomposition $H=\rset w_0 \stackrel{\perp}{+} H_0$ where $H_0$ is of dimension $p$. Then, applying \eqref{eq:​dim1}
 \begin{align*} \begin{align*}
     \sum_{i=1}^n \norm{\projorth{H}X_i}^2&​= \sum_{i=1}^n \norm{\projorth{\rset w_0}X_i}^2+\norm{\projorth{H_0}X_i}^2\\     \sum_{i=1}^n \norm{\projorth{H}X_i}^2&​= \sum_{i=1}^n \norm{\projorth{\rset w_0}X_i}^2+\norm{\projorth{H_0}X_i}^2\\
Line 38: Line 43:
     &​=\sum_{j=p+1}^d \lambda_j (w_0^T w_j)^2 +\sum_{i=1}^n \norm{\projorth{H_0}X_i}^2\\     &​=\sum_{j=p+1}^d \lambda_j (w_0^T w_j)^2 +\sum_{i=1}^n \norm{\projorth{H_0}X_i}^2\\
 \end{align*} \end{align*}
-where we used that $w_0 \in G=\mathrm{Span}(w_1,​\ldots,​w_p)^\perp$. Applying ​now \eqref{eq:​dim1} ​and the induction assumption+where we used that $w_0 \in G=\mathrm{Span}(w_1,​\ldots,​w_p)^\perp$. Applying ​the induction assumption and then \eqref{eq:​dim1},​
 \begin{align*} \begin{align*}
     \sum_{i=1}^n \norm{\projorth{H}X_i}^2&​ \leq \lambda_{p+1} \sum_{j=p+1}^d (w_0^T w_j)^2 +\sum_{i=1}^n \norm{\projorth{\mathrm{Span}(w_1,​\ldots,​w_p)}X_i}^2\\     \sum_{i=1}^n \norm{\projorth{H}X_i}^2&​ \leq \lambda_{p+1} \sum_{j=p+1}^d (w_0^T w_j)^2 +\sum_{i=1}^n \norm{\projorth{\mathrm{Span}(w_1,​\ldots,​w_p)}X_i}^2\\
world/pca.txt · Last modified: 2022/11/13 18:38 by rdouc · Currently locked by: 18.191.84.33