This shows you the differences between two versions of the page.
| Next revision | Previous revision | ||
|
world:kullback [2023/11/10 10:08] rdouc created |
world:kullback [2023/11/10 10:19] (current) rdouc [Proof] |
||
|---|---|---|---|
| Line 1: | Line 1: | ||
| {{page>:defs}} | {{page>:defs}} | ||
| - | The aim is to compute the Kullback divergence between two normal distributions: | + | ====== Kullback-Leibler divergence for normal distributions ====== |
| + | |||
| + | <WRAP center round tip 90%> | ||
| + | Let $\mu_0,\mu_1 \in \rset^p$ and $ \Sigma_0,\Sigma_1 \in \rset^{p \times p}$ where $\Sigma_0,\Sigma_1$ are symmetric definite positive. Then, | ||
| $$ | $$ | ||
| - | \klbck{\N(\mu_0,\Sigma_0)}{\N(\mu_1,\Sigma_1)} | + | \klbck{\N(\mu_0,\Sigma_0)}{\N(\mu_1,\Sigma_1)}=-\frac{p}{2} + \frac{(\mu_0-\mu_1)^T \Sigma_1^{-1} (\mu_0-\mu_1) }{2} + \frac{Tr\lr{\Sigma_1^{-1} \Sigma_0}}{2} - \frac{1}{2} \log \frac{\mathrm{det} \Sigma_0}{\mathrm{det} \Sigma_1} |
| $$ | $$ | ||
| - | where $\mu_0,\mu_1 \in \rset^p$ and $ \Sigma_0,\Sigma_1 \in \rset^{p \times p}$. | + | |
| + | </WRAP> | ||
| + | |||
| + | ===== Proof ===== | ||
| Assume that $X_0\sim \N(\mu_0,\Sigma_0)$ then $X_0= \mu_0 + \Sigma_0^{1/2} U_0$ where $U_0\sim \N(0,I_p)$ | Assume that $X_0\sim \N(\mu_0,\Sigma_0)$ then $X_0= \mu_0 + \Sigma_0^{1/2} U_0$ where $U_0\sim \N(0,I_p)$ | ||
| Line 14: | Line 20: | ||
| &= -\frac{p}{2} + \frac{(\mu_0-\mu_1)^T \Sigma_1^{-1} (\mu_0-\mu_1) }{2} + \frac 1 2 \PE\lrb{U_0^T \Sigma_0^{1/2} \Sigma_1^{-1} \Sigma_0^{1/2} U_0} - \frac{1}{2} \log \frac{\mathrm{det} \Sigma_0}{\mathrm{det} \Sigma_1}\\ | &= -\frac{p}{2} + \frac{(\mu_0-\mu_1)^T \Sigma_1^{-1} (\mu_0-\mu_1) }{2} + \frac 1 2 \PE\lrb{U_0^T \Sigma_0^{1/2} \Sigma_1^{-1} \Sigma_0^{1/2} U_0} - \frac{1}{2} \log \frac{\mathrm{det} \Sigma_0}{\mathrm{det} \Sigma_1}\\ | ||
| &= -\frac{p}{2} + \frac{(\mu_0-\mu_1)^T \Sigma_1^{-1} (\mu_0-\mu_1) }{2} + \frac{Tr\lr{\Sigma_1^{-1} \Sigma_0}}{2} - \frac{1}{2} \log \frac{\mathrm{det} \Sigma_0}{\mathrm{det} \Sigma_1} | &= -\frac{p}{2} + \frac{(\mu_0-\mu_1)^T \Sigma_1^{-1} (\mu_0-\mu_1) }{2} + \frac{Tr\lr{\Sigma_1^{-1} \Sigma_0}}{2} - \frac{1}{2} \log \frac{\mathrm{det} \Sigma_0}{\mathrm{det} \Sigma_1} | ||
| + | \end{align*} | ||
| + | where in the last line, we have used that | ||
| + | |||
| + | \begin{align*} | ||
| + | \PE\lrb{U_0^T \Sigma_0^{1/2} \Sigma_1^{-1} \Sigma_0^{1/2} U_0}&=\lr{\PE\lrb{Tr\lr{U_0^T \Sigma_0^{1/2} \Sigma_1^{-1} \Sigma_0^{1/2} U_0}}}=Tr\lr{\PE\lrb{\Sigma_1^{-1} \Sigma_0^{1/2} U_0 U_0^T \Sigma_0^{1/2} }} \\ | ||
| + | &=Tr\lr{\Sigma_1^{-1} \Sigma_0^{1/2} \PE\lrb{U_0 U_0^T} \Sigma_0^{1/2} }=Tr\lr{\Sigma_1^{-1} \Sigma_0} | ||
| \end{align*} | \end{align*} | ||