This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
world:kullback [2023/11/10 10:08] rdouc created |
world:kullback [2023/11/10 10:19] (current) rdouc [Proof] |
||
---|---|---|---|
Line 1: | Line 1: | ||
{{page>:defs}} | {{page>:defs}} | ||
- | The aim is to compute the Kullback divergence between two normal distributions: | + | ====== Kullback-Leibler divergence for normal distributions ====== |
+ | |||
+ | <WRAP center round tip 90%> | ||
+ | Let $\mu_0,\mu_1 \in \rset^p$ and $ \Sigma_0,\Sigma_1 \in \rset^{p \times p}$ where $\Sigma_0,\Sigma_1$ are symmetric definite positive. Then, | ||
$$ | $$ | ||
- | \klbck{\N(\mu_0,\Sigma_0)}{\N(\mu_1,\Sigma_1)} | + | \klbck{\N(\mu_0,\Sigma_0)}{\N(\mu_1,\Sigma_1)}=-\frac{p}{2} + \frac{(\mu_0-\mu_1)^T \Sigma_1^{-1} (\mu_0-\mu_1) }{2} + \frac{Tr\lr{\Sigma_1^{-1} \Sigma_0}}{2} - \frac{1}{2} \log \frac{\mathrm{det} \Sigma_0}{\mathrm{det} \Sigma_1} |
$$ | $$ | ||
- | where $\mu_0,\mu_1 \in \rset^p$ and $ \Sigma_0,\Sigma_1 \in \rset^{p \times p}$. | + | |
+ | </WRAP> | ||
+ | |||
+ | ===== Proof ===== | ||
Assume that $X_0\sim \N(\mu_0,\Sigma_0)$ then $X_0= \mu_0 + \Sigma_0^{1/2} U_0$ where $U_0\sim \N(0,I_p)$ | Assume that $X_0\sim \N(\mu_0,\Sigma_0)$ then $X_0= \mu_0 + \Sigma_0^{1/2} U_0$ where $U_0\sim \N(0,I_p)$ | ||
Line 14: | Line 20: | ||
&= -\frac{p}{2} + \frac{(\mu_0-\mu_1)^T \Sigma_1^{-1} (\mu_0-\mu_1) }{2} + \frac 1 2 \PE\lrb{U_0^T \Sigma_0^{1/2} \Sigma_1^{-1} \Sigma_0^{1/2} U_0} - \frac{1}{2} \log \frac{\mathrm{det} \Sigma_0}{\mathrm{det} \Sigma_1}\\ | &= -\frac{p}{2} + \frac{(\mu_0-\mu_1)^T \Sigma_1^{-1} (\mu_0-\mu_1) }{2} + \frac 1 2 \PE\lrb{U_0^T \Sigma_0^{1/2} \Sigma_1^{-1} \Sigma_0^{1/2} U_0} - \frac{1}{2} \log \frac{\mathrm{det} \Sigma_0}{\mathrm{det} \Sigma_1}\\ | ||
&= -\frac{p}{2} + \frac{(\mu_0-\mu_1)^T \Sigma_1^{-1} (\mu_0-\mu_1) }{2} + \frac{Tr\lr{\Sigma_1^{-1} \Sigma_0}}{2} - \frac{1}{2} \log \frac{\mathrm{det} \Sigma_0}{\mathrm{det} \Sigma_1} | &= -\frac{p}{2} + \frac{(\mu_0-\mu_1)^T \Sigma_1^{-1} (\mu_0-\mu_1) }{2} + \frac{Tr\lr{\Sigma_1^{-1} \Sigma_0}}{2} - \frac{1}{2} \log \frac{\mathrm{det} \Sigma_0}{\mathrm{det} \Sigma_1} | ||
+ | \end{align*} | ||
+ | where in the last line, we have used that | ||
+ | |||
+ | \begin{align*} | ||
+ | \PE\lrb{U_0^T \Sigma_0^{1/2} \Sigma_1^{-1} \Sigma_0^{1/2} U_0}&=\lr{\PE\lrb{Tr\lr{U_0^T \Sigma_0^{1/2} \Sigma_1^{-1} \Sigma_0^{1/2} U_0}}}=Tr\lr{\PE\lrb{\Sigma_1^{-1} \Sigma_0^{1/2} U_0 U_0^T \Sigma_0^{1/2} }} \\ | ||
+ | &=Tr\lr{\Sigma_1^{-1} \Sigma_0^{1/2} \PE\lrb{U_0 U_0^T} \Sigma_0^{1/2} }=Tr\lr{\Sigma_1^{-1} \Sigma_0} | ||
\end{align*} | \end{align*} |