{{page>:defs}} ====== Bayes Optimal Classifier ====== Let $(X,Y)$ be random vector on $ (\Omega,\mcf,\PP) $, taking values on $\rset^k \times [1:p]$. We are interested in solving the optimization problem $$ \inf_{\phi \in \sf{F}} \PP(Y\neq \phi(X)) $$ where $\sf{F}$ is the set of measurable functions from $\rset^k$ to $[1:p]$ where we equip $\rset^k$ with the $\sigma$-field $\mcbb(\rset^k)$ and $[1:p]$ with the $\sigma$-field $\mc{P}([1:p])$. **__Proposition__** $$ \inf_{\phi \in \sf{F}} \PP(Y\neq \phi(X))= \PE\lrb{\min_{i \in [1:p]} \PP(Y \neq i|X)}=\PP(Y \neq \phi^\star(X)) $$ where $\phi^\star(X)=\argmax_{i \in [1:p]} \PP(Y=i|X)$. ===== Proof ===== For any classifier $\phi$, decomposing into the disjoint events $\{\phi(X)=i\}$ and using the tower property, \begin{align*} \PP(Y\neq \phi(X)) &= \sum_{i=1}^p \PP(\phi(X)=i, Y \neq i)=\sum_{i=1}^p \PE\lrb{\indiacc{\phi(X)=i} \PP(Y \neq i|X)} \\ & =\PE\lrb{\sum_{i=1}^p \indiacc{\phi(X)=i} \PP(Y \neq i|X)} \geq \PE\lrb{\min_{i\in [1:p]} \PP(Y \neq i|X)} \end{align*} Moreover the inequality above becomes an equality for $\phi=\phi^\star$ which satisfies the property: $$ \phi^\star(X)= \argmin_{i \in [1:p]} \PP(Y \neq i|X)=\argmax_{i \in [1:p]} \PP(Y=i|X) $$ where the last equality follows from the identity $\PP(Y \neq i|X)=1-\PP(Y=i|X)$. This concludes the proof.