Welcome to Randal Douc's wiki

A collaborative site on maths but not only!

User Tools

Site Tools


world:kkt

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
world:kkt [2019/01/28 12:40]
douc
world:kkt [2022/10/03 00:23]
rdouc
Line 1: Line 1:
 +{{page>:​defs}}
 +{{tag>​karush_kuhn_tucker kkt farkas_lemma}}
 +
 +
 +
 +$$
 +\newcommand{\mcD}{\mathcal D}
 +\newcommand{\mcl}{\mathcal L}
 +$$
 +
 +====== Weak duality ======
 +Let $f$ and $(h_i)_{1 \leq i \leq n}$ be convex differentiable functions on $\Xset=\rset^p$. Let $\mcD=\cap_{i=1}^n \{h_i \leq 0\}\neq \emptyset$ and note that by convexity of the functions $h_i$, the set $\mcD$ is actually a convex set. We are interested in the infimum of the convex function $f$ on the convex set $\mcD$. Define the Lagrange function
 +$$
 +\mcl(x,​\lambda)=f(x)+\sum_{i=1}^n \lambda_i h_i(x)
 +$$
 +where $\lambda=(\lambda_1,​\ldots,​\lambda_n)^T \in \rset^n$. We first show the weak duality relation. Start with this simple remark: for all $(x,​\lambda) \in \Xset \times \rset^n$, ​
 +$$
 +\inf_{x \in \Xset} \mcl(x,​\lambda) \leq \mcl(x,​\lambda) ​
 +$$
 +Taking the supremum wrt $\lambda \geq 0$ on both sides yields: ​
 +$$
 +\sup_{\lambda\geq 0}\inf_{x \in \Xset} \mcl(x,​\lambda) \leq \sup_{\lambda\geq 0} \mcl(x,​\lambda)=\infty \indin{x\notin \mcD}+f(x)\indin{x\in \mcD}
 +$$
 +And taking now the infimum wrt $x \in \Xset$ on both sides, we finally get the <color red>​**weak duality relation**</​color>: ​
 +\begin{equation} \label{eq:​weak}
 +\sup_{\lambda\geq 0}\inf_{x \in \Xset} \mcl(x,​\lambda) \leq \inf_{x \in \Xset} \sup_{\lambda\geq 0}\mcl(x,​\lambda)=\inf_{x \in \mcD} f(x) 
 +\end{equation}
 +Note that in the lhs (left-hand side), the infimum is wrt $x\in \Xset$ and therefore, there is no constraint, which is nice... ​
 +The rhs (right-hand side) is called the //primal problem// and the lhs the //dual problem//. Note, since $x\mapsto \mcl(x,​\lambda)$ is convex, that the <color red>dual problem</​color>​ $\sup_{\lambda\geq 0}\inf_{x \in \Xset} \mcl(x,​\lambda)$ is equivalent to 
 +$$
 +\sup \set{\mcl(x,​\lambda)}{\lambda \geq 0\mbox{ and }\nabla_x \mcl(x,​\lambda)=0}
 +$$
 +
 +To obtain the equality in \eqref{eq:​weak} (which is named the <color red>​**strong duality relation**</​color>​),​ we actually need some additional assumptions (for example the existence of a Slater point). But before that, let us check a very useful relation...
 +====== Karush-Kuhn-Tucker conditions ======
 +
 +===== Sufficient conditions =====
 +
 +<WRAP center round box 80%>
 +__**Lemma**__
 +Assume that there exist $(x^*,​\lambda^*) \in \mcD\times (\rset^+)^n$ such that 
 +\begin{equation}
 +\nabla_x \mcl(x^*,​\lambda^*)=0
 +\end{equation}
 +and for all $i \in [1:n]$, we have either $\lambda^*_i=0$ or $h_i(x^*)=0$ . 
 +Then, the strong duality holds and 
 +$$
 +f(x^*)=\inf_{x \in \mcD} f(x)=\mcl(x^*,​\lambda^*)
 +$$
 +</​WRAP>​
 +<hidden Click to see the proof>
 +$\bproof$
 +For every $x \in \mcD$, the convexity of the function $f$ yields ​
 +$$
 +f(x)-f(x^*) \geq \nabla f(x^*)^T (x-x^*).
 +$$
 +It remains to prove $\nabla f(x^*)^T (x-x^*)\geq 0$. Using first $\nabla_x \mcl(x^*,​\lambda^*)=0$ and then the convexity of the functions $h_i$, we get
 +$$
 +\nabla f(x^*)^T (x-x^*)=-\sum_{i=1}^n \lambda^*_i \nabla_x h_i^T(x^*) (x-x^*) \geq -\sum_{i=1}^n \lambda^*_i (h_i(x)-h_i(x^*))=- \sum_{i=1}^n \lambda^*_i ​ \underbrace{h_i(x)}_{\leq 0} \geq 0
 +$$  ​
 +This finishes the proof. $\eproof$
 +</​hidden>​
 +===== Saddle points =====
 +
 +<WRAP center round box 80%>
 +__**Definition**__
 +We say that $(x^*,​\lambda^*) \in \Xset \times (\rset^+)^n$ is a saddle point of the Lagrange function $\mcl$ if for every $(x,​\lambda) \in \Xset \times (\rset^+)^n$, ​
 +\begin{equation}\label{eq:​saddle}
 +\mcl(x^*,​\lambda) \leq \mcl(x,​\lambda^*) ​
 +\end{equation}
 +</​WRAP>​
 +<WRAP center round box 80%>
 +__**Saddle point Lemma**__
 +If $(x^*,​\lambda^*) \in \Xset \times (\rset^+)^n$ is a saddle point for $\mcl$ then  the strong duality holds, and the KKT conditions holds for $(x^*,​\lambda^*)$.
 +</​WRAP>​
 +<hidden click here to see the proof>
 +$\bproof$ ​
 +Using \eqref{eq:​saddle} with $\lambda=\lambda^*$ first and then with $x=x^*$, we have 
 +for every $(x,​\lambda) \in \Xset \times (\rset^+)^n$, ​
 +\begin{equation*}
 +\mcl(x^*,​\lambda) \leq \mcl(x^*,​\lambda^*) ​ \leq \mcl(x,​\lambda^*) ​
 +\end{equation*}
 +The existence of a saddle point implies the **strong duality** since 
 +$$
 +\inf_{x \in\Xset} \sup_{\lambda \geq 0} \mcl(x,​\lambda) \leq \sup_{\lambda \geq 0} \mcl(x^*,​\lambda) \leq \mcl(x^*,​\lambda^*) ​ \leq   ​\inf_{x \in\Xset} \mcl(x,​\lambda^*) \leq \sup_{\lambda \geq 0}  \inf_{x \in\Xset} \mcl(x,​\lambda)
 +$$
 +which is the converse inequality of \eqref{eq:​weak}. This finally implies: ​
 +\begin{equation}\label{eq:​all}
 +\sup_{\lambda \geq 0} \inf_{x \in\Xset} ​ \mcl(x,​\lambda)=\inf_{x \in\Xset} \mcl(x,​\lambda^*)=\mcl(x^*,​\lambda^*)=\inf_{x \in\Xset} \sup_{\lambda \geq 0}  \mcl(x,​\lambda)=\inf_{x \in\mcD} f(x).
 +\end{equation} ​
 +
 +Note that $\mcl(x^*,​\lambda^*)=\inf_{x \in\Xset} \mcl(x,​\lambda^*)$ shows that $\nabla_x \mcl(x^*,​\lambda^*)=0$. ​
 + 
 +The upper bound in \eqref{eq:​saddle} regardless the value of $\lambda\geq 0$ shows that $h_i(x^*)\leq 0$ for every $i \in [1:n]$, that is $x^* \in \mcD$. Now taking \eqref{eq:​saddle} with $\lambda=0$ yields for $x\in \mcD$, ​
 +$$
 +f(x^*)=\mcl(x^*,​0) \leq \mcl(x,​\lambda^*) \leq f(x)  ​
 +$$    ​
 +which shows that $f(x^*)=\inf_{x \in\mcD} f(x)$. Combining with \eqref{eq:​all} yields $\mcl(x^*,​\lambda^*)=f(x^*)$ so that $\sum_{i=1}^n \lambda^*_i h_i(x^*_i)=0$ and since $\lambda^* \geq 0$ and $x^* \in \mcD$, this implies $\lambda^*_i h_i(x^*)=0$ for all $i \in [1:​n]$. ​
 +$\eproof$
 +</​hidden>​
 +===== Necessary conditions and strong duality =====
 + 
 +We now assume the existence of Slater points: there exists $\tilde x \in \mcD$ such that for all $i \in \{1,​\ldots,​n\}$,​ $h_i(\tilde x)<​0$. ​
 +
 +<WRAP center round box 80%>
 +__**The convex Farkas lemma**__ ​
 +Assume that there exists a Slater point. Then, $\{f<0\} \cap \mcD =\emptyset$ iff there exists $\lambda^* \geq 0$ such that for all $x\in \mcD$, $f(x)+\sum_{i=1}^n \lambda^*_i h_i(x) \geq 0$.
 +</​WRAP>​
 +<hidden Click here to see the proof> ​
 +$\bproof$
 +Set 
 +$$
 +U=\{u=u_{0:​n} \in \rset^{n+1};​ \exists\ x \in \mcD, f(x)<u_0 \ \mbox{and} \ h_i(x)\leq u_i  \mbox{ for all } i \in [1:n]\}.
 +$$
 +The condition $\{f<0\} \cap \mcD =\emptyset$ ​ is equivalent to saying that $0 \notin U$ and since $U$ is clearly a convex set, by a separation argument, there exists a __**non-null**__ vector $\phi \in \rset^{n+1}$ such that 
 +\begin{equation} \label{eq:​ineg}
 +\forall u \in U\,, \quad \phi^T u\geq 0 
 +\end{equation}
 +Take an arbitrary $i \in [0:​n]$. ​
 +If $u\in U$ then $u+t e_i \in U$ where $t>0$ and $e_i=(\indiacc{k=i})_{k \in [0:n]} \in \rset^{n+1}$. ​ The previous inequality implies $\phi^T (u+t e_i)=\phi^T u + t\phi_i \geq 0$ for all $t>0$, therefore $\phi_i\geq 0$. Now, since $\phi^T u\geq 0$ for all $u \in U$, a simple limiting argument yields for all $x\in \mcD$, ​
 +$$
 +\phi_0 f(x)+\sum_{i=1}^n \phi_i h_i(x) \geq 0
 +$$
 +To conclude, and since we already know that $\phi_i\geq 0$ for all $i \in [0:n]$, it only remains to prove that $\phi_0 \neq 0$ and to set in that case $\lambda^*_i=\frac{\phi_i}{\phi_0}$. The rest of the argument is by contradiction. If $\phi_0=0$, then the previous inequality on a Slater point $x=\tilde x$ yields $\sum_{i=1}^n \phi_i h_i(\tilde x) \geq 0$ but since $h_i(\tilde x)<0$ for all $i \in [1:n]$, we finally get $\phi_i=0$ for all $i \in [1:n]$. Finally all the components of $\phi$ are null and we are face to a contradiction. The proof is completed.  ​
 +$\eproof$
 +</​hidden>​
 +
 +The Farkas lemma will imply the strong duality under the existence of a Slater point. ​
 +<WRAP center round box 80%>
 +__**The necessary condition**__ ​
 +Assume that $f(x^*)=\inf_{x\in \mcD} f(x)$ for some $x^* \in\mcD$ and that there exists a Slater point. Then there exist $\lambda^* \geq 0$ such that $(x^*,​\lambda^*)$ is a saddle point and by the Saddle point lemma, the strong duality holds. ​
 +</​WRAP>​
 +<hidden Click here to see the proof>
 +$\bproof$
 +Assume now that $f(x^*)=\inf_{x\in \mcD} f(x)$ for some $x^* \in\mcD$. Then, $\{f-f(x^*)<​0\} \cap \mcD =\emptyset$ so that there exists $\lambda^* \geq 0$ satisfying for all $x\in \mcD$, $f(x)-f(x^*)+\sum_{i=1}^n \lambda^*_i h_i(x) \geq 0$. And this implies that $(x^*,​\lambda^*)$ is a saddle point since: for all $\lambda \geq 0$ and $x \in \mcD$, ​
 +$$
 +f(x^*)+\sum_{i=1}^n \lambda_i h_i(x^*) \leq f(x^*) \leq f(x)+\sum_{i=1}^n \lambda^*_i h_i(x) ​
 +$$
 +so that $(x^*,​\lambda^*)$ is a saddle point and this implies that the strong duality holds (as we have seen in the previous section). This concludes the proof. ​
 +$\eproof$
 +</​hidden>​
 +
  
world/kkt.txt ยท Last modified: 2022/10/03 00:23 by rdouc