This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
world:kkt [2021/01/21 09:22] rdouc |
world:kkt [2022/10/03 00:23] rdouc |
||
---|---|---|---|
Line 10: | Line 10: | ||
====== Weak duality ====== | ====== Weak duality ====== | ||
- | Let $f$ and $(g_i)_{1 \leq i \leq n}$ be convex differentiable functions on $\Xset=\rset^p$. Let $\mcD=\cap_{i=1}^n \{g_i \leq 0\}\neq \emptyset$ and note that by convexity of the functions $g_i$, the set $\mcD$ is actually a convex set. We are interested in the infimum of the convex function $f$ on the convex set $\mcD$. Define the Lagrange function | + | Let $f$ and $(h_i)_{1 \leq i \leq n}$ be convex differentiable functions on $\Xset=\rset^p$. Let $\mcD=\cap_{i=1}^n \{h_i \leq 0\}\neq \emptyset$ and note that by convexity of the functions $h_i$, the set $\mcD$ is actually a convex set. We are interested in the infimum of the convex function $f$ on the convex set $\mcD$. Define the Lagrange function |
$$ | $$ | ||
- | \mcl(x,\lambda)=f(x)+\sum_{i=1}^n \lambda_i g_i(x) | + | \mcl(x,\lambda)=f(x)+\sum_{i=1}^n \lambda_i h_i(x) |
$$ | $$ | ||
where $\lambda=(\lambda_1,\ldots,\lambda_n)^T \in \rset^n$. We first show the weak duality relation. Start with this simple remark: for all $(x,\lambda) \in \Xset \times \rset^n$, | where $\lambda=(\lambda_1,\ldots,\lambda_n)^T \in \rset^n$. We first show the weak duality relation. Start with this simple remark: for all $(x,\lambda) \in \Xset \times \rset^n$, | ||
Line 27: | Line 27: | ||
\end{equation} | \end{equation} | ||
Note that in the lhs (left-hand side), the infimum is wrt $x\in \Xset$ and therefore, there is no constraint, which is nice... | Note that in the lhs (left-hand side), the infimum is wrt $x\in \Xset$ and therefore, there is no constraint, which is nice... | ||
- | The rhs (right-hand side) is called the //primal problem// and the lhs the //dual problem//. Note, since $x\mapsto \mcl(x,\lambda)$ is convex, that the dual problem $\sup_{\lambda\geq 0}\inf_{x \in \Xset} \mcl(x,\lambda)$ is equivalent to | + | The rhs (right-hand side) is called the //primal problem// and the lhs the //dual problem//. Note, since $x\mapsto \mcl(x,\lambda)$ is convex, that the <color red>dual problem</color> $\sup_{\lambda\geq 0}\inf_{x \in \Xset} \mcl(x,\lambda)$ is equivalent to |
$$ | $$ | ||
- | \sup_{\lambda \geq 0\mbox{ and }\nabla_x \mcl(x,\lambda)=0} \mcl(x,\lambda) | + | \sup \set{\mcl(x,\lambda)}{\lambda \geq 0\mbox{ and }\nabla_x \mcl(x,\lambda)=0} |
$$ | $$ | ||
- | subject to the constraint $\lambda \geq 0$ and $\nabla_x \mcl(x,\lambda)=0$. | ||
- | To obtain the equality (which is named the <color red>**strong duality relation**</color>), we actually need some additional assumptions (for example the existence of a Slater point). But before that, let us check a very useful relation... | + | To obtain the equality in \eqref{eq:weak} (which is named the <color red>**strong duality relation**</color>), we actually need some additional assumptions (for example the existence of a Slater point). But before that, let us check a very useful relation... |
====== Karush-Kuhn-Tucker conditions ====== | ====== Karush-Kuhn-Tucker conditions ====== | ||
Line 44: | Line 43: | ||
\nabla_x \mcl(x^*,\lambda^*)=0 | \nabla_x \mcl(x^*,\lambda^*)=0 | ||
\end{equation} | \end{equation} | ||
- | and for all $i \in [1:n]$, we have either $\lambda^*_i=0$ or $g_i(x^*)=0$ . | + | and for all $i \in [1:n]$, we have either $\lambda^*_i=0$ or $h_i(x^*)=0$ . |
- | Then, | + | Then, the strong duality holds and |
$$ | $$ | ||
- | f(x^*)=\inf_{x \in \mcD} f(x) | + | f(x^*)=\inf_{x \in \mcD} f(x)=\mcl(x^*,\lambda^*) |
$$ | $$ | ||
</WRAP> | </WRAP> | ||
Line 56: | Line 55: | ||
f(x)-f(x^*) \geq \nabla f(x^*)^T (x-x^*). | f(x)-f(x^*) \geq \nabla f(x^*)^T (x-x^*). | ||
$$ | $$ | ||
- | It remains to prove $\nabla f(x^*)^T (x-x^*)\geq 0$. Using first $\nabla_x \mcl(x^*,\lambda^*)=0$ and then the convexity of the functions $g_i$, we get | + | It remains to prove $\nabla f(x^*)^T (x-x^*)\geq 0$. Using first $\nabla_x \mcl(x^*,\lambda^*)=0$ and then the convexity of the functions $h_i$, we get |
$$ | $$ | ||
- | \nabla f(x^*)^T (x-x^*)=-\sum_{i=1}^n \lambda^*_i \nabla_x g_i^T(x^*) (x-x^*) \geq -\sum_{i=1}^n \lambda^*_i (g_i(x)-g_i(x^*))=- \sum_{i=1}^n \lambda^*_i \underbrace{g_i(x)}_{\leq 0} \geq 0 | + | \nabla f(x^*)^T (x-x^*)=-\sum_{i=1}^n \lambda^*_i \nabla_x h_i^T(x^*) (x-x^*) \geq -\sum_{i=1}^n \lambda^*_i (h_i(x)-h_i(x^*))=- \sum_{i=1}^n \lambda^*_i \underbrace{h_i(x)}_{\leq 0} \geq 0 |
$$ | $$ | ||
This finishes the proof. $\eproof$ | This finishes the proof. $\eproof$ | ||
Line 73: | Line 72: | ||
<WRAP center round box 80%> | <WRAP center round box 80%> | ||
__**Saddle point Lemma**__ | __**Saddle point Lemma**__ | ||
- | If $(x^*,\lambda^*) \in \Xset \times (\rset^+)^n$ is a saddle point for $\mcl$ then the strong duality holds, $x^*=\arginf_{x \in \mcD} f(x)$, and $\lambda^*_i g_i(x^*)=0$ for all $i \in [1:n]$. | + | If $(x^*,\lambda^*) \in \Xset \times (\rset^+)^n$ is a saddle point for $\mcl$ then the strong duality holds, and the KKT conditions holds for $(x^*,\lambda^*)$. |
</WRAP> | </WRAP> | ||
<hidden click here to see the proof> | <hidden click here to see the proof> | ||
Line 88: | Line 87: | ||
which is the converse inequality of \eqref{eq:weak}. This finally implies: | which is the converse inequality of \eqref{eq:weak}. This finally implies: | ||
\begin{equation}\label{eq:all} | \begin{equation}\label{eq:all} | ||
- | \sup_{\lambda \geq 0} \inf_{x \in\Xset} \mcl(x,\lambda)=\mcl(x^*,\lambda^*)=\inf_{x \in\Xset} \sup_{\lambda \geq 0} \mcl(x,\lambda)=\inf_{x \in\mcD} f(x). | + | \sup_{\lambda \geq 0} \inf_{x \in\Xset} \mcl(x,\lambda)=\inf_{x \in\Xset} \mcl(x,\lambda^*)=\mcl(x^*,\lambda^*)=\inf_{x \in\Xset} \sup_{\lambda \geq 0} \mcl(x,\lambda)=\inf_{x \in\mcD} f(x). |
\end{equation} | \end{equation} | ||
- | The upper bound in \eqref{eq:saddle} regardless the value of $\lambda\geq 0$ shows that $g_i(x^*)\leq 0$ for every $i \in [1:n]$, that is $x^* \in \mcD$. Now taking \eqref{eq:saddle} with $\lambda=0$ yields for $x\in \mcD$, | + | Note that $\mcl(x^*,\lambda^*)=\inf_{x \in\Xset} \mcl(x,\lambda^*)$ shows that $\nabla_x \mcl(x^*,\lambda^*)=0$. |
+ | |||
+ | The upper bound in \eqref{eq:saddle} regardless the value of $\lambda\geq 0$ shows that $h_i(x^*)\leq 0$ for every $i \in [1:n]$, that is $x^* \in \mcD$. Now taking \eqref{eq:saddle} with $\lambda=0$ yields for $x\in \mcD$, | ||
$$ | $$ | ||
f(x^*)=\mcl(x^*,0) \leq \mcl(x,\lambda^*) \leq f(x) | f(x^*)=\mcl(x^*,0) \leq \mcl(x,\lambda^*) \leq f(x) | ||
$$ | $$ | ||
- | which shows that $f(x^*)=\inf_{x \in\mcD} f(x)$. Combining with \eqref{eq:all} yields $\mcl(x^*,\lambda^*)=f(x^*)$ so that $\sum_{i=1}^n \lambda^*_i g_i(x^*_i)=0$ and since $\lambda^* \geq 0$ and $x^* \in \mcD$, this implies $\lambda^*_i g_i(x^*)=0$ for all $i \in [1:n]$. | + | which shows that $f(x^*)=\inf_{x \in\mcD} f(x)$. Combining with \eqref{eq:all} yields $\mcl(x^*,\lambda^*)=f(x^*)$ so that $\sum_{i=1}^n \lambda^*_i h_i(x^*_i)=0$ and since $\lambda^* \geq 0$ and $x^* \in \mcD$, this implies $\lambda^*_i h_i(x^*)=0$ for all $i \in [1:n]$. |
$\eproof$ | $\eproof$ | ||
</hidden> | </hidden> | ||
===== Necessary conditions and strong duality ===== | ===== Necessary conditions and strong duality ===== | ||
- | We now assume the existence of Slater points: there exists $\tilde x \in \mcD$ such that for all $i \in \{1,\ldots,n\}$, $g_i(\tilde x)<0$. | + | We now assume the existence of Slater points: there exists $\tilde x \in \mcD$ such that for all $i \in \{1,\ldots,n\}$, $h_i(\tilde x)<0$. |
<WRAP center round box 80%> | <WRAP center round box 80%> | ||
__**The convex Farkas lemma**__ | __**The convex Farkas lemma**__ | ||
- | Assume that there exists a Slater point. Then, $\{f<0\} \cap \mcD =\emptyset$ iff there exists $\lambda^* \geq 0$ such that for all $x\in \mcD$, $f(x)+\sum_{i=1}^n \lambda^*_i g_i(x) \geq 0$. | + | Assume that there exists a Slater point. Then, $\{f<0\} \cap \mcD =\emptyset$ iff there exists $\lambda^* \geq 0$ such that for all $x\in \mcD$, $f(x)+\sum_{i=1}^n \lambda^*_i h_i(x) \geq 0$. |
</WRAP> | </WRAP> | ||
<hidden Click here to see the proof> | <hidden Click here to see the proof> | ||
Line 110: | Line 111: | ||
Set | Set | ||
$$ | $$ | ||
- | U=\{u=u_{0:n} \in \rset^{n+1}; \exists\ x \in \mcD, f(x)<u_0 \ \mbox{and} \ g_i(x)\leq u_i \mbox{ for all } i \in [1:n]\}. | + | U=\{u=u_{0:n} \in \rset^{n+1}; \exists\ x \in \mcD, f(x)<u_0 \ \mbox{and} \ h_i(x)\leq u_i \mbox{ for all } i \in [1:n]\}. |
$$ | $$ | ||
The condition $\{f<0\} \cap \mcD =\emptyset$ is equivalent to saying that $0 \notin U$ and since $U$ is clearly a convex set, by a separation argument, there exists a __**non-null**__ vector $\phi \in \rset^{n+1}$ such that | The condition $\{f<0\} \cap \mcD =\emptyset$ is equivalent to saying that $0 \notin U$ and since $U$ is clearly a convex set, by a separation argument, there exists a __**non-null**__ vector $\phi \in \rset^{n+1}$ such that | ||
Line 119: | Line 120: | ||
If $u\in U$ then $u+t e_i \in U$ where $t>0$ and $e_i=(\indiacc{k=i})_{k \in [0:n]} \in \rset^{n+1}$. The previous inequality implies $\phi^T (u+t e_i)=\phi^T u + t\phi_i \geq 0$ for all $t>0$, therefore $\phi_i\geq 0$. Now, since $\phi^T u\geq 0$ for all $u \in U$, a simple limiting argument yields for all $x\in \mcD$, | If $u\in U$ then $u+t e_i \in U$ where $t>0$ and $e_i=(\indiacc{k=i})_{k \in [0:n]} \in \rset^{n+1}$. The previous inequality implies $\phi^T (u+t e_i)=\phi^T u + t\phi_i \geq 0$ for all $t>0$, therefore $\phi_i\geq 0$. Now, since $\phi^T u\geq 0$ for all $u \in U$, a simple limiting argument yields for all $x\in \mcD$, | ||
$$ | $$ | ||
- | \phi_0 f(x)+\sum_{i=1}^n \phi_i g_i(x) \geq 0 | + | \phi_0 f(x)+\sum_{i=1}^n \phi_i h_i(x) \geq 0 |
$$ | $$ | ||
- | To conclude, and since we already know that $\phi_i\geq 0$ for all $i \in [0:n]$, it only remains to prove that $\phi_0 \neq 0$ and to set in that case $\lambda^*_i=\frac{\phi_i}{\phi_0}$. The rest of the argument is by contradiction. If $\phi_0=0$, then the previous inequality on a Slater point $x=\tilde x$ yields $\sum_{i=1}^n \phi_i g_i(\tilde x) \geq 0$ but since $g_i(\tilde x)<0$ for all $i \in [1:n]$, we finally get $\phi_i=0$ for all $i \in [1:n]$. Finally all the components of $\phi$ are null and we are face to a contradiction. The proof is completed. | + | To conclude, and since we already know that $\phi_i\geq 0$ for all $i \in [0:n]$, it only remains to prove that $\phi_0 \neq 0$ and to set in that case $\lambda^*_i=\frac{\phi_i}{\phi_0}$. The rest of the argument is by contradiction. If $\phi_0=0$, then the previous inequality on a Slater point $x=\tilde x$ yields $\sum_{i=1}^n \phi_i h_i(\tilde x) \geq 0$ but since $h_i(\tilde x)<0$ for all $i \in [1:n]$, we finally get $\phi_i=0$ for all $i \in [1:n]$. Finally all the components of $\phi$ are null and we are face to a contradiction. The proof is completed. |
$\eproof$ | $\eproof$ | ||
</hidden> | </hidden> | ||
Line 132: | Line 133: | ||
<hidden Click here to see the proof> | <hidden Click here to see the proof> | ||
$\bproof$ | $\bproof$ | ||
- | Assume now that $f(x^*)=\inf_{x\in \mcD} f(x)$ for some $x^* \in\mcD$. Then, $\{f-f(x^*)<0\} \cap \mcD =\emptyset$ so that there exists $\lambda^* \geq 0$ satisfying for all $x\in \mcD$, $f(x)-f(x^*)+\sum_{i=1}^n \lambda^*_i g_i(x) \geq 0$. And this implies that $(x^*,\lambda^*)$ is a saddle point since: for all $\lambda \geq 0$ and $x \in \mcD$, | + | Assume now that $f(x^*)=\inf_{x\in \mcD} f(x)$ for some $x^* \in\mcD$. Then, $\{f-f(x^*)<0\} \cap \mcD =\emptyset$ so that there exists $\lambda^* \geq 0$ satisfying for all $x\in \mcD$, $f(x)-f(x^*)+\sum_{i=1}^n \lambda^*_i h_i(x) \geq 0$. And this implies that $(x^*,\lambda^*)$ is a saddle point since: for all $\lambda \geq 0$ and $x \in \mcD$, |
$$ | $$ | ||
- | f(x^*)+\sum_{i=1}^n \lambda_i g_i(x^*) \leq f(x^*) \leq f(x)+\sum_{i=1}^n \lambda^*_i g_i(x) | + | f(x^*)+\sum_{i=1}^n \lambda_i h_i(x^*) \leq f(x^*) \leq f(x)+\sum_{i=1}^n \lambda^*_i h_i(x) |
$$ | $$ | ||
so that $(x^*,\lambda^*)$ is a saddle point and this implies that the strong duality holds (as we have seen in the previous section). This concludes the proof. | so that $(x^*,\lambda^*)$ is a saddle point and this implies that the strong duality holds (as we have seen in the previous section). This concludes the proof. |