Differences

This shows you the differences between two versions of the page.

--- world:kkt [2021/01/21 09:22]
rdouc
+++ world:kkt [2022/10/03 00:23]
rdouc
@@ Line 10: / Line 10: @@
 ====== Weak duality ======
-Let $f$ and $(g_i)_{1 \leq i \leq n}$ be convex differentiable functions on $\Xset=\rset^p$. Let $\mcD=\cap_{i=1}^n \{g_i \leq 0\}\neq \emptyset$ and note that by convexity of the functions $g_i$, the set $\mcD$ is actually a convex set. We are interested in the infimum of the convex function $f$ on the convex set $\mcD$. Define the Lagrange function
+Let $f$ and $(h_i)_{1 \leq i \leq n}$ be convex differentiable functions on $\Xset=\rset^p$. Let $\mcD=\cap_{i=1}^n \{h_i \leq 0\}\neq \emptyset$ and note that by convexity of the functions $h_i$, the set $\mcD$ is actually a convex set. We are interested in the infimum of the convex function $f$ on the convex set $\mcD$. Define the Lagrange function
 $$
-\mcl(x,\lambda)=f(x)+\sum_{i=1}^n \lambda_i g_i(x)
+\mcl(x,\lambda)=f(x)+\sum_{i=1}^n \lambda_i h_i(x)
 $$
 where $\lambda=(\lambda_1,\ldots,\lambda_n)^T \in \rset^n$. We first show the weak duality relation. Start with this simple remark: for all $(x,\lambda) \in \Xset \times \rset^n$,
@@ Line 27: / Line 27: @@
 \end{equation}
 Note that in the lhs (left-hand side), the infimum is wrt $x\in \Xset$ and therefore, there is no constraint, which is nice...
-The rhs (right-hand side) is called the //primal problem// and the lhs the //dual problem//. Note, since $x\mapsto \mcl(x,\lambda)$ is convex, that the dual problem $\sup_{\lambda\geq 0}\inf_{x \in \Xset} \mcl(x,\lambda)$ is equivalent to
+The rhs (right-hand side) is called the //primal problem// and the lhs the //dual problem//. Note, since $x\mapsto \mcl(x,\lambda)$ is convex, that the <color red>dual problem</color> $\sup_{\lambda\geq 0}\inf_{x \in \Xset} \mcl(x,\lambda)$ is equivalent to
 $$
-\sup_{\lambda \geq 0\mbox{ and }\nabla_x \mcl(x,\lambda)=0} \mcl(x,\lambda)
+\sup \set{\mcl(x,\lambda)}{\lambda \geq 0\mbox{ and }\nabla_x \mcl(x,\lambda)=0}
 $$
-subject to the constraint $\lambda \geq 0$ and $\nabla_x \mcl(x,\lambda)=0$.
-To obtain the equality (which is named the <color red>**strong duality relation**</color>), we actually need some additional assumptions (for example the existence of a Slater point). But before that, let us check a very useful relation...
+To obtain the equality in \eqref{eq:weak} (which is named the <color red>**strong duality relation**</color>), we actually need some additional assumptions (for example the existence of a Slater point). But before that, let us check a very useful relation...
 ====== Karush-Kuhn-Tucker conditions ======
@@ Line 44: / Line 43: @@
 \nabla_x \mcl(x^*,\lambda^*)=0
 \end{equation}
-and for all $i \in [1:n]$, we have either $\lambda^*_i=0$ or $g_i(x^*)=0$ .
+and for all $i \in [1:n]$, we have either $\lambda^*_i=0$ or $h_i(x^*)=0$ .
-Then,
+Then, the strong duality holds and
 $$
-f(x^*)=\inf_{x \in \mcD} f(x)
+f(x^*)=\inf_{x \in \mcD} f(x)=\mcl(x^*,\lambda^*)
 $$
 </WRAP>
@@ Line 56: / Line 55: @@
 f(x)-f(x^*) \geq \nabla f(x^*)^T (x-x^*).
 $$
-It remains to prove $\nabla f(x^*)^T (x-x^*)\geq 0$. Using first $\nabla_x \mcl(x^*,\lambda^*)=0$ and then the convexity of the functions $g_i$, we get
+It remains to prove $\nabla f(x^*)^T (x-x^*)\geq 0$. Using first $\nabla_x \mcl(x^*,\lambda^*)=0$ and then the convexity of the functions $h_i$, we get
 $$
-\nabla f(x^*)^T (x-x^*)=-\sum_{i=1}^n \lambda^*_i \nabla_x g_i^T(x^*) (x-x^*) \geq -\sum_{i=1}^n \lambda^*_i (g_i(x)-g_i(x^*))=- \sum_{i=1}^n \lambda^*_i  \underbrace{g_i(x)}_{\leq 0} \geq 0
+\nabla f(x^*)^T (x-x^*)=-\sum_{i=1}^n \lambda^*_i \nabla_x h_i^T(x^*) (x-x^*) \geq -\sum_{i=1}^n \lambda^*_i (h_i(x)-h_i(x^*))=- \sum_{i=1}^n \lambda^*_i  \underbrace{h_i(x)}_{\leq 0} \geq 0
 $$
 This finishes the proof. $\eproof$
@@ Line 73: / Line 72: @@
 <WRAP center round box 80%>
 __**Saddle point Lemma**__
-If $(x^*,\lambda^*) \in \Xset \times (\rset^+)^n$ is a saddle point for $\mcl$ then  the strong duality holds, $x^*=\arginf_{x \in \mcD} f(x)$, and $\lambda^*_i g_i(x^*)=0$ for all $i \in [1:n]$.
+If $(x^*,\lambda^*) \in \Xset \times (\rset^+)^n$ is a saddle point for $\mcl$ then  the strong duality holds, and the KKT conditions holds for $(x^*,\lambda^*)$.
 </WRAP>
 <hidden click here to see the proof>
@@ Line 88: / Line 87: @@
 which is the converse inequality of \eqref{eq:weak}. This finally implies:
 \begin{equation}\label{eq:all}
-\sup_{\lambda \geq 0} \inf_{x \in\Xset}  \mcl(x,\lambda)=\mcl(x^*,\lambda^*)=\inf_{x \in\Xset} \sup_{\lambda \geq 0}  \mcl(x,\lambda)=\inf_{x \in\mcD} f(x).
+\sup_{\lambda \geq 0} \inf_{x \in\Xset}  \mcl(x,\lambda)=\inf_{x \in\Xset} \mcl(x,\lambda^*)=\mcl(x^*,\lambda^*)=\inf_{x \in\Xset} \sup_{\lambda \geq 0}  \mcl(x,\lambda)=\inf_{x \in\mcD} f(x).
 \end{equation}
-The upper bound in \eqref{eq:saddle} regardless the value of $\lambda\geq 0$ shows that $g_i(x^*)\leq 0$ for every $i \in [1:n]$, that is $x^* \in \mcD$. Now taking \eqref{eq:saddle} with $\lambda=0$ yields for $x\in \mcD$,
+Note that $\mcl(x^*,\lambda^*)=\inf_{x \in\Xset} \mcl(x,\lambda^*)$ shows that $\nabla_x \mcl(x^*,\lambda^*)=0$.
+The upper bound in \eqref{eq:saddle} regardless the value of $\lambda\geq 0$ shows that $h_i(x^*)\leq 0$ for every $i \in [1:n]$, that is $x^* \in \mcD$. Now taking \eqref{eq:saddle} with $\lambda=0$ yields for $x\in \mcD$,
 $$
 f(x^*)=\mcl(x^*,0) \leq \mcl(x,\lambda^*) \leq f(x)
 $$
-which shows that $f(x^*)=\inf_{x \in\mcD} f(x)$. Combining with \eqref{eq:all} yields $\mcl(x^*,\lambda^*)=f(x^*)$ so that $\sum_{i=1}^n \lambda^*_i g_i(x^*_i)=0$ and since $\lambda^* \geq 0$ and $x^* \in \mcD$, this implies $\lambda^*_i g_i(x^*)=0$ for all $i \in [1:n]$.
+which shows that $f(x^*)=\inf_{x \in\mcD} f(x)$. Combining with \eqref{eq:all} yields $\mcl(x^*,\lambda^*)=f(x^*)$ so that $\sum_{i=1}^n \lambda^*_i h_i(x^*_i)=0$ and since $\lambda^* \geq 0$ and $x^* \in \mcD$, this implies $\lambda^*_i h_i(x^*)=0$ for all $i \in [1:n]$.
 $\eproof$
 </hidden>
 ===== Necessary conditions and strong duality =====
-We now assume the existence of Slater points: there exists $\tilde x \in \mcD$ such that for all $i \in \{1,\ldots,n\}$, $g_i(\tilde x)<0$.
+We now assume the existence of Slater points: there exists $\tilde x \in \mcD$ such that for all $i \in \{1,\ldots,n\}$, $h_i(\tilde x)<0$.
 <WRAP center round box 80%>
 __**The convex Farkas lemma**__
-Assume that there exists a Slater point. Then, $\{f<0\} \cap \mcD =\emptyset$ iff there exists $\lambda^* \geq 0$ such that for all $x\in \mcD$, $f(x)+\sum_{i=1}^n \lambda^*_i g_i(x) \geq 0$.
+Assume that there exists a Slater point. Then, $\{f<0\} \cap \mcD =\emptyset$ iff there exists $\lambda^* \geq 0$ such that for all $x\in \mcD$, $f(x)+\sum_{i=1}^n \lambda^*_i h_i(x) \geq 0$.
 </WRAP>
 <hidden Click here to see the proof>
@@ Line 110: / Line 111: @@
 Set
 $$
-U=\{u=u_{0:n} \in \rset^{n+1}; \exists\ x \in \mcD, f(x)<u_0 \ \mbox{and} \ g_i(x)\leq u_i  \mbox{ for all } i \in [1:n]\}.
+U=\{u=u_{0:n} \in \rset^{n+1}; \exists\ x \in \mcD, f(x)<u_0 \ \mbox{and} \ h_i(x)\leq u_i  \mbox{ for all } i \in [1:n]\}.
 $$
 The condition $\{f<0\} \cap \mcD =\emptyset$  is equivalent to saying that $0 \notin U$ and since $U$ is clearly a convex set, by a separation argument, there exists a __**non-null**__ vector $\phi \in \rset^{n+1}$ such that
@@ Line 119: / Line 120: @@
 If $u\in U$ then $u+t e_i \in U$ where $t>0$ and $e_i=(\indiacc{k=i})_{k \in [0:n]} \in \rset^{n+1}$.  The previous inequality implies $\phi^T (u+t e_i)=\phi^T u + t\phi_i \geq 0$ for all $t>0$, therefore $\phi_i\geq 0$. Now, since $\phi^T u\geq 0$ for all $u \in U$, a simple limiting argument yields for all $x\in \mcD$,
 $$
-\phi_0 f(x)+\sum_{i=1}^n \phi_i g_i(x) \geq 0
+\phi_0 f(x)+\sum_{i=1}^n \phi_i h_i(x) \geq 0
 $$
-To conclude, and since we already know that $\phi_i\geq 0$ for all $i \in [0:n]$, it only remains to prove that $\phi_0 \neq 0$ and to set in that case $\lambda^*_i=\frac{\phi_i}{\phi_0}$. The rest of the argument is by contradiction. If $\phi_0=0$, then the previous inequality on a Slater point $x=\tilde x$ yields $\sum_{i=1}^n \phi_i g_i(\tilde x) \geq 0$ but since $g_i(\tilde x)<0$ for all $i \in [1:n]$, we finally get $\phi_i=0$ for all $i \in [1:n]$. Finally all the components of $\phi$ are null and we are face to a contradiction. The proof is completed.
+To conclude, and since we already know that $\phi_i\geq 0$ for all $i \in [0:n]$, it only remains to prove that $\phi_0 \neq 0$ and to set in that case $\lambda^*_i=\frac{\phi_i}{\phi_0}$. The rest of the argument is by contradiction. If $\phi_0=0$, then the previous inequality on a Slater point $x=\tilde x$ yields $\sum_{i=1}^n \phi_i h_i(\tilde x) \geq 0$ but since $h_i(\tilde x)<0$ for all $i \in [1:n]$, we finally get $\phi_i=0$ for all $i \in [1:n]$. Finally all the components of $\phi$ are null and we are face to a contradiction. The proof is completed.
 $\eproof$
 </hidden>
@@ Line 132: / Line 133: @@
 <hidden Click here to see the proof>
 $\bproof$
-Assume now that $f(x^*)=\inf_{x\in \mcD} f(x)$ for some $x^* \in\mcD$. Then, $\{f-f(x^*)<0\} \cap \mcD =\emptyset$ so that there exists $\lambda^* \geq 0$ satisfying for all $x\in \mcD$, $f(x)-f(x^*)+\sum_{i=1}^n \lambda^*_i g_i(x) \geq 0$. And this implies that $(x^*,\lambda^*)$ is a saddle point since: for all $\lambda \geq 0$ and $x \in \mcD$,
+Assume now that $f(x^*)=\inf_{x\in \mcD} f(x)$ for some $x^* \in\mcD$. Then, $\{f-f(x^*)<0\} \cap \mcD =\emptyset$ so that there exists $\lambda^* \geq 0$ satisfying for all $x\in \mcD$, $f(x)-f(x^*)+\sum_{i=1}^n \lambda^*_i h_i(x) \geq 0$. And this implies that $(x^*,\lambda^*)$ is a saddle point since: for all $\lambda \geq 0$ and $x \in \mcD$,
 $$
-f(x^*)+\sum_{i=1}^n \lambda_i g_i(x^*) \leq f(x^*) \leq f(x)+\sum_{i=1}^n \lambda^*_i g_i(x)
+f(x^*)+\sum_{i=1}^n \lambda_i h_i(x^*) \leq f(x^*) \leq f(x)+\sum_{i=1}^n \lambda^*_i h_i(x)
 $$
 so that $(x^*,\lambda^*)$ is a saddle point and this implies that the strong duality holds (as we have seen in the previous section). This concludes the proof.

Welcome to Randal Douc's wiki

User Tools

Site Tools

Differences

Page Tools