Welcome to Randal Douc's wiki

A collaborative site on maths but not only!

User Tools

Site Tools


world:kkt

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
world:kkt [2022/03/16 07:40]
127.0.0.1 external edit
world:kkt [2022/10/03 00:23] (current)
rdouc
Line 10: Line 10:
  
 ====== Weak duality ====== ====== Weak duality ======
-Let $f$ and $(g_i)_{1 \leq i \leq n}$ be convex differentiable functions on $\Xset=\rset^p$. Let $\mcD=\cap_{i=1}^n \{g_i \leq 0\}\neq \emptyset$ and note that by convexity of the functions $g_i$, the set $\mcD$ is actually a convex set. We are interested in the infimum of the convex function $f$ on the convex set $\mcD$. Define the Lagrange function+Let $f$ and $(h_i)_{1 \leq i \leq n}$ be convex differentiable functions on $\Xset=\rset^p$. Let $\mcD=\cap_{i=1}^n \{h_i \leq 0\}\neq \emptyset$ and note that by convexity of the functions $h_i$, the set $\mcD$ is actually a convex set. We are interested in the infimum of the convex function $f$ on the convex set $\mcD$. Define the Lagrange function
 $$ $$
-\mcl(x,​\lambda)=f(x)+\sum_{i=1}^n \lambda_i ​g_i(x)+\mcl(x,​\lambda)=f(x)+\sum_{i=1}^n \lambda_i ​h_i(x)
 $$ $$
 where $\lambda=(\lambda_1,​\ldots,​\lambda_n)^T \in \rset^n$. We first show the weak duality relation. Start with this simple remark: for all $(x,​\lambda) \in \Xset \times \rset^n$, ​ where $\lambda=(\lambda_1,​\ldots,​\lambda_n)^T \in \rset^n$. We first show the weak duality relation. Start with this simple remark: for all $(x,​\lambda) \in \Xset \times \rset^n$, ​
Line 43: Line 43:
 \nabla_x \mcl(x^*,​\lambda^*)=0 \nabla_x \mcl(x^*,​\lambda^*)=0
 \end{equation} \end{equation}
-and for all $i \in [1:n]$, we have either $\lambda^*_i=0$ or $g_i(x^*)=0$ . +and for all $i \in [1:n]$, we have either $\lambda^*_i=0$ or $h_i(x^*)=0$ . 
 Then, the strong duality holds and  Then, the strong duality holds and 
 $$ $$
Line 55: Line 55:
 f(x)-f(x^*) \geq \nabla f(x^*)^T (x-x^*). f(x)-f(x^*) \geq \nabla f(x^*)^T (x-x^*).
 $$ $$
-It remains to prove $\nabla f(x^*)^T (x-x^*)\geq 0$. Using first $\nabla_x \mcl(x^*,​\lambda^*)=0$ and then the convexity of the functions $g_i$, we get+It remains to prove $\nabla f(x^*)^T (x-x^*)\geq 0$. Using first $\nabla_x \mcl(x^*,​\lambda^*)=0$ and then the convexity of the functions $h_i$, we get
 $$ $$
-\nabla f(x^*)^T (x-x^*)=-\sum_{i=1}^n \lambda^*_i \nabla_x ​g_i^T(x^*) (x-x^*) \geq -\sum_{i=1}^n \lambda^*_i (g_i(x)-g_i(x^*))=- \sum_{i=1}^n \lambda^*_i ​ \underbrace{g_i(x)}_{\leq 0} \geq 0+\nabla f(x^*)^T (x-x^*)=-\sum_{i=1}^n \lambda^*_i \nabla_x ​h_i^T(x^*) (x-x^*) \geq -\sum_{i=1}^n \lambda^*_i (h_i(x)-h_i(x^*))=- \sum_{i=1}^n \lambda^*_i ​ \underbrace{h_i(x)}_{\leq 0} \geq 0
 $$  ​ $$  ​
 This finishes the proof. $\eproof$ This finishes the proof. $\eproof$
Line 92: Line 92:
 Note that $\mcl(x^*,​\lambda^*)=\inf_{x \in\Xset} \mcl(x,​\lambda^*)$ shows that $\nabla_x \mcl(x^*,​\lambda^*)=0$. ​ Note that $\mcl(x^*,​\lambda^*)=\inf_{x \in\Xset} \mcl(x,​\lambda^*)$ shows that $\nabla_x \mcl(x^*,​\lambda^*)=0$. ​
    
-The upper bound in \eqref{eq:​saddle} regardless the value of $\lambda\geq 0$ shows that $g_i(x^*)\leq 0$ for every $i \in [1:n]$, that is $x^* \in \mcD$. Now taking \eqref{eq:​saddle} with $\lambda=0$ yields for $x\in \mcD$, ​+The upper bound in \eqref{eq:​saddle} regardless the value of $\lambda\geq 0$ shows that $h_i(x^*)\leq 0$ for every $i \in [1:n]$, that is $x^* \in \mcD$. Now taking \eqref{eq:​saddle} with $\lambda=0$ yields for $x\in \mcD$, ​
 $$ $$
 f(x^*)=\mcl(x^*,​0) \leq \mcl(x,​\lambda^*) \leq f(x)  ​ f(x^*)=\mcl(x^*,​0) \leq \mcl(x,​\lambda^*) \leq f(x)  ​
 $$    ​ $$    ​
-which shows that $f(x^*)=\inf_{x \in\mcD} f(x)$. Combining with \eqref{eq:​all} yields $\mcl(x^*,​\lambda^*)=f(x^*)$ so that $\sum_{i=1}^n \lambda^*_i ​g_i(x^*_i)=0$ and since $\lambda^* \geq 0$ and $x^* \in \mcD$, this implies $\lambda^*_i ​g_i(x^*)=0$ for all $i \in [1:​n]$. ​+which shows that $f(x^*)=\inf_{x \in\mcD} f(x)$. Combining with \eqref{eq:​all} yields $\mcl(x^*,​\lambda^*)=f(x^*)$ so that $\sum_{i=1}^n \lambda^*_i ​h_i(x^*_i)=0$ and since $\lambda^* \geq 0$ and $x^* \in \mcD$, this implies $\lambda^*_i ​h_i(x^*)=0$ for all $i \in [1:​n]$. ​
 $\eproof$ $\eproof$
 </​hidden>​ </​hidden>​
 ===== Necessary conditions and strong duality ===== ===== Necessary conditions and strong duality =====
    
-We now assume the existence of Slater points: there exists $\tilde x \in \mcD$ such that for all $i \in \{1,​\ldots,​n\}$,​ $g_i(\tilde x)<​0$. ​+We now assume the existence of Slater points: there exists $\tilde x \in \mcD$ such that for all $i \in \{1,​\ldots,​n\}$,​ $h_i(\tilde x)<​0$. ​
  
 <WRAP center round box 80%> <WRAP center round box 80%>
 __**The convex Farkas lemma**__ ​ __**The convex Farkas lemma**__ ​
-Assume that there exists a Slater point. Then, $\{f<0\} \cap \mcD =\emptyset$ iff there exists $\lambda^* \geq 0$ such that for all $x\in \mcD$, $f(x)+\sum_{i=1}^n \lambda^*_i ​g_i(x) \geq 0$.+Assume that there exists a Slater point. Then, $\{f<0\} \cap \mcD =\emptyset$ iff there exists $\lambda^* \geq 0$ such that for all $x\in \mcD$, $f(x)+\sum_{i=1}^n \lambda^*_i ​h_i(x) \geq 0$.
 </​WRAP>​ </​WRAP>​
 <hidden Click here to see the proof> ​ <hidden Click here to see the proof> ​
Line 111: Line 111:
 Set  Set 
 $$ $$
-U=\{u=u_{0:​n} \in \rset^{n+1};​ \exists\ x \in \mcD, f(x)<u_0 \ \mbox{and} \ g_i(x)\leq u_i  \mbox{ for all } i \in [1:n]\}.+U=\{u=u_{0:​n} \in \rset^{n+1};​ \exists\ x \in \mcD, f(x)<u_0 \ \mbox{and} \ h_i(x)\leq u_i  \mbox{ for all } i \in [1:n]\}.
 $$ $$
 The condition $\{f<0\} \cap \mcD =\emptyset$ ​ is equivalent to saying that $0 \notin U$ and since $U$ is clearly a convex set, by a separation argument, there exists a __**non-null**__ vector $\phi \in \rset^{n+1}$ such that  The condition $\{f<0\} \cap \mcD =\emptyset$ ​ is equivalent to saying that $0 \notin U$ and since $U$ is clearly a convex set, by a separation argument, there exists a __**non-null**__ vector $\phi \in \rset^{n+1}$ such that 
Line 120: Line 120:
 If $u\in U$ then $u+t e_i \in U$ where $t>0$ and $e_i=(\indiacc{k=i})_{k \in [0:n]} \in \rset^{n+1}$. ​ The previous inequality implies $\phi^T (u+t e_i)=\phi^T u + t\phi_i \geq 0$ for all $t>0$, therefore $\phi_i\geq 0$. Now, since $\phi^T u\geq 0$ for all $u \in U$, a simple limiting argument yields for all $x\in \mcD$, ​ If $u\in U$ then $u+t e_i \in U$ where $t>0$ and $e_i=(\indiacc{k=i})_{k \in [0:n]} \in \rset^{n+1}$. ​ The previous inequality implies $\phi^T (u+t e_i)=\phi^T u + t\phi_i \geq 0$ for all $t>0$, therefore $\phi_i\geq 0$. Now, since $\phi^T u\geq 0$ for all $u \in U$, a simple limiting argument yields for all $x\in \mcD$, ​
 $$ $$
-\phi_0 f(x)+\sum_{i=1}^n \phi_i ​g_i(x) \geq 0+\phi_0 f(x)+\sum_{i=1}^n \phi_i ​h_i(x) \geq 0
 $$ $$
-To conclude, and since we already know that $\phi_i\geq 0$ for all $i \in [0:n]$, it only remains to prove that $\phi_0 \neq 0$ and to set in that case $\lambda^*_i=\frac{\phi_i}{\phi_0}$. The rest of the argument is by contradiction. If $\phi_0=0$, then the previous inequality on a Slater point $x=\tilde x$ yields $\sum_{i=1}^n \phi_i ​g_i(\tilde x) \geq 0$ but since $g_i(\tilde x)<0$ for all $i \in [1:n]$, we finally get $\phi_i=0$ for all $i \in [1:n]$. Finally all the components of $\phi$ are null and we are face to a contradiction. The proof is completed.  ​+To conclude, and since we already know that $\phi_i\geq 0$ for all $i \in [0:n]$, it only remains to prove that $\phi_0 \neq 0$ and to set in that case $\lambda^*_i=\frac{\phi_i}{\phi_0}$. The rest of the argument is by contradiction. If $\phi_0=0$, then the previous inequality on a Slater point $x=\tilde x$ yields $\sum_{i=1}^n \phi_i ​h_i(\tilde x) \geq 0$ but since $h_i(\tilde x)<0$ for all $i \in [1:n]$, we finally get $\phi_i=0$ for all $i \in [1:n]$. Finally all the components of $\phi$ are null and we are face to a contradiction. The proof is completed.  ​
 $\eproof$ $\eproof$
 </​hidden>​ </​hidden>​
Line 133: Line 133:
 <hidden Click here to see the proof> <hidden Click here to see the proof>
 $\bproof$ $\bproof$
-Assume now that $f(x^*)=\inf_{x\in \mcD} f(x)$ for some $x^* \in\mcD$. Then, $\{f-f(x^*)<​0\} \cap \mcD =\emptyset$ so that there exists $\lambda^* \geq 0$ satisfying for all $x\in \mcD$, $f(x)-f(x^*)+\sum_{i=1}^n \lambda^*_i ​g_i(x) \geq 0$. And this implies that $(x^*,​\lambda^*)$ is a saddle point since: for all $\lambda \geq 0$ and $x \in \mcD$, ​+Assume now that $f(x^*)=\inf_{x\in \mcD} f(x)$ for some $x^* \in\mcD$. Then, $\{f-f(x^*)<​0\} \cap \mcD =\emptyset$ so that there exists $\lambda^* \geq 0$ satisfying for all $x\in \mcD$, $f(x)-f(x^*)+\sum_{i=1}^n \lambda^*_i ​h_i(x) \geq 0$. And this implies that $(x^*,​\lambda^*)$ is a saddle point since: for all $\lambda \geq 0$ and $x \in \mcD$, ​
 $$ $$
-f(x^*)+\sum_{i=1}^n \lambda_i ​g_i(x^*) \leq f(x^*) \leq f(x)+\sum_{i=1}^n \lambda^*_i ​g_i(x) +f(x^*)+\sum_{i=1}^n \lambda_i ​h_i(x^*) \leq f(x^*) \leq f(x)+\sum_{i=1}^n \lambda^*_i ​h_i(x) 
 $$ $$
 so that $(x^*,​\lambda^*)$ is a saddle point and this implies that the strong duality holds (as we have seen in the previous section). This concludes the proof. ​ so that $(x^*,​\lambda^*)$ is a saddle point and this implies that the strong duality holds (as we have seen in the previous section). This concludes the proof. ​
world/kkt.1647412822.txt.gz ยท Last modified: 2022/03/16 07:40 by 127.0.0.1