Welcome to Randal Douc's wiki

A collaborative site on maths but not only!

User Tools

Site Tools


world:kkt

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
world:kkt [2020/03/19 17:50]
127.0.0.1 modification externe
world:kkt [2022/10/03 00:23] (current)
rdouc
Line 10: Line 10:
  
 ====== Weak duality ====== ====== Weak duality ======
-Let $f$ and $(g_i)_{1 \leq i \leq n}$ be convex differentiable functions on $\Xset=\rset^p$. Let $\mcD=\cap_{i=1}^n \{g_i \leq 0\}\neq \emptyset$ and note that by convexity of the functions $g_i$, the set $\mcD$ is actually a convex set. We are interested in the infimum of the convex function $f$ on the convex set $\mcD$. Define the Lagrange function+Let $f$ and $(h_i)_{1 \leq i \leq n}$ be convex differentiable functions on $\Xset=\rset^p$. Let $\mcD=\cap_{i=1}^n \{h_i \leq 0\}\neq \emptyset$ and note that by convexity of the functions $h_i$, the set $\mcD$ is actually a convex set. We are interested in the infimum of the convex function $f$ on the convex set $\mcD$. Define the Lagrange function
 $$ $$
-\mcl(x,​\lambda)=f(x)+\sum_{i=1}^n \lambda_i ​g_i(x)+\mcl(x,​\lambda)=f(x)+\sum_{i=1}^n \lambda_i ​h_i(x)
 $$ $$
 where $\lambda=(\lambda_1,​\ldots,​\lambda_n)^T \in \rset^n$. We first show the weak duality relation. Start with this simple remark: for all $(x,​\lambda) \in \Xset \times \rset^n$, ​ where $\lambda=(\lambda_1,​\ldots,​\lambda_n)^T \in \rset^n$. We first show the weak duality relation. Start with this simple remark: for all $(x,​\lambda) \in \Xset \times \rset^n$, ​
Line 26: Line 26:
 \sup_{\lambda\geq 0}\inf_{x \in \Xset} \mcl(x,​\lambda) \leq \inf_{x \in \Xset} \sup_{\lambda\geq 0}\mcl(x,​\lambda)=\inf_{x \in \mcD} f(x)  \sup_{\lambda\geq 0}\inf_{x \in \Xset} \mcl(x,​\lambda) \leq \inf_{x \in \Xset} \sup_{\lambda\geq 0}\mcl(x,​\lambda)=\inf_{x \in \mcD} f(x) 
 \end{equation} \end{equation}
-Note that in the lhs, the infimum is wrt $x\in \Xset$ and therefore, there is no constraint, which is nice... To obtain the equality (which is named the strong duality relation), we actually need some additional assumptions (for example the existence of a Slater point). But before that, let us check a very useful relation...+Note that in the lhs (left-hand side), the infimum is wrt $x\in \Xset$ and therefore, there is no constraint, which is nice... ​ 
 +The rhs (right-hand side) is called the //primal problem// and the lhs the //dual problem//. Note, since $x\mapsto \mcl(x,​\lambda)$ is convex, that the <color red>dual problem</​color>​ $\sup_{\lambda\geq 0}\inf_{x \in \Xset} \mcl(x,​\lambda)$ is equivalent to  
 +$$ 
 +\sup \set{\mcl(x,​\lambda)}{\lambda \geq 0\mbox{ and }\nabla_x \mcl(x,​\lambda)=0} 
 +$$ 
 + 
 +To obtain the equality ​in \eqref{eq:​weak} ​(which is named the <color red>**strong duality relation**</​color>​), we actually need some additional assumptions (for example the existence of a Slater point). But before that, let us check a very useful relation...
 ====== Karush-Kuhn-Tucker conditions ====== ====== Karush-Kuhn-Tucker conditions ======
  
Line 37: Line 43:
 \nabla_x \mcl(x^*,​\lambda^*)=0 \nabla_x \mcl(x^*,​\lambda^*)=0
 \end{equation} \end{equation}
-and for all $i \in [1:n]$, we have either $\lambda^*_i=0$ or $g_i(x^*)=0$ .  +and for all $i \in [1:n]$, we have either $\lambda^*_i=0$ or $h_i(x^*)=0$ .  
-Then, +Then, the strong duality holds and 
 $$ $$
-f(x^*)=\inf_{x \in \mcD} f(x)+f(x^*)=\inf_{x \in \mcD} f(x)=\mcl(x^*,​\lambda^*)
 $$ $$
 </​WRAP>​ </​WRAP>​
Line 49: Line 55:
 f(x)-f(x^*) \geq \nabla f(x^*)^T (x-x^*). f(x)-f(x^*) \geq \nabla f(x^*)^T (x-x^*).
 $$ $$
-It remains to prove $\nabla f(x^*)^T (x-x^*)\geq 0$. Using first $\nabla_x \mcl(x^*,​\lambda^*)=0$ and then the convexity of the functions $g_i$, we get+It remains to prove $\nabla f(x^*)^T (x-x^*)\geq 0$. Using first $\nabla_x \mcl(x^*,​\lambda^*)=0$ and then the convexity of the functions $h_i$, we get
 $$ $$
-\nabla f(x^*)^T (x-x^*)=-\sum_{i=1}^n \lambda^*_i \nabla_x ​g_i^T(x^*) (x-x^*) \geq -\sum_{i=1}^n \lambda^*_i (g_i(x)-g_i(x^*))=- \sum_{i=1}^n \lambda^*_i ​ \underbrace{g_i(x)}_{\leq 0} \geq 0+\nabla f(x^*)^T (x-x^*)=-\sum_{i=1}^n \lambda^*_i \nabla_x ​h_i^T(x^*) (x-x^*) \geq -\sum_{i=1}^n \lambda^*_i (h_i(x)-h_i(x^*))=- \sum_{i=1}^n \lambda^*_i ​ \underbrace{h_i(x)}_{\leq 0} \geq 0
 $$  ​ $$  ​
 This finishes the proof. $\eproof$ This finishes the proof. $\eproof$
Line 66: Line 72:
 <WRAP center round box 80%> <WRAP center round box 80%>
 __**Saddle point Lemma**__ __**Saddle point Lemma**__
-If $(x^*,​\lambda^*) \in \Xset \times (\rset^+)^n$ is a saddle point for $\mcl$ then  the strong duality holds, $x^*=\arginf_{x \in \mcD} f(x)$and $\lambda^*_i g_i(x^*)=0$ for all $i \in [1:n]$. +If $(x^*,​\lambda^*) \in \Xset \times (\rset^+)^n$ is a saddle point for $\mcl$ then  the strong duality holds, ​and the KKT conditions holds for $(x^*,​\lambda^*)$.
 </​WRAP>​ </​WRAP>​
 <hidden click here to see the proof> <hidden click here to see the proof>
Line 81: Line 87:
 which is the converse inequality of \eqref{eq:​weak}. This finally implies: ​ which is the converse inequality of \eqref{eq:​weak}. This finally implies: ​
 \begin{equation}\label{eq:​all} \begin{equation}\label{eq:​all}
-\sup_{\lambda \geq 0} \inf_{x \in\Xset} ​ \mcl(x,​\lambda)=\mcl(x^*,​\lambda^*)=\inf_{x \in\Xset} \sup_{\lambda \geq 0}  \mcl(x,​\lambda)=\inf_{x \in\mcD} f(x).+\sup_{\lambda \geq 0} \inf_{x \in\Xset} ​ \mcl(x,​\lambda)=\inf_{x \in\Xset} \mcl(x,​\lambda^*)=\mcl(x^*,​\lambda^*)=\inf_{x \in\Xset} \sup_{\lambda \geq 0}  \mcl(x,​\lambda)=\inf_{x \in\mcD} f(x).
 \end{equation} ​ \end{equation} ​
  
-The upper bound in \eqref{eq:​saddle} regardless the value of $\lambda\geq 0$ shows that $g_i(x^*)\leq 0$ for every $i \in [1:n]$, that is $x^* \in \mcD$. Now taking \eqref{eq:​saddle} with $\lambda=0$ yields for $x\in \mcD$, ​+Note that $\mcl(x^*,​\lambda^*)=\inf_{x \in\Xset} \mcl(x,​\lambda^*)$ shows that $\nabla_x \mcl(x^*,​\lambda^*)=0$.  
 +  
 +The upper bound in \eqref{eq:​saddle} regardless the value of $\lambda\geq 0$ shows that $h_i(x^*)\leq 0$ for every $i \in [1:n]$, that is $x^* \in \mcD$. Now taking \eqref{eq:​saddle} with $\lambda=0$ yields for $x\in \mcD$, ​
 $$ $$
 f(x^*)=\mcl(x^*,​0) \leq \mcl(x,​\lambda^*) \leq f(x)  ​ f(x^*)=\mcl(x^*,​0) \leq \mcl(x,​\lambda^*) \leq f(x)  ​
 $$    ​ $$    ​
-which shows that $f(x^*)=\inf_{x \in\mcD} f(x)$. Combining with \eqref{eq:​all} yields $\mcl(x^*,​\lambda^*)=f(x^*)$ so that $\sum_{i=1}^n \lambda^*_i ​g_i(x^*_i)=0$ and since $\lambda^* \geq 0$ and $x^* \in \mcD$, this implies $\lambda^*_i ​g_i(x^*)=0$ for all $i \in [1:​n]$. ​+which shows that $f(x^*)=\inf_{x \in\mcD} f(x)$. Combining with \eqref{eq:​all} yields $\mcl(x^*,​\lambda^*)=f(x^*)$ so that $\sum_{i=1}^n \lambda^*_i ​h_i(x^*_i)=0$ and since $\lambda^* \geq 0$ and $x^* \in \mcD$, this implies $\lambda^*_i ​h_i(x^*)=0$ for all $i \in [1:​n]$. ​
 $\eproof$ $\eproof$
 </​hidden>​ </​hidden>​
 ===== Necessary conditions and strong duality ===== ===== Necessary conditions and strong duality =====
    
-We now assume the existence of Slater points: there exists $\tilde x \in \mcD$ such that for all $i \in \{1,​\ldots,​n\}$,​ $g_i(\tilde x)<​0$. ​+We now assume the existence of Slater points: there exists $\tilde x \in \mcD$ such that for all $i \in \{1,​\ldots,​n\}$,​ $h_i(\tilde x)<​0$. ​
  
 <WRAP center round box 80%> <WRAP center round box 80%>
 __**The convex Farkas lemma**__ ​ __**The convex Farkas lemma**__ ​
-Assume that there exists a Slater point. Then, $\{f<0\} \cap \mcD =\emptyset$ iff there exists $\lambda^* \geq 0$ such that for all $x\in \mcD$, $f(x)+\sum_{i=1}^n \lambda^*_i ​g_i(x) \geq 0$.+Assume that there exists a Slater point. Then, $\{f<0\} \cap \mcD =\emptyset$ iff there exists $\lambda^* \geq 0$ such that for all $x\in \mcD$, $f(x)+\sum_{i=1}^n \lambda^*_i ​h_i(x) \geq 0$.
 </​WRAP>​ </​WRAP>​
 <hidden Click here to see the proof> ​ <hidden Click here to see the proof> ​
Line 103: Line 111:
 Set  Set 
 $$ $$
-U=\{u=u_{0:​n} \in \rset^{n+1};​ \exists\ x \in \mcD, f(x)<u_0 \ \mbox{and} \ g_i(x)\leq u_i  \mbox{ for all } i \in [1:n]\}.+U=\{u=u_{0:​n} \in \rset^{n+1};​ \exists\ x \in \mcD, f(x)<u_0 \ \mbox{and} \ h_i(x)\leq u_i  \mbox{ for all } i \in [1:n]\}.
 $$ $$
 The condition $\{f<0\} \cap \mcD =\emptyset$ ​ is equivalent to saying that $0 \notin U$ and since $U$ is clearly a convex set, by a separation argument, there exists a __**non-null**__ vector $\phi \in \rset^{n+1}$ such that  The condition $\{f<0\} \cap \mcD =\emptyset$ ​ is equivalent to saying that $0 \notin U$ and since $U$ is clearly a convex set, by a separation argument, there exists a __**non-null**__ vector $\phi \in \rset^{n+1}$ such that 
Line 112: Line 120:
 If $u\in U$ then $u+t e_i \in U$ where $t>0$ and $e_i=(\indiacc{k=i})_{k \in [0:n]} \in \rset^{n+1}$. ​ The previous inequality implies $\phi^T (u+t e_i)=\phi^T u + t\phi_i \geq 0$ for all $t>0$, therefore $\phi_i\geq 0$. Now, since $\phi^T u\geq 0$ for all $u \in U$, a simple limiting argument yields for all $x\in \mcD$, ​ If $u\in U$ then $u+t e_i \in U$ where $t>0$ and $e_i=(\indiacc{k=i})_{k \in [0:n]} \in \rset^{n+1}$. ​ The previous inequality implies $\phi^T (u+t e_i)=\phi^T u + t\phi_i \geq 0$ for all $t>0$, therefore $\phi_i\geq 0$. Now, since $\phi^T u\geq 0$ for all $u \in U$, a simple limiting argument yields for all $x\in \mcD$, ​
 $$ $$
-\phi_0 f(x)+\sum_{i=1}^n \phi_i ​g_i(x) \geq 0+\phi_0 f(x)+\sum_{i=1}^n \phi_i ​h_i(x) \geq 0
 $$ $$
-To conclude, and since we already know that $\phi_i\geq 0$ for all $i \in [0:n]$, it only remains to prove that $\phi_0 \neq 0$ and to set in that case $\lambda^*_i=\frac{\phi_i}{\phi_0}$. The rest of the argument is by contradiction. If $\phi_0=0$, then the previous inequality on a Slater point $x=\tilde x$ yields $\sum_{i=1}^n \phi_i ​g_i(\tilde x) \geq 0$ but since $g_i(\tilde x)<0$ for all $i \in [1:n]$, we finally get $\phi_i=0$ for all $i \in [1:n]$. Finally all the components of $\phi$ are null and we are face to a contradiction. The proof is completed.  ​+To conclude, and since we already know that $\phi_i\geq 0$ for all $i \in [0:n]$, it only remains to prove that $\phi_0 \neq 0$ and to set in that case $\lambda^*_i=\frac{\phi_i}{\phi_0}$. The rest of the argument is by contradiction. If $\phi_0=0$, then the previous inequality on a Slater point $x=\tilde x$ yields $\sum_{i=1}^n \phi_i ​h_i(\tilde x) \geq 0$ but since $h_i(\tilde x)<0$ for all $i \in [1:n]$, we finally get $\phi_i=0$ for all $i \in [1:n]$. Finally all the components of $\phi$ are null and we are face to a contradiction. The proof is completed.  ​
 $\eproof$ $\eproof$
 </​hidden>​ </​hidden>​
Line 125: Line 133:
 <hidden Click here to see the proof> <hidden Click here to see the proof>
 $\bproof$ $\bproof$
-Assume now that $f(x^*)=\inf_{x\in \mcD} f(x)$ for some $x^* \in\mcD$. Then, $\{f-f(x^*)<​0\} \cap \mcD =\emptyset$ so that there exists $\lambda^* \geq 0$ satisfying for all $x\in \mcD$, $f(x)-f(x^*)+\sum_{i=1}^n \lambda^*_i ​g_i(x) \geq 0$. And this implies that $(x^*,​\lambda^*)$ is a saddle point since: for all $\lambda \geq 0$ and $x \in \mcD$, ​+Assume now that $f(x^*)=\inf_{x\in \mcD} f(x)$ for some $x^* \in\mcD$. Then, $\{f-f(x^*)<​0\} \cap \mcD =\emptyset$ so that there exists $\lambda^* \geq 0$ satisfying for all $x\in \mcD$, $f(x)-f(x^*)+\sum_{i=1}^n \lambda^*_i ​h_i(x) \geq 0$. And this implies that $(x^*,​\lambda^*)$ is a saddle point since: for all $\lambda \geq 0$ and $x \in \mcD$, ​
 $$ $$
-f(x^*)+\sum_{i=1}^n \lambda_i ​g_i(x^*) \leq f(x^*) \leq f(x)+\sum_{i=1}^n \lambda^*_i ​g_i(x) +f(x^*)+\sum_{i=1}^n \lambda_i ​h_i(x^*) \leq f(x^*) \leq f(x)+\sum_{i=1}^n \lambda^*_i ​h_i(x) 
 $$ $$
 so that $(x^*,​\lambda^*)$ is a saddle point and this implies that the strong duality holds (as we have seen in the previous section). This concludes the proof. ​ so that $(x^*,​\lambda^*)$ is a saddle point and this implies that the strong duality holds (as we have seen in the previous section). This concludes the proof. ​
world/kkt.1584636616.txt.gz · Last modified: 2022/03/16 01:36 (external edit)