Welcome to Randal Douc's wiki

A collaborative site on maths but not only!

User Tools

Site Tools


world:kkt

This is an old revision of the document!


$$ \newcommand{\arginf}{\mathrm{arginf}} \newcommand{\argmin}{\mathrm{argmin}} \newcommand{\argmax}{\mathrm{argmax}} \newcommand{\asconv}[1]{\stackrel{#1-a.s.}{\rightarrow}} \newcommand{\Aset}{\mathsf{A}} \newcommand{\b}[1]{{\mathbf{#1}}} \newcommand{\ball}[1]{\mathsf{B}(#1)} \newcommand{\bbQ}{{\mathbb Q}} \newcommand{\bproof}{\textbf{Proof :}\quad} \newcommand{\bmuf}[2]{b_{#1,#2}} \newcommand{\card}{\mathrm{card}} \newcommand{\chunk}[3]{{#1}_{#2:#3}} \newcommand{\condtrans}[3]{p_{#1}(#2|#3)} \newcommand{\convprob}[1]{\stackrel{#1-\text{prob}}{\rightarrow}} \newcommand{\Cov}{\mathbb{C}\mathrm{ov}} \newcommand{\cro}[1]{\langle #1 \rangle} \newcommand{\CPE}[2]{\PE\lr{#1| #2}} \renewcommand{\det}{\mathrm{det}} \newcommand{\dimlabel}{\mathsf{m}} \newcommand{\dimU}{\mathsf{q}} \newcommand{\dimX}{\mathsf{d}} \newcommand{\dimY}{\mathsf{p}} \newcommand{\dlim}{\Rightarrow} \newcommand{\e}[1]{{\left\lfloor #1 \right\rfloor}} \newcommand{\eproof}{\quad \Box} \newcommand{\eremark}{</WRAP>} \newcommand{\eqdef}{:=} \newcommand{\eqlaw}{\stackrel{\mathcal{L}}{=}} \newcommand{\eqsp}{\;} \newcommand{\Eset}{ {\mathsf E}} \newcommand{\esssup}{\mathrm{essup}} \newcommand{\fr}[1]{{\left\langle #1 \right\rangle}} \newcommand{\falph}{f} \renewcommand{\geq}{\geqslant} \newcommand{\hchi}{\hat \chi} \newcommand{\Hset}{\mathsf{H}} \newcommand{\Id}{\mathrm{Id}} \newcommand{\img}{\text{Im}} \newcommand{\indi}[1]{\mathbf{1}_{#1}} \newcommand{\indiacc}[1]{\mathbf{1}_{\{#1\}}} \newcommand{\indin}[1]{\mathbf{1}\{#1\}} \newcommand{\itemm}{\quad \quad \blacktriangleright \;} \newcommand{\jointtrans}[3]{p_{#1}(#2,#3)} \newcommand{\ker}{\text{Ker}} \newcommand{\klbck}[2]{\mathrm{K}\lr{#1||#2}} \newcommand{\law}{\mathcal{L}} \newcommand{\labelinit}{\pi} \newcommand{\labelkernel}{Q} \renewcommand{\leq}{\leqslant} \newcommand{\lone}{\mathsf{L}_1} \newcommand{\lrav}[1]{\left|#1 \right|} \newcommand{\lr}[1]{\left(#1 \right)} \newcommand{\lrb}[1]{\left[#1 \right]} \newcommand{\lrc}[1]{\left\{#1 \right\}} \newcommand{\lrcb}[1]{\left\{#1 \right\}} \newcommand{\ltwo}[1]{\PE^{1/2}\lrb{\lrcb{#1}^2}} \newcommand{\Ltwo}{\mathrm{L}^2} \newcommand{\mc}[1]{\mathcal{#1}} \newcommand{\mcbb}{\mathcal B} \newcommand{\mcf}{\mathcal{F}} \newcommand{\meas}[1]{\mathrm{M}_{#1}} \newcommand{\norm}[1]{\left\|#1\right\|} \newcommand{\normmat}[1]{{\left\vert\kern-0.25ex\left\vert\kern-0.25ex\left\vert #1 \right\vert\kern-0.25ex\right\vert\kern-0.25ex\right\vert}} \newcommand{\nset}{\mathbb N} \newcommand{\N}{\mathcal{N}} \newcommand{\one}{\mathsf{1}} \newcommand{\PE}{\mathbb E} \newcommand{\pminfty}{_{-\infty}^\infty} \newcommand{\PP}{\mathbb P} \newcommand{\projorth}[1]{\mathsf{P}^\perp_{#1}} \newcommand{\Psif}{\Psi_f} \newcommand{\pscal}[2]{\langle #1,#2\rangle} \newcommand{\pscal}[2]{\langle #1,#2\rangle} \newcommand{\psconv}{\stackrel{\PP-a.s.}{\rightarrow}} \newcommand{\qset}{\mathbb Q} \newcommand{\revcondtrans}[3]{q_{#1}(#2|#3)} \newcommand{\rmd}{\mathrm d} \newcommand{\rme}{\mathrm e} \newcommand{\rmi}{\mathrm i} \newcommand{\Rset}{\mathbb{R}} \newcommand{\rset}{\mathbb{R}} \newcommand{\rti}{\sigma} \newcommand{\section}[1]{==== #1 ====} \newcommand{\seq}[2]{\lrc{#1\eqsp: \eqsp #2}} \newcommand{\set}[2]{\lrc{#1\eqsp: \eqsp #2}} \newcommand{\sg}{\mathrm{sgn}} \newcommand{\supnorm}[1]{\left\|#1\right\|_{\infty}} \newcommand{\thv}{{\theta_\star}} \newcommand{\tmu}{ {\tilde{\mu}}} \newcommand{\Tset}{ {\mathsf{T}}} \newcommand{\Tsigma}{ {\mathcal{T}}} \newcommand{\ttheta}{{\tilde \theta}} \newcommand{\tv}[1]{\left\|#1\right\|_{\mathrm{TV}}} \newcommand{\unif}{\mathrm{Unif}} \newcommand{\weaklim}[1]{\stackrel{\mathcal{L}_{#1}}{\rightsquigarrow}} \newcommand{\Xset}{{\mathsf X}} \newcommand{\Xsigma}{\mathcal X} \newcommand{\Yset}{{\mathsf Y}} \newcommand{\Ysigma}{\mathcal Y} \newcommand{\Var}{\mathbb{V}\mathrm{ar}} \newcommand{\zset}{\mathbb{Z}} \newcommand{\Zset}{\mathsf{Z}} $$

2017/10/07 23:39 · douc

$$ \newcommand{\mcD}{\mathcal D} \newcommand{\mcl}{\mathcal L} $$

Weak duality

Let $f$ and $(g_i)_{1 \leq i \leq n}$ be convex differentiable functions on $\Xset=\rset^p$. Let $\mcD=\cap_{i=1}^n \{g_i \leq 0\}\neq \emptyset$ and note that by convexity of the functions $g_i$, the set $\mcD$ is actually a convex set. We are interested in the infimum of the convex function $f$ on the convex set $\mcD$. Define the Lagrange function $$ \mcl(x,\lambda)=f(x)+\sum_{i=1}^n \lambda_i g_i(x) $$ where $\lambda=(\lambda_1,\ldots,\lambda_n)^T \in \rset^n$. We first show the weak duality relation. Start with this simple remark: for all $(x,\lambda) \in \Xset \times \rset^n$, $$ \inf_{x \in \Xset} \mcl(x,\lambda) \leq \mcl(x,\lambda) $$ Taking the supremum wrt $\lambda \geq 0$ on both sides yields: $$ \sup_{\lambda\geq 0}\inf_{x \in \Xset} \mcl(x,\lambda) \leq \sup_{\lambda\geq 0} \mcl(x,\lambda)=\infty \indin{x\notin \mcD}+f(x)\indin{x\in \mcD} $$ And taking now the infimum wrt $x \in \Xset$ on both sides, we finally get the weak duality relation: \begin{equation} \label{eq:weak} \sup_{\lambda\geq 0}\inf_{x \in \Xset} \mcl(x,\lambda) \leq \inf_{x \in \Xset} \sup_{\lambda\geq 0}\mcl(x,\lambda)=\inf_{x \in \mcD} f(x) \end{equation} Note that in the lhs, the infimum is wrt $x\in \Xset$ and therefore, there is no constraint, which is nice… To obtain the equality (which is named the strong duality relation), we actually need some additional assumptions (for example the existence of a Slater point). But before that, let us check a very useful relation…

Karush-Kuhn-Tucker conditions

Sufficient conditions

Lemma Assume that there exist $(x^*,\lambda^*) \in \mcD\times (\rset^+)^n$ such that \begin{equation} \nabla_x \mcl(x^*,\lambda^*)=0 \end{equation} and for all $i \in [1:n]$, we have either $\lambda^*_i=0$ or $g_i(x^*)=0$ . Then, $$ f(x^*)=\inf_{x \in \mcD} f(x) $$

Click to see the proof

Click to see the proof

$\bproof$ For every $x \in \mcD$, the convexity of the function $f$ yields $$ f(x)-f(x^*) \geq \nabla f(x^*)^T (x-x^*). $$ It remains to prove $\nabla f(x^*)^T (x-x^*)\geq 0$. Using first $\nabla_x \mcl(x^*,\lambda^*)=0$ and then the convexity of the functions $g_i$, we get $$ \nabla f(x^*)^T (x-x^*)=-\sum_{i=1}^n \lambda^*_i \nabla_x g_i^T(x^*) (x-x^*) \geq -\sum_{i=1}^n \lambda^*_i (g_i(x)-g_i(x^*))=- \sum_{i=1}^n \lambda^*_i \underbrace{g_i(x)}_{\leq 0} \geq 0 $$ This finishes the proof. $\eproof$

Saddle points

Definition We say that $(x^*,\lambda^*) \in \Xset \times (\rset^+)^n$ is a saddle point of the Lagrange function $\mcl$ if for every $(x,\lambda) \in \Xset \times (\rset^+)^n$, \begin{equation}\label{eq:saddle} \mcl(x^*,\lambda) \leq \mcl(x,\lambda^*) \end{equation}

Saddle point Lemma If $(x^*,\lambda^*) \in \Xset \times (\rset^+)^n$ is a saddle point for $\mcl$ then the strong duality holds, $x^*=\arginf_{x \in \mcD} f(x)$, and $\lambda^*_i g_i(x^*)=0$ for all $i \in [1:n]$.

click here to see the proof

click here to see the proof

$\bproof$ Using \eqref{eq:saddle} with $\lambda=\lambda^*$ first and then with $x=x^*$, we have for every $(x,\lambda) \in \Xset \times (\rset^+)^n$, \begin{equation*} \mcl(x^*,\lambda) \leq \mcl(x^*,\lambda^*) \leq \mcl(x,\lambda^*) \end{equation*} The existence of a saddle point implies the strong duality since $$ \inf_{x \in\Xset} \sup_{\lambda \geq 0} \mcl(x,\lambda) \leq \sup_{\lambda \geq 0} \mcl(x^*,\lambda) \leq \mcl(x^*,\lambda^*) \leq \inf_{x \in\Xset} \mcl(x,\lambda^*) \leq \sup_{\lambda \geq 0} \inf_{x \in\Xset} \mcl(x,\lambda) $$ which is the converse inequality of \eqref{eq:weak}. This finally implies: \begin{equation}\label{eq:all} \sup_{\lambda \geq 0} \inf_{x \in\Xset} \mcl(x,\lambda)=\mcl(x^*,\lambda^*)=\inf_{x \in\Xset} \sup_{\lambda \geq 0} \mcl(x,\lambda)=\inf_{x \in\mcD} f(x). \end{equation}

The upper bound in \eqref{eq:saddle} regardless the value of $\lambda\geq 0$ shows that $g_i(x^*)\leq 0$ for every $i \in [1:n]$, that is $x^* \in \mcD$. Now taking \eqref{eq:saddle} with $\lambda=0$ yields for $x\in \mcD$, $$ f(x^*)=\mcl(x^*,0) \leq \mcl(x,\lambda^*) \leq f(x) $$ which shows that $f(x^*)=\inf_{x \in\mcD} f(x)$. Combining with \eqref{eq:all} yields $\mcl(x^*,\lambda^*)=f(x^*)$ so that $\sum_{i=1}^n \lambda^*_i g_i(x^*_i)=0$ and since $\lambda^* \geq 0$ and $x^* \in \mcD$, this implies $\lambda^*_i g_i(x^*)=0$ for all $i \in [1:n]$. $\eproof$

Necessary conditions and strong duality

We now assume the existence of Slater points: there exists $\tilde x \in \mcD$ such that for all $i \in \{1,\ldots,n\}$, $g_i(\tilde x)<0$.

The convex Farkas lemma Assume that there exists a Slater point. Then, $\{f<0\} \cap \mcD =\emptyset$ iff there exists $\lambda^* \geq 0$ such that for all $x\in \mcD$, $f(x)+\sum_{i=1}^n \lambda^*_i g_i(x) \geq 0$.

Click here to see the proof

Click here to see the proof

$\bproof$ Set $$ U=\{u=u_{0:n} \in \rset^{n+1}; \exists\ x \in \mcD, f(x)<u_0 \ \mbox{and} \ g_i(x)\leq u_i \mbox{ for all } i \in [1:n]\}. $$ The condition $\{f<0\} \cap \mcD =\emptyset$ is equivalent to saying that $0 \notin U$ and since $U$ is clearly a convex set, by a separation argument, there exists a non-null vector $\phi \in \rset^{n+1}$ such that \begin{equation} \label{eq:ineg} \forall u \in U\,, \quad \phi^T u\geq 0 \end{equation} Take an arbitrary $i \in [0:n]$. If $u\in U$ then $u+t e_i \in U$ where $t>0$ and $e_i=(\indiacc{k=i})_{k \in [0:n]} \in \rset^{n+1}$. The previous inequality implies $\phi^T (u+t e_i)=\phi^T u + t\phi_i \geq 0$ for all $t>0$, therefore $\phi_i\geq 0$. Now, since $\phi^T u\geq 0$ for all $u \in U$, a simple limiting argument yields for all $x\in \mcD$, $$ \phi_0 f(x)+\sum_{i=1}^n \phi_i g_i(x) \geq 0 $$ To conclude, and since we already know that $\phi_i\geq 0$ for all $i \in [0:n]$, it only remains to prove that $\phi_0 \neq 0$ and to set in that case $\lambda^*_i=\frac{\phi_i}{\phi_0}$. The rest of the argument is by contradiction. If $\phi_0=0$, then the previous inequality on a Slater point $x=\tilde x$ yields $\sum_{i=1}^n \phi_i g_i(\tilde x) \geq 0$ but since $g_i(\tilde x)<0$ for all $i \in [1:n]$, we finally get $\phi_i=0$ for all $i \in [1:n]$. Finally all the components of $\phi$ are null and we are face to a contradiction. The proof is completed. $\eproof$

The Farkas lemma will imply the strong duality under the existence of a Slater point.

The necessary condition Assume that $f(x^*)=\inf_{x\in \mcD} f(x)$ for some $x^* \in\mcD$ and that there exists a Slater point. Then there exist $\lambda^* \geq 0$ such that $(x^*,\lambda^*)$ is a saddle point and by the Saddle point lemma, the strong duality holds.

Click here to see the proof

Click here to see the proof

$\bproof$ Assume now that $f(x^*)=\inf_{x\in \mcD} f(x)$ for some $x^* \in\mcD$. Then, $\{f-f(x^*)<0\} \cap \mcD =\emptyset$ so that there exists $\lambda^* \geq 0$ satisfying for all $x\in \mcD$, $f(x)-f(x^*)+\sum_{i=1}^n \lambda^*_i g_i(x) \geq 0$. And this implies that $(x^*,\lambda^*)$ is a saddle point since: for all $\lambda \geq 0$ and $x \in \mcD$, $$ f(x^*)+\sum_{i=1}^n \lambda_i g_i(x^*) \leq f(x^*) \leq f(x)+\sum_{i=1}^n \lambda^*_i g_i(x) $$ so that $(x^*,\lambda^*)$ is a saddle point and this implies that the strong duality holds (as we have seen in the previous section). This concludes the proof. $\eproof$

world/kkt.1611172275.txt.gz · Last modified: 2022/03/16 01:36 (external edit)