Differences

This shows you the differences between two versions of the page.

--- world:kkt [2026/04/08 10:42]
rdouc
+++ world:kkt [2026/04/09 23:17] (current)
rdouc
@@ Line 34: / Line 34: @@
 $$
-Taking the infimum over $x \in \Xset$ leads to the **weak duality relation**:
+Taking the infimum over $x \in \Xset$ leads to the <color red>weak duality relation </color>:
 \begin{equation} \label{eq:weak}
 \sup_{\lambda \geq 0,\ \mu} \inf_{x \in \Xset} \mcl(x,\lambda,\mu)
@@ Line 43: / Line 43: @@
 Observe that in the lhs (left-hand side), the infimum is taken over $x \in \Xset$, hence no constraint is imposed. In contrast, the rhs (right-hand side) corresponds to the constrained problem.
-The rhs is called the //primal problem//, while the lhs is referred to as the //dual problem//. Since $x \mapsto \mcl(x,\lambda,\mu)$ is convex, the dual problem $\sup_{\lambda\geq 0,\mu}\inf_{x \in \Xset} \mcl(x,\lambda,\mu)$ is equivalent to
+The rhs is called the <color red>//primal problem//</color>, while the lhs is referred to as the <color red>//dual problem//</color>. Since $x \mapsto \mcl(x,\lambda,\mu)$ is convex, the dual problem $\sup_{\lambda\geq 0,\mu}\inf_{x \in \Xset} \mcl(x,\lambda,\mu)$ is equivalent to
 $$
-\sup \{\mcl(x,\lambda,\mu)\,:\lambda \geq 0, \mu \mbox{ and }\nabla_x \mcl(x,\lambda)=0\}
+\sup \{\mcl(x,\lambda,\mu)\,:\lambda \geq 0, \mu \mbox{ and }\nabla_x \mcl(x,\lambda,\mu)=0\}
 $$
 This formulation is often useful when solving the optimization problem.
-To obtain equality in \eqref{eq:weak} (known as the **strong duality relation**), additional assumptions are required, such as the existence of a Slater point. Before discussing this, we introduce a useful intermediate result.
+To obtain equality in \eqref{eq:weak} (known as the <color red>**strong duality relation**</color>), additional assumptions are required, such as the existence of a Slater point. Before discussing this, we introduce a useful intermediate result.
 ===== Karush-Kuhn-Tucker conditions =====
@@ Line 63: / Line 63: @@
   * for all $i \in [1:n]$, either $\lambda_i^* = 0$ or $h_i(x^*) = 0$.
-Then strong duality holds and
+Then
 $$
-f(x^*) = \inf_{x \in \mcD} f(x) = \mcl(x^*, \lambda^*, \mu^*).
+f(x^*) = \inf_{x \in \mcD} f(x) = \mcl(x^*, \lambda^*, \mu^*),
 $$
+and strong duality holds.
 </WRAP>
@@ Line 82: / Line 82: @@
 which implies
 \begin{equation} \label{eq:grad}
-\nabla f(x^*)^T (x - x^*) = - \sum_{i=1}^n \lambda_i^* \nabla h_i(x^*)^T (x - x^*) - \sum_{j=1}^m \mu_j^* \nabla g_j(x^*)^T (x - x^*).
+f(x) - f(x^*) \geq \nabla f(x^*)^T (x - x^*) = - \sum_{i=1}^n \lambda_i^* \nabla h_i(x^*)^T (x - x^*) - \sum_{j=1}^m \mu_j^* \nabla g_j(x^*)^T (x - x^*).
 \end{equation}
 **Inequality constraints**: $h_i$: by convexity, for any $x \in \mcD$,
 $$
-- \nabla h_i(x^*)^T (x - x^*) \ge h_i(x^*) - h_i(x) \ge -h_i(x^*),
+- \nabla h_i(x^*)^T (x - x^*) \ge h_i(x^*) - h_i(x) \ge h_i(x^*),
 $$
 hence
 $$
-- \sum_{i=1}^n \lambda_i^* \nabla h_i(x^*)^T (x - x^*) \ge -\sum_{i=1}^n \lambda_i^* h_i(x^*) = 0.
+- \sum_{i=1}^n \lambda_i^* \nabla h_i(x^*)^T (x - x^*) \ge \sum_{i=1}^n \lambda_i^* h_i(x^*) = 0.
 $$
@@ Line 99: / Line 99: @@
 $$
-Combining these relations in \eqref{eq:grad}, we obtain
+Combining these relations in \eqref{eq:grad}, we obtain that for any $x \in \mcD$,
 $$
-\nabla f(x^*)^T (x - x^*) \ge - \sum_{i=1}^n \lambda_i^* h_i(x^*) = 0,
+f(x) - f(x^*) \ge \nabla f(x^*)^T (x - x^*) \ge  0,
 $$
 which proves that $f(x^*) = \inf_{x \in \mcD} f(x)= \mcl(x^*, \lambda^*, \mu^*)$. Since $x \mapsto \mcl(x,\lambda^*,\mu^*)$ is convex, the KKT conditions show that
@@ Line 107: / Line 107: @@
  \mcl(x^*, \lambda^*, \mu^*) = \inf_{x\in \Xset} \mcl(x, \lambda^*, \mu^*) \leq \sup_{\lambda \geq 0,\mu} \inf_{x\in \Xset} \mcl(x, \lambda, \mu)
 $$
-and the strong duality holds.
+which is the converse inequality in \eqref{eq:weak} and the strong duality holds.
 </hidden>
@@ Line 142: / Line 142: @@
 $$
-This chain of inequalities implies strong duality.
+This chain of inequalities implies strong duality (see \eqref{eq:weak} for the reverse inequality).
-Moreover, the upper bound in the saddle point property, which holds for all $\lambda \geq 0, \mu$, implies that $h_i(x^*) \leq 0$ and $g_j(x^*) = 0$.
+Moreover, the upper bound in the saddle point property, which holds for all $\lambda \geq 0, \mu$, implies that $\sup_{\lambda \geq 0, \mu}  \mcl(x^*, \lambda, \mu) <\infty$ and hence $h_i(x^*) \leq 0$ and $g_j(x^*) = 0$.
 Finally, choosing $\lambda = 0$ and $\mu = 0$, we obtain
 $$
-f(x^*) = \mcl(x^*, 0, 0) \le \mcl(x, \lambda^*, \mu^*) \le f(x), \quad \forall x \in \mcD,
+f(x^*) = \mcl(x^*, 0, 0) \le \sup_{\lambda \geq 0, \mu} \mcl(x^*, \lambda, \mu)
+= \inf_{x \in \Xset} \mcl(x, \lambda^*, \mu^*) \le \mcl(x, \lambda^*, \mu^*) \le f(x), \quad \forall x \in \mcD,
 $$
 which shows that $x^*$ minimizes $f$ over $\mcD$ and that $\lambda_i^* h_i(x^*) = 0$ for all $i$ where the last identity follows from the above inequality with $x=x^*$.
@@ Line 171: / Line 172: @@
 g_j(x) = a_j^T x - b_j, \quad j=1,\dots,m.
 $$
-Without loss of generality, we assume that $(a_j)_{1\leq j \leq m}$ are  \textbf{linearly independent}.
+Without loss of generality, we assume that $(a_j)_{1\leq j \leq m}$ are  **linearly independent**.
 We define the set $U$ as
 $$
@@ Line 195: / Line 196: @@
-<WRAP center round help 80%>
+<WRAP center round tip 80%>
 === Theorem (KKT conditions) ===

Welcome to Randal Douc's wiki

User Tools

Site Tools

Differences

Page Tools