This shows you the differences between two versions of the page.
| Both sides previous revision Previous revision Next revision | Previous revision | ||
|
world:kkt [2026/04/08 10:42] rdouc |
world:kkt [2026/04/09 23:17] (current) rdouc |
||
|---|---|---|---|
| Line 34: | Line 34: | ||
| $$ | $$ | ||
| - | Taking the infimum over $x \in \Xset$ leads to the **weak duality relation**: | + | Taking the infimum over $x \in \Xset$ leads to the <color red>weak duality relation </color>: |
| \begin{equation} \label{eq:weak} | \begin{equation} \label{eq:weak} | ||
| \sup_{\lambda \geq 0,\ \mu} \inf_{x \in \Xset} \mcl(x,\lambda,\mu) | \sup_{\lambda \geq 0,\ \mu} \inf_{x \in \Xset} \mcl(x,\lambda,\mu) | ||
| Line 43: | Line 43: | ||
| Observe that in the lhs (left-hand side), the infimum is taken over $x \in \Xset$, hence no constraint is imposed. In contrast, the rhs (right-hand side) corresponds to the constrained problem. | Observe that in the lhs (left-hand side), the infimum is taken over $x \in \Xset$, hence no constraint is imposed. In contrast, the rhs (right-hand side) corresponds to the constrained problem. | ||
| - | The rhs is called the //primal problem//, while the lhs is referred to as the //dual problem//. Since $x \mapsto \mcl(x,\lambda,\mu)$ is convex, the dual problem $\sup_{\lambda\geq 0,\mu}\inf_{x \in \Xset} \mcl(x,\lambda,\mu)$ is equivalent to | + | The rhs is called the <color red>//primal problem//</color>, while the lhs is referred to as the <color red>//dual problem//</color>. Since $x \mapsto \mcl(x,\lambda,\mu)$ is convex, the dual problem $\sup_{\lambda\geq 0,\mu}\inf_{x \in \Xset} \mcl(x,\lambda,\mu)$ is equivalent to |
| $$ | $$ | ||
| - | \sup \{\mcl(x,\lambda,\mu)\,:\lambda \geq 0, \mu \mbox{ and }\nabla_x \mcl(x,\lambda)=0\} | + | \sup \{\mcl(x,\lambda,\mu)\,:\lambda \geq 0, \mu \mbox{ and }\nabla_x \mcl(x,\lambda,\mu)=0\} |
| $$ | $$ | ||
| This formulation is often useful when solving the optimization problem. | This formulation is often useful when solving the optimization problem. | ||
| - | To obtain equality in \eqref{eq:weak} (known as the **strong duality relation**), additional assumptions are required, such as the existence of a Slater point. Before discussing this, we introduce a useful intermediate result. | + | To obtain equality in \eqref{eq:weak} (known as the <color red>**strong duality relation**</color>), additional assumptions are required, such as the existence of a Slater point. Before discussing this, we introduce a useful intermediate result. |
| ===== Karush-Kuhn-Tucker conditions ===== | ===== Karush-Kuhn-Tucker conditions ===== | ||
| Line 63: | Line 63: | ||
| * for all $i \in [1:n]$, either $\lambda_i^* = 0$ or $h_i(x^*) = 0$. | * for all $i \in [1:n]$, either $\lambda_i^* = 0$ or $h_i(x^*) = 0$. | ||
| - | Then strong duality holds and | + | Then |
| $$ | $$ | ||
| - | f(x^*) = \inf_{x \in \mcD} f(x) = \mcl(x^*, \lambda^*, \mu^*). | + | f(x^*) = \inf_{x \in \mcD} f(x) = \mcl(x^*, \lambda^*, \mu^*), |
| $$ | $$ | ||
| + | and strong duality holds. | ||
| </WRAP> | </WRAP> | ||
| Line 82: | Line 82: | ||
| which implies | which implies | ||
| \begin{equation} \label{eq:grad} | \begin{equation} \label{eq:grad} | ||
| - | \nabla f(x^*)^T (x - x^*) = - \sum_{i=1}^n \lambda_i^* \nabla h_i(x^*)^T (x - x^*) - \sum_{j=1}^m \mu_j^* \nabla g_j(x^*)^T (x - x^*). | + | f(x) - f(x^*) \geq \nabla f(x^*)^T (x - x^*) = - \sum_{i=1}^n \lambda_i^* \nabla h_i(x^*)^T (x - x^*) - \sum_{j=1}^m \mu_j^* \nabla g_j(x^*)^T (x - x^*). |
| \end{equation} | \end{equation} | ||
| **Inequality constraints**: $h_i$: by convexity, for any $x \in \mcD$, | **Inequality constraints**: $h_i$: by convexity, for any $x \in \mcD$, | ||
| $$ | $$ | ||
| - | - \nabla h_i(x^*)^T (x - x^*) \ge h_i(x^*) - h_i(x) \ge -h_i(x^*), | + | - \nabla h_i(x^*)^T (x - x^*) \ge h_i(x^*) - h_i(x) \ge h_i(x^*), |
| $$ | $$ | ||
| hence | hence | ||
| $$ | $$ | ||
| - | - \sum_{i=1}^n \lambda_i^* \nabla h_i(x^*)^T (x - x^*) \ge -\sum_{i=1}^n \lambda_i^* h_i(x^*) = 0. | + | - \sum_{i=1}^n \lambda_i^* \nabla h_i(x^*)^T (x - x^*) \ge \sum_{i=1}^n \lambda_i^* h_i(x^*) = 0. |
| $$ | $$ | ||
| Line 99: | Line 99: | ||
| $$ | $$ | ||
| - | Combining these relations in \eqref{eq:grad}, we obtain | + | Combining these relations in \eqref{eq:grad}, we obtain that for any $x \in \mcD$, |
| $$ | $$ | ||
| - | \nabla f(x^*)^T (x - x^*) \ge - \sum_{i=1}^n \lambda_i^* h_i(x^*) = 0, | + | f(x) - f(x^*) \ge \nabla f(x^*)^T (x - x^*) \ge 0, |
| $$ | $$ | ||
| which proves that $f(x^*) = \inf_{x \in \mcD} f(x)= \mcl(x^*, \lambda^*, \mu^*)$. Since $x \mapsto \mcl(x,\lambda^*,\mu^*)$ is convex, the KKT conditions show that | which proves that $f(x^*) = \inf_{x \in \mcD} f(x)= \mcl(x^*, \lambda^*, \mu^*)$. Since $x \mapsto \mcl(x,\lambda^*,\mu^*)$ is convex, the KKT conditions show that | ||
| Line 107: | Line 107: | ||
| \mcl(x^*, \lambda^*, \mu^*) = \inf_{x\in \Xset} \mcl(x, \lambda^*, \mu^*) \leq \sup_{\lambda \geq 0,\mu} \inf_{x\in \Xset} \mcl(x, \lambda, \mu) | \mcl(x^*, \lambda^*, \mu^*) = \inf_{x\in \Xset} \mcl(x, \lambda^*, \mu^*) \leq \sup_{\lambda \geq 0,\mu} \inf_{x\in \Xset} \mcl(x, \lambda, \mu) | ||
| $$ | $$ | ||
| - | and the strong duality holds. | + | which is the converse inequality in \eqref{eq:weak} and the strong duality holds. |
| </hidden> | </hidden> | ||
| Line 142: | Line 142: | ||
| $$ | $$ | ||
| - | This chain of inequalities implies strong duality. | + | This chain of inequalities implies strong duality (see \eqref{eq:weak} for the reverse inequality). |
| - | Moreover, the upper bound in the saddle point property, which holds for all $\lambda \geq 0, \mu$, implies that $h_i(x^*) \leq 0$ and $g_j(x^*) = 0$. | + | Moreover, the upper bound in the saddle point property, which holds for all $\lambda \geq 0, \mu$, implies that $\sup_{\lambda \geq 0, \mu} \mcl(x^*, \lambda, \mu) <\infty$ and hence $h_i(x^*) \leq 0$ and $g_j(x^*) = 0$. |
| Finally, choosing $\lambda = 0$ and $\mu = 0$, we obtain | Finally, choosing $\lambda = 0$ and $\mu = 0$, we obtain | ||
| $$ | $$ | ||
| - | f(x^*) = \mcl(x^*, 0, 0) \le \mcl(x, \lambda^*, \mu^*) \le f(x), \quad \forall x \in \mcD, | + | f(x^*) = \mcl(x^*, 0, 0) \le \sup_{\lambda \geq 0, \mu} \mcl(x^*, \lambda, \mu) |
| + | = \inf_{x \in \Xset} \mcl(x, \lambda^*, \mu^*) \le \mcl(x, \lambda^*, \mu^*) \le f(x), \quad \forall x \in \mcD, | ||
| $$ | $$ | ||
| which shows that $x^*$ minimizes $f$ over $\mcD$ and that $\lambda_i^* h_i(x^*) = 0$ for all $i$ where the last identity follows from the above inequality with $x=x^*$. | which shows that $x^*$ minimizes $f$ over $\mcD$ and that $\lambda_i^* h_i(x^*) = 0$ for all $i$ where the last identity follows from the above inequality with $x=x^*$. | ||
| Line 171: | Line 172: | ||
| g_j(x) = a_j^T x - b_j, \quad j=1,\dots,m. | g_j(x) = a_j^T x - b_j, \quad j=1,\dots,m. | ||
| $$ | $$ | ||
| - | Without loss of generality, we assume that $(a_j)_{1\leq j \leq m}$ are \textbf{linearly independent}. | + | Without loss of generality, we assume that $(a_j)_{1\leq j \leq m}$ are **linearly independent**. |
| We define the set $U$ as | We define the set $U$ as | ||
| $$ | $$ | ||
| Line 195: | Line 196: | ||
| - | <WRAP center round help 80%> | ||
| + | <WRAP center round tip 80%> | ||
| === Theorem (KKT conditions) === | === Theorem (KKT conditions) === | ||