KKT条件学习

编程入门 行业动态 更新时间:2024-10-08 03:28:05

KKT<a href=https://www.elefans.com/category/jswz/34/1771358.html style=条件学习"/>

KKT条件学习

不等式约束问题

(P)  min ⁡ f ( x ) s.t.  g i ( x ) ≤ 0 , i = 1 , 2 , … , m \begin{array}{lll} \text { (P) } & \min & f(\mathbf{x}) \\ & \text { s.t. } & g_{i}(\mathbf{x}) \leq 0, & i=1,2, \ldots, m \end{array}  (P) ​min s.t. ​f(x)gi​(x)≤0,​i=1,2,…,m​
其中 f , g 1 , ⋯ , g m f,g_1,\cdots,g_m f,g1​,⋯,gm​是在 R n \mathbb{R}^n Rn连续可微的函数

可行下降方向

min ⁡ b ( x ) s.t.  x ∈ C \begin{array}{ll} \min & b(\mathbf{x}) \\ \text { s.t. } & \mathbf{x} \in C \end{array} min s.t. ​b(x)x∈C​
其中 h h h是在闭凸集 C ⊆ R n \mathbf{C}\subseteq \mathbb{R}^n C⊆Rn上连续可微的函数。
设向量 d ≠ 0 \mathbf{d}\neq 0 d​=0 , x ∈ C \mathbf{x}\in\mathbf{C} x∈C
如果 ∇ h ( x ) T d < 0 \nabla h(\mathbf{x})^T\mathbf{d}<0 ∇h(x)Td<0,并且存在 ϵ > 0 \epsilon>0 ϵ>0使得 ∀ t ∈ [ 0 , ϵ ] , x + t d ∈ C \forall t\in\left[0,\epsilon\right],\mathbf{x}+t\mathbf{d}\in \mathbf{C} ∀t∈[0,ϵ],x+td∈C
则 d \mathbf{d} d是在 x \mathbf{x} x点的可行下降方向

引理1

考虑问题
(G)  min ⁡ h ( x ) s.t.  x ∈ C . \begin{array}{lll} \text { (G) } & \min & h(\mathbf{x}) \\ & \text { s.t. } & \mathbf{x} \in C . \end{array}  (G) ​min s.t. ​h(x)x∈C.​
如果 x ∗ \mathbf{x}^* x∗是一个局部最优解,则在 x ∗ \mathbf{x}^* x∗没有可行下降方向

证明:
假设有下降方向,则存在 d ≠ 0 , ϵ 1 > 0 \mathbf{d}\neq 0,\epsilon_1>0 d​=0,ϵ1​>0
∀ t ∈ [ 0 , ϵ 1 ] , x ∗ + t d ∈ C , ∇ f ( x ) T d < 0 \forall t \in \left[0,\epsilon_1\right],\mathbf{x}^*+t\mathbf{d}\in\mathbf{C},\nabla f(\mathbf{x})^T\mathbf{d}<0 ∀t∈[0,ϵ1​],x∗+td∈C,∇f(x)Td<0
根据下降方向的性质
∃ ϵ 2 < ϵ 1 , ∀ t ∈ [ 0 , ϵ 2 ] , f ( x ∗ + t d ) < f ( x ∗ ) \exists \epsilon_2<\epsilon_1,\forall t\in\left[0,\epsilon_2\right],f(\mathbf{x}^*+t\mathbf{d})<f(\mathbf{x}^*) ∃ϵ2​<ϵ1​,∀t∈[0,ϵ2​],f(x∗+td)<f(x∗)
和局部最优矛盾了

引理2

设 x ∗ \mathbf{x}^* x∗是
min ⁡ f ( x ) s.t.  g i ( x ) ≤ 0 , i = 1 , 2 , … , m \begin{array}{ll} \min & f(\mathbf{x}) \\ \text { s.t. } & g_{i}(\mathbf{x}) \leq 0, \quad i=1,2, \ldots, m \end{array} min s.t. ​f(x)gi​(x)≤0,i=1,2,…,m​
的局部最优解
其中 f , g 1 , ⋯ , g m f,g_1,\cdots,g_m f,g1​,⋯,gm​是 R n \mathbb{R}^n Rn上连续可微的函数

I ( x ∗ ) = { i : g i ( x ∗ ) = 0 } I(\mathbf{x}^*)=\left\{i: g_i(\mathbf{x}^*)=0\right\} I(x∗)={i:gi​(x∗)=0}
则不存在 d ∈ R n \mathbf{d}\in\mathbb{R}^n d∈Rn,使得
∇ f ( x ∗ ) T d < 0 , ∇ g i ( x ∗ ) T d < 0 , i ∈ I ( x ∗ ) \begin{aligned} &\nabla f\left(\mathbf{x}^{*}\right)^{T} \mathbf{d}<0, \\ &\nabla g_{i}\left(\mathbf{x}^{*}\right)^{T} \mathbf{d}<0, \quad i \in I\left(\mathbf{x}^{*}\right) \end{aligned} ​∇f(x∗)Td<0,∇gi​(x∗)Td<0,i∈I(x∗)​
证明:
假设存在 d \mathbf{d} d使得
∇ f ( x ∗ ) T d < 0 , ∇ g i ( x ∗ ) T d < 0 , i ∈ I ( x ∗ ) \begin{aligned} &\nabla f\left(\mathbf{x}^{*}\right)^{T} \mathbf{d}<0, \\ &\nabla g_{i}\left(\mathbf{x}^{*}\right)^{T} \mathbf{d}<0, \quad i \in I\left(\mathbf{x}^{*}\right) \end{aligned} ​∇f(x∗)Td<0,∇gi​(x∗)Td<0,i∈I(x∗)​
那么存在 ϵ 1 > 0 \epsilon_1>0 ϵ1​>0, ∀ t ∈ ( 0 , ϵ 1 ) , i ∈ I ( x ∗ ) \forall t \in (0,\epsilon_1),i\in I(\mathbf{x}^*) ∀t∈(0,ϵ1​),i∈I(x∗)
使得 f ( x ∗ + t d ) < f ( x ∗ ) f(\mathbf{x}^*+t\mathbf{d})<f(\mathbf{x}^*) f(x∗+td)<f(x∗)以及 g i ( x ∗ + t d ) < g i ( x ∗ ) = 0 g_i(\mathbf{x}^*+t\mathbf{d})<g_i(\mathbf{x}^*)=0 gi​(x∗+td)<gi​(x∗)=0

对于 i ∉ I ( x ∗ ) i\notin I(\mathbf{x}^*) i∈/​I(x∗),有 g i ( x ∗ ) < 0 g_i(\mathbf{x}^*)<0 gi​(x∗)<0
因此 ∃ ϵ 2 > 0 , ∀ t ∈ ( 0 , ϵ 2 ) , i ∉ I ( x ∗ ) \exists \epsilon_2>0,\forall t\in (0,\epsilon_2),i\notin I(\mathbf{x}^*) ∃ϵ2​>0,∀t∈(0,ϵ2​),i∈/​I(x∗),有 g i ( x ∗ + t d ) < 0 g_i(\mathbf{x}^*+t\mathbf{d})<0 gi​(x∗+td)<0

所以 ∀ t ∈ ( 0 , min ⁡ { ϵ 1 , ϵ 2 } ) \forall t\in \left(0,\min \left\{\epsilon_1,\epsilon_2\right\}\right) ∀t∈(0,min{ϵ1​,ϵ2​}),有
f ( x ∗ + t d ) < f ( x ∗ ) g i ( x ∗ + t d ) < 0 , i = 1 , 2 , … , m , \begin{aligned} &f\left(\mathbf{x}^{*}+t \mathbf{d}\right)<f\left(\mathbf{x}^{*}\right) \\ &g_{i}\left(\mathbf{x}^{*}+t \mathbf{d}\right)<0, \quad i=1,2, \ldots, m, \end{aligned} ​f(x∗+td)<f(x∗)gi​(x∗+td)<0,i=1,2,…,m,​
与 x ∗ \mathbf{x}^* x∗局部最优解矛盾

不等式约束问题的Fritz-John条件

设 x ∗ \mathbf{x}^* x∗是
min ⁡ f ( x ) s.t.  g i ( x ) ≤ 0 , i = 1 , 2 , … , m \begin{array}{ll} \min & f(\mathbf{x}) \\ \text { s.t. } & g_{i}(\mathbf{x}) \leq 0, \quad i=1,2, \ldots, m \end{array} min s.t. ​f(x)gi​(x)≤0,i=1,2,…,m​
的局部最优解
其中 f , g 1 , ⋯ , g m f,g_1,\cdots,g_m f,g1​,⋯,gm​是 R n \mathbb{R}^n Rn上连续可微的函数
则存在不全为0的 λ 0 , λ 1 , ⋯ , λ m ≥ 0 \lambda_0,\lambda_1,\cdots,\lambda_m\ge 0 λ0​,λ1​,⋯,λm​≥0,使得
λ 0 ∇ f ( x ∗ ) + ∑ i = 1 m λ i ∇ g i ( x ∗ ) = 0 , λ i g i ( x ∗ ) = 0 , i = 1 , 2 , … , m \begin{aligned} \lambda_{0} \nabla f\left(\mathbf{x}^{*}\right)+\sum_{i=1}^{m} \lambda_{i} \nabla g_{i}\left(\mathbf{x}^{*}\right) &=0, \\ \lambda_{i} g_{i}\left(\mathbf{x}^{*}\right) &=0, \quad i=1,2, \ldots, m \end{aligned} λ0​∇f(x∗)+i=1∑m​λi​∇gi​(x∗)λi​gi​(x∗)​=0,=0,i=1,2,…,m​
证明:
由引理2
不存在 d \mathbf{d} d使得
∇ f ( x ∗ ) T d < 0 , ∇ g i ( x ∗ ) T d < 0 , i ∈ I ( x ∗ ) \begin{aligned} &\nabla f\left(\mathbf{x}^{*}\right)^{T} \mathbf{d}<0, \\ &\nabla g_{i}\left(\mathbf{x}^{*}\right)^{T} \mathbf{d}<0, \quad i \in I\left(\mathbf{x}^{*}\right) \end{aligned} ​∇f(x∗)Td<0,∇gi​(x∗)Td<0,i∈I(x∗)​
其中 I ( x ∗ ) = { i : g i ( x ∗ ) } = { i 1 , i 2 , ⋯ , i k } I(\mathbf{x}^*)=\left\{i:g_i(\mathbf{x}^*)\right\}=\left\{i_1,i_2,\cdots,i_k\right\} I(x∗)={i:gi​(x∗)}={i1​,i2​,⋯,ik​}
这个等价于
A d < 0 \mathbf{Ad}<0 Ad<0
无解
其中
A = ( ∇ f ( x ∗ ) T ∇ g i 1 ( x ∗ ) T ⋮ ∇ g i k ( x ∗ ) T ) \mathbf{A}=\left(\begin{array}{c} \nabla f\left(\mathbf{x}^{*}\right)^{T} \\ \nabla g_{i_{1}}\left(\mathbf{x}^{*}\right)^{T} \\ \vdots \\ \nabla g_{i_{k}}\left(\mathbf{x}^{*}\right)^{T} \end{array}\right) A=⎝⎜⎜⎜⎛​∇f(x∗)T∇gi1​​(x∗)T⋮∇gik​​(x∗)T​⎠⎟⎟⎟⎞​
根据Gordan定理,
无解等价于存在 η = ( λ 0 , ⋯ , λ i k ) T ≠ 0 \eta=\left(\lambda_0,\cdots,\lambda_{i_k}\right)^T\neq 0 η=(λ0​,⋯,λik​​)T​=0,使得
A T η = 0 , η ≥ 0 \mathrm{A}^{T} \eta=0, \quad \eta \geq 0 ATη=0,η≥0
于是
λ 0 ∇ f ( x ∗ ) + ∑ i = 1 m λ i ∇ g i ( x ∗ ) = 0 , λ i g i ( x ∗ ) = 0 , i = 1 , 2 , … , m \begin{aligned} \lambda_{0} \nabla f\left(\mathbf{x}^{*}\right)+\sum_{i=1}^{m} \lambda_{i} \nabla g_{i}\left(\mathbf{x}^{*}\right) &=0, \\ \lambda_{i} g_{i}\left(\mathbf{x}^{*}\right) &=0, \quad i=1,2, \ldots, m \end{aligned} λ0​∇f(x∗)+i=1∑m​λi​∇gi​(x∗)λi​gi​(x∗)​=0,=0,i=1,2,…,m​

不等式约束问题的KKT条件

设 x ∗ \mathbf{x}^* x∗是
min ⁡ f ( x ) s.t.  g i ( x ) ≤ 0 , i = 1 , 2 , … , m \begin{array}{ll} \min & f(\mathbf{x}) \\ \text { s.t. } & g_{i}(\mathbf{x}) \leq 0, \quad i=1,2, \ldots, m \end{array} min s.t. ​f(x)gi​(x)≤0,i=1,2,…,m​
的局部最优解
其中 f , g 1 , ⋯ , g m f,g_1,\cdots,g_m f,g1​,⋯,gm​是 R n \mathbb{R}^n Rn上连续可微的函数

I ( x ∗ ) = { i : g i ( x ∗ ) = 0 } I(\mathbf{x}^*)=\left\{i:g_i(\mathbf{x}^*)=0\right\} I(x∗)={i:gi​(x∗)=0}
设 { ∇ g i ( x ∗ ) } i ∈ I ( x ∗ ) \left\{\nabla g_i(\mathbf{x}^*)\right\}_{i\in I(\mathbf{x}^*)} {∇gi​(x∗)}i∈I(x∗)​线性无关
则存在 λ 1 , ⋯ , λ m ≥ 0 \lambda_1,\cdots,\lambda_m\ge 0 λ1​,⋯,λm​≥0,使得
∇ f ( x ∗ ) + ∑ i = 1 m λ i ∇ g i ( x ∗ ) = 0 , λ i g i ( x ∗ ) = 0 , i = 1 , 2 , … , m \begin{aligned} \nabla f\left(\mathbf{x}^{*}\right)+\sum_{i=1}^{m} \lambda_{i} \nabla g_{i}\left(\mathbf{x}^{*}\right) &=0, \\ \lambda_{i} g_{i}\left(\mathbf{x}^{*}\right) &=0, \quad i=1,2, \ldots, m \end{aligned} ∇f(x∗)+i=1∑m​λi​∇gi​(x∗)λi​gi​(x∗)​=0,=0,i=1,2,…,m​
证明:
根据Fritz-John条件,存在不全为0的 λ 0 ~ , ⋯ , λ m ~ ≥ 0 \tilde{\lambda_0},\cdots,\tilde{\lambda_m}\ge 0 λ0​~​,⋯,λm​~​≥0
使得 λ 0 ~ ∇ f ( x ∗ ) + ∑ i = 1 m λ i ~ ∇ g i ( x ∗ ) = 0 , λ i ~ g i ( x ∗ ) = 0 , i = 1 , 2 , … , m \begin{aligned} \tilde{\lambda_0} \nabla f\left(\mathbf{x}^{*}\right)+\sum_{i=1}^{m} \tilde{\lambda_i} \nabla g_{i}\left(\mathbf{x}^{*}\right) &=0, \\ \tilde{\lambda_i} g_{i}\left(\mathbf{x}^{*}\right) &=0, \quad i=1,2, \ldots, m \end{aligned} λ0​~​∇f(x∗)+i=1∑m​λi​~​∇gi​(x∗)λi​~​gi​(x∗)​=0,=0,i=1,2,…,m​
如果 λ 0 ~ = 0 \tilde{\lambda_0}=0 λ0​~​=0,则 λ i ~ = 0 \tilde{\lambda_i}=0 λi​~​=0,与 { ∇ g i ( x ∗ ) } i ∈ I ( x ∗ ) \left\{\nabla g_i(\mathbf{x}^*)\right\}_{i\in I(\mathbf{x}^*)} {∇gi​(x∗)}i∈I(x∗)​线性无关
所以令 λ i = λ i ~ λ 0 ~ \lambda_i=\frac{\tilde{\lambda_i}}{\tilde{\lambda_0}} λi​=λ0​~​λi​~​​
得证

不等式与等式约束问题

不等式与等式约束问题的KKT条件

设 x ∗ \mathbf{x}^* x∗是问题
min ⁡ f ( x ) s.t.  g i ( x ) ≤ 0 , i = 1 , 2 , … , m h j ( x ) = 0 , j = 1 , 2 , … , p \begin{array}{ll} \min & f(\mathbf{x}) \\ \text { s.t. } & g_{i}(\mathbf{x}) \leq 0, \quad i=1,2, \ldots, m \\ & h_{j}(\mathbf{x})=0, \quad j=1,2, \ldots, p \end{array} min s.t. ​f(x)gi​(x)≤0,i=1,2,…,mhj​(x)=0,j=1,2,…,p​
的局部最优解,
其中 f , g 1 , ⋯ , g m , h 1 , ⋯ , h p f,g_1,\cdots,g_m,h_1,\cdots,h_p f,g1​,⋯,gm​,h1​,⋯,hp​是在 R n \mathbb{R}^n Rn上连续可微的函数
假设
{ ∇ g i ( x ∗ ) : i ∈ I ( x ∗ ) } ∪ { ∇ h j ( x ∗ ) : j = 1 , 2 , … , p } \left\{\nabla g_{i}\left(\mathbf{x}^{*}\right): i \in I\left(\mathbf{x}^{*}\right)\right\} \cup\left\{\nabla h_{j}\left(\mathbf{x}^{*}\right): j=1,2, \ldots, p\right\} {∇gi​(x∗):i∈I(x∗)}∪{∇hj​(x∗):j=1,2,…,p}
线性无关
其中 I ( x ∗ ) = { i : g i ( x ∗ ) = 0 } I(\mathbf{x}^*)=\left\{i:g_i(\mathbf{x}^*)=0\right\} I(x∗)={i:gi​(x∗)=0}
则存在 λ 1 , ⋯ , λ m ≥ 0 \lambda_1,\cdots,\lambda_m\ge 0 λ1​,⋯,λm​≥0和 μ 1 , ⋯ , μ p ∈ R \mu_1,\cdots,\mu_p\in \mathbb{R} μ1​,⋯,μp​∈R,使得
∇ f ( x ∗ ) + ∑ i = 1 m λ i ∇ g i ( x ∗ ) + ∑ j = 1 p μ j ∇ h j ( x ∗ ) = 0 , λ i g i ( x ∗ ) = 0 , i = 1 , 2 , … , m \begin{aligned} \nabla f\left(\mathbf{x}^{*}\right)+\sum_{i=1}^{m} \lambda_{i} \nabla g_{i}\left(\mathbf{x}^{*}\right)+\sum_{j=1}^{p} \mu_{j} \nabla h_{j}\left(\mathbf{x}^{*}\right) &=0, \\ \lambda_{i} g_{i}\left(\mathbf{x}^{*}\right) &=0, \quad i=1,2, \ldots, m \end{aligned} ∇f(x∗)+i=1∑m​λi​∇gi​(x∗)+j=1∑p​μj​∇hj​(x∗)λi​gi​(x∗)​=0,=0,i=1,2,…,m​

KKT点

一个可行解 x ∗ \mathbb{x}^* x∗
如果存在 λ 1 , ⋯ , λ m ≥ 0 \lambda_1,\cdots,\lambda_m\ge 0 λ1​,⋯,λm​≥0和 μ 1 , ⋯ , μ p ∈ R \mu_1,\cdots,\mu_p\in \mathbb{R} μ1​,⋯,μp​∈R,使得
∇ f ( x ∗ ) + ∑ i = 1 m λ i ∇ g i ( x ∗ ) + ∑ j = 1 p μ j ∇ h j ( x ∗ ) = 0 , λ i g i ( x ∗ ) = 0 , i = 1 , 2 , … , m \begin{aligned} \nabla f\left(\mathbf{x}^{*}\right)+\sum_{i=1}^{m} \lambda_{i} \nabla g_{i}\left(\mathbf{x}^{*}\right)+\sum_{j=1}^{p} \mu_{j} \nabla h_{j}\left(\mathbf{x}^{*}\right) &=0, \\ \lambda_{i} g_{i}\left(\mathbf{x}^{*}\right) &=0, \quad i=1,2, \ldots, m \end{aligned} ∇f(x∗)+i=1∑m​λi​∇gi​(x∗)+j=1∑p​μj​∇hj​(x∗)λi​gi​(x∗)​=0,=0,i=1,2,…,m​
则 x ∗ \mathbf{x}^* x∗称为KKT点

规范(regularity)

一个可行解 x ∗ \mathbf{x}^* x∗
如果 { ∇ g i ( x ∗ ) : i ∈ I ( x ∗ ) } ∪ { ∇ h j ( x ∗ ) : j = 1 , 2 , … , p } \left\{\nabla g_{i}\left(\mathbf{x}^{*}\right): i \in I\left(\mathbf{x}^{*}\right)\right\} \cup\left\{\nabla h_{j}\left(\mathbf{x}^{*}\right): j=1,2, \ldots, p\right\} {∇gi​(x∗):i∈I(x∗)}∪{∇hj​(x∗):j=1,2,…,p}
线性无关,则称为规范(regularity)

凸的条件下

凸优化问题中KKT条件的充分性

设 x ∗ \mathbf{x}^* x∗是问题
min ⁡ f ( x ) s.t.  g i ( x ) ≤ 0 , i = 1 , 2 , … , m h j ( x ) = 0 , j = 1 , 2 , … , p \begin{array}{ll} \min & f(\mathbf{x}) \\ \text { s.t. } & g_{i}(\mathbf{x}) \leq 0, \quad i=1,2, \ldots, m \\ & h_{j}(\mathbf{x})=0, \quad j=1,2, \ldots, p \end{array} min s.t. ​f(x)gi​(x)≤0,i=1,2,…,mhj​(x)=0,j=1,2,…,p​
的一个可行解
其中 f , g 1 , ⋯ , g m f,g_1,\cdots,g_m f,g1​,⋯,gm​是在 R n \mathbb{R}^n Rn上连续可微的凸函数
h 1 , ⋯ , h p h_1,\cdots,h_p h1​,⋯,hp​是仿射函数
如果存在 λ 1 , ⋯ , λ m ≥ 0 \lambda_1,\cdots,\lambda_m\ge 0 λ1​,⋯,λm​≥0和 μ 1 , ⋯ , μ p ∈ R \mu_1,\cdots,\mu_p\in \mathbb{R} μ1​,⋯,μp​∈R,使得
∇ f ( x ∗ ) + ∑ i = 1 m λ i ∇ g i ( x ∗ ) + ∑ j = 1 p μ j ∇ h j ( x ∗ ) = 0 , λ i g i ( x ∗ ) = 0 , i = 1 , 2 , … , m \begin{aligned} \nabla f\left(\mathbf{x}^{*}\right)+\sum_{i=1}^{m} \lambda_{i} \nabla g_{i}\left(\mathbf{x}^{*}\right)+\sum_{j=1}^{p} \mu_{j} \nabla h_{j}\left(\mathbf{x}^{*}\right) &=0, \\ \lambda_{i} g_{i}\left(\mathbf{x}^{*}\right) &=0, \quad i=1,2, \ldots, m \end{aligned} ∇f(x∗)+i=1∑m​λi​∇gi​(x∗)+j=1∑p​μj​∇hj​(x∗)λi​gi​(x∗)​=0,=0,i=1,2,…,m​
则 x ∗ \mathbf{x}^* x∗是最优解
证明:
s ( x ) = f ( x ) + ∑ i = 1 m λ i g i ( x ) + ∑ j = 1 p μ j h j ( x ) s(\mathbf{x})=f(\mathbf{x})+\sum_{i=1}^{m} \lambda_{i} g_{i}(\mathbf{x})+\sum_{j=1}^{p} \mu_{j} h_{j}(\mathbf{x}) s(x)=f(x)+i=1∑m​λi​gi​(x)+j=1∑p​μj​hj​(x)
∇ s ( x ∗ ) = 0 \nabla s(\mathbf{x}^*)=0 ∇s(x∗)=0,所以 s ( x ∗ ) ≤ s ( x ) s(\mathbf{x}^*)\le s(\mathbf{x}) s(x∗)≤s(x)
所以
f ( x ∗ ) = f ( x ∗ ) + ∑ i = 1 m λ i g i ( x ∗ ) + ∑ j = 1 p μ j h j ( x ∗ ) = s ( x ∗ ) ≤ s ( x ) = f ( x ) + ∑ i = 1 m λ i g i ( x ) + ∑ j = 1 p μ j h j ( x ) ≤ f ( x ) \begin{aligned} f\left(\mathbf{x}^{*}\right) &=f\left(\mathbf{x}^{*}\right)+\sum_{i=1}^{m} \lambda_{i} g_{i}\left(\mathbf{x}^{*}\right)+\sum_{j=1}^{p} \mu_{j} h_{j}\left(\mathbf{x}^{*}\right) \\ &=s\left(\mathbf{x}^{*}\right) \\ & \leq s(\mathbf{x}) \\ &=f(\mathbf{x})+\sum_{i=1}^{m} \lambda_{i} g_{i}(\mathbf{x})+\sum_{j=1}^{p} \mu_{j} h_{j}(\mathbf{x}) \\ & \leq f(\mathbf{x}) \end{aligned} f(x∗)​=f(x∗)+i=1∑m​λi​gi​(x∗)+j=1∑p​μj​hj​(x∗)=s(x∗)≤s(x)=f(x)+i=1∑m​λi​gi​(x)+j=1∑p​μj​hj​(x)≤f(x)​

不等式约束问题中KKT条件下Slater条件的必要性

设 x ∗ \mathbf{x}^* x∗是
min ⁡ f ( x ) s.t.  g i ( x ) ≤ 0 , i = 1 , 2 , … , m \begin{array}{ll} \min & f(\mathbf{x}) \\ \text { s.t. } & g_{i}(\mathbf{x}) \leq 0, \quad i=1,2, \ldots, m \end{array} min s.t. ​f(x)gi​(x)≤0,i=1,2,…,m​
的局部最优解
其中 f , g 1 , ⋯ , g m f,g_1,\cdots,g_m f,g1​,⋯,gm​是 R n \mathbb{R}^n Rn上连续可微的函数,
并且 g 1 , ⋯ , g m g_1,\cdots,g_m g1​,⋯,gm​是凸函数
假设存在 x ^ ∈ R n \hat{\mathbf{x}}\in \mathbb{R}^n x^∈Rn,使得
g i ( x ^ ) < 0 , i = 1 , 2 , … , m g_{i}(\hat{\mathbf{x}})<0, \quad i=1,2, \ldots, m gi​(x^)<0,i=1,2,…,m
则存在 λ 1 , ⋯ , λ m ≥ 0 \lambda_1,\cdots,\lambda_m\ge 0 λ1​,⋯,λm​≥0,使得
∇ f ( x ∗ ) + ∑ i = 1 m λ i ∇ g i ( x ∗ ) = 0 , λ i g i ( x ∗ ) = 0 , i = 1 , 2 , … , m \begin{aligned} \nabla f\left(\mathbf{x}^{*}\right)+\sum_{i=1}^{m} \lambda_{i} \nabla g_{i}\left(\mathbf{x}^{*}\right) &=0, \\ \lambda_{i} g_{i}\left(\mathbf{x}^{*}\right) &=0, \quad i=1,2, \ldots, m \end{aligned} ∇f(x∗)+i=1∑m​λi​∇gi​(x∗)λi​gi​(x∗)​=0,=0,i=1,2,…,m​
证明:
因为 x ∗ \mathbf{x}^* x∗是局部最优解,所以一定满足Fritz-John条件
即存在不全为0的 λ 0 , λ 1 , ⋯ , λ m ≥ 0 \lambda_0,\lambda_1,\cdots,\lambda_m\ge 0 λ0​,λ1​,⋯,λm​≥0,使得
λ 0 ∇ f ( x ∗ ) + ∑ i = 1 m λ i ∇ g i ( x ∗ ) = 0 , λ i g i ( x ∗ ) = 0 , i = 1 , 2 , … , m \begin{aligned} \lambda_{0} \nabla f\left(\mathbf{x}^{*}\right)+\sum_{i=1}^{m} \lambda_{i} \nabla g_{i}\left(\mathbf{x}^{*}\right) &=0, \\ \lambda_{i} g_{i}\left(\mathbf{x}^{*}\right) &=0, \quad i=1,2, \ldots, m \end{aligned} λ0​∇f(x∗)+i=1∑m​λi​∇gi​(x∗)λi​gi​(x∗)​=0,=0,i=1,2,…,m​

假设 λ 0 = 0 \lambda_0=0 λ0​=0,则
∑ i = 1 m λ ~ i ∇ g i ( x ∗ ) = 0 \sum_{i=1}^{m} \tilde{\lambda}_{i} \nabla g_{i}\left(\mathbf{x}^{*}\right)=0 i=1∑m​λ~i​∇gi​(x∗)=0
由凸函数的一阶条件
0 > g i ( x ^ ) ≥ g i ( x ∗ ) + ∇ g i ( x ∗ ) T ( x ^ − x ∗ ) 0>g_{i}(\hat{\mathbf{x}}) \geq g_{i}\left(\mathbf{x}^{*}\right)+\nabla g_{i}\left(\mathbf{x}^{*}\right)^{T}\left(\hat{\mathbf{x}}-\mathbf{x}^{*}\right) 0>gi​(x^)≥gi​(x∗)+∇gi​(x∗)T(x^−x∗)
于是
0 > ∑ i = 1 m λ ~ i g i ( x ∗ ) + [ ∑ i = 1 m λ ~ i ∇ g i ( x ∗ ) ] T ( x ^ − x ∗ ) = 0 0>\sum_{i=1}^{m} \tilde{\lambda}_{i} g_{i}\left(\mathbf{x}^{*}\right)+\left[\sum_{i=1}^{m} \tilde{\lambda}_{i} \nabla g_{i}\left(\mathbf{x}^{*}\right)\right]^{T}\left(\hat{\mathbf{x}}-\mathbf{x}^{*}\right)=0 0>i=1∑m​λ~i​gi​(x∗)+[i=1∑m​λ~i​∇gi​(x∗)]T(x^−x∗)=0
矛盾
所以 λ 0 > 0 \lambda_0>0 λ0​>0,然后取 λ i = λ i ~ λ 0 \lambda_i=\frac{\tilde{\lambda_i}}{\lambda_0} λi​=λ0​λi​~​​
得证

广义Slater条件

g i ( x ) ≤ 0 , i = 1 , 2 , … , m , h j ( x ) ≤ 0 , j = 1 , 2 , … , p , s k ( x ) = 0 , k = 1 , 2 , … , q , \begin{array}{ll} g_{i}(\mathbf{x}) \leq 0, & i=1,2, \ldots, m, \\ h_{j}(\mathbf{x}) \leq 0, & j=1,2, \ldots, p, \\ s_{k}(\mathbf{x})=0, & k=1,2, \ldots, q, \end{array} gi​(x)≤0,hj​(x)≤0,sk​(x)=0,​i=1,2,…,m,j=1,2,…,p,k=1,2,…,q,​
其中 g i g_i gi​是凸函数, h j , s k h_j,s_k hj​,sk​是仿射函数
如果存在 x ^ ∈ R n \hat{\mathbf{x}}\in\mathbb{R}^n x^∈Rn,满足
g i ( x ^ ) < 0 , i = 1 , 2 , … , m , h j ( x ^ ) ≤ 0 , j = 1 , 2 , … , p , s k ( x ^ ) = 0 , k = 1 , 2 , … , q , \begin{array}{ll} g_{i}(\hat{\mathbf{x}}) < 0, & i=1,2, \ldots, m, \\ h_{j}(\hat{\mathbf{x}}) \leq 0, & j=1,2, \ldots, p, \\ s_{k}(\hat{\mathbf{x}})=0, & k=1,2, \ldots, q, \end{array} gi​(x^)<0,hj​(x^)≤0,sk​(x^)=0,​i=1,2,…,m,j=1,2,…,p,k=1,2,…,q,​
则满足广义Slater条件

不等式与等式约束问题的KKT条件下的广义Slater条件的必要性

设 x ∗ \mathbf{x}^* x∗是问题
min ⁡ f ( x ) s.t.  g i ( x ) ≤ 0 , i = 1 , 2 , … , m , h j ( x ) ≤ 0 , j = 1 , 2 , … , p , s k ( x ) = 0 , k = 1 , 2 , … , q , \begin{array}{ll} \min & f(\mathbf{x}) \\ \text { s.t. } & g_{i}(\mathbf{x}) \leq 0, \quad i=1,2, \ldots, m, \\ & h_{j}(\mathbf{x}) \leq 0, \quad j=1,2, \ldots, p, \\ & s_{k}(\mathbf{x})=0, \quad k=1,2, \ldots, q, \end{array} min s.t. ​f(x)gi​(x)≤0,i=1,2,…,m,hj​(x)≤0,j=1,2,…,p,sk​(x)=0,k=1,2,…,q,​
的最优解
其中 f , g 1 , ⋯ , g m f,g_1,\cdots,g_m f,g1​,⋯,gm​是在 R n \mathbb{R}^n Rn上的连续可微凸函数, h j , s k h_j,s_k hj​,sk​是仿射函数
如果存在 x ^ ∈ R n \hat{\mathbf{x}}\in\mathbb{R}^n x^∈Rn,满足
g i ( x ^ ) < 0 , i = 1 , 2 , … , m , h j ( x ^ ) ≤ 0 , j = 1 , 2 , … , p , s k ( x ^ ) = 0 , k = 1 , 2 , … , q , \begin{array}{ll} g_{i}(\hat{\mathbf{x}}) < 0, & i=1,2, \ldots, m, \\ h_{j}(\hat{\mathbf{x}}) \leq 0, & j=1,2, \ldots, p, \\ s_{k}(\hat{\mathbf{x}})=0, & k=1,2, \ldots, q, \end{array} gi​(x^)<0,hj​(x^)≤0,sk​(x^)=0,​i=1,2,…,m,j=1,2,…,p,k=1,2,…,q,​
则存在 λ 1 , ⋯ , λ m , η 1 , ⋯ , η p ≥ 0 , μ 1 , ⋯ , μ q ∈ R \lambda_1,\cdots,\lambda_m,\eta_1,\cdots,\eta_p\ge 0,\mu_1,\cdots,\mu_q\in\mathbb{R} λ1​,⋯,λm​,η1​,⋯,ηp​≥0,μ1​,⋯,μq​∈R,使得
∇ f ( x ∗ ) + ∑ i = 1 m λ i ∇ g i ( x ∗ ) + ∑ j = 1 p η j ∇ h j ( x ∗ ) + ∑ k = 1 q μ k ∇ s k ( x ∗ ) = 0 , λ i g i ( x ∗ ) = 0 , i = 1 , 2 , … , m , η j h j ( x ∗ ) = 0 , j = 1 , 2 , … , p . \begin{aligned} \nabla f\left(\mathbf{x}^{*}\right)+\sum_{i=1}^{m} \lambda_{i} \nabla g_{i}\left(\mathbf{x}^{*}\right)+\sum_{j=1}^{p} \eta_{j} \nabla h_{j}\left(\mathbf{x}^{*}\right)+\sum_{k=1}^{q} \mu_{k} \nabla s_{k}\left(\mathbf{x}^{*}\right) &=0, \\ \lambda_{i} g_{i}\left(\mathbf{x}^{*}\right) &=0, \quad i=1,2, \ldots, m, \\ \eta_{j} h_{j}\left(\mathbf{x}^{*}\right) &=0, \quad j=1,2, \ldots, p . \end{aligned} ∇f(x∗)+i=1∑m​λi​∇gi​(x∗)+j=1∑p​ηj​∇hj​(x∗)+k=1∑q​μk​∇sk​(x∗)λi​gi​(x∗)ηj​hj​(x∗)​=0,=0,i=1,2,…,m,=0,j=1,2,…,p.​

二阶优化条件

再说吧

更多推荐

KKT条件学习

本文发布于:2024-02-27 22:58:43,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1766690.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:条件   KKT

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!