本文主要是介绍distance to convex cone,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
基本定义
定义1: 给定 1 ≤ K ≤ N 1 \leq K \leq N 1≤K≤N,定义凸锥(convex cone):
D = { x ∣ 1 T x = 0 , x 1 : N − K ≤ 0 , x ∈ R N } \mathrm{D} = \{ \boldsymbol x | \boldsymbol 1^T \boldsymbol x =0, x_{1: N-K} \leq 0, \boldsymbol x \in \mathbb R^N \} D={x∣1Tx=0,x1:N−K≤0,x∈RN}
定义2: 若 g ∈ R N \boldsymbol g \in \mathbb R^N g∈RN是一个高斯向量,且 g ∼ N ( 0 , I N ) \boldsymbol g \sim \mathcal N(\boldsymbol 0, \boldsymbol I_N) g∼N(0,IN),定义 g \boldsymbol g g到凸锥 D \mathrm D D的映射为:
x = Π D ( g ) , g ∼ N ( 0 , I N ) \boldsymbol x = \Pi_{\mathrm D}(\boldsymbol g), \ \ \boldsymbol g \sim \mathcal N(\boldsymbol 0, \boldsymbol I_N) x=ΠD(g), g∼N(0,IN)
其中 x ∈ R N \boldsymbol x \in \mathbb R^N x∈RN为 g \boldsymbol g g映射到 D \mathrm D D的向量。
定义3:凸锥 D \mathrm D D的统计维数(Statistical Dimension)可以定义为:
δ ( D ) = E [ ∥ Π D ( g ) ∥ 2 ] , g ∼ N ( 0 , I N ) \delta(\mathrm D) = \mathbb E \left [ \left \Vert \Pi_{\mathrm D}(\boldsymbol g) \right \Vert^2 \right], \ \ \boldsymbol g \sim \mathcal N(\boldsymbol 0, \boldsymbol I_N) δ(D)=E[∥ΠD(g)∥2], g∼N(0,IN)
问题描述与建模:建模为凸优化问题
目标:给定 D \mathcal D D,计算得到统计维数(Statistical Dimension) δ ( D ) \delta(\mathrm D) δ(D)。
问题:如果得到映射向量 x = Π D ( g ) \boldsymbol x = \Pi_{\mathrm D}(\boldsymbol g) x=ΠD(g)?
思路:通过建立以下优化问题
min x ∈ D 1 2 ∥ x − g ∥ 2 \min_{\boldsymbol x \in \mathrm D} \frac{1}{2} \left \Vert \boldsymbol x - \boldsymbol g \right \Vert^2 x∈Dmin21∥x−g∥2
可以等价为
min x 1 2 ∥ x − g ∥ 2 subject to x i ≤ 0 , i = 1 , ⋯ , N − K 1 T x = 0 \begin{aligned} & \min_{\boldsymbol x} \frac{1}{2} \left \Vert \boldsymbol x - \boldsymbol g \right \Vert^2 \\ \text{subject to} & \\ & x_i \leq 0, \ \ \ \ i = 1, \cdots, N-K \\ & \boldsymbol 1^T \boldsymbol x =0 \end{aligned} subject toxmin21∥x−g∥2xi≤0, i=1,⋯,N−K1Tx=0
亦可以等价为
min x 1 2 ∥ x − g ∥ 2 subject to e i T x ≤ 0 , i = 1 , ⋯ , N − K 1 T x = 0 \begin{aligned} & \min_{\boldsymbol x} \frac{1}{2} \left \Vert \boldsymbol x - \boldsymbol g \right \Vert^2 \\ \text{subject to} & \\ & \boldsymbol e^T_i \boldsymbol x \leq 0, \ \ \ \ i = 1, \cdots, N-K \\ & \boldsymbol 1^T \boldsymbol x =0 \end{aligned} subject toxmin21∥x−g∥2eiTx≤0, i=1,⋯,N−K1Tx=0
上述问题是一个凸优化问题(满足强对偶性),我们可以借助拉格朗日方法进行求解。
优化问题求解
我们定义上述优化问题的拉格朗日函数,即 L : R N × R N − K × R → R L: \mathbb R^ N \times \mathbb R^{N-K} \times \mathbb R \rightarrow \mathbb R L:RN×RN−K×R→R
L ( x , λ , ν ) = 1 2 ∥ x − g ∥ 2 + ∑ i = 1 N − K λ i e i T x + ν 1 T x L(\boldsymbol x, \boldsymbol \lambda, \nu) = \frac{1}{2} \left \Vert \boldsymbol x - \boldsymbol g \right \Vert^2 + \sum_{i=1}^{N-K} \lambda_i \boldsymbol e^T_i \boldsymbol x + \nu \boldsymbol 1^T \boldsymbol x L(x,λ,ν)=21∥x−g∥2+i=1∑N−KλieiTx+ν1Tx
因为 L ( x , λ , ν ) L(\boldsymbol x, \boldsymbol \lambda, \nu) L(x,λ,ν)关于 x \boldsymbol x x是凸的,所以我们对 L L L求一阶导,得到
∇ x L = x − g + ∑ i = 1 N − K λ i e i + ν 1 \nabla_x L= \boldsymbol x -\boldsymbol g + \sum_{i=1}^{N-K} \lambda_i \boldsymbol e_i + \nu \boldsymbol 1 ∇xL=x−g+i=1∑N−Kλiei+ν1
因此,当 x = g − ∑ i = 1 N − K λ i e i − ν 1 \boldsymbol x = \boldsymbol g - \sum_{i=1}^{N-K} \lambda_i \boldsymbol e_i - \nu \boldsymbol 1 x=g−∑i=1N−Kλiei−ν1, L L L可以取最小值,进一步可以得到对偶函数
h ( λ , ν ) = inf x ∈ R N L ( x , λ , ν ) = L ( x , λ , ν ) ∣ x = g − ∑ i = 1 N − K λ i e i − ν 1 = 1 2 ∥ ∑ i = 1 N − K λ i e i + ν 1 ∥ 2 + ( ∑ i = 1 N − K λ i e i T ) ( g − ∑ i = 1 N − K λ i e i − ν 1 ) + ν 1 T ( g − ∑ i = 1 N − K λ i e i − ν 1 ) = 1 2 ( ∥ λ ∥ 2 + N ν 2 + 2 ν 1 T ( ∑ i = 1 N − K λ i e i ) ) + ( ∑ i = 1 N − K λ i e i T ) g − ∥ λ ∥ 2 − ν 1 T ( ∑ i = 1 N − K λ i e i ) + ν 1 T g − ν 1 T ( ∑ i = 1 N − K λ i e i ) − N ν 2 = − 1 2 ∥ λ ∥ 2 + ( ∑ i = 1 N − K λ i e i T ) g − 1 2 N ν 2 − ν 1 T ( ∑ i = 1 N − K λ i e i ) + ν 1 T g = − 1 2 ∥ λ ∥ 2 + ( ∑ i = 1 N − K λ i e i T ) g − 1 2 ν ( N ν + 2 ⋅ 1 T ( ∑ i = 1 N − K λ i e i ) − 2 ⋅ 1 T g ) \begin{aligned} h(\boldsymbol \lambda, \nu) &= \inf_{x \in \mathbb R^N} L(\boldsymbol x, \boldsymbol \lambda, \nu) \\ &= L(\boldsymbol x, \boldsymbol \lambda, \nu)|_{\boldsymbol x = \boldsymbol g - \sum_{i=1}^{N-K} \lambda_i \boldsymbol e_i - \nu \boldsymbol 1} \\ &= \frac{1}{2} \left \Vert \sum_{i=1}^{N-K} \lambda_i \boldsymbol e_i + \nu \boldsymbol 1 \right \Vert^2 + \left ( \sum_{i=1}^{N-K} \lambda_i \boldsymbol e^T_i \right) \left( \boldsymbol g - \sum_{i=1}^{N-K} \lambda_i \boldsymbol e_i - \nu \boldsymbol 1 \right) \\ & \ \ \ \ \ \ + \nu \boldsymbol 1^T \left( \boldsymbol g - \sum_{i=1}^{N-K} \lambda_i \boldsymbol e_i - \nu \boldsymbol 1 \right) \\ & = \frac{1}{2} \left( \Vert \boldsymbol \lambda \Vert^2 + N \nu^2 + 2 \nu \boldsymbol 1^T \left ( \sum_{i=1}^{N-K} \lambda_i \boldsymbol e_i \right) \right) + \left ( \sum_{i=1}^{N-K} \lambda_i \boldsymbol e^T_i \right) \boldsymbol g - \Vert \boldsymbol \lambda \Vert^2 - \nu \boldsymbol 1^T \left ( \sum_{i=1}^{N-K} \lambda_i \boldsymbol e_i \right) \\ & \ \ \ \ \ \ + \nu \boldsymbol 1^T \boldsymbol g - \nu \boldsymbol 1^T \left ( \sum_{i=1}^{N-K} \lambda_i \boldsymbol e_i \right) - N \nu^2 \\ &= -\frac{1}{2} \Vert \boldsymbol \lambda \Vert^2 + \left ( \sum_{i=1}^{N-K} \lambda_i \boldsymbol e^T_i \right) \boldsymbol g -\frac{1}{2}N \nu^2 - \nu \boldsymbol 1^T \left ( \sum_{i=1}^{N-K} \lambda_i \boldsymbol e_i \right) + \nu \boldsymbol 1^T \boldsymbol g \\ & = -\frac{1}{2} \Vert \boldsymbol \lambda \Vert^2 + \left ( \sum_{i=1}^{N-K} \lambda_i \boldsymbol e^T_i \right) \boldsymbol g - \frac{1}{2} \nu \left( N \nu + 2\cdot \boldsymbol 1^T \left ( \sum_{i=1}^{N-K} \lambda_i \boldsymbol e_i \right) - 2 \cdot \boldsymbol 1^T \boldsymbol g \right) \end{aligned} h(λ,ν)=x∈RNinfL(x,λ,ν)=L(x,λ,ν)∣x=g−∑i=1N−Kλiei−ν1=21 i=1∑N−Kλiei+ν1 2+(i=1∑N−KλieiT)(g−i=1∑N−Kλiei−ν1) +ν1T(g−i=1∑N−Kλiei−ν1)=21(∥λ∥2+Nν2+2ν1T(i=1∑N−Kλiei))+(i=1∑N−KλieiT)g−∥λ∥2−ν1T(i=1∑N−Kλiei) +ν1Tg−ν1T(i=1∑N−Kλiei)−Nν2=−21∥λ∥2+(i=1∑N−KλieiT)g−21Nν2−ν1T(i=1∑N−Kλiei)+ν1Tg=−21∥λ∥2+(i=1∑N−KλieiT)g−21ν(Nν+2⋅1T(i=1∑N−Kλiei)−2⋅1Tg)
又因为 1 T x = 0 \boldsymbol 1^T \boldsymbol x =0 1Tx=0,我们有
1 T ( g − ∑ i = 1 N − K λ i e i − ν 1 ) = 0 ⟹ N ν = 1 T g − 1 T ( ∑ i = 1 N − K λ i e i ) \begin{aligned} & \boldsymbol 1^T \left( \boldsymbol g - \sum_{i=1}^{N-K} \lambda_i \boldsymbol e_i - \nu \boldsymbol 1 \right) = 0 \\ \Longrightarrow & N \nu = \boldsymbol 1^T \boldsymbol g - \boldsymbol 1^T \left ( \sum_{i=1}^{N-K} \lambda_i \boldsymbol e_i \right) \end{aligned} ⟹1T(g−i=1∑N−Kλiei−ν1)=0Nν=1Tg−1T(i=1∑N−Kλiei)
因此,对偶函数 h ( λ , ν ) h(\boldsymbol \lambda, \nu) h(λ,ν)可以继续化简为
h ( λ ) = − 1 2 ∥ λ ∥ 2 + ( ∑ i = 1 N − K λ i e i T ) g − 1 2 1 T g − 1 T ( ∑ i = 1 N − K λ i e i ) N ( 1 T ( ∑ i = 1 N − K λ i e i ) − 1 T g ) = − 1 2 ∥ λ ∥ 2 + ( ∑ i = 1 N − K λ i e i T ) g + 1 2 N ( 1 T ( ∑ i = 1 N − K λ i e i ) − 1 T g ) 2 = − 1 2 ∥ λ ∥ 2 + g 1 : N − K T λ + 1 2 N ( λ T 1 N − K − 1 N T g ) 2 = − 1 2 ∥ λ ∥ 2 + g 1 : N − K T λ + 1 2 N ( λ T 1 N − K 1 N − K T λ − 2 g T 1 N 1 N − K T λ + ∣ 1 N T g ∣ 2 ) = − 1 2 λ T ( I − 1 N 1 N − K 1 N − K T ) λ + ( g 1 : N − K T − g T 1 N 1 N 1 N − K T ) λ + 1 2 N ∣ 1 N T g ∣ 2 = − 1 2 λ T ( I − 1 N 1 N − K 1 N − K T ) λ + ( g 1 : N − K T ( I − 1 N 1 N − K 1 N − K T ) − g N − K + 1 : N T 1 N 1 K 1 N − K T ) λ + 1 2 N ∣ 1 N T g ∣ 2 \begin{aligned} h(\boldsymbol \lambda) &= -\frac{1}{2} \Vert \boldsymbol \lambda \Vert^2 + \left ( \sum_{i=1}^{N-K} \lambda_i \boldsymbol e^T_i \right) \boldsymbol g - \frac{1}{2} \frac{ \boldsymbol 1^T \boldsymbol g - \boldsymbol 1^T \left ( \sum_{i=1}^{N-K} \lambda_i \boldsymbol e_i \right)}{N} \left( \boldsymbol 1^T \left ( \sum_{i=1}^{N-K} \lambda_i \boldsymbol e_i \right) - \boldsymbol 1^T \boldsymbol g \right) \\ &= -\frac{1}{2} \Vert \boldsymbol \lambda \Vert^2 + \left ( \sum_{i=1}^{N-K} \lambda_i \boldsymbol e^T_i \right) \boldsymbol g + \frac{1}{2N} \left( \boldsymbol 1^T \left ( \sum_{i=1}^{N-K} \lambda_i \boldsymbol e_i \right) - \boldsymbol 1^T \boldsymbol g \right)^2 \\ & = -\frac{1}{2} \Vert \boldsymbol \lambda \Vert^2 + \boldsymbol g_{1:N-K}^T \boldsymbol \lambda + \frac{1}{2N} \left( \boldsymbol \lambda^T \boldsymbol 1_{N-K} - \boldsymbol 1^T_N \boldsymbol g \right)^2 \\ &= -\frac{1}{2} \Vert \boldsymbol \lambda \Vert^2 + \boldsymbol g_{1:N-K}^T \boldsymbol \lambda + \frac{1}{2N} \left( \boldsymbol \lambda^T \boldsymbol 1_{N-K} \boldsymbol 1_{N-K}^T \boldsymbol \lambda - 2 \boldsymbol g^T \boldsymbol 1_N \boldsymbol 1^T_{N-K} \boldsymbol \lambda + |\boldsymbol 1^T_N \boldsymbol g|^2\right) \\ &= -\frac{1}{2} \boldsymbol \lambda^T \left( \boldsymbol I - \frac{1}{N} \boldsymbol 1_{N-K} \boldsymbol 1^T_{N-K} \right) \boldsymbol \lambda + \left( \boldsymbol g_{1:N-K}^T -\boldsymbol g^T \frac{1}{N} \boldsymbol 1_N \boldsymbol 1^T_{N-K} \right )\boldsymbol \lambda + \frac{1}{2N} |\boldsymbol 1^T_N \boldsymbol g|^2 \\ & = -\frac{1}{2} \boldsymbol \lambda^T \left( \boldsymbol I - \frac{1}{N} \boldsymbol 1_{N-K} \boldsymbol 1^T_{N-K} \right) \boldsymbol \lambda + \left( \boldsymbol g_{1:N-K}^T \left( \boldsymbol I - \frac{1}{N} \boldsymbol 1_{N-K} \boldsymbol 1^T_{N-K} \right) - \boldsymbol g_{N-K+1:N}^T \frac{1}{N} \boldsymbol 1_{K} \boldsymbol 1^T_{N-K} \right) \boldsymbol \lambda \\ & \ \ \ \ \ \ + \frac{1}{2N} |\boldsymbol 1^T_N \boldsymbol g|^2 \end{aligned} h(λ)=−21∥λ∥2+(i=1∑N−KλieiT)g−21N1Tg−1T(∑i=1N−Kλiei)(1T(i=1∑N−Kλiei)−1Tg)=−21∥λ∥2+(i=1∑N−KλieiT)g+2N1(1T(i=1∑N−Kλiei)−1Tg)2=−21∥λ∥2+g1:N−KTλ+2N1(λT1N−K−1NTg)2=−21∥λ∥2+g1:N−KTλ+2N1(λT1N−K1N−KTλ−2gT1N1N−KTλ+∣1NTg∣2)=−21λT(I−N11N−K1N−KT)λ+(g1:N−KT−gTN11N1N−KT)λ+2N1∣1NTg∣2=−21λT(I−N11N−K1N−KT)λ+(g1:N−KT(I−N11N−K1N−KT)−gN−K+1:NTN11K1N−KT)λ +2N1∣1NTg∣2
令 B = I − 1 N 1 N − K 1 N − K T ∈ R ( N − K ) × ( N − K ) \boldsymbol B = \boldsymbol I - \frac{1}{N} \boldsymbol 1_{N-K} \boldsymbol 1^T_{N-K} \in \mathbb R^{(N-K) \times (N-K)} B=I−N11N−K1N−KT∈R(N−K)×(N−K), C = 1 K 1 N − K T ∈ R K × ( N − K ) \boldsymbol C= \boldsymbol 1_{K} \boldsymbol 1^T_{N-K} \in \mathbb R^{K \times (N-K)} C=1K1N−KT∈RK×(N−K),则 h ( λ ) : R N − K → R h(\boldsymbol \lambda): \mathbb R^{N-K} \rightarrow \mathbb R h(λ):RN−K→R可以写为
h ( λ ) = − 1 2 λ T B λ + g 1 : N − K T B λ − 1 N g N − K + 1 : N T C λ + 1 2 N ∣ 1 N T g ∣ 2 \begin{aligned} h(\boldsymbol \lambda) &= -\frac{1}{2} \boldsymbol \lambda^T \boldsymbol B \boldsymbol \lambda + \boldsymbol g_{1:N-K}^T \boldsymbol B \boldsymbol \lambda - \frac{1}{N} \boldsymbol g_{N-K+1:N}^T \boldsymbol C \boldsymbol \lambda + \frac{1}{2N} |\boldsymbol 1^T_N \boldsymbol g|^2 \end{aligned} h(λ)=−21λTBλ+g1:N−KTBλ−N1gN−K+1:NTCλ+2N1∣1NTg∣2
最终,我们可以把对偶问题写为:
max λ ∈ R N − K − 1 2 λ T B λ + g 1 : N − K T B λ − 1 N g N − K + 1 : N T C λ + 1 2 N ∣ 1 N T g ∣ 2 subject to λ ≥ 0 \begin{aligned} & \max_{\boldsymbol \lambda \in \mathbb R^{N-K}} -\frac{1}{2} \boldsymbol \lambda^T \boldsymbol B \boldsymbol \lambda + \boldsymbol g_{1:N-K}^T \boldsymbol B \boldsymbol \lambda - \frac{1}{N} \boldsymbol g_{N-K+1:N}^T \boldsymbol C \boldsymbol \lambda + \frac{1}{2N} |\boldsymbol 1^T_N \boldsymbol g|^2 \\ \text{subject to} & \\ & \boldsymbol \lambda \geq 0 \end{aligned} subject toλ∈RN−Kmax−21λTBλ+g1:N−KTBλ−N1gN−K+1:NTCλ+2N1∣1NTg∣2λ≥0
进一步,我们分析矩阵 B \boldsymbol B B和 C \boldsymbol C C的相关性质。令 ϵ = K / N \epsilon = K/N ϵ=K/N, q 1 = 1 N − K \boldsymbol q_1 = \frac{1}{\sqrt{N-K}} q1=N−K1则 B \boldsymbol B B可以写为
B = I − 1 N 1 N − K 1 N − K T = I − ( 1 − ϵ ) q 1 q 1 T \begin{aligned} \boldsymbol B & = \boldsymbol I - \frac{1}{N} \boldsymbol 1_{N-K} \boldsymbol 1^T_{N-K} \\ &=\boldsymbol I - (1-\epsilon) \boldsymbol q_1 \boldsymbol q_1^T \end{aligned} B=I−N11N−K1N−KT=I−(1−ϵ)q1q1T
- B 2 = I − ( 1 − ϵ 2 ) q 1 q 1 T \boldsymbol B^{2} = \boldsymbol I - (1-{\epsilon}^2) \boldsymbol q_1 \boldsymbol q_1^T B2=I−(1−ϵ2)q1q1T
- B − 1 = I − ( 1 − 1 ϵ ) q 1 q 1 T \boldsymbol B^{-1} = \boldsymbol I - (1-\frac{1}{\epsilon}) \boldsymbol q_1 \boldsymbol q_1^T B−1=I−(1−ϵ1)q1q1T
- B 1 2 = I − ( 1 − ϵ ) q 1 q 1 T \boldsymbol B^{\frac{1}{2}} = \boldsymbol I - (1-\sqrt{\epsilon}) \boldsymbol q_1 \boldsymbol q_1^T B21=I−(1−ϵ)q1q1T
- B − 1 2 = I − ( 1 − 1 ϵ ) q 1 q 1 T \boldsymbol B^{-\frac{1}{2}} = \boldsymbol I - (1-\frac{1}{\sqrt{\epsilon}}) \boldsymbol q_1 \boldsymbol q_1^T B−21=I−(1−ϵ1)q1q1T
因为 C \boldsymbol C C是一个 K × ( N − K ) K \times (N-K) K×(N−K)的全1矩阵,我们可以得到
B C T = ( 1 − ϵ ) C T B − 1 2 C T = ( 1 + 1 1 + 1 ϵ ) C T \begin{aligned} \boldsymbol {BC}^T = (1-\epsilon) \boldsymbol C^T \\ \boldsymbol B^{-\frac{1}{2}} \boldsymbol {C}^T = (1 + \frac{1}{1+ \frac{1}{\sqrt{ \epsilon}}}) \boldsymbol C^T \end{aligned} BCT=(1−ϵ)CTB−21CT=(1+1+ϵ11)CT
尝试
∇ λ h = − B 1 2 λ + B g 1 : N − K − 1 N C T g N − K + 1 : N = − B 1 2 λ + [ B , − 1 N C T ] g \begin{aligned} \nabla_{\boldsymbol \lambda}h &= - \boldsymbol B^{\frac{1}{2}} \boldsymbol \lambda + \boldsymbol B \boldsymbol g_{1:N-K} - \frac{1}{N} \boldsymbol C^T \boldsymbol g_{N-K+1:N} \\ &= - \boldsymbol B^{\frac{1}{2}} \boldsymbol \lambda + \left[ \boldsymbol B, -\frac{1}{N} \boldsymbol C^T \right] \boldsymbol g \end{aligned} ∇λh=−B21λ+Bg1:N−K−N1CTgN−K+1:N=−B21λ+[B,−N1CT]g
如果令 ∇ λ h = 0 \nabla_{\boldsymbol \lambda}h=0 ∇λh=0,则
λ = B − 1 2 [ B , − 1 N C T ] g = [ B − 1 2 , − 1 N B − 1 2 C T ] g = [ B − 1 2 , − 1 N ( 1 + 1 1 + 1 ϵ ) C T ] g \begin{aligned} \boldsymbol \lambda &= \boldsymbol B^{-\frac{1}{2}} \left[ \boldsymbol B, -\frac{1}{N} \boldsymbol C^T \right] \boldsymbol g \\ &= \left[ \boldsymbol B^{-\frac{1}{2}}, -\frac{1}{N} \boldsymbol B^{-\frac{1}{2}} \boldsymbol C^T \right] \boldsymbol g \\ &= \left[ \boldsymbol B^{-\frac{1}{2}}, -\frac{1}{N}(1 + \frac{1}{1+ \frac{1}{\sqrt{ \epsilon}}}) \boldsymbol C^T \right] \boldsymbol g \end{aligned} λ=B−21[B,−N1CT]g=[B−21,−N1B−21CT]g=[B−21,−N1(1+1+ϵ11)CT]g
因为 g ∼ N ( 0 , I N ) \boldsymbol g \sim \mathcal N(\boldsymbol 0, \boldsymbol I_N) g∼N(0,IN),令 κ ( ϵ ) = ( 1 + 1 1 + 1 ϵ ) \kappa(\epsilon)=(1 + \frac{1}{1+ \frac{1}{\sqrt{ \epsilon}}}) κ(ϵ)=(1+1+ϵ11),我们可以计算得到 λ \boldsymbol \lambda λ的协方差矩阵
Σ λ = [ B − 1 2 , − 1 N κ ( ϵ ) C T ] [ B − 1 2 , − 1 N κ ( ϵ ) C T ] T = B − 1 + 1 N 2 κ ( ϵ ) 2 C T C = B − 1 + 1 N 2 κ ( ϵ ) 2 1 N − K 1 K T 1 K 1 N − K T = B − 1 + κ ( ϵ ) 2 K ( N − K ) N 2 1 N − K ( N − K ) 1 N − K T ( N − K ) = B − 1 + κ ( ϵ ) 2 ϵ ( 1 − ϵ ) 1 N − K ( N − K ) 1 N − K T ( N − K ) = I N − K − ( 1 − 1 ϵ − ϵ ( 1 − ϵ ) κ 2 ( ϵ ) ) q 1 q 1 T \begin{aligned} \boldsymbol \Sigma_{\boldsymbol \lambda} &= \left[ \boldsymbol B^{-\frac{1}{2}}, -\frac{1}{N} \kappa(\epsilon) \boldsymbol C^T \right] \left[ \boldsymbol B^{-\frac{1}{2}}, -\frac{1}{N} \kappa(\epsilon) \boldsymbol C^T \right]^T \\ &= \boldsymbol B^{-1} + \frac{1}{N^2} \kappa(\epsilon)^2 \boldsymbol C^T \boldsymbol C \\ &= \boldsymbol B^{-1} + \frac{1}{N^2} \kappa(\epsilon)^2 \boldsymbol 1_{N-K} \boldsymbol 1_{K}^T \boldsymbol 1_{K} \boldsymbol 1^T_{N-K} \\ & = \boldsymbol B^{-1} + \kappa(\epsilon)^2 \frac{K(N-K)}{N^2} \frac{\boldsymbol 1_{N-K}}{\sqrt(N-K)} \frac{\boldsymbol 1^T_{N-K}}{\sqrt(N-K)} \\ & = \boldsymbol B^{-1} + \kappa(\epsilon)^2 \epsilon(1-\epsilon) \frac{\boldsymbol 1_{N-K}}{\sqrt(N-K)} \frac{\boldsymbol 1^T_{N-K}}{\sqrt(N-K)} \\ & = \boldsymbol I_{N-K} -\left( 1 - \frac{1}{\epsilon} - \epsilon (1- \epsilon ) \kappa^2(\epsilon) \right ) \boldsymbol q_1 \boldsymbol q^T_1 \end{aligned} Σλ=[B−21,−N1κ(ϵ)CT][B−21,−N1κ(ϵ)CT]T=B−1+N21κ(ϵ)2CTC=B−1+N21κ(ϵ)21N−K1KT1K1N−KT=B−1+κ(ϵ)2N2K(N−K)(N−K)1N−K(N−K)1N−KT=B−1+κ(ϵ)2ϵ(1−ϵ)(N−K)1N−K(N−K)1N−KT=IN−K−(1−ϵ1−ϵ(1−ϵ)κ2(ϵ))q1q1T
原问题:
min x 1 2 ∥ x − g ∥ 2 subject to e i T x ≤ 0 , i = 1 , ⋯ , N − K 1 T x = 0 \begin{aligned} & \min_{\boldsymbol x} \frac{1}{2} \left \Vert \boldsymbol x - \boldsymbol g \right \Vert^2 \\ \text{subject to} & \\ & \boldsymbol e^T_i \boldsymbol x \leq 0, \ \ \ \ i = 1, \cdots, N-K \\ & \boldsymbol 1^T \boldsymbol x =0 \end{aligned} subject toxmin21∥x−g∥2eiTx≤0, i=1,⋯,N−K1Tx=0
对偶问题:
max λ ∈ R N − K − 1 2 λ T B λ + g 1 : N − K T B λ − 1 N g N − K + 1 : N T C λ + 1 2 N ∣ 1 N T g ∣ 2 subject to λ ≥ 0 \begin{aligned} & \max_{\boldsymbol \lambda \in \mathbb R^{N-K}} -\frac{1}{2} \boldsymbol \lambda^T \boldsymbol B \boldsymbol \lambda + \boldsymbol g_{1:N-K}^T \boldsymbol B \boldsymbol \lambda - \frac{1}{N} \boldsymbol g_{N-K+1:N}^T \boldsymbol C \boldsymbol \lambda + \frac{1}{2N} |\boldsymbol 1^T_N \boldsymbol g|^2 \\ \text{subject to} & \\ & \boldsymbol \lambda \geq 0 \end{aligned} subject toλ∈RN−Kmax−21λTBλ+g1:N−KTBλ−N1gN−K+1:NTCλ+2N1∣1NTg∣2λ≥0
这篇关于distance to convex cone的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!