本文主要是介绍《Detecting sequences of system states in temporal networks》,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
文章目录
- 论文地址
- bibtex
- 代码地址
- 主要内容
- 网络的距离度量
- 图编辑距离
- DeltaCon
- The quantum spectral Jensen-Shannon divergence
- 其余四种频域距离
论文地址
https://www.nature.com/articles/s41598-018-37534-2
bibtex
@article{DBLP:journals/corr/abs-1803-04755,author = {Naoki Masuda andPetter Holme},title = {Detecting sequences of system states in temporal networks},journal = {CoRR},volume = {abs/1803.04755},year = {2018},url = {http://arxiv.org/abs/1803.04755},archivePrefix = {arXiv},eprint = {1803.04755},timestamp = {Mon, 13 Aug 2018 16:46:49 +0200},biburl = {https://dblp.org/rec/journals/corr/abs-1803-04755.bib},bibsource = {dblp computer science bibliography, https://dblp.org}
}
代码地址
https://github.com/naokimas/state_dynamics
主要内容
动态网络是由网络快照(snapshot)的序列来描述,这篇文章主要考虑网络的链路是动态变化的,比如通讯网络中,节点之间的通讯状态是时断时续的。
假设一个快照的持续时间为 T T T,在这段时间内存在通讯的节点对之间具有连边,用网络的邻接矩阵表示。动态网络序列由网络快照的邻接矩阵组成。
接下来要识别这些邻接矩阵的状态,核心思想就是(层次)聚类。
聚类算法的核心是求元素之间的距离,即网络邻接矩阵间的距离。
网络的距离度量
图编辑距离
d = N ( G 1 ) + N ( G 2 ) − 2 N ( G 1 ∩ G 2 ) + M ( G 1 ) + M ( G 2 ) − 2 M ( G 1 ∩ G 2 ) d = N(G_1) + N(G_2) - 2N(G_1 \cap G_2) + M(G_1) + M(G_2) - 2M(G_1 \cap G_2) d=N(G1)+N(G2)−2N(G1∩G2)+M(G1)+M(G2)−2M(G1∩G2)其中, N ( ⋅ ) , M ( ⋅ ) N(\cdot), M(\cdot) N(⋅),M(⋅) 分别代表节点数和边数。
DeltaCon
@article{10.1145/2824443,
author = {Koutra, Danai and Shah, Neil and Vogelstein, Joshua T. and Gallagher, Brian and Faloutsos, Christos},
title = {DeltaCon: Principled Massive-Graph Similarity Function with Attribution},
year = {2016},
issue_date = {February 2016},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {10},
number = {3},
issn = {1556-4681},
url = {https://doi.org/10.1145/2824443},
doi = {10.1145/2824443},
journal = {ACM Trans. Knowl. Discov. Data},
month = feb,
articleno = {28},
numpages = {43},
keywords = {node attribution, anomaly detection, graph classification, culprit nodes and edges, Graph similarity, network monitoring, graph comparison, edge attribution}
}
The quantum spectral Jensen-Shannon divergence
JS 散度解决了 KL 散度不对称的问题:
KL散度:
K L ( P ∣ ∣ Q ) = ∑ x P ( x ) log P ( x ) Q ( x ) KL(P||Q) = \sum_x P(x)\log\frac{P(x)}{Q(x)} KL(P∣∣Q)=x∑P(x)logQ(x)P(x)
KL散度具有正定性和非对称性。
JS 散度:
J S ( P ∣ ∣ Q ) = 1 2 K L ( P ∣ ∣ M ) + 1 2 K L ( Q ∣ ∣ M ) , M = 1 2 ( Q + P ) JS(P||Q) = \frac{1}{2}KL(P||M) + \frac{1}{2}KL(Q||M), \\ M = \frac{1}{2}(Q+P) JS(P∣∣Q)=21KL(P∣∣M)+21KL(Q∣∣M),M=21(Q+P)
熵的定义为:
H ( P ) = − ∑ x P ( x ) log P ( x ) , H(P) = -\sum_x P(x)\log P(x), H(P)=−x∑P(x)logP(x),
从熵的角度来看JS散度: J S ( P ∣ ∣ Q ) = 1 2 K L ( P ∣ ∣ M ) + 1 2 K L ( Q ∣ ∣ M ) = 1 2 ( ∑ x P ( x ) log P ( x ) − ∑ x P ( x ) log M ( x ) + ∑ x Q ( x ) log Q ( x ) − ∑ x Q ( x ) log M ( x ) ) = H ( M ) − 1 2 ( H ( P ) + H ( Q ) ) \begin{array}{rl} JS(P||Q) =&\frac{1}{2}KL(P||M) + \frac{1}{2}KL(Q||M) \\\\ =& \frac{1}{2} \left(\sum_x P(x)\log P(x) - \sum_x P(x)\log M(x) + \sum_x Q(x)\log Q(x) - \sum_x Q(x)\log M(x) \right) \\\\ =& H(M)-\frac{1}{2} \left( H(P) + H(Q)\right) \end{array} JS(P∣∣Q)===21KL(P∣∣M)+21KL(Q∣∣M)21(∑xP(x)logP(x)−∑xP(x)logM(x)+∑xQ(x)logQ(x)−∑xQ(x)logM(x))H(M)−21(H(P)+H(Q))
JS散度具有:
- 正定性且值域为 [ 0 , 1 ] [0,1] [0,1];
- 对称性。
JS散度是比较两个分部的距离,怎样用来计算两个网络的相似度呢?
首先定义密度矩阵:
ρ = e − β L / ∑ i = 1 N e − β λ i \rho = e^{-\beta L}/\sum_{i=1}^N e^{-\beta \lambda_i} ρ=e−βL/i=1∑Ne−βλi
其中, L = D − A L = D-A L=D−A, e − β L = I − β L + 1 2 ! β 2 L 2 − 1 3 ! β 3 L 3 + ⋯ e^{-\beta L} = I -\beta L + \frac{1}{2!}\beta^2L^2 - \frac{1}{3!}\beta^3L^3 +\cdots e−βL=I−βL+2!1β2L2−3!1β3L3+⋯, 怎么理解这个式子呢?
其实, e − t L e^{-tL} e−tL 是网络扩散过程:
x ˙ = − L x = ( A − D ) x \dot{x} = -Lx = (A-D)x x˙=−Lx=(A−D)x的基本解矩阵,该方程的通解为: x = e − t L x 0 x = e^{-tL}x_0 x=e−tLx0, 而 β \beta β 控制了网络中扩散的时间。
所以 ρ \rho ρ可以反映网络中的扩散过程,因而可以作为网络的特征表示。另一方面, ρ \rho ρ的特征值之和相加为1,所以 ρ \rho ρ可以视为量子力学中的密度矩阵(?暂时不懂)。
对于密度矩阵定义冯纽曼熵(von Neumann entropy):
S ( ρ ) = − ∑ i = 1 N λ ~ i log 2 λ ~ i , S(\rho) = -\sum_{i=1}^N \tilde\lambda_i \log_2\tilde\lambda_i, S(ρ)=−i=1∑Nλ~ilog2λ~i,其中, λ ~ i \tilde\lambda_i λ~i是 ρ \rho ρ的第 i i i个特征值.
根据熵和JS散度的关系,得到两个密度矩阵之间的距离度量:
d = S ( ρ 1 + ρ 2 2 ) − 1 2 [ S ( ρ 1 ) + S ( ρ 2 ) ] d = \sqrt{S(\frac{\rho_1 + \rho_2}{2}) - \frac{1}{2}[S(\rho_1)+S(\rho_2)]} d=S(2ρ1+ρ2)−21[S(ρ1)+S(ρ2)]
其余四种频域距离
对于两种拉普拉斯矩阵:
L = D − A , L ′ = I − D − 1 / 2 A D − 1 / 2 L = D - A, \\ L' = I - D^{-1/2} A D^{-1/2} L=D−A,L′=I−D−1/2AD−1/2
分别取如下两种频域距离度量:
d 1 = ∑ i n ( λ i ( G 1 ) − λ i ( G 2 ) ) 2 d_1 = \sqrt{\sum_i^n(\lambda_i(G_1) - \lambda_i(G_2))^2} d1=i∑n(λi(G1)−λi(G2))2 d 2 = ∑ i n ( λ i ( G 1 ) − λ i ( G 2 ) ) 2 max { ∑ i n λ i ( G 1 ) 2 , ∑ i n λ i ( G 2 ) 2 } d_2 = \sqrt{\frac{\sum_i^n(\lambda_i(G_1) - \lambda_i(G_2))^2}{\max\{\sum_i^n\lambda_i(G_1)^2 , \sum_i^n\lambda_i(G_2)^2 \}}} d2=max{∑inλi(G1)2,∑inλi(G2)2}∑in(λi(G1)−λi(G2))2
其中 λ i \lambda_i λi表示第 i i i大的特征值.
这篇关于《Detecting sequences of system states in temporal networks》的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!