论文网址:Few-shot domain-adaptive anomaly detection for cross-site brain images | IEEE Journals & Magazine | IEEE Xplore
1. 省流版
1.1. 心得
1.2. 论文总结图
2. 论文逐段精读
2.1. Abstract
2.2. Introduction
2.3. Related work
2.3.1. Classification of mental disorders
2.3.2. Few-shot learning for anomaly detection
2.3.3. Cross-domain few-shot learning
2.4. Materials
2.4.1. Demographic, clinical and imaging information of data
2.4.2. Preprocessing
2.4.3. Functional connectivity measures
2.5. Proposed algorithm
2.5.1. Problem definition
2.5.2. Deep semi-supervised anomaly detection (DSAD)
2.5.3. Residual correction block (RCB)
2.5.4. Conditional adversarial domain adaptation revisited
2.5.5. Overall formulation of the FAAD algorithm
2.6. Experiment
2.6.1. Baseline method
2.6.2. Implementation details
2.6.3. Results and analysis
2.7. Discussion
2.8. Conclusion
3. 知识补充
3.1. Hypersphere
3.2. Meta-learning
3.3. Manifold
3.4. Canonical Correlation Analysis (CCA)
4. Reference List
(3)Related works写名字是真的...难评。为什么不能写写模型名字
(5)文章解释了为什么不用voxel FC而是用ROI based FC:“在体素方面,由于FC具有超高的维度(十亿级)和较低的信噪比(SNR),因此没有采用。”
1.2. 论文总结图
2. 论文逐段精读
2.1. Abstract
①For solving the problem that fMRI data comes from different sites, the authors proposed few-shot domain-adaptive anomaly detection (FAAD)
②They firstly adopt domain adaptation, which reduce the differences of different sites. And secondly combining the features of different sites
③The database is the Human Connectome Project (HCP)
2.2. Introduction
①It is hard to obtain enough number of correctly labeled samples
②⭐It comes overfitting risk when applying unsupervised methods in that the dimension of functional connectivity is too high, the number of sample is limited and differences between samples are significant
③⭐In reality, the number of healthy people is definitely much greater than the number of Alzheimer's patients. If follows the situation (the ratio of AD and HC), it may decreases the accuracy of binary classification
④⭐Accordingly...They take large amount healthy samples as their pre-traning set, then apply anomaly detection in comprehensive sites.
⑥The schematic of their FAAD:
⑦Their contributions: a) they are the first one to adopt anomaly detection in psychiatric disoders classification, b) for one class in source dataset and two classes (only one new class) in target dataset, they alleviate the difference of distribution between two classes, c) they align the general feature distribution and conditional distribution between the source and the target datasets at the same time
interrater adj. 评分者间的:指不同评分者之间的一致性或可靠性
delineate v. (详细地)描述,解释;标明,标示(边界)
schematic adj. 略图的;严谨的;简表的;有章法的 n. 简图
authenticity n. 真实性,可靠性
2.3. Related work
2.3.1. Classification of mental disorders
①Shen et al. classified schizophrenia (SCZ) and HC by locally linear embedding and C-means clustering
②Zeng et al. classified depression and HC by whole brain FC and SVM
③What is more, Zeng et al. then classified SCZ and HC by discriminant autoencoder network with sparsity constraint (DANS) with combining different sites of data
④Sui et al. predicted the cognitive domain score of SCZ by extracting features from multimodal MRI images
⑤Li et al. classified posttraumatic stress disorder (PTSD) and HC by dynamic FC
⑥Gopinath et al. predicted the stage of AD by new learnable graph pooling method
⑦Lian et al. extracted the multi-scale features of AD by hierarchical fully convolutional network (H-FCN)
⑧Mourao-Miranda et al. classified patients by anomaly detections with SVM but only contains 38 samples
morphometry n. 形态测量学;形态计量术
2.3.2. Few-shot learning for anomaly detection
①Anomaly detection, also called outlier detection or novelty detection, tries to limit all the training samples (normal samples) in a hypersphere as much as possible. All the samples that fall outside the hypersphere are abnormal samples
②Few number of anomalies will better help to depict the hypersphere
③Lu et al. proposed a few-shot scene-adaptive outlier detection method
④Ding et al. put forward graph deviation networks (GDN) and new cross-network meta-learning algorithm
⑤Koizumi et al. proposed a few-shot method to train cascaded specific anomaly detector
⑥It is hard to use meta-learning cuz the domain is single (diversity needed) and unseen labels can only be used in fine-tune in meta-learning
a.k.a. abbr.又名,亦称(尤用于引出某人的昵称或艺名(also known as));
2.3.3. Cross-domain few-shot learning
①Most of the cross-domain methods focus on the condition that the label space is the same of the source domain and the target domain
②Guan et al. proposed triplet autoencoder (TriAE) model
③Zhao et al. put forward domain-adversarial prototypical network (DAPN) model with meta-learning and N-way k-shot classification. N-way k-shot means N clusters in support set and k samples in each clusters. The there is a query set which contains N clusters also to query (measure the performance). Due to the requirement of N clusters, disease classification can not apply this method
2.4. Materials
①The overall pipeline:
(A)Get time series FC
input vector
(B)Pretraining: input vector (dimension , where
is the number of ROI)
output vector through reconstruction loss
(C)Apply three-repeat three-trial validation in samples with random seed in each repeat for randomize the sequence of samples. Select few normal and abnormal samples from each trival randomly as labelled data. The remain of them is regard as test set
(D)Retaining the encoder from B and compensating the differences between domains through residual correction block and conditional adversarial domain adaptation. Also
where denotes the loss of anomaly detection and
denotes the loss of domain adaptation.
②Finally, the measure the performance by the AUC of unlabelled target domain
2.4.1. Demographic, clinical and imaging information of data
①Sites: 7
(1)Source domain
①dataset: The Human Connectome Project (HCP) dataset (HCP S1200)
②Samples: 1053 HC with 483 males and 570 females
③Parameters of scanning: spatial resolution = 2×2×2mm³ , repetition time (TR) = 720 ms, echo time (TE) = 33.1 ms, field of view (FOV) = 208×80mm² , slices = 72, flip angle (FA) = 52◦, TRs = 1200
(2)Target domain
①Dadaset: AMU, FMMU#1, FMMU#2, PUTH, UCLA and COBRE datasets (they are a) rs-fMRI, b) keep the same scanner in one site, c) the sample size >100 when contains HC and SCZ, > 150 when contains SCZ and MDD for one site)
2.4.2. Preprocessing
①Software: SMP8
②Magnetic saturation: the first five frames of the scanned data are discarded
③Slice timing
④Motion correction: excluding scans with excessive head motion during acquisition (>2.5 mm translation and/or 2.5◦ rotation)
⑤Normalization with an EPI template in the Montreal Neurological Institute (MNI) atlas space (3-mm isotropic voxels)
⑥Spatial smoothing with a 6-mm fullwidth half-maximum Gaussian kernel
⑦Linear detrending and bandpass temporal filtering (0·01–0·08 Hz)
⑧Regression of nuisance variables, including the six parameters obtained by rigid body head motion correction, ventricular and white matter signals, and their first temporal derivatives, quadratic terms, and squares of derivatives
2.4.3. Functional connectivity measures
①AAL atlas lacks information of functional organization
②17-network parcellation possess high SNR but do not contain some subcortical regions, such as the thalamus and amygdala, which are regarded as essential regions in memory, emotional control and various cognitive functions
③Thus, they use BA512 atlas with eigen clustering (EIC) and unsupervised method
④Applying Pearson correlation coefficient in time series under each atlas, then transforming them to approach to normal distribution by Fisher r-to-z transformation
⑤Three atlases:
striatum n. 纹状体,终脑的皮层 thalamus n. [解剖] 丘脑;花托 amygdala n. [解剖] 杏仁核;扁桃腺;苦巴旦杏
2.5. Proposed algorithm
2.5.1. Problem definition
① is the source domain, the HCP dataset, where
② is the target domain, the AMU, FMMU#1, FMMU#2, PUTH, UCLA and COBRE datasets
③ is the labeled target, where
for HC,
for patients
④ is the unlabeled target
the feature space of the source domain | |
the feature space of the target domain | |
the label space of the source domain | |
the label space of the target domain |
⑥ means they have the same dimension
⑦⭐The feature distribution between source and target domain is difference, namely (其实我不知道这个特征分布指的是 a) 同样的指标但是大小分区不均 还是 b) 指标个数一样但是指标不一样)
⑧They aim to alleviate the distribution discrepancy between and
and apply anomaly detection in
2.5.2. Deep semi-supervised anomaly detection (DSAD)
①In layers deep support vector data description (deep SVDD):
where denotes the input space and
denotes the output space;
denotes the center of the hypersphere;
And this function is for minimizing the volume of hypersphere of all the HC;
The left term is to enclose the HC and the right term is a standard weight decay regularizer with hyperparameter
②For there is only HC samples for training and maxmizing the mutual information , autoencoder initialization with reconstruction loss as the optimizer
③The mean value of all the features of encoded samples in center :
④The anomaly score after training can be:
⑤There might be "hypersphere collapse" when only use HC. It means the radius of the hypersphere reduce to 0 and eliminating the representation capability of the network. It can be mitigated by few labeled abnormal samples
⑥For two classes labeled samples, there are:
⑦After adding the labeled samples, the network could be changed to:
the labeled abnormal samples are mapped away from center by penalization
⑧The centers of source domain and target domain are shared
2.5.3. Residual correction block (RCB)
①Distribution alignment by increasing discrepancy loss may not completely eliminate the domain discrepancies
②Li et al. put forward two-layer fully connected neural network RCB, which
③ and
are the task-specific features of source data
and target data
④“The source data only needs to go through the original network, while the target data
needs to pass the RCB afterward.” Hence
⑤Feature that learned by RCB is denoted as
⑥The integrate target feature:
⑦They further update the object equation, i.e. the loss of DSAD:
2.5.4. Conditional adversarial domain adaptation revisited
①CDAN designed for traditional domain adaptation, which domain possess the same label space of source and target domain
②The domain confufsion error:
③They apply:
where denotes the distance between
④There are adversarial network:
⑤The domain discriminator
⑥Then, the CDAN can be:
where the entropy criterion
2.5.5. Overall formulation of the FAAD algorithm
①The Few-shot domain-Adaptive Anomaly Detection (FAAD) combines DSAD and RCB:
③The pseudo code of FAAD+CDANE:
2.6. Experiment
①They compared their model with a) machine learning as SVM and deep learning as FNN, b) originial anomaly detection DSAD, c) domain adaptation models
②They evaluate the soecific disease detection ability and various disease domain differentiating ability of their model
2.6.1. Baseline method
①They apply 95% PCA-SVM cuz the number of dimension is far more than the samples(特征维数是哪个什么n(n-1)/2吗,)
②They construct a BC-DNN with FNN combined with a fully connected layer and a Softmax layer. Then apply pre-training in BC-DNN to get BC-DNN-p
③They continue to introduce other models...(我这省略了)
2.6.2. Implementation details
(1)Network and training setup
①Shot: 10-shot and 20-shot applied
②Measurement: AUC
③FNN: input dimensions of layer 1,2,3 are the original dimension of vector, 128, 32 respectively; learning rate=0.001; optimizer: Adam
④FAAD and FAAD+CDANE: learning rate of RCB = 1/10 original learning rate; epoch=12 in pretraining and epoch=16 in FAAD; learning rate / 10 in the fourth and eighth epoch; batch size=4; and
(from 0 to 0.1, influenced by coefficient
, where
iterate from 0 to 1)(我不能太理解); dropout ratio=0.2(多看一眼就会爆炸的段落)
(2)Data augmentation
②⭐They think the label of partial fMRI scanning is the same as the full scan
2.6.3. Results and analysis
They compare the mean AUC of 9 trials
(1)FAAD for one mental disorder (SCZ only)
⑧Mean values and standard deviation of AUCs(%):
(2)FAAD for two mental disorders (SCZ & MDD)
(3)Discriminative FC and brain regions
①They combine all the FC vector in each test set and apply canonical correlation analysis (CCA) on it. Get the mean weight of FC in each test set and select the top 10%
②SCZ visualization:
③SCZ or MDD:
(4)Empirical analysis of parameters
①Grid search and find FAAD+CDANE is not sensitive to
②Table of the tuning:
(5)Distribution of anomaly scores
①Anomaly scores in FMMU#1 with AAL:
(6)Brain parcellation and model performance
①Comparison of datasets and atlases:
2.7. Discussion
①This model can also be generalized to other networks
③Most of the samples in HCP are young person, it might influence the results
④⭐They did not consider the different pre-processing pipeline of different sites
2.8. Conclusion
3. 知识补充
3.1. Hypersphere
参考学习:超球面_百度百科 (baidu.com)
3.2. Meta-learning
参考学习:一文入门元学习(Meta-Learning)(附代码) - 知乎 (zhihu.com)
3.3. Manifold
参考学习1:几何学中最伟大的发明之一——流形,其背后的几何直觉与数学方法 (baidu.com)
参考学习2:流形_百度百科 (baidu.com)
3.4. Canonical Correlation Analysis (CCA)
参考学习:Canonical Correlation Analysis - 知乎 (zhihu.com)
4. Reference List
Su J. et al. (2021) 'Few-shot domain-adaptive anomaly detection for cross-site brain images', IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1. doi: 10.1109/TPAMI.2021.3125686
