本文主要是介绍[论文精读]Few-shot domain-adaptive anomaly detection for cross-site brain images,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
论文网址:Few-shot domain-adaptive anomaly detection for cross-site brain images | IEEE Journals & Magazine | IEEE Xplore
英文是纯手打的!论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误,若有发现欢迎评论指正!文章偏向于笔记,谨慎食用!
目录
1. 省流版
1.1. 心得
1.2. 论文总结图
2. 论文逐段精读
2.1. Abstract
2.2. Introduction
2.3. Related work
2.3.1. Classification of mental disorders
2.3.2. Few-shot learning for anomaly detection
2.3.3. Cross-domain few-shot learning
2.4. Materials
2.4.1. Demographic, clinical and imaging information of data
2.4.2. Preprocessing
2.4.3. Functional connectivity measures
2.5. Proposed algorithm
2.5.1. Problem definition
2.5.2. Deep semi-supervised anomaly detection (DSAD)
2.5.3. Residual correction block (RCB)
2.5.4. Conditional adversarial domain adaptation revisited
2.5.5. Overall formulation of the FAAD algorithm
2.6. Experiment
2.6.1. Baseline method
2.6.2. Implementation details
2.6.3. Results and analysis
2.7. Discussion
2.8. Conclusion
3. 知识补充
3.1. Hypersphere
3.2. Meta-learning
3.3. Manifold
3.4. Canonical Correlation Analysis (CCA)
4. Reference List
1. 省流版
1.1. 心得
(1)这Intro在我黯淡无光的读着重复的论文的每一天中突然闪耀起来了。这是TPAMI的魅力吗
(2)其实我现在觉得脑图分类总不好可能是大家也有别的病...(天哪我又...他他他居然在文章的3.1(不是我的3.1,我的是2.4.1)里面说了“患者无神经系统疾病、严重内科疾病、药物滥用或电休克治疗史。所有健康对照与SCZ或MDD患者无相关性。他们也根据DSM-IV标准进行评估。他们都没有急性身体疾病,药物滥用或依赖,头部受伤导致意识丧失的历史,或严重的精神或神经疾病。”我不知道其他的有没有,反正大概率有的话都不在正文)
(3)Related works写名字是真的...难评。为什么不能写写模型名字
(4)文章也解释了为什么用fMRI而不是sMRI:“精神障碍引起的病理改变通常是功能性的,而不是结构性的,尤其是在早期阶段。”
(5)文章解释了为什么不用voxel FC而是用ROI based FC:“在体素方面,由于FC具有超高的维度(十亿级)和较低的信噪比(SNR),因此没有采用。”
(6)我终于知道什么是标签空间了,就像去不同医院测的指标其实不一样
(7)我的discussion:我突然觉得似乎对于注意力来说ROI得小然后对于普通的ROI得大
1.2. 论文总结图
2. 论文逐段精读
2.1. Abstract
①For solving the problem that fMRI data comes from different sites, the authors proposed few-shot domain-adaptive anomaly detection (FAAD)
②They firstly adopt domain adaptation, which reduce the differences of different sites. And secondly combining the features of different sites
③The database is the Human Connectome Project (HCP)
2.2. Introduction
①It is hard to obtain enough number of correctly labeled samples
②⭐It comes overfitting risk when applying unsupervised methods in that the dimension of functional connectivity is too high, the number of sample is limited and differences between samples are significant
③⭐In reality, the number of healthy people is definitely much greater than the number of Alzheimer's patients. If follows the situation (the ratio of AD and HC), it may decreases the accuracy of binary classification
④⭐Accordingly...They take large amount healthy samples as their pre-traning set, then apply anomaly detection in comprehensive sites.
⑤作者在这里提到一个标签空间的问题,他们认为纯健康的源域和有健康有不健康的目标域的标签空间可能是不一样的。因此不能采用传统的自适应方法。作者认为“需要应用一般和有条件的领域自适应。这样可以在保持训练模型的判别能力的同时,使两个域的特征分布保持一致”
⑥The schematic of their FAAD:
⑦Their contributions: a) they are the first one to adopt anomaly detection in psychiatric disoders classification, b) for one class in source dataset and two classes (only one new class) in target dataset, they alleviate the difference of distribution between two classes, c) they align the general feature distribution and conditional distribution between the source and the target datasets at the same time
interrater adj. 评分者间的:指不同评分者之间的一致性或可靠性
delineate v. (详细地)描述,解释;标明,标示(边界)
schematic adj. 略图的;严谨的;简表的;有章法的 n. 简图
authenticity n. 真实性,可靠性
2.3. Related work
2.3.1. Classification of mental disorders
①Shen et al. classified schizophrenia (SCZ) and HC by locally linear embedding and C-means clustering
②Zeng et al. classified depression and HC by whole brain FC and SVM
③What is more, Zeng et al. then classified SCZ and HC by discriminant autoencoder network with sparsity constraint (DANS) with combining different sites of data
④Sui et al. predicted the cognitive domain score of SCZ by extracting features from multimodal MRI images
⑤Li et al. classified posttraumatic stress disorder (PTSD) and HC by dynamic FC
⑥Gopinath et al. predicted the stage of AD by new learnable graph pooling method
⑦Lian et al. extracted the multi-scale features of AD by hierarchical fully convolutional network (H-FCN)
⑧Mourao-Miranda et al. classified patients by anomaly detections with SVM but only contains 38 samples
morphometry n. 形态测量学;形态计量术
2.3.2. Few-shot learning for anomaly detection
①Anomaly detection, also called outlier detection or novelty detection, tries to limit all the training samples (normal samples) in a hypersphere as much as possible. All the samples that fall outside the hypersphere are abnormal samples
②Few number of anomalies will better help to depict the hypersphere
③Lu et al. proposed a few-shot scene-adaptive outlier detection method
④Ding et al. put forward graph deviation networks (GDN) and new cross-network meta-learning algorithm
⑤Koizumi et al. proposed a few-shot method to train cascaded specific anomaly detector
⑥It is hard to use meta-learning cuz the domain is single (diversity needed) and unseen labels can only be used in fine-tune in meta-learning
a.k.a. abbr.又名,亦称(尤用于引出某人的昵称或艺名(also known as));
2.3.3. Cross-domain few-shot learning
①Most of the cross-domain methods focus on the condition that the label space is the same of the source domain and the target domain
②Guan et al. proposed triplet autoencoder (TriAE) model
③Zhao et al. put forward domain-adversarial prototypical network (DAPN) model with meta-learning and N-way k-shot classification. N-way k-shot means N clusters in support set and k samples in each clusters. The there is a query set which contains N clusters also to query (measure the performance). Due to the requirement of N clusters, disease classification can not apply this method
2.4. Materials
①The overall pipeline:
(A)Get time series FC input vector
(B)Pretraining: input vector (dimension , where is the number of ROI) output vector through reconstruction loss (我不知道怎么用的)
(C)Apply three-repeat three-trial validation in samples with random seed in each repeat for randomize the sequence of samples. Select few normal and abnormal samples from each trival randomly as labelled data. The remain of them is regard as test set
(D)Retaining the encoder from B and compensating the differences between domains through residual correction block and conditional adversarial domain adaptation. Also
where denotes the loss of anomaly detection and denotes the loss of domain adaptation.
②Finally, the measure the performance by the AUC of unlabelled target domain
2.4.1. Demographic, clinical and imaging information of data
①Sites: 7
(1)Source domain
①dataset: The Human Connectome Project (HCP) dataset (HCP S1200)
②Samples: 1053 HC with 483 males and 570 females
③Parameters of scanning: spatial resolution = 2×2×2mm³ , repetition time (TR) = 720 ms, echo time (TE) = 33.1 ms, field of view (FOV) = 208×80mm² , slices = 72, flip angle (FA) = 52◦, TRs = 1200
(2)Target domain
①Dadaset: AMU, FMMU#1, FMMU#2, PUTH, UCLA and COBRE datasets (they are a) rs-fMRI, b) keep the same scanner in one site, c) the sample size >100 when contains HC and SCZ, > 150 when contains SCZ and MDD for one site)
2.4.2. Preprocessing
①Software: SMP8
②Magnetic saturation: the first five frames of the scanned data are discarded
③Slice timing
④Motion correction: excluding scans with excessive head motion during acquisition (>2.5 mm translation and/or 2.5◦ rotation)
⑤Normalization with an EPI template in the Montreal Neurological Institute (MNI) atlas space (3-mm isotropic voxels)
⑥Spatial smoothing with a 6-mm fullwidth half-maximum Gaussian kernel
⑦Linear detrending and bandpass temporal filtering (0·01–0·08 Hz)
⑧Regression of nuisance variables, including the six parameters obtained by rigid body head motion correction, ventricular and white matter signals, and their first temporal derivatives, quadratic terms, and squares of derivatives
2.4.3. Functional connectivity measures
①AAL atlas lacks information of functional organization
②17-network parcellation possess high SNR but do not contain some subcortical regions, such as the thalamus and amygdala, which are regarded as essential regions in memory, emotional control and various cognitive functions
③Thus, they use BA512 atlas with eigen clustering (EIC) and unsupervised method
④Applying Pearson correlation coefficient in time series under each atlas, then transforming them to approach to normal distribution by Fisher r-to-z transformation
⑤Three atlases:
striatum n. 纹状体,终脑的皮层 thalamus n. [解剖] 丘脑;花托 amygdala n. [解剖] 杏仁核;扁桃腺;苦巴旦杏
2.5. Proposed algorithm
2.5.1. Problem definition
① is the source domain, the HCP dataset, where
② is the target domain, the AMU, FMMU#1, FMMU#2, PUTH, UCLA and COBRE datasets
③ is the labeled target, where for HC, for patients
④ is the unlabeled target
⑤
the feature space of the source domain | |
the feature space of the target domain | |
the label space of the source domain , . Its class number | |
the label space of the target domain . Its class number |
⑥ means they have the same dimension
⑦⭐The feature distribution between source and target domain is difference, namely (其实我不知道这个特征分布指的是 a) 同样的指标但是大小分区不均 还是 b) 指标个数一样但是指标不一样)
⑧They aim to alleviate the distribution discrepancy between and and apply anomaly detection in
2.5.2. Deep semi-supervised anomaly detection (DSAD)
①In layers deep support vector data description (deep SVDD):
where denotes the input space and denotes the output space;
, , denotes the center of the hypersphere;
And this function is for minimizing the volume of hypersphere of all the HC;
The left term is to enclose the HC and the right term is a standard weight decay regularizer with hyperparameter
②For there is only HC samples for training and maxmizing the mutual information , autoencoder initialization with reconstruction loss as the optimizer
③The mean value of all the features of encoded samples in center :
④The anomaly score after training can be:
⑤There might be "hypersphere collapse" when only use HC. It means the radius of the hypersphere reduce to 0 and eliminating the representation capability of the network. It can be mitigated by few labeled abnormal samples
⑥For two classes labeled samples, there are:
⑦After adding the labeled samples, the network could be changed to:
the labeled abnormal samples are mapped away from center by penalization
⑧The centers of source domain and target domain are shared
2.5.3. Residual correction block (RCB)
①Distribution alignment by increasing discrepancy loss may not completely eliminate the domain discrepancies
②Li et al. put forward two-layer fully connected neural network RCB, which
③ and are the task-specific features of source data and target data
④“The source data only needs to go through the original network, while the target data needs to pass the RCB afterward.” Hence (我不知道啥意思)
⑤Feature that learned by RCB is denoted as
⑥The integrate target feature:
⑦They further update the object equation, i.e. the loss of DSAD:
2.5.4. Conditional adversarial domain adaptation revisited
①CDAN designed for traditional domain adaptation, which domain possess the same label space of source and target domain
②The domain confufsion error:
③They apply:
where denotes the distance between and
④There are adversarial network:
⑤The domain discriminator
⑥Then, the CDAN can be:
where the entropy criterion
2.5.5. Overall formulation of the FAAD algorithm
①The Few-shot domain-Adaptive Anomaly Detection (FAAD) combines DSAD and RCB:
②FAAD+CDANE:
③The pseudo code of FAAD+CDANE:
2.6. Experiment
①They compared their model with a) machine learning as SVM and deep learning as FNN, b) originial anomaly detection DSAD, c) domain adaptation models
②They evaluate the soecific disease detection ability and various disease domain differentiating ability of their model
2.6.1. Baseline method
①They apply 95% PCA-SVM cuz the number of dimension is far more than the samples(特征维数是哪个什么n(n-1)/2吗,)
②They construct a BC-DNN with FNN combined with a fully connected layer and a Softmax layer. Then apply pre-training in BC-DNN to get BC-DNN-p
③They continue to introduce other models...(我这省略了)
2.6.2. Implementation details
(1)Network and training setup
①Shot: 10-shot and 20-shot applied
②Measurement: AUC
③FNN: input dimensions of layer 1,2,3 are the original dimension of vector, 128, 32 respectively; learning rate=0.001; optimizer: Adam
④FAAD and FAAD+CDANE: learning rate of RCB = 1/10 original learning rate; epoch=12 in pretraining and epoch=16 in FAAD; learning rate / 10 in the fourth and eighth epoch; batch size=4; and (from 0 to 0.1, influenced by coefficient , where and iterate from 0 to 1)(我不能太理解); dropout ratio=0.2(多看一眼就会爆炸的段落)
⑤DSAD-DANN:
(2)Data augmentation
①为什么在这里又说特征维度比样本量小!?
②⭐They think the label of partial fMRI scanning is the same as the full scan
③⭐“在训练过程中,每个时间过程都是随机裁剪的(应该从扫描的第一帧开始,并且大于原始长度的一半),然后用于计算全脑FC。在测试期间,放弃增强”(这种叫增强啊...可能没学过数据增强)
2.6.3. Results and analysis
They compare the mean AUC of 9 trials
(1)FAAD for one mental disorder (SCZ only)
①AMU
②FMMU#1
③FMMU#2
④PUTH
⑤UCLA
⑥COBRE
⑦他们在这之后花了大篇幅撰写discussion,不过讨论都是基于实验结果的,对于没有实验结果的我暂时没有特别大的意义。因此只是看了一遍而没有记录
⑧Mean values and standard deviation of AUCs(%):
(2)FAAD for two mental disorders (SCZ & MDD)
①AMU
②FMMU#1
(3)Discriminative FC and brain regions
①They combine all the FC vector in each test set and apply canonical correlation analysis (CCA) on it. Get the mean weight of FC in each test set and select the top 10%
②SCZ visualization:
③SCZ or MDD:
(4)Empirical analysis of parameters
①Grid search and find FAAD+CDANE is not sensitive to
②Table of the tuning:
(5)Distribution of anomaly scores
①Anomaly scores in FMMU#1 with AAL:
(6)Brain parcellation and model performance
①Comparison of datasets and atlases:
2.7. Discussion
①This model can also be generalized to other networks
②⭐图的定义和图的拉普拉斯表示并不总是令人满意哈哈哈哈哈笑死,但你这个平均精度其实也不算太高,虽然最高可以到80但是平均下来我感觉就六七十了。2021其实也很够了
③Most of the samples in HCP are young person, it might influence the results
④⭐They did not consider the different pre-processing pipeline of different sites
2.8. Conclusion
我就懒得conclude了,该是啥是啥
3. 知识补充
3.1. Hypersphere
参考学习:超球面_百度百科 (baidu.com)
3.2. Meta-learning
参考学习:一文入门元学习(Meta-Learning)(附代码) - 知乎 (zhihu.com)
3.3. Manifold
参考学习1:几何学中最伟大的发明之一——流形,其背后的几何直觉与数学方法 (baidu.com)
参考学习2:流形_百度百科 (baidu.com)
3.4. Canonical Correlation Analysis (CCA)
参考学习:Canonical Correlation Analysis - 知乎 (zhihu.com)
4. Reference List
Su J. et al. (2021) 'Few-shot domain-adaptive anomaly detection for cross-site brain images', IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1. doi: 10.1109/TPAMI.2021.3125686
这篇关于[论文精读]Few-shot domain-adaptive anomaly detection for cross-site brain images的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!