[论文精读]Few-shot domain-adaptive anomaly detection for cross-site brain images

本文主要是介绍[论文精读]Few-shot domain-adaptive anomaly detection for cross-site brain images,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

论文网址:Few-shot domain-adaptive anomaly detection for cross-site brain images | IEEE Journals & Magazine | IEEE Xplore

英文是纯手打的!论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误,若有发现欢迎评论指正!文章偏向于笔记,谨慎食用!

目录

1. 省流版

1.1. 心得

1.2. 论文总结图

2. 论文逐段精读

2.1. Abstract

2.2. Introduction

2.3. Related work

2.3.1. Classification of mental disorders

2.3.2. Few-shot learning for anomaly detection

2.3.3. Cross-domain few-shot learning

2.4. Materials

2.4.1. Demographic, clinical and imaging information of data

2.4.2. Preprocessing

2.4.3. Functional connectivity measures

2.5. Proposed algorithm

2.5.1. Problem definition

2.5.2. Deep semi-supervised anomaly detection (DSAD)

2.5.3. Residual correction block (RCB)

2.5.4. Conditional adversarial domain adaptation revisited

2.5.5. Overall formulation of the FAAD algorithm

2.6. Experiment

2.6.1. Baseline method

2.6.2. Implementation details

2.6.3. Results and analysis

2.7. Discussion

2.8. Conclusion

3. 知识补充

3.1. Hypersphere

3.2. Meta-learning

3.3. Manifold

3.4. Canonical Correlation Analysis (CCA)

4. Reference List


1. 省流版

1.1. 心得

(1)这Intro在我黯淡无光的读着重复的论文的每一天中突然闪耀起来了。这是TPAMI的魅力吗

(2)其实我现在觉得脑图分类总不好可能是大家也有别的病...(天哪我又...他他他居然在文章的3.1(不是我的3.1,我的是2.4.1)里面说了“患者无神经系统疾病、严重内科疾病、药物滥用或电休克治疗史。所有健康对照与SCZ或MDD患者无相关性。他们也根据DSM-IV标准进行评估。他们都没有急性身体疾病,药物滥用或依赖,头部受伤导致意识丧失的历史,或严重的精神或神经疾病。”我不知道其他的有没有,反正大概率有的话都不在正文)

(3)Related works写名字是真的...难评。为什么不能写写模型名字

(4)文章也解释了为什么用fMRI而不是sMRI:“精神障碍引起的病理改变通常是功能性的,而不是结构性的,尤其是在早期阶段。”

(5)文章解释了为什么不用voxel FC而是用ROI based FC:“在体素方面,由于FC具有超高的维度(十亿级)和较低的信噪比(SNR),因此没有采用。”

(6)我终于知道什么是标签空间了,就像去不同医院测的指标其实不一样

(7)我的discussion:我突然觉得似乎对于注意力来说ROI得小然后对于普通的ROI得大

1.2. 论文总结图

2. 论文逐段精读

2.1. Abstract

        ①For solving the problem that fMRI data comes from different sites, the authors proposed few-shot domain-adaptive anomaly detection (FAAD)

        ②They firstly adopt domain adaptation, which reduce the differences of different sites. And secondly combining the features of different sites

        ③The database is the Human Connectome Project (HCP)

2.2. Introduction

        ①It is hard to obtain enough number of correctly labeled samples

        ②⭐It comes overfitting risk when applying unsupervised methods in that the dimension of functional connectivity is too high, the number of sample is limited and differences between samples are significant

        ③⭐In reality, the number of healthy people is definitely much greater than the number of Alzheimer's patients. If follows the situation (the ratio of AD and HC), it may decreases the accuracy of binary classification

        ④⭐Accordingly...They take large amount healthy samples as their pre-traning set, then apply anomaly detection in comprehensive sites.

        ⑤作者在这里提到一个标签空间的问题,他们认为纯健康的源域和有健康有不健康的目标域的标签空间可能是不一样的。因此不能采用传统的自适应方法。作者认为“需要应用一般和有条件的领域自适应。这样可以在保持训练模型的判别能力的同时,使两个域的特征分布保持一致”

        ⑥The schematic of their FAAD:

        ⑦Their contributions: a) they are the first one to adopt anomaly detection in psychiatric disoders classification, b) for one class in source dataset and two classes (only one new class) in target dataset, they alleviate the difference of distribution between two classes, c) they align the general feature distribution and conditional distribution between the source and the target datasets at the same time

interrater  adj. 评分者间的:指不同评分者之间的一致性或可靠性

delineate  v. (详细地)描述,解释;标明,标示(边界)

schematic  adj. 略图的;严谨的;简表的;有章法的  n. 简图

authenticity  n. 真实性,可靠性

2.3. Related work

2.3.1. Classification of mental disorders

        ①Shen et al. classified schizophrenia (SCZ) and HC by locally linear embedding and C-means clustering

        ②Zeng et al. classified depression and HC by whole brain FC and SVM

        ③What is more, Zeng et al. then classified SCZ and HC by discriminant autoencoder network with sparsity constraint (DANS) with combining different sites of data

        ④Sui et al. predicted the cognitive domain score of SCZ by extracting features from multimodal MRI images

        ⑤Li et al. classified posttraumatic stress disorder (PTSD) and HC by dynamic FC

        ⑥Gopinath et al. predicted the stage of AD by new learnable graph pooling method

        ⑦Lian et al. extracted the multi-scale features of AD by hierarchical fully convolutional network (H-FCN)

        ⑧Mourao-Miranda et al. classified patients by anomaly detections with SVM but only contains 38 samples

morphometry  n. 形态测量学;形态计量术

2.3.2. Few-shot learning for anomaly detection

        ①Anomaly detection, also called outlier detection or novelty detection, tries to limit all the training samples (normal samples) in a hypersphere as much as possible. All the samples that fall outside the hypersphere are abnormal samples

        ②Few number of anomalies will better help to depict the hypersphere

        ③Lu et al. proposed a few-shot scene-adaptive outlier detection method

        ④Ding et al. put forward graph deviation networks (GDN) and new cross-network meta-learning algorithm

        ⑤Koizumi et al. proposed a few-shot method to train cascaded specific anomaly detector

        ⑥It is hard to use meta-learning cuz the domain is single (diversity needed) and unseen labels can only be used in fine-tune in meta-learning

a.k.a.  abbr.又名,亦称(尤用于引出某人的昵称或艺名(also known as));

2.3.3. Cross-domain few-shot learning

        ①Most of the cross-domain methods focus on the condition that the label space is the same of the source domain and the target domain

        ②Guan et al. proposed triplet autoencoder (TriAE) model

        ③Zhao et al. put forward domain-adversarial prototypical network (DAPN) model with meta-learning and N-way k-shot classification. N-way k-shot means N clusters in support set and k samples in each clusters. The there is a query set which contains N clusters also to query (measure the performance). Due to the requirement of N clusters, disease classification can not apply this method

2.4. Materials

        ①The overall pipeline:

(A)Get time series \overset{Pearson\, \, correlation}{\rightarrow} FC \overset{vectorize}{\rightarrow} input vector

(B)Pretraining: input vector (dimension N=\frac{n(n-1)}{2}, where n is the number of ROI) \overset{three-layer\, \, autoencoder}{\rightarrow} output vector through reconstruction loss L_{reconstruction}我不知道怎么用的

(C)Apply three-repeat three-trial validation in samples with random seed in each repeat for randomize the sequence of samples. Select few normal and abnormal samples from each trival randomly as labelled data. The remain of them is regard as test set

(D)Retaining the encoder from B and compensating the differences between domains through residual correction block and conditional adversarial domain adaptation. Also

L_{total}=L_{ad}+L_{da}\left ( \beta \right )

where L_{ad} denotes the loss of anomaly detection and L_{da} denotes the loss of domain adaptation. 

        ②Finally, the measure the performance by the AUC of unlabelled target domain

2.4.1. Demographic, clinical and imaging information of data

        ①Sites: 7

(1)Source domain

        ①dataset: The Human Connectome Project (HCP) dataset (HCP S1200)

        ②Samples: 1053 HC with 483 males and 570 females

        ③Parameters of scanning: spatial resolution = 2×2×2mm³ , repetition time (TR) = 720 ms, echo time (TE) = 33.1 ms, field of view (FOV) = 208×80mm² , slices = 72, flip angle (FA) = 52◦, TRs = 1200

(2)Target domain

        ①Dadaset: AMU, FMMU#1, FMMU#2, PUTH, UCLA and COBRE datasets (they are a) rs-fMRI, b) keep the same scanner in one site, c) the sample size >100 when contains HC and SCZ, > 150 when contains SCZ and MDD for one site)

2.4.2. Preprocessing

        ①Software: SMP8

        ②Magnetic saturation: the first five frames of the scanned data are discarded

        ③Slice timing

        ④Motion correction: excluding scans with excessive head motion during acquisition (>2.5 mm translation and/or 2.5◦ rotation)

        ⑤Normalization with an EPI template in the Montreal Neurological Institute (MNI) atlas space (3-mm isotropic voxels)

        ⑥Spatial smoothing with a 6-mm fullwidth half-maximum Gaussian kernel

        ⑦Linear detrending and bandpass temporal filtering (0·01–0·08 Hz)

        ⑧Regression of nuisance variables, including the six parameters obtained by rigid body head motion correction, ventricular and white matter signals, and their first temporal derivatives, quadratic terms, and squares of derivatives

2.4.3. Functional connectivity measures

        ①AAL atlas lacks information of functional organization

        ②17-network parcellation possess high SNR but do not contain some subcortical regions, such as the thalamus and amygdala, which are regarded as essential regions in memory, emotional control and various cognitive functions

        ③Thus, they use BA512 atlas with eigen clustering (EIC) and unsupervised method 

        ④Applying Pearson correlation coefficient in time series under each atlas, then transforming them to approach to normal distribution by Fisher r-to-z transformation

        ⑤Three atlases:

striatum  n. 纹状体,终脑的皮层    thalamus  n. [解剖] 丘脑;花托     amygdala  n. [解剖] 杏仁核;扁桃腺;苦巴旦杏

2.5. Proposed algorithm

2.5.1. Problem definition

        ①\mathcal{D}_{s}=\{(x_{si},y_{si})\}_{i=1}^{n_{s}}=\{\mathbf{X}_{s},y_{s}\} is the source domain, the HCP dataset, where y_{si}=+1

        ②\mathcal{D}_{t} is the target domain, the AMU, FMMU#1, FMMU#2, PUTH, UCLA and COBRE datasets

        ③\mathcal{D}_{l}=\{(x_{li},y_{li})\}_{i=1}^{n_{l}}=\{\mathbf{X}_{l},y_{l}\} is the labeled target, where y_{li}=+1 for HC, y_{li}=-1 for patients

        ④\mathcal{D}_{u}=\{(x_{ui})\}_{i=1}^{n_{u}}=\{\mathbf{X}_{u}\} is the unlabeled target

        ⑤

\mathcal{X}_{s}the feature space of the source domain \mathcal{D}_{s}
\mathcal{X}_{t}the feature space of the target domain \mathcal{D}_{t}
\mathcal{Y}_{s}the label space of the source domain \mathcal{D}_{s}\mathcal{Y}_{s}\subset \mathcal{Y}_{t}. Its class number C_s=1
\mathcal{Y}_{t}the label space of the target domain \mathcal{D}_{t}. Its class number C_t=2

        ⑥D\left ( \mathcal{X}_{s} \right )=D\left ( \mathcal{X}_{t} \right ) means they have the same dimension

        ⑦⭐The feature distribution between source and target domain is difference, namely P_{s}(X_{s})\neq P_{t}(X_{t})其实我不知道这个特征分布指的是 a) 同样的指标但是大小分区不均 还是 b) 指标个数一样但是指标不一样

        ⑧They aim to alleviate the distribution discrepancy between \mathcal{D}_{s} and \mathcal{D}_{l} and apply anomaly detection in \mathcal{D}_{u}

2.5.2. Deep semi-supervised anomaly detection (DSAD)

        ①In L layers deep support vector data description (deep SVDD):

\begin{aligned}\min_{\mathcal{W}}\frac{i}{n}\sum_{i=1}^{n}||\phi(x_{i};\mathcal{W})-c||^{2}+\frac{\lambda}{2}\sum_{l=1}^{L}||\mathbf{W}^{l}||_{F}^{2}\end{aligned}

where \mathcal{X}\subset\mathbb{R}^{D} denotes the input space and \mathcal{Z}\subset\mathbb{R}^d denotes the output space;

\mathcal{W}=\{\mathbf{W}^{1},...,\mathbf{W}^{L}\} , x_{1},...,x_{n}\in\mathcal{X}c denotes the center of the hypersphere;

And this function is for minimizing the volume of hypersphere of all the HC;

The left term is to enclose the HC and the right term is a standard weight decay regularizer with hyperparameter \lambda > 0

        ②For there is only HC samples for training and maxmizing the mutual information \mathcal{I}(\mathcal{X},\mathcal{Z}), autoencoder initialization with reconstruction loss as the optimizer

        ③The mean value of all the features of encoded samples in center c:

c=\frac{1}{n}\sum_{i=1}^{n}\phi(x_{si};\mathcal{W}_{0})

        ④The anomaly score after training can be:

s(x)=\|\phi(x;\mathcal{W})-c\|^2

        ⑤There might be "hypersphere collapse" when only use HC. It means the radius of the hypersphere reduce to 0 and eliminating the representation capability of the network. It can be mitigated by few labeled abnormal samples

        ⑥For two classes labeled samples, there are:

\begin{aligned}&(x_{t1},y_{t1}),...,(x_{tm},y_{tm}),\\&(x_{t(m+1)},y_{t(m+1)}),...,(x_{t(2m)},y_{t(2m)})\in\mathcal{X}_t\times\mathcal{Y}_t\end{aligned}

        ⑦After adding the labeled samples, the network could be changed to:

\begin{aligned} \operatorname*{min}_{\mathcal{V}}& \begin{aligned}\frac{1}{n}\sum_{i=1}^n(||\phi(x_{si};\mathcal{W})-c||^2)^{y_si}\end{aligned} \\ &+\frac{1}{2m}\sum_{j=1}^{2m}(||\phi(x_{tj};\mathcal{W})-c||^{2})^{y_{t}j}+\frac{\lambda}{2}\sum_{l=1}^{L}||\mathbf{W}^{l}||_{F}^{2} \end{aligned}

the labeled abnormal samples are mapped away from center by penalization

        ⑧The centers of source domain and target domain are shared

2.5.3. Residual correction block (RCB)

        ①Distribution alignment by increasing discrepancy loss may not completely eliminate the domain discrepancies

        ②Li et al. put forward two-layer fully connected neural network RCB, which \mathcal{Y}_{t}\subset\mathcal{Y}_{s}

        ③\phi_{s}(x_{s}) and \phi_{t}(x_{t}) are the task-specific features of source data x_s and target data x_t

        ④“The source data x_s only needs to go through the original network, while the target data x_t needs to pass the RCB afterward.” Hence \phi_{s}(x_{s})=\phi(x_{s})我不知道啥意思

        ⑤Feature that learned by RCB is denoted as \Delta\phi_{\boldsymbol{s}}(x_{t})

        ⑥The integrate target feature: \phi_{t}(x_{t})=\phi_{s}(x_{t})+\Delta\phi_{s}(x_{t})

        ⑦They further update the object equation, i.e. the loss of DSAD:

\begin{aligned} L_{ad}=& \begin{aligned}\frac{1}{n}\sum_{i=1}^{n}(||\phi_{s}(x_{si};\mathcal{W})-c||^{2})^{y_{si}}\end{aligned} \\ &+\frac1{2m}\sum_{j=1}^{2m}(||\phi_{t}(x_{tj};\mathcal{W})-c||^{2})^{y_{tj}}+\frac\lambda2\sum_{l=1}^{L}||\mathbf{W}^{l}||_{F}^{2} \end{aligned}

2.5.4. Conditional adversarial domain adaptation revisited

        ①CDAN designed for traditional domain adaptation, which domain possess the same label space of source and target domain

        ②The domain confufsion error:

\begin{aligned}L_{dc}&=-\frac{1}{n}\sum_{i=1}^{n}\log[D(\phi_s(x_{si}),g(x_{si}))]\\&-\frac{1}{2m}\sum_{j=1}^{2m}\log[1-D(\phi_t(x_{tj}),g(x_{tj}))]\end{aligned}

        ③They apply:

\begin{aligned}&\{g(x_1),g(x_2),...,g(x_B)\}\\&=\text{softmax}(\{-s(x_1),-s(x_2),...,-s(x_B)\})\end{aligned}

where s\left ( x_i \right ) denotes the distance between x_i and c

        ④There are adversarial network:

\begin{aligned}&\min_\phi L_{ad}(\phi)-\beta L_{dc}(D,g)\\&\min_DL_{dc}(D,g)\end{aligned}

        ⑤The domain discriminator D(\phi,g)=D(\phi\otimes g)

        ⑥Then, the CDAN can be: 

\begin{aligned} &\begin{aligned}\min_{\phi}L_{ad}(\phi)+\beta(\frac{1}{n}\sum_{i=1}^{n}w(g(x_{si}))\log[D(\phi_{s}(x_{si})\otimes g(x_{si}))]\end{aligned} \\ &+\frac{1}{2m}\sum_{j=1}^{2m}w(g(x_{tj}))\log[1-D(\phi_{t}(x_{tj})\otimes g(x_{tj}))]) \\ &\operatorname*{mar}_{D} \kappa\frac{1}{n}\sum_{i=1}^{n}w(g(x_{si}))\log[D(\phi_{s}(x_{si})\otimes g(x_{si}))] \\ &+\frac{1}{2m}\sum_{j=1}^{2m}w(g(x_{tj}))\log[1-D(\phi_{t}(x_{tj})\otimes g(x_{tj}))]. \end{aligned}

where the entropy criterion w(g)=1+e^{-g}

2.5.5. Overall formulation of the FAAD algorithm

        ①The Few-shot domain-Adaptive Anomaly Detection (FAAD) combines DSAD and RCB:

\begin{aligned} \min_{\phi}& \frac{1}{n}\sum_{i=1}^{n}(||\phi_{s}(x_{si};\mathcal{W})-c||^{2})^{y_{si}} \\ &+\frac{1}{2m}\sum_{j=1}^{2m}(||\phi_{t}(x_{tj};\mathcal{W})-c||^{2})^{y_{tj}}+\frac{\lambda}{2}\sum_{l=1}^{L}||\mathbf{W}^{l}||_{F}^{2} \end{aligned}

        ②FAAD+CDANE:

\begin{aligned} &\min_{\phi} \begin{aligned}\frac{1}{n}\sum_{i=1}^n(||\phi_s(x_{si};\mathcal{W})-c||^2)^{y_{si}}\end{aligned} \\ &+\frac1{2m}\sum_{j=1}^{2m}(||\phi_{t}(x_{tj};\mathcal{W})-c||^{2})^{y_{tj}}+\frac\lambda2\sum_{l=1}^{L}||\mathbf{W}^{l}||_{F}^{2} \\ &+\beta(\frac1n\sum_{i=1}^nw(g(x_{si}))\log[D(\phi_s(x_{si})\otimes g(x_{si}))] \\ &+\frac1{2m}\sum_{j=1}^{2m}w(g(x_{tj}))\log[1-D(\phi_{t}(x_{tj})\otimes g(x_{tj}))]) \\ &\max_{D} \begin{aligned}\frac{1}{n}\sum_{i=1}^nw(g(x_{si}))\log[D(\phi_s(x_{si})\otimes g(x_{si}))]\end{aligned} \\ &+\frac1{2m}\sum_{j=1}^{2m}w(g(x_{tj}))\log[1-D(\phi_{t}(x_{tj})\otimes g(x_{tj}))], \end{aligned}

        ③The pseudo code of FAAD+CDANE:

2.6. Experiment

        ①They compared their model with a) machine learning as SVM and deep learning as FNN, b) originial anomaly detection DSAD, c) domain adaptation models

        ②They evaluate the soecific disease detection ability and various disease domain differentiating ability of their model

2.6.1. Baseline method

        ①They apply 95% PCA-SVM cuz the number of dimension is far more than the samples(特征维数是哪个什么n(n-1)/2吗,)

        ②They construct a BC-DNN with FNN combined with a fully connected layer and a Softmax layer. Then apply pre-training in BC-DNN to get BC-DNN-p

        ③They continue to introduce other models...(我这省略了)

2.6.2. Implementation details

(1)Network and training setup

        ①Shot: 10-shot and 20-shot applied

        ②Measurement: AUC

        ③FNN: input dimensions of layer 1,2,3 are the original dimension of vector, 128, 32 respectively; learning rate=0.001; optimizer: Adam

        ④FAAD and FAAD+CDANE: learning rate of RCB = 1/10 original learning rate; epoch=12 in pretraining and epoch=16 in FAAD; learning rate / 10 in the fourth and eighth epoch; batch size=4; \lambda =0.0001 and \beta =0.1 (from 0 to 0.1, influenced by coefficient \begin{aligned}(1-\exp(-\delta p))/(1+\exp(-\delta p))\end{aligned},  where \delta =10 and p iterate from 0 to 1)(我不能太理解); dropout ratio=0.2(多看一眼就会爆炸的段落)

        ⑤DSAD-DANN: \beta =1

(2)Data augmentation

        ①为什么在这里又说特征维度比样本量小!?

        ②⭐They think the label of partial fMRI scanning is the same as the full scan

        ③⭐“在训练过程中,每个时间过程都是随机裁剪的(应该从扫描的第一帧开始,并且大于原始长度的一半),然后用于计算全脑FC。在测试期间,放弃增强”(这种叫增强啊...可能没学过数据增强)

2.6.3. Results and analysis

        They compare the mean AUC of 9 trials

(1)FAAD for one mental disorder (SCZ only)

        ①AMU

        ②FMMU#1

        ③FMMU#2

        ④PUTH

        ⑤UCLA

        ⑥COBRE

        ⑦他们在这之后花了大篇幅撰写discussion,不过讨论都是基于实验结果的,对于没有实验结果的我暂时没有特别大的意义。因此只是看了一遍而没有记录

        ⑧Mean values and standard deviation of AUCs(%):

(2)FAAD for two mental disorders (SCZ & MDD)

        ①AMU

        ②FMMU#1

(3)Discriminative FC and brain regions

        ①They combine all the FC vector in each test set and apply canonical correlation analysis (CCA) on it. Get the mean weight of FC in each test set and select the top 10%

        ②SCZ visualization:

        ③SCZ or MDD:

(4)Empirical analysis of parameters

        ①Grid search \beta =\left \{ 0,\, 0.05,\, 0.1,\, 0.15,\, 0.2,\, 0.25 \right \} and find FAAD+CDANE is not sensitive to \beta

        ②Table of the tuning:

(5)Distribution of anomaly scores

        ①Anomaly scores in FMMU#1 with AAL:

(6)Brain parcellation and model performance

        ①Comparison of datasets and atlases:

2.7. Discussion

        ①This model can also be generalized to other networks

        ②⭐图的定义和图的拉普拉斯表示并不总是令人满意哈哈哈哈哈笑死,但你这个平均精度其实也不算太高,虽然最高可以到80但是平均下来我感觉就六七十了。2021其实也很够了

        ③Most of the samples in HCP are young person, it might influence the results

        ④⭐They did not consider the different pre-processing pipeline of different sites

2.8. Conclusion

        我就懒得conclude了,该是啥是啥

3. 知识补充

3.1. Hypersphere

参考学习:超球面_百度百科 (baidu.com)

3.2. Meta-learning

参考学习:一文入门元学习(Meta-Learning)(附代码) - 知乎 (zhihu.com)

3.3. Manifold

参考学习1:几何学中最伟大的发明之一——流形,其背后的几何直觉与数学方法 (baidu.com)

参考学习2:流形_百度百科 (baidu.com)

3.4. Canonical Correlation Analysis (CCA)

参考学习:Canonical Correlation Analysis - 知乎 (zhihu.com)

4. Reference List

Su J. et al. (2021)  'Few-shot domain-adaptive anomaly detection for cross-site brain images', IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1. doi: 10.1109/TPAMI.2021.3125686

这篇关于[论文精读]Few-shot domain-adaptive anomaly detection for cross-site brain images的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/804620

相关文章

AI hospital 论文Idea

一、Benchmarking Large Language Models on Communicative Medical Coaching: A Dataset and a Novel System论文地址含代码 大多数现有模型和工具主要迎合以患者为中心的服务。这项工作深入探讨了LLMs在提高医疗专业人员的沟通能力。目标是构建一个模拟实践环境,人类医生(即医学学习者)可以在其中与患者代理进行医学

cross-plateform 跨平台应用程序-03-如果只选择一个框架,应该选择哪一个?

跨平台系列 cross-plateform 跨平台应用程序-01-概览 cross-plateform 跨平台应用程序-02-有哪些主流技术栈? cross-plateform 跨平台应用程序-03-如果只选择一个框架,应该选择哪一个? cross-plateform 跨平台应用程序-04-React Native 介绍 cross-plateform 跨平台应用程序-05-Flutte

论文翻译:arxiv-2024 Benchmark Data Contamination of Large Language Models: A Survey

Benchmark Data Contamination of Large Language Models: A Survey https://arxiv.org/abs/2406.04244 大规模语言模型的基准数据污染:一项综述 文章目录 大规模语言模型的基准数据污染:一项综述摘要1 引言 摘要 大规模语言模型(LLMs),如GPT-4、Claude-3和Gemini的快

论文阅读笔记: Segment Anything

文章目录 Segment Anything摘要引言任务模型数据引擎数据集负责任的人工智能 Segment Anything Model图像编码器提示编码器mask解码器解决歧义损失和训练 Segment Anything 论文地址: https://arxiv.org/abs/2304.02643 代码地址:https://github.com/facebookresear

论文翻译:ICLR-2024 PROVING TEST SET CONTAMINATION IN BLACK BOX LANGUAGE MODELS

PROVING TEST SET CONTAMINATION IN BLACK BOX LANGUAGE MODELS https://openreview.net/forum?id=KS8mIvetg2 验证测试集污染在黑盒语言模型中 文章目录 验证测试集污染在黑盒语言模型中摘要1 引言 摘要 大型语言模型是在大量互联网数据上训练的,这引发了人们的担忧和猜测,即它们可能已

OmniGlue论文详解(特征匹配)

OmniGlue论文详解(特征匹配) 摘要1. 引言2. 相关工作2.1. 广义局部特征匹配2.2. 稀疏可学习匹配2.3. 半稠密可学习匹配2.4. 与其他图像表示匹配 3. OmniGlue3.1. 模型概述3.2. OmniGlue 细节3.2.1. 特征提取3.2.2. 利用DINOv2构建图形。3.2.3. 信息传播与新的指导3.2.4. 匹配层和损失函数3.2.5. 与Super

BERT 论文逐段精读【论文精读】

BERT: 近 3 年 NLP 最火 CV: 大数据集上的训练好的 NN 模型,提升 CV 任务的性能 —— ImageNet 的 CNN 模型 NLP: BERT 简化了 NLP 任务的训练,提升了 NLP 任务的性能 BERT 如何站在巨人的肩膀上的?使用了哪些 NLP 已有的技术和思想?哪些是 BERT 的创新? 1标题 + 作者 BERT: Pre-trainin

SAM2POINT:以zero-shot且快速的方式将任何 3D 视频分割为视频

摘要 我们介绍 SAM2POINT,这是一种采用 Segment Anything Model 2 (SAM 2) 进行零样本和快速 3D 分割的初步探索。 SAM2POINT 将任何 3D 数据解释为一系列多向视频,并利用 SAM 2 进行 3D 空间分割,无需进一步训练或 2D-3D 投影。 我们的框架支持各种提示类型,包括 3D 点、框和掩模,并且可以泛化到不同的场景,例如 3D 对象、室

[论文笔记]LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale

引言 今天带来第一篇量化论文LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale笔记。 为了简单,下文中以翻译的口吻记录,比如替换"作者"为"我们"。 大语言模型已被广泛采用,但推理时需要大量的GPU内存。我们开发了一种Int8矩阵乘法的过程,用于Transformer中的前馈和注意力投影层,这可以将推理所需

速通GPT-3:Language Models are Few-Shot Learners全文解读

文章目录 论文实验总览1. 任务设置与测试策略2. 任务类别3. 关键实验结果4. 数据污染与实验局限性5. 总结与贡献 Abstract1. 概括2. 具体分析3. 摘要全文翻译4. 为什么不需要梯度更新或微调⭐ Introduction1. 概括2. 具体分析3. 进一步分析 Approach1. 概括2. 具体分析3. 进一步分析 Results1. 概括2. 具体分析2.1 语言模型