(五)单细胞数据分析——细胞亚群的表型特征刻画(补充工作)

本文主要是介绍(五)单细胞数据分析——细胞亚群的表型特征刻画(补充工作),希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

由于细胞表型特征这里,我们可以进行一些信息的补充,一个是从全局重新观察每个细胞亚群的基因表达,另一个是细胞亚群特征基因的小提琴图。

第一步:导入R包

library(scibet)
library(Seurat)
library(scater)
library(scran)
library(dplyr)
library(Matrix)
library(cowplot)
library(ggplot2)
library(harmony)

第二步:全局观察每个细胞亚群的基因表达

导入所有细胞亚群的数据

BRCA_SingleCell.Bcell <- readRDS("BRCA_SingleCell.B_Pcell.RDS")
BRCA_SingleCell.Tcell <- readRDS("BRCA_SingleCell.Tcell.RDS")
BRCA_SingleCell.Mycell <- readRDS("BRCA_SingleCell.Mycell.RDS")
BRCA_SingleCell.Fcell <- readRDS("BRCA_SingleCell.Fcell.RDS")

设置颜色种类

cb_palette <- c("antiquewhite", "yellowgreen", "thistle", "violet", "azure3", "azure4", "black","blue", "blue4","blueviolet", #B"brown4","brown1", "burlywood1", "burlywood4", "cadetblue1", "cadetblue4","chartreuse", "chartreuse3", #T"chartreuse4","chocolate1", "chocolate3", "chocolate4", "cyan3", "darkgoldenrod","darkgoldenrod1","darkgoldenrod4", "darkolivegreen", "khaki", "yellow" ,"hotpink" ,"hotpink4" , #My"slateblue1" ,"slateblue4", "yellow3","sienna4", "tomato1", "sienna1", "dodgerblue4", "darkmagenta","cornsilk4") #F

绘制每个细胞亚群的UMAP图

DimPlot(object = BRCA_SingleCell.Bcell,reduction = 'umap',group.by = c('cell_Subtype'),cols = cb_palette[1:10], label = TRUE)
DimPlot(object = BRCA_SingleCell.Tcell,reduction = 'umap',group.by = c('cell_Subtype'),cols = cb_palette[11:18], label = TRUE)
DimPlot(object = BRCA_SingleCell.Mycell,reduction = 'umap',group.by = c('cell_Subtype'),cols = cb_palette[19:31], label = TRUE)
DimPlot(object = BRCA_SingleCell.Fcell,reduction = 'umap',group.by = c('cell_Subtype'),cols = cb_palette[32:40], label = TRUE)

其中一个结果如下:

合并所有数据绘制总的UMAP图

BRCA_SingleCell <- merge(BRCA_SingleCell.Fcell,BRCA_SingleCell.Tcell)
BRCA_SingleCell <- merge(BRCA_SingleCell,BRCA_SingleCell.Bcell)
BRCA_SingleCell <- merge(BRCA_SingleCell,BRCA_SingleCell.Mycell)
DimPlot(object = BRCA_SingleCell,reduction = 'umap',group.by = c('cell_Subtype'), cols = cb_palette[1:40], label = FALSE)

结果如下:

 分析不同细胞亚群之间的差异基因,并绘制热图

cellType_markerGene <- findMarker(BRCA_SingleCell)
top <- cellType_markerGene %>% group_by(cluster) %>% top_n(n =10, wt = myAUC)
top$cluster %>% levels(.)
clusters <- unique(BRCA_SingleCell$cell_Subtype)
genes <- c()
for(i in clusters){gene <- as.vector(as.matrix(subset(top, cluster == i, gene)))gene <- gene[1:(ifelse(length(gene)>5,5,length(gene)))]names(gene) <- rep(i, times = length(gene))genes <- c(genes, gene)
}
table(names(genes))
table(unique(names(genes)))
B_P_gene <- c(genes[which(names(genes) == "naive B")],genes[which(names(genes) == "Memory B")],genes[which(names(genes) == "Germinal center B cell")],genes[which(names(genes) == "IgA Plasma")],genes[which(names(genes) == "IGHGP+ Plasma")],genes[which(names(genes) == "IGLC7+ Plasma")],genes[which(names(genes) == "IL7R+ Plasma")],genes[which(names(genes) == "NKG7+ Plasma")],genes[which(names(genes) == "IgG Plasma")])
T_gene <- c(genes[which(names(genes) == "CD8+ Tem")],genes[which(names(genes) == "CD4+ Tfh")],genes[which(names(genes) == "NK")],genes[which(names(genes) == "Th0")],genes[which(names(genes) == "GZMK+ Tc")],genes[which(names(genes) == "CD8+ Tex")],genes[which(names(genes) == "CD4+ Treg")],genes[which(names(genes) == "Tem")])
Stroma_gene <- c(genes[which(names(genes) == "MYH11+MCAM+ Myofibro")],genes[which(names(genes) == "RGS5+MCAM+ Myofibro")],genes[which(names(genes) == "VEGF+ Fibro")],genes[which(names(genes) == "CFD+ Fibro")],genes[which(names(genes) == "PLA2G2A+CFD+ Fibro")],genes[which(names(genes) == "PDGFC+ Fibro")],genes[which(names(genes) == "CXCL11+ Fibro")],genes[which(names(genes) == "MMP11+ Fibro")],genes[which(names(genes) == "MMP1+ Fibro")],genes[which(names(genes) == "MMP11+ Fibro")])
Myeloid_gene <- c(genes[which(names(genes) == "MM9+ Macro")],genes[which(names(genes) == "C1QC+MRC1-  Macro")],genes[which(names(genes) == "MRC1+ Macro")],genes[which(names(genes) == "cDC2")],genes[which(names(genes) == "VCAN+SPP1+ Macro")],genes[which(names(genes) == "MARCO+SPP1+ Macro")],genes[which(names(genes) == "cDC1")],genes[which(names(genes) == "Activated DCs")],genes[which(names(genes) == "Mast cell")],genes[which(names(genes) == "TACSTD2+ Macro")],genes[which(names(genes) == "CXCL10+ Macro")])
cellMarker <- c(B_P_gene,T_gene,Myeloid_gene,Stroma_gene)
length(cellMarker)
length(unique(cellMarker))
sum(table(cellMarker) == 1)
cellMarker.unique <- unique(cellMarker)
options(repr.plot.height = 14, repr.plot.width = 18)
test <- DotPlot(object = BRCA_SingleCell,features = cellMarker.unique,cols = c('blue','red'),group.by = 'cell_Subtype')+
theme_bw()+ theme(axis.text.x = element_text(angle = -90,hjust = 0,vjust = 0)) +
coord_flip()
test
test$data$id %>% unique(.) %>% setdiff(.,c('naive B','IL7R+ Plasma','memory B','Germinal center B','IGHGP+ Plasma','IGLC7+ Plasma','NKG7+ Plasma','STMN1+ Plasma','IgA Plasma','IgG Plasma','Th0','CD4+ Tfh','CD8+ Tem','CD4+ Treg','GZMK+ Tc','CD8+ Tex','NK','Tem','TACSTD2+ Macro','MARCO+SPP1+ Macro','VCAN+SPP1+ Macro','CXCL10+ Macro','cDC2','CXCL9+ Macro','MRC1+ Macro','cDC1','Activated DCs','C1QC+MRC1- Macro','MM9+ Macro','CPB1+ Macro','Mast cell','CXCL11+ Fibro','CFD+ Fibro','RGS5+MCAM+ Myofibro','MMP11+ Fibro','VEGF+ Fibro','PLA2G2A+CFD+ Fibro','MYH11+MCAM+ Myofibro','PDGFC+ CAFs','Endo'))
Mmarker_data  <- test$data
Mmarker_data$Zscore <- Mmarker_data$avg.exp.scaled
Mmarker_data$cluster <- factor(x = Mmarker_data$id,levels = c('naive B','IL7R+ Plasma','memory B','Germinal center B','IGHGP+ Plasma','IGLC7+ Plasma','NKG7+ Plasma','STMN1+ Plasma','IgA Plasma','IgG Plasma','Th0','CD4+ Tfh','CD8+ Tem','CD4+ Treg','GZMK+ Tc','CD8+ Tex','NK','Tem','TACSTD2+ Macro','MARCO+SPP1+ Macro','VCAN+SPP1+ Macro','CXCL10+ Macro','cDC2','CXCL9+ Macro','MRC1+ Macro','cDC1','Activated DCs','C1QC+MRC1- Macro','MM9+ Macro','CPB1+ Macro','Mast cell','CXCL11+ Fibro','CFD+ Fibroblast','RGS5+MCAM+ Myofibro','MMP11+ Fibro','VEGF+ Fibro','PLA2G2A+CFD+ Fibro','MYH11+MCAM+ Myofibro','PDGFC+ Fibro','Endo'),labels = c('naive B','IL7R+ Plasma','memory B','Germinal center B','IGHGP+ Plasma','IGLC7+ Plasma','NKG7+ Plasma','STMN1+ Plasma','IgA Plasma','IgG Plasma','Th0','CD4+ Tfh','CD8+ Tem','CD4+ Treg','GZMK+ Tc','CD8+ Tex','NK','Tem','TACSTD2+ Macro','MARCO+SPP1+ Macro','VCAN+SPP1+ Macro','CXCL10+ Macro','cDC2','CXCL9+ Macro','MRC1+ Macro','cDC1','Activated DCs','C1QC+MRC1- Macro','MM9+ Macro','CPB1+ Macro','Mast cell','CXCL11+ Fibro','CFD+ Fibro','RGS5+MCAM+ Myofibro','MMP11+ Fibro','VEGF+ Fibro','PLA2G2A+CFD+ Fibro','MYH11+MCAM+ Myofibro','PDGFC+ Fibro','Endo'))
unique(Mmarker_data$id)
Mmarker_data$geneClass <- NA
Mmarker_data$geneClass <- ifelse(Mmarker_data$features.plot %in% B_P_gene,'B',Mmarker_data$geneClass)
Mmarker_data$geneClass <- ifelse(Mmarker_data$features.plot %in% T_gene,'T',Mmarker_data$geneClass)
Mmarker_data$geneClass <- ifelse(Mmarker_data$features.plot %in% Myeloid_gene,'M',Mmarker_data$geneClass)
Mmarker_data$geneClass <- ifelse(Mmarker_data$features.plot %in% Stroma_gene,'S',Mmarker_data$geneClass)
library(aplot)
options(repr.plot.height = 14, repr.plot.width = 13)
group <- factor(c('naive B','IL7R+ Plasma','memory B','Germinal center B','IGHGP+ Plasma','IGLC7+ Plasma','NKG7+ Plasma','STMN1+ Plasma','IgA Plasma','IgG Plasma','Th0','CD4+ Tfh','CD8+ Tem','CD4+ Treg','GZMK+ Tc','CD8+ Tex','NK','Tem','TACSTD2+ Macro','MARCO+SPP1+ Macro','VCAN+SPP1+ Macro','CXCL10+ Macro','cDC2','CXCL9+ Macro','MRC1+ Macro','cDC1','Activated DCs','C1QC+MRC1- Macro','MM9+ Macro','CPB1+ Macro','Mast cell','CXCL11+ Fibro','CFD+ Fibro','RGS5+MCAM+ Myofibro','MMP11+ Fibro','VEGF+ Fibro','PLA2G2A+CFD+ Fibro','MYH11+MCAM+ Myofibro','PDGFC+ Fibro','Endo'),levels = c('naive B','IL7R+ Plasma','memory B','Germinal center B','IGHGP+ Plasma','IGLC7+ Plasma','NKG7+ Plasma','STMN1+ Plasma','IgA Plasma','IgG Plasma','Th0','CD4+ Tfh','CD8+ Tem','CD4+ Treg','GZMK+ Tc','CD8+ Tex','NK','Tem','TACSTD2+ Macro','MARCO+SPP1+ Macro','VCAN+SPP1+ Macro','CXCL10+ Macro','cDC2','CXCL9+ Macro','MRC1+ Macro','cDC1','Activated DCs','C1QC+MRC1- Macro','MM9+ Macro','CPB1+ Macro','Mast cell','CXCL11+ Fibro','CFD+ Fibro','RGS5+MCAM+ Myofibro','MMP11+ Fibro','VEGF+ Fibro','PLA2G2A+CFD+ Fibro','MYH11+MCAM+ Myofibro','PDGFC+ Fibro','Endo')) %>% as.data.frame() %>% mutate(group=c(rep("B cell",10),rep("T cell",8),rep("Myeloid",13),rep("Strom cell",9))) %>%mutate(p="group") %>%ggplot(aes(.,y=p,fill=group))+geom_tile() + scale_y_discrete(position="right") +theme_minimal()+xlab(NULL) + ylab(NULL) +theme(axis.text.x = element_blank())+labs(fill = "Group")
heatmap <- ggplot(Mmarker_data, aes(y=features.plot, x=cluster, fill=Zscore))+ 
geom_raster()+scale_fill_gradient2(low="#003366", high="#990033", mid="white",midpoint = 0)+
theme(axis.text.x = element_text(angle = -90,hjust = 0,vjust = 0.5, size = 15),axis.ticks.y = element_blank())+
xlab(NULL) + ylab(NULL)+theme(panel.border = element_rect(fill=NA,color="black", size=1, linetype="solid"))+geom_vline(xintercept=c(-0.5,10.5,18.5,31.5),size=.8)+geom_hline(yintercept=c(-0.5,18.5,48.5,76.5), size=.8)
heatmap <- heatmap %>% insert_top(group ,height = 0.05)
heatmap

结果如下:

第三步:绘制特征基因的小提琴图

构造绘制小提琴图的函数与特征基因筛选函数

modify_vlnplot<- function(obj, feature, pt.size = 0, plot.margin = unit(c(-0.75, 0, -0.75, 0), "cm"),...) {p<- VlnPlot(obj, features = feature, pt.size = pt.size,group.by = 'cell_Subtype', ... )  + xlab("") + ylab(feature) + ggtitle("") + theme(legend.position = "none", axis.text.x = element_blank(), axis.ticks.x = element_blank(), axis.title.y = element_text(size = rel(1), angle = 0), axis.text.y = element_text(size = rel(1)), plot.margin = plot.margin ) #+ rotate_x_text() return(p)
}
extract_max<- function(p){ymax<- max(ggplot_build(p)$layout$panel_scales_y[[1]]$range$range)return(ceiling(ymax))
}## main function
StackedVlnPlot<- function(obj, features,pt.size = 0, plot.margin = unit(c(-0.75, 0, -0.75, 0), "cm"),...) {plot_list<- purrr::map(features, function(x) modify_vlnplot(obj = obj,feature = x, ...))# Add back x-axis title to bottom plot. patchwork is going to support this?# rotate_x_text() 转置横坐标的字体,ggpubr 包,在这里添加才会只出现一次横坐标,不会出现多个横坐标plot_list[[length(plot_list)]]<- plot_list[[length(plot_list)]] +theme(axis.text.x=element_text(), axis.ticks.x = element_line()) + rotate_x_text(angle = 90)# change the y-axis tick to only max value ymaxs<- purrr::map_dbl(plot_list, extract_max)plot_list<- purrr::map2(plot_list, ymaxs, function(x,y) x + scale_y_continuous(breaks = c(y)) + expand_limits(y = y))p<- patchwork::wrap_plots(plotlist = plot_list, ncol = 1)return(p)
}
findMarker <- function(x){Idents(x) <- 'cell_Subtype'cellType_markerGene <- FindAllMarkers(x,logfc.threshold = 0.5,only.pos = T,test.use = 'roc',min.pct = 0.2)cellType_markerGene <- subset(cellType_markerGene,!grepl(pattern = 'RP[LS]',gene))cellType_markerGene <- subset(cellType_markerGene,!grepl(pattern = 'MT-',gene))top5 <- cellType_markerGene %>% group_by(cluster) %>% top_n(n = 10, wt = myAUC)print(top5)return(cellType_markerGene)
}

注:查找差异基因是因为,如果没有去了解特征基因,可以通过差异基因进行一个“预见”

绘制小提琴图(以髓系细胞为例)

cellMarker.Mycell <- findMarker(BRCA_SingleCell.Mycell)
features<- c("HLA-DOB","HLA-DQA2","HLA-DPB1","LGALS2","CLEC9A","WDFY4","IDO1","CCND1","IRF8","RGCC","SHTN1","BATF3","DNASE1L3","CD1C","XCR1","CADM1","LIMD2","CPNE3")
vlnplt <- StackedVlnPlot(obj = BRCA_SingleCell.Mycell, features = features)
vlnplt

结果如下:

综上所述,补充的工作就完成了,主要是对描述进行一个更细致的刻画,通过热图和总UMAP全局观察细胞亚群,通过小提琴图刻画特征基因在细胞亚群中的表达!

这篇关于(五)单细胞数据分析——细胞亚群的表型特征刻画(补充工作)的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/205884

相关文章

工作常用指令与快捷键

Git提交代码 git fetch  git add .  git commit -m “desc”  git pull  git push Git查看当前分支 git symbolic-ref --short -q HEAD Git创建新的分支并切换 git checkout -b XXXXXXXXXXXXXX git push origin XXXXXXXXXXXXXX

嵌入式方向的毕业生,找工作很迷茫

一个应届硕士生的问题: 虽然我明白想成为技术大牛需要日积月累的磨练,但我总感觉自己学习方法或者哪些方面有问题,时间一天天过去,自己也每天不停学习,但总感觉自己没有想象中那样进步,总感觉找不到一个很清晰的学习规划……眼看 9 月份就要参加秋招了,我想毕业了去大城市磨练几年,涨涨见识,拓开眼界多学点东西。但是感觉自己的实力还是很不够,内心慌得不行,总怕浪费了这人生唯一的校招机会,当然我也明白,毕业

OmniGlue论文详解(特征匹配)

OmniGlue论文详解(特征匹配) 摘要1. 引言2. 相关工作2.1. 广义局部特征匹配2.2. 稀疏可学习匹配2.3. 半稠密可学习匹配2.4. 与其他图像表示匹配 3. OmniGlue3.1. 模型概述3.2. OmniGlue 细节3.2.1. 特征提取3.2.2. 利用DINOv2构建图形。3.2.3. 信息传播与新的指导3.2.4. 匹配层和损失函数3.2.5. 与Super

husky 工具配置代码检查工作流:提交代码至仓库前做代码检查

提示:这篇博客以我前两篇博客作为先修知识,请大家先去看看我前两篇博客 博客指路:前端 ESlint 代码规范及修复代码规范错误-CSDN博客前端 Vue3 项目开发—— ESLint & prettier 配置代码风格-CSDN博客 husky 工具配置代码检查工作流的作用 在工作中,我们经常需要将写好的代码提交至代码仓库 但是由于程序员疏忽而将不规范的代码提交至仓库,显然是不合理的 所

Python:豆瓣电影商业数据分析-爬取全数据【附带爬虫豆瓣,数据处理过程,数据分析,可视化,以及完整PPT报告】

**爬取豆瓣电影信息,分析近年电影行业的发展情况** 本文是完整的数据分析展现,代码有完整版,包含豆瓣电影爬取的具体方式【附带爬虫豆瓣,数据处理过程,数据分析,可视化,以及完整PPT报告】   最近MBA在学习《商业数据分析》,大实训作业给了数据要进行数据分析,所以先拿豆瓣电影练练手,网络上爬取豆瓣电影TOP250较多,但对于豆瓣电影全数据的爬取教程很少,所以我自己做一版。 目

【多系统萎缩患者必看】✨维生素补充全攻略,守护你的健康每一天!

亲爱的朋友们,今天我们要聊一个既重要又容易被忽视的话题——‌多系统萎缩患者如何科学补充维生素‌!🌟 在这个快节奏的生活中,健康成为了我们最宝贵的财富,而对于多系统萎缩(MSA)的患者来说,合理的营养补充更是维护身体机能、提升生活质量的关键一步。👇 🌈 为什么多系统萎缩患者需要特别关注维生素? 多系统萎缩是一种罕见且复杂的神经系统疾病,它影响身体的多个系统,包括自主神经、锥体外系、小脑及锥

未来工作趋势:零工小程序在共享经济中的作用

经济在不断发展的同时,科技也在飞速发展。零工经济作为一种新兴的工作模式,正在全球范围内迅速崛起。特别是在中国,随着数字经济的蓬勃发展和共享经济模式的深入推广,零工小程序在促进就业、提升资源利用效率方面显示出了巨大的潜力和价值。 一、零工经济的定义及现状 零工经济是指通过临时性、自由职业或项目制的工作形式,利用互联网平台快速匹配供需双方的新型经济模式。这种模式打破了传统全职工作的界限,为劳动

Smarty模板引擎工作机制(一)

深入浅出Smarty模板引擎工作机制,我们将对比使用smarty模板引擎和没使用smarty模板引擎的两种开发方式的区别,并动手开发一个自己的模板引擎,以便加深对smarty模板引擎工作机制的理解。 在没有使用Smarty模板引擎的情况下,我们都是将PHP程序和网页模板合在一起编辑的,好比下面的源代码: <?php$title="深处浅出之Smarty模板引擎工作机制";$content=

3.比 HTTP 更安全的 HTTPS(工作原理理解、非对称加密理解、证书理解)

所谓的协议 协议只是一种规则,你不按规则来就无法和目标方进行你的工作 协议说白了只是人定的规则,任何人都可以定协议 我们不需要太了解细节,这些制定和完善协议的人去做的,我们只需要知道协议的一个大概 HTTPS 协议 1、概述 HTTPS(Hypertext Transfer Protocol Secure)是一种安全的超文本传输协议,主要用于在客户端和服务器之间安全地传输数据

《计算机视觉工程师养成计划》 ·数字图像处理·数字图像处理特征·概述~

1 定义         从哲学角度看:特征是从事物当中抽象出来用于区别其他类别事物的属性集合,图像特征则是从图像中抽取出来用于区别其他类别图像的属性集合。         从获取方式看:图像特征是通过对图像进行测量或借助算法计算得到的一组表达特性集合的向量。 2 认识         有些特征是视觉直观感受到的自然特征,例如亮度、边缘轮廓、纹理、色彩等。         有些特征需要通