使用EnhancedVolcano包可视化火山图
火山图是用来展示差异表达的基因或者物种,其常常出现RNA-seq,amplicon-seq分析的结果中。其主要根据差异分析结果的p值和Fold change值进行展示结果。
介绍
标准的火山图常用来展示差异分析的结果,它具有两个指标:
- 差异表达的显著性值:
p
值或基于p
值得adjust-P
(差异分析过程涉及到多重校验,因此要对多次检验的p
进行校正,校正的方法一般有FDR, BH
等等); - 差异表达程度值:
Fold change
值也即使倍数变化值;
数据
Deseq2
的结果数据。
- 箭头指向的红框是组间比较,根据
Fold changes
值设置富集方向,并在图中表示出来; -
DataFrame
的红框部分是火山图的坐标轴,FoldChange是x
轴,padj
是y
轴(可以对其取);
实操
-
读取数据:
library(dplyr) library(tibble) library(ggplot2) library(EnhancedVolcano) # pre-run : save(miRNA_Deseq2, file="../../Result/Differential_expression/RNA_DEseq2.RData") load("../../Result/Differential_expression/RNA_DEseq2.RData")
-
画图函数:
设置富集方向标签;
-
设置差异RNA的名称;
Volcanofun <- function(prof, type="miRNA", FDR=0.05, lgFC=2){ # prof <- miRNA.des2 # type <- "miRNA" # FDR <- 0.05 # lgFC <- 2 geneid2symbolfun <- function(diff_data){ require(biomaRt) require(curl) mart <- useDataset("hsapiens_gene_ensembl", useMart("ensembl")) # origin ensembl_gene_id <- rownames(diff_data) if(!file.exists("../../Result/Differential_expression/geneid2symbol.csv")){ gene_symbols <- getBM(attributes=c('ensembl_gene_id','external_gene_name', "transcript_biotype","description"), filters='ensembl_gene_id', values=ensembl_gene_id, mart = mart) write.csv(gene_symbols, "../../Result/Differential_expression/geneid2symbol.csv", row.names = F) }else{ gene_symbols <- fread("../../Result/Differential_expression/geneid2symbol.csv") } return(gene_symbols) } if(type == "miRNA"){ dat <- prof label <- rownames(dat) }else{ gene_symbols <- geneid2symbolfun(prof) dat_cln <- gene_symbols %>% filter(transcript_biotype == type) dat <- subset(prof, rownames(prof)%in%dat_cln$ensembl_gene_id) label <- NA } # over- or under- represent RNA keyvals <- ifelse( dat$log2FoldChange < -lgFC, 'royalblue', ifelse(dat$log2FoldChange > lgFC, 'gold', 'black')) names(keyvals)[keyvals == 'gold'] <- 'high' names(keyvals)[keyvals == 'black'] <- 'mid' names(keyvals)[keyvals == 'royalblue'] <- 'low' pl <- EnhancedVolcano(dat, lab = label, # 显示差异RNA x = 'log2FoldChange', y = 'padj', xlim = c(-4, 4), ylim = c(-10, 50), xlab = bquote(~Log[2]~ 'fold change'), ylab = bquote(~-Log[10]~ adjusted~italic(P)), title = paste('RB versus PB', "in", type), pCutoff = FDR, FCcutoff = lgFC, pointSize = 2.0, labSize = 4.0, col = c('black', 'black', 'black', 'red3'), shape = 16, colAlpha = 1, drawConnectors = TRUE, legendLabSize = 16, legendIconSize = 5.0, selectLab = rownames(dat)[which(names(keyvals) %in% c('high', 'low'))], colCustom = keyvals)+ scale_x_continuous(breaks = seq(-4, 4, 1), limits = c(-4.5, 4.5))+ scale_y_continuous(breaks = seq(-5, 40, 5), limits = c(-6, 46))+ theme_bw()+ theme(plot.title = element_text(hjust = .5, size = 14)) return(pl) }
-
Run结果:
# miRNA Volcanofun(miRNA.des2, type="miRNA") # lncRNA Volcanofun(RNA.des2, type="lncRNA")