corralation
0.输入数据和R包
rm(list=ls())
library(ggpubr)
library(stringr)
load(file = "for_boxplot.Rdata")
VHL_mut=str_sub(as.character(
as.data.frame( mut[mut$Hugo_Symbol=='VHL','Tumor_Sample_Barcode'])[,1] ),
1,12)
2.任意两个基因的相关性分析
2.1 简单绘图
使用ggpbur。
dat=data.frame(gene1=exprSet['hsa-mir-10b',],
gene2=exprSet['hsa-mir-143',],
stage=meta$patient.stage_event.pathologic_stage)
sp1 <- ggscatter(dat, x = "gene1", y = "gene2",
add = "reg.line", # Add regressin line
add.params = list(color = "blue", fill = "lightgray"), # Customize reg. line
conf.int = TRUE # Add confidence interval
) + stat_cor(method = "pearson", label.x = 15, label.y = 20)
sp1
2.2 按照stage分组
不仅stage,任意在meta信息中能找到或生产的分组都可以。(找跟分期相关性的关系)
sp2 <- ggscatter( dat, x = "gene1", y = "gene2",
color = "stage", palette = "jco",
add = "reg.line", conf.int = TRUE) + stat_cor(aes(color = stage),label.x = 15 )
sp2
2.3 按照是否突变来分组
理论上某个是否突变并不会改变某两个基因的相关性趋势,如果有这种特殊的突变,打乱了两个基因之间正常的相关关系,机制就有了。可以写循环试一下是否有这样的基因突变。
expm = exprSet[,str_sub(colnames(exprSet),1,12) %in% unique(str_sub(mut$Tumor_Sample_Barcode,1,12))]
dat=data.frame(gene1=expm['hsa-mir-10b',],
gene2=expm['hsa-mir-143',],
mut= str_sub(colnames(expm),1,12) %in% VHL_mut)
sp3 <- ggscatter( dat, x = "gene1", y = "gene2",
color = "mut", palette = "jco",
add = "reg.line", conf.int = TRUE) + stat_cor(aes(color = mut),label.x = 15 )
sp3
3.分面
sp2 + facet_wrap(~stage, scales = "free_x")
sp3 + facet_wrap(~mut, scales = "free_x")
可通过循环寻找基因之间的相关性(小洁老师简书有,记得学)
*生信技能树课程笔记