1.更改目录setwd()
2.安装R包BiocManager::install()
3.加载R包
library(DESeq2)
library("tximport")
library("readr")
4.读取转录本与基因关系文件,trans2gene.csv第一列为转录本,第二列为基因,需要加表头:
tx2gene <- read_csv("trans2gene.csv")
5.设置sample矩阵,run为样本名称,condition 为分组名称
samples <- data.frame(run = c('T121', 'T122', 'T123', 'T161', 'T162', 'T163'), condition = c("T12","T12","T12","T16","T16","T16"))
samples
run condition
1 T01 T0
2 T02 T0
3 T03 T0
4 T81 T8
5 T82 T8
6 T83 T8
6. 读取salmon结果文件,结果文件为01.sf、02.sf、03.sf、81.sf、82.sf、83.sf,substring(samples$run, 2,4)为提取samples矩阵run那一列,每个字符串截取第2到第4个字符,paste(substring(samples$run, 2,4),".sf", sep = "")为把截取的字符加上文件扩展名:
files <- file.path('.', paste(substring(samples$run, 2,4),".sf", sep = ""))
7.给每个文件加上文件名,文件名从samples矩阵run那一列提取
names(files) <- samples$run
files
T01 T02 T03 T81 T82 T83
"./01.sf" "./02.sf" "./03.sf" "./81.sf" "./82.sf" "./83.sf"
8.tximport导入结果
txi <- tximport(files, type="salmon", tx2gene=tx2gene)
head(txi$counts) # 查看矩阵
9.tximport导入的结果传递给DESeq2
dds <- DESeqDataSetFromTximport(txi, colData=samples, design= ~ condition)
10.计算并输出
dds <- DESeq(dds)
res <- results(dds)
write.table(res,"result.csv", sep = ",", row.names = TRUE)