要做什么?
文章里面的:
我自己的表达矩阵的的话:有60564个基因
文章里面的“GSE130437_genes.fpkm_table”,有60603个基因
(少了四十几个?)
先用自己的数据,去差异性分析,看看符不符合?
exprset_my <- read.table('all.id.txt',header = T,sep = '\t',fill = T)
exprset_my=exprset_my[!duplicated(exprset_my$Geneid),]
row.names(exprset_my) <- exprset_my$Geneid
exprset_my <- exprset_my[,-1]
exprset_my1 <- exprset_my[,6:11]
colnames(exprset_my1) <- c("MCF7pR1","MCF7pR2","MCF7pR3","MCF7pS1","MCF7pS2","MCF7pS3")
colData <- read.csv("pdata_溶瘤病毒耐药1.csv", header = T)
row.names(colData) <- colData$X
coldata2 <- colData[2]
#DESeq2差异性分析
library(DESeq2)
dds <- DESeqDataSetFromMatrix(countData = exprset_my1,colData = colData,design = ~ condition)
dds <- DESeq(dds)
res <- results(dds, contrast=c("condition","control","treatment"))
DEG <- as.data.frame(res)
DEG <- na.omit(DEG)
diff_gene <-subset(DEG, padj <= 0.05 & abs(log2FoldChange) > 1)
diff_gene_up <- subset(diff_gene, log2FoldChange > 1)
diff_gene_down <- subset(diff_gene, log2FoldChange < -1)
文章的分析结果是:
Using a q-value cutoff ≤ 0.05 with |log2FC| ≥1, we identified 2183 up-regulated genes and 1548 down-regulated transcripts in MCF7/pR cells
自己的diff-gene有5227个
up的:3538
down的有:1689
也就是up基因那里,多了一千多个?
找表1显示了与MCF7 / pS细胞相比,MCF7 / pR细胞中排名前20位的上调和下调基因。
我自己找的,和文章的,完全对不上号?
用文章里面的数据试试吧?
文章给的处理好的GEO数据。