一、读取文件,ID转换
1.读取文件
library(clusterProfiler)
library(org.Hs.eg.db)
#读取文件,原始文件中使用空格分割的
go_ythdf2 <- read.table("./goList/RIP YTHDF2 RNAseq.txt",sep=" ")
go_ythdf2 <- t(go_ythdf2)
2.ID转换,ENTREZID是进行GO分析最好的ID类型(针对clusterProfiler)
ID转换用到的是bitr()
函数,bitr()的使用方法:
bitr(geneID, fromType, toType, OrgDb, drop = TRUE)
geneID:一个含有gene_name的矢量
orgDb:人类的注释包是 org.Hs.eg.db
fromType:输入的gene_name的类型
toType:需要转换成的gene_name的类型,可以是多种类型,用大写character类型向量表示
3.gene_name的类型查看
org.Hs.eg.db包含有多种gene_name的类型
#查看org.Hs.eg.db中可以被选择/使用的类型
keytypes(org.Hs.eg.db)
[1] "ACCNUM" "ALIAS" "ENSEMBL" "ENSEMBLPROT" "ENSEMBLTRANS" "ENTREZID"
[7] "ENZYME" "EVIDENCE" "EVIDENCEALL" "GENENAME" "GO" "GOALL"
[13] "IPI" "MAP" "OMIM" "ONTOLOGY" "ONTOLOGYALL" "PATH"
[19] "PFAM" "PMID" "PROSITE" "REFSEQ" "SYMBOL" "UCSCKG"
[25] "UNIGENE" "UNIPROT"
#ID转换,把gene_symbol转换成ENTREZID。
go_ythdf2_id_trance <- bitr(go_ythdf2,fromType = "SYMBOL",toType = "ENTREZID",OrgDb = "org.Hs.eg.db",drop = T)
4.org.Hs.eg.db的相关函数
keytypes():keytypes(x),查看注释包中可以使用的类型
columns():类似于keytypes(),针对org.Hs.eg.db两个函数返回值一致
select():select(x, keys, columns, keytype, ...) eg.
#类似于bitr()的注释方式
select(org.Hs.eg.db, keys=geneids, columns=c("SYMBOL", "GENENAME","GO"), keytype="ENSEMBL")
二、GO注释
函数enrichGO()进行GO富集分析,enrichGO()的使用方法:
enrichGO(gene, OrgDb, keyType = "ENTREZID", ont = "MF", pvalueCutoff = 0.05,
pAdjustMethod = "BH", universe, qvalueCutoff = 0.2, minGSSize = 10,
maxGSSize = 500, readable = FALSE, pool = FALSE)
Arguments:
gene a vector of entrez gene id.
OrgDb OrgDb
keyType keytype of input gene
ont One of "MF", "BP", and "CC" subontologies or 'ALL'.
pvalueCutoff Cutoff value of pvalue.
pAdjustMethod one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none"
universe background genes
qvalueCutoff qvalue cutoff
minGSSize minimal size of genes annotated by Ontology term for testing.
maxGSSize maximal size of genes annotated for testing
readable whether mapping gene ID to gene Name
pool If ont=’ALL’, whether pool 3 GO sub-ontologies
举例:
result_go_ythdf2 <- enrichGO(go_ythdf2_id_trance$ENTREZID,OrgDb = "org.Hs.eg.db",
keyType = "ENTREZID",ont = "ALL",readable = T)
结果处理
画图
dotplot(result_go_ythdf2,showCategory=50)
barplot(result_go_ythdf2,showCategory=20)
保存
df_go_ythdf2 <- as.data.frame(result_go_ythdf2)
write.table(df_go_ythdf2,"df_go_ythdf2.xls",quote = F,sep = "\t")
三、整体代码
library("clusterProfiler")
library("org.Hs.eg.db")
#读取数据
go_ythdf2 <- read.table("./goList/RIP YTHDF2 RNAseq.txt",sep=" ")
go_ythdf2 <- t(go_ythdf2)
#ID转换
go_ythdf2_id_trance <- bitr(go_ythdf2,fromType = "SYMBOL",toType = "ENTREZID",
OrgDb = "org.Hs.eg.db",drop = T)
#GO分析
result_go_ythdf2 <- enrichGO(go_ythdf2_id_trance$ENTREZID,OrgDb = "org.Hs.eg.db",
keyType = "ENTREZID",ont = "ALL",readable = T)
#绘图
dotplot(result_go_ythdf2,showCategory=50)
barplot(result_go_ythdf2,showCategory=20)
#保存
df_go_ythdf2 <- as.data.frame(result_go_ythdf2)
write.table(df_go_ythdf2,"df_go_ythdf2.xls",quote = F,sep = "\t")
我想建立并管理一个高质量的生信&统计相关的微信讨论群,如果你想参与讨论,可以添加微信:veryqun 。我会拉你进群,当然有问题也可以微信咨询我。