基因ID转换有很多方法,对高手如饮水吃饭,对新手来说却是很难逾越的高山,我推荐两个简单迅速的方法:
一 网页工具
https://biodbnet-abcc.ncifcrf.gov/db/db2db.php
哪里不会点哪里
二 R语言org.Hs.eg.db 系列包
suppressMessages(library(org.Hs.eg.db))# 载入包
keytypes(org.Hs.eg.db) #查看支持对选项
####方式一
[1] "ACCNUM" "ALIAS" "ENSEMBL" "ENSEMBLPROT" "ENSEMBLTRANS" "ENTREZID" "ENZYME"
[8] "EVIDENCE" "EVIDENCEALL" "GENENAME" "GO" "GOALL" "IPI" "MAP"
[15] "OMIM" "ONTOLOGY" "ONTOLOGYALL" "PATH" "PFAM" "PMID" "PROSITE"
[22] "REFSEQ" "SYMBOL" "UCSCKG" "UNIGENE" "UNIPROT"
gene_list<-select(org.Hs.eg.db, keys=as.character(your genes), columns=c("SYMBOL","ENTREZID"), keytype="ENSEMBL") #keytype是你输入基因编号类型,columns是你输出对基因编号类型,基因怎么导入不再赘述。
gene_list[1:4,1:3]
ENSEMBL SYMBOL ENTREZID
1 ENSG00000125352 RNF113A 7737
2 ENSG00000048162 NOP16 51491
3 ENSG00000205542 TMSB4X 7114
4 ENSG00000205542 TMSB4X 7114
结果是一个矩阵,想怎么提取就怎么提取。
#####方式二
entrezIDs <- mget(genenames, org.Hs.egSYMBOL2EG, ifnotfound=NA)
方式三
library(clusterProfiler)
data(geneList, package="DOSE")
gene <- names(geneList)[abs(geneList) > 2]
gene.df <- bitr(gene, fromType = "ENTREZID",
toType = c("ENSEMBL", "SYMBOL"),
OrgDb = org.Hs.eg.db)
head(gene.df)
方法四
suppressMessages(library(biomaRt))
#chr should be number only
#from the chr to gene name
mart <- useEnsembl(biomart='ensembl',
dataset="hsapiens_gene_ensembl")
check_filters_rt<-listFilters(mart)
check_out_rt<-listAttributes(mart)
results <- getBM(attributes = c("hgnc_symbol", "chromosome_name", "start_position", "end_position","gene_biotype"),
filters = c("chromosome_name", "start", "end"),
values = list(human_df_sig_pick[,"chr"], human_df_sig_pick[,"start"], human_df_sig_pick[,"end"]), mart = mart)