前面介绍了3种获取TCGA数据的方法:使用TCGA2STAT、TCGAbiolinks、RTCGA。这里再介绍一个包:RTCGAToolbox包,这个包是我最为推荐的,原因是我使用时它下载数据最快、最为稳定可靠。
RTCGAToolbox下载方法
## try http:// if https:// URLs are not supported
source("https://bioconductor.org/biocLite.R")
biocLite("RTCGAToolbox")
帮助文档:
http://bioconductor.org/packages/release/bioc/manuals/RTCGAToolbox/man/RTCGAToolbox.pdf
案例介绍
#包下载
source("https://bioconductor.org/biocLite.R")
biocLite("RTCGAToolbox")
#加载包
library(RTCGAToolbox)
#哪些癌症数据可以下载
> getFirehoseDatasets()
[1] "ACC" "BLCA" "BRCA" "CESC" "CHOL" "COADREAD" "COAD" "DLBC" "ESCA"
[10] "FPPP" "GBMLGG" "GBM" "HNSC" "KICH" "KIPAN" "KIRC" "KIRP" "LAML"
[19] "LGG" "LIHC" "LUAD" "LUSC" "MESO" "OV" "PAAD" "PCPG" "PRAD"
[28] "READ" "SARC" "SKCM" "STAD" "STES" "TGCT" "THCA" "THYM" "UCEC"
[37] "UCS" "UVM"
#数据库中更新时间
> getFirehoseRunningDates()
[1] "20151101" "20150821" "20150601" "20150402" "20150204" "20141206" "20141017" "20140902" "20140715"
[10] "20140614" "20140518" "20140416" "20140316" "20140215" "20140115" "20131210" "20131114" "20131010"
[19] "20130923" "20130809" "20130715" "20130623" "20130606" "20130523" "20130508" "20130421" "20130406"
[28] "20130326" "20130309" "20130222" "20130203" "20130116" "20121221" "20121206" "20121114" "20121102"
[37] "20121024" "20121020" "20121018" "20121004" "20120913" "20120825" "20120804" "20120725" "20120707"
[46] "20120623" "20120606" "20120525" "20120515" "20120425" "20120412" "20120321" "20120306" "20120217"
[55] "20120124" "20120110" "20111230" "20111206" "20111128" "20111115" "20111026"
#下载所需要的数据,这里以乳腺癌为例,数据下载完后会直接放在你的工作目录,不同地方下载的速度不一样,我这里等待了好久才下完。
brcaData = getFirehoseData (dataset="READ", runDate="20150402",forceDownload = TRUE,
Clinic=TRUE, Mutation=TRUE)
按照上面的代码进行,你就可以获取到TCGA的数据了,然后进行你的牛逼实验,发表一流的工作研究成果。加油,恭喜!
个人见解
强烈推荐这种下载方法来下载TCGA数据,它会是你的下载更加的靠谱。靠谱,就是稳定、快!