library(TCGAbiolinks)
# 获取GDC工程的各种参数,如肿瘤种类等
GDCprojects=getGDCprojects()
saveRDS(object = GDCprojects,
file = "G:\\GDCdata\\GDCprojects.rds")
建立查询
GDCquery(
project,
data.category,
data.type,
workflow.type,
legacy = FALSE,
access,
platform,
file.type,
barcode,
data.format,
experimental.strategy,
sample.type
)
project:肿瘤种类。可用TCGAbiolinks:::getGDCprojects()$project_id 查询可填的值
data.category,用TCGAbiolinks:::getProjectSummary("TCGA-BRCA") 查询,其中TCGA-BRCA来自上次查询将结果
主要是以下其中一种:
- Biospecimen
- Clinical
- Copy Number Variation
- DNA Methylation
- Sequencing Reads
- Simple Nucleotide Variation
- Transcriptome Profiling
sample.type:样本类型,取值可以是三列中的任意一个值
下载并保存
# 下载引用内容
GDCdownload(query = query_TCGA,directory = "G:\\GDCdata")
# 提取下载内容,注意路径与下载路径一致
tcga_data <- GDCprepare(query_TCGA,directory = "G:\\GDCdata")
查询结果保存,注意,不是具体数据,是查询结果描述,约等于PDATA
data <- getResults(query)