文章所在期刊:Clinical Cancer Research
介绍
PD-1/PDL-1在临床治疗NSCLC(non-small cell carcinoma)病人中表现了空前的潜力。然而只有少数人对当前的免疫治疗有反应,驱动病人产生药物敏感/抵挡性的机制并不是完全清楚。因而鉴定一些生物标志物以提升病人的反应率非常重要,当前研究已经阐述了tumor mutational load, DNA mismatch repair deficiency, the intensity of CD8+ cell infiltrates, intratumoral PD-L1 expression是抗PD-1/PD-L1治疗反应的生物标记物。鉴于这些因素存在关联作用并在个体中经常一起被发现,因而有没有其他的因素可以同时刺激(调控)上面提到的这些因素呢?如果存在这种因素,那么它肯定比上述提到的4种标志物更有价值(它反应了更为本质的内部驱动)。这就是这篇文章研究的核心。
一些研究已经表明了LUAD中TP53和STK11突变非常普遍,而且经常伴随KRAS的突变。基于这个结果,作者假设激活一些特定的肿瘤发生的通路会对基因表达(改变)有广谱的效应,那么可以想象到基因组上的突变会对肿瘤微环境造成重要的影响。而且一些研究也发现了TP53或KRAS突变的NSCLC相对于野生型会表达更高水平的PD-L1蛋白;TP53功能的丧失降低了基因组的不稳定性,与DNA损失修复相关,也表现了TP53突变的肿瘤有更高的mutational burden。
以上介绍都说明了TP53/KRAS与之前提到的四个生物标志物有密切联系,所以作者推测TP53/KRAS就是他在问题中想要寻找的“另一种因素”,对它们的分析和理解可以区分病人亚型(更有针对性)以提升免疫治疗的效果。
研究目的
临床研究已经表明了用靶向PD-1/PD-L1通路在治疗NSCLC中有光明的前景,但是我们并未能完全理解、区分对免疫治疗有反应的亚型病人的因素。
实验设计
整合基因组、转录组、蛋白质组和临床等多个维度的LUAD公共数据库(探索数据集)和内部数据库(验证数据集)、免疫治疗病人的数据进行分析。基因富集分析(GSEA)用来测定特定病人子群的潜在相关基因表达signature。
方法
Clinical Cohorts
- TCGA: 462 patients with mRNA expression profiling and gene mutation data.
- GSE72094: 442 patients with detailed mRNA expression data and EGFR/KRAS/TP53/STK11 sanger sequencing analysis.
- Broad cohort: 183 lung adenocarcinomas and matched normal tissues with detail information about mutation load and mutation spectrum.
- A total of 85 lung adenocarcinomas from the Guangdong Lung Cancer Institute (GLCI), Guangdong General Hospital (GCH) were underwent whole genome sequencing (WGS).
Immunotherapeutic patients
Clinical and mutation data for 34 NSCLC (29 ADC) patients were retrieved from cbioPortal (http://www.cbioportal.org/study.do?cancer_study_id=luad_mskcc_2015). Another group consisted of 20 NSCLC (15 ADC) patients were collected prospectively in the Guangdong Lung Cancer Institute from August, 2015 to August, 2016. Tumor specimens
142 were obtained for Sanger sequencing and IHC analysis.
mRNA Expression Profiling and Reverse Phase Protein Array (RPPA) analysis
Gene expression data for the GSE72094 lung adenocarcinomas have been deposited in the GEO repository (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE72094). Proteomic analysis was based on Reverse Phase Protein Array (RPPA) form TCGA database. The RPPA methodology and data analysis pipeline have been previously described (ref 21). For TCGA, level 3 data were downloaded directly from the TCGA portal and utilized in subsequent analyses.
Mutation Data Analysis
For the discovery set, somatic mutation data (level 2) of the 462 lung adenocarcinomas were retrieved from the TCGA data portal (https://tcga-data.nci.nih.gov/tcga/findArchives.htm). To assess the mutational load, the number of mutated genes carrying at least one nonsynonymous mutation in the coding region was computed for each tumor. Somatic mutation data of 183 lung adenocarcinomas in Broad cohort was retrieved from cbioPortal (http://www.cbioportal.org/study.do?cancer_study_id=luad_broad). Somatic substitutions and covered bases within their trinucleotide sequence context were analyzed to characterize the mutation spectrum of 183 lung adenocarcinoma. Mutation spectrum for each sample was calculated as the percentage of each of six possible single nucleotide changes (AT>CG AT>GC, AT>TA, GC>AT, GC>CG, GC>TA) among all single nucleotide substitutions. The most frequent mutation signatures were C→T transitions and C→A transversions.
For the validation set (GLCI), we conducted whole-exome sequencing of DNA from tumors and matched normal blood from 85 lung adenocarcinoma patients. Enriched exome libraries were sequenced on the HiSeq 2000 platform (Illumina) to >100X coverage. Alignment, base-quality score recalibration and duplicate-read removal were performed, germline variants were excluded, mutations annotated and indels evaluated as previously described(ref 4, 5, 24). Mutations between clinical groups were compared using the Mann-Whitney test. 用的非参检验,确实突变数不适合参数检验。
Gene Expression Data Analysis (GSEA) and Pathway Analysis
For Gene Set Enrichment Analysis (GSEA)(ref 25), the javaGSEA Desktop Application was downloaded from http://software.broadinstitute.org/gsea/index.jsp. GSEA was used to associate the gene signature with the TP53 or KRAS mutation status (TP53-mut vs. TP53-wt; KRAS-mut vs. KRAS-wt). The genes identified to be on the leading edge of the enrichment profile were subject to pathway analysis. Fold change values were exported for all genes and analyzed with version 2.2.0 of GSEA, using the GseaPreranked module. The normalized enrichment score (NES) is the primary statistic for examining gene set enrichment results The nominal P value estimates the statistical significance of the enrichment score. A gene set with nominal P ≤ 0.05 was considered to be significantly enriched in genes.
免疫组化、Sanger测序跳过。
Statistical Analyses
Statistical analyses were conducted using GraphPad Prism (version 7.01, La Jolla, CA) and SPSS version 22.0 (SPSS, Inc., Chicago, IL). Scatter dot plot and Box and whisker plots indicate median and 95% confidence interal (CI). Statistical tests were used to analyze the clinical and genomic data, including the Mann-Whitney U, Chi-square, Fisher’s exact and Kruskal-Wallis. Kaplan-Meier curves analysis of progression free survival (PFS) were compared using the log-rank test. All reported P values are two-tailed, and for all analyses, P≤0.05 is considered statistically significant, unless otherwise specified.