免疫浸润分析方法信息汇总

整理这几天看的文献的重要信息。

1. 去卷积方法

CIBERSORT

CIBERSORT是2015年在Nature Methods发表的一个方法,现在被引量为84,其方法摘要为:

a method for characterizing cell composition of complex tissues from their gene expression profiles. When applied to enumeration of hematopoietic subsets in RNA mixtures from fresh, frozen, and fixed tissues, including solid tumors, CIBERSORT outperformed other methods with respect to noise, unknown mixture content, and closely related cell types. CIBERSORT should enable large-scale analysis of RNA mixtures for cellular biomarkers and therapeutic targets (http://cibersort.stanford.edu).

在Cell Reports那篇文章提供的TCIA网址中得到了关于LM22的细胞占比数据,之前不理解,这篇文章有详细的说明:

To assess the feasibility of leukocyte deconvolution from bulk tumors, we designed and validated a leukocyte gene signature matrix, termed LM22. It contains 547 genes that distinguish 22 human hematopoietic cell phenotypes, including seven T cell types, naïve and memory B cells, plasma cells, NK cells, and myeloid subsets (Supplementary Table 1, Supplementary Fig. 2, and Online Methods). Cell subsets can be further grouped into 11 major leukocyte types based on shared lineage (Supplementary Table 1).

虽然在这篇文章末尾提到了

We anticipate that CIBERSORT will prove valuable for analysis of cellular heterogeneity in microarray or RNA-Seq data derived from fresh, frozen, and fixed specimens, thereby complementing methods that require living cells as input.

表明文章发表时,CIBERSORT还没有用于RNAseq数据(这点在TIMER中以一个优势被提及)。但从Cell Reports文章《Pan-cancer Immunogenomic Analyses Reveal Genotype-Immunophenotype Relationships and Predictors of Response to Checkpoint Blockade》创建的TCIA网站来看,CIBERSORT方法已经应用到了TCGA的RNAseq数据上。

TIMER

Comprehensive analyses of tumor immunity: implications for cancer immunotherapy》是2016年发表于Genome Biology的一篇计算免疫浸润的方法。

它的主要工作结果为:

We developed a computational method to estimate the abundance of six tumor-infiltrating immune cell types (B cells, CD4 T cells, CD8 T cells, neutrophils, macrophages, and dendritic cells) to study 23 cancer types in The Cancer Genome Atlas (TCGA)

设计的大致流程如下:

The prerequisite data include tumor purity, tumor gene expression profiles (as transcript per million reads (TPM)) from TCGA, and an external reference dataset of purified immune cells. Tumor purity is critical to selecting genes informative for deconvolving immune cells in the tumor tissue and was inferred from copy number alteration data using an R package, CHAT, we have developed [15] (Fig. 1a). Our purity estimation method has been validated using diluted series with known tumor/normal mixture ratios and agreed with previous methods and orthogonal estimations [16]. The distributions of tumor purity showed large variations among different samples across most of the 23 TCGA cancer types (Additional file 1: Figure S1). For each cancer dataset, batch effects between TCGA and the external reference data sets were removed using ComBat [17] (Fig. 1b). Next we selected genes whose expression levels are negatively correlated with tumor purity (Fig. 1c; Additional file 1: Figure S2; Additional file 2: Table S1), an indication that these genes are expressed by stromal cells in the tumor microenvironment. Across all 23 cancers, informative genes selected from the above steps are significantly enriched for a predefined immune signature [18] (Fig. 1d). This result indicates that large numbers of immune cell-specific genes are highly expressed in the tumor microenvironment. Finally, we used constrained least squares fitting [19] on the informative immune signature genes to infer the abundance of the six immune cell types (Fig. 1e).

它指出了自己开发的方法相对于CIBERSORT的优点:

Our work first provided a systematic prognostic landscape of different tumor-infiltrating immune cells in diverse cancer types. We compared our results with two recent studies on the same topic [6, 11]. The method used in Gentles et al., CIBERSORT [35], is currently only applicable to microarray data, thus unable to analyze the TCGA RNA-seq data. Therefore, our immune component estimation is a unique addition to TCGA for future integrative analyses of tumor–immune interactions. By including more immune cell types into regression, CIBERSORT inference also suffered from statistical co-linearity that might have resulted in biased estimations (Additional file 7: Table S6; Additional file 8). Due to this limitation, although Gentles et al. studied more cell types, they reported few significant prognostic immune predictors, without correction for other clinical confounders. In contrast, we observed many more significant clinical associations with the correction of multiple cofactors.

值得注意的是,在文末提供的方法部分,作者详细地介绍了方法的计算流程,重点强调了

To note, coefficients f estimated using this method are the relative abundance of immune cells. The scale of the estimation of an individual immune cell type is determined by the variance of the corresponding reference data Xr. Therefore, f are not comparable between cancer types or different immune cells. Source codes for TIMER and downstream statistical analysis as well as related data files are available at http://cistrome.org/TIMER/download.html.

说明最后的计算结果是一个系数,它表明的只是免疫细胞相对的丰富度。

2. GSEA

个人认为最值得关注的还是Cell Reports的文章Pan-cancer Immunogenomic Analyses Reveal Genotype-Immunophenotype Relationships and Predictors of Response to Checkpoint Blockade。它于2017年发表,本身的背景中就有提及前面两种方法去卷积方法。

文章对TIMER的观点是:

RNA expression data corrected for tumor purity were used to estimate infiltration of B cells, CD4+ T cells, CD8+ T cells, neutrophils, macrophages, and dendritic cells ([Li et al., 2016](javascript:void(0);)). However, although such analyses of few major cell types are helpful for identifying clinical associations, higher resolution of the TIL landscape is required in order to dissect tumor-immune cell interactions and identify prognostic and predictive markers.

因此,该文章提供了更为细致的免疫浸润组分图景,有近30种细胞类型

Thus, it is of utmost importance to provide a comprehensive view of the intratumoral immune landscape including memory cells, cytotoxic cells (CD8+ T cells, natural killer [NK] cells, and NK T [NKT] cells), as well as immunosuppressive cells (Tregs and myeloid-derived suppressor cells [MDSCs]).

we estimated 28 subpopulations of TILs including major types related to adaptive immunity: activated T cells, central memory (Tcm), effector memory (Tem) CD4+ and CD8+ T cells, gamma delta T (Tγδ) cells, T helper 1 (Th1) cells, Th2 cells, Th17 cells, regulatory T cells (Treg), follicular helper T cells (Tfh), activated, immature, and memory B cells, as well as cell types related to innate immunity, such as macrophages, monocytes, mast cells, eosinophils, neutrophils, activated, plasmacytoid, and immature dendritic cells (DCs), NK cells, natural killer T (NKT) cells, and MDSCs.

在构建的TCIA中,它使用了CIBERSORT去卷积方法进行免疫浸润的计算,方法是通过将TCGA的RNAseq数据转化为CIBERSORT能够处理的MicroArray数据集。

重点是,该文章提供了一种新的免疫浸润细胞组分计算的思路:就是使用基因富集分析。

下面介绍了该方法及其优点

Our approach is based on the use of metagenes, i.e., non-overlapping sets of genes that are representative for specific immune cell subpopulations and are neither expressed in CRC cell lines nor in normal tissue. The expression of these sets of metagenes is then used to analyze statistical enrichment using gene set enrichment analysis (GSEA). The advantage of the metagene approach is the robustness of the method due to two characteristics: (1) the use of a set of genes instead of single genes that represent one immune subpopulation, because the use of single genes as markers for immune subpopulations can be misleading as many genes are expressed in different cell types; and (2) the assessment of relative expression changes of a set of genes in relation to the expression of all other genes in a sample. Thus, the calculations are less sensitive to noise resulting from sample impurity or sample preparation compared with the deconvolution methods.

这两种方法都能够在TCIA上使用:

We built a model to transform RNA-sequencing data to microarray data considering the TCGA samples for which microarray and RNA-sequencing data were available (n = 550; see Experimental Procedures). We present here only the results from the GSEA method, which depicts a more comprehensive picture of the tumor-suppressive or tumor-promoting roles of TILs. However, we make both GSEA and deconvolution data available on the TCIA website.

Enrichment bubble plot.png

该方法提供了免疫细胞在病人样本中的富集度(上图是以绝对数目显示,也可以显示富集的相对比例),通过NES(averaged normalized enrichment score)与FDR进行样本的预筛。默认设定为NES>0, Q-value(FDR) < 0.1。可以下载相应的文件,包含样本名(Barcode),NES和Q-value 3列。

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 203,324评论 5 476
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 85,303评论 2 381
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 150,192评论 0 337
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 54,555评论 1 273
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 63,569评论 5 365
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 48,566评论 1 281
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 37,927评论 3 395
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 36,583评论 0 257
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 40,827评论 1 297
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 35,590评论 2 320
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 37,669评论 1 329
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 33,365评论 4 318
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 38,941评论 3 307
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 29,928评论 0 19
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 31,159评论 1 259
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 42,880评论 2 349
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 42,399评论 2 342

推荐阅读更多精彩内容