作者,Evil Genius
周一了,这一篇我们来讨论关于单细胞 + ATAC多样本、多组学整合分析的方法。
分两种情况,一种是样本分两份,一份做RNA,一份做ATAC,另外一种是单核 + ATAC多组学。
我们首先来讨论第一种情况,多个样本,每个样本单独做了RNA + ATAC,那么我们应该如何将这么多的RNA和ATAC数据进行整合呢?
Seurat提供了scRNA-seq and scATAC-seq联合分析教程,以Identifying anchors的方式进行整合,但是有一个小问题,教程中只是采用了单个样本(一个样本分两份)进行了演示,但是面临多样本的情况,怎么办??
发表于Molecular Cell的文章A multi-omic single-cell landscape of human gynecologic malignancies采用的方法是多个scRNA数据单独预处理,但是请注意,单个样本预处理之后,采用Seurat 的merge()函数进行整合,后续的分析没有规避线粒体的影响。单细胞ATAC的数据也是进行单独处理,运用ArchR进行联合分析(注意这里也是merge),但是在进行单细胞RNA + ATAC联合分析的时候,
Seurat’s CCA implementation in FindTransferAnchors() and TransferData() was used to assign each of the scATAC-seq cells a cell type subcluster identity from the matching scRNA-seq data and an associated label prediction score. This label transferring procedure was constrained to only align cells of the same patient dataset()(e.g., Patient 1 scATAC-seq cells were assigned only to cell type subclusters represented by Patient 1 scRNA-seq cells)。随后对联合分析的结果进行了进一步filter。
发表在Cell Discovery的文章Single-cell multiomics analysis reveals regulatory programs in clear cell renal cell carcinoma采用的方法也是单细胞数据先单独处理,不过采用了harmony进行了数据矫正,scATAC也是单独进行分析,软件signac,但是请注意
Clustering and dimensionality reduction were then performed on the corrected LSI components by the Harmony method.
最后在联合分析的过程中,也是采用了
Seurat’s integration framework,to identify the pairs of corresponding cells between two modalities data. The shared correlation patterns between scATAC-seq gene activity and scRNA-seq gene expression were identified by
the “FindTransferAnchors” function (reduction = ‘cca’). Then the cell type label of each cell in scATAC-seq data was predicted by “TransferData” function (weight.reduction = ‘lsi’ and dim = 2:15). 联合分析之后进一步挑选,然后对挑选后的scATAC数据再次进行联合分析并用harmony进行了矫正。
发表于nature的文章Single-cell roadmap of human gonadal development同样是对单细胞和ATAC数据先进行单独处理,单细胞数据联合用Seurat进行了矫正,ATAC也做了联合分析,然后scRNA和scATAC进行联合分析。
To annotate cell types in scATAC-seq data, we first performed label transfer from scRNA-seq data of matched individuals. We used canonical correlation analysis as a dimensionality reduction method and vst as a selection method, along with 3,000 variable features and 25 dimensions for finding anchors between the two datasets and transferring the annotations. The predicted cell type annotations by label transfer were validated by importing annotations of the multiomic snRNA-seq/scATAC-seq profiling data. To visualize the correspondence between scATAC-seq final annotations and predictions from label transfer, we plotted the average label transfer score (value between 0 and 1) of each cell type in the annotated cell types in scATAC-seq data,也是匹配的样本进行联合注释
同样发表在nature的文章Spatial multi-omic map of human myocardial infarction同样也是单细胞数据和scATAC先单独处理,单细胞数据先进行Seurat分析,用harmony进行矫正,scATAC用ArchR进行分析,然后进行联合
The Seurat label-transferring approach was used to compare the annotation of snRNA-seq and snATAC-seq. To do so, the snRNA-seq data were used as reference and the function FindTransferAnchors was applied to identify a set of anchors using gene expression from snRNA-seq and gene activity score from snATAC-seq. Next, the cell labels from snRNA-seq were transferred to snATAC-seq by running the function TransferData. An adjusted rand index was calculated to evaluate the agreement between annotated and predicted cell labels for snATAC-seq data
那么,总结一下,在第一种情况下,单细胞和ATAC分开做,联合分析的方法就是先单独分析,RNA和ATAC分别进行整合分析,当然,一般需要进行一定的数据矫正,然后按照匹配的样本进行联合注释,最后对ATAC注释的结果进行过滤,达到数据分析的目的。