hello,大家好, 今天我们来分享一个新的内容,利用我们的轨迹分析的结果来推断细胞之间的通讯交流。文章在Inferring cell-cell interactions from pseudotime ordering of scRNA-Seq data.这个方法对于有分化关系的细胞类型来说,是一个新的角度解读细胞间的通讯,个人认为更加的重要。我们先来看看文献,最后分享示例代码。
Abstract
1、A major advantage of single cell RNA-Sequencing (scRNA-Seq) data is the ability to reconstruct continuous ordering and trajectories for cells。
2、To date, such ordering was mainly used to group cells and to infer interactions within cells.
3、prior methods that only focus on the average expression levels of genes
in clusters or cell types, TraSig fully utilizes the dynamic information to identify significant ligand-receptor pairs with similar trajectories
, which in turn are used to score interacting cell clusters(这一句是精髓,细胞分化接近的细胞类型通讯才是最重要的)。
Introduction
1、单细胞轨迹分析mainly focused on the expression similarity between cells in the same cluster or at consecutive time points and on the differences in transcriptional regulation between cell types and over time。
2、单细胞细胞通讯通常是识别ligands in one of the clusters or cell types and corresponding receptors in another cluster and then infer interactions based on the average expression of these ligand-receptor pairs。(cellphoneDB,SingleCellSingleR等软件)。While successful, most current methods for inferring cell-cell interactions from scRNA-Seq data only use of the average expression levels of ligands and receptors in the two clusters or cell types they test(这个地方局限性很大)。
3、目前的通讯分析方法(cluster的平均值)While this may be fine for steady state populations,(for example, different cell types in adult tissues),for studies that focus on development or response modeling, such averages do not take full advantage of the available data in scRNA-Seq studies 。
4、轨迹分析的结果中,cells on the same branch (or cluster) cannot be assumed to be homogeneous with respect to the expression of key genes. Using average analysis for such clusters may lead to inaccurate predictions about the relationship between ligands and receptors in two different (though parallel in terms of timing) branches.
5、(下图)While the average expression of a ligand and receptor in two different branches are the same,the first two cases are unlikely to strongly support an interaction between these two cell types while the third and fourth, where both are either increasing or decreasing in their respective ordering, are much more likely to hint at real interactions between the groups。(这是精髓)。
In other words, if two groups of cells are interacting, then we expect to see the genes,encoding signaling molecules in these groups co-express at a similar pace along the pseudotime.(很有道理)。所以在轨迹分析的结果上进行通讯分析,做好的方法就是sliding window approach.(滑动窗口法 )。
6、TraSig利用轨迹分析进行细胞通讯分析的方法,extract expression patterns for ligands and receptors in different edges of the trajectory using a sliding window approach
. It then uses these profiles to score temporal interactions
between ligand and their known receptors in different edges corresponding to the same time.检验还是置换检验。
Result
TraSig workflow. Top Left: For a time series scRNA-seq dataset, we use the reconstructed pseudotime, trajectory and the expression data as inputs. Bottom Left: We next determine expression profiles for genes along each of the edges (clusters) using sliding windows and compute dot product scores for pairs of genes in edges. Right: Finally, we use permutation tests to assign significance levels to the scores we computed.
看看示例结果,利用CSHMM(隐式马尔科夫模型)构建细胞类型之间的发育轨迹,关于CSHMM,大家可以参考一文搞懂HMM(隐马尔可夫模型),以及我之前分享的文章10X单细胞(10X空间转录组)基础算法之KL散度。
然后是Inferring cell type interactions for liver development。就是上面我们所说的滑动窗口法。
Results from comparing TraSig with SingleCellSignalR and CellPhoneDB. Top: Heatmaps for scores assigned by the three different methods for all cluster pairs representing cells sampled at the same time. TraSig and SingleCellSignalR identified more ligand-receptors pairs leading to higher scores. Bottom left: -log10 p-value for enriched GO terms related to endothelial cells and vascular development. Bottom right: Venn diagrams for the overlap in identified ligands and receptors among the three methods. The overlap between TraSig and SingleCellSignalR is high though roughly 50% of the identified proteins by each method are not identified by the other.这里的结果展示的是方法上的差别。
TraSig identifies ligand-receptor interactions important to vascular development,其实按照这个方法更加准确的得到了细胞在分化过程中的准确通讯。
图注,Ligand-receptor interaction predictions from TraSig of interest for functional studies. (a) Cartoon of cell signaling interaction between different DesLO cell types (HLC, hepatocyte-like cells; CLC, cholangiocyte-like cells; SLC, stellate-like cells; ELC, endothelial-like cells) (b) Trajectory plot showing cell type assignments with key identifying genes highlighted by different colors (Red = SOX2+ non induced cells, Yellow = SOX9 cholangiocyte-like cells, Blue = Hepatocyte-like cells, Purple = Stellate-like cells, Green = Endothelial-like cells). (c) Sender CXCL12 cells from the Cholangiocyte and Stellate populations in red shown with the receiver CXCR4 expressing endothelial cell population in blue. (d) Sender and receiver signaling populations (red = senders/ligands; blue = receivers/receptors)。
其实最有价值的就是计算通讯分析的方法是滑动窗口法。
最后看看示例代码
import pickle
import sys
import os
import gc
import requests
import numpy as np
import bottleneck as bn
import pandas as pd
# load packages required for analysis
import statsmodels.api as sm
import statsmodels as sm
import matplotlib
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
Run TraSig on the example data
main.py -i input -o output -d oligodendrocyte-differentiation-clusters_marques -g None -b ti_slingshot -n 1000 -s smallerWindow
usage: main.py [-h] -i INPUT -o OUTPUT -d PROJECT -g PREPROCESS -b MODELNAME
[-t LISTTYPE] [-l NLAP] [-m METRIC] [-z NAN2ZERO] [-n NUMPERMS]
[-p MULTIPROCESS] [-c NCORES] [-s STARTINGTREATMENT]
optional arguments:
-h, --help show this help message and exit
-i INPUT, --input INPUT
string, folder to find inputs
-o OUTPUT, --output OUTPUT
string, folder to put outputs
-d PROJECT, --project PROJECT
string, project name
-g PREPROCESS, --preprocess PREPROCESS
string, preprocessing steps applied to the data /
project, default None
-b MODELNAME, --modelName MODELNAME
string, name of the trajectory model
-t LISTTYPE, --listType LISTTYPE
string, optional, interaction list type, default
ligand_receptor
-l NLAP, --nLap NLAP integer, optional, sliding window size, default 20
-m METRIC, --metric METRIC
string, optional, scoring metric, default dot
-z NAN2ZERO, --nan2zero NAN2ZERO
boolean, optional, if treat nan as zero, default True
-n NUMPERMS, --numPerms NUMPERMS
integer, optional, number of permutations, default
10000
-p MULTIPROCESS, --multiProcess MULTIPROCESS
boolean, optional, if use multi-processing, default
True
-c NCORES, --ncores NCORES
integer, optional, number of cores to use for multi-
processing, default 4
-s STARTINGTREATMENT, --startingTreatment STARTINGTREATMENT
string, optional, way to treat values at the beginning
of an edge with sliding window size smaller than nLap,
None/parent/discard/smallerWindow, default
smallerWindow, need to provide an extra input
'path_info.pickle' for 'parent' option
Prepare inputs for TraSig (from dynverse outputs)
python prepare_inputs.py -i ../trajectory/input -o ../example/input -d oligodendrocyte-differentiation-clusters_marques -t ../trajectory/output/output.h5 -g None -b ti_slingshot -e None
usage: prepare_inputs.py [-h] -i INPUT -o OUTPUT -d PROJECT -t TRAJECTORYFILE
-g PREPROCESS -b MODELNAME [-e OTHERIDENTIFIER]
optional arguments:
-h, --help show this help message and exit
-i INPUT, --input INPUT
string, folder to find inputs for trajectory inference
-o OUTPUT, --output OUTPUT
string, folder to save inputs for TraSig
-d PROJECT, --project PROJECT
string, project name
-t TRAJECTORYFILE, --trajectoryFile TRAJECTORYFILE
string, trajectory output file from dynverse, default
../trajectory/output/output.h5
-g PREPROCESS, --preprocess PREPROCESS
string, preprocessing steps applied to the data /
project, default None
-b MODELNAME, --modelName MODELNAME
string, name of the trajectory model
-e OTHERIDENTIFIER, --otherIdentifier OTHERIDENTIFIER
string, optional, other identifier for the output,
default None
Analyze outputs from TraSig
剩下的大家自己看吧,内容在TraSig。研究发育的童鞋,实时通讯,才是最好的分析做法。
生活很好,有你更好