作者,追风少年i
国庆前的最后一周了,大家好好努力,有句台词说得好,Yesterday is history, Tomorrow is a mystery, But today is a gift That is why it’s called the present (the gift) 。
这一篇来回答一些项目中遇到的问题,以及简单分享一下如何衡量样本间的异质性。
最近很多老师来做分析,其中大多数拿着疾病的样本,然后告诉我做什么分析,要分析什么老师也不知道,只是说做分析,搞得我难以下手,简单的课题设计还是要准备好的。
疾病样本:其实单细胞样本的分析本质还是分组找差异,只是这个差异从更高的维度来分析,无论是通讯的差异还是转录状态的差异,前提是有组可分,最理想的状态就是normal和disease,当然很多老师无法取到normal的样本,那么临床信息就显得尤为重要,预后好和预后差之间的差异,才具有临床指导的价值。
如果说仅仅有疾病样本,临床信息也没有的前提下,是很难分析出有效的信息,这个时候很多老师就开始探索单细胞数据库,从数据库的队列中寻找normal的数据来进行分析,方法是可取的,但是要注意数据的匹配程度,平台差异等等。
样本的注释问题,流程化的注释是不可取的,多次强调过,市面上都说新格元注释做的好,也是浪费了很大的人工,并不是流程化带来的,资料和经验,也是科研工作者必备的素质。
样本量的问题,单细胞发展到现在,仅靠一两个样本发文章是不现实的(二区及以上),无论从哪个角度分析,都需要在多个样本中验证研究的价值,所以大家如果决定做单细胞分析,就要想好这一点。
好了,关于课题设计,其实也是一个很大的学问,这样的工作通常是技术支持来做的,接下来分享一个简单的内容,衡量样本间的异质性。
- 指标1:Cluster entropy,为了测量来自不同样本的细胞在细胞类型cluster中的混合程度如何,量化了数据集的归一化相对cluster entropy,按cluster大小加权。 cluster entropy为 1 表示样本在cluster 之间完全混合。
参考文章,10X单细胞(10X空间转录组)基因表达的熵值分析
- 指标2 :Similarity scores/alignment,为了测量来自相同与不同批次和/或样品的细胞之间细胞类型内细胞状态的转录变异,测量了每个样品/批次之间的成对对齐(就是整合),其中批次由同一天处理的样品组组成。 这个“相似度分数”检查特定样本/批次中每个细胞的局部邻域,询问其 k 个最近邻居中有多少个(在 PC 或 iNMF(这个大家应该都知道) 空间中)属于第二个样本/批次,然后在所有细胞上取平均值 . 这里选择 k 为cluster内细胞总数的 1%。 结果通过每个样品/批次的预期细胞数进行标准化。
最后汇总一下Seurat包的所有函数
函数 | 作用 |
---|---|
AddModuleScore | Calculate module scores for feature expression programs in single cells |
AggregateExpression | Aggregated feature expression by identity class |
AnchorSet-class | The AnchorSet Class |
AnnotateAnchors | Add info to anchor matrix |
Assay-class | The Assay Class |
AugmentPlot | Augments ggplot2-based plot with a PNG image. |
AverageExpression | Averaged feature expression by identity class |
BGTextColor | Determine text color based on background color |
BarcodeInflectionsPlot | Plot the Barcode Distribution and Calculated Inflection Points |
BlackAndWhite | Create a custom color palette |
BuildClusterTree | Phylogenetic Analysis of Identity Classes |
CalcPerturbSig | Calculate a perturbation Signature |
CalculateBarcodeInflections | Calculate the Barcode Distribution Inflection |
CaseMatch | Match the case of character vectors |
CellCycleScoring | Score cell cycle phases |
CellScatter | Cell-cell scatter plot |
CellSelector | Cell Selector |
Cells.SCTModel | Get Cell Names |
CellsByImage | Get a vector of cell names associated with an image (or set of images) |
CollapseEmbeddingOutliers | Move outliers towards center on dimension reduction plot |
CollapseSpeciesExpressionMatrix | Slim down a multi-species expression matrix,when only one species is primarily of interenst. |
ColorDimSplit | Color dimensional reduction plot by tree split |
CombinePlots | Combine ggplot2-based plots into a single plot |
CreateSCTAssayObject | Create a SCT Assay object |
CustomDistance | Run a custom distance function on an input data matrix |
DEenrichRPlot | DE and EnrichR pathway visualization barplot |
DietSeurat | Slim down a Seurat object |
DimHeatmap | Dimensional reduction heatmap |
DimPlot | Dimensional reduction plot |
DimReduc-class | The DimReduc Class |
DiscretePalette | Discrete colour palettes from the pals package |
DoHeatmap | Feature expression heatmap |
DotPlot | Dot plot visualization |
ElbowPlot | Quickly Pick Relevant Dimensions |
ExpMean | Calculate the mean of logged values |
ExpSD | Calculate the standard deviation of logged values |
ExpVar | Calculate the variance of logged values |
FastRowScale | Scale and/or center matrix rowwise |
FeaturePlot | Visualize 'features' on a dimensional reduction plot |
FeatureScatter | Scatter plot of single cell data |
FilterSlideSeq | Filter stray beads from Slide-seq puck |
FindAllMarkers | Gene expression markers for all identity classes |
FindClusters | Cluster Determination |
FindConservedMarkers | Finds markers that are conserved between the groups |
FindIntegrationAnchors | Find integration anchors |
FindMarkers | Gene expression markers of identity classes |
FindMultiModalNeighbors | Construct weighted nearest neighbor graph |
FindNeighbors | (Shared) Nearest-neighbor graph construction |
FindSpatiallyVariableFeatures | Find spatially variable features |
FindSubCluster | Find subclusters under one cluster |
FindTransferAnchors | Find transfer anchors |
FindVariableFeatures | Find variable features |
FoldChange | Fold Change |
GetAssay | Get an Assay object from a given Seurat object. |
GetImage.SlideSeq | Get Image Data |
GetIntegrationData | Get integration data |
GetResidual | Calculate pearson residuals of features not in the scale.data |
GetTissueCoordinates.SlideSeq | Get Tissue Coordinates |
GetTransferPredictions | Get the predicted identity |
Graph-class | The Graph Class |
GroupCorrelation | Compute the correlation of features broken down by groups with another covariate |
GroupCorrelationPlot | Boxplot of correlation of a variable (e.g.number of UMIs) with expression data |
HTODemux | Demultiplex samples based on data from cell 'hashing' |
HTOHeatmap | Hashtag oligo heatmap |
HVFInfo.SCTAssay | Get Variable Feature Information |
HoverLocator | Hover Locator |
IFeaturePlot | Visualize features in dimensional reduction space interactively |
ISpatialDimPlot | Visualize clusters spatially and interactively |
ISpatialFeaturePlot | Visualize features spatially and interactively |
IntegrateData | Integrate data |
IntegrateEmbeddings | Integrate low dimensional embeddings |
IntegrationAnchorSet-class | The IntegrationAnchorSet Class |
IntegrationData-class | The IntegrationData Class |
JackStraw | Determine statistical significance of PCA scores. |
JackStrawData-class | The JackStrawData Class |
JackStrawPlot | JackStraw Plot |
L2CCA | L2-Normalize CCA |
L2Dim | L2-normalization |
LabelClusters | Label clusters on a ggplot2-based scatter plot |
LabelPoints | Add text labels to a ggplot2 plot |
LinkedPlots | Visualize spatial and clustering (dimensional reduction) data in a linked, interactive framework |
Load10X_Spatial | Load a 10x Genomics Visium Spatial Experiment into a 'Seurat' object |
LoadAnnoyIndex | Load the Annoy index file |
LoadSTARmap | Load STARmap data |
LocalStruct | Calculate the local structure preservation metric |
LogNormalize | Normalize raw data |
LogVMR | Calculate the variance to mean ratio of logged values |
MULTIseqDemux | Demultiplex samples based on classification method from MULTI-seq |
MapQuery | Map query cells to a reference |
MappingScore | Metric for evaluating mapping success |
MetaFeature | Aggregate expression of multiple features into a single feature |
MinMax | Apply a ceiling and floor to all values in a matrix |
MixingMetric | Calculates a mixing metric |
MixscapeHeatmap | Differential expression heatmap for mixscape |
MixscapeLDA | Linear discriminant analysis on pooled CRISPR screen data. |
ModalityWeights-class | The ModalityWeights Class |
NNPlot | Highlight Neighbors in DimPlot |
Neighbor-class | The Neighbor Class |
NormalizeData | Normalize Data |
PCASigGenes | Significant genes from a PCA |
PercentageFeatureSet | Calculate the percentage of all counts that belong to a given set of features |
PlotClusterTree | Plot clusters as a tree |
PlotPerturbScore | Function to plot perturbation score distributions. |
PolyDimPlot | Polygon DimPlot |
PolyFeaturePlot | Polygon FeaturePlot |
PredictAssay | Predict value from nearest neighbors |
PrepLDA | Function to prepare data for Linear Discriminant Analysis. |
PrepSCTIntegration | Prepare an object list normalized with sctransform for integration. |
ProjectDim | Project Dimensional reduction onto full dataset |
ProjectUMAP | Project query into UMAP coordinates of reference |
Radius.SlideSeq | Get Spot Radius |
Read10X | Load in data from 10X |
Read10X_Image | Load a 10X Genomics Visium Image |
Read10X_h5 | Read 10X hdf5 file |
ReadMtx | Load in data from remote or local mtx files |
ReadSlideSeq | Load Slide-seq spatial data |
RegroupIdents | Regroup idents based on meta.data info |
RelativeCounts | Normalize raw data to fractions |
RenameCells.SCTAssay | Rename Cells in an Object |
RidgePlot | Single cell ridge plot |
RunCCA | Perform Canonical Correlation Analysis |
RunICA | Run Independent Component Analysis on gene expression |
RunLDA | Run Linear Discriminant Analysis |
RunMarkVario | Run the mark variogram computation on a given position matrix and expression matrix. |
RunMixscape | Run Mixscape |
RunMoransI | Compute Moran's I value. |
RunPCA | Run Principal Component Analysis |
RunSPCA | Run Supervised Principal Component Analysis |
RunTSNE | Run t-distributed Stochastic Neighbor Embedding |
RunUMAP | Run UMAP |
SCTAssay-class | The SCTModel Class |
SCTResults | Get SCT results from an Assay |
SCTransform | Use regularized negative binomial regression to normalize UMI count data |
STARmap-class | The STARmap class |
SampleUMI | Sample UMI |
SaveAnnoyIndex | Save the Annoy index |
ScaleData | Scale and center the data. |
ScaleFactors | Get image scale factors |
ScoreJackStraw | Compute Jackstraw scores significance. |
SelectIntegrationFeatures | Select integration features |
SetIntegrationData | Set integration data |
Seurat-class | The Seurat Class |
Seurat-package | Seurat: Tools for Single Cell Genomics |
SeuratCommand-class | The SeuratCommand Class |
SeuratTheme | Seurat Themes |
SlideSeq-class | The SlideSeq class |
SpatialImage-class | The SpatialImage Class |
SpatialPlot | Visualize spatial clustering and expression data. |
SplitObject | Splits object into a list of subsetted objects. |
SubsetByBarcodeInflections | Subset a Seurat Object based on the Barcode Distribution Inflection Points |
TopCells | Find cells with highest scores for a given dimensional reduction technique |
TopFeatures | Find features with highest scores for a given dimensional reduction technique |
TopNeighbors | Get nearest neighbors for given cell |
TransferAnchorSet-class | The TransferAnchorSet Class |
TransferData | Transfer data |
UpdateSCTAssays | Update pre-V4 Assays generated with SCTransform in the Seurat to the new SCTAssay class |
UpdateSymbolList | Get updated synonyms for gene symbols |
VariableFeaturePlot | View variable features |
VisiumV1-class | The VisiumV1 class |
VizDimLoadings | Visualize Dimensional Reduction genes |
VlnPlot | Single cell violin plot |
as.CellDataSet | Convert objects to CellDataSet objects |
as.Seurat.CellDataSet | Convert objects to 'Seurat' objects |
as.SingleCellExperiment | Convert objects to SingleCellExperiment objects |
as.sparse.H5Group | Cast to Sparse |
cc.genes | Cell cycle genes |
cc.genes.updated.2019 | Cell cycle genes: 2019 update |
contrast-theory | Get the intensity and/or luminance of a color |
merge.SCTAssay | Merge SCTAssay objects |
subset.AnchorSet | Subset an AnchorSet object |
周一了,偷个懒,生活很好,有你更好