寒露:露水以寒,将要结冰。
scRNA-seq整合简介
对两个或多个单细胞数据集的联合分析提出了独特的挑战。特别是,在标准工作流程下,识别多个数据集中存在的细胞群体可能会成问题。Seurat v4包括一组用于匹配(或“对齐”)跨数据集的共享细胞群体的方法。这些方法首先确定处于匹配生物学状态(“锚”)的细胞的跨数据集对,既可以用于校正数据集之间的技术差异(即批效应校正),也可以用于对基因组进行比较性scRNA-seq分析跨实验条件。
下面,我们展示了Stuart *,Butler *等人,2019中所述的scRNA-seq整合方法,以对处于静止或干扰素刺激状态的人免疫细胞(PBMC)进行比较分析。
整合目标
以下教程旨在概述使用Seurat集成过程可能进行的复杂细胞类型的比较分析。在这里,我们解决了一些关键目标:
- 创建“集成”数据分析以进行下游分析
- 识别两个数据集中都存在的单元格类型
- 获得在对照和刺激细胞中均保守的细胞类型标记
- 比较数据集以找到对刺激的细胞类型特异性反应
设置Seurat对象
为了方便起见,我们通过SeuratData软件包分发此数据集。
library(Seurat)
Seurat v4 will be going to CRAN in the near future;
for more details, please visit https://satijalab.org/seurat/v4_changes
> library(SeuratData)
Registered S3 method overwritten by 'cli':
method from
print.boxx spatstat
Using cached data manifest, last updated at 2021-03-13 20:33:46
-- Installed datasets ---------------------------------------------- SeuratData v0.2.1 --
√ pbmc3k 3.1.4
------------------------------------------ Key ------------------------------------------
√ Dataset loaded successfully
> Dataset built with a newer version of Seurat than installed
(?) Unknown version of Seurat installed
> library(patchwork)
> # install dataset
> InstallData("ifnb")
Installing package into ‘C:/Users/zzu/Documents/R/win-library/4.0’
(as ‘lib’ is unspecified)
试开URL’http://seurat.nygenome.org/src/contrib/ifnb.SeuratData_3.1.0.tar.gz'
Content type 'application/octet-stream' length 413266233 bytes (394.1 MB)
downloaded 394.1 MB
* installing *source* package 'ifnb.SeuratData' ...
** using staged installation
** R
** data
*** moving datasets to lazyload DB
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
converting help for package 'ifnb.SeuratData'
finding HTML links ... 好了
ifnb html
** building package indices
** testing if installed package can be loaded from temporary location
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (ifnb.SeuratData)
The downloaded source packages are in
‘C:\Users\zzu\AppData\Local\Temp\RtmpO0TeZF\downloaded_packages’
> # load dataset
> LoadData("ifnb")
An object of class Seurat
14053 features across 13999 samples within 1 assay
Active assay: RNA (14053 features, 0 variable features)
> class(ifnb)
[1] "Seurat"
attr(,"package")
[1] "Seurat"
> str(ifnb)
Formal class 'Seurat' [package "Seurat"] with 13 slots
..@ assays :List of 1
.. ..$ RNA:Formal class 'Assay' [package "Seurat"] with 8 slots
.. .. .. ..@ counts :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
.. .. .. .. .. ..@ i : int [1:9787436] 20 27 37 64 65 83 87 131 139 175 ...
.. .. .. .. .. ..@ p : int [1:14000] 0 877 1590 2440 3549 4183 4740 5720 6301 7181 ...
.. .. .. .. .. ..@ Dim : int [1:2] 14053 13999
.. .. .. .. .. ..@ Dimnames:List of 2
.. .. .. .. .. .. ..$ : chr [1:14053] "AL627309.1" "RP11-206L10.2" "LINC00115" "NOC2L" ...
.. .. .. .. .. .. ..$ : chr [1:13999] "AAACATACATTTCC.1" "AAACATACCAGAAA.1" "AAACATACCTCGCT.1" "AAACATACCTGGTA.1" ...
.. .. .. .. .. ..@ x : num [1:9787436] 1 1 1 1 1 2 1 1 1 1 ...
.. .. .. .. .. ..@ factors : list()
.. .. .. ..@ data :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
.. .. .. .. .. ..@ i : int [1:9787436] 20 27 37 64 65 83 87 131 139 175 ...
.. .. .. .. .. ..@ p : int [1:14000] 0 877 1590 2440 3549 4183 4740 5720 6301 7181 ...
.. .. .. .. .. ..@ Dim : int [1:2] 14053 13999
.. .. .. .. .. ..@ Dimnames:List of 2
.. .. .. .. .. .. ..$ : chr [1:14053] "AL627309.1" "RP11-206L10.2" "LINC00115" "NOC2L" ...
.. .. .. .. .. .. ..$ : chr [1:13999] "AAACATACATTTCC.1" "AAACATACCAGAAA.1" "AAACATACCTCGCT.1" "AAACATACCTGGTA.1" ...
.. .. .. .. .. ..@ x : num [1:9787436] 1 1 1 1 1 2 1 1 1 1 ...
.. .. .. .. .. ..@ factors : list()
.. .. .. ..@ scale.data : num[0 , 0 ]
.. .. .. ..@ key : chr "rna_"
.. .. .. ..@ var.features : logi(0)
.. .. .. ..@ meta.features:'data.frame': 14053 obs. of 0 variables
.. .. .. ..@ misc : symbol �NULL�
.. .. .. ..@ NA : NULL
..@ meta.data :'data.frame': 13999 obs. of 5 variables:
.. ..$ orig.ident : chr [1:13999] "IMMUNE_CTRL" "IMMUNE_CTRL" "IMMUNE_CTRL" "IMMUNE_CTRL" ...
.. ..$ nCount_RNA : num [1:13999] 3017 2481 3420 3156 1868 ...
.. ..$ nFeature_RNA : int [1:13999] 877 713 850 1109 634 557 980 581 880 669 ...
.. ..$ stim : chr [1:13999] "CTRL" "CTRL" "CTRL" "CTRL" ...
.. ..$ seurat_annotations: Factor w/ 13 levels "CD14 Mono","CD4 Naive T",..: 1 1 1 12 3 1 7 2 6 1 ...
..@ active.assay: chr "RNA"
..@ active.ident: Factor w/ 2 levels "IMMUNE_CTRL",..: 1 1 1 1 1 1 1 1 1 1 ...
.. ..- attr(*, "names")= chr [1:13999] "AAACATACATTTCC.1" "AAACATACCAGAAA.1" "AAACATACCTCGCT.1" "AAACATACCTGGTA.1" ...
..@ graphs : list()
..@ neighbors : list()
..@ reductions : list()
..@ project.name: chr "ifnb"
..@ misc : list()
..@ version :Classes 'package_version', 'numeric_version' hidden list of 1
.. ..$ : int [1:3] 3 0 0
..@ commands : list()
..@ tools : list()
..@ NA : NULL
> ls <- GetAssayData(object = ifnb, slot = "counts")
> dim(ls)
[1] 14053 13999
> dim(ifnb@meta.data)
[1] 13999 5
> head(ifnb@meta.data)
orig.ident nCount_RNA nFeature_RNA stim seurat_annotations
AAACATACATTTCC.1 IMMUNE_CTRL 3017 877 CTRL CD14 Mono
AAACATACCAGAAA.1 IMMUNE_CTRL 2481 713 CTRL CD14 Mono
AAACATACCTCGCT.1 IMMUNE_CTRL 3420 850 CTRL CD14 Mono
AAACATACCTGGTA.1 IMMUNE_CTRL 3156 1109 CTRL pDC
AAACATACGATGAA.1 IMMUNE_CTRL 1868 634 CTRL CD4 Memory T
AAACATACGGCATT.1 IMMUNE_CTRL 1581 557 CTRL CD14 Mono
> # split the dataset into a list of two seurat objects (stim and CTRL)
> ifnb.list <- SplitObject(ifnb, split.by = "stim")
> ifnb.list
$CTRL
An object of class Seurat
14053 features across 6548 samples within 1 assay
Active assay: RNA (14053 features, 0 variable features)
$STIM
An object of class Seurat
14053 features across 7451 samples within 1 assay
Active assay: RNA (14053 features, 0 variable features)
> str(ifnb.list$CTRL)
Formal class 'Seurat' [package "Seurat"] with 13 slots
..@ assays :List of 1
.. ..$ RNA:Formal class 'Assay' [package "Seurat"] with 8 slots
.. .. .. ..@ counts :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
.. .. .. .. .. ..@ i : int [1:4626015] 20 27 37 64 65 83 87 131 139 175 ...
.. .. .. .. .. ..@ p : int [1:6549] 0 877 1590 2440 3549 4183 4740 5720 6301 7181 ...
.. .. .. .. .. ..@ Dim : int [1:2] 14053 6548
.. .. .. .. .. ..@ Dimnames:List of 2
.. .. .. .. .. .. ..$ : chr [1:14053] "AL627309.1" "RP11-206L10.2" "LINC00115" "NOC2L" ...
.. .. .. .. .. .. ..$ : chr [1:6548] "AAACATACATTTCC.1" "AAACATACCAGAAA.1" "AAACATACCTCGCT.1" "AAACATACCTGGTA.1" ...
.. .. .. .. .. ..@ x : num [1:4626015] 1 1 1 1 1 2 1 1 1 1 ...
.. .. .. .. .. ..@ factors : list()
.. .. .. ..@ data :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
.. .. .. .. .. ..@ i : int [1:4626015] 20 27 37 64 65 83 87 131 139 175 ...
.. .. .. .. .. ..@ p : int [1:6549] 0 877 1590 2440 3549 4183 4740 5720 6301 7181 ...
.. .. .. .. .. ..@ Dim : int [1:2] 14053 6548
.. .. .. .. .. ..@ Dimnames:List of 2
.. .. .. .. .. .. ..$ : chr [1:14053] "AL627309.1" "RP11-206L10.2" "LINC00115" "NOC2L" ...
.. .. .. .. .. .. ..$ : chr [1:6548] "AAACATACATTTCC.1" "AAACATACCAGAAA.1" "AAACATACCTCGCT.1" "AAACATACCTGGTA.1" ...
.. .. .. .. .. ..@ x : num [1:4626015] 1 1 1 1 1 2 1 1 1 1 ...
.. .. .. .. .. ..@ factors : list()
.. .. .. ..@ scale.data : num[0 , 0 ]
.. .. .. ..@ key : chr "rna_"
.. .. .. ..@ var.features : chr(0)
.. .. .. ..@ meta.features:'data.frame': 14053 obs. of 0 variables
.. .. .. ..@ misc : symbol �NULL�
.. .. .. ..@ NA : NULL
..@ meta.data :'data.frame': 6548 obs. of 5 variables:
.. ..$ orig.ident : chr [1:6548] "IMMUNE_CTRL" "IMMUNE_CTRL" "IMMUNE_CTRL" "IMMUNE_CTRL" ...
.. ..$ nCount_RNA : num [1:6548] 3017 2481 3420 3156 1868 ...
.. ..$ nFeature_RNA : int [1:6548] 877 713 850 1109 634 557 980 581 880 669 ...
.. ..$ stim : chr [1:6548] "CTRL" "CTRL" "CTRL" "CTRL" ...
.. ..$ seurat_annotations: Factor w/ 13 levels "CD14 Mono","CD4 Naive T",..: 1 1 1 12 3 1 7 2 6 1 ...
..@ active.assay: chr "RNA"
..@ active.ident: Factor w/ 1 level "IMMUNE_CTRL": 1 1 1 1 1 1 1 1 1 1 ...
.. ..- attr(*, "names")= chr [1:6548] "AAACATACATTTCC.1" "AAACATACCAGAAA.1" "AAACATACCTCGCT.1" "AAACATACCTGGTA.1" ...
..@ graphs : list()
..@ neighbors : list()
..@ reductions : list()
..@ images : list()
..@ project.name: chr "ifnb"
..@ misc : list()
..@ version :Classes 'package_version', 'numeric_version' hidden list of 1
.. ..$ : int [1:3] 3 0 0
..@ commands : list()
..@ tools : list()
> # normalize and identify variable features for each dataset independently
> ifnb.list <- lapply(X = ifnb.list, FUN = function(x) {
+ x <- NormalizeData(x)
+ x <- FindVariableFeatures(x, selection.method = "vst", nfeatures = 2000)
+ })
Performing log-normalization
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Calculating gene variances
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Calculating feature variances of standardized and clipped values
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Performing log-normalization
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Calculating gene variances
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Calculating feature variances of standardized and clipped values
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
> ifnb.list
$CTRL
An object of class Seurat
14053 features across 6548 samples within 1 assay
Active assay: RNA (14053 features, 2000 variable features)
$STIM
An object of class Seurat
14053 features across 7451 samples within 1 assay
Active assay: RNA (14053 features, 2000 variable features)
> # select features that are repeatedly variable across datasets for integration
> features <- SelectIntegrationFeatures(object.list = ifnb.list)
> features
[1] "HBB" "HBA2" "HBA1" "CCL4" "CCL3"
[6] "CCL7" "TXN" "GNLY" "PPBP" "APOBEC3B"
[11] "VMO1" "ALAS2" "CCL2" "CXCL3" "IFI27"
[16] "MIR155HG" "GZMB" "PTGDS" "C1QB" "IGLL5"
[21] "CCL22" "IL1B" "IGJ" "CXCL1" "CCL8"
[26] "G0S2" "CXCL10" "HBD" "LYZ" "SERPINB2"
[31] "IL8" "CCL5" "HLA-DPB1" "RAMP1" "SPP1"
[36] "AL928768.3" "HLA-DRA" "TSPAN13" "HLA-DQA1" "MYC"
[41] "CD74" "FABP5" "XCL2" "FCER1A" "HSPA6"
[46] "HLA-DPA1" "XCL1" "IDO1" "SPINK1" "NKG7"
[51] "TVP23A" "HSPA1B" "HLA-DRB5" "TIMP1" "FCGR3A"
[56] "SDS" "HLA-DRB1" "TNFAIP6" "MZB1" "S100A8"
[61] "S100A9" "DNASE1L3" "DUSP2" "FTL" "CTSC"
[66] "HSPA1A" "AHSP" "IL1RN" "HBM" "CD69"
[71] "ID2" "PF4" "GZMA" "CD83" "CTSL"
[76] "SOD2" "TCL1A" "CA1" "CCL3L1" "GZMK"
[81] "ID3" "C1QA" "CCR7" "EDN1" "FAM26F"
[86] "CXCL11" "UBE2C" "TRH" "CLIC3" "PTX3"
[91] "BIRC3" "APOBEC3A" "SERPINB1" "CXCL2" "C15orf48"
[96] "CD14" "MS4A7" "GADD45B" "FSCN1" "FGFBP2"
[101] "GNG11" "HLA-DQB1" "CD79A" "KRT1" "LGALS3"
[106] "MKI67" "PLAC8" "TOP2A" "IL6" "IFNG"
[111] "CSF2" "CCL13" "RAB9A" "GZMH" "CST3"
[116] "CYP1B1" "CCL23" "LILRA4" "SERPINF1" "CD70"
[121] "HSPB1" "NMB" "ISG15" "CCNB2" "IL1R2"
[126] "TPSAB1" "EBI3" "CKB" "GADD45G" "RGS2"
[131] "RSAD2" "CST7" "DDIT4" "MS4A1" "TNFRSF4"
[136] "RGS1" "ADA" "MS4A4A" "CD207" "MYBL2"
[141] "GATA2" "CSTB" "ITM2C" "SDPR" "NR4A2"
[146] "KLK1" "AVP" "SNCA" "ACRBP" "GPR183"
[151] "PKIB" "HSPA5" "KIAA0101" "PLA2G7" "MARCKSL1"
[156] "FUCA1" "TNFRSF13B" "HMOX1" "CLEC10A" "S100A4"
[161] "KLRD1" "DUSP4" "INSIG1" "AC006129.4" "PRF1"
[166] "CTSB" "BCL2A1" "CENPF" "CD9" "FOS"
[171] "HBG2" "FGL2" "CXCL9" "TNFSF10" "OSM"
[176] "TUBB1" "HLA-DMA" "AKR1C1" "PMAIP1" "NME1"
[181] "HSPH1" "BUB1" "HERPUD1" "FTH1" "LTA"
[186] "RP11-354E11.2" "TSC22D1" "CSRP2" "NFKBIA" "EGR4"
[191] "CA2" "CCL20" "GATA1" "AIF1" "TNFRSF17"
[196] "RP11-701P16.5" "EREG" "CDKN1C" "HSPE1" "IRF8"
[201] "SERPINA9" "FCER1G" "ACTB" "SAT1" "GLUL"
[206] "S100B" "HES1" "IGSF6" "SYNGR2" "KCNE1L"
[211] "CTSD" "RNASE1" "CTSW" "GPR137B" "CXCL16"
[216] "TYMS" "KRT81" "KLRC1" "LGALS1" "THBS1"
[221] "CD63" "PLAUR" "HPSE" "NRGN" "STMN1"
[226] "CD79B" "SLC16A7" "HMGB2" "IFITM3" "HES4"
[231] "ANXA5" "IFIT2" "CALCRL" "SNHG15" "TNFRSF9"
[236] "S100A12" "LAD1" "CFP" "MEG3" "ZNF664"
[241] "C12orf45" "TPRG1" "TMEM176B" "IL32" "CLEC4E"
[246] "INHBA" "HSPD1" "LTB" "IL18" "LILRA3"
[251] "GALR2" "PLIN2" "GBP1" "ZWINT" "KLF1"
[256] "TIGIT" "ARHGAP26" "CYTL1" "IL4I1" "PLEK"
[261] "C3AR1" "PPM1N" "CDK1" "LY9" "CRHBP"
[266] "LGMN" "PTCRA" "GJB2" "NQO1" "DNAJB1"
[271] "TNFSF13B" "LMNA" "MS4A6A" "ALDH2" "NPW"
[276] "CFD" "FCN1" "C1QC" "ZNF331" "PPA1"
[281] "LYPD2" "GPR157" "ARHGAP24" "MSLN" "MAFB"
[286] "RP11-489E7.4" "ACP5" "ANXA2" "HPS4" "PHLDA1"
[291] "AKR1C3" "WARS" "KLRB1" "GPX1" "NT5DC1"
[296] "CD36" "CXCR3" "ZFAND2A" "GEM" "TEX9"
[301] "CPVL" "ALOX5AP" "CD8B" "C12orf75" "TNIP3"
[306] "LST1" "IL10" "TYROBP" "S100A10" "CLK1"
[311] "ZNF700" "CD200" "MAP1A" "GBP5" "HCAR3"
[316] "REL" "SLAMF7" "PLD4" "TUBA4A" "DUSP5"
[321] "NCF1" "CASP5" "CD86" "EMP1" "IFIT1"
[326] "CREM" "NR4A1" "ESAM" "RGS18" "RP11-706O15.3"
[331] "MYL4" "HN1" "TGFBI" "FEZ1" "IER3"
[336] "PSAP" "TREML1" "SGK1" "AC002456.2" "APBB2"
[341] "SRSF7" "RGS16" "SNHG8" "ANKRD37" "VCAN"
[346] "MGST1" "PALLD" "ANXA1" "SLC39A8" "MAP3K7CL"
[351] "CD8A" "HS3ST1" "ADPRM" "CCL19" "PASK"
[356] "LAIR2" "TYMP" "KYNU" "KCNMA1" "HOPX"
[361] "DAPP1" "ICOS" "RNASE6" "FAM49A" "CXCR5"
[366] "TNFRSF18" "DUSP1" "ITM2A" "TMCO4" "CCR9"
[371] "RAB3GAP2" "FCGR3B" "SEC61B" "KRT86" "BATF"
[376] "CKS2" "NUFIP2" "EFHC1" "VAMP5" "MLH3"
[381] "ADORA3" "PRKCDBP" "GADD45A" "TYW3" "PPIF"
[386] "ZBTB32" "CMSS1" "P2RY6" "KLRC2" "GPNMB"
[391] "CLIC2" "IRG1" "SERPINA1" "TIFAB" "DHRS9"
[396] "CEBPD" "CLU" "TRAT1" "FCGR1A" "MGLL"
[401] "MMP9" "RDX" "AGPAT9" "CLEC12A" "CREG1"
[406] "GP9" "REG4" "CXCL5" "POLE" "HLA-DMB"
[411] "JUN" "EGR3" "NUSAP1" "ADM" "TCTN3"
[416] "SPIB" "S100A11" "ANKRD22" "TUBA1B" "AQP9"
[421] "TNF" "KIAA1598" "C5orf51" "CACYBP" "PERP"
[426] "WTAP" "TSPAN2" "SNHG12" "SNX10" "BASP1"
[431] "AQP3" "CDC20" "DERL3" "SLC31A2" "CMTM5"
[436] "PPP1R14B" "RPS6KB1" "HLA-DQA2" "RP11-117D22.2" "RP5-887A10.1"
[441] "SLC27A2" "LDLRAD4" "SRM" "CLEC4F" "SPON2"
[446] "CDT1" "COLGALT2" "SLC25A37" "BID" "MMD"
[451] "SLC7A11" "OLR1" "C5AR1" "PTGR1" "HMMR"
[456] "TFPI" "PPP1R17" "IL19" "MNDA" "WBP5"
[461] "CLEC5A" "CAPG" "GPR171" "EFNA5" "CLEC4D"
[466] "CENPA" "TK1" "PRKCZ" "IDI1" "FPR1"
[471] "SRSF2" "BANK1" "CTCF" "C22orf42" "LINC00996"
[476] "TALDO1" "SPRED2" "RP11-367G6.3" "RP11-290F20.3" "ALDH1A1"
[481] "HSP90AB1" "MCOLN3" "ADTRP" "SYNJ1" "NINJ1"
[486] "CEBPB" "HIST1H2AC" "IDO2" "MMP19" "CHORDC1"
[491] "CLEC1B" "GCLM" "THAP4" "NOP58" "DUSP6"
[496] "CD68" "TSC22D3" "EAF2" "CAV1" "LILRB4"
[501] "CENPE" "CD72" "MYL9" "CCL4L1" "RASL11A"
[506] "LGALS2" "CD163" "HSP90B1" "CD27" "CPNE2"
[511] "STEAP4" "HSP90AA1" "L1TD1" "NIPSNAP1" "TXNRD1"
[516] "PDZK1IP1" "SERPINB9" "KPNA2" "TBC1D4" "MAP3K4"
[521] "KMO" "SDF2L1" "SMC2" "GNPDA1" "RCBTB2"
[526] "HIST1H2AL" "CCDC50" "KLF10" "SPARC" "WDR47"
[531] "TMEM177" "PILRA" "PHLDA2" "PLAU" "ATP1B3"
[536] "S100A6" "BIK" "CTLA4" "CDK19" "OASL"
[541] "NPM3" "CDKN2A" "METTL7B" "RP11-10C24.1" "LAG3"
[546] "CXorf21" "KDM5B" "SDCBP" "GSN" "ACTG1"
[551] "LILRB2" "MARCO" "GLRX" "TNNT1" "IGSF9B"
[556] "TRAF4" "ADK" "CYP27A1" "NPC2" "FPR3"
[561] "TCF4" "RGP1" "RGCC" "ATP2B1" "HOXA9"
[566] "RAB20" "RP11-572O17.1" "GPR35" "FOXJ1" "GAPT"
[571] "IL2RA" "DRAM1" "FCRLA" "CCND2" "WASL"
[576] "CDO1" "CA6" "GAPDH" "C19orf52" "HIST1H1C"
[581] "A2M" "SDSL" "LRRN3" "LILRA5" "OSGIN1"
[586] "RPL22L1" "SMIM20" "GZMM" "ARL4D" "PSAT1"
[591] "RBM4B" "SGTB" "SMC4" "WNT7A" "RSRC2"
[596] "CCRL2" "CRLS1" "FNIP2" "CTB-61M7.2" "MPP1"
[601] "PTGER2" "TMEM194B" "GPR97" "SMIM14" "DOK2"
[606] "LSP1" "CYP1A1" "HCG18" "DCSTAMP" "SCML1"
[611] "C1orf162" "IFI30" "KCTD5" "AC007228.11" "ATF3"
[616] "TM4SF19.1" "VIM" "LYPD3" "KIR2DL3" "RP11-804H8.6"
[621] "AMICA1" "IFNGR2" "CD7" "MUC12" "DNAJB4"
[626] "SRD5A3" "ACSL1" "SAC3D1" "TCEAL2" "AVPI1"
[631] "NPM1" "ARL5B" "MRPL18" "GCHFR" "NPL"
[636] "IGFBP3" "NMD3" "SHPRH" "LONRF1" "RHOH"
[641] "ATF5" "CD300E" "TESC" "FN1" "STXBP2"
[646] "CCR1" "MUM1" "RP11-51J9.5" "BZW2" "WBP4"
[651] "SOX4" "QPCT" "HPD" "TPST1" "SLC19A1"
[656] "CRTAM" "CD38" "LPAR6" "RP11-1191J2.5" "CD80"
[661] "SOCS3" "PDE4DIP" "SUCNR1" "ENDOG" "BCAT1"
[666] "GNA12" "PRR7" "CHST7" "LRIF1" "SQLE"
[671] "CYBB" "PID1" "LILRB5" "KIR3DL1" "COQ9"
[676] "PIM3" "ASGR1" "ASAH1" "RP11-115C21.2" "TLR2"
[681] "IL7R" "GPR18" "CDK4" "HSPA8" "FKBP4"
[686] "MRPS23" "IL15" "NF1" "MIS18BP1" "CHN1"
[691] "TTC39A" "JUNB" "RAB13" "CD40" "COL9A3"
[696] "CLEC2B" "AIG1" "RARRES3" "SPI1" "TCP1"
[701] "CKLF" "NRP2" "ITGA2B" "IL13RA1" "IFRD1"
[706] "FFAR2" "ZC3H13" "CD300LB" "RP11-505K9.1" "TM7SF3"
[711] "CENPU" "PPARG" "TUBA1A" "MANF" "LIMS1"
[716] "UBE2J1" "GYPC" "PRMT1" "CACFD1" "ATP6V1F"
[721] "COCH" "TREM1" "MXD1" "RGS19" "PLEKHA1"
[726] "TXNIP" "ALKBH2" "RABL5" "NLRP7" "GNG7"
[731] "BCL2L14" "SPHK1" "GATA3" "PYCR1" "GM2A"
[736] "CD5" "SEMA4A" "SLC12A2" "ABHD12" "NCF2"
[741] "CITED2" "TRIB2" "RNF146" "CD59" "GRN"
[746] "SH2D1B" "KLF6" "PIM2" "MTMR11" "RIPK2"
[751] "ZKSCAN8" "TPST2" "SH2D2A" "GCLC" "USP48"
[756] "CD2" "KRTCAP3" "DUSP10" "ZNF845" "LEF1"
[761] "ANXA4" "ABI3" "TSHZ2" "TNKS2-AS1" "KIAA0226L"
[766] "PDXK" "SNRNP200" "EIF4EBP1" "GPR34" "RASSF4"
[771] "TPM4" "PNRC1" "CDKN1A" "NHP2" "C1orf21"
[776] "SERTAD1" "CD3G" "ALG2" "RPS6KA5" "PDLIM1"
[781] "DNAJC2" "CAT" "SSBP3" "ARID5B" "APOL3"
[786] "C1orf122" "FBXO32" "CXCL13" "NUPR1" "IFNB1"
[791] "CCL18" "PNOC" "FUT7" "HRASLS2" "BACE2"
[796] "FANCL" "IL24" "IL36RN" "SCT" "MRGBP"
[801] "CLEC4C" "CH25H" "DNAJC25" "IFIT3" "L3MBTL3"
[806] "MMP7" "BIRC5" "AURKB" "HBG1" "PRSS57"
[811] "CENPJ" "ZGLP1" "NAALADL1" "MT1G" "FABP3"
[816] "RRM2" "ERMP1" "GPATCH4" "SMPD3" "ZMYM1"
[821] "VPS13A" "TMEM176A" "VPS8" "TFPI2" "RP11-35G9.3"
[826] "S100PBP" "CCNA2" "SDC2" "IGFBP6" "ACKR4"
[831] "TACSTD2" "LMBR1" "WDR61" "LAMP3" "TMEM218"
[836] "TMEM40" "ZMYM4" "CUL4B" "SCARF1" "PHF8"
[841] "PARM1" "HPGD" "HIST1H4C" "GLCCI1" "EGR1"
[846] "C17orf58" "PRR16" "BAG2" "SLC9A7" "EGR2"
[851] "MT1X" "CTA-445C9.15" "IL36G" "LY6E" "RMI2"
[856] "IL23A" "POU2AF1" "APOC1" "NLRP2" "RBM26"
[861] "RETN" "MYOM2" "TCP11L1" "SUSD1" "CCNA1"
[866] "MT2A" "CHRNA6" "FAM172A" "TNFRSF10A" "MLLT11"
[871] "HES6" "PROSER1" "C9orf41" "CDKN3" "PANK3"
[876] "BEX5" "SYT11" "ACAD9" "ZBP1" "ANKAR"
[881] "BYSL" "RRAGB" "ALKBH4" "FRMD8" "FCHO2"
[886] "CERS4" "MAP3K5" "RAPGEF2" "FOSB" "ZNF395"
[891] "RP11-707M3.3" "CCNB1" "XPOT" "CCDC18" "BACH2"
[896] "RASD1" "VAV3-AS1" "AP4B1" "TMEM53" "MFSD8"
[901] "SSTR2" "SLMAP" "METTL25" "KLF4" "SECTM1"
[906] "SENP7" "NAA25" "ATF2" "JSRP1" "PRDM1"
[911] "ATAD2" "NDUFAF6" "POLR3A" "SMARCA2" "CCND1"
[916] "VASH1" "TCHP" "TBL1XR1" "HEATR6" "SERPINH1"
[921] "SHOX2" "TTC33" "UNKL" "HSD17B12" "NGFRAP1"
[926] "BTBD10" "TDRD7" "NDC80" "RAD18" "TLDC1"
[931] "PINK1" "RECK" "PATZ1" "C19orf59" "AC093391.2"
[936] "FIGNL1" "ADAMDEC1" "ZNF124" "IMPA2" "CXCL12"
[941] "NKIRAS1" "LDB1" "PIGB" "PIGG" "AP3B1"
[946] "NKAPL" "LINC00662" "SPATS2" "PIBF1" "SH3BGR"
[951] "MTMR9" "ZNF529" "FGFBP3" "NIPA1" "RP11-819C21.1"
[956] "USPL1" "TCEAL1" "FAM76B" "RAB39B" "SMPDL3A"
[961] "STPG1" "TRAPPC2" "CD1E" "ZNF548" "RP3-508I15.14"
[966] "HNRNPU-AS1" "ZNF623" "NCOR2" "KTI12" "HIST1H2AG"
[971] "ZHX1" "DDHD1" "MIR29A" "ZNF235" "PXDC1"
[976] "ZNF419" "APPL2" "ITGAV" "NDUFAF5" "ZADH2"
[981] "RPL39L" "TMCO3" "CDADC1" "METTL8" "NTPCR"
[986] "UCK2" "DHX40" "FAM168B" "LRRC61" "ZNF627"
[991] "PIAS2" "MAP3K3" "CCDC41" "ERMAP" "PAXIP1"
[996] "SPINK2" "FPGT" "ZNF382" "ZNF852" "C11orf63"
[ reached getOption("max.print") -- omitted 1000 entries ]
> immune.anchors <- FindIntegrationAnchors(object.list = ifnb.list, anchor.features = features)
Scaling features for provided objects
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=02s
Finding all pairwise anchors
| | 0 % ~calculating Running CCA
Merging objects
Finding neighborhoods
Finding anchors
Found 16407 anchors
Filtering anchors
Retained 6750 anchors
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=03m 15s
> #CCA
> # this command creates an 'integrated' data assay
> immune.combined <- IntegrateData(anchorset = immune.anchors)
Merging dataset 1 into 2
Extracting anchors for merged samples
Finding integration vectors
Finding integration vector weights
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Integrating data
Warning: Adding a command log without an assay associated with it
> # specify that we will perform downstream analysis on the corrected data note that the original
> # unmodified data still resides in the 'RNA' assay
> DefaultAssay(immune.combined) <- "integrated"
> immune.combined <- ScaleData(immune.combined, verbose = FALSE)
> immune.combined <- RunPCA(immune.combined, npcs = 30, verbose = FALSE)
> immune.combined <- RunUMAP(immune.combined, reduction = "pca", dims = 1:30)
Warning: The default method for RunUMAP has changed from calling Python UMAP via reticulate to the R-native UWOT using the cosine metric
To use Python UMAP via reticulate, set umap.method to 'umap-learn' and metric to 'correlation'
This message will be shown once per session
10:39:30 UMAP embedding parameters a = 0.9922 b = 1.112
10:39:30 Read 13999 rows and found 30 numeric columns
10:39:30 Using Annoy for neighbor search, n_neighbors = 30
10:39:30 Building Annoy index with metric = cosine, n_trees = 50
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
10:39:33 Writing NN index file to temp file C:\Users\zzu\AppData\Local\Temp\RtmpO0TeZF\file34646dee7baf
10:39:33 Searching Annoy index using 1 thread, search_k = 3000
10:39:38 Annoy recall = 100%
10:39:38 Commencing smooth kNN distance calibration using 1 thread
10:39:39 Initializing from normalized Laplacian + noise
10:39:40 Commencing optimization for 200 epochs, with 618846 positive edges
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
10:39:59 Optimization finished
> immune.combined <- FindNeighbors(immune.combined, reduction = "pca", dims = 1:30)
Computing nearest neighbor graph
Computing SNN
> immune.combined <- FindClusters(immune.combined, resolution = 0.5)
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
Number of nodes: 13999
Number of edges: 580792
Running Louvain algorithm...
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Maximum modularity in 10 random starts: 0.9047
Number of communities: 14
Elapsed time: 3 seconds
> # Visualization
> p1 <- DimPlot(immune.combined, reduction = "umap", group.by = "stim")
> p2 <- DimPlot(immune.combined, reduction = "umap", label = TRUE, repel = TRUE)
> p1 + p2
Error in grid.Call(C_convert, x, as.integer(whatfrom), as.integer(whatto), :
Viewport has zero dimension(s)
> DimPlot(immune.combined, reduction = "umap", split.by = "stim")
> # For performing differential expression after integration, we switch back to the original data
> DefaultAssay(immune.combined) <- "RNA"
> nk.markers <- FindConservedMarkers(immune.combined, ident.1 = 6, grouping.var = "stim", verbose = FALSE)
错误: Please install the metap package to use FindConservedMarkers.
This can be accomplished with the following commands:
----------------------------------------
install.packages('BiocManager')
BiocManager::install('multtest')
install.packages('metap')
----------------------------------------
> install.packages('BiocManager')
Installing package into ‘C:/Users/zzu/Documents/R/win-library/4.0’
(as ‘lib’ is unspecified)
试开URL’https://mirrors.tuna.tsinghua.edu.cn/CRAN/bin/windows/contrib/4.0/BiocManager_1.30.10.zip'
Content type 'application/zip' length 100117 bytes (97 KB)
downloaded 97 KB
package ‘BiocManager’ successfully unpacked and MD5 sums checked
The downloaded binary packages are in
C:\Users\zzu\AppData\Local\Temp\RtmpO0TeZF\downloaded_packages
> BiocManager::install('multtest')
Bioconductor version 3.12 (BiocManager 1.30.10), R 4.0.3 (2020-10-10)
Installing package(s) 'multtest'
试开URL’https://bioconductor.org/packages/3.12/bioc/bin/windows/contrib/4.0/multtest_2.46.0.zip'
Content type 'application/zip' length 953873 bytes (931 KB)
downloaded 931 KB
package ‘multtest’ successfully unpacked and MD5 sums checked
The downloaded binary packages are in
C:\Users\zzu\AppData\Local\Temp\RtmpO0TeZF\downloaded_packages
Installation path not writeable, unable to update packages: boot, class, cluster,
codetools, foreign, KernSmooth, MASS, Matrix, mgcv, nlme, nnet, spatial
Old packages: 'backports', 'BH', 'biomaRt', 'brio', 'broom', 'caTools', 'cli',
'clusterProfiler', 'coin', 'correlation', 'cpp11', 'crayon', 'crosstalk',
'data.table', 'DBI', 'dbplyr', 'DelayedArray', 'deldir', 'desc', 'DESeq2', 'diffobj',
'dplyr', 'DT', 'edgeR', 'enrichplot', 'fansi', 'farver', 'fastmap', 'findpython',
'flextable', 'forcats', 'formatR', 'future.apply', 'GenomeInfoDb', 'gert', 'ggforce',
'ggraph', 'ggrepel', 'ggridges', 'ggsignif', 'ggstatsplot', 'glmnet', 'GSVA',
'hexbin', 'Hmisc', 'hms', 'htmltools', 'httpuv', 'insight', 'ipmisc', 'ipred',
'isoband', 'knitr', 'lava', 'leiden', 'libcoin', 'lifecycle', 'lubridate', 'maftools',
'magick', 'MatrixGenerics', 'MatrixModels', 'matrixStats', 'memoise', 'mime',
'multcomp', 'officer', 'pairwiseComparisons', 'parallelly', 'parameters', 'pbkrtest',
'performance', 'pillar', 'pkgload', 'plotly', 'promises', 'ps', 'quantreg',
'R.devices', 'rappdirs', 'Rcpp', 'RcppArmadillo', 'RcppParallel', 'reprex', 'rio',
'rlang', 'rmarkdown', 'rms', 'RSQLite', 'rstatix', 'rtracklayer', 'rvest', 'Seurat',
'shiny', 'sp', 'SparseM', 'spatstat', 'spatstat.data', 'spatstat.utils',
'statsExpressions', 'survminer', 'systemfonts', 'testthat', 'tibble', 'tidyr',
'tinytex', 'usethis', 'utf8', 'waldo', 'WGCNA', 'withr', 'WRS2', 'xfun', 'zoo'
Update all/some/none? [a/s/n]: install.packages('metap')
Update all/some/none? [a/s/n]:
n
> install.packages('metap')
Installing package into ‘C:/Users/zzu/Documents/R/win-library/4.0’
(as ‘lib’ is unspecified)
also installing the dependencies ‘tmvnsim’, ‘mnormt’, ‘rbibutils’, ‘sn’, ‘Rdpack’, ‘TFisher’, ‘mutoss’, ‘mathjaxr’
试开URL’https://mirrors.tuna.tsinghua.edu.cn/CRAN/bin/windows/contrib/4.0/tmvnsim_1.0-2.zip'
Content type 'application/zip' length 36929 bytes (36 KB)
downloaded 36 KB
试开URL’https://mirrors.tuna.tsinghua.edu.cn/CRAN/bin/windows/contrib/4.0/mnormt_2.0.2.zip'
Content type 'application/zip' length 193526 bytes (188 KB)
downloaded 188 KB
试开URL’https://mirrors.tuna.tsinghua.edu.cn/CRAN/bin/windows/contrib/4.0/rbibutils_2.0.zip'
Content type 'application/zip' length 1004590 bytes (981 KB)
downloaded 981 KB
试开URL’https://mirrors.tuna.tsinghua.edu.cn/CRAN/bin/windows/contrib/4.0/sn_1.6-2.zip'
Content type 'application/zip' length 1590456 bytes (1.5 MB)
downloaded 1.5 MB
试开URL’https://mirrors.tuna.tsinghua.edu.cn/CRAN/bin/windows/contrib/4.0/Rdpack_2.1.1.zip'
Content type 'application/zip' length 924593 bytes (902 KB)
downloaded 902 KB
试开URL’https://mirrors.tuna.tsinghua.edu.cn/CRAN/bin/windows/contrib/4.0/TFisher_0.2.0.zip'
Content type 'application/zip' length 80138 bytes (78 KB)
downloaded 78 KB
试开URL’https://mirrors.tuna.tsinghua.edu.cn/CRAN/bin/windows/contrib/4.0/mutoss_0.1-12.zip'
Content type 'application/zip' length 1092928 bytes (1.0 MB)
downloaded 1.0 MB
试开URL’https://mirrors.tuna.tsinghua.edu.cn/CRAN/bin/windows/contrib/4.0/mathjaxr_1.4-0.zip'
Content type 'application/zip' length 968304 bytes (945 KB)
downloaded 945 KB
试开URL’https://mirrors.tuna.tsinghua.edu.cn/CRAN/bin/windows/contrib/4.0/metap_1.4.zip'
Content type 'application/zip' length 683028 bytes (667 KB)
downloaded 667 KB
package ‘tmvnsim’ successfully unpacked and MD5 sums checked
package ‘mnormt’ successfully unpacked and MD5 sums checked
package ‘rbibutils’ successfully unpacked and MD5 sums checked
package ‘sn’ successfully unpacked and MD5 sums checked
package ‘Rdpack’ successfully unpacked and MD5 sums checked
package ‘TFisher’ successfully unpacked and MD5 sums checked
package ‘mutoss’ successfully unpacked and MD5 sums checked
package ‘mathjaxr’ successfully unpacked and MD5 sums checked
package ‘metap’ successfully unpacked and MD5 sums checked
The downloaded binary packages are in
C:\Users\zzu\AppData\Local\Temp\RtmpO0TeZF\downloaded_packages
> nk.markers <- FindConservedMarkers(immune.combined, ident.1 = 6, grouping.var = "stim", verbose = FALSE)
> FeaturePlot(immune.combined, features = c("CD3D", "SELL", "CREM", "CD8A", "GNLY", "CD79A", "FCGR3A",
+ "CCL2", "PPBP"), min.cutoff = "q9")
> immune.combined <- RenameIdents(immune.combined, `0` = "CD14 Mono", `1` = "CD4 Naive T", `2` = "CD4 Memory T",
+ `3` = "CD16 Mono", `4` = "B", `5` = "CD8 T", `6` = "NK", `7` = "T activated", `8` = "DC", `9` = "B Activated",
+ `10` = "Mk", `11` = "pDC", `12` = "Eryth", `13` = "Mono/Mk Doublets", `14` = "HSPC")
Warning: Cannot find identity 14
> DimPlot(immune.combined, label = TRUE)
> Idents(immune.combined) <- factor(Idents(immune.combined), levels = c("HSPC", "Mono/Mk Doublets",
+ "pDC", "Eryth", "Mk", "DC", "CD14 Mono", "CD16 Mono", "B Activated", "B", "CD8 T", "NK", "T activated",
+ "CD4 Naive T", "CD4 Memory T"))
> markers.to.plot <- c("CD3D", "CREM", "HSPH1", "SELL", "GIMAP5", "CACYBP", "GNLY", "NKG7", "CCL5",
+ "CD8A", "MS4A1", "CD79A", "MIR155HG", "NME1", "FCGR3A", "VMO1", "CCL2", "S100A9", "HLA-DQA1",
+ "GPR183", "PPBP", "GNG11", "HBA2", "HBB", "TSPAN13", "IL3RA", "IGJ", "PRSS57")
> DotPlot(immune.combined, features = markers.to.plot, cols = c("blue", "red"), dot.scale = 8, split.by = "stim") +
+ RotatedAxis()
> library(ggplot2)
> library(cowplot)
载入程辑包:‘cowplot’
The following object is masked from ‘package:patchwork’:
align_plots
> theme_set(theme_cowplot())
> t.cells <- subset(immune.combined, idents = "CD4 Naive T")
> Idents(t.cells) <- "stim"
> avg.t.cells <- as.data.frame(log1p(AverageExpression(t.cells, verbose = FALSE)$RNA))
> avg.t.cells$gene <- rownames(avg.t.cells)
> cd14.mono <- subset(immune.combined, idents = "CD14 Mono")
> Idents(cd14.mono) <- "stim"
> avg.cd14.mono <- as.data.frame(log1p(AverageExpression(cd14.mono, verbose = FALSE)$RNA))
> avg.cd14.mono$gene <- rownames(avg.cd14.mono)
> genes.to.label = c("ISG15", "LY6E", "IFI6", "ISG20", "MX1", "IFIT2", "IFIT1", "CXCL10", "CCL8")
> p1 <- ggplot(avg.t.cells, aes(CTRL, STIM)) + geom_point() + ggtitle("CD4 Naive T Cells")
> p1 <- LabelPoints(plot = p1, points = genes.to.label, repel = TRUE)
When using repel, set xnudge and ynudge to 0 for optimal results
> p2 <- ggplot(avg.cd14.mono, aes(CTRL, STIM)) + geom_point() + ggtitle("CD14 Monocytes")
> p2 <- LabelPoints(plot = p2, points = genes.to.label, repel = TRUE)
When using repel, set xnudge and ynudge to 0 for optimal results
> p1 + p2
> immune.combined$celltype.stim <- paste(Idents(immune.combined), immune.combined$stim, sep = "_")
> immune.combined$celltype <- Idents(immune.combined)
> Idents(immune.combined) <- "celltype.stim"
> b.interferon.response <- FindMarkers(immune.combined, ident.1 = "B_STIM", ident.2 = "B_CTRL", verbose = FALSE)
> head(b.interferon.response, n = 15)
p_val avg_logFC pct.1 pct.2 p_val_adj
ISG15 1.120549e-159 3.1952344 0.998 0.237 1.574707e-155
IFIT3 2.512528e-155 3.1291572 0.964 0.051 3.530856e-151
IFI6 2.247524e-153 2.9224641 0.966 0.080 3.158446e-149
ISG20 2.708899e-150 2.0479198 1.000 0.666 3.806816e-146
IFIT1 2.311096e-141 2.8706200 0.911 0.031 3.247783e-137
MX1 4.449549e-124 2.2679072 0.906 0.114 6.252952e-120
LY6E 1.303073e-120 2.1589250 0.897 0.150 1.831208e-116
TNFSF10 3.006065e-112 2.6237683 0.785 0.022 4.224423e-108
IFIT2 6.524713e-111 2.5339635 0.794 0.036 9.169179e-107
B2M 1.115282e-97 0.4215099 1.000 1.000 1.567306e-93
PLSCR1 3.806809e-97 1.9635015 0.798 0.116 5.349708e-93
IRF7 9.445721e-96 1.8084515 0.838 0.186 1.327407e-91
CXCL10 1.548568e-85 3.6448641 0.647 0.012 2.176203e-81
UBE2L6 1.979783e-82 1.4825313 0.847 0.300 2.782189e-78
PSMB9 1.422315e-77 1.1334344 0.938 0.574 1.998779e-73
> FeaturePlot(immune.combined, features = c("CD3D", "GNLY", "IFI6"), split.by = "stim", max.cutoff = 3,
+ cols = c("grey", "red"))
> plots <- VlnPlot(immune.combined, features = c("LYZ", "ISG15", "CXCL10"), split.by = "stim", group.by = "celltype",
+ pt.size = 0, combine = FALSE)
The default behaviour of split.by has changed.
Separate violin plots are now plotted side-by-side.
To restore the old behaviour of a single split violin,
set split.plot = TRUE.
This message will be shown once per session.
> wrap_plots(plots = plots, ncol = 1)
>