一人默默学习inferCNV的时候,你或许需要看看别人的运行记录,这里我提供一份完整记录。
1. 代码
> rm(list=ls())
> options(stringsAsFactors = F)
> library(Seurat)
> library(ggplot2)
> library(infercnv)
> expFile='expFile.txt'
> groupFiles='groupFiles.txt'
> geneFile='geneFile.txt'
> infercnv_obj = CreateInfercnvObject(raw_counts_matrix=expFile,
+ annotations_file=groupFiles,
+ delim="\t",
+ gene_order_file= geneFile,
+ ref_group_names=c('ref-fib')) ## 这个取决于自己的分组信息里面的
INFO [2021-03-11 10:51:17] Parsing matrix: expFile.txt
INFO [2021-03-11 10:51:18] Parsing gene order file: geneFile.txt
INFO [2021-03-11 10:51:18] Parsing cell annotations file: groupFiles.txt
INFO [2021-03-11 10:51:18] ::order_reduce:Start.
INFO [2021-03-11 10:51:18] .order_reduce(): expr and order match.
INFO [2021-03-11 10:51:18] ::process_data:order_reduce:Reduction from positional data, new dimensions (r,c) = 8259,161 Total=657080 Min=0 Max=866.
INFO [2021-03-11 10:51:18] num genes removed taking into account provided gene ordering list: 302 = 3.65661702385277% removed.
INFO [2021-03-11 10:51:18] -filtering out cells < 100 or > Inf, removing 0 % of cells
INFO [2021-03-11 10:51:18] validating infercnv_obj
> dir.create("plot_out")#新建,原来没有这个文件夹
> ## 文献的代码:#14:58开始
> start_time <- Sys.time()
> ## 文献的代码:#14:58开始
> start_time <- Sys.time()
> infercnv_obj2 = infercnv::run(infercnv_obj,
+ cutoff=0.1, # cutoff=1 works well for Smart-seq2, and cutoff=0.1 works well for 10x Genomics
+ out_dir= 'plot_out/' ,
+ cluster_by_groups=T, # cluster
+ hclust_method="ward.D2",
+ plot_steps=F,
+ denoise=F,
+ HMM=F)
INFO [2021-03-11 10:54:22] ::process_data:Start
INFO [2021-03-11 10:54:22] Creating output path plot_out/
INFO [2021-03-11 10:54:22] Checking for saved results.
INFO [2021-03-11 10:54:22]
2.过程
STEP 1: incoming data
INFO [2021-03-11 10:54:22]
STEP 02: Removing lowly expressed genes
INFO [2021-03-11 10:54:22] ::above_min_mean_expr_cutoff:Start
INFO [2021-03-11 10:54:22] Removing 5199 genes from matrix as below mean expr threshold: 0.1
INFO [2021-03-11 10:54:22] validating infercnv_obj
INFO [2021-03-11 10:54:22] There are 2758 genes and 161 cells remaining in the expr matrix.
INFO [2021-03-11 10:54:22] no genes removed due to min cells/gene filter
INFO [2021-03-11 10:54:22]
STEP 03: normalization by sequencing depth
INFO [2021-03-11 10:54:22] normalizing counts matrix by depth
INFO [2021-03-11 10:54:22] Computed total sum normalization factor as median libsize: 1424.000000
INFO [2021-03-11 10:54:23]
STEP 04: log transformation of data
INFO [2021-03-11 10:54:23] transforming log2xplus1()
INFO [2021-03-11 10:54:24]
STEP 08: removing average of reference data (before smoothing)
INFO [2021-03-11 10:54:24] ::subtract_ref_expr_from_obs:Start inv_log=FALSE, use_bounds=TRUE
INFO [2021-03-11 10:54:24] subtracting mean(normal) per gene per cell across all data
INFO [2021-03-11 10:54:25] -subtracting expr per gene, use_bounds=TRUE
INFO [2021-03-11 10:54:26]
STEP 09: apply max centered expression threshold: 3
INFO [2021-03-11 10:54:26] ::process_data:setting max centered expr, threshold set to: +/-: 3
INFO [2021-03-11 10:54:27]
STEP 10: Smoothing data per cell by chromosome
INFO [2021-03-11 10:54:27] smooth_by_chromosome: chr: chr1
INFO [2021-03-11 10:54:28] smooth_by_chromosome: chr: chr10
INFO [2021-03-11 10:54:28] smooth_by_chromosome: chr: chr11
INFO [2021-03-11 10:54:28] smooth_by_chromosome: chr: chr12
INFO [2021-03-11 10:54:28] smooth_by_chromosome: chr: chr13
INFO [2021-03-11 10:54:28] smooth_by_chromosome: chr: chr14
INFO [2021-03-11 10:54:29] smooth_by_chromosome: chr: chr15
INFO [2021-03-11 10:54:29] smooth_by_chromosome: chr: chr16
INFO [2021-03-11 10:54:29] smooth_by_chromosome: chr: chr17
INFO [2021-03-11 10:54:29] smooth_by_chromosome: chr: chr18
INFO [2021-03-11 10:54:29] smooth_by_chromosome: chr: chr19
INFO [2021-03-11 10:54:29] smooth_by_chromosome: chr: chr2
INFO [2021-03-11 10:54:30] smooth_by_chromosome: chr: chr20
INFO [2021-03-11 10:54:30] smooth_by_chromosome: chr: chr21
INFO [2021-03-11 10:54:30] smooth_by_chromosome: chr: chr22
INFO [2021-03-11 10:54:30] smooth_by_chromosome: chr: chr3
INFO [2021-03-11 10:54:30] smooth_by_chromosome: chr: chr4
INFO [2021-03-11 10:54:30] smooth_by_chromosome: chr: chr5
INFO [2021-03-11 10:54:31] smooth_by_chromosome: chr: chr6
INFO [2021-03-11 10:54:31] smooth_by_chromosome: chr: chr7
INFO [2021-03-11 10:54:31] smooth_by_chromosome: chr: chr8
INFO [2021-03-11 10:54:31] smooth_by_chromosome: chr: chr9
INFO [2021-03-11 10:54:33]
STEP 11: re-centering data across chromosome after smoothing
INFO [2021-03-11 10:54:33] ::center_smooth across chromosomes per cell
INFO [2021-03-11 10:54:34]
STEP 12: removing average of reference data (after smoothing)
INFO [2021-03-11 10:54:34] ::subtract_ref_expr_from_obs:Start inv_log=FALSE, use_bounds=TRUE
INFO [2021-03-11 10:54:34] subtracting mean(normal) per gene per cell across all data
INFO [2021-03-11 10:54:35] -subtracting expr per gene, use_bounds=TRUE
INFO [2021-03-11 10:54:37]
STEP 14: invert log2(FC) to FC
INFO [2021-03-11 10:54:37] invert_log2(), computing 2^x
INFO [2021-03-11 10:54:38]
STEP 15: Clustering samples (not defining tumor subclusters)
INFO [2021-03-11 10:54:38] define_signif_tumor_subclusters(p_val=0.1
INFO [2021-03-11 10:54:38] define_signif_tumor_subclusters(), tumor: epi
INFO [2021-03-11 10:54:38] cut tree into: 1 groups
INFO [2021-03-11 10:54:38] -processing epi,epi_s1
INFO [2021-03-11 10:54:38] define_signif_tumor_subclusters(), tumor: ref-fib
INFO [2021-03-11 10:54:38] cut tree into: 1 groups
INFO [2021-03-11 10:54:38] -processing ref-fib,ref-fib_s1
INFO [2021-03-11 10:54:40] ::plot_cnv:Start
INFO [2021-03-11 10:54:40] ::plot_cnv:Current data dimensions (r,c)=2758,161 Total=444261.448584115 Min=0.7259085460275 Max=1.27974740879628.
INFO [2021-03-11 10:54:40] ::plot_cnv:Depending on the size of the matrix this may take a moment.
INFO [2021-03-11 10:54:41] plot_cnv(): auto thresholding at: (0.882791 , 1.118216)
INFO [2021-03-11 10:54:41] plot_cnv_observation:Start
INFO [2021-03-11 10:54:41] Observation data size: Cells= 57 Genes= 2758
INFO [2021-03-11 10:54:42] plot_cnv_observation:Writing observation groupings/color.
INFO [2021-03-11 10:54:42] plot_cnv_observation:Done writing observation groupings/color.
INFO [2021-03-11 10:54:42] plot_cnv_observation:Writing observation heatmap thresholds.
INFO [2021-03-11 10:54:42] plot_cnv_observation:Done writing observation heatmap thresholds.
INFO [2021-03-11 10:54:42] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000
INFO [2021-03-11 10:54:42] Quantiles of plotted data range: 0.882790650772113,0.966942481185594,0.998792930197712,1.03387941341448,1.11821578824489
INFO [2021-03-11 10:54:42] plot_cnv_observations:Writing observation data to plot_out//infercnv.preliminary.observations.txt
INFO [2021-03-11 10:54:43] plot_cnv_references:Start
INFO [2021-03-11 10:54:43] Reference data size: Cells= 104 Genes= 2758
INFO [2021-03-11 10:54:43] plot_cnv_references:Number reference groups= 1
INFO [2021-03-11 10:54:43] plot_cnv_references:Plotting heatmap.
INFO [2021-03-11 10:54:43] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000
INFO [2021-03-11 10:54:43] Quantiles of plotted data range: 0.882790650772113,0.978846385330733,0.998778646338114,1.02034944189329,1.11821578824489
INFO [2021-03-11 10:54:43] plot_cnv_references:Writing reference data to plot_out//infercnv.preliminary.references.txt
INFO [2021-03-11 10:54:45]
Making the final infercnv heatmap
INFO [2021-03-11 10:54:46] ::plot_cnv:Start
INFO [2021-03-11 10:54:46] ::plot_cnv:Current data dimensions (r,c)=2758,161 Total=444261.448584115 Min=0.7259085460275 Max=1.27974740879628.
INFO [2021-03-11 10:54:46] ::plot_cnv:Depending on the size of the matrix this may take a moment.
INFO [2021-03-11 10:54:48] plot_cnv(): auto thresholding at: (0.881784 , 1.118216)
INFO [2021-03-11 10:54:48] plot_cnv_observation:Start
INFO [2021-03-11 10:54:48] Observation data size: Cells= 57 Genes= 2758
INFO [2021-03-11 10:54:48] plot_cnv_observation:Writing observation groupings/color.
INFO [2021-03-11 10:54:48] plot_cnv_observation:Done writing observation groupings/color.
INFO [2021-03-11 10:54:48] plot_cnv_observation:Writing observation heatmap thresholds.
INFO [2021-03-11 10:54:48] plot_cnv_observation:Done writing observation heatmap thresholds.
INFO [2021-03-11 10:54:49] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000
INFO [2021-03-11 10:54:49] Quantiles of plotted data range: 0.881784211755114,0.966942481185594,0.998792930197712,1.03387941341448,1.11821578824489
INFO [2021-03-11 10:54:49] plot_cnv_observations:Writing observation data to plot_out//infercnv.observations.txt
INFO [2021-03-11 10:54:49] plot_cnv_references:Start
INFO [2021-03-11 10:54:49] Reference data size: Cells= 104 Genes= 2758
INFO [2021-03-11 10:54:50] plot_cnv_references:Number reference groups= 1
INFO [2021-03-11 10:54:50] plot_cnv_references:Plotting heatmap.
INFO [2021-03-11 10:54:50] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000
INFO [2021-03-11 10:54:50] Quantiles of plotted data range: 0.881784211755114,0.978846385330733,0.998778646338114,1.02034944189329,1.11821578824489
INFO [2021-03-11 10:54:50] plot_cnv_references:Writing reference data to plot_out//infercnv.references.txt
Warning messages:
1: In dir.create(out_dir) : 'plot_out' already exists
2: In dir.create(out_dir) : 'plot_out' already exists
3: In dir.create(out_dir) : 'plot_out' already exists
end_time <- Sys.time()
end_time - start_time
Time difference of 29.92654 secs
这个只有15步就完成了