前面我们已经介绍了cosinor差异分析最常用的两个包:cosinor
和cosinor2
包,这两个包其实已经做得非常好了,就是还有一点点缺陷,下面一起来看看它们的缺陷:
1.cosinor包
示例来自官网 vignettes
library(cosinor)
head(vitamind)
# X Y time
# 1 0 16.12091 11.439525
# 2 0 29.90624 5.807104
# 3 1 39.17572 1.045492
# 4 1 35.15403 4.082983
# 5 1 43.67065 10.606247
# 6 0 31.20360 5.126054
fit <- cosinor.lm(Y ~ time(time) + X + amp.acro(X), data = vitamind, period = 12)
summary(fit)
# Raw model coefficients:
# estimate standard.error lower.CI upper.CI p.value
# (Intercept) 29.6898 0.4654 28.7776 30.6020 0.0000
# X 1.9019 0.8041 0.3258 3.4779 0.0180
# rrr 0.9308 0.6357 -0.3151 2.1767 0.1431
# sss 6.2010 0.6805 4.8673 7.5347 0.0000
# X:rrr 5.5795 1.1386 3.3479 7.8111 0.0000
# X:sss -1.3825 1.1364 -3.6097 0.8447 0.2237
#
# ***********************
#
# Transformed coefficients:
# estimate standard.error lower.CI upper.CI p.value
# (Intercept) 29.6898 0.4654 28.7776 30.6020 0.000
# [X = 1] 1.9019 0.8041 0.3258 3.4779 0.018
# amp 6.2705 0.6799 4.9378 7.6031 0.000
# [X = 1]:amp 8.0995 0.9095 6.3170 9.8820 0.000
# acr 1.4218 0.1015 1.2229 1.6207 0.000
# [X = 1]:acr 0.6372 0.1167 0.4084 0.8659 0.000
test_cosinor(fit, "X", param = "amp")#param 命名要测试参数的字符串,“amp”表示振幅,“acr”表示顶相(峰值相位)
# Global test:
# Statistic:
# [1] 2.59
#
#
# P-value:
# [1] 0.1072
#
# Individual tests:
# Statistic:
# [1] 1.61
#
#
# P-value:
# [1] 0.1072
#
# Estimate and confidence interval[1] "1.83 (-0.4 to 4.05)"
发现没,其实这个包的核心是检测相位和振幅的差异,但是我们就无法推测基线(均值)差异了吗?其实不然,由summary(fit)
我们其实已经得到了均值差异的结果
1
处代表默认状态下的均值水平,2
处代表因子X=1时与默认状态下的均值差异,所以3
处代表的就是均值差异的p值因此,
cosinor
包中的比较是分散的,整理结果时需要花一番手脚
2.cosinor2
#构造测试数据集
set.seed(100)
mat1 <- data.frame(row.names = paste0("subject",1:10),
x1=1+rnorm(10,0,0.15),
x2=2+rnorm(10,0,0.2),
x3=3+rnorm(10,0,0.3),
x4=4+rnorm(10,0,0.4),
x5=3+rnorm(10,0,0.35),
x6=2+rnorm(10,0,0.25))
mat2 <- data.frame(row.names = paste0("subject",1:12),
x1=8+rnorm(12,0,0.85),
x2=6+rnorm(12,0,0.6),
x3=4+rnorm(12,0,0.45),
x4=2+rnorm(12,0,0.25),
x5=4+rnorm(12,0,0.4),
x6=6+rnorm(12,0,0.65))
time=seq(1,21,4)
library(cosinor2)
fit.pa_ext.cosinor <- population.cosinor.lm(data = mat1 , time = time, period = 24)
# MESOR Amplitude Acrophase
# 1 2.520422 1.364075 -3.399919
fit.pa_int.cosinor <- population.cosinor.lm(data = mat2, time = time, period = 24)
# MESOR Amplitude Acrophase
# 1 4.954428 2.617844 -0.2602902
cosinor.poptests(fit.pa_ext.cosinor, fit.pa_int.cosinor)
# F df1 df2 p 1st population 2nd population
# MESOR 921.87276 1 20 3.315962e-18 2.520422 4.9544275
# Amplitude 14.83446 1 20 9.952569e-04 1.364075 2.6178436
# Acrophase 124.81985 1 20 4.755292e-10 -3.399919 -0.2602902
# Warning message:
# In cosinor.poptests(fit.pa_ext.cosinor, fit.pa_int.cosinor) :
# Results of population amplitude difference test are not reliable due to different acrophases.
看到最后三行没,有一个大大的警告:如果两个种群的峰值相位显著不同,则振幅差异检验的结果不可靠,不可靠,不可靠,重要的结果说三遍
因此,这两个包都有一点点缺陷,不影响正常使用,但使用起来总觉得有点不舒服
3.终极R包circacompare
circacompare
is an R package that allows for the statistical analyses and comparison of two circadian rhythms. This work is published here and can be cited as:
Rex Parsons, Richard Parsons, Nicholas Garner, Henrik Oster, Oliver Rawashdeh, CircaCompare: A method to estimate and statistically support differences in mesor, amplitude, and phase, between circadian rhythms, Bioinformatics, https://doi.org/10.1093/bioinformatics/btz730
示例1:检测单个基因是否振荡
library(circacompare)
library(ggplot2)
set.seed(42)
data_grouped <- make_data(phi1=6, noise_sd=1)
df <- data_grouped[data_grouped$group == "g1",]
head(df)
# time measure group
# 1 1 11.0302167 g1
# 2 2 8.0955559 g1
# 3 3 7.4341962 g1
# 4 4 5.6328626 g1
# 5 5 2.9924588 g1
# 6 6 -0.1061245 g1
out <- circa_single(x = df, col_time = "time", col_outcome = "measure")# call circacompare.
out[[3]] # view the graph.
out[[2]] # view the results.
# parameter value
# 1 rhythmic_p 1.744001e-38
# 2 mesor -4.182901e-02
# 3 amplitude 1.006016e+01
# 4 phase_radians 5.352336e-02
# 5 peak_time_hours 2.044442e-01
# 6 period 2.400000e+01
out[[1]] #view the statistics
# Nonlinear regression model
# model: measure ~ k + (alpha) * cos((1/period) * time_r - (phi))
# data: x
# k alpha phi
# -0.04183 10.06016 0.05352
# residual sum-of-squares: 57.33
#
# Number of iterations to convergence: 4
# Achieved convergence tolerance: 3.492e-06
summary(out[[1]])
# Formula: measure ~ k + (alpha) * cos((1/period) * time_r - (phi))
#
# Parameters:
# Estimate Std. Error t value Pr(>|t|)
# k -0.04183 0.16292 -0.257 0.7985
# alpha 10.06016 0.23040 43.664 <2e-16 ***
# phi 0.05352 0.02290 2.337 0.0239 *
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Residual standard error: 1.129 on 45 degrees of freedom
#
# Number of iterations to convergence: 4
# Achieved convergence tolerance: 3.492e-06
nlstools::confint2(out[[1]],level = 0.95)
# 2.5 % 97.5 %
# k -0.369961082 0.28630306
# alpha 9.596113955 10.52421133
# phi 0.007395989 0.09965073
解读:out[[3]]
是ggplot2对象,可以直接输出,也可以修改各种基于ggplot2的参数,图形左上角表明基因振荡是否显著
out[[2]]
是拟合主要统计值,有6行2列,分别为振荡p值(也就是振荡显著性),均值(基线),振幅,峰值相位弧度值,峰值相位小时值,周期(默认为24h)out[[1]]
为拟合结果,summary
查看总结输出更详细的统计资料,nlstools::confint2
获得置信区间
示例2 检测单个基因在两组间是否振荡:核心内容
df <- data_grouped
table(df$group)# 2 groups
# g1 g2
# 48 48
out <- circacompare(x = df, col_time = "time", col_group = "group", col_outcome = "measure")# call circacompare.
out[[1]] # view the graph.
out[[2]]$value <- round(out[[2]]$value,4)
out[[2]] # view the results.
# parameter value
# 1 Presence of rhythmicity (p-value) for g1 0.0000
# 2 Presence of rhythmicity (p-value) for g2 0.0000
# 3 g1 mesor estimate -0.0418
# 4 g2 mesor estimate 3.1483
# 5 Mesor difference estimate 3.1901
# 6 P-value for mesor difference 0.0000
# 7 g1 amplitude estimate 10.0602
# 8 g2 amplitude estimate 14.1477
# 9 Amplitude difference estimate 4.0875
# 10 P-value for amplitude difference 0.0000
# 11 g1 peak time hours 0.2044
# 12 g2 peak time hours 22.8717
# 13 Phase difference estimate -1.3328
# 14 P-value for difference in phase 0.0000
# 15 Shared period estimate 24.0000
out[[3]]
summary(out[[3]])
nlstools::confint2(out[[3]])
解读:out[[1]]
同样返回的为ggplot2对象,如图:
out[[2]]为拟合主要统计值,有15行2列:
1.g1组振荡显著性的p值
2.g2组振荡显著性的p值
3~6.g1组均值,g2组均值,两组均值差异,均值差异显著性
7-10.g1组振幅,g2组振幅,两组振幅差异,振幅差异显著性
11-14.g1组峰值相位,g2组峰值相位,两组峰值相位差异,峰值相位差异显著性
15.两组的统一周期,默认为24h
out[[3]]
为拟合结果,summary
查看总结输出更详细的统计资料,
nlstools::confint2
获得置信区间
注意:单个时out[[3]]
为图像,out[[1]]
为拟合结果
比较时out[[1]]
为图像,out[[3]]
为拟合结果
ps:怎么样,是不是心动了,还有比这更好使的做cosinor工具吗?只恨没有早点发现啊,赶紧install.packages("circacompare")
或devtools::install_github("RWParsons/circacompare")
安装起来吧
不喜欢代码的,也可以试试Shiny版本哦,链接在此:https://rwparsons.shinyapps.io/circacompare/
注意:
您上传的数据应为csv格式,同时您上传的文件应包含以下内容的列:
- 1.时间点(应为数字,以小时为单位)
- 2.分组变量(可以是任何格式,但只能有两个可能的值)
- 3.测量值(应为数字)
上传您的csv文件,然后从下拉菜单中选择相应的列,单击“运行”进行比较。