更好的阅读体验-->>
ggplot2 (Hadley Wickham开发)是目前R语言数据可视化的主流。与R的基础绘图系统相比,基于grid绘图系统的ggplot2已经在语法理解性上已经进步很多,但是通过ggplot2绘制用于学术杂志的图形,仍然需要较多的绘图函数(或者加载一些写好的模板代码)。为此Alboukadel Kassambara基于ggplot2、ggsci包开发了ggpubr用于绘制符合出版物要求的图形。该包封装了很多ggplot2的绘图函数,并且内嵌了ggsci中很多优秀的学术期刊配色方案,值得学习使用。
ggpubr包括一些关键的特性:
- 能帮助研究人员快速创建易于发表的图形;
- 能够将P值和显著性水平自动添加到图形上而无需二次编辑;
- 使图形注释和排版变得容易;
- 使更改图形参数(例如颜色和标签)变得容易。
安装
从CRAN安装:
install.packages("ggpubr")
或者也可以从Github安装最新版本:
if(!require(devtools)) install.packages("devtools")
devtools::install_github("kassambara/ggpubr")
加载ggpubr包:
library("ggpubr")
ggpubr可绘制的图形
加载数据
library(ggpubr)
set.seed(1234)
wdata = data.frame(
sex = factor(rep(c("F", "M"), each=200)),
weight = c(rnorm(200, 55), rnorm(200, 58)))
head(wdata, 4)
## sex weight
## 1 F 53.8
## 2 F 55.3
## 3 F 56.1
## 4 F 52.7
密度图
# 带有平均值线和边际地毯的密度图
ggdensity(wdata, x = "weight",
add = "mean", rug = TRUE,
color = "sex", fill = "sex",
palette = c("#00AFBB", "#E7B800"))
直方图
# 带有平均值线和边际地毯的直方图
gghistogram(wdata, x = "weight",
add = "mean", rug = TRUE,
color = "sex", fill = "sex",
palette = c("#00AFBB", "#E7B800"))
箱线图和小提琴图
# 加载数据
data("ToothGrowth")
df <- ToothGrowth
head(df, 4)
## len supp dose
## 1 4.2 VC 0.5
## 2 11.5 VC 0.5
## 3 7.3 VC 0.5
## 4 5.8 VC 0.5
# 带有抖动点图的箱线图
p <- ggboxplot(df, x = "dose", y = "len",
color = "dose", palette =c("#00AFBB", "#E7B800", "#FC4E07"),
add = "jitter", shape = "dose")
p
# 添加P值
my_comparisons <- list( c("0.5", "1"), c("1", "2"), c("0.5", "2") )
p + stat_compare_means(comparisons = my_comparisons)+ # 添加每两组间的P值
stat_compare_means(label.y = 50) # 添加全局P值
# 带有箱线图的小提琴图
ggviolin(df, x = "dose", y = "len", fill = "dose",
palette = c("#00AFBB", "#E7B800", "#FC4E07"),
add = "boxplot", add.params = list(fill = "white"))+
stat_compare_means(comparisons = my_comparisons, label = "p.signif")+ # 添加显著性水平
stat_compare_means(label.y = 50) # 添加全局P值
条形图
数据加载
data("mtcars")
dfm <- mtcars
dfm$cyl <- as.factor(dfm$cyl)
dfm$name <- rownames(dfm)
head(dfm[, c("name", "wt", "mpg", "cyl")])
## name wt mpg cyl
## Mazda RX4 Mazda RX4 2.62 21.0 6
## Mazda RX4 Wag Mazda RX4 Wag 2.88 21.0 6
## Datsun 710 Datsun 710 2.32 22.8 4
## Hornet 4 Drive Hornet 4 Drive 3.21 21.4 6
## Hornet Sportabout Hornet Sportabout 3.44 18.7 8
## Valiant Valiant 3.46 18.1 6
有序条形图
通过cyl
更改填充色,并且对全部数据进行排序, 而非分组排序。
ggbarplot(dfm, x = "name", y = "mpg",
fill = "cyl",
color = "white",
palette = "jco",
sort.val = "desc",
sort.by.groups = FALSE,
x.text.angle = 90
)
对每组内的数据进行排序,可设置sort.by.groups = TRUE。
ggbarplot(dfm, x = "name", y = "mpg",
fill = "cyl",
color = "white",
palette = "jco",
sort.val = "asc",
sort.by.groups = TRUE,
x.text.angle = 90
)
偏差图
偏差图一般用以展示变量与参考值之间的偏差程度。下面将以mtcars数据集中的mpg z-score来绘制偏差图。
计算mpg数据的z-score:
dfm$mpg_z <- (dfm$mpg -mean(dfm$mpg))/sd(dfm$mpg)
dfm$mpg_grp <- factor(ifelse(dfm$mpg_z < 0, "low", "high"),
levels = c("low", "high"))
head(dfm[, c("name", "wt", "mpg", "mpg_z", "mpg_grp", "cyl")])
## name wt mpg mpg_z mpg_grp cyl
## Mazda RX4 Mazda RX4 2.62 21.0 0.151 high 6
## Mazda RX4 Wag Mazda RX4 Wag 2.88 21.0 0.151 high 6
## Datsun 710 Datsun 710 2.32 22.8 0.450 high 4
## Hornet 4 Drive Hornet 4 Drive 3.21 21.4 0.217 high 6
## Hornet Sportabout Hornet Sportabout 3.44 18.7 -0.231 low 8
## Valiant Valiant 3.46 18.1 -0.330 low 6
绘制分组排序的条形图:
ggbarplot(dfm, x = "name", y = "mpg_z",
fill = "mpg_grp",
color = "white",
palette = "jco",
sort.val = "asc",
sort.by.groups = FALSE,
x.text.angle = 90,
ylab = "MPG z-score",
xlab = FALSE,
legend.title = "MPG Group"
)
旋转图形:
ggbarplot(dfm, x = "name", y = "mpg_z",
fill = "mpg_grp",
color = "white",
palette = "jco",
sort.val = "desc",
sort.by.groups = FALSE,
x.text.angle = 90,
ylab = "MPG z-score",
legend.title = "MPG Group",
rotate = TRUE,
ggtheme = theme_minimal()
)
点图
棒棒糖图
当你有大量数据来展示时,棒棒糖图与上面所说的条形图的效果是类似的。
棒棒糖图的颜色可以根据分组变量“cyl”确定:
ggdotchart(dfm, x = "name", y = "mpg",
color = "cyl",
palette = c("#00AFBB", "#E7B800", "#FC4E07"),
sorting = "ascending",
add = "segments",
ggtheme = theme_pubr()
)
旋转并更改点大小:
ggdotchart(dfm, x = "name", y = "mpg",
color = "cyl",
palette = c("#00AFBB", "#E7B800", "#FC4E07"),
sorting = "descending",
add = "segments",
rotate = TRUE,
group = "cyl",
dot.size = 6,
label = round(dfm$mpg),
font.label = list(color = "white", size = 9,
vjust = 0.5),
ggtheme = theme_pubr()
)
偏差图
ggdotchart(dfm, x = "name", y = "mpg_z",
color = "cyl",
palette = c("#00AFBB", "#E7B800", "#FC4E07"),
sorting = "descending",
add = "segments",
add.params = list(color = "lightgray", size = 2),
group = "cyl",
dot.size = 6,
label = round(dfm$mpg_z,1),
font.label = list(color = "white", size = 9,
vjust = 0.5),
ggtheme = theme_pubr()
)+
geom_hline(yintercept = 0, linetype = 2, color = "lightgray")
Cleveland点图
ggdotchart(dfm, x = "name", y = "mpg",
color = "cyl",
palette = c("#00AFBB", "#E7B800", "#FC4E07"),
sorting = "descending",
rotate = TRUE,
dot.size = 2,
y.text.col = TRUE,
ggtheme = theme_pubr()
)+
theme_cleveland()
运行环境
sessionInfo()
R version 3.6.2 (2019-12-12)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)
Matrix products: default
locale:
[1] LC_COLLATE=Chinese (Simplified)_China.936
[2] LC_CTYPE=Chinese (Simplified)_China.936
[3] LC_MONETARY=Chinese (Simplified)_China.936
[4] LC_NUMERIC=C
[5] LC_TIME=Chinese (Simplified)_China.936
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] ggpubr_0.2.4 magrittr_1.5 ggplot2_3.2.1
loaded via a namespace (and not attached):
[1] Rcpp_1.0.3 rstudioapi_0.10 tidyselect_0.2.5 munsell_0.5.0
[5] colorspace_1.4-1 R6_2.4.1 rlang_0.4.3 dplyr_0.8.3
[9] tools_3.6.2 grid_3.6.2 gtable_0.3.0 withr_2.1.2
[13] lazyeval_0.2.2 assertthat_0.2.1 digest_0.6.23 tibble_2.1.3
[17] lifecycle_0.1.0 ggsignif_0.6.0 crayon_1.3.4 ggsci_2.9
[21] purrr_0.3.3 farver_2.0.3 glue_1.3.1 labeling_0.3
[25] compiler_3.6.2 pillar_1.4.3 scales_1.1.0 pkgconfig_2.0.3