ggplot2
最近想用ggplot2画一个点、线的图,发现有点超纲,搞了好久才搞明白。
要求如下
- 点和线的图例分开
- 点和线都有不同的颜色
- 点不能有边框
- 有副坐标轴
注意要点
- 数据集尽量写成大长表(术语是什么不太清楚),大概就是数据集的结构最好不要是这种:
DATE a b c
1: 2021-01-01 1 11 21
2: 2021-01-02 2 12 20
3: 2021-01-03 3 13 21
4: 2021-01-04 4 14 20
5: 2021-01-05 5 15 21
6: 2021-01-06 6 16 20
7: 2021-01-07 7 17 21
8: 2021-01-08 8 18 20
9: 2021-01-09 9 19 21
10: 2021-01-10 10 20 20
而要写成这种:
DATE variable value
1: 2021-01-01 a 1
2: 2021-01-02 a 2
3: 2021-01-03 a 3
4: 2021-01-04 a 4
5: 2021-01-05 a 5
6: 2021-01-06 a 6
7: 2021-01-07 a 7
8: 2021-01-08 a 8
9: 2021-01-09 a 9
10: 2021-01-10 a 10
11: 2021-01-01 b 11
12: 2021-01-02 b 12
13: 2021-01-03 b 13
14: 2021-01-04 b 14
15: 2021-01-05 b 15
16: 2021-01-06 b 16
17: 2021-01-07 b 17
18: 2021-01-08 b 18
19: 2021-01-09 b 19
20: 2021-01-10 b 20
21: 2021-01-01 c 21
22: 2021-01-02 c 20
23: 2021-01-03 c 21
24: 2021-01-04 c 20
25: 2021-01-05 c 21
26: 2021-01-06 c 20
27: 2021-01-07 c 21
28: 2021-01-08 c 20
29: 2021-01-09 c 21
30: 2021-01-10 c 20
DATE variable value
- 副坐标轴的加法有点反人类。要把副坐标轴的数据映射到主坐标轴的范围内,然后副坐标轴的设置同样根据主坐标轴映射关系进行调整。如:假设主坐标轴的数据范围在[10, 20]之间,而副坐标轴的范围在[-100, 200]之间,则,找到主副坐标轴的线性关系,即(其中,x代表副坐标轴的数据, y代表主坐标轴的数据,该例子中,),则在画图过程中,用y = kx + b将副坐标轴的数据映射到主坐标轴的范围里,加负坐标轴的时候,用公式构建出副坐标轴。
-
geom_poine
和geom_line
在都有颜色colour
属性时,若要将point和line的图例区分开,需要在geom_point
中使用fill
参数(而不要用colour
,把这个属性留给line用)。这里需要注意的是,geom_point
函数的fill
参数,只在shape
在21-25之间才有效(这点经常忘,导致每次想用fill
参数的时候发现都无效)。然而,shape
在21-25之间的时候默认是有边框的(即stroke
参数控制的属性),如果想去掉边框,网上好多地方说设置stroke = 0
,但是发现不好用,而应该是stroke = NA
(猜测这个可能和电脑环境 / R的版本等有关系)。 -
geom_point
中stroke
的属性体现不在图例中,即我在geom_point(aes(x = DATE, y = VALUE, fill = POINT, stroke = NA))
这样写,发现图例里面point还是有黑边(!!!逼死强迫症!!!),这个时候,需要加上这样一条:p <- p +guides(fill = guide_legend(override.aes = list(stroke = NA)))
,强制把图例里面的边框去掉!
最后附上代码例子
library(data.table)
library(ggplot2)
sec_dt <- data.table::data.table(
DATE = c("2021/10/11","2021/10/12","2021/10/13",
"2021/10/14","2021/10/15","2021/10/18","2021/10/19",
"2021/10/20","2021/10/21","2021/10/22","2021/10/25","2021/10/26",
"2021/10/27","2021/10/28","2021/10/29","2021/10/31",
"2021/11/1","2021/11/2","2021/11/3","2021/11/4","2021/11/5",
"2021/11/8","2021/10/11","2021/10/12","2021/10/13","2021/10/14",
"2021/10/15","2021/10/18","2021/10/19","2021/10/20",
"2021/10/21","2021/10/22","2021/10/25","2021/10/26","2021/10/27",
"2021/10/28","2021/10/29","2021/10/31","2021/11/1",
"2021/11/2","2021/11/3","2021/11/4","2021/11/5","2021/11/8",
"2021/10/11","2021/10/12","2021/10/13","2021/10/14","2021/10/15",
"2021/10/18","2021/10/19","2021/10/20","2021/10/21",
"2021/10/22","2021/10/25","2021/10/26","2021/10/27","2021/10/28",
"2021/10/29","2021/10/31","2021/11/1","2021/11/2",
"2021/11/3","2021/11/4","2021/11/5","2021/11/8","2021/10/11",
"2021/10/12","2021/10/13","2021/10/14","2021/10/15","2021/10/18",
"2021/10/19","2021/10/20","2021/10/21","2021/10/22",
"2021/10/25","2021/10/26","2021/10/27","2021/10/28","2021/10/29",
"2021/10/31","2021/11/1","2021/11/2","2021/11/3",
"2021/11/4","2021/11/5","2021/11/8"),
LINE = c("QUANTITY","QUANTITY","QUANTITY",
"QUANTITY","QUANTITY","QUANTITY","QUANTITY","QUANTITY","QUANTITY",
"QUANTITY","QUANTITY","QUANTITY","QUANTITY","QUANTITY",
"QUANTITY","QUANTITY","QUANTITY","QUANTITY","QUANTITY",
"QUANTITY","QUANTITY","QUANTITY","PRICE","PRICE","PRICE",
"PRICE","PRICE","PRICE","PRICE","PRICE","PRICE","PRICE",
"PRICE","PRICE","PRICE","PRICE","PRICE","PRICE","PRICE",
"PRICE","PRICE","PRICE","PRICE","PRICE","BUY_POINT",
"BUY_POINT","BUY_POINT","BUY_POINT","BUY_POINT","BUY_POINT",
"BUY_POINT","BUY_POINT","BUY_POINT","BUY_POINT","BUY_POINT",
"BUY_POINT","BUY_POINT","BUY_POINT","BUY_POINT","BUY_POINT",
"BUY_POINT","BUY_POINT","BUY_POINT","BUY_POINT","BUY_POINT",
"BUY_POINT","SELL_POINT","SELL_POINT","SELL_POINT",
"SELL_POINT","SELL_POINT","SELL_POINT","SELL_POINT","SELL_POINT",
"SELL_POINT","SELL_POINT","SELL_POINT","SELL_POINT",
"SELL_POINT","SELL_POINT","SELL_POINT","SELL_POINT","SELL_POINT",
"SELL_POINT","SELL_POINT","SELL_POINT","SELL_POINT",
"SELL_POINT"),
POINT = c("QUANTITY","QUANTITY","QUANTITY",
"QUANTITY","QUANTITY","QUANTITY","QUANTITY","QUANTITY","QUANTITY",
"QUANTITY","QUANTITY","QUANTITY","QUANTITY","QUANTITY",
"QUANTITY","QUANTITY","QUANTITY","QUANTITY","QUANTITY",
"QUANTITY","QUANTITY","QUANTITY","PRICE","PRICE","PRICE",
"PRICE","PRICE","PRICE","PRICE","PRICE","PRICE","PRICE",
"PRICE","PRICE","PRICE","PRICE","PRICE","PRICE","PRICE",
"PRICE","PRICE","PRICE","PRICE","PRICE","BUY_POINT",
"BUY_POINT","BUY_POINT","BUY_POINT","BUY_POINT","BUY_POINT",
"BUY_POINT","BUY_POINT","BUY_POINT","BUY_POINT","BUY_POINT",
"BUY_POINT","BUY_POINT","BUY_POINT","BUY_POINT","BUY_POINT",
"BUY_POINT","BUY_POINT","BUY_POINT","BUY_POINT","BUY_POINT",
"BUY_POINT","SELL_POINT","SELL_POINT","SELL_POINT",
"SELL_POINT","SELL_POINT","SELL_POINT","SELL_POINT","SELL_POINT",
"SELL_POINT","SELL_POINT","SELL_POINT","SELL_POINT",
"SELL_POINT","SELL_POINT","SELL_POINT","SELL_POINT","SELL_POINT",
"SELL_POINT","SELL_POINT","SELL_POINT","SELL_POINT",
"SELL_POINT"),
VALUE = c(12500,12500,12500,12500,12500,12500,
12500,12500,12500,12500,10100,10100,10100,10100,10100,
10100,10100,10100,10100,10100,10100,0,49.63,50.36,50.06,
52.8,52.5,50.9,50.99,50.91,50.71,50.06,49.71,50.8,
51.09,51.27,52.8,52.8,51.07,50.35,50.25,49.24,49.03,
52.91,49.63,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,
NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,
NA,NA,49.71,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,52.91)
)
sec_dt <- sec_dt[, DATE := lubridate::ymd(DATE)]
x1 <- max(sec_dt[LINE == 'QUANTITY', VALUE]) * 2
y1 <- max(sec_dt[LINE == 'PRICE', VALUE])
x2 <- min(sec_dt[LINE == 'QUANTITY', VALUE])
y2 <- min(sec_dt[LINE == 'PRICE', VALUE])
k <- (y2 - y1) / (x2 - x1)
b <- y1 - k*x1
stopifnot(!is.na(k), !is.na(b))
p <- ggplot(sec_dt, aes(x = DATE)) +
geom_point(
aes(y = VALUE, size = POINT, fill = POINT, shape = POINT, stroke = NA),
data = sec_dt[POINT %in% c('BUY_POINT', 'SELL_POINT')], na.rm = TRUE
) +
geom_line(aes(y = VALUE, linetype = LINE, colour = LINE), data = sec_dt[LINE %in% c('PRICE')]) +
geom_line(aes(y = VALUE * k + b, linetype = LINE, colour = LINE), data = sec_dt[LINE %in% c('QUANTITY')]) +
scale_linetype_manual('', values = c('PRICE' = 'solid', 'QUANTITY' = 'dashed')) +
scale_size_manual('', values = c('BUY_POINT' = 3, 'SELL_POINT' = 3)) +
scale_fill_manual('', values = c('BUY_POINT' = '#FF0000', 'SELL_POINT' = '#99FF66')) +
scale_colour_manual('', values = c('PRICE' = '#000000', 'QUANTITY' = '#CC0000')) +
scale_shape_manual('', values = c('BUY_POINT' = 21, 'SELL_POINT' = 21)) +
scale_y_continuous(
name = 'price', limits = c(min(sec_dt[LINE == 'PRICE', VALUE]), max(sec_dt[LINE == 'PRICE', VALUE])),
sec.axis = sec_axis(~ (. - b) / k , name = "quantity")
) +
guides(
fill = guide_legend(override.aes = list(stroke = NA))
) +
labs(x = NULL, y = NULL) +
theme_minimal() +
theme(
plot.title = element_text(hjust = 0.5),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
axis.title.x = element_text(size = 12),
axis.title.y = element_text(size = 12),
legend.text = element_text(size = 8),
legend.position = "top"
)
p
结果如下图
echarts4r
相关网址
echarts4r官网: https://echarts4r.john-coene.com/index.html
统计之都的中文讲解,比较全,可以满足大部分的画图需求:echarts4r: 从入门到应用
echart API的中文网站,英文API有些看不懂的可以参考这里的:https://echarts.apache.org/zh/
用法
在使用R的echarts4r包的过程中,发现里面有些函数挺反人类的,比如加双x轴,搞了好久都没搞出来,严重怀疑是这个包的bug。但是,全包最NB的函数就是e_list()
,所有JSON option能实现的,用这个函数全可以实现!所以,如果在使用的过程中,对于某些特殊的画图需求,发现包里自带的函数难以实现,直接上e_list
。比如在echarts4r的官网里的这个例子:
N <- 20 # data points
opts <- list(
xAxis = list(
type = "category",
data = LETTERS[1:N]
),
yAxis = list(
type = "value"
),
series = list(
list(
type = "line",
data = round(runif(N, 5, 20))
)
)
)
p <- e_charts() |>
e_list(opts)
另外一个好用的函数为e_inspect
,这个函数就是返回图的JSON options的源码,如果画的图有错误,可以通过源码来检查哪里写的有问题,接上面的例子:
library(magrittr)
json <- p %>%
e_inspect(
json = TRUE,
pretty = TRUE
)
json
综上所述,搭配使用e_inspect
和e_list
函数,可以解决几乎所有的画图问题。