2019-08-26用ggplot2画“美图”

A ggplot2 Tutorial for Beautiful Plotting in R

原文见https://cedricscherer.netlify.com/2019/08/05/a-ggplot2-tutorial-for-beautiful-plotting-in-r/

引言

从2016年开始,我不得不准备博士工作介绍的讲演,因为非常讨厌R的基础绘图包的语法和样式,从那时起开始用ggplot2 来可视化我的数据。因为时间不够,我试了n次错用于绘制这些图形 ,还用了大量的谷歌搜索的帮助。我经常回顾的资源是篇博客文章Beautiful plotting in R: A ggplot2 cheatsheet ,作者是 Zev Ross, 2014年8月4日发布,最后更新于2016年1月。归功于这个博客文章,我的演讲中包含了很多非常优美的图形。我决定把这篇教程逐步予以介绍。我从中获益良多,从直接修改代码开始,与时俱进,增加了很多代码段、图表类型和资源。

由于Zev Ross 的博客很多年没有更新了,我在自己的Github上有跟新版(最后更新于2017年1月)。在此主页是最合适的地方(另外,我添加了多处更新,如精彩绝伦的patchworkggforce包。还有饼图,因为人人都爱looooves饼图!)

image.png

以下是我做的主要修改:

  • 遵循R的规范格式,如 Hadley Wickham, GoogleCoding Club 指导格式),
  • 改变了图形的样式和外观(如不仅仅做了些许改变,我改变了所有绘图的轴标题、图例和漂亮的颜色),
  • 更新的版本在 ggplot2 中记录了这些变化,
  • 修改了输入的数据 (GitHub 资源),
  • 为实战练习和研讨会准备了可执行的R脚本,
  • 加入了额外的小提示,如:
    • 另外的绘图类型 (例如轮廓图、地毯图、屋脊图)
    • 如何和为什么使用Viridis调色板
    • 使用 Tufte 打印样式创建最小绘图
    • 如何调整绘图标题、副标题和标题
    • 如何为图形加入不同的线型
    • 如何在图例和图例项名称改变顺序
    • 如何为数据添加标签(如何做的更美观)

目录

点击进入英文原文链接,哈哈哈😝😝😝!!!

Preparation(准备)

  • 用着这个链接下载数据 这里.
  • 带有可执行代码的 Rmarkdwon 脚本在下载 here.
  • 以下是需要安装的包:
    • ggplot2
    • ggthemes
    • tidyverse
    • extrafont
    • patchwork
    • cowplot
    • grid
    • gridExtra
    • ggrepel
    • reshape2
    • ggforce
    • ggridges
    • shiny
install.packages(c("ggplot2", "ggthemes", "tidyverse", "extrafont", 
                   "cowplot", "grid", "gridExtra", "ggrepel", 
                   "reshape2", "ggforce", "ggridges", "shiny"))

devtools::install_github("thomasp85/patchwork")

(出于教学目的,在任何一个绘图章节中,除了个ggplot2之外,在相应的代码块中都加载了必要的R包)

The Dataset(数据集)

我们采用的数据来源于 National Morbidity and Mortality Air Pollution Study (NMMAPS)。为了使图形可控,我们把数据限定于芝加哥和1997-2000。更多的数据细节,参考Roger Peng的图书 Statistical Methods in Environmental Epidemiology with R.

chic <- readr::read_csv("https://raw.githubusercontent.com/Z3tt/R-Tutorials/master/ggplot2/chicago-nmmaps.csv")
tibble::glimpse(chic)
## Observations: 1,461
## Variables: 10
## $ city     <chr> "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chic"...
## $ date     <date> 1997-01-01, 1997-01-02, 1997-01-03, 1997-01-04, 1997-01-05, 1997-01-06, 1997-01-07, 1997-01-08, 1997-01-09, 1997-01-10, 1997-01-11, 1997-01-12, 1997-01-13, 1997-01-14, 1997-01-15, 1997-01-16, 1997-01-17, 1997-01-18, 1997-01-19, 1997-01-20, 1997-01-21, 1997-01-22, 1997-01-23, 1997-01-24, 1997-01-25, 1997-01-26, 1997-01-27, 1997-01-28, 1997-01-29, 1997-01-30, 1997-01-31, 1997-02-01, 1997-02-02, 1997-02-03, 1997-02-04, 1997-02-05, 1997-02-06, 1997-02-07, 1997-02-08, 1997-02-09, 1997-02-10, 1997-02-11, 1997-02-12, 1997-02-13, 1997-02-14, 1997-02-15, 1997-02-16, 1997-02-17, 1997-02-18, 1997-02-19, 1997-02-20, 1997-02-21, 1997-02-22, 1997-02-23, 1997-02-24, 1997-02-25, 1997-...
## $ death    <dbl> 137, 123, 127, 146, 102, 127, 116, 118, 148, 121, 110, 127, 129, 151, 128, 132, 116, 142, 124, 124, 127, 121, 134, 120, 109, 109, 115, 105, 114, 120, 117, 126, 97, 96, 119, 125, 116, 118, 121, 114, 111, 107, 127, 98, 104, 122, 124, 120, 106, 103, 139, 133, 109, 121, 111, 105, 107, 123, 124, 125, 108, 114, 104, 120, 134, 101, 102, 125, 119, 115, 121, 112, 127, 99, 125, 115, 113, 105, 113, 120, 105, 119, 147, 123, 108, 117, 110, 106, 96, 119, 119, 99, 120, 130, 97, 105, 102, 104, 137, 111, 108, 96, 100, 105, 128, 120, 98, 118, 94, 117, 121, 110, 110, 108, 121, 114, 116, 109, 123, 115, 101, 118, 100, 126, 126, 121, 114, 112, 111, 111, 107, 124, 104, 107, 109, 133, 108, 109...
## $ temp     <dbl> 36.0, 45.0, 40.0, 51.5, 27.0, 17.0, 16.0, 19.0, 26.0, 16.0, 1.5, 1.0, 3.0, 10.0, 19.0, 9.5, -3.0, 0.0, 14.0, 31.0, 35.0, 36.5, 26.0, 32.0, 14.5, 11.0, 17.0, 2.0, 8.0, 16.5, 31.5, 35.0, 36.5, 30.0, 34.5, 30.0, 26.0, 25.5, 25.5, 26.0, 27.0, 23.5, 21.0, 20.5, 25.5, 20.0, 18.5, 30.0, 48.5, 37.5, 35.5, 36.0, 26.0, 28.0, 21.5, 25.5, 36.5, 34.5, 37.5, 45.5, 35.0, 33.5, 38.0, 33.0, 26.5, 35.5, 39.0, 37.0, 44.0, 37.0, 33.5, 37.5, 26.5, 19.0, 24.5, 45.0, 33.5, 35.5, 46.0, 53.5, 37.5, 32.5, 33.0, 40.5, 44.0, 60.5, 55.5, 43.5, 37.5, 38.5, 44.5, 53.0, 59.5, 62.5, 60.5, 45.0, 34.0, 28.5, 30.0, 30.5, 33.5, 33.5, 38.5, 41.5, 49.0, 43.0, 40.5, 40.0, 45.5, 49.0, 45.0, 43.0, 48.5, 47.5, 4...
## $ dewpoint <dbl> 37.50000, 47.25000, 38.00000, 45.50000, 11.25000, 5.75000, 7.00000, 17.75000, 24.00000, 5.37500, -6.62500, -8.87500, 1.50000, 11.50000, 23.25000, -9.75000, -10.37500, -4.12500, 22.62500, 27.25000, 41.62500, 20.75000, 18.75000, 29.50000, -1.37500, 17.12500, 8.37500, -6.37500, 11.00000, 16.37500, 33.75000, 29.66667, 29.62500, 28.00000, 32.00000, 24.25000, 21.87500, 23.37500, 22.50000, 21.00000, 21.75000, 19.50000, 11.60000, 16.37500, 23.00000, 15.25000, 8.12500, 32.62500, 41.37500, 27.50000, 44.12500, 29.62500, 24.25000, 14.62500, 10.87500, 27.12500, 35.00000, 30.25000, 36.00000, 44.00000, 27.37500, 29.37500, 28.87500, 28.62500, 13.37500, 35.25000, 28.25000, 32.62500, 33....
## $ pm10     <dbl> 13.052268, 41.948600, 27.041751, 25.072573, 15.343121, 9.364655, 20.228428, 33.134819, 12.118381, 24.761534, 18.126151, 16.013770, 34.991079, 64.945403, 26.941955, 27.022906, 18.837025, 31.859740, 30.923168, 19.894566, 27.882017, 18.508762, 11.845698, 26.687346, 16.612825, 21.641455, 22.672498, 28.101180, 51.776607, 48.741462, 24.686329, 23.784943, 27.762150, 21.600928, 17.050900, 10.157749, 15.943086, 33.010704, 14.955909, 30.410449, 23.914813, 22.972347, 12.712336, 22.719836, 35.676001, 28.373076, 15.662430, 38.744847, 27.597166, 17.612211, 29.768805, 7.340321, 7.856717, 7.908915, 17.834350, 41.124012, 34.052583, 19.749350, 26.126759, 28.129506, 9.940940, 15.980970, 2...
## $ o3       <dbl> 5.659256, 5.525417, 6.288548, 7.537758, 20.760798, 14.940874, 11.920985, 8.678477, 13.355892, 10.448264, 15.866094, 15.115290, 9.381068, 8.029508, 7.066111, 20.113023, 15.363898, 12.713223, 9.616133, 16.840369, 12.758676, 21.024213, 18.665072, 7.131938, 17.167861, 9.960118, 9.167350, 13.613967, 7.945009, 7.660619, 11.882608, 16.676182, 12.032368, 21.849559, 10.887549, 14.894031, 15.957824, 14.391243, 19.749645, 12.397635, 14.193562, 20.492388, 23.091993, 20.171005, 15.453240, 19.526661, 20.019234, 17.297562, 27.013275, 19.055436, 6.890252, 16.313610, 23.015853, 24.990318, 18.939318, 12.526243, 7.962753, 13.194153, 15.178614, 13.860717, 30.992349, 29.260852, 15.413875, 1...
## $ time     <dbl> 3654, 3655, 3656, 3657, 3658, 3659, 3660, 3661, 3662, 3663, 3664, 3665, 3666, 3667, 3668, 3669, 3670, 3671, 3672, 3673, 3674, 3675, 3676, 3677, 3678, 3679, 3680, 3681, 3682, 3683, 3684, 3685, 3686, 3687, 3688, 3689, 3690, 3691, 3692, 3693, 3694, 3695, 3696, 3697, 3698, 3699, 3700, 3701, 3702, 3703, 3704, 3705, 3706, 3707, 3708, 3709, 3710, 3711, 3712, 3713, 3714, 3715, 3716, 3717, 3718, 3719, 3720, 3721, 3722, 3723, 3724, 3725, 3726, 3727, 3728, 3729, 3730, 3731, 3732, 3733, 3734, 3735, 3736, 3737, 3738, 3739, 3740, 3741, 3742, 3743, 3744, 3745, 3746, 3747, 3748, 3749, 3750, 3751, 3752, 3753, 3754, 3755, 3756, 3757, 3758, 3759, 3760, 3761, 3762, 3763, 3764, 3765, 3766, ...
## $ season   <chr> "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", "Winter"...
## $ year     <dbl> 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, ...
head(chic, 10)
## # A tibble: 10 x 10
##    city  date       death  temp dewpoint  pm10    o3  time season  year
##    <chr> <date>     <dbl> <dbl>    <dbl> <dbl> <dbl> <dbl> <chr>  <dbl>
##  1 chic  1997-01-01   137  36      37.5  13.1   5.66  3654 Winter  1997
##  2 chic  1997-01-02   123  45      47.2  41.9   5.53  3655 Winter  1997
##  3 chic  1997-01-03   127  40      38    27.0   6.29  3656 Winter  1997
##  4 chic  1997-01-04   146  51.5    45.5  25.1   7.54  3657 Winter  1997
##  5 chic  1997-01-05   102  27      11.2  15.3  20.8   3658 Winter  1997
##  6 chic  1997-01-06   127  17       5.75  9.36 14.9   3659 Winter  1997
##  7 chic  1997-01-07   116  16       7    20.2  11.9   3660 Winter  1997
##  8 chic  1997-01-08   118  19      17.8  33.1   8.68  3661 Winter  1997
##  9 chic  1997-01-09   148  26      24    12.1  13.4   3662 Winter  1997
## 10 chic  1997-01-10   121  16       5.38 24.8  10.4   3663 Winter  1997

The ggplot2 Package(ggplot2包)

ggplot2 是个以声明方式创建图形的系统,基于此书 The Grammar of Graphics。你需要提供数据,告诉 ggplot2 如何把变量映射到美学外观、使用何种图形属性,细节问题由ggplot2处理。

随后,一幅ggplot图被从几个基本原件所创建:

1.数据:
用于绘图的原始数据The raw data that you want to plot.
2.几何形状 geom_:
表示数据的几何图形。
3.美学外观 aes()_:
几何和统计对象的美学外观,如颜色、大小、形状、透明度和位置。
4.标度 scale_:
在数据和美学维度间进行映射,如数据范围到绘图宽度或赋予颜色的因数。
5.统计变换 stat_:
数据的统计汇总,分位数、拟合曲线和总和。
6.坐标系 coord_:
用于将数据坐标映射到数据矩形平面的转换。
7分面 facet_:
把数据分列成图形网格。
8.视觉主题 theme():
图形的总体视觉默认效果,如背景、网格、坐标轴、默认字体、大小和颜色。

A Default ggplot(默认的ggplot)

首先,我们加载 ggplot2包 (也能通过 tidyverse包加载):

library(ggplot2)
#library(tidyverse)

ggplot2 的语法体系有别于基本R绘图包。如前所示,我们经常开始通过 ggplot(data = df, aes(x = 变量1, y = 变量2)) 定义绘图原件,它告诉ggplot2我们用此数据开始工作。因而,运行这个命令,只创建了一个绘图板,因为ggplot2不知道我们想怎样用这个数据进行绘图。

(g <- ggplot(chic, aes(x = date, y = temp)))
image.png

提示: 在创建图形时,用括号创建一个对象,将即刻输出对象(而不是写作g <- ggplot(...) ,然后 g)。

让我们告诉 ggplot想用的样式:

g + geom_point()
image.png

别担心,我们将在后面学习几个绘图类型。

Change Color of Points(改变点的颜色)

用这个命令,可以改变美学外观,如改变点的颜色:

g + geom_point(color = "firebrick")
image.png

把其用之于我们的绘图原件,以下基于g的图形就有了红色点。

通过设置一个不同内置的主题,如theme_bw,我们就去除了默认的灰色系的外观。

theme_set(theme_bw())

g + geom_point(color = "firebrick")

image.png

(能在 “Working with Themes”部分找到怎样使用内置主题和怎样定制主题。)

Working with Axes(坐标轴)

加入坐标轴标签

让我们为坐标轴加入写好的标签

ggplot(chic, aes(x = date, y = temp)) +
  geom_point(color = "firebrick") + 
  labs(x = "Year", y = expression(paste("Temperature (", degree ~ F, ")")))
image.png

Move Labels Away from the Plot & Change Color(从图形中移动标签和改变标签颜色)

theme() 是一个必要的命令用于修改各种主题元素(文本和标题,框线、符号、背景等)。我们将大量使用它,参见可能的外观here.

ggplot(chic, aes(x = date, y = temp)) +
  geom_point(color = "firebrick") + 
  labs(x = "Year", y = "Temperature (°F)") + 
  theme(axis.title.x = element_text(color = "sienna", 
                                    size = 15, vjust = -0.35),
        axis.title.y = element_text(color = "orangered", 
                                    size = 15, vjust = 0.35))
image.png

Change Size & Angle of Tick Text(改变文本的大小和角度)

用angle和vjust调整文本的位置(0=左对齐,0.5=中心对齐,1=右对齐):

ggplot(chic, aes(x = date, y = temp)) +
  geom_point(color = "firebrick") + 
  labs(x = "Year", y = "Temperature (°F)") +
  theme(axis.text.x = element_text(angle = 50, size = 16, 
                                   vjust = 0.5))
image.png

Remove Axis Ticks & Tick Text(去除坐标轴刻度和刻度文本)

很少有理由去这么做-但是要这么做就需要这样来:

ggplot(chic, aes(x = date, y = temp)) +
  geom_point(color = "firebrick") + 
  labs(x = "Year", y = "Temperature (°F)") +
  theme(axis.ticks.y = element_blank(), 
        axis.text.y = element_blank())

image.png

如果你想去掉主体元素,参数的元素就是element_blank()

Limit Axis Range(限制坐标轴范围)

有时候想要缩放你的数据,可以这样做,而不需要从数据取子集。

ggplot(chic, aes(x = date, y = temp)) +
  geom_point(color = "firebrick") + 
  labs(x = "Year", y = "Temperature (°F)") +
  ylim(c(0, 50))
image.png

另外,可用g + scale_x_continuous(limits = c(0, 50))或者g + coord_cartesian(xlim = c(0, 50))。前者移除了范围以外的所有的数据点,而第二个调整了可视的范围。

Force Plot to Start at Origin(强制绘图开始于原点)

与之相关,可以强制R绘图开始于原点:

library(tidyverse)

chic %>% 
  dplyr::filter(temp > 25, o3 > 20) %>% 
  ggplot(aes(x = temp, y = o3)) + 
    geom_point() + 
    labs(x = expression(paste("Temperature higher than 25 ", degree ~ F, "")), 
         y = "Ozone higher than 20 ppb") + 
   expand_limits(x = 0, y = 0)

image.png

coord_cartesian(xlim = c(0, max(chic_red$temp)), ylim = c(0, max(chic_red$o3)))能得到相同的结果。
也能真正的强制绘图开始于原点。

chic %>% 
  dplyr::filter(temp > 25, o3 > 20) %>% 
  ggplot(aes(x = temp, y = o3)) + 
    geom_point() + 
    labs(x = expression(paste("Temperature higher than 25 ", degree ~ F, "")), 
         y = "Ozone higher than 20 ppb") + 
    expand_limits(x = 0, y = 0) + 
    scale_x_continuous(expand = c(0, 0)) + 
    scale_y_continuous(expand = c(0, 0)) +
    coord_cartesian(clip = "off")
image.png

Axes with Same Scaling(带有相同标度的坐标)

出于展示的目的,我们用温度和带有随机噪声的温度绘图:

ggplot(chic, aes(x = temp, y = temp + rnorm(nrow(chic), sd = 20))) +
   geom_point() +
   labs(x = "Temperature (°F)") +
   xlim(c(0, 100)) + ylim(c(0, 150)) +
   coord_equal()
image.png

Use a Function to Alter Labels(用函数改变标签)

有时要稍稍改变标签,或者加入单位或百分比符号,而不把他们加入你的数据。这种情况下,可以用一个函数来实现。这是一个例子:

ggplot(chic, aes(x = date, y = temp)) +
   geom_point(color = "firebrick") +
   labs(x = "Year", y = "Temperature (°F)") +
   scale_y_continuous(label = function(x) {return(paste(x, "Degrees Fahrenheit"))})  
image.png

Working with Titles(标题)

Add a Title(加入标题)

能用ggtitle()函数加入标题:

ggplot(chic, aes(x = date, y = temp)) +
  geom_point(color = "firebrick") + 
  labs(x = "Year", y = "Temperature (°F)") + 
  ggtitle("Temperatures in Chicago")
image.png

此外,可以用g + labs(tite = "Temperatures in Chicago")。这里可以加入几个参数,如加入副标题和一个标题:

ggplot(chic, aes(x = date, y = temp)) +
  geom_point(color = "firebrick") + 
  labs(x = "Year", y = "Temperature (°F)",
       title = "Temperatures in Chicago", 
       subtitle = "Seasonal pattern of daily temperatures from 1997 to 2001", 
       caption = "Data: NMMAPS")
image.png

Make Title Bold & Add a Space at the Baseline(将标题加粗和基线加入空格)

face参数能用于使字体加粗或斜体。margin参数用 margin函数提供上下左右的边界(默认单位是点)。

ggplot(chic, aes(x = date, y = temp)) +
  geom_point(color = "firebrick") + 
  labs(x = "Year", y = "Temperature (°F)", 
       title = "Temperatures in Chicago") + 
  theme(plot.title = element_text(size = 15, face = "bold", 
                                  margin = margin(10, 0, 10, 0)))
image.png

(记住边界参数的好方法是“trouble”的trbl字母与四个边很像)。

Adjust Position of Titles(调整标题的位置)

hjust控制对齐(代表水平调整):

ggplot(chic, aes(x = date, y = temp)) +
  geom_point(color = "firebrick") +  
  labs(x = "Year", y = "Temperature (°F)", 
       title = "Temperatures in Chicago") + 
  theme(plot.title = element_text(size = 15, face = 4, hjust = 1))

image.png

当然,用vjust可以调节垂直对齐。

Use a Non-Traditional Font in Your Title(在题目中用非传统的字体)

值得注意的是可以用不同的字体。为了用安装到你机器上的字体(可能你在用office程序),我们从extrafont包获取帮助。在加载包后,需要输入和加载安装到你机器上的:

library(extrafont)
extrafont::font_import()
## Importing fonts may take a few minutes, depending on the number of fonts and the speed of the system.
## Continue? [y/n]
## extrafont::loadfonts(device = "win")

可以先看一下你输入的字体库,用fonts()fonttable().

现在,我们使用这些字体家族的一种:

ggplot(chic, aes(x = date, y = temp)) +
  geom_point(color = "firebrick") + 
  labs(x = "Year", y = "Temperature (°F)", title = "Temperatures in Chicago") + 
  theme(plot.title = element_text(size = 18, family = "Merriweather"))
image.png

(你也可以为你的图形设置一种非默认的字体,详见“Working with Themes”部分。我用Roboto Condensed作为图形的新的默认字体)

theme_set(theme_bw(base_size = 12, base_family = "Roboto Condensed"))

Change Spacing in Multi-Line Text(在多行文本中改变间隔)

能用 lineheight 参数改变行的间隔。在本例中,我把行压缩了一点(行高小于1)。

ggplot(chic, aes(x = date, y = temp)) +
  geom_point(color = "firebrick") + 
  labs(x = "Year", y = "Temperature (°F)") + 
  ggtitle("Temperatures in Chicago\nfrom 1997 to 2001") + 
  theme(plot.title = element_text(size = 16, face = "bold", 
                                  vjust = 1, lineheight = 0.75))
image.png

Working with Legends(图例)

我们将按照季节为图形着色。你会看到默认情况下,图例的标题正是我们在颜色参数所定义的:

ggplot(chic, aes(x = date, y = temp, color = factor(season))) +
  geom_point() +
  labs(x = "Year", y = "Temperature (°F)")
image.png

Turn Off the Legend(关闭图例)

常被问及的第一个问题是:“我怎样才能去掉图例?”。
很容易就能用legend.position = "none"实现:

ggplot(chic, aes(x = date, y = temp, color = factor(season))) +
  geom_point() +
  labs(x = "Year", y = "Temperature (°F)") + 
  theme(legend.position = "none")
image

取决于特定的情况,也可以用guides(fill = F)或者用scale_fill_discrete(guide = F)

Turn Off Legend Titles(关闭图例标题)

正如我们所学,用 element_blank() 来绘制无图(nothing):

ggplot(chic, aes(x = date, y = temp, color = factor(season))) +
  geom_point() +
  labs(x = "Year", y = "Temperature (°F)") + 
  theme(legend.title = element_blank())
image

Change Legend Position(改变图例位置)

如不想把图例置于右侧,可在用themelegend.position 参数 :

gplot(chic, aes(x = date, y = temp, color = factor(season))) +
  geom_point() +
  labs(x = "Year", y = "Temperature (°F)") + 
  theme(legend.position = "bottom")
image

可能的位置包括:上,右,下,左。

Change Style of Legend Titles(改变图例标题的样式)

通过调节主题元素legend.title修改图例题目的外观:

ggplot(chic, aes(x = date, y = temp, color = factor(season))) +
  geom_point() +
  labs(x = "Year", y = "Temperature (°F)") + 
  theme(legend.title = element_text(color = "chocolate", 
                                    size = 14, face = "bold"))
image

Change Legend Title(改变图例标题)

改变图例标题最简便的方式是使用labs参数:

ggplot(chic, aes(x = date, y = temp, color = factor(season))) +
  geom_point() +
  labs(x = "Year", y = "Temperature (°F)", color = "Seasons\nindicated\nby colors:") + 
  theme(legend.title = element_text(color = "chocolate", 
                                    size = 14, face = "bold"))

image.png

取决于展示变量的类型,图例的细节可通过scale_color_discretescale_color_continuous修改。

ggplot(chic, aes(x = date, y = temp, color = factor(season))) +
  geom_point() +
  labs(x = "Year", y = "Temperature (°F)") + 
  theme(legend.title = element_text(color = "chocolate", 
                                    size = 14, face = "bold")) +
  scale_color_discrete(name = "Seasons\nindicated\nby colors:")

Change Order of Legend Keys(改变图例的排序)

通过改变season的水平达到目的 :

chic$season <- factor(chic$season, levels = c("Spring", "Summer", 
                                              "Autumn", "Winter"))

ggplot(chic, aes(x = date, y = temp, color = factor(season))) +
  geom_point() +
  labs(x = "Year", y = "Temperature (°F)")

Change Legend Labels(改变图例标签)

我们实现以其涵盖的月份替换季节标签:

ggplot(chic, aes(x = date, y = temp, color = factor(season))) +
  geom_point() +
  labs(x = "Year", y = "Temperature (°F)") + 
  theme(legend.title = element_text(color = "chocolate", 
                                    size = 14, face = 2)) +
  scale_color_discrete("Seasons:", labels = c("Mar - May", "Jun - Aug", 
                                              "Sep - Nov", "Dec - Feb"))
image.png

Change Background Boxes in the Legend(改变图例的背景边框)

要改变(填充)图例项的背景色,我们需要调整主题元素legend.key的设置:

ggplot(chic, aes(x = date, y = temp, color = factor(season))) +
  geom_point() +
  labs(x = "Year", y = "Temperature (°F)") + 
  theme(legend.key = element_rect(fill = "darkgoldenrod1"),
        legend.title = element_text(color = "chocolate", 
                                    size = 14, face = 2)) +
  scale_color_discrete("Seasons:")
image

如果要全部去除背景,使用fill = NA

Change Size of Legend Symbols(改变图例符号)

图例中的点有点损失,特别是没有边框时。覆盖默认的点,试试以下代码:

ggplot(chic, aes(x = date, y = temp, color = factor(season))) +
  geom_point() +
  labs(x = "Year", y = "Temperature (°F)") + 
  theme(legend.key = element_rect(fill = NA),
        legend.title = element_text(color = "chocolate", 
                                    size = 14, face = 2)) +
  scale_color_discrete("Seasons:") +
  guides(color = guide_legend(override.aes = list(size = 6))
image

Leave a Layer Off the Legend(只在图例中保留一个图层)

假如你有一个点状图层,接着你又为同样的数据添加了一个边际图。默认情况下,点图和线图使得图例变成这样:

ggplot(chic, aes(x = date, y = temp, color = factor(season))) +
  geom_point() +
  labs(x = "Year", y = "Temperature (°F)") + 
  geom_rug() +
  theme(legend.title = element_text(color = "chocolate", 
                                    size = 14, face = 2)) +
  scale_color_discrete("Seasons:")

image.png

可用show.legend = F来关闭图例的一个图层:

ggplot(chic, aes(x = date, y = temp, color = factor(season))) +
  geom_point() +
  labs(x = "Year", y = "Temperature (°F)") + 
  geom_rug(show.legend = F) +
  theme(legend.title = element_text(color = "chocolate", size = 14, face = 2)) +
  scale_color_discrete("Seasons:")
image.png

Manually Adding Legend Items(手动添加图例项)

ggplot2 不能自动添加图例,除非你来把美学参数(颜色、大小等)映射到一个变量。我曾不止一次的想有一个图例,使得所绘之图更加清晰。
这是默认图形:

ggplot(chic, aes(x = date, y = o3)) +
  geom_line(color = "gray") +
  geom_point(color = "darkorange2") +
  labs(x = "Year", y = "Ozone")
image

我们可以强制映射一个图例到一个变量。用 aes() 映射线和点。在此我们的数据集中,不映射到一个变量,而是映射到一个代码串(以便为每一个线或点绘制一种颜色)。

ggplot(chic, aes(x = date, y = o3)) +
  geom_line(aes(color = "line")) +
  geom_point(aes(color = "points")) +
  labs(x = "Year", y = "Ozone") +
  scale_color_discrete("Type:")
image

已经接近,但是这不是我们想要的。我们想要灰色和红色。要改变颜色,我们使用scale_color_manual()。另外,使用 guide()函数覆盖图例外观。

Voila! 现在,我们绘出了灰色的线和红色的点,还有带有灰色线和红色点的图例符号:

ggplot(chic, aes(x = date, y = o3)) + 
  geom_line(aes(color = "line")) +  
  geom_point(aes(color = "points")) +
  labs(x = "Year", y = "Ozone") +
  scale_color_manual("", guide = "legend", 
                     values = c("points" = "darkorange2", 
                                "line" = "gray")) +
  guides(color = guide_legend(override.aes = list(linetype = c(1, 0), 
                                                  shape = c(NA, 16))))
image

Working with Backgrounds & Grid Lines(背景和网格线)

有很多种方法用一个函数(如下)改变绘图的整体外观,但是如果你只想改变部分元素的颜色,你可以这么做。

Change the Panel Color(改变面板颜色)

ggplot(chic, aes(x = date, y = temp)) +
  geom_point(color = "firebrick") +
  labs(x = "Year", y = "Temperature (°F)") +
  theme(panel.background = element_rect(fill = "moccasin"))
image

Change Grid Lines(改变网格线)

有两种网格线:主要网格线指示刻度和在主要网格线之间的次要网格线。

ggplot(chic, aes(x = date, y = temp)) +
  geom_point(color = "firebrick") +
  labs(x = "Year", y = "Temperature (°F)") + 
  theme(panel.background = element_rect(fill = "grey90"),
        panel.grid.major = element_line(color = "gray10", size = 0.5),
        panel.grid.minor = element_line(color = "gray70", size = 0.25))
image

进一步,你可以在主要网格线和次要网格线间自定义间隔:

ggplot(chic, aes(x = date, y = temp)) +
  geom_point(color = "firebrick") +
  labs(x = "Year", y = "Temperature (°F)") + 
  scale_y_continuous(breaks = seq(0, 100, 10),
                     minor_breaks = seq(0, 100, 2.5))
image.png

Change the Plot Background Color(改变图形的背景色)

要改变(填充)图形区域的背景色,需要调节主题元素的 plot.background:

ggplot(chic, aes(x = date, y = temp)) +
  geom_point(color = "firebrick") +
  labs(x = "Year", y = "Temperature (°F)") + 
  theme(plot.background = element_rect(fill = "gray60"))
image

Working with Margins(边界)

有时候为绘图边界加入一点空隙是有用的。同前面的例子相似,我们用theme()函数的参数实现。在此使用参数 plot.margin. 如前,我们通过plot.background改变背景色,已经展示了默认的边际。

现在,让我们为左右添加额外的空间。参数plot.margin可以处理多种不同的单位(厘米、英寸等),但是需要使用 grid 包的函数单位来定义单位。在此我在左右使用用5厘米的边际。

ggplot(chic, aes(x = date, y = temp)) +
  geom_point(color = "firebrick") +
  labs(x = "Year", y = "Temperature (°F)") + 
  theme(plot.background = element_rect(fill = "gray60"),
        plot.margin = unit(c(1, 5, 1, 5), "cm"))
image

边际的顺序是上右下左。一个容易记忆的方法是 "trouble 分别代表四个边的首字母。

Working with Multi-Panel Plots(分面绘图)

ggplot2 包有两个很好的函数用于创建分面绘图。他们相互有联系,但是有所不同,facet_wrap 本质上基于一个变量创建了一个绘图带,而facet_grid 可以接受两个变量。

Create a Single Row of Plots Based on One Variable(基于一个变量创建一列图形)

facet_wrap 创建一个变量的分面图,前面用一个破折号: facet_wrap(~ variable)。子图的外观用 ncolnrow参数进行控制:

g <- ggplot(chic, aes(x = date, y = temp)) +
       geom_point(color = "chartreuse4") +
       labs(x = "Year", y = "Temperature (°F)")

g + facet_wrap(~ year, nrow = 1) +
    theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1))
image

Create a Matrix of Plots Based on One Variable(基于一个变量创建绘图矩阵)

g + facet_wrap(~ year, nrow = 2)
image.png

Allow Scales to Roam Free(自由展示标度)

ggplot2分面图的默认效果是为各自分面图使用相同的标度。但有时你想让自己的数据的每个分面有自己的标度。这不是个好主意,因为这会给用户对数据以错误的印象。你可以设置scales = "free" 来实现:

g + facet_wrap(~ year, nrow = 2, scales = "free")
image

注意x和y轴的范围是不同的。

Create a Grid of Plots Based on Two Variables(基于两个变量创建一个绘图矩阵)

在两个变量的情况下,facet_grid 可以完成此任务。在此,变量的顺序决定了行数和列数:

ggplot(chic, aes(x = date, y = temp)) +
  geom_point(color = "orangered") +
  theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1)) +
  labs(x = "Year", y = "Temperature (°F)") +
  facet_grid(year ~ season)
image

要交换行对列的排布,可以改变 facet_grid(year ~ season)facet_grid(season ~ year)

Put Two (Different) Plots Side by Side(把两张图肩靠肩排列)

有好几种方法来组合绘图。以我愚见,最简便的方法是 patchwork
Thomas Lin Pedersen写的:

p1 <- ggplot(chic, aes(x = date, y = temp, 
                       color = factor(season))) + 
        geom_point() + 
        geom_rug() +
        labs(x = "Year", y = "Temperature (°F)")

p2 <- ggplot(chic, aes(x = date, y = o3)) + 
        geom_line(color = "gray") + 
        geom_point(color = "darkorange2") + 
        labs(x = "Year", y = "Ozone")

library(patchwork)
p1 + p2
image

通过两图相除,可以改变图的顺序(注意此排列方式,一个有图例而另一个没有图例):

p1 / p2
image.png

而且嵌套作图也是可以滴!

(g + p2) / p1
image.png

(注意图的排布,甚至只有一行包含图例)
另外,Claus Wilke写的 cowplot 提供了同样的功能(以及其它许多好的应用):

library(cowplot)
plot_grid(p1, p2)

gridExtra包也可以:

library(gridExtra)
grid.arrange(p1, p2, ncol = 2)

Working with Themes(主题)

Change the Overall Plotting Style(改变整体的绘图格式)

通过应用主题可以改变整体外观。例如,Jeffrey Arnold把 ggthemes 库和几个用户定义的主题放到一起。清单参见 ggthemes 网站. 不用任何代码,你就可以调节数种样式,其中一些样式和美学外观为大众所熟知。

这里是一个例子复制了《经济学人》杂志的绘图样式

library(ggthemes)

ggplot(chic, aes(x = date, y = temp, color = factor(season))) +
  geom_point() +
  labs(x = "Year", y = "Temperature (°F)") + 
  ggtitle("Ups and Downs of Chicago's Daily Temperatures") +
  theme_economist() + 
  scale_color_economist(name = "Seasons:") +
  theme(legend.title = element_text(size = 12, face = "bold"))

image.png

另外一个例子是Tufte的绘图样式,这是一个基于 Edward Tufte的书 The Visual Display of Quantitative Information的最少笔墨主题。 此书以Minard’s chart depicting Napoleon’s march on Russia 作为有史以来人类所创建的最好的统计学绘图得以广为人知。Tuftes 图变的知名是由于纯粹的样式。请看你自己的图:

set.seed(2019)
chic.red <- chic[sample(nrow(chic), 50), ]

ggplot(chic.red, aes(x = temp, y = o3)) +
  geom_point() +
  labs(x = "Temperature (°F)", y = "Ozone") + 
  ggtitle("Temperature and Ozone Levels in Chicago") +
  theme_tufte() +
  stat_smooth(method = "lm", col = "black", size = 0.7, 
              fill = "gray60", alpha = 0.2)
image.png

由于Tufte’s样式是极简主义,我们首先减少用于展示的数据点的数目以贴合这条规则。(不关心`stat_smooth() '命令,稍后我会解释)。只是添加它以使绘图更具趣味。

ggplot(chic.red, aes(x = temp, y = o3)) +
  geom_point() +
  labs(x = "Temperature (°F)", y = "Ozone") + 
  ggtitle("Temperature and Ozone Levels in Chicago") +
  theme_tufte() +
  stat_smooth(method = "lm", col = "black", size = 0.7, 
              fill = "gray60", alpha = 0.2) + 
  geom_rangeframe()

image.png

如果你想如此绘图,请参见此博客,用R重绘几个Tufte 图。

Change the Size of All Plot Text Elements(改变所有绘图文本元素的大小)

很容易就一次性改变所有的文本元素的大小。如果你仔细研究默认的主题(见如下“Create and Use Your Custom Theme”章节),就会注意到所有的元素的大小都相对于基本的大小。所以,你可以简单改变基本大小,如果你想增加绘图的可读性:

theme_set(theme_gray(base_size = 30, base_family = "Roboto Condensed"))

ggplot(chic, aes(x = date, y = temp, color = factor(season))) + 
  geom_point() + 
  labs(x = "Year", y = "Temperature (°F)") + 
  guides(color = F) 
image.png

Create and Use Your Custom Theme(用户自定义的主题的创建和使用)

如果想要改变整体主题,可以用 theme_set,例如theme_set(theme_bw())。默认的主题是theme_gray。如果要创建自定义的主题,可以从灰度主题提取代码并修改。注意 rel()函数改变了base_size的大小。

theme_gray
ize = 11, base_family = "", base_line_size = base_size/22, 
##     base_rect_size = base_size/22) 
## {
##     half_line <- base_size/2
##     theme(line = element_line(colour = "black", size = base_line_size, 
##         linetype = 1, lineend = "butt"), rect = element_rect(fill = "white", 
##         colour = "black", size = base_rect_size, linetype = 1), 
##         text = element_text(family = base_family, face = "plain", 
##             colour = "black", size = base_size, lineheight = 0.9, 
##             hjust = 0.5, vjust = 0.5, angle = 0, margin = margin(), 
##             debug = FALSE), axis.line = element_blank(), axis.line.x = NULL, 
##         axis.line.y = NULL, axis.text = element_text(size = rel(0.8), 
##             colour = "grey30"), axis.text.x = element_text(margin = margin(t = 0.8 * 
##             half_line/2), vjust = 1), axis.text.x.top = element_text(margin = margin(b = 0.8 * 
##             half_line/2), vjust = 0), axis.text.y = element_text(margin = margin(r = 0.8 * 
##             half_line/2), hjust = 1), axis.text.y.right = element_text(margin = margin(l = 0.8 * 
##             half_line/2), hjust = 0), axis.ticks = element_line(colour = "grey20"), 
##         axis.ticks.length = unit(half_line/2, "pt"), axis.ticks.length.x = NULL, 
##         axis.ticks.length.x.top = NULL, axis.ticks.length.x.bottom = NULL, 
##         axis.ticks.length.y = NULL, axis.ticks.length.y.left = NULL, 
##         axis.ticks.length.y.right = NULL, axis.title.x = element_text(margin = margin(t = half_line/2), 
##             vjust = 1), axis.title.x.top = element_text(margin = margin(b = half_line/2), 
##             vjust = 0), axis.title.y = element_text(angle = 90, 
##             margin = margin(r = half_line/2), vjust = 1), axis.title.y.right = element_text(angle = -90, 
##             margin = margin(l = half_line/2), vjust = 0), legend.background = element_rect(colour = NA), 
##         legend.spacing = unit(2 * half_line, "pt"), legend.spacing.x = NULL, 
##         legend.spacing.y = NULL, legend.margin = margin(half_line, 
##             half_line, half_line, half_line), legend.key = element_rect(fill = "grey95", 
##             colour = "white"), legend.key.size = unit(1.2, "lines"), 
##         legend.key.height = NULL, legend.key.width = NULL, legend.text = element_text(size = rel(0.8)), 
##         legend.text.align = NULL, legend.title = element_text(hjust = 0), 
##         legend.title.align = NULL, legend.position = "right", 
##         legend.direction = NULL, legend.justification = "center", 
##         legend.box = NULL, legend.box.margin = margin(0, 0, 0, 
##             0, "cm"), legend.box.background = element_blank(), 
##         legend.box.spacing = unit(2 * half_line, "pt"), panel.background = element_rect(fill = "grey92", 
##             colour = NA), panel.border = element_blank(), panel.grid = element_line(colour = "white"), 
##         panel.grid.minor = element_line(size = rel(0.5)), panel.spacing = unit(half_line, 
##             "pt"), panel.spacing.x = NULL, panel.spacing.y = NULL, 
##         panel.ontop = FALSE, strip.background = element_rect(fill = "grey85", 
##             colour = NA), strip.text = element_text(colour = "grey10", 
##             size = rel(0.8), margin = margin(0.8 * half_line, 
##                 0.8 * half_line, 0.8 * half_line, 0.8 * half_line)), 
##         strip.text.x = NULL, strip.text.y = element_text(angle = -90), 
##         strip.placement = "inside", strip.placement.x = NULL, 
##         strip.placement.y = NULL, strip.switch.pad.grid = unit(half_line/2, 
##             "pt"), strip.switch.pad.wrap = unit(half_line/2, 
##             "pt"), plot.background = element_rect(colour = "white"), 
##         plot.title = element_text(size = rel(1.2), hjust = 0, 
##             vjust = 1, margin = margin(b = half_line)), plot.subtitle = element_text(hjust = 0, 
##             vjust = 1, margin = margin(b = half_line)), plot.caption = element_text(size = rel(0.8), 
##             hjust = 1, vjust = 1, margin = margin(t = half_line)), 
##         plot.tag = element_text(size = rel(1.2), hjust = 0.5, 
##             vjust = 0.5), plot.tag.position = "topleft", plot.margin = margin(half_line, 
##             half_line, half_line, half_line), complete = TRUE)
## }
## <bytecode: 0x0000000006c89660>
## <environment: namespace:ggplot2>

现在,让我们修改默认主题函数,并看看效果图:

theme_custom <- function (base_size = 12, base_family = "Roboto Condensed") {
  half_line <- base_size/2
  theme(line = element_line(color = "black", size = 0.5, linetype = 1, lineend = "butt"), 
        rect = element_rect(fill = "white", color = "black", size = 0.5, linetype = 1), 
        text = element_text(family = base_family, face = "plain", color = "black", 
                            size = base_size, lineheight = 0.9, hjust = 0.5, vjust = 0.5, 
                            angle = 0, margin = margin(), debug = F), 
        axis.line = element_blank(), 
        axis.line.x = NULL, 
        axis.line.y = NULL, 
        axis.text = element_text(size = base_size * 1.1, color = "gray30"), 
        axis.text.x = element_text(margin = margin(t = 0.8 * half_line/2), vjust = 1), 
        axis.text.x.top = element_text(margin = margin(b = 0.8 * half_line/2), vjust = 0), 
        axis.text.y = element_text(margin = margin(r = 0.8 * half_line/2), hjust = 1), 
        axis.text.y.right = element_text(margin = margin(l = 0.8 * half_line/2), hjust = 0), 
        axis.ticks = element_line(color = "gray30", size = 0.7), 
        axis.ticks.length = unit(half_line / 1.5, "pt"), 
        axis.title.x = element_text(margin = margin(t = half_line), vjust = 1, 
                                    size = base_size * 1.3, face = "bold"), 
        axis.title.x.top = element_text(margin = margin(b = half_line), vjust = 0), 
        axis.title.y = element_text(angle = 90, margin = margin(r = half_line), 
                                    vjust = 1, size = base_size * 1.3, face = "bold"), 
        axis.title.y.right = element_text(angle = -90, vjust = 0, 
                                          margin = margin(l = half_line)), 
        legend.background = element_rect(color = NA), 
        legend.spacing = unit(0.4, "cm"), 
        legend.spacing.x = NULL, 
        legend.spacing.y = NULL, 
        legend.margin = margin(0.2, 0.2, 0.2, 0.2, "cm"), 
        legend.key = element_rect(fill = "gray95", color = "white"), 
        legend.key.size = unit(1.2, "lines"), 
        legend.key.height = NULL, 
        legend.key.width = NULL, 
        legend.text = element_text(size = rel(0.8)), 
        legend.text.align = NULL, 
        legend.title = element_text(hjust = 0), 
        legend.title.align = NULL, 
        legend.position = "right", 
        legend.direction = NULL, 
        legend.justification = "center", 
        legend.box = NULL, 
        legend.box.margin = margin(0, 0, 0, 0, "cm"), 
        legend.box.background = element_blank(), 
        legend.box.spacing = unit(0.4, "cm"), 
        panel.background = element_rect(fill = "white", color = NA),
        panel.border = element_rect(color = "gray30", 
                                    fill = NA, size = 0.7),
        panel.grid.major = element_line(color = "gray90", size = 1),
        panel.grid.minor = element_line(color = "gray90", size = 0.5, 
                                        linetype = "dashed"),
        panel.spacing = unit(base_size, "pt"), 
        panel.spacing.x = NULL, 
        panel.spacing.y = NULL, 
        panel.ontop = F, 
        strip.background = element_rect(fill = "white", color = "gray30"), 
        strip.text = element_text(color = "black", size = base_size), 
        strip.text.x = element_text(margin = margin(t = half_line, 
                                                    b = half_line)), 
        strip.text.y = element_text(angle = -90, margin = margin(l = half_line, 
                                                                 r = half_line)), 
        strip.placement = "inside", 
        strip.placement.x = NULL, 
        strip.placement.y = NULL, 
        strip.switch.pad.grid = unit(0.1, "cm"), 
        strip.switch.pad.wrap = unit(0.1, "cm"), 
        plot.background = element_rect(color = NA), 
        plot.title = element_text(size = base_size * 1.8, hjust = 0.5, 
                                  vjust = 1, face = "bold", 
                                  margin = margin(b = half_line * 1.2)), 
        plot.subtitle = element_text(size = base_size * 1.3, hjust = 0.5, vjust = 1, 
                                     margin = margin(b = half_line * 0.9)), 
        plot.caption = element_text(size = rel(0.9), hjust = 1, vjust = 1, 
                                    margin = margin(t = half_line * 0.9)),
        plot.tag = element_text(size = rel(1.2), hjust = 0.5, vjust = 0.5), 
        plot.tag.position = "topleft", 
        plot.margin = margin(base_size, base_size, base_size, base_size), complete = T)
}

浏览一下修改的外观,新的面板和网格线,还有坐标轴标度、文本和标题:

theme_set(theme_custom())

ggplot(chic, aes(x = date, y = temp, color = factor(season))) + 
  geom_point() + labs(x = "Year", y = "Temperature (°F)") + guides(color = F)
image.png

这种改变绘图设计的方式值得大力推荐。因为它让你能快速一次性改变绘图的任何元素。 在数秒内让你的结果符合恰当的样式并满足其它任何需求(例如有更大字体大小用于演示或期刊需要)。

也可以通过theme_update()快速进行设置:

theme_custom <- theme_update(panel.background = element_rect(fill = "gray60"))

ggplot(chic, aes(x = date, y = temp, color = factor(season))) + 
  geom_point() + labs(x = "Year", y = "Temperature (°F)") + guides(color = F)
image.png

出于联系目的,我们用自己的主题,使用白色填充和不带有白色的小网格线:

theme_custom <- theme_update(panel.background = element_rect(fill = "white"),
                             panel.grid.major = element_line(size = 0.5),
                             panel.grid.minor = element_blank())

Working with Colors(颜色)

对于简单应用颜色在ggplot2中是直接的,但是当你有更高级的需求时,可能会遇到挑战。对更高级的处理颜色的主题可以翻阅Hadley’s book ,这本书涵盖了很好的内容。也有几个其它的好资源包括 R Cookbookggplot2 online docs。哥伦比亚的 Tian Zheng 创建了实用性的 PDF of R colors.

为了用你的数据使用颜色,对重要的是需要指导你是否正在处理分类或连续性的变量。

Categorical Variables: Manually Select Colors(分类变量:手动选择颜色)

(g <- ggplot(chic, aes(x = date, y = temp, color = factor(season))) +
        geom_point() + 
        labs(x = "Year", y = "Temperature (°F)") +
        theme(legend.title = element_blank()) +
        scale_color_manual(values = c("dodgerblue4", "darkolivegreen4", 
                                      "darkorchid3", "goldenrod1")))
image.png

Categorical Variables: Use Built-In Palettes(分类变量:使用内置的调色板)

可以用ColorBrewer palettes ,通过 scale_*_brewer,这是ggplot2包内置的函数:

g + scale_color_brewer(palette = "Set1")
image.png

可以忽略程序前台的信息,替换已有的标度为我们想要的。

Categorical Variables: Use Tableau colors(分类变量,使用Tableau颜色)

Tableau是著名的可视化软件,带有为人熟知的调色板。对R的用户,可以通过 ggthemes的命令 scale_color_tableau()获得:

library(ggthemes)
g + scale_color_tableau()
image.png

Continuous Variables: Default Color Schemes(连续性变量:默认的颜色方案)

本例中,我们要改变变量,把颜色赋予ozone,它是一个连续性的变量,同温度强相关(higher temperature = higher ozone)。函数scale_color_gradient()是一个连续性的梯度,而scale_color_gradient2()是另一个分支。

这是默认的ggplot2连续颜色方案 (sequential color scheme):

ggplot(chic, aes(x = date, y = temp, color = o3)) + 
  geom_point() + 
  labs(x = "Year", y = "Temperature (°F)") +
  scale_color_continuous("Ozone:")
image.png

此代码绘同样的图:

ggplot(chic, aes(x = date, y = temp, color = o3)) +  
  geom_point() + 
  labs(x = "Year", y = "Temperature (°F)") +
  scale_color_gradient()

这里是另一个默认的颜色方案:

ggplot(chic, aes(x = date, y = temp, color = o3)) +  
  geom_point() + 
  labs(x = "Year", y = "Temperature (°F)") +
  scale_color_gradient2()
image.png

Continuous Variables: Manually Set a Sequential Color Scheme(连续性变量:手动设置连续颜色方案)

梯度改变的颜色板被用于连续性变量,可以通过手动设置scale_*_gradient:

ggplot(chic, aes(x = date, y = temp, color = o3)) + 
  geom_point() + 
  labs(x = "Year", y = "Temperature (°F)") +
  scale_color_gradient(low = "darkkhaki", high = "darkgreen", "Ozone:")
image.png

温度数据是正态分布的,因此使用对比性大的颜色方案会怎样呢(不是连续性的颜色)。使用
scale_color_gradient2函数用于diverging颜色:

mid <- max(chic$o3) / 2  # or mid <- mean(chic$o3)

ggplot(chic, aes(x = date, y = temp, color = o3)) + 
  geom_point() + 
  labs(x = "Year", y = "Temperature (°F)") + 
  scale_color_gradient2(midpoint = mid, low = "blue4", 
                        mid = "white", high = "red4", "Ozone:")
image.png

Continuous Variables: The Beautiful Viridis Color Palette(连续性变量:美丽的翡翠调色板)

viridis color palettes 不仅是你的图形更漂亮和易于理解,还可以使色盲者易于阅读和以灰色的进行打印。你可以用 dichromate测试在色盲适用的各种形式下你绘图是怎样的外观。
如下多面板的图形展示了4个翡翠调色板中的两个:

g <- ggplot(chic, aes(x = date, y = temp, color = o3)) + 
       geom_point() + 
       labs(x = "Year", y = "Temperature (°F)")

library(viridis)
p1 <- g + scale_color_viridis("Ozone:") + ggtitle("'viridis' (default)")
p2 <- g + scale_color_viridis(option = "inferno", "Ozone:") + ggtitle("'inferno'")
p3 <- g + scale_color_viridis(option = "cividis", "Ozone:") + ggtitle("'cividis'")

library(patchwork)
(p1 + p2 + p3) * theme(legend.position = "bottom")
image.png

将翡翠调色板用于非连续的变量也是可能的。

ggplot(chic, aes(x = date, y = temp, color = factor(season))) +
  geom_point() + 
  labs(x = "Year", y = "Temperature (°F)") +
  theme(legend.title = element_blank()) +
  scale_color_viridis(discrete = T, end = 1)
image.png

Working with Lines(线条)

Add Horizonal or Vertical Lines to a Plot(为图形添加水平或垂直的线)

你可能想强调给定的范围或阈值,使用geom_hline()(对于水平线) 或者 geom_vline()(用于垂直线),在给定的坐标位置绘制一条线 :

ggplot(chic, aes(x = date, y = temp, color = o3)) + 
  geom_point() + 
  labs(x = "Year", y = "Temperature (°F)") + 
  geom_hline(yintercept = c(0, 73))
image.png

如果你想非0或1位置添加一条斜线,需用用到'geom_abline()`。这里是一个例子,用于添加回归线。

reg <- lm(o3 ~ temp, data = chic)
ggplot(chic, aes(x = temp, y = o3)) +
  geom_point(alpha = 0.5) +
  labs(caption = paste0("y = ", round(coefficients(reg)[2], 2), 
                        " * x + ", round(coefficients(reg)[1], 2)), 
       x = "Temperature (°F)", y = "Ozone") + 
  geom_abline(intercept = coefficients(reg)[1], slope = coefficients(reg)[2], 
              color = "darkorange2", size = 1.5)
image.png

随后,我们将学习怎样用命令添加线性拟合线,用`stat_smooth(method = "lm")。但是,可能有其它原因让我们添加一条给定斜率的的线。

Working with Text(文本)

Add Labels to Your Data(为你的数据添加标签)

有时,我们想为我们的数据点添加标签。为避免文本标签重叠或拥挤,使用1%原始数据抽样,均等代表四个季节。

set.seed(1)

library(tidyverse)
sample <- chic %>% 
  dplyr::group_by(season) %>% 
  dplyr::sample_frac(0.01)

## code without pipes: 
## sample <- sample_frac(group_by(chic, season), 0.01)

chic %>% 
  group_by(season) %>% 
  sample_frac(0.01) %>% 
  ggplot(aes(x = date, y = temp, label = season)) +
    geom_point() + 
    geom_text(aes(color = factor(temp)), hjust = 0.5, vjust = -0.5) +
    labs(x = "Year", y = "Temperature (°F)") +
    xlim(as.Date(c('1997-01-01', '2000-12-31'))) + 
    ylim(c(0, 90)) +
    theme(legend.position = "none")
image.png

好吧,避免标签拥挤看起来不work。别担心,我们马上就解决。
可以用'geom_label '框图:

ggplot(sample, aes(x = date, y = temp, label = season)) +
  geom_point() + 
  geom_label(aes(fill = factor(temp)), color = "white", 
             fontface = "bold", hjust = 0.5, vjust = -0.25) +
  labs(x = "Year", y = "Temperature (°F)") +
  xlim(as.Date(c('1997-01-01', '2000-12-31'))) +
  ylim(c(0, 90)) +
  theme(legend.position = "none")

image.png

很酷的一个包是ggrepel,它 为ggplot2提供了geoms以强制以上例子中重叠的标签分离。这里,我们展示了两者,原始的数据和我们带标签的抽样数据:

library(ggrepel)
ggplot(chic, aes(x = date, y = temp, label = season)) +
  geom_point(alpha = 0.5) +
  geom_point(data = sample, aes(color = factor(temp)), size = 2.5) +
  geom_label_repel(data = sample, aes(fill = factor(temp)), 
                   color = "white", fontface = "bold") +
  labs(x = "Year", y = "Temperature (°F)") +
  theme(legend.position = "none")
image.png

对于纯文本标签,使用 geom_text_repel同样奏效。看一下所有的例子 usage examples.

Add Text Annotation in the Top-Right, Top-Left etc.(在上右边,上左等加入文本注释)

ggplot2 可以为 Inf 设置注释坐标,但是用处有限。这里是一个例子 (基于代码this Google group) 用grid 基于坐标标度来指定位置。0是最低,1是最高的位置。

grobTree函数来自于 grid 包,可以创建网格图形对象,textGrob 创建文本图形对象。 annotation_custom() 函数来自于 ggplot2 ,设计用grob作为输入。

library(grid)
my_grob <- grobTree(textGrob("This text stays in place!", 
                             x = 0.1, y = 0.9, hjust = 0, 
                             gp = gpar(col = "black", 
                                       fontsize = 15, 
                                       fontface = "bold")))

ggplot(chic, aes(x = temp, y = o3)) +
  geom_point(color = "tan", alpha = 0.5) + 
  labs(x = "Temperature (°F)", y ="Ozone") +
  annotation_custom(my_grob)
image.png

在你有很多图有不同的标度时,这个函数的作用尤其冥想。如下你看到的图中,坐标轴的范围有很大变化,上面同样的代码能用于在每个分图上同样的位置添加注释。

ggplot(chic, aes(x = temp, y = o3)) +
  geom_point(color = "tan") + 
  labs(x = "Temperature (°F)", y ="Ozone") +
  facet_wrap(~ season, scales = "free") +
  annotation_custom(my_grob)
image.png

Working with Coordinates(标度)

Flip a Plot(翻转图像)

以图形的边翻转图像非常容易实现。在此我们引入coord_flip(),需要它来翻转图像(顺便,我们用 geom_boxpot()来绘制新的图形。

ggplot(chic, aes(x = season, y = o3)) +
  geom_boxplot(fill = "indianred") + 
  labs(x = "Season", y = "Ozone") +
  coord_flip()
image.png

Reverse an Axis(翻转坐标轴)

分别使用scale_x_reverse()scale_y_reverse()很容易实现坐标轴翻转:

ggplot(chic, aes(x = date, y = temp, color = o3)) + 
  geom_point() + 
  labs(x = "Year", y = "Temperature (°F)") + 
  scale_y_reverse()
image.png

Transform an Axis(坐标轴转换)

… 使用scale_y_log10()scale_y_sqrt()转换默认的线性绘图。例如,在此用 log10-转换坐标轴(注意,会引入NA):

ggplot(chic, aes(x = date, y = temp, color = o3)) + 
  geom_point() + 
  labs(x = "Year", y = "Temperature (°F)") + 
  scale_y_log10(lim = c(0.1, 100))
image.png

Circularize a Plot(环形绘图)

通过 coord_polar让坐标系统成环形是可以滴(极性图)。

library(tidyverse)

chic %>% 
  dplyr::group_by(season) %>% 
  dplyr::summarize(o3 = median(o3)) %>% 
  ggplot(aes(x = season, y = o3)) +
    geom_col(aes(fill = factor(season))) + 
    labs(x = "", y = "Median Ozone Level") +
    coord_polar() +
    guides(fill = F)
image.png

此坐标系统可以画饼图:

chic %>% 
  dplyr::mutate(o3_avg = median(o3)) %>% 
  dplyr::filter(o3 > o3_avg) %>% 
  dplyr::mutate(n_all = n()) %>% 
  dplyr::group_by(season) %>% 
  dplyr::summarize(rel = n() / unique(n_all)) %>% 
  ggplot(aes(x = "", y = rel)) +
    geom_col(aes(fill = factor(season)), width = 1) + 
    labs(x = "", 
         y = "Proportion of Days Exceeding\nthe Median Ozone Level") +
    coord_polar("y") +
    scale_fill_brewer(palette = "Set1", name = "Season:") +
    theme(axis.ticks = element_blank())
image.png

Working with Chart Types(图形)

Alternatives to a Box Plot(方框图的替代图)

方框图是很好的,但是可能特让人厌烦。有好几种替代的图形,但首先让我们画一个普通的方框图:

image

有效性? Yes.
趣味性? No.

1. Alternative: Plot of Points(替代图形:绘制点图)

让我们用原始数据绘制每个数据点:

g + geom_point(color = "firebrick")
image

不仅让人烦,而且没有提供有用的信息。为了改进图形,可以增加透明度来处理重叠点:

g + geom_point(color = "firebrick", alpha = 0.1)
image.png

然而,这里设了透明度也不行,因为重叠依然严重,高值或极值仍然不可见。糟糕了,来试试其它方法。

2. Alternative: Jitter the Points(替代图形:点的抖动)

试着为数据添加一些抖动。我超爱这个内部的可视化,但需要注意,人为添加的抖动增加了数据的噪音,可能导致数据的误导。

g + geom_jitter(aes(color = season), alpha = 0.25, 
                position = position_jitter(width = 0.3)) +
    theme(legend.position = "none")
image.png

3. Alternative: Violin Plots(替代图形:小提琴图)

小提琴图通过方框图类似,其突出优势是用核密度来展示最多的数据,这是一种有益的可视化图形。

g + geom_violin(color = "sienna", fill = "red", alpha = 0.4)
image.png

4. Alternative: Combining Violin Plots with Jitter(替代图形:小提琴图和抖动合体)

当然可以组合使用两者,同时估计密度和有原始数据点:

g + geom_violin(aes(color = season), fill = "gray80", alpha = 0.5) +
    geom_jitter(aes(color = season), alpha = 0.25, 
                position = position_jitter(width = 0.3)) +
    theme(legend.position = "none") +
    coord_flip()

image.png

ggforce 提供了所谓的sina 函数,抖动的宽度可以通过数据的密度来控制,可是让抖动图更具视觉吸引力:

library(ggforce)

g + geom_violin(aes(color = season), fill = "gray80", alpha = 0.5) +
    geom_sina(aes(color = season), alpha = 0.25) +
    theme(legend.position = "none") +
    coord_flip()
image.png

5. Alternative: Combining Violin Plots with Box Plots(替代图形:小提琴图和框型图合体)

为了易于估计分位数,可以为小提琴图内部添加框型图,用于指示25%分位数、中位数和75%分位数:

g + geom_violin(aes(fill = season), color = "transparent", alpha = 0.5) +
    geom_boxplot(outlier.alpha = 0, coef = 0, 
                 color = "gray40", width = 0.1) +
    theme(legend.position = "none") +
    coord_flip()
image.png

Create a Rug Representation to a Plot(为图形添加地毯图形)

地毯表示单一定量变量的数据,表现形式是沿着坐标轴的标记。多数情况下,用于散点图或热图来展示一个或两个变量的整体分布:

ggplot(chic, aes(x = date, y = temp, color = factor(season))) +
  geom_point() +
  geom_rug() +
  labs(x = "Year", y = "Temperature (°F)") +
  theme(legend.position = "none")
image.png
ggplot(chic, aes(x = date, y = temp, color = factor(season))) +
  geom_point() +
  geom_rug(sides = "r", alpha = 0.3) +
  labs(x = "Year", y = "Temperature (°F)") +
  theme(legend.position = "none")
image.png

Create a Tiled Correlation Plot(绘制瓦片相关图)

第一步是建立相关矩阵。我们用Pearson相关性,因为所有的变量都是相当的正态分布(但是可以用Spearman系数,如果变量遵循不同的分布模式) 。注意因为相关矩阵有冗余的信息,我们设置它的一半为 NA

corm <- round(cor(chic[ , sort(c("death", "temp", "dewpoint", "pm10", "o3"))], 
                  method = "pearson", use = "pairwise.complete.obs"), 2)
corm[lower.tri(corm)] <- NA
corm
##          death dewpoint    o3 pm10  temp
## death        1    -0.47 -0.24 0.00 -0.49
## dewpoint    NA     1.00  0.45 0.33  0.96
## o3          NA       NA  1.00 0.21  0.53
## pm10        NA       NA    NA 1.00  0.37
## temp        NA       NA    NA   NA  1.00

现在我们用 reshape2包的melt函数,把矩阵变为 long 形式,丢掉 NA 值的记录:

library(reshape2)
corm <- melt(corm)
corm$Var1 <- as.character(corm$Var1)
corm$Var2 <- as.character(corm$Var2)
corm <- na.omit(corm)
head(corm, 10)
##        Var1     Var2 value
## 1     death    death  1.00
## 6     death dewpoint -0.47
## 7  dewpoint dewpoint  1.00
## 11    death       o3 -0.24
## 12 dewpoint       o3  0.45
## 13       o3       o3  1.00
## 16    death     pm10  0.00
## 17 dewpoint     pm10  0.33
## 18       o3     pm10  0.21
## 19     pm10     pm10  1.00

画图我们用 geom_tile,但是如你有很多数据,可以考虑用geom_raster,它会更快。

ggplot(corm, aes(x = Var2, y = Var1)) +
   geom_tile(data = corm, aes(fill = value), color = "white") +
   labs(x = "Variable 2", y = "Variable 1") +
   scale_fill_gradient2(low = "blue", high = "red", mid = "white", 
                        midpoint = 0, limit = c(-1, 1), 
                        name = "Correlation\n(Pearson)") +
   theme(axis.text.x = element_text(angle = 45, size = 11, 
                                    vjust = 1, hjust = 1)) +
   coord_equal()
image

Create a Contour Plot(画登高线图)

等高线图是展示三维数据的好方法,可以指明数值的末端阈值。在此,我们用露点数据画图(也就是 空气中的水蒸气凝结成液态录水的温度), 与温度和臭氧的浓度有关:

## interpolate data
library(akima)
fld <- with(chic, interp(x = temp, y = o3, z = dewpoint))

## prepare data in long format
library(reshape2)
df <- melt(fld$z, na.rm = T)
names(df) <- c("x", "y", "Dewpoint")
df$Temperature <- fld$x[df$x]
df$Ozone <- fld$y[df$y]

g <- ggplot(data = df, aes(x = Temperature, y = Ozone, z = Dewpoint)) +
         theme(panel.background = element_rect(fill = "white"),
               panel.border = element_rect(color = "black", fill = NA),
               legend.title = element_text(size = 15),
               axis.text = element_text(size = 12),
               axis.title.x = element_text(size = 15, vjust = -0.5),
               axis.title.y = element_text(size = 15, vjust = 0.2),
               legend.text = element_text(size = 12))
         
g + stat_contour(aes(color = ..level.., fill = Dewpoint))
image.png

震惊了! 如其定义,露点多数情况下等于测量温度。

这些线表示的是不同的露点水平,但这不是一个靓图,由于缺少边界也难以阅读。让我们试着画一个瓦片图,用 viridis调色板编码露点的臭氧水平和温度组合:

g + geom_tile(aes(fill = Dewpoint)) +
    scale_fill_viridis(option = "inferno")
image.png

让等高线图和瓦片图合体,在登高线下填充这些区域会怎样呢?

g + geom_tile(aes(fill = Dewpoint)) + 
    stat_contour(color = "white", size = 0.7, bins = 5) + 
    scale_fill_viridis()
image.png

Create a Joyplot aka Ridge Plot(一种波涛汹涌,哦不对,是山峰叠峦的可视化方式)

山峰叠峦Joyplots (亦称之为 屋脊(线)图)是一种新型的图形,此刻很流行。 (有趣的事实: 名字参考Joy Division’s “Unknown Pleasures” LP.封面)

你可以用 基本 ggplot 命令 绘图,此图的流行的后果就是有人写了一个包,更加容易绘制此类图形: ggridges. 这里我们用一下这个包。

library(ggridges)
ggplot(chic, aes(x = temp, y = factor(year))) + 
   geom_density_ridges(fill = "gray90") +
   labs(x = "Temperature (°F)", y = "Year")

image.png

使用参数 rel_min_heightscale`分别易于区分重叠和拖尾。这个包带有自己的主题(但是我更愿意自己造个轮子,见“Create and Use Your Custom Theme”)。另外,我们基于年度来变更颜色,使图形更具吸引力。

ggplot(chic, aes(x = temp, y = factor(year), fill = year)) + 
  geom_density_ridges(alpha = 0.8, color = "white", 
                      scale = 2.5, rel_min_height = 0.01) + 
  labs(x = "Temperature (°F)", y = "Year") + 
  guides(fill = F) + 
  theme_ridges()
image.png

也能用scaling参数值小于1,来去除重叠(但是这个有悖于山峰叠峦的本意)。这是一个例子,用viridis调色板:

ggplot(chic, aes(x = temp, y = season, fill = ..x..)) + 
  geom_density_ridges_gradient(scale = 0.9, gradient_lwd = 0.5, 
                               color = "black") + 
  scale_fill_viridis(option = "plasma", name = "") + 
  labs(x = "Temperature (°F)", y = "Season:") +
  theme_ridges(font_family = "Roboto Condensed", grid = F)
image.png

也可以比较每个山峰叠峦线的几个组,根据他们的组别来上色。这个剽窃了 Marc Belzunces的思路。

library(tidyverse)

## only plot extreme season using dplyr from the tidyverse
ggplot(data = filter(chic, season %in% c("Summer", "Winter")), 
         aes(x = temp, y = year, fill = paste(year, season))) +
  geom_density_ridges(alpha = 0.7, rel_min_height = 0.01, 
                      color = "white", from = -5, to = 95) +
  scale_fill_cyclical(breaks = c("1997 Summer", "1997 Winter"),
                      labels = c(`1997 Summer` = "Summer", 
                                 `1997 Winter` = "Winter"),
                      values = c("tomato", "dodgerblue"),
                      name = "Season:", guide = "legend") +
  theme_ridges(font_family = "Roboto Condensed") + 
  labs(x = "Temperature (°F)", y = "Year")

image.png

ggridges包 在geom_density_ridges命令中用 stat = "binline" 创建直方图是有用的:

ggplot(chic, aes(x = temp, y = factor(year), fill = year)) + 
  geom_density_ridges(stat = "binline", bins = 25, scale = 0.9, 
                      draw_baseline = F, show.legend = F) + 
  theme_ridges(font_family = "Roboto Condensed") +
  labs(x = "Temperature (°F)", y = "Season")
image.png

Working with Ribbons (AUC, CI, etc.)(丝带图)

展示丝带图的数据不是非常理想,但是丝带图非常有用。此例中,我们将用filter()函数绘制一个30天的动态平均,以便让我们的丝带的噪音不会太多。

chic$o3run <- as.numeric(stats::filter(chic$o3, rep(1/30, 30), sides = 2))

ggplot(chic, aes(x = date, y = o3run)) +
   geom_line(color = "chocolate", lwd = 0.8) +
   labs(x = "Year", y = "Temperature (°F)")
image

如果我们用geom_ribbon()函数为曲线下的区域添充,看起来会如何呢?

ggplot(chic, aes(x = date, y = o3run)) +
   geom_ribbon(aes(ymin = 0, ymax = o3run), fill = "orange", 
               color = "orange", alpha = 0.4) +
   geom_line(color = "chocolate", lwd = 0.8) +
   labs(x = "Year", y = "Temperature (°F)")
image.png

很好表明了曲线下面积 area under the curve (AUC),但是这不是 geom_ribbon()的经典实用方式。替代方法是,我们绘制一条丝带,在我们数据的上下添加标准差:

chic$mino3 <- chic$o3run - sd(chic$o3run, na.rm = T)
chic$maxo3 <- chic$o3run + sd(chic$o3run, na.rm = T)

ggplot(chic, aes(x = date, y = o3run)) +
   geom_ribbon(aes(ymin = mino3, ymax = maxo3), alpha = 0.5, 
               fill = "darkseagreen3", color = "transparent") +
   geom_line(color = "aquamarine4", lwd = 0.7) +
   labs(x = "Year", y = "Temperature (°F)")
image.png

Working with Smoothings(平滑线)

实用ggplot2为数据添加平滑线易如反掌。

Default: Adding a LOESS or GAM Smoothing(默认方式:添加LOESS或GAM平滑线)

简单实用stat_smooth() – 甚至都不用公式。如果数据少于1000个点,这个添加了LOESS 线(局部权重的散点平滑线,method = "loess") 或者 GAM线 (广义加法模型, method = "gam") 。由于我们点多于1000个,平滑线基于GAM。

ggplot(chic, aes(x = date, y = temp)) + 
  geom_point(color = "gray40", alpha = 0.5)+
  labs(x = "Year", y = "Temperature (°F)") +
  stat_smooth()
image.png

Specifying the Formula for Smoothing(为平滑线定义公式)

ggplot2 让你可以想用的方式定义模型。比如增加GAM的维度(为平滑线加入一些其他额外的起伏):

ggplot(chic, aes(x = date, y = temp)) + 
   geom_point(color = "gray40", alpha = 0.3) +
   labs(x = "Year", y = "Temperature (°F)") +
   stat_smooth(method = "gam", formula = y ~ s(x, k = 1000), 
               se = F, size = 1.3, aes(col = "1000")) +
   stat_smooth(method = "gam", formula = y ~ s(x, k = 100), 
               se = F, size = 1, aes(col = "100")) +
   stat_smooth(method = "gam", formula = y ~ s(x, k = 10), 
               se = F, size = 0.8, aes(col = "10")) +
   scale_color_manual(name = "k", values = c("darkorange2", 
                                             "firebrick", 
                                             "dodgerblue3"))
image.png

Adding a Linear Fit(加入线性拟合)

虽然默认的是 LOESS或GAM平滑线,添加标准的线性拟合也相当容易 :

ggplot(chic, aes(x = temp, y = death)) +
   geom_point(color = "gray40", alpha = 0.5) +
   labs(x = "Temperature (°F)", y = "Deaths") +
   stat_smooth(method = "lm", col = "firebrick", se = F, size = 1.3)
image

Working with Interactive Plots(交互绘图)

Shiny

Shiny是RStudio的一个包,使得用R创建交互网络应用非常便捷,介绍和实例,见 Shiny homepage.

浏览一下潜在用途,可以看Hello Shiny例子。这是第一个:

Plot.ly

Plot.ly 可以用你的ggplot2很容易创建在线的交互图形。过程相当容易,用R就可以实现。

Remarks, Tipps & Tricks(备注、提示和技巧)

Using ggplot2 in Loops and Functions(在循环和函数中用ggplot2)

基于网格的图形函数lattice和ggplot2中创建图形对象。 当你在命令行交互使用这些函数时,结果就会自动打印,但是在 source()或你自己的函数中,需要明确声明print(), i.e. 在我们的例子中的print(g) 。也见 Q&A page of R.

其它资源

©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 202,723评论 5 476
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 85,080评论 2 379
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 149,604评论 0 335
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 54,440评论 1 273
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 63,431评论 5 364
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 48,499评论 1 281
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 37,893评论 3 395
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 36,541评论 0 256
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 40,751评论 1 296
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 35,547评论 2 319
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 37,619评论 1 329
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 33,320评论 4 318
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 38,890评论 3 307
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 29,896评论 0 19
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 31,137评论 1 259
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 42,796评论 2 349
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 42,335评论 2 342

推荐阅读更多精彩内容