[28] 《R数据科学》Workflow: Projects

本文摘自《R数据科学》,主要介绍了RStudio中R工作目录的设置,通过这种方式,可以设置工作文件夹,快速的读取文件,除此之外,也介绍了使用ggsave()对pdf进行保存,原文如下:

One day you will need to quit R, go do something else, and return to your analysis the next day. One day you will be working on multiple analyses simultaneously that all use R and you want to keep them separate. One day you will need to bring data from the outside world into R and send numerical results and figures from R back out into the world. To handle these real-life situations, you need to make two decisions:

  1. What about your analysis is “real,” i.e., what will you save as your lasting record of what happened?
  2. Where does your analysis “live”?

What Is Real?

As a beginning R user, it’s OK to consider your environment (i.e., the objects listed in the environment pane) “real.” However, in the long run, you’ll be much better off if you consider your R scripts as “real.”
With your R scripts (and your data files), you can re-create the environment.
It’s much harder to re-create your R scripts from your environment! You’ll either have to retype a lot of code from memory (making mistakes all the way) or you’ll have to carefully mine your R history.
To foster this behavior, I highly recommend that you instruct RStudio not to preserve your workspace between sessions:


image.png

This will cause you some short-term pain, because now when you restart RStudio it will not remember the results of the code that you ran last time. But this short-term pain will save you long-term agony because it forces you to capture all important interactions in your code. There’s nothing worse than discovering three months after the fact that you’ve only stored the results of an important calculation in your workspace, not the calculation itself in your code.
There is a great pair of keyboard shortcuts that will work together to make sure you’ve captured the important parts of your code in the editor:
• Press Cmd/Ctrl-Shift-F10 to restart RStudio.
• Press Cmd/Ctrl-Shift-S to rerun the current script.
I use this pattern hundreds of times a week.

Where Does Your Analysis Live?

R has a powerful notion of the working directory. This is where R
looks for files that you ask it to load, and where it will put any files
that you ask it to save. RStudio shows your current working directory
at the top of the console:


image.png

And you can print this out in R code by running getwd():

getwd()
#> [1] "/Users/hadley/Documents/r4ds/r4ds"

As a beginning R user, it’s OK to let your home directory, documents directory, or any other weird directory on your computer be R’s working directory. But you’re six chapters into this book, and you’re no longer a rank beginner. Very soon now you should evolve to organizing your analytical projects into directories and, when working on a project, setting R’s working directory to the associated directory.
I do not recommend it, but you can also set the working directory from within R:

setwd("/path/to/my/CoolProject")

But you should never do this because there’s a better way; a way that also puts you on the path to managing your R work like an expert.

Paths and Directories

Paths and directories are a little complicated because there are two basic styles of paths: Mac/Linux and Windows. There are three chief ways in which they differ:
• The most important difference is how you separate the components of the path. Mac and Linux use slashes (e.g., plots/diamonds.pdf) and Windows uses backslashes (e.g., plots\dia
monds.pdf). R can work with either type (no matter what platform you’re currently using), but unfortunately, backslashes mean something special to R, and to get a single backslash in the path, you need to type two backslashes! That makes life frustrating, so I recommend always using the Linux/Max style with forward slashes.
• Absolute paths (i.e., paths that point to the same place regardless of your working directory) look different. In Windows they start with a drive letter (e.g., C:) or two backslashes (e.g., \servername) and in Mac/Linux they start with a slash “/”(e.g., /users/hadley). You should never use absolute paths in your scripts, because they hinder sharing: no one else will have
exactly the same directory configuration as you.
• The last minor difference is the place that ~ points to. ~ is a convenient shortcut to your home directory. Windows doesn’t really have the notion of a home directory, so it instead points to your documents directory.

RStudio Projects

R experts keep all the files associated with a project together—input data, R scripts, analytical results, figures. This is such a wise and common practice that RStudio has built-in support for this via projects.
Let’s make a project for you to use while you’re working through the rest of this book. Click File → New Project, then:


image.png

image.png

image.png

Call your project r4ds and think carefully about which subdirectory you put the project in. If you don’t store it somewhere sensible, it will be hard to find it in the future!
Once this process is complete, you’ll get a new RStudio project just for this book. Check that the “home” directory of your project is the current working directory:

getwd()
#> [1] /Users/hadley/Documents/r4ds/r4ds

Whenever you refer to a file with a relative path it will look for it here.
Now enter the following commands in the script editor, and save the file, calling it diamonds.R. Next, run the complete script, which will save a PDF and CSV file into your project directory. Don’t worry about the details, you’ll learn them later in the book:

library(tidyverse)
ggplot(diamonds, aes(carat, price)) +
geom_hex()
ggsave("diamonds.pdf")
write_csv(diamonds, "diamonds.csv")

Quit RStudio. Inspect the folder associated with your project notice the .Rproj file. Double-click that file to reopen the project. Notice you get back to where you left off: it’s the same working
directory and command history, and all the files you were working on are still open. Because you followed my instructions above, you will, however, have a completely fresh environment, guaranteeing that you’re starting with a clean slate.
In your favorite OS-specific way, search your computer for diamonds.pdf and you will find the PDF (no surprise) but also the script that created it (diamonds.r). This is huge win! One day you will want to remake a figure or just understand where it came from. If you rigorously save figures to files with R code and never with the mouse or the clipboard, you will be able to reproduce old work with ease!

Summary

In summary, RStudio projects give you a solid workflow that will serve you well in the future:
• Create an RStudio project for each data analyis project.
• Keep data files there; we’ll talk about loading them into R in Chapter 8.
• Keep scripts there; edit them, and run them in bits or as a whole.
• Save your outputs (plots and cleaned data) there.
• Only ever use relative paths, not absolute paths.
Everything you need is in one place, and cleanly separated from all the other projects that you are working on.

©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 194,457评论 5 459
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 81,837评论 2 371
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 141,696评论 0 319
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 52,183评论 1 263
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 61,057评论 4 355
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 46,105评论 1 272
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 36,520评论 3 381
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 35,211评论 0 253
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 39,482评论 1 290
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 34,574评论 2 309
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 36,353评论 1 326
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 32,213评论 3 312
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 37,576评论 3 298
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 28,897评论 0 17
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 30,174评论 1 250
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 41,489评论 2 341
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 40,683评论 2 335

推荐阅读更多精彩内容