packageMaintaining.Rmd
The basic workflow for creating vignettes is the following:
Initialize vignette with usethis::use_vignette(name = "vignette_name",title = "Vignette Title")
in the Console.
Edit the contents of the vignette. Preview the vignette by clicking Knit
.
Once you’re happy with the vignette’s contents, build the package website with pkgdown::build_site()
.
Push the local changes to GitHub.
The rest of this section will detail vignette-writing within the dspgWork R package.
Vignettes are helpful to demonstrate or explain R code. Writing vignettes in .Rmd
files allow the author to combine prose (like what you’re reading right now) with executable R code. An author can include additional exposition to help others understand the purpose of a piece of code.
For example: “the code below will install (if you haven’t already) three important R packages for R package development: usethis
, devtools
, and pkgdown
.” Note that pkgdown
is smart enough to automatically link in-line code to the appropriate online documentation page: for example, clicking this -> dplyr::mutate
will take you to the the mutate function’s doc page.
# install.packages("usethis")
library(usethis)
# install.packages("devtools")
library(devtools)
# install.packages("pkgdown")
library(pkgdown)
The usethis
package includes a function usethis::use_vignette
that will automatically initialize a vignette folder (if one doesn’t already exist) and an .Rmd
file. Run the following code in your Console to initialize an example_vignette.Rmd
file. NOTE: make sure that your working directory is set to the package’s root directory (containing the R
, data_raw
, data_clean
, and other folders). Use the setwd
function to change your working directory if needed.
usethis::use_vignette(name = "example_vignette",title = "Example Vignette")
Writing vignettes is somewhat of an art. The following section shows an example of a vignette, but you are highly encouraged to read Hadley Wickhams’ take on vignettes in his R Packages book available here.
Suppose we want to use this vignette to explain a particular plot. As a boring example, we will consider the mtcars
data set available in RStudio. We not only want to show the code used to create the actual plot, but also any code used beforehand to “clean” the data. You can use text to describe any choices you made while cleaning data. For example, if you decide to aggregate or filter data beforehand, you can use the text to explain your motivation.
It is often vignette writing best practice to load any packages you use in the beginning of the document.
library(tidyverse)
#> -- Attaching packages --------------------------------------- tidyverse 1.3.0 --
#> v ggplot2 3.3.3 v purrr 0.3.4
#> v tibble 3.1.2 v dplyr 1.0.6
#> v tidyr 1.1.2 v stringr 1.4.0
#> v readr 1.4.0 v forcats 0.5.1
#> -- Conflicts ------------------------------------------ tidyverse_conflicts() --
#> x dplyr::filter() masks stats::filter()
#> x dplyr::lag() masks stats::lag()
library(ggrepel)
We want to plot the relationship between miles/gallon and # of cylinders for all Mercedes cars. In the mtcars
data set, the Make/Model of the car is stored as row names (as opposed to in its own column). We will add two columns containing the Make and Model information, respectively, instead. The tidyr::separate
function allows use to split a column containing strings by a given separator (in this case, an empty space " ").
pltData <- mtcars %>%
mutate(make_model = row.names(mtcars)) %>%
tidyr::separate(col = make_model,into = c("make","model"),sep = " ",remove = TRUE) %>%
filter(make == "Merc")
#> Warning: Expected 2 pieces. Additional pieces discarded in 3 rows [2, 4, 29].
#> Warning: Expected 2 pieces. Missing pieces filled with `NA` in 1 rows [6].
pltData
#> mpg cyl disp hp drat wt qsec vs am gear carb make model
#> Merc 240D 24.4 4 146.7 62 3.69 3.19 20.0 1 0 4 2 Merc 240D
#> Merc 230 22.8 4 140.8 95 3.92 3.15 22.9 1 0 4 2 Merc 230
#> Merc 280 19.2 6 167.6 123 3.92 3.44 18.3 1 0 4 4 Merc 280
#> Merc 280C 17.8 6 167.6 123 3.92 3.44 18.9 1 0 4 4 Merc 280C
#> Merc 450SE 16.4 8 275.8 180 3.07 4.07 17.4 0 0 3 3 Merc 450SE
#> Merc 450SL 17.3 8 275.8 180 3.07 3.73 17.6 0 0 3 3 Merc 450SL
#> Merc 450SLC 15.2 8 275.8 180 3.07 3.78 18.0 0 0 3 3 Merc 450SLC
Now that the data have been cleaned to our specification, we can create the desired scatter plot.
pltData %>%
ggplot(aes(x = cyl,y = mpg)) +
geom_point() +
ggrepel::geom_text_repel(aes(label = model)) +
theme_bw() +
labs(x = "# of cylinders",
y = "Miles per gallon",
title = "MPG vs. # of cylinders",
subtitle = "Mercedes models")
To preview the vignette, you can click the Knit
button in the toolbar above the .Rmd editing window. The result will be an .html
document containing text, R code, and the R code’s output.
We’re happy with the contents of the above vignette and want to display the vignette on the package website. The pkgdown
package contains various tools to create a package website. Run the following in your Console (assuming your working directory is still the project’s root directory) to build the package website. A local version of the website should automatically open in your default browser once it is done building. You will then need to push this updated local version to GitHub.
pkgdown::build_site()
The rest of the vignette is for individuals who are tasked with maintaining the package. It discusses into what goes on “behind-the-scenes” to generate the pkgdown site and other package idiosyncrasies.
The method by which data are documented in the dspgWork package is different than “normal” R packages. Typically, for any package that contains loaded data (e.g., for function demonstrations), there is a single folder called data
that houses the .rda
or .RData
files. Associated with this folder is normally a script called data.R
. This is the file structure oftentimes assumed by R packages that aid in the development of other R packages.
To maintain a separation between “raw” and “processed” data, we have split the saved data into data_raw
and data_clean
folders. This unorthodox file structure requires a few tricks to be pulled to, for example, build data documentation files from both data_raw
and data_clean
and have those files display nicely on the pkgdown site. For instance, it is assumed that the each data file name will begin with dataRaw_
or dataClean_
(this is so that the pkgdown site builder organizes the data sets correctly - see below).