Managing R Packages with roxyPackage

Getting started

Apart from installing roxyPackage and having written some R code, no further requirements for building your first R package need to be fulfilled. I tried to keep the dependencies for roxyPackage to a minimum, and basically they come down to roxygen21 (used to generate the documentation) and XiMpLe2 (used to generate the HTML and RSS files of the repository).

To demonstrate this, we will start with an extremely simple one: Let’s assume you would like to have an R package called “examplePackage” and have just written a first function for it, called “simpleExample”, which does nothing but return NULL:

simpleExample <- function(){NULL}

To transform this code into a package, save it as a file called "examplePackage/R/simpleExample.R", that is, create a directory with the package name and in it another one named "R". If you were to create the package manually,3 you would now at least write a DESCRIPTION file with meta information like the license, version number and release date, a NAMESPSACE file to export the function so it can acually be seen and used after the package was loaded, and a *-package.Rd file, which has the meta information in Rd format so it’s accessible via the R help system. In a minute, all of this will be done automatically by roxyPackage.

Before this, we need to add at least one roxygen2 command to the function code, which will make sure that the function is exported and user accessible:

#' @export
simpleExample <- function(){NULL}

This must be done for every function, class etc. you would like to export. But that’s about it – we can now call roxy.package() to create a valid R package. You only have to tell the function a few things:

  • where is the source directory of the package (e.g., "~/myRcode/examplePackage")?
  • where is the R library the package can be installed to (e.g., "~/R")?
  • in which directory should the repository be created (e.g., "~/myRcode/repo")?
  • what is the package version number, description, author, its dependencies etc.?

This is what such a call could look like:

roxy.package(
  pck.source.dir="~/myRcode/examplePackage",
  pck.version="0.01-1",
  R.libs="~/R",
  repo.root="~/myRcode/repo",
  pck.description=package_description(
    Package="examplePackage",
    Type="Package",
    Title="An R Example Package",
    AuthorsR="c(person(given=\"Ernst\", family=\"Dölle\",
      email=\"[email protected]\",
      role=c(\"aut\", \"cre\")))",
    Depends="R (>= 3.0.0)",
    Description="Provides a great function to produce NULL results.",
    License="GPL (>= 3)",
    Encoding="UTF-8",
    LazyLoad="yes",
    URL="https://example.com/doelleInnovations"
  )
)

We’re using a function called package_description() to set pck.description. It returns a simple data.frame which is used for the DESCRIPTION file. You can alternative use any ordinary data.frame as long as it only includes valid DESCRIPTION fields4 except “Version:” and “Date:”, as these have their own parameters (pck.version and pck.date, which defaults to Sys.Date()), and Authors@R was simplified to AuthorsR.

At a first glance this maybe look like a lot. But as you can see, except for of course the version number and release date, most of this information will probably remain the same over time, so you can write all of this in a script and then run the script to generate/update your package.

If you run the above call to roxy.package(), the R console will display verbose status information:

R environment
  R.home: /usr/lib/R 
  R.libs: ~/R
repo: created ~/myRcode/repo/pckg/examplePackage.
Updating collate directive in /home/user/myRcode/examplePackage/DESCRIPTION
Updating namespace directives
Writing examplePackage-package.Rd
repo: created ~/myRcode/repo/src/contrib.

repo: examplePackage_0.01-1.tar.gz copied to src/contrib.
* installing *source* package ‘examplePackage’ ...
** R
** preparing package for lazy loading
** help
*** installing help indices
** building package indices ...
** testing if installed package can be loaded

* DONE (examplePackage)
build: package built and installed.
Processing packages:
 /home/user/myRcode/repo/src/contrib/examplePackage_0.01-1.tar.gz
done
repo: src/contrib/PACKAGES (unix) updated.

Now, if you inspect the project directory, you will see that the new files DESCRIPTION, NAMESPACE, R/examplePackage-package.R and man/examplePackage-package.Rd have been added, and you will also find a directory structure in the specified repository location.

Why the local repository?

If you build a package of your R code you probably want to share your work with others. Of course you can upload your package to CRAN. But there might be reasons to have your own repository, for example, if your release pace is considerably higher than the volunteers running CRAN are able to cope with.

Since roxyPackage needed to save its packages somewhere anyway, i chose to organize this place like a proper R package repository. So effectively, you can simply copy the generated repository directory to your webserver, and you’re done. Packages in that repository can then easily be installed from an R session:

install.packages("examplePackage", repos="https://myRepoURL.example.com")

If you’re running a Debian GNU/Linux based system, roxyPackage can also “debianize”5 your package, build Debian source and binary packages, and add an additional Debian package repository hosting your R packages (see "deb" action below).

More actions

The tasks performed at each call of roxy.package() are called “actions”. The default is to run the actions "roxy" and "package", i.e., roxygenize the code documentation and build/install the source package.

There are a number of additional actions, and most of them run out of the box without the need to provide more information:

  • "cite": Use the data from pck.description to write a proper CITATION file.
  • "license": Try to detect the software license of this package from the “License” info in pck.description, and add a copy as the LICENSE file; will let you know if it doesn’t know the license you specified.
  • "doc": Update pdf documentation, including vignettes if found in inst/doc.
  • "buildVignettes": If you’ve added vignettes in RMarkdown format and want the vignette index to be updated, use this action. Have a look at rnw2rmd() if you want to translate older SWeave vignettes into RMarkdown.
  • "cleanRd": If some lines in the generated *.Rd files torn out to be too long, this action reformats the file properly.
  • "readme": Generate initial README.md file (optionally including a GitHub URL).
  • "log": Generate initial ChangeLog or update a present ChangeLog file.
  • "cl2news": Look for a ChangeLog file in the package root directory, and translate it into an inst/NEWS.Rd file.
  • "news2rss": Look for an inst/NEWS.Rd file and translate it into an RSS feed for the repository; you need to provide a value for the URL parameter for this to work, where the URL should point to the root location where your repository is available.
  • "html": Generate updated HTML index files6 throughout the repository, very similar to the HTML pages CRAN provides for its packages; will also include links to the RSS feed and provide HTML versions of the citation info and news/changelog.
  • "win": Build a Windows binary package (only works for pure R packages).
  • "macosx": Build a macOS binary package (only works for pure R packages).
  • "deb": Build Debian source and/or binary packages (only works on Debian based operating systems); this will also set up a Debian package repository inside the defined repository directory and generate the documentation7 on how to use it, and also needs the URL parameter to work, as well as probably additional configuration via the deb.options parameter.
  • "check": Run R CMD check on the package.
  • "binonly": (Re-)build & install the package, binaries only. Useful if you need to repackage your Windows, macOS or Debian packages but don’t want to touch the sources.

So, let’s re-run roxy.package(), to get a copy of the GPL into our sources, HTML index files for the repository and information on how to cite the package properly:

roxy.package(
  pck.source.dir="~/myRcode/examplePackage",
  pck.version="0.01-1",
  R.libs="~/R",
  repo.root="~/myRcode/repo",
  pck.description=package_description(
    Package="examplePackage",
    Type="Package",
    Title="An R Example Package",
    AuthorsR="c(person(given=\"Ernst\", family=\"Dölle\",
      email=\"[email protected]\",
      role=c(\"aut\", \"cre\")))",
    Depends="R (>= 3.0.0)",
    Description="Provides a great function to produce NULL results.",
    License="GPL (>= 3)",
    Encoding="UTF-8",
    LazyLoad="yes",
    URL="https://example.com/doelleInnovations"
  ),
  actions=c(
    "roxy",
    "cite",
    "html",
    "license",
    "package"
  ))

This is what we’re told:

R environment
  R.home: /usr/lib/R 
  R.libs: ~/R
license: saved a copy of the GNU General Public License (GPL) as LICENSE
Updating collate directive in /home/user/myRcode/examplePackage/DESCRIPTION
cite: updated ~/myRcode/examplePackage/inst/CITATION.
cite: updated ~/myRcode/repo/pckg/examplePackage/citation.html.

repo: examplePackage_0.01-1.tar.gz copied to src/contrib.
* installing *source* package ‘examplePackage’ ...
** R
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices ...
** testing if installed package can be loaded

* DONE (examplePackage)
build: package built and installed.
Processing packages:
 /home/user/myRcode/repo/src/contrib/examplePackage_0.01-1.tar.gz
done
repo: src/contrib/PACKAGES (unix) updated.
html: created CSS file ~/myRcode/repo/pckg/web.css
html: copied RSS image to ~/myRcode/repo/pckg/feed-icon-14x14.png
html: updated ~/myRcode/repo/pckg/examplePackage/index.html
html: updated pckg index ~/myRcode/repo/pckg/index.html
html: updated global index ~/myRcode/repo/index.html

Handling ChangeLogs

roxyPackage will try its best to be of help when it comes to handling the ChangeLog of your package. The "log" action can generate an initial log file in a valid format that the other actions can parse (see ChangeLog format). It can also update log entries and add new ones (see Maintaining a ChangeLog).

ChangeLog format

It’s a good idea to document the changes you make to your packages from one version to another, i.e., keeping a ChangeLog. R packages can have this information in Rd markup format as well, usually in NEWS.Rd. roxyPackage can translate ChangeLogs to NEWS.Rds for you, if the structure of your changelog is understood. A working format is to start each new entry with “Changes in version <version number>”, optionally followed by a date string in the format “(YYYY-MM-DD)”. After that, each log item needs to be indented with spaces and itemized, e.g., by a dash or asterisk. You can categorize log items by writing the category name with a colon at the beginning of a line. This is what the inital changelog of our example package could look like:

ChangeLog for package examplePackage

Changes in version 0.01-1 (2012-04-22)
added:
  - LICENSE and CITATION files
changed:
  - initial release

You might also look at the function initChangeLog() to get started with a valid file.

Maintaining a ChangeLog

It’s OK to maintain your ChangeLog manually with an editor of your choice, as long as you stick to a valid format. To make this even easier, roxyPackage has some dedicated functions and methods to deal with ChangeLogs 8. Typically, you want to add entries to the package version you are currently working on, to document what has been changed, added or fixed. You can do this directly within a roxy.package() call, using the "log" action in combination with a list of items given to the ChangeLog argument:

roxy.package(
  pck.source.dir="~/myRcode/examplePackage",
  pck.version="0.01-1",
  R.libs="~/R",
  repo.root="~/myRcode/repo",
  pck.description=package_description(
    Package="examplePackage",
    Type="Package",
    Title="An R Example Package",
    AuthorsR="c(person(given=\"Ernst\", family=\"Dölle\",
      email=\"[email protected]\",
      role=c(\"aut\", \"cre\")))",
    Depends="R (>= 3.0.0)",
    Description="Provides a great function to produce NULL results.",
    License="GPL (>= 3)",
    Encoding="UTF-8",
    LazyLoad="yes",
    URL="https://example.com/doelleInnovations"
  ),
  actions=c(
    "roxy",
    "cite",
    "html",
    "license",
    "log",
    "package"
  ),
  ChangeLog=list(
    added=c("new extra NULL feature", "new oblivion matrix"),
    fixed=c("finally resolved the reandom results bug")
  )
)

This would generate a new log entry for version 0.01-1 if none is present, otherwise the entries in the named ChangeLog list would be appended to the existing entry and its date stamp updated. In the list, the element names are treated as the names of log sections, and each character string as a log entry item of that section. Duplicate items will be ignored automatically, so you can simply append new ones to the list as you see fit. However, it wouldn’t hurt to check the log from time to time, to convince yourself it’s still in shape.

Generating RSS feeds

RSS technology allows you to spread the news on package updates instantly, after interested users subscribed to the news feed. All you need to define additionally is the URL parameter with the web address of your repository. After that, you can combine the actions "cl2news", "news2rss" and "html" to generate an up-to-date RSS feed9 from your ChangeLog file.

Archiving packages

At some point, you might want to move old packages to an archive location. This can also be done automatically with this package, by calling archive.packages(). If you do that and only include your local repository location as an argument, this function will do nothing but print what it would actually do if you want it to. You can then tweak its parameters to your liking, and if you would like to go for it, set reallyDoIt=TRUE.

No more excuses: Generating template files

Imagine you just wrote a new function or two and would like to share your work with others. But when you try to remember how to write sufficient roxygen2 documentation, your motivation disappears. Well, roxyPackage comes to the rescue: It can write template files for you, prefilled with some usefull tags and hints which should leave you with no more excuses. Depending on the arguments given to templateFile(), you can see the script code on screen or write it directly to a file:

# you can re-use the package description here, but only a
# subset of it's data is needed for templates
pckg.dscrptn <- data.frame(
  Package="examplePackage",
  Author="Ernst A. Dölle <[email protected]>",
  License="GPL (>= 3)",
  stringsAsFactors=FALSE
)
cat(
  templateFile(
    name="exampleFunction",
    pck.description=pckg.dscrptn
  )
)
# Copyright (C) 2017 Ernst A. Dölle <[email protected]>
#
# This file is part of the R package examplePackage.
#
# examplePackage is free software: You can redistribute it and/or modify
# it under the terms of the GNU General Public License (GPL) as published
# by the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# examplePackage is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# licence terms for more details.
#
# You should have received a copy of the license with the source
# package as the file COPYING or LICENSE.  If not, see
# <http://www.gnu.org/licenses/>.

#' Function exampleFunction
#' 
#' \code{\link[:]{}}
#' 
#' @section <section>: 
#'   \describe{
#'     \item{\code{}:} {}
#'   }
#' 
#'   itemize{
#'     \item{\code{}} {:}
#'   }
#' 
#' @note 
#' 
#' @param obj An object of class someClass.
#' @param ... Additional arguments passed through to \code{\link[somepackage]{somefunction}}.
#' @return An object of class \code{\link[aPackage]{aFunction-class}}.
#' @references <ref>
#' @seealso \code{\link[aPackage]{aFunction}}.
#' @keywords functions
#' @rdname exampleFunction
#' @export
#' @examples
#' \dontrun{
#' exampleFunction()
#' }

exampleFunction <- function(
  obj=someClass(),
  ...
){
}

In addition tu documenting R functions, templateFile() can also produce useful templates for S4 classes and methods.