Contributing to ‘box’

Installation

Compared to many other R packages, ‘box’ uses a somewhat different development workflow:

While ‘box’ uses ‘devtools’ and ‘roxygen2’ to generate documentation and package infrastructure, generated code and data is not checked into version control! This means in particular that the project does not contain a NAMESPACE file, since that is auto-(re)generated by tools. Therefore, attempting to install the current development version of ‘box’ via pak::pak('klmr/box') or equivalent means will fail!

Instead, an up-to-date, automatically generated build of the development version of ‘box’ should be installed from R-Universe:

install.packages('box', repos = 'https://klmr.r-universe.dev')

… or from the corresponding build branch:

pak::pak('klmr/box@build')

Alternatively it can be built manually using the instructions below.

Development

Building

The project contains a Makefile written in the GNU Make dialect that contains various development utilities. Invoking make without target will show a list of available targets with short descriptions. Using this Makefile isn’t necessary, but it helps. In particular:

  • make documentation builds the NAMESPACE file, the shared C library and the package documentation.
  • make test runs the unit tests.
  • make check runs checks, and should run cleanly before submitting a pull request: of note, make check performs additional checks that are not performed by either R CMD check or rcmdcheck::rcmdcheck() (these checks can be found as individual scripts under scripts/; they roughly correspond to some undocumented checks performed internally on CRAN).

Branches

All new code should be developed on a new branch with a name prefixed fix/ (for bug fixes), feature/ (for new features/enhancements), and chore/ (for any other contributions: fixed typos, project infrastructure, tests, etc.). Branches for pull requests will be merged into the main branch.

Code style

The code style of ‘box’ is similar to the Tidyverse style guide, but with several notable differences:

  • The file extension for R code files is lowercase .r (not uppercase .R). The file extension for R Markdown files is lowercase .rmd (not uppercase .Rmd).

    Rationale: using capital letters in file extensions is a pointless violation of established convention that creates unnecessary inconsistencies.
  • Use = for assignment, not <-.

    • On the (very rare!) occasions where assignment inside a function call is required, use additional parentheses to make = syntactically an assignment rather than named argument passing.

      # GOOD:
      x = 5
      
      # OK (rarely):
      if (is.null((x = function_call()))) {}
      
      # BAD:
      x <- 5
      if (is.null(x <- function_call())) {}
    • <<- is banned. Instead of name <<- value, use env$name = value, where env is a previously-defined name referring to the desired target environment. This pattern is used in various places in the code, and good, context-dependent names for env are ns, self, or similar. assign() should generally not be used when the name of the assignee is statically known.

      # GOOD:
      self = parent.frame()
      # … later, inside a nested closure:
      self$x = TRUE
      
      # BAD:
      x <<- TRUE
      
      # BAD:
      assign('x', TRUE, envir = caller)
      Rationale: <<- leaves it unclear where assignment will happen. Subset-assignment via $ makes the assignment target scope explicit, which makes the code clearer and less error-prone.
  • Leave a space between function and the following opening parenthesis.

    # GOOD:
    f = function () {}
    
    # BAD:
    f = function() {}
    Rationale: a function declaration in R does not use function call syntax; rather, it’s a special syntactic construct equivalent to if and while, so the same formatting conventions apply.
  • Use single quotes, not double quotes, around strings. — Even when the string contains ', which should be escaped.

    # GOOD:
    'text'
    'text with \'quotes\''
    
    # BAD:
    "text"
    "text with 'quotes'"
    r'(text with 'quotes')'
    Rationale: raw strings cannot be used since ‘box’ supports R versions pre R 4.0.
  • Use four spaces for indentation. Do not add extra spaces to align assignments or named function call arguments.

    # GOOD:
    first = 1
    second = 2
    
    # BAD:
    first  = 1
    second = 2
  • The line length is hard-limited to 120 columns. Most lines should be shorter, but there is no (soft, or otherwise) limit at 80 columns.

  • Explicitly use integer literals where the logical type of the expression is integer.

    # GOOD:
    if (length(x) == 0L) {}
    
    # BAD:
    if (length(x) == 0) {}