From cee8f1a72224db9fef786b3bdc45367fb917397f Mon Sep 17 00:00:00 2001 From: Michael Quinn Date: Wed, 17 Jul 2019 21:28:05 -0700 Subject: [PATCH] Replace external R style guide Internally, we've moved to the Tidyverse Style Guide: style.tidyverse.org The new file for Rguide.md reflects this fact and notes where we differ. --- README.md | 2 +- Rguide.md | 109 +++++++++++++ Rguide.xml | 447 ----------------------------------------------------- 3 files changed, 110 insertions(+), 448 deletions(-) create mode 100644 Rguide.md delete mode 100644 Rguide.xml diff --git a/README.md b/README.md index b8f87fc..581ec2b 100644 --- a/README.md +++ b/README.md @@ -39,7 +39,7 @@ The following Google style guides live outside of this project: [objc]: objcguide.md [java]: https://google.github.io/styleguide/javaguide.html [py]: https://google.github.io/styleguide/pyguide.html -[r]: https://google.github.io/styleguide/Rguide.xml +[r]: https://google.github.io/styleguide/Rguide.md [sh]: https://google.github.io/styleguide/shell.xml [htmlcss]: https://google.github.io/styleguide/htmlcssguide.html [js]: https://google.github.io/styleguide/jsguide.html diff --git a/Rguide.md b/Rguide.md new file mode 100644 index 0000000..de27371 --- /dev/null +++ b/Rguide.md @@ -0,0 +1,109 @@ +# Google's R Style Guide + +R is a high-level programming language used primarily for statistical computing +and graphics. The goal of the R Programming Style Guide is to make our R code +easier to read, share, and verify. + +The Google R Style Guide is a fork of the +[Tidyverse Style Guide](https://style.tidyverse.org/) by Hadley Wickham +[license](https://creativecommons.org/licenses/by-sa/2.0/). Google modifications +were developed in collaboration with the internal R user community. The rest of +this document explains Google's primary differences with the Tidyverse guide, +and why these differences exist. + +## Syntax + +### Naming conventions + +Google prefers identifying functions with `BigCamelCase` to clearly distinguish +them from other objects. + +``` +# Good +DoNothing <- function() { + return(invisible(NULL)) +} +``` + +The names of private functions should begin with a dot. This helps communicate +both the origin of the function and its intended use. + +``` +# Good +.DoNothingPrivately <- function() { + return(invisible(NULL)) +} +``` + +We previously recommended naming objects with `dot.case`. We're moving away from +that, as it creates confusion with S3 methods. + +### Don't use attach() + +The possibilities for creating errors when using `attach()` are numerous. + +## Pipes + +### Right-hand assignment + +We do not support using right-hand assignment. + +``` +# Bad +iris %>% + dplyr::summarize(max_petal = max(Petal.Width)) -> results +``` + +This convention differs substantially from practices in other languages and +makes it harder to see in code where an object is defined. E.g. searching for +`foo <-` is easier than searching for `foo <-` and `-> foo` (possibly split over +lines). + +### Use explicit returns + +Do not rely on R's implicit return feature. It is better to be clear about your +intent to `return()` an object. + +``` +# Good +AddValues <- function(x, y) { + return(x + y) +} + +# Bad +AddValues <- function(x, y) { + x + y +} +``` + +### Qualifying namespaces + +Users should explicitly qualify namespaces for all external functions. + +``` +# Good +purrr::map() +``` + +We discourage using the `@import` Roxygen tag to bring in all functions into a +NAMESPACE. Google has a very big R codebase, and importing all functions creates +too much risk for name collisions. + +While there is a small performance penalty for using `::`, it makes it easier to +understand dependencies in your code. There are some exceptions to this rule. + +* Infix functions (`%name%`) always need to be imported. +* Certain `rlang` pronouns, notably `.data`, need to be imported. +* Functions from default R packages, including `datasets`, `utils`, + `grDevices`, `graphics`, `stats` and `methods`. If needed, you can `@import` + the full package. + +When importing functions, place the `@importFrom` tag in the Roxygen header +above the function where the external dependency is used. + +## Documentation + +### Package-level documentation + +All packages should have a package documentation file, in a +`packagename-package.R` file. diff --git a/Rguide.xml b/Rguide.xml deleted file mode 100644 index ef87872..0000000 --- a/Rguide.xml +++ /dev/null @@ -1,447 +0,0 @@ - - - - - - Google's R Style Guide - - - - -

Google's R Style Guide

- -

- R is a high-level programming language used primarily for - statistical computing and graphics. The goal of the R - Programming Style Guide is to make our R code easier to read, - share, and verify. The rules below were designed in - collaboration with the entire R user community at Google. - -

- - - - - - -

Summary: R Style Rules

- -
    -
  1. File Names: end in .R
  2. -
  3. Identifiers: variable.name - (or variableName), - FunctionName, kConstantName
  4. -
  5. Line Length: maximum 80 characters
  6. -
  7. Indentation: two spaces, no tabs
  8. -
  9. Spacing
  10. -
  11. Curly Braces: first on same line, last on - own line
  12. -
  13. else: Surround else with braces
  14. -
  15. Assignment: use <-, not - =
  16. -
  17. Semicolons: don't use them
  18. -
  19. General Layout and Ordering
  20. -
  21. Commenting Guidelines: all comments begin - with # followed by a space; inline comments need two - spaces before the #
  22. -
  23. Function Definitions and Calls
  24. -
  25. Function Documentation
  26. -
  27. Example Function
  28. -
  29. TODO Style: TODO(username)
  30. -
- -

Summary: R Language Rules

-
    -
  1. attach: avoid using it
  2. -
  3. Functions: - errors should be raised using stop()
  4. -
  5. Objects and Methods: avoid S4 objects and - methods when possible; never mix S3 and S4
  6. -
- -

Notation and Naming

- -

File Names

-

- File names should end in .R and, of course, be - meaningful. -
GOOD: predict_ad_revenue.R -
BAD: foo.R -

- -

Identifiers

-

- Don't use underscores ( _ ) or hyphens - ( - ) in identifiers. - Identifiers should be named according to the following conventions. - The preferred form for variable names is all lower case - letters and words separated with dots - (variable.name), but variableName - is also accepted; - function names have initial capital letters and no dots - (FunctionName); - constants are named like functions but with an initial - k. -

- - - -

Syntax

- -

Line Length

-

- The maximum line length is 80 characters. -

- -

Indentation

-

- When indenting your code, use two spaces. Never use tabs or mix - tabs and spaces. -
Exception: When a line break occurs inside parentheses, - align the wrapped line with the first character inside the - parenthesis. -

- -

Spacing

-

- Place spaces around all binary operators (=, - +, -, <-, etc.). -
Exception: Spaces around ='s are - optional when passing parameters in a function call. -

-

- Do not place a space before a comma, but always place one after a - comma. -

GOOD: -

-
tab.prior <- table(df[df$days.from.opt < 0, "campaign.id"])
-total <- sum(x[, 1])
-total <- sum(x[1, ])
-

- BAD: -

-
tab.prior <- table(df[df$days.from.opt<0, "campaign.id"])  # Needs spaces around '<'
-tab.prior <- table(df[df$days.from.opt < 0,"campaign.id"])  # Needs a space after the comma
-tab.prior<- table(df[df$days.from.opt < 0, "campaign.id"])  # Needs a space before <-
-tab.prior<-table(df[df$days.from.opt < 0, "campaign.id"])  # Needs spaces around <-
-total <- sum(x[,1])  # Needs a space after the comma
-total <- sum(x[ ,1])  # Needs a space after the comma, not before
-
-

- Place a space before left parenthesis, except in a function call. -

-

- GOOD: -
if (debug) -

-

- BAD: -
if(debug) -

-

- Extra spacing (i.e., more than one space in a row) is okay if it - improves alignment of equals signs or arrows (<-). -

-
plot(x    = x.coord,
-     y    = data.mat[, MakeColName(metric, ptiles[1], "roiOpt")],
-     ylim = ylim,
-     xlab = "dates",
-     ylab = metric,
-     main = (paste(metric, " for 3 samples ", sep = "")))
-
-

- Do not place spaces around code in parentheses or square brackets. -
Exception: Always place a space after a comma. -

-

- GOOD:

if (debug)
-x[1, ]
-

- BAD:

if ( debug )  # No spaces around debug
-x[1,]  # Needs a space after the comma 
- -

Curly Braces

-

- An opening curly brace should never go on its own line; a closing - curly brace should always go on its own line. You may omit curly - braces when a block consists of a single statement; however, you - must consistently either use or not use curly braces for - single statement blocks. -

-
-if (is.null(ylim)) {
-  ylim <- c(0, 0.06)
-}
-

- xor (but not both) -

-
-if (is.null(ylim))
-  ylim <- c(0, 0.06)
-

- Always begin the body of a block on a new line. -

-

- BAD: -
if (is.null(ylim)) - ylim <- c(0, 0.06) -
if (is.null(ylim)) - {ylim <- c(0, 0.06)} -

- -

Surround else with braces

-

- An else statement should always be surrounded on the - same line by curly braces.

-
-if (condition) {
-  one or more lines
-} else {
-  one or more lines
-}
-
-

- BAD:
-

-
-if (condition) {
-  one or more lines
-}
-else {
-  one or more lines
-}
-
-

- BAD:
-

-
-if (condition)
-  one line
-else
-  one line
-
- -

Assignment

-

- Use <-, not =, for assignment. -

-

- GOOD: -
x <- 5 -

-

- BAD: -
x = 5 -

-

Semicolons

-

- Do not terminate your lines with semicolons or use semicolons to - put more than one command on the same line. (Semicolons are not - necessary, and are omitted for consistency with other Google style - guides.) -

- - -

Organization

-

General Layout and Ordering

-

- If everyone uses the same general ordering, we'll be able to - read and understand each other's scripts faster and more easily. -

-
    -
  1. Copyright statement comment
  2. -
  3. Author comment
  4. -
  5. File description comment, including purpose of - program, inputs, and outputs
  6. -
  7. source() and library() statements
  8. -
  9. Function definitions
  10. -
  11. Executed statements, if applicable (e.g., - print, plot)
  12. -
-

- Unit tests should go in a separate file named - originalfilename_test.R. -

-

Commenting Guidelines

-

- Comment your code. Entire commented lines should begin with - # and one space. -

-

- Short comments can be placed after code preceded by two spaces, - #, and then one space. -

-
# Create histogram of frequency of campaigns by pct budget spent.
-hist(df$pct.spent,
-     breaks = "scott",  # method for choosing number of buckets
-     main   = "Histogram: fraction budget spent by campaignid",
-     xlab   = "Fraction of budget spent",
-     ylab   = "Frequency (count of campaignids)")
-
-

Function Definitions and - Calls

-

- Function definitions should first list arguments without default - values, followed by those with default values. -

-

- In both function definitions and function calls, multiple - arguments per line are allowed; line breaks are only allowed - between assignments. -
GOOD: -

-
PredictCTR <- function(query, property, num.days,
-                       show.plot = TRUE)
-
- BAD: -
PredictCTR <- function(query, property, num.days, show.plot =
-                       TRUE)
-
-

Ideally, unit tests should serve as sample function calls (for - shared library routines). -

-

Function Documentation

-

Functions should contain a comments section immediately below - the function definition line. These comments should consist of a - one-sentence description of the function; a list of the function's - arguments, denoted by Args:, with a description of - each (including the data type); and a description of the return - value, denoted by Returns:. The comments should be - descriptive enough that a caller can use the function without - reading any of the function's code. -

- -

Example Function

-
-CalculateSampleCovariance <- function(x, y, verbose = TRUE) {
-  # Computes the sample covariance between two vectors.
-  #
-  # Args:
-  #   x: One of two vectors whose sample covariance is to be calculated.
-  #   y: The other vector. x and y must have the same length, greater than one,
-  #      with no missing values.
-  #   verbose: If TRUE, prints sample covariance; if not, not. Default is TRUE.
-  #
-  # Returns:
-  #   The sample covariance between x and y.
-  n <- length(x)
-  # Error handling
-  if (n <= 1 || n != length(y)) {
-    stop("Arguments x and y have different lengths: ",
-         length(x), " and ", length(y), ".")
-  }
-  if (TRUE %in% is.na(x) || TRUE %in% is.na(y)) {
-    stop(" Arguments x and y must not have missing values.")
-  }
-  covariance <- var(x, y)
-  if (verbose)
-    cat("Covariance = ", round(covariance, 4), ".\n", sep = "")
-  return(covariance)
-}
-
- -

TODO Style

- -

- Use a consistent style for TODOs throughout your code. -
TODO(username): Explicit description of action to - be taken -

- - -

Language

- -

Attach

-

The possibilities for creating errors when using - attach are numerous. Avoid it.

-

Functions

-

Errors should be raised using stop().

-

Objects and Methods

-

The S language has two object systems, S3 and S4, both of which - are available in R. S3 methods are more interactive and flexible, - whereas S4 methods are more formal and rigorous. (For an illustration - of the two systems, see Thomas Lumley's - "Programmer's Niche: A Simple - Class, in S3 and S4" in R News 4/1, 2004, pgs. 33 - 36: - - https://cran.r-project.org/doc/Rnews/Rnews_2004-1.pdf.) -

-

Use S3 objects and methods unless there is a strong reason to use - S4 objects or methods. A primary justification for an S4 object - would be to use objects directly in C++ code. A primary - justification for an S4 generic/method would be to dispatch on two - arguments. -

-

Avoid mixing S3 and S4: S4 methods ignore S3 inheritance and - vice-versa. -

- - -

Exceptions

- -

- The coding conventions described above should be followed, unless - there is good reason to do otherwise. Exceptions include legacy - code and modifying third-party code. -

- - -

Parting Words

- -

- Use common sense and BE CONSISTENT. -

-

- If you are editing code, take a few minutes to look at the code around - you and determine its style. If others use spaces around their - if - clauses, you should, too. If their comments have little boxes of stars - around them, make your comments have little boxes of stars around them, - too. -

-

- The point of having style guidelines is to have a common vocabulary of - coding so people can concentrate on what you are saying, - rather than on how you are saying it. We present global style - rules here so people - know the vocabulary. But local style is also important. If code you add - to a file looks drastically different from the existing code around it, - the discontinuity will throw readers out of their rhythm when they go to - read it. Try to avoid this. -

- -

- OK, enough writing about writing code; the code itself is much more - interesting. Have fun! -

- - -

References

- -

- - http://www.maths.lth.se/help/R/RCC/ - R Coding Conventions -

-

- http://ess.r-project.org/ - For - emacs users. This runs R in your emacs and has an emacs mode. -

- - - -