2009-07-22 01:04:05 +08:00
|
|
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
|
|
|
|
"http://www.w3.org/TR/REC-html4/strict.dtd">
|
2009-07-22 00:56:19 +08:00
|
|
|
<html>
|
|
|
|
<head>
|
|
|
|
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
|
|
|
|
<link rel="stylesheet" type="text/css"
|
|
|
|
href="http://www.corp.google.com/eng/docstyle.css">
|
|
|
|
<title>Google's R Style Guide</title>
|
|
|
|
<style type="text/css">
|
|
|
|
ul.NoBullet {list-style-type: none}
|
|
|
|
</style>
|
|
|
|
</head>
|
|
|
|
|
|
|
|
<body>
|
|
|
|
|
|
|
|
<!---------------------------------------------------------------------------->
|
|
|
|
|
|
|
|
<h1>Google's R Style Guide</h1>
|
|
|
|
|
|
|
|
<p>
|
|
|
|
R is a high-level programming language used primarily for statistical
|
|
|
|
computing and graphics. The goal of the R Programming Style Guide
|
|
|
|
is to make our R code easier to read, share, and verify. The rules
|
|
|
|
below were designed in collaboration with the entire R user community
|
|
|
|
at Google.
|
|
|
|
</p>
|
|
|
|
|
|
|
|
<!---------------------------------------------------------------------------->
|
|
|
|
|
|
|
|
<ul class="NoBullet">
|
|
|
|
<h2><li>Summary: R Style Rules</li></h2>
|
|
|
|
<ol>
|
|
|
|
<li><a href=#filenames>File Names</a>: end in <code>.R</code></li>
|
|
|
|
<li><a href=#identifiers>Identifiers</a>: <code>variable.name</code>,
|
|
|
|
<code>FunctionName</code>, <code>kConstantName</code></li>
|
|
|
|
|
|
|
|
<li><a href=#linelength>Line Length</a>: maximum 80 characters</li>
|
|
|
|
<li><a href=#indentation>Indentation</a>: two spaces, no tabs</li>
|
|
|
|
<li><a href=#spacing>Spacing</a></li>
|
|
|
|
<li><a href=#curlybraces>Curly Braces</a>: first on same line, last on
|
|
|
|
own line</li>
|
|
|
|
<li><a href=#assignment>Assignment</a>: use <code><-</code>, not
|
|
|
|
<code>=</code></li>
|
|
|
|
|
|
|
|
<li><a href=#semicolons>Semicolons</a>: don't use them</li>
|
|
|
|
<li><a href=#generallayout> General Layout and Ordering</a></li>
|
|
|
|
<li><a href=#comments> Commenting Guidelines</a>: all comments begin
|
|
|
|
with <code>#</code> followed by a space; inline comments need two
|
|
|
|
spaces before the <code>#</code></li>
|
|
|
|
|
|
|
|
<li><a href=#functiondefinition>Function Definitions and Calls</a></li>
|
|
|
|
<li><a href=#functiondocumentation> Function Documentation</a></li>
|
|
|
|
<li><a href=#examplefunction> Example Function</a></li>
|
|
|
|
<li><a href=#todo> TODO Style</a>: <code>TODO(username)</code></li>
|
|
|
|
|
|
|
|
</ol>
|
|
|
|
<h2><li>Summary: R Language Rules</li></h2>
|
|
|
|
<ol>
|
|
|
|
<li><a href=#attach> <code>attach</code></a>: avoid using it</li>
|
|
|
|
<li><a href=#functionlanguage> Functions</a>:
|
|
|
|
errors should be raised using <code>stop()</code></li>
|
|
|
|
|
|
|
|
<li><a href=#object> Objects and Methods</a>: avoid S4 objects and
|
|
|
|
methods when possible; never mix S3 and S4 </li>
|
|
|
|
</ol>
|
|
|
|
</ul>
|
|
|
|
|
|
|
|
<br>
|
|
|
|
<ol>
|
|
|
|
<!---------------------------------------------------------------------------->
|
|
|
|
|
|
|
|
<h3><li>Notation and Naming</li></h3>
|
|
|
|
|
|
|
|
<ul class="NoBullet">
|
|
|
|
<h4 id="filenames"><li>File Names</li></h4>
|
|
|
|
<p>
|
|
|
|
File names should end in <code>.R</code> and, of course, be
|
|
|
|
meaningful.
|
|
|
|
<br> GOOD: <code>predict_ad_revenue.R</code>
|
|
|
|
<br> BAD: <code><span style="color:red">foo.R</span></code>
|
|
|
|
|
|
|
|
</p>
|
|
|
|
<h4 id="identifiers"><li>Identifiers</li></h4>
|
|
|
|
<p>
|
|
|
|
Don't use underscores ( <code>_</code> ) or hyphens
|
|
|
|
( <code>-</code> ) in identifiers.
|
|
|
|
Identifiers should be named according to the following conventions.
|
|
|
|
Variable names should have all lower case letters and words
|
|
|
|
separated with dots (<code>.</code>);
|
|
|
|
function names have initial capital letters and no dots
|
|
|
|
(CapWords);
|
|
|
|
constants are named like functions but with an initial
|
|
|
|
<code>k</code>.
|
|
|
|
</p>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
<li><code>variable.name </code>
|
|
|
|
<br> GOOD: <code>avg.clicks</code>
|
|
|
|
<br> BAD: <code><span style="color:red">avg_Clicks
|
|
|
|
</span></code>, <code><span style="color:red">avgClicks
|
|
|
|
</span></code>
|
|
|
|
|
|
|
|
</li>
|
|
|
|
<li><code>FunctionName </code>
|
|
|
|
<br> GOOD: <code>CalculateAvgClicks</code>
|
|
|
|
<br> BAD: <code><span style="color:red">calculate_avg_clicks
|
|
|
|
</span></code>,
|
|
|
|
<code><span style="color:red">calculateAvgClicks</span></code>
|
|
|
|
|
|
|
|
<br> Make function names verbs.
|
|
|
|
<br><em>Exception: When creating a classed object, the function
|
|
|
|
name (constructor) and class should match (e.g., lm).</em>
|
|
|
|
<li><code>kConstantName </code></li>
|
|
|
|
</ul>
|
|
|
|
</ul>
|
|
|
|
|
|
|
|
<!---------------------------------------------------------------------------->
|
|
|
|
|
|
|
|
<h3><li>Syntax</li></h3>
|
|
|
|
|
|
|
|
<ul class="NoBullet">
|
|
|
|
<h4 id="linelength"><li>Line Length</li></h4>
|
|
|
|
<p>
|
|
|
|
The maximum line length is 80 characters.
|
|
|
|
</p>
|
|
|
|
<h4 id="indentation"><li>Indentation</li></h4>
|
|
|
|
<p>
|
|
|
|
When indenting your code, use two spaces. Never use tabs or mix
|
|
|
|
tabs and spaces.
|
|
|
|
<br><em>Exception: When a line break occurs inside parentheses,
|
|
|
|
align the wrapped line with the first character inside the
|
|
|
|
parenthesis.</em>
|
|
|
|
|
|
|
|
<p>
|
|
|
|
<h4 id="spacing"><li>Spacing</li></h4>
|
|
|
|
<p>
|
|
|
|
Place spaces around all binary operators (<code>=</code>,
|
|
|
|
<code>+</code>, <code>-</code>, <code><-</code>, etc.).
|
|
|
|
<br><em> Exception: Spaces around <code>=</code>'s are
|
|
|
|
optional when passing parameters in a function call.</em>
|
|
|
|
|
|
|
|
</p>
|
|
|
|
<p>
|
|
|
|
Do not place a space before a comma, but always place one after a
|
|
|
|
comma.
|
|
|
|
<br><br> GOOD:<pre><code>tabPrior <- table(df[df$daysFromOpt < 0, "campaignid"])
|
|
|
|
total <- sum(x[, 1])
|
|
|
|
total <- sum(x[1, ])</code></pre>
|
|
|
|
|
|
|
|
</p>
|
|
|
|
<p>
|
|
|
|
BAD:<pre><code><span style="color:red">tabPrior <- table(df[df$daysFromOpt<0, "campaignid"]) # Needs spaces around '<'
|
|
|
|
tabPrior <- table(df[df$daysFromOpt < 0,"campaignid"]) # Needs a space after the comma
|
|
|
|
tabPrior<- table(df[df$daysFromOpt < 0, "campaignid"]) # Needs a space before <-
|
|
|
|
tabPrior<-table(df[df$daysFromOpt < 0, "campaignid"]) # Needs spaces around <-
|
|
|
|
total <- sum(x[,1]) # Needs a space after the comma
|
|
|
|
total <- sum(x[ ,1]) # Needs a space after the comma, not before</span></code></pre>
|
|
|
|
|
|
|
|
</p>
|
|
|
|
<p>
|
|
|
|
Place a space before left parenthesis, except in a function call.
|
|
|
|
</p>
|
|
|
|
<p>
|
|
|
|
GOOD:
|
|
|
|
<br><code>if (debug)</code>
|
|
|
|
</p>
|
|
|
|
<p>
|
|
|
|
|
|
|
|
BAD:
|
|
|
|
<br><code><span style="color:red">if(debug)</span></code>
|
|
|
|
</p>
|
|
|
|
<p>
|
|
|
|
Extra spacing (i.e., more than one space in a row) is okay if it
|
|
|
|
improves alignment of equals signs or arrows (<code><-</code>).
|
|
|
|
</p>
|
|
|
|
<pre><code>plot(x = xCoord,
|
|
|
|
y = dataMat[, makeColName(metric, ptiles[1], "roiOpt")],
|
|
|
|
ylim = ylim,
|
|
|
|
xlab = "dates",
|
|
|
|
ylab = metric,
|
|
|
|
main = (paste(metric, " for 3 samples ", sep="")))
|
|
|
|
|
|
|
|
</pre></code>
|
|
|
|
<p>
|
|
|
|
Do not place spaces around code in parentheses or square brackets.
|
|
|
|
<br><em> Exception: Always place a space after a comma.</em>
|
|
|
|
</p>
|
|
|
|
<p>
|
|
|
|
GOOD: <pre><code>if (debug)
|
|
|
|
x[1, ]</code></pre>
|
|
|
|
</p>
|
|
|
|
|
|
|
|
<p>
|
|
|
|
BAD:<pre><code><span style="color:red">if ( debug ) # No spaces around debug
|
|
|
|
x[1,] # Needs a space after the comma </span></code></pre>
|
|
|
|
</p>
|
|
|
|
<h4 id="curlybraces"><li>Curly Braces</li></h4>
|
|
|
|
<p>
|
|
|
|
An opening curly brace should never go on its own line; a closing
|
|
|
|
curly brace should always go on its own line. You may omit curly
|
|
|
|
braces when a block consists of a single statement; however, you
|
|
|
|
must <em>consistently</em> either use or not use curly braces for
|
|
|
|
single statement blocks.
|
|
|
|
</p>
|
|
|
|
|
|
|
|
<code><pre>
|
|
|
|
if (is.null(ylim)) {
|
|
|
|
ylim <- c(0, 0.06)
|
|
|
|
}</pre></code>
|
|
|
|
<p>
|
|
|
|
xor (but not both)
|
|
|
|
</p>
|
|
|
|
<code><pre>
|
|
|
|
if (is.null(ylim))
|
|
|
|
ylim <- c(0, 0.06)</pre></code>
|
|
|
|
<p>
|
|
|
|
|
|
|
|
Always begin the body of a block on a new line.
|
|
|
|
</p>
|
|
|
|
<p>
|
|
|
|
BAD:
|
|
|
|
<br><code><span style="color:red"> if (is.null(ylim))
|
|
|
|
ylim <- c(0, 0.06)</span></code>
|
|
|
|
<br><code><span style="color:red"> if (is.null(ylim))
|
|
|
|
{ylim <- c(0, 0.06)} </span></code>
|
|
|
|
</p>
|
|
|
|
|
|
|
|
<h4 id="assignment"><li>Assignment</li></h4>
|
|
|
|
<p>
|
|
|
|
Use <code><-</code>, not <code>=</code>, for assignment.
|
|
|
|
</p>
|
|
|
|
<p>
|
|
|
|
GOOD:
|
|
|
|
<br><code> x <- 5 </code>
|
|
|
|
|
|
|
|
</p>
|
|
|
|
<p>
|
|
|
|
BAD:
|
|
|
|
<br><code><span style="color:red"> x = 5</span></code>
|
|
|
|
</p>
|
|
|
|
<h4 id="semicolons"><li>Semicolons</li></h4>
|
|
|
|
<p>
|
|
|
|
Do not terminate your lines with semicolons or use semicolons to
|
|
|
|
put more than one command on the same line. (Semicolons are not
|
|
|
|
necessary, and are omitted for consistency with other Google style
|
|
|
|
guides.)
|
|
|
|
</p>
|
|
|
|
|
|
|
|
</ul>
|
|
|
|
|
|
|
|
<!---------------------------------------------------------------------------->
|
|
|
|
|
|
|
|
<h3><li> Organization </li></h3>
|
|
|
|
<ul class="NoBullet">
|
|
|
|
<h4 id="generallayout"><li>General Layout and Ordering</li></h4>
|
|
|
|
<p>
|
|
|
|
If everyone uses the same general ordering, we'll be able to
|
|
|
|
read and understand each other's scripts faster and more easily.
|
|
|
|
</p>
|
|
|
|
|
|
|
|
<ol>
|
|
|
|
<li>Copyright statement comment
|
|
|
|
<li>Author comment
|
|
|
|
<li>File description comment, including purpose of
|
|
|
|
program, inputs, and outputs
|
|
|
|
<li><code>source()</code> and <code>library()</code> statements
|
|
|
|
<li>Function definitions
|
|
|
|
<li>Executed statements, if applicable (e.g.,
|
|
|
|
<code> print</code>, <code>plot</code>)
|
|
|
|
</ol>
|
|
|
|
|
|
|
|
<p>
|
|
|
|
Unit tests should go in a separate file named
|
|
|
|
<code>originalfilename_unittest.R</code>.
|
|
|
|
<p>
|
|
|
|
<h4 id="comments"><li>Commenting Guidelines</li></h4>
|
|
|
|
<p>
|
|
|
|
Comment your code. Entire commented lines should begin with
|
|
|
|
<code>#</code> and one space.
|
|
|
|
</p>
|
|
|
|
|
|
|
|
<p>
|
|
|
|
Short comments can be placed after code preceded by two spaces,
|
|
|
|
<code>#</code>, and then one space.
|
|
|
|
</p>
|
|
|
|
<pre><code># Create histogram of frequency of campaigns by pct budget spent.
|
|
|
|
hist(df$pctSpent,
|
|
|
|
breaks = "scott", # method for choosing number of buckets
|
|
|
|
main = "Histogram: fraction budget spent by campaignid",
|
|
|
|
xlab = "Fraction of budget spent",
|
|
|
|
ylab = "Frequency (count of campaignids)")
|
|
|
|
</code></pre>
|
|
|
|
|
|
|
|
<h4 id="functiondefinition"><li> Function Definitions and
|
|
|
|
Calls</li></h4>
|
|
|
|
<p>
|
|
|
|
Function definitions should first list arguments without default
|
|
|
|
values, followed by those with default values.
|
|
|
|
</p>
|
|
|
|
<p>
|
|
|
|
In both function definitions and function calls, multiple
|
|
|
|
arguments per line are allowed; line breaks are only allowed
|
|
|
|
between assignments.
|
|
|
|
<br>GOOD:
|
|
|
|
<pre><code>PredictCTR <- function(query, property, numDays,
|
|
|
|
showPlot = TRUE)
|
|
|
|
</code></pre>
|
|
|
|
|
|
|
|
BAD:
|
|
|
|
<pre><code><span style="color:red">PredictCTR <- function(query, property, numDays, showPlot =
|
|
|
|
TRUE)
|
|
|
|
</span></code></pre>
|
|
|
|
<p> Ideally, unit tests should serve as sample function calls (for
|
|
|
|
shared library routines).
|
|
|
|
<h4 id="functiondocumentation"><li> Function Documentation </li></h4>
|
|
|
|
<p> Functions should contain a comments section immediately below
|
|
|
|
the function definition line. These comments should consist of a
|
|
|
|
one-sentence description of the function; a list of the function's
|
|
|
|
arguments, denoted by <code>Args:</code>, with a description of
|
|
|
|
each (including the data type); and a description of the return
|
|
|
|
value, denoted by <code>Returns:</code>. The comments should be
|
|
|
|
descriptive enough that a caller can use the function without
|
|
|
|
reading any of the function's code.
|
|
|
|
|
|
|
|
|
|
|
|
<h4 id="examplefunction"><li> Example Function </li></h4><pre><code>
|
|
|
|
|
|
|
|
CalculateSampleCovariance <- function(x, y, verbose = TRUE) {
|
|
|
|
# Computes the sample covariance between two vectors.
|
|
|
|
#
|
|
|
|
# Args:
|
|
|
|
# x: One of two vectors whose sample covariance is to be calculated.
|
|
|
|
# y: The other vector. x and y must have the same length, greater than one,
|
|
|
|
# with no missing values.
|
|
|
|
# verbose: If TRUE, prints sample covariance; if not, not. Default is TRUE.
|
|
|
|
#
|
|
|
|
# Returns:
|
|
|
|
# The sample covariance between x and y.
|
|
|
|
n <- length(x)
|
|
|
|
# Error handling
|
|
|
|
if (n <= 1 || n != length(y)) {
|
|
|
|
stop("Arguments x and y have invalid lengths: ",
|
|
|
|
length(x), " and ", length(y), ".")
|
|
|
|
}
|
|
|
|
if (TRUE %in% is.na(x) || TRUE %in% is.na(y)) {
|
|
|
|
stop(" Arguments x and y must not have missing values.")
|
|
|
|
}
|
|
|
|
covariance <- var(x, y)
|
|
|
|
if (verbose)
|
|
|
|
cat("Covariance = ", round(covariance, 4), ".\n", sep = "")
|
|
|
|
return(covariance)
|
|
|
|
}
|
|
|
|
|
|
|
|
</code></pre>
|
|
|
|
|
|
|
|
<h4 id="todo"><li> TODO Style </li></h4>
|
|
|
|
<p> Use a consistent style for TODOs throughout your code.
|
|
|
|
<br><CODE>TODO(username): Explicit description of action to
|
|
|
|
be taken</CODE>
|
|
|
|
</ul>
|
|
|
|
|
|
|
|
<!---------------------------------------------------------------------------->
|
|
|
|
|
|
|
|
<h3><li> Language </li></h3>
|
|
|
|
|
|
|
|
<ul class="NoBullet">
|
|
|
|
<h4 id="attach"><li> Attach </li></h4>
|
|
|
|
<p> The possibilities for creating errors when using
|
|
|
|
<code>attach</code> are numerous. Avoid it.
|
|
|
|
<h4 id="functionlanguage"><li>Functions </li></h4>
|
|
|
|
<p> Errors should be raised using <code>stop()</code>.
|
|
|
|
<h4 id="object"><li>Objects and Methods</li></h4>
|
|
|
|
|
|
|
|
<p> The S language has two object systems, S3 and S4, both of which
|
|
|
|
are available in R. S3 methods are more interactive and flexible,
|
|
|
|
whereas S4 methods are more formal and rigorous. (For an illustration
|
|
|
|
of the two systems, see Thomas Lumley's
|
|
|
|
"Programmer's Niche: A Simple
|
|
|
|
Class, in S3 and S4" in R News 4/1, 2004, pgs. 33 - 36:
|
|
|
|
<a href="http://cran.r-project.org/doc/Rnews/Rnews_2004-1.pdf">
|
|
|
|
http://cran.r-project.org/doc/Rnews/Rnews_2004-1.pdf</a>.)
|
|
|
|
<p>Use S3 objects and methods unless there is a strong reason to use
|
|
|
|
S4 objects or methods. A primary justification for an S4 object
|
|
|
|
would be to use objects directly in C++ code. A primary
|
|
|
|
justification for an S4 generic/method would be to dispatch on two
|
|
|
|
arguments.
|
|
|
|
<p>Avoid mixing S3 and S4: S4 methods ignore S3 inheritance and
|
|
|
|
vice-versa.
|
|
|
|
</ul>
|
|
|
|
|
|
|
|
|
|
|
|
<!---------------------------------------------------------------------------->
|
|
|
|
<h3><li> Exceptions </li></h3>
|
|
|
|
|
|
|
|
The coding conventions described above should be followed, unless
|
|
|
|
there is good reason to do otherwise. Exceptions include
|
|
|
|
legacy code and modifying third-party code.
|
|
|
|
|
|
|
|
<!---------------------------------------------------------------------------->
|
|
|
|
<h3><li> Parting Words </li></h3>
|
|
|
|
|
|
|
|
Use common sense and BE CONSISTENT.
|
|
|
|
<p>
|
|
|
|
If you are editing code, take a few minutes to look at the code around
|
|
|
|
you and determine its style. If others use spaces around their
|
|
|
|
<code>if </code>
|
|
|
|
clauses, you should, too. If their comments have little boxes of stars
|
|
|
|
around them, make your comments have little boxes of stars around them,
|
|
|
|
too.
|
|
|
|
<p>
|
|
|
|
|
|
|
|
The point of having style guidelines is to have a common vocabulary of
|
|
|
|
coding so people can concentrate on <em>what</em> you are saying,
|
|
|
|
rather than on <em>how</em> you are saying it. We present global style
|
|
|
|
rules here so people
|
|
|
|
know the vocabulary. But local style is also important. If code you add
|
|
|
|
to a file looks drastically different from the existing code around it,
|
|
|
|
the discontinuity will throw readers out of their rhythm when they go to
|
|
|
|
read it. Try to avoid this.
|
|
|
|
|
|
|
|
OK, enough writing about writing code; the code itself is much more
|
|
|
|
interesting. Have fun!
|
|
|
|
|
|
|
|
<!---------------------------------------------------------------------------->
|
|
|
|
<h3><li> References </li></h3>
|
|
|
|
|
|
|
|
<a href="http://www.maths.lth.se/help/R/RCC/">
|
|
|
|
http://www.maths.lth.se/help/R/RCC/</a> - R Coding Conventions
|
|
|
|
<br>
|
|
|
|
<a href="http://ess.r-project.org/">http://ess.r-project.org/</a> - For
|
|
|
|
emacs users. This runs R in your emacs and has an emacs mode.
|
|
|
|
|
|
|
|
</ol>
|
|
|
|
|
|
|
|
<!---------------------------------------------------------------------------->
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
</html>
|