3 Functions
3.1 Naming
Function names should be meaningful and descriptive and provide insight to what the function does. Typically, functions names are VERBS and describe the action the function performs. This style agrees with Google’s recommendation of using BigCamelCase for function names. Do not use special characters in the function name, but using letters or numbers is acceptable. That is, -, . and _ should not be included in a function name. While - or _ do make the function names more readable than running words together like functionname, the use of - or _ requires extra key strokes when compared to BigCamelCase where the first letter of each word is capitalized. To avoid confusion with S3 methods a . should not be used in a function name unless it is to define an S3 method, see example below.
In addition, it is recommended that function names should contain verbs or a word that describes the action the function performs. Vague generic function names like PerformTasks() or Com() should not be used.
# Good Example
ComputePosteriorParameters()
ComputePosteriorParameters.BetaBinom() # If ComputePosteriorParameters is a generic function this is fine
SimulatePatientArrivalTimes()
# Bad Example
post.par() # Incorrect name capitalization and including ., non-descriptive
arrival-time() # Incorrect name capitalization and including -
addhr() # Capitalizaiton and vague
ComputePosteriorParameters.BetaBinom() # If ComputePosteriorParameters is NOT a generic function this is not acceptable
3.2 Structure
Functions should be self contained and only depend on the arguments. They should NOT use variables in a higher scope. A call to a function with the same arguments should always return the same value, unless the function is supposed to produce random variables ect. [QUESTION: What does everyone thing about there not being a preceding and trailing space before the first argument and after the last argument? I prefer function( x, y ), but Tidyverse suggests not putting spaces inside or outside side of parentheses see https://style.tidyverse.org/syntax.html#spacing so it would allow function(x, y) but not function( x, y ).
Also, I have always put the { } for a function in line with each other and the first letter of the function name like the examples below but both TV and Google recommend the opening { after function, for example Add2 <- function(x) { X <- x + 2 return(x) }
I am okay with either approach. ]
# Example
#--------------------------------------------------------------------------------------
# VERY BAD STYLE - What this function returns depends on what y was before the call
#--------------------------------------------------------------------------------------
MyFunction <- function(x)
{
x <- x + y
return(x)
}
y <- 5
MyFunction(4) # return 9
y <- 10
MyFunction(4) #returns 14
#--------------------------------------------------------------------------------------
# Acceptable example - a call to the function only depends on arguments
#--------------------------------------------------------------------------------------
MyFunction2 <- function( x, y )
{
x <- x + y
return( x )
}
y <- 5
MyFunction2( 4, 7 ) # return 11
y <- 10
MyFunction2( 4, 7 ) #returns 7
3.3 Explicit Returns
Do NOT rely on R’s implicit return feature as an explicit return is much clearer for the reader. Tidyverse and Google do not agree on this recommendation and this guide follows Google as it is much more transparent to developers that may not be as familiar with R.
# Good Example - explicit return
MyFunction(x, y)
{
return(x + y)
}
# Bad Example - rely on explicit return can be confusing and more error prone
MyFunction(x, y)
{
x + y
}
3.4 Local function
A local function refers to defining a function within another function. Local functions are difficult to test, understand and prone to errors. Local functions should be avoided unless necessary.
## Good Example
RunAnalysis1 <- function( dfData )
{
return( lm( dfData$y ~ dfData$x ) )
}
RunAnalysis2 <- function( dfData )
{
return( lm( dfData$y ~ dfData$x + dfData$x*dfData$Trt ) )
}
RunAnalysis3 <- function( dfData )
{
return( lm( dfData$y ~ dfData$x + dfData$x2 ) )
}
AnalyzeData <- function( strType, dfData )
{
if( strType == "ONE" ) fit <- RunAnalysis1( dfData )
if( strType == "TWO" ) fit <- RunAnalysis2( dfData )
if( strType == "THREE" ) fit <- RunAnalysis3( dfData )
return( fit )
}
## Bad Example
AnalyzeData <- function( strType, dfData ){
if( strType == "ONE" ) ana <- function( dfData ){ lm( dfData$y ~ dfData$x ) }
if( strType == "TWO" ) ana <- function( dfData ){ lm( dfData$y ~ dfData$x + dfData$x*dfData$Trt )}
if( strType == "THREE" ) ana <- function( dfData ){ lm( dfData$y ~ dfData$x + dfData$x2 )}
fit <- ana( dfData )
return( fit )
}
The above “Good” example provides functions RunAnalysis1, RunAnalysis2, RunAnalysis3 that are much easier to test. However, AnalyzeData could be improved further in this example with the use of s3 classes where the class( dfData ) determines which analysis to call thus eliminating the need for the if statements.