vignettes/Developer_Style_Guide.Rmd
Developer_Style_Guide.Rmd
This document describes the coding style used within the package. Having a consistent style enhances the readability and “understandability” of the code and makes it easier for users and developers to work with this package and with other, related Mazama Science packages.
Naming variables is one of the most important things to get right to make your code readable and understandable to future readers of the code. (Perhaps even yourself!). Having a system for creating names also makes it easier to come up with new ones.
Mazama Science packages embrace
lowerCamelCase
for object names.
With the casing settled, we use an ornithologist’s sensibility for how to identify things:
bird
blackBird
redwingedBlackBird
It’s a simple system: start with a noun and prefix it with descriptors until it is uniquely identified.
In this system we would never have a variable called:
num_hours
. Instead we go through our process:
count
hourCount
.For complex objects it is often helpful to give readers of the code a hint as to what type of object it is so they will know how to work with it. We often use variable names like:
monitor
— a mts_monitor objectWe occasionally use ’_’ to create classes of similar variables that are otherwise hard to name, e.g.:
QC_negativeValues
Most functions should strive to be atomic in nature and should do one thing really well. Think of them as functional Lego bricks that we click together to achieve more advanced functionality. Where objects are well described nouns, functions are well described verbs that describe what they do as in:
monitor_collapse()
monitor_combine()
monitor_dropEmpty()
monitor_filter()
monitor_filterDate()
monitor_select()
...
All of these functions begin with monitor_
because they
are for creating or working with mts_monitor objects. Many of
these functions accept a mts_monitor object as their first
argument and return a modified mts_monitor. This means that
they can be used with the %>%
“pipe” operator and
chained together as in:
# Create a daily average dataframe (aka tibble)
Sacramento_area_daily_avg <-
# get all data for 2018
monitor_loadAnnual(2018) %>%
# filter to a 2-week period
monitor_filterDate(
startdate = 20181108,
enddate = 20181123,
timezone = "America/Los_Angeles"
) %>%
# filter to monitors near Sacramento
monitor_filterByDistance(
longitude = -121.4931,
latitude = 38.56844 ,
radius = 50000
) %>%
# combine by averaging hourly values from different monitors
monitor_collapse() %>%
# calculate local time daily averages
monitor_dailyStatistic() %>%
# extract daily average values from 'monitor' object
monitor_getData()
Each file should contain a single function of the same name. Thus,
the function named monitor_filterDate()
is defined in
monitor_filterDate.R
. An exception is made for small,
mostly internal functions used in conjunction with a particular type of
object or activity. These can be stored together in a file named
utils-~
:
utils-monitor.R
We generally adhere to the Wickham Style Guide for syntax with a few exceptions:
Do place spaces around code in parentheses if it is
an if
test:
if ( <logical expression part1> && <logical expression part2> ) {
...
}
When debugging, this makes it much easier to select the logical test with a cursor and paste it into the RStudio console.
We generally like to specify R lists with each
parameter = value
pair on a separate line. This goes for
regular lists and for named argument lists passed to a function:
# Filter to a 2-week period
monitor_filterDate(
startdate = 20181108,
enddate = 20181123,
timezone = "America/Los_Angeles"
) %>%
Coding this way makes it easy to see which function arguments are being passed. It also eases future refactoring of the code when additional arguments need to be added or the order of arguments needs to be changed.
It is our belief that good code should be both readable and understandable and should inspire others to copy and innovate on their own.