vignettes/AirSensor2.Rmd
AirSensor2.Rmd
Install the latest version from GitHub with:
devtools::install_github('mazamascience/AirSensor2')
The AirSensor2 package uses the MazamaSpatialUtils package to enhance data downloaded from PurpleAir with additional spatial metadata. In order to work properly, you must specify a spatial directory and install additional datasets.
If you don’t already have a spatial data directory specified, the following setup is recommended:
if ( !file.exists("~/Data/Spatial") ) {
dir.create("~/Data/Spatial")
}
MazamaSpatialUtils::setSpatialDataDir("~/Data/Spatial")
# Install datasets
MazamaSpatialUtils::installSpatialData("NaturalEarthAdm1")
MazamaSpatialUtils::installSpatialData("USCensusCounties")
# Check installed datasets
MazamaSpatialUtils::installedSpatialData()
Please see the article on Working with PurpleAir API Keys.
The following code chunks assume that you have set up a
global_vars.R
file which sets your
PurpleAir_API_READ_KEY
.
Synoptic data (many measurements at a single point in time) are stored as a tibble (modern dataframe) for easy use with dplyr.
Time series data for stationary time series are stored using the MazamaTimeSeries SingleTimeSeries (sts) data model:
Time series data from a single environmental sensor typically consists of multiple parameters measured at successive times. This data is stored in an R list containing two dataframes.
sts$meta
– 1 row = unique device-deployment; cols =
device/location metadata
sts$data
– rows = UTC times; cols = measured parameters
(plus an additional datetime
column)
The following example creates a PurpleAir_synoptic (or “pas”) object with recent measurements for all PurpleAir sensors in Washington state.
# AirSensor2 package
library(AirSensor2)
# Set user's PurpleAir_API_READ_KEY
source('global_vars.R')
setAPIKey("PurpleAir-read", PurpleAir_API_READ_KEY)
# Initialize spatial datasets
initializeMazamaSpatialUtils()
# All sensors in Washington state
WA_pas <-
pas_createNew(
countryCodes = "US",
stateCodes = "WA",
lookbackDays = 1,
location_type = 0
)
pas_leaflet(WA_pas)
The AirSensor2 package handles a variety of data sources within the world of low-cost sensors with various functions tailored to the specifics of each data source.
Functions with names of the form
<source>_<action>()
are designed to perform a
particular <action> on data provided by a specific
<source>.
Currently supported sources of data include:
Clarity
PurpleAir
In addition, functions with names of the form
<structure>_<action>()
are designed to perform
a particular <action> on data that has been converted
into a particular <structure> (aka “class”).
Currently supported data structures include:
pas
–
PurpleAirSynoptic.
Data for many PurpleAir sensors at a specific point in time. Each
pas object is a simple tibble.pat
–
PurupleAirTimeseries.
Time series data for a specific PurpleAir sensor. Each pat
object is a list with two tibbles: meta
and
data
.synoptic
– Generic synoptic data for many sensors at a
specific point in time. Each synoptic object is a simple
tibble.We encourage people to embrace “data pipeline” style coding as
encouraged by dplyr and related packages. The special
%>%
operator uses the output of one function as the
first argument of the next function, thus allowing for easy “chaining”
of results. Many of the functions supporting a particular data class
take an object of that class as their first argument and return an
object of that class. As a result, they can be chained together as
in:
# Okanogan county high elevation sites
okanogan_high_pas <-
WA_pas %>%
pas_filter(countyName %in% c("Okanogan")) %>%
pas_filter(elevation >= 800)
pas_leaflet(okanogan_high_pas)
# Okanogan county low elevation sites
okanogan_low_pas <-
WA_pas %>%
pas_filter(countyName %in% c("Okanogan")) %>%
pas_filter(elevation < 800)
pas_leaflet(okanogan_low_pas)
Best of luck analyzing your local air quality data!