NEWS.md
PURPLE_AIR_API_READY_KEY instead of MY_API_READ_KEY.show_only is used, spatial bounding information is ignored in: pa_getSensorsData(), pas_downloadParseRawData() and pas_createNew().pas_createNew() to be backwards compatible with MazamaSpatialUtils version 0.7.6.Updates to accommodate data access through the PurpleAir API released in July of 2022. This API requires an API_KEY. Users of the package are required to obtain their own credentials from PurpleAir.
For information, see:
Updated Mazama package dependency versions.
Support for API keys.
Replace sp package dependency with sf package.
New versions of pas_createNew(), pas_downloadParseData(), pas_enhanceRawData() and pas_leaflet().
Some column names in generated pas dataframes have changed to be more consistent with PurpleAir documentation.
New versions of pat_downloadParseRawData(), pat_createNew().
Updated package data and function reference examples.
Removed package dependencies: httpcode
pas_addCommunityRegion().pas_load() so that current and archival ‘pas’ objects can be easily merged. (Some early ‘pas’ objects contained some “Voc” data but this has never been a focus of the AirSensor package. Later ‘pas’ objects do not contain the “Voc” parameter.)pas_load() now properly handles loading of archival ‘pas’ objects prior to 2019-08-02.pat_externalFit() and pat_monitorComparison().sensor_calendarPlot().pat_join() to handle older ‘pat’ objects with integer deviceID.sensor_pollutionRose() now searches for the nearest 2 met sites in case the first one has no data. This matches the behavior in sensor_polarPlot().example_~ data files with AirSensor 0.9.17..json file.plantowerAlgorithm argument from pat_createPATimeseriesObject() as an attractive nuisance that could generate much confusion.sensor_~() functions.sensor_polarPlot() and sensor_pollutionRose() functionality by upgrading to worldmet 0.9.1.sensor_join().sensor_load().meta$ID and meta$deviceID for both pat and airsensor objects are of type “character”. This fixes errors during joining of monthly files that may have been created with an earlier version of the package.PurpleAirQC_hourly_~() functions, returnAllColumns now returns a column of data for every parameter used in the QC algorithm.stats::t.test() inside of PurpleAirQC_hourly_AB_01().PurpleAirQC_hourly_AB_01() to properly use the t-test p_value.PurpleAirQC_hourly_AB_01() to gracefully handle time periods when R’s stats::t.test() function returns an error – e.g. when an hourly subset contains insufficient data.pas_downloadParseRawData() to guarantee that all “ID” parameters are of “character” type and not “integer”.sensor_filterDatetime() function for times that are not aligned to date boundaries.sensor_load() to use sensor_join() and other internal functions for package consistency.PWFSLSmoke::monitor_isMonitor() with package internal sensor_isSensor().pat_join() to remove overlapping data in incoming PAT_pat objects before joining.pat_filterDatetime() function for times that are not aligned to date boundaries.pat_createNew() to not trim data at date boundaries.pat_load() to not trim data at date boundaries and removed the days parameter.sensor_join() function to “join” airsensor objects from different months.pat_distinct() to remove duplicated time stamps and guarantee proper ordering by datetime.pat_isPat() and sensor_isSensor().idPattern argument to all pas_get~() functions.sensor_load().round(1) from internal pat_aggregate() calculations.patData_aggregate() to work properly. The passed in FUN must now return a matrix.sensor_load~() functions.pat_createPATimeseriesObject.R.Changes requested by CRAN maintainers:
option(warn = -1).sensor_videoFrame(). This function is not general and has been moved to the AQ-SPEC-sensor-data-ingest-v1 private repository.local_Rmd/PurpleAirQC_Comparison.Rmd.PurpleAirQC_hourly_AB_01() as the default QC algorithm throughout the code. (The _02 version removed too much data.)sensor_pollutionRose() and sensor_polarPlot().Pre-release candidate.
setArchiveBaseDir(NULL) now works identically to removeArchiveBasesDir(). Ditto for setArchiveBaseUrl(NULL).pas_load() returns helpful message when the archive BASE_DIR is set and no requested pas data is found: removeArchiveBaseDir().verbose parameter to sensor_polarPlot() and sensor_pollutionRose().pat_multiplot() as an alias for pat_multiPlot().pat_dailySoH() bug that showed up with dplyr 1.0.0.pat_aggregate() to always return a “regular” time axis with no missing timesteps.example_sensor dataset.PurpleAirQC_hourly_AB_03().pat_monitorComparison()
FUN to specify which QC algorithm to applydistanceCutoff specifying the radius within which to search for a nearby monitor.pat_filterDate().pat_createNew() check for spatial data.pat_upgrade() function to bring version 0.6 ‘pat’ objects up to the version 0.8 data model.pas_createNew().pat_dygraph() to include support for parameter = "pressure".pat_aggregateOutlierCounts().This release completely refactors the way that ‘pat’ data are downloaded from ThingSpeak. Additional data columns are added to the standard pat$data dataframe.
pat_downloadParseRawData() data access times and also reduced the volume of data downloaded by switching from the .json to the .csv API from ThingSpeak. This function now returns a list of five dataframes.pat_createPATimeseriesObject() to work with the new list of dataframes from pat_downloadParseRawData(). New fields pressure and bsec_iaq are included in the created pat object.pat_multiplot() to pat_multiPlot() (capital ‘P’).pat_scatterMatrixPlot to pat_scatterPlotMatrx().pat_createAirSensor()pas_load() now ignores duplicated warning messages.pm25 in pat_qc() was increased to 2,000 based on a 2020-05-14 conversation with PurpleAir founder Adrian Dybwad.This release completely refactors the way that aggregation and QC are performed. The changes should make the code easier to maintain, faster and much more flexible for those developing QC algorithms.
pat_aggregate() and pat_createAirSensor() functions.PurpleAirQC_hourly_AB_02().pat_externalFit() and pat_monitorComparison() updated to use new aggregation/QC functions.pat_aggregate() and pat_createAirSensor() are now now available as pat_aggregate_old and pat_createAirSensor_old().PurpleAirQC_validationPlot() renamed to PurpleAirQC_validationPlot_old().pat_outliers() defaults updated to reflect current default 120 second sample interval.pat_multiPlot() now returns a ggplot2 object which can be manipulated.downloadParseTimeseriesData() to pat_downloadParseRawData().pas_load().pas_upgrade() to add sensorManufacturer, targetPollutant and technologyType columns.archiveBaseUrl from http://smoke.mazamascience.com/data/PurpleAir to https://airsensor.aqmd.gov/PurpleAir/v1.pat_scatterPlot() to pat_scatterPlotMatrix().downloadParseSynopticData() to pas_downloadParseRawData().enhanceSynopticData() to pas_enhanceData().pas objects: pas_addAirDistrict(), pas_addSpatialMetadata(), pas_addUniqueIDs(), pas_addUniqueIDs(), pas_hasSpatial().pas_upgrade() function to upgrade pas files created with AirSensor version 0.5.enhanceSynopticData() when countryCodes = NULL. This was encountered when running pas_createNew() with countryCodes = NULL.MazamaCoreUtils::loadDatafile() function .pas_~() functions.pat_multiPlot() and pat_monitorComparison()
pat_isEmpty() so that it test for an empty data dataframe rather than an empty meta dataframe.pat_trimDate() now preserves the first day if it begins at local midnight.pat/YYYY/. The pat_loadMonth() function was modified to search in this location.pat$meta$pm25_A/B values now come from the raw data outdoor value PM2.5 (ATM) rather than the indoor value PM2.5 (CF=1). See https://www2.purpleair.com/community/faq#!hc-what-is-the-difference-between-cf-1-and-cf-atm
downloadParseTimeseriesData().countryCodes = NULL in pas_createNew() and enhanceSynopticData() so that it is easier to create pas objects for other countries.countryCode and stateCode arguments from pas_get~ functions. Filter should be done with pas_filter() before these functions are called.enhanceSynopticData() which failed to add pwfsl~ columns when includePWFSL = FALSE. (The columns should still exist but with all NA.)pas_get~() functions now support subsetting by countryCode and US stateCode.days argument to pat_loadLatest().pat_load() and sensor_load() default to the last 7 days when no dates are provided.pat_trimDate() function trims data to local time full days.pas_isPas() no longer requires presence of optional pwfsl_~ columns.pas_filterNear() to package standard longitude, latitude.deviceID, locationID, deviceDeploymentID.pat_createNew() andpat_load~()functions so that the firstidargument is used as thedeviceDeploymentID` unique time series identifier.label and pas and provide an alternative to specifying id.exaple_pat~ data files with so they have the new metadata fields: sensorID, locationID and deviceDeploymentID.Revisited everything at the pas object level:
pas_sensorDeploymentID() to pas_deviceDeploymentID()
Version 0.6 is a backwards incompatible release. It replaces label identifier with a proper “sensor-deployment” identifier and relies on the MazamaLocationUtils package to create location identifiers. Becuase of this change, archives must be regenerated and many functions must be refactored.
New fields in pat$meta include:
sensorID – same as ID but more explicitlocationID – generated with MazamaLocationUtils::location_createID()
sensorDeploymentID – combination of the aboveThe following functions were added/refactored to use “sensor-deployment” identifiers:
pas_sensorDeploymentID() creates a unique “sensor-deployment” identifier. This identifier is used in creation of file names in the archive database and as the unique time series identifer in airsensor objects.pas_getColumn(), past_getLabels(), pas_getIDs()
pat_createNew()pat_load(), pat_loadLatest(), pat_loadMonth()
pat_createAirSensor()sensor_polarPlot() now uses the second-nearest met station if no data are found at the nearest station.pat_filterDate() now issues a warning if requested data range is outside aviailable date rangedownloadParseTimeseriesData()
Additional functionality for calculating a state of health index to be used as an overall assessment of sensor functioning.
pat_dailyStateOfHealth() to pat_dailySoH()
pat_dailySoHPlot(), pat_dailySoHIndex_00(), pat_dailySoHIndexPlot()
Modified how pat_filterDate() obtains the timezone used to interpret the incoming startdate and enddate:
startdate timezone if it is POSIXct
timezone if it is passed inpat$meta$timezone otherwisepat_outliers() no longer fails in the face of DC signalspas_getLabels() functions simplifies the creation of vectors of sensor labels (currently used as unique identifiers)Refactored PurpleAirSoH_daily~() functions to use dplyr resulting in code that runs faster and is easier to understand.
~SoH_dailyCorrelation() to ~SoH_dailyMetFit().timeseriesTbl_multiPlot() argument parameterPatter to pattern.Fixed bug in datetime axis that caused SoH functions to return missing values after the switch from PST to PDT.
Removed logger.error() statements from the following low-level functions as they couldn’t be turned off and ended up cluttering the log files:
loadDataFile()pas_load()pat_createNew()pat_loadLatest()pat_loadMonth()sensor_loadLatest()sensor_loadMonth()sensor_loadYear()Improved pat_join() logic to deal with metadata changes associated with the value pwfsl_closestMonitorID when temporary monitors come and go in the PWFSL database.
New PurpleAirSoH~ functions calculate daily state-of-health metrics from PAT data. These metrics can be combined to create multi-metric indices that provide an overall assessment of the health of a Purple Air sensor.
PurpleAirSoH_dailyCorrelation()PurpleAirSoH_dailyPctDC()PurpleAirSoH_dailyPctReporting()PurpleAirSoH_dailyPctValid()pat_dailyStateOfHealth()sensor_loadYear() to load annual files (~2-3 MB).sensor_load() now attempts to load and join annual files before attempting the slower process of loading and joining monthly files.timeseriesTbl_multiPlot() now supports a style paramter which can be set to "point", "line" or "area".SoH_pctValid(), SoH_pctDC().pat_createNew() and downloadParseTimeseriesData() to support the id parameter.PurpleAirQC_aggregationPlot() to work with any tibble or dataframe and renamed it to tbl_multiplot().SoH_pctReporting().pat_aggregate() non longer stops in the face of data with a DC signal on one of the pm25 channels.aggregation_FUN argument to pat_createAirSensor() to allow for custom aggregation statistics.PurpleAirQC_aggregationPlot() function.datestamp handling in pas_load().make.names argument to pat_loadMonth() and pat_loadLatest().pat_dygraph() which referenced PWFSLSmoke::parseDatetime() which is now deprecated in favor of MazamaCoreUtils::parseDatetime().loadDataFile(), getArchiveBaseDir(), setArchiveBaseDir().~_load() and ~_loadLatest() functions now call loadDataFile()
pat_load() now continues working in the face of missing data files. under the hood and will work with data archives found at archiveBaseDir.pat_calendarPlot() and sensor_calendarPlot() from the packge. These work-in-progress functions are now found in local_R/. We anticipate including calendar plotting functionality in the AirMonitorPlots package.Version 0.5.x represents the version of the package that is ready for public release.
sensor_filterMeta() where filtering the number of rows in sensor$meta did not also filter columns in sensor$data.sensor_videoFrame() with new arguments and new defaults for colorPalette and colorBins.pat_sample() to handle cases where the A or B channel contains only missing values.sensor_loadLatest() no supports a days argument and defaults to loading a 45-day file.local_executables/ scripts.timezone argument to downloadParseTimeseriesData().local_executables/crontab~ files.downloadParseTimeseriesData() to avoid using day boundaries when explicit times are specified.pas_load() default retries was increased from 10 to 30.pat_createNew() created time ranges that ended (UTC - local) hours short of the requested enddate.pat_externalFit(), pat_internalFit(), pat_monitorComparison() and pat_outliers() now have time axes with sensor local time rather than UTC.pat_externalFit() and pat_monitorComparison() labeling now includes PWFSL monitor siteName
base::Sys.timezone().enhanceSynopticData() now handles presence or absence of State column in raw data obtained from PurpleAir. This was apparently removed some time during the summer of 2019.pat_calendarPlot() and sensor_calendarPlot() to handle discrete color scales.pat_createAirSensor() now uses pat$meta$label to populate sensor$meta$siteName.sensor_calendarPlot().aspectRatio argument to ~calendarPlot() functions.sensor_load() no longer stops when monthly files cannot be found. This helps when asking for the current year’s worth of data for a calendar plot.MazamaCoreUtils::dateRange() to reflect change from “end of enddate” to “start of enddate”.setArchiveBaseUrl() and getArchiveBaseUrl() functions allow per session specification of the location of pre-generated data files.baseUrl parameter from all data loading functions. Now users must begin a session with setArchiveBaseUrl().sensor_load() to trim data to the requested time range.pat_scatterPlot() that generated an error when the number of records in the pat object was fewer than the sampleSize parameter.pat_calendarPlot() tailored to full-year calendar heatmaps.pat_distinct() to remove duplicate data records.pat_~() functions now remove duplicate records to guarantee proper functioning of chained functions.pat_loadLatest() to always get the most recent 7 days.pat_loadSensor() to always get the most recent 7 days.timezone argument to pat_createNew().pat_createNew() now accepts start and end points not on date boundaries.AirShiny_leaflet()
pas_leaflet() coloring issue.pwdpat_load() which didn’t trim data to date boundaries when a local timezone was passed in.pat_calendarPlot() to plot annual daily heat map.pas_loadLatest() to pas_createNew().pat_loadLatest() to pat_createNew().download~() functions.timezone parameter to pas_load().timezone information in every internal function call where it can be specified so that R’s default “system local timezone” does not accidentally get used.sensor_polarplot() and sensor_pollutionRose().PurpleAirQC_algorithm to sensor objects created by pat_createAirSensor() so that they self-document how they were created.archival argument to pas_load(). When archival = TRUE a pas object will be loaded that contains metadata for all PurpleAir sensors including those that have ceased reporting.downloadParseSensorList().name to label in pat_createNew() and downloadParseTimeseriesData().local_executables/ to work with the latest release.pas_staticMap() and pas_filterNear().downloadParseSensorList() to download a list of archived PurpleAir sensors.sensor_videoFrame() function to create a single frame map for a network of sensors. These can be used to create videos showing the evolution of PM2.5 levels over several days.local_executables/createVideo_exec.R script to generate mp4 videos.ylim argument to pat_multiPlot().sensor_filter(), sensor_filterDate() and sensor_filterMeta().local_examples/downloadSpeeds.Rmd to benchmark data download times from ThinkSpeak.pat_aggregateOutlierCounts() to count outliers per aggregation period.pat_aggregate() to fix warnings and optimizewind_loadMonth() to load pre-generated monthly wind datawind_load() to load pre-generated wind data from timestampssensor_pollutionRose() to accept new wind data modelsensor_polarPlot() to plot bivariate polar plotsairsensor_load~() to sensor_load~().sensor_~ utility functions: isSensor(), isEmpty(), extractMeta() , extractData().example_sensor dataset for use in documentation examples.local_examples/LA_fireworks_2019.R
min_count = 20).pat_scatterPlot().returnAllColumns option to `PurpleAirQC_~1 functions.PurpleQC_validationPlot() function.pat_createPATimeseriesObject() now retains additional metadata: sensorManufacturer, targetPollutant, technologyType, communityRegion
airsensor_load() so that it includes monitors found in any month rather that those found in every month.pat_createPATimeseriesObject() and pat_createAirSensor() so that they no longer generate NaN or Inf values.as_pollutionRose()
createMonthlyWind_exec()
example_as as an example “airsensor” objectinitializeMazamaSpatialUtils() now only sets up logging if it hasn’t already been set up.local_executables scripts.pat_loadMonth() to use the newer pat_<label>_<monthstamp>.rda naming system.pat_monitorComparison().local_examples/bikesgv_story.Rmd.pas_filter()."pm25_a" and "pm25_b" plot types to pat_multiPlot().pas_within() to pas_filterNear()
airSensorQC_~ functions to PurpleAirQC_~
local_executables/ to be more similarairsensor data files with pre-generated, hourly-aggregated data suitable for use with the PWFSLSmoke package.airsensor_load() to load pre-generated, hourly-aggregated data files suitable for use with the PWFSLSmoke package.hourly_AB_00 and hourly_AB_01.pat_cerateAirSensor() to accept arguments that impact the conversion of pat data into aggregated period-averages: period, channel, qc_algorithm, min_count.pat_aggregate() so that any NaN or Inf values are converted to NA.The AirSensor package is now almost feature complete with functions for QC and aggregation to an houly axis.
pat_aggregate() that occasionaly returned empty columns of data.local_examples/Jons_qc_1.R with Jon’s best take on appropriate QC of the hourly aggregated data.pat_createAirSensor() added to barplotglobal.R to improve clarity of scopepas_within() for spatial analysispat_createAirSensor() function converts from pat object to airsensor object that is compatible with the PWFSLSmoke package.pat_qc() argument humidityMax –> max_humidity.pat_aggregate() to fix an issue with t-test statistics. Also simplified the function signature to accept just pat and period arguments.pat_load() to default to the most recent week of datapat_qc() function applies low level QCpat_outliers() to retain records with missing PM2.5 values when replace = TRUE
pat_aggregate() defaults to: period = "1 hour", quickStats = TRUE
pat_aggregate() dataThreshold argumentexample_pat_failure_B dataset with severe A channel errorspat_loadLatest()
createMonthlyPAT_exec.R
threePlot web-appdailyAveragePlot web-appsample() to .sample() to avoid confusion with base::sample()
utils-plot.R added to support general functionsutils-gen.R added to support general functionslocal_examples/07_pat_archive.R demonstrates how to efficiently work with pre-generated pat files from the archivepat_loadMonth() loads pre-generated “pat” objects from a data archivepat_aggregate() – it now always returns all statisticsplotList parameter from pat_multiPlot()
pat_join() can now accept either individual pat objects or a list of pat objectslocal_executables/createMonthlyPAT_exec.R script for populating an archive with pat data filespat_load() to pat_loadLatest()
subset and weights parameters from pat_internalFit()
param parameter from pat_join()
ast_createAirSensor() converts ast objects into “airsensor” objects that are compatible with the PWFSLSmoke packageinitializeMazamaSpatialUtils() now imports all datasets need to create pas objectsexample_pas data file has additional fields introduced by the 0.2.8 version of enhanceSynopticData()
pat_timeAverage() functionpat_aggregate() function performs temporal aggregationenhanceSynopticData() now handle changing order of json properties and validate locations before adding spatial metadatapat_createASTimeseries() function handles conversion of Purple Air-specific “pat” objects into sensor-generic “ast” objects.ast_createAirSensor() objects converts “ast” objects into a “as” data type that is compatible with the “ws_monitor” data type used in the PWFSLSmoke packagepat_sample(forGraphics = TRUE)
pas_leaflet() and pas_staticMap()
enhanceSynopticMetadata() adds the following columns to a pas object:
airDistrict – CARB air districtsensorManufacturer = "Purple Air"targetPollutant = "PM"technologyType = "consumer-grade"communityRegion – (where known)example_pat_failure datasetcreateASTimeseriesObject()
pas_staticMap() function with customizable base maps and color schemespas_esriMap() because the ESRI map service we were using started requiring tokens on April 25, 2019param to parameter in pas_leaflet()
pat_sample() outlier detection window size to n = 23 to match pat_outliers()
pat~ functionspat_sample() functionpat~ plot functionspat_sample() included to sample pat datasetspat_dygraphs() included to plot JavaScript based “dygraphs”pat_multiPlot() time axis now in sensor local timepat_multiPlot() has a new pm25_over plottypepat_scatterPlot()
pas_esriMap()
local_examples/example_02_pas-filtering.R
pas_filterArea()
pas_isPas(), pas_isEmpty()
pat_internalData() to pat_scatterPlot() with improved functionalitypat_outliers() to pat_outliers() with improved functionalitypat_isPat(), pat_isEmpty(), pat_extractMeta(), pat_extractData()
pat_internalFit() functionpas_filter() to filter toolboxpas_esriMap()
multi_ggplot()
pas_esriMap()
pat_filterData
pat_subdate() has been renamed to pat_filterDate() and defaults to the America/Los_Angeles timezonepat_loadLatest() – all requests are assumed to be in the sensor’s local timezonepas_load() function now downloads pre-generated pas objectspas_loadLatest() function downloads raw synoptic data from Purple Air and generates a pas objectexample_pas and example_raw_pas
data/ directory with sample pas objectvignettes/purple-air-synoptic.Rmd
Initial functions to download and map Purple Air synoptic data.