Reduce the number of records (timesteps) in the
dataframe of the incoming
mts through random sampling.
mts_sample( mts = NULL, sampleSize = 5000, seed = NULL, keepOutliers = FALSE, width = 5, thresholdMin = 3 )
Non-negative integer giving the number of rows to choose.
Integer passed to
set.seed for reproducible sampling.
Logical specifying a graphics focused sampling algorithm that retains outliers (see Details).
Integer width of the rolling window used for outlier detection.
Numeric threshold for outlier detection.
A subset of the given mts object.
An mts time series object with fewer timesteps.
(A list with
keepOutliers = FALSE, random sampling is used to provide
a statistically relevant subsample of the data.
keepOutliers = TRUE, a customized sampling algorithm is used that
attempts to create subsets for use in plotting that create plots that are
visually identical to plots using all data. This is accomplished by
preserving outliers and only sampling data in regions where overplotting
The process is as follows:
find outliers using
create a subset consisting of only outliers
sample the remaining data
merge the outliers and sampled data
This algorithm works best when the mts object has only one or two timeseries.
thresholdMin parameters determine the number of
outliers detected. For hourly data, a
width of 5 and a
of 3 or 4 seem to find many visually obvious outliers.
Users attempting to optimize plotting speed for lengthy time series are
encouraged to experiment with these two parameters along with
sampleSize and review the results visually.