Introduction
In this post I will take you through the editing I have done to the package seasonal by Will Welch, for fitting it to the needs of Karttur’s GeoImagine Framework.
Install Seasonal
To install the seasonal pacakge, either use pip or conda. At time of writing the conda installer was a bit outdate, and I used pip for installing seasonal:
$ pip install seasonal
Instructions for installations using conda :
$ conda install -c derickl seasonal
Once installed, you can use the seasonal package as instructed on the GitHub page.
The original package, however, includes some implicit assumptions that are converted to explicit parameters when used with Karttur´s GeoImagine Framework. I then also added the option of filtering using a customised kernel.
Editing the Seasonal source code
For use with karttur´s GeoImagine Framework, changes are made to the following modules of the seasonal package:
- seasonal.py
- trend.py
If you have installed the complete Spatial Data Integrated Development Environment (SPIDE) using Anaconda, the path to your site-packages should be something like:
$ /Users/youruser/anaconda3/lib/python3.6/site-packages
under the site-packages you should now have the folder seasonal.
To edit the modules, you can use Eclipse. Use the menu system .
seasonal.py
def fit_seasons
The function fit_seasons is expanded and includes three additional boolean variables trendonly, forceseason, and kernel, all set to False as default.
If terndonly is set to True, fit_seasons only returns the trend (speeding up processing). If forceseason is set to True, the seasonal component is calculated regardless of the significance with which it is calculated. If the kernel parameter is set to a list ([i,j,k,j,i]), this kernel will be used for filtering the time series.
def fit_seasons(data, trend="spline", period=None, min_ev=0.05,
periodogram_thresh=0.5, ptimes=1, trendonly=False, forceseason=False, kernel=False):
trendtype = trend
if trend is None:
trend = np.zeros(len(data))
elif not isinstance(trend, np.ndarray):
trend = fit_trend(data, kind=trend, period=period, ptimes=ptimes, kernel=kernel)
else:
assert isinstance(trend, np.ndarray)
if trendonly:
return (None, trend)
data = data - trend
var = data.var()
if np.isclose(var, 0.0):
return (None, trend)
if period:
# compute seasonal offsets for given period
start= period; stop = len(data)-period
cv_mse, cv_seasons = gcv(data[start:stop], period)
'''
if trendtype == 'kernel':
start= int(period/2); stop = len(data)-int(period/2)
cv_mse, cv_seasons = gcv(data[start:stop], period)
else:
cv_mse, cv_seasons = gcv(data, period)
'''
if forceseason:
return (cv_seasons, trend)
fev = 1 - cv_mse / var
if np.isclose(cv_mse, 0.0) or fev >= min_ev:
return (cv_seasons, trend)
else:
return (None, trend)
if periodogram_thresh and period is None:
# find intervals containing best period
peaks = periodogram_peaks(data, thresh=periodogram_thresh)
if peaks is None:
return (None, trend)
peaks = sorted(peaks)
else:
# search everything (XXX parameterize this)
peaks = [(0, 0, 4, len(data) // 2)]
cv_mse, cv_seasons = np.inf, []
period = 0
for interval in peaks:
period = max(period, interval[2])
while period <= interval[3]:
_mse, _seasons = gcv(data, period)
if _mse < cv_mse:
cv_mse, cv_seasons = _mse, _seasons
period += 1
if np.isclose(cv_mse, 0.0) or min_ev <= 1 - cv_mse / var:
return (cv_seasons, trend)
else:
return (None, trend)
trend.py
def fit_seasons
In Karttur´s version the function fit_seasons in the module trend.py expects the additional variable kernel. If kernel is set to False (default) the parameter has no effect. If, however, a kernel is defined, this array will be used for filtering the time series data.
Also the parameter ptimes is changed in Karttur´s version. In the original package ptimes is hardcoded to 2, whereas is Karttur’s version it is defaulted to 1, but can be set to any integer.
def fit_trend(data, kind="spline", period=None, ptimes=1, kernel=np.array([])):
print 'ptimes',ptimes
"""Fit a trend for a possibly noisy, periodic timeseries.
Trend may be modeled by a line, cubic spline, or mean or median
filtered series.
Parameters
----------
data : ndarray
list of observed values
kind : string ("mean", "median", "line", "spline","kernel", None)
if mean, apply a period-based mean filter
if median, apply a period-based median filter
if line, fit a slope to median-filtered data.
if spline, fit a piecewise cubic spline to the data
if None, return zeros
period : number
seasonal periodicity, for filtering the trend.
if None, will be estimated.
ptimes : number
multiple of period to use as smoothing window size
kernel : ndarray
kernel array
Returns
-------
trend : ndarray
"""
if kind is None:
return np.zeros(len(data)) + np.mean(data)
if period is None:
period = guess_trended_period(data)
window = (int(period * ptimes) // 2) * 2 - 1 # odd window
if kind == "median":
filtered = aglet(median_filter(data, window), window)
elif kind == "mean":
filtered = aglet(mean_filter(data, window), window)
elif kind == "line":
filtered = line_filter(data, window)
elif kind == "spline":
#changed by tg to just 1
'''
nsegs = len(data) // (window * 2) + 1
filtered = aglet(spline_filter(data, nsegs), window)
'''
nsegs = len(data) // (window * 1) + 1
filtered = aglet(spline_filter(data, nsegs), window)
'''
nsegs = len(data) / window
print 'window',window
print 'len(data)',len(data)
print 'nsegs',nsegs
#BALLE
#win = int(window/2)
filtered = aglet(spline_filter(data, nsegs), window)
'''
elif kind == "kernel":
#ADDED BY TG
kernel /= kernel.sum()
filtered = np.convolve(data, kernel, mode='same')
else:
raise Exception("adjust_trend: unknown filter type {}".format(kind))
return filtered