Average rainfall 2001-2016, global tropics

Map: Average rainfall 2001-2016, global tropics

The Seasonal python package

Thomas Gumbricht bio photo By Thomas Gumbricht

Introduction

In this post I will take you through the editing I have done to the package seasonal by Will Welch, for fitting it to the needs of Karttur’s GeoImagine Framework.

Install Seasonal

To install the seasonal pacakge, either use pip or conda. At time of writing the conda installer was a bit outdate, and I used pip for installing seasonal:

$ pip install seasonal

Instructions for installations using conda :

$ conda install -c derickl seasonal

Once installed, you can use the seasonal package as instructed on the GitHub page.

The original package, however, includes some implicit assumptions that are converted to explicit parameters when used with Karttur´s GeoImagine Framework. I then also added the option of filtering using a customised kernel.

Editing the Seasonal source code

For use with karttur´s GeoImagine Framework, changes are made to the following modules of the seasonal package:

  • seasonal.py
  • trend.py

If you have installed the complete Spatial Data Integrated Development Environment (SPIDE) using Anaconda, the path to your site-packages should be something like:

$ /Users/youruser/anaconda3/lib/python3.6/site-packages

under the site-packages you should now have the folder seasonal.

To edit the modules, you can use Eclipse. Use the menu system File: Open File….

seasonal.py

def fit_seasons

The function fit_seasons is expanded and includes three additional boolean variables trendonly, forceseason, and kernel, all set to False as default.

If terndonly is set to True, fit_seasons only returns the trend (speeding up processing). If forceseason is set to True, the seasonal component is calculated regardless of the significance with which it is calculated. If the kernel parameter is set to a list ([i,j,k,j,i]), this kernel will be used for filtering the time series.

def fit_seasons(data, trend="spline", period=None, min_ev=0.05,
                periodogram_thresh=0.5, ptimes=1, trendonly=False, forceseason=False, kernel=False):

    trendtype = trend
    if trend is None:
        trend = np.zeros(len(data))
    elif not isinstance(trend,  np.ndarray):
        trend = fit_trend(data, kind=trend, period=period, ptimes=ptimes, kernel=kernel)
    else:
        assert isinstance(trend,  np.ndarray)
    if trendonly:
        return (None, trend)
    data = data - trend
    var = data.var()
    if np.isclose(var, 0.0):
        return (None, trend)

    if period:
        # compute seasonal offsets for given period
        start= period; stop = len(data)-period
        cv_mse, cv_seasons = gcv(data[start:stop], period)
        '''
        if trendtype == 'kernel':
            start= int(period/2); stop = len(data)-int(period/2)
            cv_mse, cv_seasons = gcv(data[start:stop], period)
        else:
            cv_mse, cv_seasons = gcv(data, period)
        '''
        if forceseason:
            return (cv_seasons, trend)
        fev = 1 - cv_mse / var
        if np.isclose(cv_mse, 0.0) or fev >= min_ev:
            return (cv_seasons, trend)
        else:
            return (None, trend)
    if periodogram_thresh and period is None:
        # find intervals containing best period
        peaks = periodogram_peaks(data, thresh=periodogram_thresh)
        if peaks is None:
            return (None, trend)
        peaks = sorted(peaks)
    else:
        # search everything (XXX parameterize this)
        peaks = [(0, 0, 4, len(data) // 2)]
    cv_mse, cv_seasons = np.inf, []
    period = 0
    for interval in peaks:
        period = max(period, interval[2])
        while period <= interval[3]:
            _mse, _seasons = gcv(data, period)
            if _mse < cv_mse:
                cv_mse, cv_seasons = _mse, _seasons
            period += 1
    if np.isclose(cv_mse, 0.0) or min_ev <= 1 - cv_mse / var:
        return (cv_seasons, trend)
    else:
        return (None, trend)

trend.py

def fit_seasons

In Karttur´s version the function fit_seasons in the module trend.py expects the additional variable kernel. If kernel is set to False (default) the parameter has no effect. If, however, a kernel is defined, this array will be used for filtering the time series data.

Also the parameter ptimes is changed in Karttur´s version. In the original package ptimes is hardcoded to 2, whereas is Karttur’s version it is defaulted to 1, but can be set to any integer.

def fit_trend(data, kind="spline", period=None, ptimes=1, kernel=np.array([])):
    print 'ptimes',ptimes

    """Fit a trend for a possibly noisy, periodic timeseries.

    Trend may be modeled by a line, cubic spline, or mean or median
    filtered series.

    Parameters
    ----------
    data : ndarray
        list of observed values
    kind : string ("mean", "median", "line", "spline","kernel", None)
        if mean, apply a period-based mean filter
        if median, apply a period-based median filter
        if line, fit a slope to median-filtered data.
        if spline, fit a piecewise cubic spline to the data
        if None, return zeros
    period : number
        seasonal periodicity, for filtering the trend.
        if None, will be estimated.
    ptimes : number
        multiple of period to use as smoothing window size
    kernel : ndarray
        kernel array

    Returns
    -------
    trend : ndarray

    """
    if kind is None:
        return np.zeros(len(data)) + np.mean(data)
    if period is None:
        period = guess_trended_period(data)
    window = (int(period * ptimes) // 2) * 2 - 1 # odd window
    if kind == "median":
        filtered = aglet(median_filter(data, window), window)
    elif kind == "mean":
        filtered = aglet(mean_filter(data, window), window)
    elif kind == "line":
        filtered = line_filter(data, window)
    elif kind == "spline":
        #changed by tg to just 1  
        '''      
        nsegs = len(data) // (window * 2) + 1
        filtered = aglet(spline_filter(data, nsegs), window)
        '''
        nsegs = len(data) // (window * 1) + 1
        filtered = aglet(spline_filter(data, nsegs), window)
        '''
        nsegs = len(data) / window
        print 'window',window
        print 'len(data)',len(data)
        print 'nsegs',nsegs
        #BALLE
        #win = int(window/2)
        filtered = aglet(spline_filter(data, nsegs), window)
        '''
    elif kind == "kernel":
        #ADDED BY TG
        kernel /= kernel.sum()

        filtered = np.convolve(data, kernel, mode='same')

    else:
        raise Exception("adjust_trend: unknown filter type {}".format(kind))
    return filtered