Friday, December 30, 2022
HomeData Science3 Methods to Take care of Heteroskedasticity in Time Sequence | by...

3 Methods to Take care of Heteroskedasticity in Time Sequence | by Vitor Cerqueira | Dec, 2022


Picture by Samuel Ferrara on Unsplash

This text is a follow-up to my earlier submit. There, I describe the way to detect heteroskedasticity in time collection.

We proceed to review the difficulty of non-constant variance right here. You’ll study three approaches used to cope with this situation.

Heteroskedasticity impacts the becoming of forecasting fashions.

Time collection with non-constant variance typically have a long-tailed distribution. The information is left- or right-skewed. This may impair the educational means of some algorithms. Strategies comparable to deep neural networks are affected by this problem, in contrast to tree-based approaches.

The estimated coefficients of a linear mannequin will nonetheless be unbiased. Which means that they are going to be right, on common. However, they gained’t be essentially the most exact.

Heteroskedasticity additionally invalidates any statistical take a look at that assumes a continuing variance amongst observations.

All in all, you’ll go away efficiency on the desk by not coping with heteroskedasticity.

Suppose you ran a statistical take a look at that confirms the time collection is heteroskedastic.

What are you able to do about that?

Let’s have a look at three doable approaches.

1. Log or Energy Transformations

Remodeling the info is the go-to strategy to take away heteroskedasticity. The aim is to stabilize the variance and to deliver the distribution nearer to the Regular distribution.

The log is an efficient transformation to do that. Taking the sq. root or cubic root are two doable options. These are specific cases of Field-Cox transformations.

Right here’s the way to apply Field-Cox to a time collection utilizing Python:

import numpy as np
from pmdarima.datasets import load_airpassengers
from scipy.stats import boxcox
from scipy.particular import inv_boxcox

# loading the info
collection = load_airpassengers(True)

# reworking the collection
# lambda_ is the transformation parameter
series_transformed, lambda_ = boxcox(collection)

# reverting to the unique scale
original_series = inv_boxcox(series_transformed, lambda_)

# test if it's the identical as the unique knowledge
np.allclose(original_series, collection)
# True

Field-Cox depends on a change parameter lambda. However, it’s optimized below the hood by scipy robotically. No want to fret about that. When the worth of lambda is the same as 0, making use of the Field-Cox transformation is identical as log scaling.

You should use the operate inv_boxcox to revert the reworked knowledge to its authentic scale.

One limitation of Field-Cox is that it’s only outlined for constructive knowledge. The Yeo-Johnson transformation is just like the Field-Cox methodology and addresses this downside.

Right here’s the impression of Field-Cox within the instance time collection used within the script.

Determine 1: Unique time collection (higher tile), and the respective Field-cox transformation (decrease tile). The variance turns into secure after the transformation. Picture by writer.

2. Volatility Standardization

Volatility standardization is one other solution to cope with non-constant variance.

The volatility of a time collection is the current degree of variability within the knowledge. This variability is usually quantified with customary deviation. The thought of volatility standardization is to normalize the collection based mostly on its volatility. It results in observations with the identical degree of variation.

Volatility standardization could be particularly helpful in knowledge units with many time collection. For instance, when coaching international forecasting fashions.

Instance

Let’s have a look at an instance utilizing a multivariate time collection.

The aim is to forecast the long run values of various variables within the collection. We’ll use a world forecasting mannequin to do that. You possibly can study extra about international forecasting fashions on this introductory article. The fundamental thought is to construct a single forecasting mannequin utilizing a number of time collection as enter.

The information set on this instance is the wine gross sales time collection. Right here’s what it seems to be like:

Determine 2: Wine gross sales multivariate time collection. The aim is to forecast all observations previous the testing begin mark. The information is publicly accessible. Test reference [3] for the unique supply. Picture by writer.

Inside every variable (wine kind), the variance seems to be secure over the collection. However, it’s clear that completely different variables have a definite degree of variability. Volatility standardization can be utilized to stabilize the variance throughout every variable.

Right here’s the way to apply volatility standardization when constructing a forecasting mannequin. Test the feedback for extra context.

import re
import numpy as np
import pandas as pd

# utilizing xgboost because the regression algorithm
from xgboost import XGBRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error as mae

# https://github.com/vcerqueira/weblog/
from src.tde import time_delay_embedding

# https://github.com/vcerqueira/weblog/tree/foremost/knowledge
wine = pd.read_csv('knowledge/wine_sales.csv', parse_dates=['date'])
wine.set_index('date', inplace=True)

# practice take a look at break up
# utilizing the final 20% of information as testing
practice, take a look at = train_test_split(wine, test_size=0.2, shuffle=False)

# reworking the time collection for supervised studying
train_df, test_df = [], []
for col in wine:
# utilizing 12 lags to forecast the following worth (horizon=1)
col_train_df = time_delay_embedding(practice[col], n_lags=12, horizon=1)
col_train_df = col_train_df.rename(columns=lambda x: re.sub(col, 'Sequence', x))
train_df.append(col_train_df)

col_test_df = time_delay_embedding(take a look at[col], n_lags=12, horizon=1)
col_test_df = col_test_df.rename(columns=lambda x: re.sub(col, 'Sequence', x))
test_df.append(col_test_df)

# completely different collection are concatenated on rows
# to coach a world forecasting mannequin
train_df = pd.concat(train_df, axis=0)
test_df = pd.concat(test_df, axis=0)

# splitting the explanatory variables from goal variables
predictor_variables = train_df.columns.str.comprises('(t-')
target_variables = train_df.columns.str.comprises('Sequence(t+')
X_train = train_df.iloc[:, predictor_variables]
Y_train = train_df.iloc[:, target_variables]
X_test = test_df.iloc[:, predictor_variables]
Y_test = test_df.iloc[:, target_variables]

# volatility standardization
# dividing by the usual deviation of previous 12 lags
X_train_vs = X_train.apply(lambda x: x / x.std(), axis=0)
X_test_vs = X_test.apply(lambda x: x / x.std(), axis=0)

# testing three strategies
## no normalization/preprocessing
mod_raw = XGBRegressor()
## volatility standardization
mod_vs = XGBRegressor()
## log transformation
mod_log = XGBRegressor()

# becoming on uncooked knowledge
mod_raw.match(X_train, Y_train)
# becoming with log-scaled knowledge
mod_log.match(np.log(X_train), np.log(Y_train))
# becoming with vol. std. knowledge
mod_vs.match(X_train_vs, Y_train)

# making predictions
preds_raw = mod_raw.predict(X_test)
preds_log = np.exp(mod_log.predict(np.log(X_test)))
preds_vs = mod_vs.predict(X_test_vs)

print(mae(Y_test, preds_raw))
# 301.73
print(mae(Y_test, preds_vs))
# 294.74
print(mae(Y_test, preds_log))
# 308.41

Preprocessing the info with volatility standardization improves forecasting efficiency. Volatility standardization can be higher relative to the log transformation on this downside.

3. Weighted Regression

One other approach to deal with heteroskedasticity is to pick an acceptable methodology. For instance, one which doesn’t assume equal variance amongst observations.

It’s also possible to assign weights to observations based mostly on their variability. By default, studying algorithms give equal weight to all observations in an information set. But, instances with greater variability have much less data. You possibly can lower their significance to a technique by decreasing their weight.

These weights are computed in line with the variance of the fitted values. A bigger variance results in a smaller weight. Right here’s an instance from the statsmodels Python library.

A number of algorithms in scikit-learn have a sample_weight parameter that you should utilize to set the weights.

Picture by Nikola Knezevic on Unsplash

You possibly can mannequin adjustments within the variance as an alternative of stabilizing it.

This may be performed with fashions comparable to ARCH (Auto-Regressive Conditional Heteroskedastic). One of these mannequin is used to forecast the variance based mostly on the previous values of the time collection.

GARCH (Generalized Auto-Regressive Conditional Heteroskedastic) extends ARCH. Apart from utilizing the previous values of the collection, it additionally makes use of previous variances.

The arch library supplies a Python implementation for these strategies.

On this article, you discovered the way to cope with heteroskedasticity in time collection. We coated three approaches:

  1. Log or energy transformations;
  2. Volatility standardization;
  3. Weighted regression.

My private choice is to rework the info. I am going for possibility 1 once I’m coping with a single time collection. For a number of time collection, I have a tendency to make use of possibility 2. However, that is one thing you’ll be able to optimize utilizing cross-validation.

You possibly can mannequin the variance immediately utilizing ARCH/GARCH strategies.

Thanks for studying and see you within the subsequent story!

Additional Readings

[1] On energy transformations, Forecasting Ideas and Apply

[2] On weighted regression, Engineering Statistics Handbook

[3] Rob Hyndman and Yangzhuoran Yang (2018). tsdl: Time Sequence Knowledge Library. v0.1.0. https://pkg.yangzhuoranyang.com/tsdl/ (GPL-3)

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -
Google search engine

Most Popular

Recent Comments