Exploring the Python Aeon Toolkit

Exploring the Python Aeon Toolkit

Aeon is a scikit-learn compatible toolkit for time series tasks such as forecasting, classification and clustering. One of my main interests in machine learning is forecasting. Time series data has properties that sometimes make it harder to model without massaging the dates or timestamps. Aeon wraps many scikit-learn algorithms and advanced algorithms like ARIMA from other packages.

I collected daily time series financial data with a small amount of seasonality in it. There are some small gaps in the data that I did not clean up. My first attempts to model it using a neural network from scikit-learn showed poor results, especially forecasting 3-6 months ahead. My first test with Aeon was to use the TrendForecaster. I had about 18 months of data that looked like this:

the_dateamount
04/26/20224494.14
04/27/20224494.74
04/28/20224495.34
04/29/20224495.94
05/02/20224497.74
05/03/20224498.34
05/04/20224498.94
05/05/20224499.54
05/09/20224501.94
05/10/20224502.53

It was easy to load the amount as a series in pandas, then run the trend forecaster:

import pandas as pd 
from aeon.forecasting.trend import TrendForecaster

# Load your dataset
data = pd.read_csv('time-series.csv')

y = data.iloc[:, 1]

forecaster = TrendForecaster()
forecaster.fit(y)  # fit the forecaster

predicted_value = forecaster.predict(fh=[90])  # forecast the 90th future value
print(f"The 90th future value is: {predicted_value}")

The result was:

The 90th future value is: 586 5241.873509 
dtype: float64

I had more actual data in the time series than I fed it, and the result was very close to real world data. I wasn't sure what the forecaster was doing behind the scenes so I set up another script using scikit-learn LinearRegression.

import pandas as pd
from sklearn.linear_model import LinearRegression
import numpy as np

# Load your dataset
df = pd.read_csv('time-series.csv')

amount_series = df['amount']

# Using index as independent variable
X = amount_series.index.values.reshape(-1, 1)  # Reshape necessary for sklearn
y = amount_series.values

# Create and fit the model
model = LinearRegression()
model.fit(X, y)

# Display model coefficients
print(f"Model Coefficient: {model.coef_[0]}")
print(f"Model Intercept: {model.intercept_}")

# Making predictions (predicting y for new indices)
index_to_predict = np.array([[X[-1, 0] + 90]])
predicted_value = model.predict(index_to_predict)
print(f"The 90th future value is: {predicted_value[0]}")

The result was:

Model Coefficient: 1.2591998385571888
Model Intercept: 4503.982403820514 
The 90th future value is: 5241.873509215027

I didn't expect the result to be exactly the same! Aeon was using a linear regression but with much less code and without having to explicitly reshape the index. Since it was not much extra work, I tested the Aeon PolynomialTrendForecaster with degree=1 and got the same result as the TrendForecaster. With degree=2, the fit was not as good and the forecasts were off (too low). The Aeon toolkit is vast and I look forward to trying other parts of the kit on more complicated problems.