Predicting gold prices with SARIMAX

Predicting gold prices with SARIMAX

I had a little fun with data from the St. Louis Fed (FRED). Note: not investment advice

I collected 54 years of gold price data starting 9/30/71 (end of the US federal government fiscal year after gold was allowed to float). I also collected the 10-year treasury rate, US public debt, and inflation data for the same dates. I used python to run some models and predict the end of fiscal year prices for the next three years. This is what the data looked like in a pandas dataframe:

I did a little cleaning, scaled the debt down. Since this a time series, a histogram doesn’t tell us much other than a historical reference. The scatter plot shows where prices have gone and the recent run up.

Initial correlation of the variables shows a positive relationship with US debt, and negative correlations with the 10-year treasury rate and inflation. However, the relationships are somewhat more complex.

Correlation of features

Price:1.000000
debt (billions):0.951928
treasury_rate-0.634155
inflation-0.288526

I initially built two machine learning models that worked quite well within the known universe of prices, but did poorly at predicting future values. Time series data is better handled with other methods. I made one attempt using Meta’s prophet model, but could not find very good documentation on it. I ended up using the statsmodels SARIMAX (Seasonal Autoregressive Integrated Moving-Average with eXogenous regressors) which is designed for time series data. I trained the model with all of the price data, and the extra features were treated as exogenous. For future predictions, I created a set of estimated data with the following hypothesis:

  • the US debt continues to grow at the mean rate from 2017-2024

  • inflation drops to 2.7% and stays there

  • the 10-year treasury rate drop to 4.2% in 2025, then 3.8% the following two years

I used the get_forecast() method on the trained model and got predicated prices for the next three years. The model found that only US debt was statistically significant in determining gold prices. This was a little surprising. Because kurtosis was 4.8, well above the normal distribution value of 3, prices have a longer tail and more room for fluctuations. This created a pretty wide confidence interval in the predictions.

Predicted Gold Prices through 2027

9/30/2025$2,838.27
9/30/2026$2,985.15
9/30/2027$3,130.70

If you are interested in playing around with the Jupyter notebook for this model, you can find it on Github.