Typical Year Forecasting of Electricity Prices

Improve your energy project modelling with this simple & flexible forecasting technique.

12 minute read

created: 2022-11-19, updated: 2022-12-04

Energy prices are volatile - the prices of energy commodities such as gas, oil and electricity all change year on year, driven by politics, technology, climate and the weather.

Prices are signals that guide capital allocation to energy projects - the economic viability of solar, battery and energy efficiency all depend on the energy price assumptions used in to model their economic return.

This post will show why the standard industry approaches for handling energy prices for investments in energy assets are hiding a huge source of error - variance in estimations of project performance that occur due to improper handling of energy price data.

You can find supporting materials (source code & data) for this work at adgefficiency/typical-year-forecasting-electricity-prices.

What is a Typical Year Forecast?

A typical year forecast uses historical data to create a single, synthetic year of data.

This single year forecast is suitable for use in business case modelling of energy projects - it’s not suitable for short term dispatch of energy assets.

A typical year forecast has the following advantages:

  • simple to create - no machine learning, gradients or iterative calculations,
  • interpretable - easy to understand why one sample is selected over others,
  • realistic - the forecast is made from real historical data,
  • domain flexible - can be used with any time series,
  • statistically flexible - can use a range of statistics to define what typical means.

A typical year forecast has the following disadvantages:

  • data quantity - requires at least 2 years of historical data,
  • domain knowledge - requires selection & weighting of statistics.

An example of a typical year forecast is a typical metrological year (TMY) forecast, used to create a dataset of typical year of weather. TMY forecasts are commonly used in modelling solar generation or building energy use.

The idea & inspiration for this post came from using the TMY forecast produced by Solcast - thanks Solcast for the inspiration!

The Problem with the Standard Industry Approach

Estimating the economic performance (simple payback, IRR, NPV or rate of return on capital) of an investment in an energy project requires combining two models - a technical model and a financial model.

Commonly the technical model will model a single year in isolation, and is used as an input to the financial model.

The financial model will model multiple years over time (to model economic return over time), using the technical results as the basis for the first year with the financial inputs (such as prices) forecasted forward based on the single year technical results.

In the absence of forecasted energy prices across the future project lifetime, energy prices are often modelled in a similar way to the technical model - taking a single reference year of prices and forecasting them forward with assumptions of inflation.

A simple example of how a technical & financial model combine is given below:

  • a technical model outputs annual savings of 150 MWh of electricity,
  • we assume electricity prices at 100 $/MWh
  • capital investment is estimated at $ 25,000.

The technical inputs & price assumptions are then forecast forward (here without inflation) to calculate cumulative savings:

year capex savings_mwh price savings_$ cumulative_savings_$
0 25000 150 100 15000 -10000
1 0 150 100 15000 5000
2 0 150 100 15000 20000
3 0 150 100 15000 35000

It’s not common to see both the project capex and savings in the same year (usually you need to build something before it gives a saving) - for this simple example please forgive this!

Why Using The Most Recent Prices is Wrong

Choosing the reference year for prices is commonly done by:

  • taking the most recent prices,
  • taking the most recent full calendar year of prices,
  • taking the prices that align with the technical model.

If we were setting up our model in November 2022 with a technical model based on 2019 data, the standard industry approach would likely be one of the following:

  • the most recent prices - October 2021 to September 2022,
  • the most recent calendar year - January 2021 to December 2021,
  • prices that align with the technical data - January 2019 to December 2019.

Below we will demonstrate why all of these commonly used methodologies introduce a large source of error.

Error of Using Recent Prices

In our example above, we assumed prices at 100 $/MWh. The figure below uses the same financial model with the actual annual average electricity prices for South Australia:

Project savings versus annual average electricity prices.

Look at the variance of these results! Around half of our projects lose money, with the other half being profitable.

This variance error that the standard industry approaches are hiding - normally we only get a single estimate, without seeing the spread across different years of price data.

This variance in project performance is only occurring based on when we do our modelling - not based on the fundamental, underlying economics of the project.

We can do better!

Creating a Typical Year Forecast

Creating a typical year forecast requires defining what typical means.

For these forecasts we will define typical as similarity - our typical year forecast will be made of samples of data that are most similar to all the other data.

We can quantify similarity by defining an error metric - the error between statistics measured across all our data and statistics measured across a candidate sample. The samples that minimize this error will be selected and used in our forecast.

For our first typical year forecast, we will create a forecast based on a single statistic - the average price within a month.

The basic idea is as follows:

#  Creating a Typical Year Forecast based on the Mean with 5 Years of Historical Data.

#  Iterate across each month in a year (12 months in total).
for each month in a year (Jan, Feb ... Nov, Dec)
 
  #  Calculate one long term statistic across all 5 years for this one month.
  long_term_mean = historical_data[month].mean()

  #  Iterate across our historical data, selecting this one month,
  #  5 months across 5 years, all the same month.
  for year in historical_data
    sample_mean = year[month].mean()

    #  Calculate the error of this month versus the long term statistic.
    sample_error = absolute(sample_mean - long_term_mean)

  #  Select the sample with the lowest sample error,
  #  this is the historical month we will use in our typical year forecast.
  selected_sample = argmin(sample_errors)

After following this procedure, we will select 12 monthly samples - one for each month in a year, creating our typical year forecast.

Typical Year Forecast for South Australian Electricity Prices

To further demonstrate the idea, we will first limit ourselves to forecasting a single month - January, for electricity prices in South Australia, using 10 years of historical data.

Let’s first start by calculating our long term statistic - the average price in January across the entire dataset, which is 85.449 $/MWh.

We can then look at what the average price was in each January and calculate the error versus the long term statistic.

This leads us to selecting January 2017 as our typical month of electricity prices:

year month price-mean long-term-mean error-mean
2012 January 25.6153 85.449 59.8337
2013 January 59.1246 85.449 26.3244
2014 January 88.8675 85.449 3.41845
2015 January 34.68 85.449 50.769
2016 January 50.2573 85.449 35.1917
2017 January 84.2589 85.449 1.19009
2018 January 158.757 85.449 73.3081
2019 January 241.025 85.449 155.576
2020 January 83.2037 85.449 2.24526
2021 January 28.7008 85.449 56.7482

We can then repeat the procedure above to forecast the remaining 11 months of the year, ending up with 12 months that make up our typical year forecast:

month year price-mean long-term-mean error-mean
January 2017 84.2589 85.449 1.19009
February 2020 64.1771 71.2239 7.04685
March 2021 68.7727 66.6858 2.08692
April 2021 52.1361 64.1214 11.9854
May 2016 70.6976 70.1316 0.565976
June 2021 84.3886 81.6753 2.71335
July 2021 91.1873 94.7737 3.58638
August 2016 66.2397 64.8625 1.37717
September 2012 53.7977 54.7594 0.961707
October 2012 50.9616 52.3186 1.35705
November 2016 61.8883 57.3279 4.56045
December 2015 66.8321 67.2765 0.444369

Our typical year forecast, in all it’s light blue glory:

Typical year forecast using the mean as a statistic.

We can compare this typical year forecast to actual historical prices - for the years where we have sampled our typical month from, our forecast directly overlaps the historical data:

Comparing our typical year forecast using the mean as a statistic to historical data.

Extending the Forecast With More Statistics

Above we only considered the mean when selecting a month. The mean is a measurement of the central tendency of a distribution - using the mean to select a month will mean our forecast has a similar central point to the long term average.

For some energy models, the variance is more important than the average.

The variance is how spread out prices are - it’s important for batteries operating in wholesale arbitrage, as this spread puts an upper limit on the profitability of shifting of electricity between intervals can be.

Our procedure for creating a typical year forecast based on both the mean and the variance is similar to only considering the mean.

We instead calculate two additional statistics (the long term standard deviation and the sample standard deviation), and include them in our sample error:

#  Creating a typical year forecast based on the mean & standard deviation

#  Iterate across each month in a year.
for month in (Jan, Feb ... Nov, Dec):

  #  Calculate two statistics - long term mean & standard deviation.
  long_term_mean = data.mean()
  long_term_std = data.std()

  #  Iterate across historical data & calculate sample errors,
  #  using both long term statistics
  for year in (historical data):
    sample_mean = month.year.mean()
    sample_std = month.year.std()
    sample_error = absolute(long_term_mean - sample_mean) + absolute(long_term_std - sample_std)

  #  Select sample that minimizes error.
  selected_sample = argmin(sample_errors)

Taking this approach again, we end up with our typical year forecast - different from our previous forecast where we only used the mean:

month year price-mean long-term-mean price-std long-term-std error
January 2020 83.2037 85.449 519.785 504.705 17.3251
February 2018 109.17 71.2239 290.873 300.955 48.0282
March 2020 46.9517 66.6858 225.829 271.301 65.2057
April 2015 39.9493 64.1214 100.387 99.2508 25.3085
May 2016 70.6976 70.1316 132.686 133.63 1.5091
June 2021 84.3886 81.6753 96.1186 130.305 36.8999
July 2015 73.5053 94.7737 226.191 236.491 31.5684
August 2013 71.2364 64.8625 88.1036 103.648 21.9185
September 2012 53.7977 54.7594 62.1015 75.617 14.4772
October 2019 67.3398 52.3186 92.2279 108.001 30.7947
November 2019 50.8623 57.3279 88.3317 109.014 27.1474
December 2013 79.5734 67.2765 372.848 318.756 66.3892

We can compare our two typical year forecasts directly:

Typical year forecast using the mean as a statistic.

Typical year forecasting based on both the mean and the variance is selecting months with higher prices - including more of the tasty price spikes that makes Australia’s National Electricity Market (NEM) so interesting for battery storage.

Evaluating the Typical Year Forecast

Let’s return to our original motivating example, with an additional estimate of our project cumulative savings using our typical year forecast based on using the mean (show as 2052 in green):

Typical year forecast using the mean as a statistic.

How great is that!

Our typical year forecast does a fantastic job of cutting through the variance - modelling our project right in the middle of the high variance estimates we get when taking the traditional, industry standard approaches of using historical price data.

No longer are we slaves to the cruel master of time (well, perhaps we still are) - as the years go by, our estimation of project economics will stay stable and consistent, rather than varying wildly based on when we are doing our modelling.

As new price data becomes available, our typical year forecast will change (due to both the long term statistics changing, or recent data being more typical), but the variance from these changes will be minor compared to the massive year on year swings we get with the standard industry approaches.

Discussion

Above we have seen how great our typical year forecast is at reducing the variance of our estimates of project performance - let’s now discuss some challenges and potential extensions to this simple typical year forecasting method.

Challenges

Data Quantity

This methodology requires multiple years of data - if we only have access to a single year, this method is not appropriate.

Alignment

One problem that arises when concatenating interval data from different time periods together is alignment at the intersection - the sample below from the typical year forecast produced above shows the issue - our forecast jumps from Tuesday in January 2017 to Friday 2020:

forecast original-timestamps price day-of-week-forecast day-of-week-original
2052-01-31 23:50:00 2017-01-31 23:50:00 39.52 2 1
2052-01-31 23:55:00 2017-01-31 23:55:00 39.52 2 1
2052-02-01 00:00:00 2020-02-01 00:00:00 299.2 3 5
2052-02-01 00:05:00 2020-02-01 00:05:00 299.2 3 5

This misalignment will cause issues with the incorrect number of weekdays or weekends in a year - important as energy demand and price has strong weekly seasonality.

This alignment problem also occurs when you don’t use a typical year forecast - for example if you use price data from 2022 with technical data from 2010.

Domain Expertise

Domain expertise is required to setup a typical year forecast - primarily in defining the appropriate statistics.

Using multiple statistics can also require weighting - for example if the standard deviation is orders of magnitude higher than the mean, we may want to weight the mean higher.

Extensions & Improvements

Higher Frequency Sampling

In the examples above we have selected samples on a monthly basis - it is possible to instead select samples on a different frequency, such as week of the year (52 weeks) or day of the year (365 days).

More Statistics

One advantage of this methodology are flexibility of statistics we choose - unlike a loss function for a neural network, they do not need to be differentiable.

For example, we could use statistics like:

  • mean, median, mode,
  • number of time periods above a threshold price,
  • number of negative prices.

This is an exciting feature of typical year forecasting - the flexibility and simplicity of using any statistic that aligns with what your technical and financial models need to align with your business goals.

Summary

In this post we introduced typical year forecasting - a flexible, powerful forecasting method suitable for use in energy project business case modelling.

Typical year forecast address a hidden flaw in the price assumptions commonly used in industry - the large errors introduced by using recent price data.

A typical year forecast addresses these issues by selecting historical price data that is most similar to all the historical data.

Typical year forecasts have the following advantages:

  • simple to create - no machine learning, gradients or iterative calculations,
  • interpretable - easy to understand why one sample is selected over others,
  • realistic - the forecast is made from actual historical data,
  • domain flexible - can be used with any time series (not just electricity prices),
  • statistically flexible - can use a range of statistics to define what typical means.

A typical year forecast has the following disadvantages:

  • data quantity - requires at least 2 years of historical data,
  • domain knowledge - requires selecting & weighting of statistics based on problem understanding.

Further extensions on the methods shown above include:

  • higher frequency sampling on a weekly or daily basis,
  • using a variety of statistics to define similarity, such as the number of price spikes or the number of negative prices.

Thanks for reading!

If you enjoyed this post, make sure to check out Measuring Forecast Quality using Linear Programming.

You can find the materials to reproduce this analysis at adgefficiency/typical-year-electricity-price-forecasting.

Updated: