This post is the third in a series applying machine learning techniques to an energy problem. The goal of this series is to develop models to forecast the UK Imbalance Price.
In the first post we gave an introduction of what the UK Imbalance Price is. We then showed how to gather UK grid data off the Elexon API using Python.
This third post will visualize the data we have gathered. Spending the time to explore the data is an important step in model building. All of the charts in this post were created in Python using pandas and matplotlib.
Figure 1 shows the volatile nature of the Imbalance Price. Major positive spikes are observed throughout the year, with even more significant spikes occurring in November. UK electricity demand is highest in winter, so likely higher demands are leading to National Grid having to use more expensive plant to balance the system.
Figure 2 – Monthly summary statistics
Figure 2 shows how extreme the month of November was – large peaks and also a very high standard deviation. It will be interesting to see how December compares in terms of volatility.
Figure 3 – Correlogram
The correlogram shows the autocorrelation function computed at different lags of the time series. This can be used to identify any seasonality in the time series. Clearly we have some seasonality present at around 48 lags – equivalent to one day in this half hourly time series.
It makes sense that the Imbalance Price would likely be similar to the day before – many of the reasons for Imbalance (such as forecasting errors) are likely to occur multiple times. Interestingly the ACF function does not peak at a lag of 336 (corresponding to one week).
The box plot clearly shows how much of the data is classed as outliers (i.e. being outside the inner & outer fences). Also of interest is how close the median and first quartile are in most months!
In the next post we will begin to forecast the Imbalance Price using a multi-layer perceptron in Scikitlearn.