# Tuning regularization strength

This post is the fifth in a series applying machine learning techniques to an energy problem. The goal of this series is for me to teach myself machine learning techniques by developing models to forecast the UK Imbalance Price

I see huge promise in what machine learning can do in the energy industry.  This series details my initial efforts in gaining an understanding of machine learning.

In the previous post in this series we introduced a Multi Layer Perceptron neural network to predict the UK Imbalance Price.   This post will dig a bit deeper into optimizing the degree of overfitting of our model.  We do this through tuning the strength of regularization.

What is regularization

Regularization is a tool used to combat the problem of overfitting a model.  Overfitting occurs when a model starts to fit the training data too well – meaning that performance on unseen data is poor.

To prevent overfitting to the training data we can try to keep the model parameters small using regularization.  If we include a regularization term in the cost function the model minimizes we can encourage the model to use smaller parameters.

The equation below shows the loss function minimized during model training.  The first term is the square of the error.  The second term is the regularization term – with lambda shown as the parameter to control regularization.  In order to be consistent with scikit-learn, we will refer to this parameter as alpha.

Regularization penalizes large values of the model parameters (theta) based on the size of the regularization parameter.  Regularization comes in two flavours – L1 and L2.  The MLP Regressor model in scikit-learn uses L2 regularization.

Setting alpha too large will result in underfitting (also known as a high bias problem).  Setting alpha too small may lead to overfitting (a high variance problem).

Setting alpha in the UK Imbalance Price model

Here we will optimize alpha by iterating through a number of different values.

We can then evaluate the degree of overfitting by looking at how alpha affects the loss function and the Mean Absolute Scaled Error (MASE).  The loss function is the cost function the model minimizes during training.  The MASE is the metric we used to judge model performance.

We use K-fold cross validation to get a sense of the degree of overfitting.  Comparing the cross validation to training performance gives us an idea of how much our model is overfitting.  Using K-fold cross validation allows us to leave the test data free for evaluating model performance only.

Figure 1 & 2 show the results of optimizing alpha for a MLP Regressor with five hidden layers of 1344 nodes each.  The input feature set is the previous one week of Imbalance Price data.

Figure 1 shows the effect of alpha on the loss function for the training and cross validation sets.  We would expect to see the training loss increase as alpha increases.  Small values of alpha should allow the model to overfit.  We would also expect to see the loss for the training and CV sets to coverge as alpha gets large.

###### Figure 1 – The effect of alpha on the loss function

Figure 1 shows the expected trend with the training loss increasing as alpha increases – except for alpha = 0.0001 which shows a high training loss.  This I don’t understand!  I was expecting that training loss would decrease with decreasing alpha.

Figure 1 shows the effect of alpha on the Mean Absolute Squared Error for the training, cross validation and test sets.

###### Figure 2 – The effect of alpha on the MASE for the training, cross validation and test data

Figure 2 also shows a confusing result.  I was expecting to see the MASE be a minimum at the smallest alpha and increase as alpha increased.  This is because small values of alpha should allow the model to overfit (and improve performance). Instead we see that the best training MASE is at alpha = 0.01.

Figure 2 shows a minimum for the test MASE at alpha = 0.01 – this is also the minimum for the training data.

Going forward I will be using a value of 0.01 for alpha as this shows a good balance between minimizing the loss for the training and cross validation sets.

Table 1 shows the results for the model as it currently stands.

###### Table 1 – Model performance with alpha = 0.01

 Training MASE 0.3345 Cross validation MASE 0.589 Test MASE 0.5212

Next step in this project is looking at previously unseen data for December 2016 – stay tuned.

# Imbalance Price Visualization

This post is the third in a series applying machine learning techniques to an energy problem. The goal of this series is to develop models to forecast the UK Imbalance Price.

In the first post we gave an introduction of what the UK Imbalance Price is.  We then showed how to gather UK grid data off the Elexon API using Python.

This third post will visualize the data we have gathered.  Spending the time to explore the data is an important step in model building.  All of the charts in this post were created in Python using pandas and matplotlib.

###### Figure 1 – The UK Imbalance Price January to November 2016

Figure 1 shows the volatile nature of the Imbalance Price.  Major positive spikes are observed throughout the year, with even more significant spikes occurring in November.  UK electricity demand is highest in winter, so likely higher demands are leading to National Grid having to use more expensive plant to balance the system.

###### Figure 2 – Monthly summary statistics

Figure 2 shows how extreme the month of November was – large peaks and also a very high standard deviation.  It will be interesting to see how December compares in terms of volatility.

###### Figure 3 – Correlogram

The correlogram shows the autocorrelation function computed at different lags of the time series.  This can be used to identify any seasonality in the time series.  Clearly we have some seasonality present at around 48 lags – equivalent to one day in this half hourly time series.

It makes sense that the Imbalance Price would likely be similar to the day before – many of the reasons for Imbalance (such as forecasting errors) are likely to occur multiple times.  Interestingly the ACF function does not peak at a lag of 336 (corresponding to one week).

###### Figure 4 – Box plot (note the y-axis maximum was limited at 200 £/MWh)

The box plot clearly shows how much of the data is classed as outliers (i.e. being outside the inner & outer fences).  Also of interest is how close the median and first quartile are in most months!

In the next post we will begin to forecast the Imbalance Price using a multi-layer perceptron in Scikitlearn.

# Elexon API Web Scraping using Python

NOTE – the code in this post is now superseeded – please see my update post – or just go straight to my GitHub repository for this project.

This post is the second in a series applying machine learning techniques to an energy problem.  The goal of this series is to develop models to forecast the UK Imbalance Price.

Part One – What is the UK Imbalance Price?

The first post in this series gave an introduction to what the UK Imbalance Price is.

This post will show how to scrape data UK Grid data from Elexon using their API.  Elexon make available UK grid and electricity market data  to utilities and traders.

Data available includes technical information such as weather or generation.  Market data like prices and volumes is also available.  A full detail of available data is given in the Elexon API guide.

Accessing data requires an API key, available by setting up a free Elexon account.  The API is accessed by passing a URL with the API key and report parameters.  The API will return either an XML or a CSV document.

Features of the Python code

The Python code below is a modified version of code supplied by the excellent Energy Analyst website.  Some features of the code are:

• Script iterates through a dictionary of reports.  I have setup the script to iterate through two different reports  – B1770 for Imbalance Price data and B1780 for Imbalance Volume data.
• The script then iterates through a pandas date_range object of days.  This object is created by setting the startdate and the number of days.
• Two functions written by Energy Analyst are used:
• BMRS_GetXML – returns an XML object for a given set of keyword arguments & API key.
• BMRS_Dataframe – creates a pandas DataFrame from the XML object.
• Data for each iteration is indexed using UTC time.  Two columns are added to the data_DF with the UTC and UK time stamps.
• The results of each iteration are saved in an SQL database.  Each report is saved in its own table named with the report name.
• The results of the entire script run is saved in a CSV (named output.csv)

The Python code

Next steps
The next post in this series will be visualizing analyzing the Imbalance Price data for 2016.

# What is the UK Imbalance Price?

This post is the first in a series applying machine learning techniques to an energy problem.  The goal of this series is to develop models to forecast the UK Imbalance Price.

What is the Imbalance Price?

The Imbalance Price is what generators or suppliers pay for any unexpected imbalance.

In the UK generators and suppliers (known as Parties) contract with each other for the supply of electricity.  Generators sell electricity to suppliers who then sell power to end use customers.

As System Operator National Grid handles real time balancing of the UK grid.  Parties submit details of their contracts to National Grid one hour before delivery.  This allows National Grid to understand the expected imbalance.

National Grid will then take actions to correct any predicted imbalance.  For example the Balancing Mechanism allows Parties to submit Bids or Offers to change their position by a certain volume at a certain price.

National Grid also the ability to balance the system using actions outside the Balancing Mechanism.  Examples include:

• Short Term Operating Reserve power plants.
• Frequency Response plants used to balance real time.
• Reserve Services.

More drastic scenarios National Grid may call upon closed power plants or disconnect customers.  National Grid will always reduce the cost of balancing within technical constraints.

Parties submit their expected positions one hour before delivery –  but they do not always meet these contracted positions!

A supplier may underestimate their customers demand.  A power plant might face an unexpected outage.  The difference between the contracted and actual position is charged using the Imbalance Price.

ELEXON uses the costs that National Grid incurs in correcting imbalance to calculate the Imbalance Price.  This is then used to charge Parties for being out of balance with their contracts. ELEXON details the process for the calculation of the Imbalance Price here.

What data is available?

ELEXON make available a significant amount of data online.  This includes data for the Imbalance Price calculation as well as data related to the UK grid.  We will make use of the ELEXON API to access data.

The first iteration of this model will be auto-regressive.  We will use only the previous values of the Imbalance Price to predict future values.

As we continue to develop the model we will add more data and explain it’s relevance to the Imbalance Price.  Adding data iteratively will allow us to understand what value the more data has to the model.

Next steps

The next post will be the Python code used to scrape data using the Elexon API.   We will then do some visualization to analyze the Imbalance Price data.

Posts after that will be developing models in Python to predict the Imbalance Price.

Part Two – Elexon API Web Scraping using Python