Elexon API Web Scraping using Python

NOTE – the code in this post is now superseeded – please see my update post – or just go straight to my GitHub repository for this project.

This post is the second in a series applying machine learning techniques to an energy problem.  The goal of this series is to develop models to forecast the UK Imbalance Price. 

Part One – What is the UK Imbalance Price?

The first post in this series gave an introduction to what the UK Imbalance Price is.

This post will show how to scrape data UK Grid data from Elexon using their API.  Elexon make available UK grid and electricity market data  to utilities and traders.

Data available includes technical information such as weather or generation.  Market data like prices and volumes is also available.  A full detail of available data is given in the Elexon API guide.

Accessing data requires an API key, available by setting up a free Elexon account.  The API is accessed by passing a URL with the API key and report parameters.  The API will return either an XML or a CSV document.

Features of the Python code

The Python code below is a modified version of code supplied by the excellent Energy Analyst website.  Some features of the code are:

  • Script iterates through a dictionary of reports.  I have setup the script to iterate through two different reports  – B1770 for Imbalance Price data and B1780 for Imbalance Volume data.
  • The script then iterates through a pandas date_range object of days.  This object is created by setting the startdate and the number of days.
  • Two functions written by Energy Analyst are used:
    • BMRS_GetXML – returns an XML object for a given set of keyword arguments & API key.
    • BMRS_Dataframe – creates a pandas DataFrame from the XML object.
  • Data for each iteration is indexed using UTC time.  Two columns are added to the data_DF with the UTC and UK time stamps.
  • The results of each iteration are saved in an SQL database.  Each report is saved in its own table named with the report name.
  • The results of the entire script run is saved in a CSV (named output.csv)

The Python code

Next steps
The next post in this series will be visualizing analyzing the Imbalance Price data for 2016.