A Glance at Q-Learning

‘A Glance at Q-Learning’ is a talk I recently gave at the Data Science Festival in London. The talk was one I also gave in Berlin at the Berlin Machine Learning group.

Q-Learning is a reinforcement learning algorithm that DeepMind used to play Atari games – work which some call the first step towards a general artificial intelligence. The original 2013 paper is available here (I cover this paper in the talk).

It was a wonderful experience being able to present – I recommend checking out more of the talks on the Data Science Festival YouTube – all of which are higher quality, more interesting and better presented than mine!

You can download a copy of my slides here – A Glance at Q-Learning slides.

Thanks for reading!

Elon Musk on autonomous cars – Energy Insights

Energy Insights highlights interesting energy content from around the web.

Previous posts include The Complexity of a Zero Carbon Grid and the CAISO Stage 1 Grid Emergency.


In this Energy Insights post I highlight an interesting insight from a TED talk with Elon Musk. The talk is as wide ranging as Musk’s talents – well worth a watch.

Impact of autonomous cars

Musk highlights that shared autonomous vehicles will likely become more affordable than a bus, leading to an increase in miles driven in cars.

“the amount of driving that will occur will be greater with shared autonomy and traffic will get far worse” – Elon Musk

Increasing traffic is not as significant a problem for passengers in autonomous cars. Time in an autonomous car can be spent productively. But what will be the impact on transport carbon emissions?

Let’s have a look at one positive and one neutral scenario. Both of these scenarios require a clean electricity grid power electric vehicles.

If the increased miles driven are spread across the fleet, then the only way for autonomous cars to lead to a global carbon saving is for electric cars to replace fossil fuels. The absolute number of fossil fuel cars must go down.

If all the increased miles driven are taken up by electric cars, then carbon emissions would stay stagnant. For me this seems like the route we must take – making sure that the bulk of the driving load is being done by electric vehicles. Ideally all autonomous cars are electric.

The number of fossil fuel cars will increase. BP expect an increase from 1 billion to 2 billion from now until 2050. This trend can’t be stopped. But we can smartly operate our passenger fleet to favour electric cars over fossil fuels.

Thanks for reading!

Machine Learning in Energy – Part Two

Machine Learning in Energy – Part  One
Introduction, why it’s so exciting, challenges.

Machine Learning in Energy – Part Two
Time series forecasting, energy disaggregation, reinforcement learning, Google data
centre optimization


This is part two of the Machine Learning in Energy series.

This post will detail specific applications of machine learning in energy that I’m excited about.

Forecasting of electricity generation, consumption and price

What’s the problem

The time of electricity consumption has massive economic and enviromental impact. The temporal variation in electricity generation and consumption can be significant. Periods of high consumption means generating electricity using expensive & inefficient peaking plants. In periods of low consumption electricity can be so abundant that the price becomes negative.

Electric grid stability requires a constant balance between generation and consumption. Understanding future balancing actions requires accurate forecasts by the system operator.

Our current energy transition is moving us away from dispatchable, centralized and large-scale generation towards intermittent, distributed and small scale generation.

Our current energy transition is moving us away from dispatchable, centralized and large-scale generation towards intermittent, distributed and small scale generation.

Historically the majority of generation was dispatchable and predictable – making forecasting easy. The only uncertainty was plant outages for unplanned maintenance.

Intermittent generation is by nature hard to forecast. Wind turbine power generation depends on forecasting wind speeds over vast areas. Solar power is more predictable but can still see variation as cloud cover changes.

As grid scale wind & solar penetration increase balancing the grid is more difficult. Higher levels of renewables can lead to more fossil fuel backup kept in reserve in case forecasts are wrong.

It’s not just the generation side that has become more challenging.
The distributed and small scale of many wind & solar plants is also making consumption forecasting more difficult.

A solar panel sitting on a residential home is not directly metered – the system operator has no idea it is there. As this solar panel generates throughout the day it appears to the grid as reduced consumption.

Our current energy transition is a double whammy for grid balancing. Forecasting of both generation and consumption is becoming more challenging.

This has a big impact on electricity prices. In a wholesale electricity market price is set by the intersection of generation and consumption. Volatility and uncertainty on both sides spill over into more volatile electricity prices.

How machine learning will help

Many supervised machine learning models can be used for time series forecasting. Both regression and classification models are able to help understand the future.

Regression models can directly forecast electricity generation, consumption and price. Classification models can forecast the probability of a spike in electricity prices.

Well trained random forests, support vector machines and neural networks can all be used to solve these problems.

A key challenge is data. As renewables are weather driven forecasts of weather can be useful exogenous variables. It’s key that we only train models on data that will be available at the time of the forecast. This means that historical information about weather forecasts can be more useful than the actual weather data.

What’s the value to the world

Improving forecasts allows us to better balance the grid, reduce fossil fuels and increase renewables.

It’s not only the economic & environmental cost of keeping backup plant spinning. Incorrect forecasts can lead to fossil fuel generators paid to reduce output. This increases the cost to supply electricity to customers.

There are benefits for end consumers of electricity as well. Improved prediction can also allow flexible electricity consumption to respond to market signals.

More accurate forecasts that can look further ahead will allow more electricity consumers to be flexible. Using flexible assets to manage the grid will reduce our reliance on fossil fuels for grid balancing.

See:
– Forecasting UK Imbalance Price using a Multilayer Perceptron Neural Network
Machine Learning in Energy (Fayadhoi Ibrahima)
7 reasons why utilities should be using machine learning
Germany enlists machine learning to boost renewables revolution
Weron (2014) Electricity price forecasting: A review of the state-of-the-art with a look into the future

Energy disaggregation

What’s the problem

Imagine if every time you went to the restaurant you only got the total bill. Understanding the line by line breakdown of where your money went is valuable. Energy disaggregation can help give customers this level of infomation about their utility bill.

Energy disaggregation estimates appliance level consumption using only total consumption.

In an ideal world we would have visibility of each individual consumer of energy. We would know when a TV is on or a pump is running in an industrial process. One solution would be to install metering on every consumer – a very expensive and complex process.

Energy disaggregation is a more elegant solution. A good energy disaggregation model can estimate appliance level consumption through a single aggregate meter.

How machine learning will help

Supervised machine learning is all about learning patterns in data. Many supervised machine learning algorithms can learn the patterns in the total consumption. Kelly & Knottenbelt (2015) used recurrent and convolutional neural networks to disaggregate residential energy consumptions.

A key challenge is data. Supervised learning requires labeled training data. Measurement and identification of sub-consumers forms training data for a supervised learner. Data is also required at a very high temporal frequency – ideally less than one second.

What’s the value to the world

Energy disaggregation has two benefits for electricity consumers. It can identify & verify savings opportunities. It can also increase customer engagement.

Imagine if you got an electricity bill that told you how much it cost you to run your dishwasher that month. The utility could help customers understand what they could have saved if they ran their dishwasher at different times.

This kind of feedback can be very effective in increasing customer engagement – which is a key challenge for utilities around the world.

See:
7 reasons why utilities should be using machine learning
Neural NILM: Deep Neural Networks Applied to Energy Disaggregation
– Energy Disaggregation: The Holy Grail (Carrie Armel)
– Putting Energy Disaggregation Tech to the Test

Reinforcement learning

What’s the problem

Controlling energy systems is hard. Key variables such as price and energy consumption constantly change. Operators control systems with a large number of actions, with the optimal action changing throughout the day.

Our current energy transition is making this problem even harder. The transition is increasing volatility in key variables (such as electricity prices) and the number of actions to choose from.

Today deterministic sets of rules or abstract models are used to guide operation. Deterministic rules for operating any non-stationary system can’t guarantee optimality. Changes in key variables can turn a profitable operation to one that loses money.

Abstract models (such as linear programming) can account for changes in key variables. But abstract models often force the use of unrealistic models of energy systems. More importantly the performance of the model is limited by the skill and experience of the modeler.

How machine learning will help

Reinforcement learning gives a machine the ability to learn to take actions. The machine takes actions in an environment to optimize a reward signal. In the context of an energy system that reward signal could be energy cost, carbon or safety – whatever behavior we want to incentivize.

reinforcement_learning_in_energy

What is exciting about reinforcement learning is that we don’t need to build any domain knowledge into the model. A reinforcement learner learns from its own experience of the environment. This allows a reinforcement learner to see patterns that we can’t see – leading to superhuman levels of performance.

Another exciting thing about reinforcement learning is that you don’t need a data set. All you need is an environment (real or virtual) that the learner can interact with.

What’s the value to the world

Better control of our energy systems will allow us to reduce cost, reduce environmental impact and improve safety. Reinforcement learning allows us to do this at superhuman levels of performance.

See:
– energy_py – reinforcement learning in energy systems
Minh et. al (2016) Human-level control through deep reinforcement learning
Reinforcement learning course by David Silver (Google DeepMind)

Alphabet/Google data centre optimization

One of the most famous applications of machine learning in an energy system is Google’s work in their own data centers.

In 2014 Google used supervised machine learning to predict the Power Usage Effectiveness (PUE) of data centres.

This supervised model did no control of its own. Operators used the predictive model to create a target PUE for the plant. The predictive model also allowed operators to simulate the impact of changes in key parameters on PUE.

In 2016 DeepMind published details of a how they applied machine learning to optimizing data centre efficiency. The technical details of this implementation are not as clear as the 2014 work. It is pretty clear that both supervised and reinforcement learning techniques were used.

The focus on the project again was on improving PUE. Deep neural networks predicted future PUE as well as future temperatures & pressures. The predictions of future temperature & pressures simulated the effect of recommended actions.

DeepMind claim a ’40 percent reduction in the amount of energy used for cooling’ which equates to a ’15 percent reduction in overall PUE overhead after accounting for electrical losses and other non-cooling inefficiencies’. Without seeing actual data it’s hard to know exactly what this means.

What I am able to understand is that this ‘produced the lowest PUE the site had ever seen’.

This is why as an energy engineer I’m so excited about machine learning. Google’s data centers were most likely well optimized before these projects. The fact that machine learning was able to improve PUE beyond what human operators had been able to achieve before is inspiring.

The potential level of savings across the rest of our energy systems is exciting to think about. The challenges & impact of our energy systems are massive – we need the intelligence of machine learning to help us solve these challenges.

See:
Jim Gao (Google) – Machine Learning Applications for Data Center Optimization
– DeepMind AI Reduces Google Data Centre Cooling Bill by 40%

Thanks for reading!

 

CAISO Stage 1 Grid Emergency – Energy Insights

Energy Insights highlights interesting energy content from around the web.

Previous posts include The Complexity of a Zero Carbon Grid and the 2017 BP Energy Outlook.


California Grid Emergency Comes Days After Reliability Warning


On May 3rd 2017 the California grid experienced its first Stage 1 grid emergency in nearly a decade.

The reasons for this emergency notice were:
– a 330 MW gas-fired plant outage
– 800 MW of imports that were unavailable
– a demand forecasting error of 2 GW

A Stage 1 grid emergency doesn’t mean a blackout – it forces the ISO to dip into reserves and slip below required reserve margins.  It allows CAISO to access interruptible demand side managment programs.

I wanted to highlight two features of this event I found interesting.

1 – The demand forecasting error of 2 GW

This is a massive error in absolute terms – equivalent to a large power station!

To put this error in perspective demand on the 11th of May for the same time period was around 28 GW – giving a relative error of around 7%.

It’s important to note that this error isn’t actually an error in forecasting the actual demand – it’s distributed & small scale solar that is appearing to the ISO as reduced demand.

2 – Lack of flexibility

It was unusual that the issues began developing around the peak, and demand wasn’t ramping down much, but solar was ramping off faster than what the thermal units online at the time could keep up with in serving loadCAISO spokesperson Steven Greenlee

In a previous post I highlighted the concept of flexibility.  This event demonstrates why flexibility is so important for managing a modern electric grid.

Even if you have the capacity (MW) you might not have the flexibility (MW/min) to cope with the intermittent nature of renewables.

It’s also made clear in the RTO article that interruptible demand side management programs are only called upon in a Stage 1 emergency.  Prior to this thermal units are used to balance the system.

Using flexible demand side assets as a first step to balance the grid could be a more optimal way to deal with this problem.

Thanks for reading!

Machine Learning in Energy – Part One

Machine Learning in Energy – Part  One
Introduction, why it’s so exciting, challenges.

Machine Learning in Energy – Part Two
Time series forecasting, energy disaggregation, reinforcement learning, Google data
centre optimization


Technological innovation, environmental politics and international relations all influence the development of our global energy system.

Yet there is one less visible trend that may come to dominate all the others. Machine learning is blowing past previous barriers for a wide range of problems.  Many results that were expected to take decades have already been achieved.

I’m really excited about the potential of machine learning in the energy industry.

I see machine learning as fundamentally new. Up until now all the intelligence humanity had access too originated in our brains. Today we have access to a new source of intelligence – computers that can learn patterns that we can’t see.

Part One of this series will introduce what machine learning is, why it’s so exciting and some of the challenges of modern machine learning. Part Two will highlight some energy industry specific applications of machine learning.

What is machine learning

Machine learning gives computers the ability to learn without being explicitly programmed. Computers use this ability to learn patterns in large, high-dimensionality datasets. Seeing these patterns allows computers to achieve results at superhuman levels – literally better than what a human expert can achieve.

Machine learning is now state of the art for a wide range of problems. The fields of computer vision, natural language processing and robotics have all been moved forward by machine learning.

To demonstrate what is different about machine learning, we can compare two landmark achievements in computing & artificial intelligence.

In 1996 IBM’s Deep Blue defeated World Chess Champion Gary Kasparov. IBMs Deep Blue ‘derived it’s playing strength mainly from brute force computing power’. But all of Deep Blue’s intelligence originated in the brains of a team of programmers and chess Grandmasters.

In 2016 Alphabet’s Alpha Go defeated Go legend Lee Sedol 4-1. AlphaGo also made use of a massive amount of computing power. But the key difference is that AlphaGo was not given any information about the game of Go from its programmers. Alpha Go used reinforcement learning to give Alpha Go the ability to learn from its own experience of the game.

Both of these achievements are important landmarks in computing and artificial intelligence. Yet they are also fundamentally different because machine learning allowed AlphaGo to learn on it’s own.

There are a number of exciting applications of machine learning in the energy industry:
– forecasting of generation, demand & price
– energy disaggregation
– reinforcement learning to control energy systems

Part Two of this series will flesh out some of these applications.

Why now

Three broad trends have led to machine learning being the powerful force it is today.

One – Data

It’s hard to overestimate the importance of data to modern machine learning. Larger data sets tend to make machine learning models more powerful. A weaker algorithm with more data can outperform a stronger algorithm with less data.

The internet has brought about a massive increase in the growth rate of data. This data is enabling machine learning models to achieve superhuman performance.

For many large technology companies such as Alphabet or Facebook their data has become a major source of the value of their businesses. A lot of this value comes from the insights that machines can learn from such large data sets.

Two – Hardware

There are two distinct trends in hardware that have been fundamental to moving modern machine learning forward.

The first is the use of graphics processing units (GPUs) and the second is the increased availability of computing power.

In the early 2000’s computer scientists innovated the use of graphics cards originally designed for gamers for machine learning. They discovered massive increases in training times – reducing them from months to weeks or even days.

This speed up is important. Most of our understanding of machine learning is empirical (based on experiment). This knowledge is built up a lot faster by reducing the iteration time for training machine learning models.

The second trend is the availability of computing power. Platforms such as Amazon Web Services or Google Cloud allow on-demand access to a large amount of GPU-enabled computing power.

Access to computing power on demand allows more companies to build machine learning products. It enables companies to shift a capital expense (building data centres) into an operating expense, with all the balance sheet benefits that brings.

Three – Algorithms & tools

I debated whether to include this third trend. It’s really the first two trends (data & hardware) that have unlocked the latent power of machine learning algorithms, many of which are decades old. Yet I still think it’s worth touching on algorithms and tools.

Neural networks form the basis of many state of the art machine learning applications. Neural networks with multiple layers of non-linear processing units (known as deep learning) that forms the backbone of the most impressive applications of machine learning today. These artificial neural networks are inspired by the biological neural networks inside our brains.

Convolutional neural networks have revolutionised computer vision through a design based on the structure of our own visual cortex. Recurrent neural networks (specifically the LSTM implementation) have transformed natural language processing by allowing the network to hold state and ‘remember’.

Another key trend in machine learning algorithms is the availability of open source tools. Companies such as Alphabet or Facebook make many of their machine learning tools all open source and available.

It’s important to note that while these technology companies share their tools, they don’t share their data. This is because data is the crucial element in producing value from machine learning. World-class tools and computing power are not enough to deliver value from machine learning – you need data to make the magic happen.

Challenges

Any powerful technology has downsides and drawbacks.

By this point in the article the importance of data to modern machine learning is clear. In fact large datasets are so important for supervised machine learning algorithms used today that it is a weakness. Many techniques don’t work on small datasets.

Human beings are able to learn from small amounts of training data – burning yourself once on the oven is enough to learn not to touch it again. Many machine learning algorithms are not able to learn in this way.

Another problem in machine learning is interpretability. A model such as a neural network doesn’t immediately lend itself to explanation. The high dimensionality of the input and parameter space means that it’s hard to pin down cause to effect. This can be difficult when considering using a machine learner in a real world system. It’s a challenge the financial industry is struggling with at the moment.

Related to this is the challenge of a solid theoretical understanding. Many academics and computer scientists are uncomfortable with machine learning. We can empirically test if machine learning is working, but we don’t really know why it is working.

Worker displacement from the automation of jobs is a key challenge for humanity in the 21st century. Machine learning is not required for automation, but it will magnify the impact of automation. Political innovations (such as the universal basic income) are needed to fight the inequality that could emerge from the power of machine learning.

I believe it is possible for us to deploy automation and machine learning while increasing the quality of life for all of society. The move towards a machine intelligent world will be a positive one if we share the value created.

In the specific context of the energy industry I see digitisation as a major challenge. By digitisation I mean a system where everything from sensor level data to prices are accessible to employees worldwide in near real time.

It’s not about having a local site plant control system and historian setup. The 21st-century energy company should have all data available in the cloud in real time. This will allow machine learning models deployed to the cloud to help improve the performance of our energy system. It’s easier to deploy a virtual machine in the cloud than to install & maintain a dedicated system on site.

Data is one of the most strategic assets a company can own. It’s valuable not only because of the insights it can generate today, but also the value that will be created in the future. Data is an investment that will pay off.

Stay tuned for Part Two of this series where I will go into detail on some of the applications of machine learning in the energy industry.

Thanks for reading!

CHP Feasibility & Optimization Model v0.3

See the introductory post for this model here.  


This is v0.3 of the open source CHP feasibility and optimization model I am developing.  The model is setup with some dummy data.  This model is in beta – it is a work in progress!

If you want to get it working for your project all you need to do is change:

  • heat & power demands (Model : Column E-G)
  • prices (Model : Column BD-BF)
  • CHP engine (Input : Engine Library).

You can also optimize the operation of the CHP using a parametric optimization VBA script (Model : Column BQ).

You can download the latest version of the CHP scoping model here.

If you would like to get involved in working on this project please get in touch.

Thanks for reading!

 

Oil Reserves Growth – Energy Basics

Energy Basics is a series covering fundamental energy concepts.


As we consume non-renewable resources, the amount of that resource depletes. Makes sense right?

Yet when it comes to oil reserves we find that oil reserves actually grow over time! This is known as ‘oil reserves growth’. Why does this phenomenon occur?

First, let’s start by defining some relevant terms.

Oil reserves are the amount of oil that can be technically recovered at the current price of oil.

Oil resources are all oil that can be technically recovered at any price.

Oil in place is all the oil in a reservoir (both technically recoverable & unrecoverable oil).

oil reserves growth
Figure 1 – Proved oil reserves and Brent crude oil price (BP Statistical Review 2016)

So why do reserves grow over time? There are three reasons.

One – Geological estimates

Initial estimates of the oil resource are often low. It’s very difficult to estimate the amount of oil in a reservoir as you can’t directly measure it. Often a lot of computing power is thrown at trying to figure out how much oil is underground.

It’s also good engineering practice to stay on the low side when estimating for any project. I expect geologists intentionally do the same for geological estimates.

Two – Oil prices

Oil reserves are a direct function of the current oil price. Increasing oil prices means that more of the oil resource can be classed as an oil reserve.

Historically we have seen oil prices increase – leading to growth in oil reserves (even with the oil resource being depleted at the same time).

But increasing prices can also have secondary effects. A higher price might incentivise an oil company to invest more into an existing field – leading to an increase in oil recovery.

The reserves growth of existing fields can actually be responsible for the majority of additions to reserves.  Between 1977 to 1995 approximately 89% of the additions to US proved reserves of crude oil were due to oil reserves growth rather than the discovery of new fields.

Three – Technology

Improvements in technology have two effects. The first is to make more of the oil in place technically recoverable at any price (ie to increase the oil resource). Hydraulic fracturing (fracking) and horizontal drilling now allow

The first is to make more of the oil in place technically recoverable at any price (ie to increase the oil resource). Hydraulic fracturing (fracking) and horizontal drilling now allow access to oil that previously was technically unrecoverable.

The second is that as technology improves it also gets cheaper. This improvement in economics means that more of the oil resource can be classed as an oil reserve (even at constant or falling prices).


Thanks for reading!

The Complexity of a Zero Carbon Grid – Energy Insights

Energy Insights is a series highlighting interesting energy content from around the web.

Previous posts in this series include Automated Cars and How to save the energy system.


I’m excited to present this Energy Insights post. I’m highlighting a few interesting insights from the ‘The Complexity of a Zero Carbon Grid’ show.

This is very special as The Interchange podcast has only been publically relaunched recently.

The show considers what may be necessary to get to levels of 80-100% renewables. Stephen Lacey and Shayle Kann host the show with Jesse Jenkins as the guest.

The concept of flexibility

Jenkins observes that the concept of flexibility of electrical capacity appearing in literature. Flexibility means how quickly an asset is able to respond to change.

A combined cycle gas turbine plant is usually more flexible than a coal or nuclear generator. One reason for this is the ability to control plant electric output by modulating the supplementary burner gas consumption.


We will need flexibility on a second, minute, hourly or seasonal basis.

This concept of flexibility was also recently touched on by the excellent Energy Analyst blog. Patrick Avis notes that we need both flexibility (kW or kW/min) and capacity (kWh) for a high renewables scenario.

The post ‘Flexibility in Europe’s power sector’ could easily be enough material for a few Energy Insights posts. Well worth a read.

One investment cycle away

Jenkins observes that the investment decisions we make today will affect how we decarbonise in the future. Considering the lifetime of many electricity generation assets, we find that we are only a single investment cycle away from building plants that will be operating in 2050.

Most deep decarbonisation roadmaps include essentially zero carbon electricity by 2050. We need to ensure that when the next investment cycle begins we are not installing carbon intense generation as it would still be operating in 2050.

For both gas and coal the implied cutoff date for plant operation to begin is between 2010 – 2020.

Increasing marginal challenge of renewables deployment

The inverse relationship between the level of deployment of renewables and the marginal value added is well known. Jenkins notes that this relationship also applies to the deployment of storage and demand side response.

As renewable deployment increases the challenges for both storage and demand side response also increase.

Seasonal storage technologies

1 – Power to gas

Electricity -> hydrogen -> synthetic methane.

Figure 3 – Apros Power to Gas

Intermittency of the supply of excess renewable generation means that power to gas asset wouldn’t be fully utilized.

Didn’t cover the possibility of storage of electricity to allow a constant supply of electricity to the power to gas asset.

2 – Underground thermal

Limited to demonstration scale.

Didn’t cover the feasibility of generating electricity from the stored heat.

I would expect that the temperature of the stored heat is low.  Perhaps the temperature could be increased with renewable powered heat pumps.


Thanks for reading!

 

energy_py – reinforcement learning in energy systems

energy_py is reinforcement learning in energy systems.  It’s a reinforcement learning agent and environment built in Python.

I have a vision of using reinforcement learners to optimally operate energy systems.  energy_py is a step towards this vision.  I’ve built this because I’m so excited about the potential of reinforcement learning in the energy industry.

Reinforcement learning in energy systems requires first proving the concepts in a virtual environment.  This project demonstrates the ability of reinforcement learning to control a virtual energy environment.

What is reinforcement learning

supervised vs unsupervised vs reinforcement

Reinforcement learning is the branch of machine learning where learning occurs through action.  Reinforcement learning will give us the tools to operate our energy systems at superhuman levels of performance.

It’s quite different from supervised learning. In supervised learning we start out with a big data set of features and our target. We train a model to replicate this target from patterns in the data.

In reinforcement learning we start out with no data. The agent generates data by interacting with the environment. The agent then learns patterns in this data. These patterns help the agent to choose actions that maximize total reward.

Why do we need reinforcement learning in energy systems

Optimal operation of energy assets is already very challenging. Our current energy transition is making this difficult problem even harder. The rise of intermittent and distributed generation is introducing volatility and increasing the number of actions available to operators.

For a wide range of problems machine learning results are both state of the art and better than human experts. We can get this level of performance using reinforcement learning in our energy systems.

Today many operators use rules or abstract models to dispatch assets. A set of rules is not able to guarantee optimal operation in many energy systems.

Optimal operating strategies can be developed from abstract models. Yet abstract models (such as linear programming) are often constrained. These models are limited to approximations of the actual plant.  Reinforcement learners are able to learn directly from their experience of the actual plant.

Reinforcement learning can also deal with non-linearity. Most energy systems exhibit non-linear behavior (in fact an energy balance is bi-linear!). Reinforcement learning can model non-linearity using neural networks. It is also able to deal with the non-stationary and hidden environment in many energy systems.

beautiful wind turbines

There are challenges to be overcome. The first and most important is safety. Safety is the number one concern in any engineering discipline. What is important to understand is we limit the actions available to the agent. All lower levels or systems of controls would remain exactly the same.

There is also the possibility to design the reward function to incentivize safety. A well-designed reinforcement learner could actually reduce hazards to operators.

A final challenge worth addressing is the impact such a learner could have on employment. Machine learning is not a replacement for human operators. A reinforcement learner would not need a reduction in employees to be a good investment.

The value of using a reinforcement learner is to let operations teams do their jobs better. It will allow them to spend more time and improve performance for their remaining responsibilities such as maintaining the plant.  The value created here is a better-maintained plant and a happier workforce – in a plant that is operating with superhuman levels of economic and environmental performance.

Any machine requires downtime – a reinforcement learner is no different. There will still be time periods where the plant will operate in manual or semi-automatic modes with human guidance.

energy_py is one step on a long journey of getting reinforcement learners helping us in the energy industry. The fight against climate change is the greatest that humanity faces. Reinforcement learning will be a key ally in fighting it.

Guide to the project

energy_py is built using Python.  You can checkout the repository on GitHub here.

core.py
– Creates environment, agent
– episode 0 = naive episode where all assets at 100% load for entire episode
– episode 0 to n = user defined number of episodes (n) with epsilon-greedy policy
– episode n+1 = greedy episode, epsilon = 0 (no exploration)
– outputs saved every x episodes (user defined)
– saves Keras model weights after episode n+1

agents.py – Q_learner(class)
– approximating Q(s,a) using a Keras neural network
– single episode runs within agent
– trains after each step using replay memory
– policy can be set to:
     naive – action is always the maximum of the available action space
     e-greedy – with probability e select random action, else optimal
     greedy – always select optimal action (as per current value function)
– episode stops when env.done == True
– number of functions used to deal with generating variable action space

– Q_learner.output() function generates charts and CSVs of data


environments.py – energy_py(class)
– energy system environment
– ability to use multiple energy asset models – see assets/library.py
– reward based on net energy cost
– state = energy demands & prices
– actions = load and binary variable for each energy asset

– episode length = maximum depends on amount of data in assets/time_series.csv


assets/library.py
– models of energy assets
– currently two available – gas engine or gas turbine (user defined size)
– each class has the same set of methods (would be ideal for a base class)
– allows iteration over list of assets to get technical outputs (power generated, gas consumed etc)

assets/value_functions.py

– Keras models to approximate the action value function Q(s,a)
– currently only one model – a Sequential Keras model
– Keras model structured with
     input of [state, action]
     output of Q(s,a) (single node)
     input is a normalized 1-D numpy array of [state, action]
– agent is agnostic to the specific the Keras model


Structuring the Keras model to output a single Q(s,a) means we need n forward passes to consider n [state, actions]. I’ve done this because I deal with a variable set of [state, actions].
 
This makes action selection and training expensive as both require estimating Q(s,a) for a large number of [state, actions]. This means a lot of forward passes across the network!
 

You can see how the action space is generated by looking at environments.energy_py.create_action_space(last_actions).

Two important points

1 – reward calculated using next state (s’)
– actions a are applied to the next state (s’) (unseen by the agent until the next time step)

– this means the agent is forced to do some time series forecasting.  This is intended behavior

2 – variable action space
– the actions available to the agent are dependent on the previous actions

– to account for this I have added a number of different methods to the energy_py() environment

Open AI gym integration

The Open AI gym paradigm has inspired the design of energy_py.  I would love to one day have an energy_py environment in the Open AI GitHub repo!  

I’ve integrated with the Open AI gym project in the following ways:
– inherits the gym.Env base class from the Open AI gym project
– makes use of step & reset methods as per gym env objects

– makes use of the gym.spaces objects

Energy engineering modeling

It’s easy to change the energy engineering parameters of the energy_py environment. The user can add more or different types of assets. The environment is flexible enough to model an electricity only generator all the way through to a Combined Heat, Cooling & Power plant.
 
Features of the energy engineering modeling
– add or remove assets from the list of asset_models
– add, remove or change state variables by changing energy_py.state_models and assets/time_series.csv.  Note that the order of the dictionaries in energy_py.state_models should match the headers of the time_series.csv
– actions are applied to next state (as detailed above)
– the state includes demands for high-grade heat, low-grade heat, electricity and cooling
– energy balances are done for electricity, high-grade heat, low-grade heat and cooling
     Any heat not supplied by the assets is supplied by a gas boiler operating at 80 % HHV
     Any cooling not supplied by the assets is supplied by an electric chiller operating at a COP of 3
– no operation & maintenance costs are modeled
– all gas prices, efficiencies are on a higher heating value (ie gross) basis
– heat, electricity and cooling measured in MW
– reward is calculated based on the net energy cost

net energy cost = export electricity revenue – (gas cost + import electricity cost)

Results

I’m really looking forward to getting to know energy_py.  There are a number of parameters to tune. For example the structure of the environment or the design of the reward function can be modified to make the reinforcement learning problem more challenging.

One key design choice is the number of assets the agent has to control.  The more choices available to the agent the more complex the shape of the value function becomes.  To approximate a more complex value function we may need a more complex neural network.

There is also a computational cost incurred with increasing the number of actions.  More actions means more [state, actions] to consider during action selection and training (both of which require value function predictions).

So far I’ve been experimenting with an environment based on two assets – a 7.5 MWe gas turbine and a 7.5 MWe gas engine.  The episode length is set to 336 steps (one week).  I run a single naive episode, 30 ε-greedy episodes and a single greedy episode.
Figure 1 below shows the total reward per episode increasing as the agent improves it’s estimate of the value function and spends less time exploring.
Figure 1 – Epsilon decay and total reward per episode

Figure 2 shows the Q-test and the network training history.  Q-test is the average of three random [state, actions] evaluated by the value function.  It shows how the value function approximation changes over time.

Figure 2 – Q-Test and the network training history

Figure 3 shows some energy engineering outputs.  I’m pretty happy with this operating regieme – the model is roughly following both the electricity price and the heat demand which is expected behaviour.

Figure 3 – Energy engineering outputs for the final greedy run

 One interesting decision to make is how often to improve the approximation of the value function.  David Silver makes the point that you should make use of the freshest data when training – hence training after each step through the environment.  He also makes the point that you don’t need to fully train the network – just train a ‘little bit’.

This makes sense as the distribution of the data (the replay memory) will change as the learner trains it’s value function.  I train on a 64 sample batch of the replay memory.  I perform 100 passes over the entire data set (i.e. 100 epochs).   Both values can be optimized.  It could make sense to train for more epochs in later episodes as we want to fit the data more than in earlier episodes.

Another challenge in energy_py is balancing exploration versus exploitation.  The Q_learner algorithm handles this dilemma using ε-greedy action selection.  I decay epsilon at a fixed rate – the optimal selection of this parameter is something I’ll take a look at in the future.

There are many exciting innovations developed recently in reinforcement learning that I’m keen to add to energy_py.  One example is the idea of Prioritized Experience Replay – where the batch is not taken randomly from the replay memory but instead prioritizes some samples over others.

It’s unlikely that I’ll ever catch up to the state of the art in reinforcement learning – what I hope of find is that we don’t need state of the art techniques to get superhuman performance from energy systems!

Future changes for energy_py

Python/programming
– create a base class for assets
– add more complex value function approximations (1-D convolutional or recurrent neural networks)
– learner based on policy gradients
– action space to only have [0,0] rather than [67,0]
– ability to select which variables to use as state

Energy engineering model

– abstraction of steam headers – ability to link with heat demands etc
– heat and mass balances.
– deaerator modeling
– energy storage (thermal or electrical). would be another asset to operate
– expanded engine library (prime mover types & engine models)
– penalty for start/stops
– O&M costs

My reinforcement learning journey

I’m a chemical engineer by training (B.Eng, MSc) and an energy engineer by profession. I’m really excited about the potential of machine learning in the energy industry – in fact that’s what this blog is about!

My understanding of reinforcement learning has come from a variety of resources. I’d like to give credit to all of the wonderful resources I’ve used to understand reinforcement learning.

Sutton & Barto – Reinforcement Learning: An Introduction – the bible of reinforcement learning and a classic machine learning text.

Playing Blackjack with Monte Carlo Methods – I built my first reinforcement learning model to operate a battery using this post as a guide. This post is part two of an excellent three part series. Many thanks to Brandon of Δ ℚuantitative √ourney.

RL Course by David Silver – over 15 hours of lectures from Google DeepMind’s lead programmer – David Silver. Amazing resource from a brilliant mind and brillaint teacher.

Deep Q-Learning with Keras and gym – great blog post that showcases code for a reinforcement learning agent to control a Open AI Gym environment. Useful both for the gym integration and using Keras to build a non-linear value function approximation. Many thanks to Keon Kim – check out his blog here.

Artificial Intelligence and the Future – Demis Hassabis is the co-founder and CEO of Google DeepMind.  In this talk he gives some great insight into the AlphaGo project.

Minh et. al (2013) Playing Atari with Deep Reinforcement Learning – to give you an idea of the importance of this paper – Google purchased DeepMind after this paper was published.  DeepMind was a company with no revenue, no customers and no product – valued by Google at $500M!  This is a landmark paper in reinforcement learning.

Minh et. al (2015) Human-level control through deep reinforcement learning – an update to the 2013 paper published in Nature.

I would also like to thank Data Science Retreat.  I’m just finishing up the three month immersive program – energy_py is my project for the course.  Data Science Retreat has been a fantastic experience and I would highly recommend it.  The course is a great way to invest in yourself, develop professionally and meet amazing people.

That’s it from me – thanks for reading!

Energy Basics – Capacity Factor

All men & women are created equal. Unfortunately the same is not true for electricity generating capacity.
 
Capacity on it’s own is worthless – what counts the electricity generated (kWh) from that capacity (kW). If the distinction between kW and kWh is not clear this previous post will be useful.
 

Capacity factor is one way to quantify the value of capacity. It’s the actual electricity (kWh) generated as a percentage of the theoretical maximum (operation at maximum kW).

For example to calculate the capacity factor on an annual basis:
 
 

There are many reasons why capacity will not generate as much as it could.

Three major reasons are maintenance, unavailability of fuel and economics.

 
Maintenance
 
Burning fossil fuels creates a challenging engineering environment. The core of a gas turbine is high pressure & temperature gases rapidly rotating blazing hot metal. Coal power stations generate electricity by high pressure steam forcing a steam turbine to spin incredibly fast.
 
These high challenges mean that fossil fuel plants need a lot of maintenance. The time when the plant is being maintained is time the capacity isn’t generating electricity.
 
Renewables plants need a lot less maintenance than a fossil fuel generator. No combustion means there is a lot less stress on equipment.
 
Availability of fuel
 
Yet while renewables come ahead in terms of maintenance, they fall behind due to a constraint that fossil fuel generation usually doesn’t suffer from – unavailability of fuel.
 
This is why renewables like wind & solar are classed as intermittent. Fuel is often not available meaning generation is often not possible.
 
Solar panels can’t generate at night. Wind turbines need wind speeds to be within a certain range – not too low, not too high – just right.
 
This means that wind & solar plants are often not able to generate at full capacity – or even to generate at all. This problem isn’t common for fossil fuel generation. Fossil fuels are almost always available through natural gas grids or on site coal storage.
 
Economics
 
The final reason for capacity to not generate is economics.
 
The relative price of energy and regulations change how fossil fuel capacity is dispatched. Today’s low natural gas price environment is the reason why coal capacity factors have been dropping.
 
Here renewables come out way ahead of fossil fuels. As the fuel is free renewables can generate electricity at much lower marginal cost than fossil fuels. Wind & solar almost always take priority over fossil fuel generation.
 
Typical capacity factors
 
The capacity factor wraps up and quantifies all of the factors discussed above.
 
Table 1 – Annual capacity factors (2014-2016 US average)

CoalCCGTWindSolar PVNuclear
Annual Capacity Factor56.13%53.40%33.63%26.30%92.17%

 

Table 1 gives us quite a bit of insight into the relative value of different electricity generating technologies. The capacity factor for natural gas is roughly twice as high as solar PV.

 

We could conclude that 1 MW of natural gas capacity is worth around twice as much as 1 MW of solar PV.

How useful is the capacity factor?

Yet the capacity factor is not a perfect measure of how valuable capacity is. Taking the average of anything loses infomation – capacity factor is no different.
 
Two plants operating in quite different ways can have the same capacity factor. A plant that operated 50% for the entire year and a plant that generated for half of the year at full capacity will both have an identical capacity factor.
 
The capacity factor loses infomation about the time of energy generation. The time of generation & demand is a crucial element in almost every energy system.
 
Generation during a peak can be a lot more valuable to the world than generation at other times. Because of the nature of dispatchable generation it is more likely to be running during a peak.
 
This leads us to conclude that low capacity factor generation could be more valuable than higher capacity factor generation.  This is especially true for solar in many countries as a) the peak often occurs when the sun is down and b) all solar generation is coincident.
 
The solution to the intermittency problem of renewables is storage. Storage will allow intermittent generation to be used when it’s most valuable – not just whenever it happens to be windy or sunny.
 
Thanks for reading!