energy_py – reinforcement learning for energy systems

If you just want to skip to the code, the energy_py library is here.

energy_py is reinforcement learning for energy systems.  

Using reinforcement learning agents to control virtual energy environments is the first step towards using reinforcement learning to optimize real-world energy systems. This is a professional mission of mine – to use reinforcement learning to control real world energy systems.

energy_py supports this goal by providing a collection of reinforcement learning agents, energy environments and tools to run experiments.

What is reinforcement learning

supervised vs unsupervised vs reinforcement

Reinforcement learning is the branch of machine learning where an agent learns to interact with an environment.  Reinforcement learning can give us generalizable tools to operate our energy systems at superhuman levels of performance.

It’s quite different from supervised learning. In supervised learning we start out with a big data set of features and our target. We train a model to replicate this target from patterns in the data.

In reinforcement learning we start out with no data. The agent generates data (sequences of experience) by interacting with the environment. The agent uses it’s experience to learn how to interact with the environment. In reinforcement learning we not only learn patterns from data, we also generate our own data.

This makes reinforcement learning more democratic than supervised learning. The reliance on massive amounts of labelled training data gives companies with unique datasets an advantage. In reinforcement learning all that is needed is an environment (real or virtual) and an agent.

If you are interested in reading more about reinforcement learning, the course notes from a one-day introductory course I teach are hosted here.

Why do we need reinforcement learning in energy systems

Optimal operation of energy assets is already very challenging. Our current energy transition makes this difficult problem even harder.

The rise of intermittent generation is introducing uncertainty on the generation and demand side. The rise of distributed generators and increasing the number of actions available to operators.

For a wide range of problems machine learning results are both state of the art and better than human experts. We can get this level of performance using reinforcement learning in our energy systems.

Today many operators use rules or abstract models to dispatch assets. A set of rules is not able to guarantee optimal operation in many energy systems.

Optimal operating strategies can be developed from abstract models. Yet abstract models (such as linear programming) are often constrained. These models are limited to approximations of the actual plant.  Reinforcement learners are able to learn directly from their experience of the actual plant. These abstract models also require significant amount of bespoke effort by an engineer to setup and validate.

With reinforcement learning we can use the ability of the same agent to generalize to a number of different environments. This means we can use a single agent to both learn how to control a battery and to dispatch flexible demand. It’s a much more scalable solution than developing site by site heuristics or building an abtract model for each site.

beautiful wind turbines

There are challenges to be overcome. The first and most important is safety. Safety is the number one concern in any engineering discipline.

I believe that by reinforcement learning should be first applied on as high a level of the control system as possible. This allows the number of actions to be limited and existing lower level safety & control systems can remain in place. The agent is limited to only making the high level decisions operators make today.

There is also the possibility to design the reward function to incentivize safety. A well-designed reinforcement learner could actually reduce hazards to operators. Operators also benefit from freeing up more time for maintenance.

A final challenge worth addressing is the impact such a learner could have on employment. Machine learning is not a replacement for human operators. A reinforcement learner would not need a reduction in employees to be a good investment.

The value of using a reinforcement learner is to let operations teams do their jobs better.
It will allow them to spend more time and improve performance for their remaining responsibilities such as maintaining the plant.  The value created here is a better-maintained plant and a happier workforce – in a plant that is operating with superhuman levels of economic and environmental performance.

Any machine requires downtime – a reinforcement learner is no different. There will still be time periods where the plant will operate in manual or semi-automatic modes with human guidance.

energy_py is one step on a long journey of getting reinforcement learners helping us in the energy industry. The fight against climate change is the greatest that humanity faces. Reinforcement learning will be a key ally in fighting it. You can checkout the repository on GitHub here.

Results

The best place to take a look at the library is the example of using Q-Learning to control a battery. The example is well documented in this Jupyter Notebook and this blog post.

My reinforcement learning journey

I’m a chemical engineer by training (B.Eng, MSc) and an energy engineer by profession. I’m really excited about the potential of machine learning in the energy industry – in fact that’s what this blog is about!

My understanding of reinforcement learning has come from a variety of resources. I’d like to give credit to all of the wonderful resources I’ve used to understand reinforcement learning.

Sutton & Barto – Reinforcement Learning: An Introduction – the bible of reinforcement learning and a classic machine learning text.

Playing Blackjack with Monte Carlo Methods – I built my first reinforcement learning model to operate a battery using this post as a guide. This post is part two of an excellent three part series. Many thanks to Brandon of Δ ℚuantitative √ourney.

RL Course by David Silver – over 15 hours of lectures from Google DeepMind’s lead programmer – David Silver. Amazing resource from a brilliant mind and brillaint teacher.

Deep Q-Learning with Keras and gym – great blog post that showcases code for a reinforcement learning agent to control a Open AI Gym environment. Useful both for the gym integration and using Keras to build a non-linear value function approximation. Many thanks to Keon Kim – check out his blog here.

Artificial Intelligence and the Future – Demis Hassabis is the co-founder and CEO of Google DeepMind.  In this talk he gives some great insight into the AlphaGo project.

Minh et. al (2013) Playing Atari with Deep Reinforcement Learning – to give you an idea of the importance of this paper – Google purchased DeepMind after this paper was published.  DeepMind was a company with no revenue, no customers and no product – valued by Google at $500M!  This is a landmark paper in reinforcement learning.

Minh et. al (2015) Human-level control through deep reinforcement learning – an update to the 2013 paper published in Nature.

I would also like to thank Data Science Retreat.  I’m just finishing up the three month immersive program – energy_py is my project for the course.  Data Science Retreat has been a fantastic experience and I would highly recommend it.  The course is a great way to invest in yourself, develop professionally and meet amazing people.