All hail Open AI Five

Dota fans are making the same mistake Go fans did in underestimating Open AI.

2 minute read

One of the most exciting areas in reinforcement learning today is Open AI’s work playing the video game Dota 2. The Open AI Dota 2 agent is currently facing a sequence of more difficult human opponents. Dota 2 is a video game that presents reinforcement learning challenges such as partially observed states and high dimensional state and action spaces.

Open AI previously created a world class bot with significant gameplay restrictions - the most important being a restriction to a 1v1 game. Dota 2 is a 5v5 game. This current Open AI agent (based on PPO and powered with gigantic samples of experience and computation) plays the 5v5 game with some limits on gameplay.

A common comment among Dota 2 fans is that the restrictions on the gameplay are the reason why the machine can compete with humans. This is incorrect.

A more complex game would actually allow more dimensions in which a machine could outperform a human. Being able to see patterns in high dimensional spaces is what modern machine learning is all about. Allowing the use of core mechanics such as wards will give machines additional ways to outperform humans.

The more complex the game, the higher the skill cap. Because the Open AI agent can learn from so much experience (well beyond what any human being could achieve), the sample inefficiency of modern reinforcement learning can be ignored. This is the key challenge in modern RL, which Open AI are overcoming through super villain levels of CPU and GPU.

It is likely that this version of the Open AI agent will beat every human it faces. Public consensus was in the same incorrect position before each of the Alpha Go matches. AlphaGo was never defeated by a human.

Open AI will be able to quantify the performance of its system. This quantification is a key part of the learning process - if it is learning then the metrics exist, and gradients can point towards them.

It is unlikely that Open AI would let their agent play against humans if it didn’t believe the agents had a good chance. This infomational asymmetry is massively in favour of Open AI.

Machine learners and Dota fans should all enjoy the series of matches to come - with the Open AI agent beating every single one. Enjoy the journey, even if you know the destination.

Thanks for reading!

further reading

Open AI Five

Open AI 1v1

AlphaGo, in context

Zyori Podcast #130 w/ @trentPax

machine learning in energy - part one - part two