From Google’s Deep Mind:
In recent years, artificial intelligence agents have succeeded in a range of complex game environments. For instance, AlphaZero beat world-champion programs in chess, shogi, and Go after starting out with knowing no more than the basic rules of how to play. Through reinforcement learning (RL), this single system learnt by playing round after round of games through a repetitive process of trial and error. But AlphaZero still trained separately on each game — unable to simply learn another game or task without repeating the RL process from scratch.
…Today, we published “Open-Ended Learning Leads to Generally Capable Agents,” a preprint detailing our first steps to train an agent capable of playing many different games without needing human interaction data. We created a vast game environment we call XLand, which includes many multiplayer games within consistent, human-relatable 3D worlds. This environment makes it possible to formulate new learning algorithms, which dynamically control how an agent trains and the games on which it trains. The agent’s capabilities improve iteratively as a response to the challenges that arise in training, with the learning process continually refining the training tasks so the agent never stops learning. The result is an agent with the ability to succeed at a wide spectrum of tasks — from simple object-finding problems to complex games like hide and seek and capture the flag, which were not encountered during training. We find the agent exhibits general, heuristic behaviours such as experimentation, behaviours that are widely applicable to many tasks rather than specialised to an individual task. This new approach marks an important step toward creating more general agents with the flexibility to adapt rapidly within constantly changing environments. (Bold added, AT).
In other news, South Africa awarded the first ever patent to an AI.