Neural networks discovering quantum feedback strategies from scratch

Florian Marquardt
Max Planck Institute for the Science of Light, Erlangen, Germany

Machine learning with artificial neural networks is revolutionizing science. The most advanced challenges require discovering answers autonomously. This is the domain of reinforcement learning, where control strategies are improved according to a reward function. The power of neural-network- based reinforcement learning has been highlighted by spectactular recent successes, such as playing Go, but its benefits for physics are yet to be demonstrated. Here, we show how a network-based “agent” can discover complete quantum-error-correction strategies, protecting a collection of qubits against noise. These strategies require feedback adapted to measurement outcomes. Finding them from scratch, without human guidance, tailored to different hardware resources, is a formidable challenge due to the combinatorially large search space. To solve this, we develop two ideas: two-stage learning with teacher/student networks and a reward quantifying the capability to recover the quantum information stored in a multi-qubit system. Beyond its immediate impact on quantum computation, our work more generally demonstrates the promise of neural-network-based reinforcement learning in physics.