Reinforcement Learning in Different Phases of Quantum Control

On this page, you can find the supplementary videos to the paper Reinforcement Learning in Different Phases of Quantum Control [1].

For the single qubit, we present three movies showing protocols found by the Reinforcement Learning (RL) agent (Videos 1-3) and three moves for the one-parameter variational protocols (Videos 4-6) for ramp durations $T=0.5, 1.0, 3.0$, respectively. The target state on the Bloch sphere is shown in red, while the instantaneous state - in green. The protocol time step size is $\delta t=0.05$.

The final two movies show the learning dynamics in for a single qubit for $T= 2.4$ (Video 7), and a ten coupled qubit system for $T=4.0$ (Video 8).

Single qubit protocols learned by the RL agent. Videos 1-3 show the protocols found by the RL agent for a single qubit for protocol durations of $T=0.5, 1.0, 3.0$, respectively: Video 1, Video 2, Video 3.
Single qubit variational protocols. Videos 4-6 show variational protocols inspired found by the RL agent for a single qubit for protocol durations of $T=0.5, 1.0, 3.0$, respectively: Video 4, Video 5, Video 6.
Learning dynamics. Videos 7-8 shows the learning dynamics of the single qubuit with $T=2.4$ and ten coupled qubits with $T=3.0$, respectively. The reward in the manybody Video 8, $\tilde F_h(T) = 1/L\log F_h(T)$, can be thought of as the fidelity per site: Video 7, Video 8.

References:
[1] M. Bukov, A.G.R. Day, D. Sels, P. Weinberg, A. Polkovnikov, P. Mehta, arXiv: 1705.00565 (2017).