On the optimality of quantum circuit initial mapping using reinforcement learning

Norhan Elsayed Amer; Walid Gomaa; Keiji Kimura; Kazunori Ueda; Ahmed El-Mahdy

doi:10.1140/epjqt/s40507-024-00225-1

2024 Impact factor 5.6

Open Access

EPJ Quantum Technol. (2024) 11: 19
https://doi.org/10.1140/epjqt/s40507-024-00225-1

Research

On the optimality of quantum circuit initial mapping using reinforcement learning

Norhan Elsayed Amer¹^a, Walid Gomaa¹^,2, Keiji Kimura³, Kazunori Ueda³ and Ahmed El-Mahdy¹^,4^,2

¹ Department of Computer Science and Engineering, Egypt-Japan University for Science and Technology, Alexandria, Egypt
² Department of Computer and Systems Engineering, Faculty of Engineering, Alexandria University, Alexandria, Egypt
³ Waseda University, Tokyo, Japan
⁴ School of Information Technology and Computer Science, Nile University, Cairo, Egypt

^a norhan.elsayed@ejust.edu.eg

Received: 14 December 2023
Accepted: 21 February 2024
Published online: 13 March 2024

Abstract

Quantum circuit optimization is an inevitable task with the current noisy quantum backends. This task is considered non-trivial due to the varying circuits’ complexities in addition to hardware-specific noise, topology, and limited connectivity. The currently available methods either rely on heuristics for circuit optimization tasks or reinforcement learning with complex unscalable neural networks such as transformers. In this paper, we are concerned with optimizing the initial logical-to-physical mapping selection. Specifically, we investigate whether a reinforcement learning agent with simple scalable neural network is capable of finding a near-optimal logical-to-physical mapping, that would decrease as much as possible additional CNOT gates, only from a fixed-length feature vector. To answer this question, we train a Maskable Proximal Policy Optimization agent to progressively take steps towards a near-optimal logical-to-physical mapping on a 20-qubit hardware architecture. Our results show that our agent coupled with a simple routing evaluation is capable of outperforming other available reinforcement learning and heuristics approaches on 12 out of 19 test benchmarks, achieving geometric mean improvements of 2.2% and 15% over the best available related work and two heuristics approaches, respectively. Additionally, our neural network model scales linearly as the number of qubits increases.

Key words: Quantum computing / Controlled-NOT reduction / Proximal Policy Optimization / Classical optimization / Quantum circuit / Optimal initial logical-to-physical mapping / Qubit routing / Transpilation

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Conference announcements

12 Internat. Congress of the Balkan Physical Union
July 8-12, 2025
Bucharest, Romania

Joint Annual Meeting of ÖPG and SPS
August 18-22, 2025
Wien, Austria

111th Italian National Society Congress
September 22-26, 2025
Palermo, Italy