Quick Navigation

Topics

Quantum Machine Learning Quantum Chemistry

Replay-buffer engineering for noise-robust quantum circuit optimization

arXiv
Authors: Akash Kundu, Sebastian Feld

Year

2026

Paper ID

52127

Status

Preprint

Abstract Read

~2 min

Abstract Words

227

Citations

N/A

Abstract

Deep reinforcement learning (RL) for quantum circuit optimization faces three fundamental bottlenecks: replay buffers that ignore the reliability of temporal-difference (TD) targets, curriculum-based architecture search that triggers a full quantum-classical evaluation at every environment step, and the routine discard of noiseless trajectories when retraining under hardware noise. We address all three by treating the replay buffer as a primary algorithmic lever for quantum optimization. We introduce ReaPER+, an annealed replay rule that transitions from TD error-driven prioritization early in training to reliability-aware sampling as value estimates mature, achieving 4-32times gains in sample efficiency over fixed PER, ReaPER, and uniform replay while consistently discovering more compact circuits across quantum compilation and QAS benchmarks; validation on LunarLander-v3 confirms the principle is domain-agnostic. Furthermore we eliminate the quantum-classical evaluation bottleneck in curriculum RL by introducing OptCRLQAS which amortizes expensive evaluations over multiple architectural edits, cutting wall-clock time per episode by up to 67.5\% on a 12-qubit optimization problem without degrading solution quality. Finally we introduce a lightweight replay-buffer transfer scheme that warm-starts noisy-setting learning by reusing noiseless trajectories, without network-weight transfer or ε-greedy pretraining. This reduces steps to chemical accuracy by up to 85-90\% and final energy error by up to 90\% over from-scratch baselines on 6-, 8-, and 12-qubit molecular tasks. Together, these results establish that experience storage, sampling, and transfer are decisive levers for scalable, noise-robust quantum circuit optimization.

Why This Paper Matters

  • This paper contributes to the Quantum Machine Learning research area in the Quantum Articles archive.
  • It adds a 2026 reference point for readers tracking recent quantum research.
  • Deep reinforcement learning (RL) for quantum circuit optimization faces three fundamental bottlenecks: replay buffers that ignore the reliability of temporal-difference (TD)...

Paper Tools

Become a member to use research tools

Sign in to open papers, visit source links, share, cite, compare, copy DOI links, request category corrections, and build your reading list.

Show Paper arXiv Publisher Share Cite This Paper Copy URL Compare Copy DOI Add to Reading List Category Correction Request

References & Citation Signals

Local Citation Graph (Related-Paper Links)

Current Paper #52127 #69042 Simultaneous Fragment Docking f... #69037 Spin dynamics and ortho-para co... #69034 Hardware-aware Low-latency Quan... #69025 Machine-Learning Optimization a...

External citation index: OpenAlex citation signal

Community Reactions

Quick sentiment from readers on this paper.

Score: 0
Likes: 0 Dislikes: 0

Sign in to react to this paper.

Discussion & Reviews (Moderated)

Average Rating: 0.0 / 5 (0 ratings)

No written reviews yet.