Quick Navigation

Overview Topics Abstract Importance Paper Tools References Reactions Discussion Publication Related

Topics

Quantum Machine Learning Quantum Chemistry

Replay-buffer engineering for noise-robust quantum circuit optimization

arXiv

Authors: Akash Kundu, Sebastian Feld

Year

2026

Paper ID

52127

Status

Preprint

Abstract Read

~2 min

Abstract Words

227

Citations

N/A

Abstract

Deep reinforcement learning (RL) for quantum circuit optimization faces three fundamental bottlenecks: replay buffers that ignore the reliability of temporal-difference (TD) targets, curriculum-based architecture search that triggers a full quantum-classical evaluation at every environment step, and the routine discard of noiseless trajectories when retraining under hardware noise. We address all three by treating the replay buffer as a primary algorithmic lever for quantum optimization. We introduce ReaPER+, an annealed replay rule that transitions from TD error-driven prioritization early in training to reliability-aware sampling as value estimates mature, achieving 4-32times gains in sample efficiency over fixed PER, ReaPER, and uniform replay while consistently discovering more compact circuits across quantum compilation and QAS benchmarks; validation on LunarLander-v3 confirms the principle is domain-agnostic. Furthermore we eliminate the quantum-classical evaluation bottleneck in curriculum RL by introducing OptCRLQAS which amortizes expensive evaluations over multiple architectural edits, cutting wall-clock time per episode by up to 67.5\% on a 12-qubit optimization problem without degrading solution quality. Finally we introduce a lightweight replay-buffer transfer scheme that warm-starts noisy-setting learning by reusing noiseless trajectories, without network-weight transfer or ε-greedy pretraining. This reduces steps to chemical accuracy by up to 85-90\% and final energy error by up to 90\% over from-scratch baselines on 6-, 8-, and 12-qubit molecular tasks. Together, these results establish that experience storage, sampling, and transfer are decisive levers for scalable, noise-robust quantum circuit optimization.

Why This Paper Matters

This paper contributes to the Quantum Machine Learning research area in the Quantum Articles archive.
It adds a 2026 reference point for readers tracking recent quantum research.
Deep reinforcement learning (RL) for quantum circuit optimization faces three fundamental bottlenecks: replay buffers that ignore the reliability of temporal-difference (TD)...

Paper Tools

Become a member to use research tools

Sign in to open papers, visit source links, share, cite, compare, copy DOI links, request category corrections, and build your reading list.

Become a member Sign in

Show Paper arXiv Publisher Share Cite This Paper Copy URL Compare Copy DOI Add to Reading List Category Correction Request

References & Citation Signals

[1] DOI https://doi.org/arXiv:2604.21863 [2] arXiv https://arxiv.org/abs/2604.21863 [3] Publisher https://arxiv.org/abs/2604.21863

Local Citation Graph (Related-Paper Links)

External citation index: OpenAlex citation signal

Community Reactions

Quick sentiment from readers on this paper.

Score: 0

Likes: 0 Dislikes: 0

Discussion & Reviews (Moderated)

Average Rating: 0.0 / 5 (0 ratings)

No written reviews yet.