Compare Papers

Paper 1

Chipmunq: A Fault-Tolerant Compiler for Chiplet Quantum Architectures

Peter Wegmann, Aleksandra Świerkowska, Emmanouil Giortamis, Pramod Bhatotia

Year
2026
Journal
arXiv preprint
DOI
arXiv:2603.16389
arXiv
2603.16389

As quantum computing advances toward fault-tolerance through quantum error correction, modular chiplet architectures have emerged to provide the massive qubit counts required while overcoming fabrication limits of monolithic chips. However, this transition introduces a critical compilation gap: existing frameworks cannot handle the scale of fault-tolerant quantum circuits while managing the noisy, sparse interconnects of chiplet backends. We present Chipmunq, the first hardware-aware compiler for mapping and routing fault-tolerant circuits onto modular architectures. Chipmunq employs a quantum-error-correction-aware partitioning strategy that preserves the integrity of logical qubit patches, preventing prohibitive gate overheads common in general-purpose compilers. Our evaluation demonstrates that Chipmunq achieves a 13.5x speedup in compilation time compared to state-of-the-art tools. By incorporating chiplet constraints and defective qubits, it reduces circuit depth by 86.4% and SWAP gate counts by 91.4% across varying code distances. Crucially, Chipmunq overcomes heterogeneous inter-chiplet links, improving logical error rates by up to two orders of magnitude.

Open paper

Paper 2

Lottery BP: Unlocking Quantum Error Decoding at Scale

Yanzhang Zhu, Chen-Yu Peng, Yun Hao Chen, Yeong-Luh Ueng, Di Wu

Year
2026
Journal
arXiv preprint
DOI
arXiv:2605.00038
arXiv
2605.00038

To enable fault tolerance on millions of qubits in real time, scalable decoding is necessary, which motivates this paper. Existing decoding algorithms (decoders), such as clustering, matching, belief propagation (BP), and neural networks, suffer from one or more of inaccuracy, costliness, and incompatibility, upon a broad set of quantum error correction codes, such as surface code, toric code, and bivariate bicycle code. Therefore, there exists a gap between existing decoders and an ideal decoder that is accurate, fast, general, and scalable simultaneously. This paper contributes in three aspects, including decoder, decoder architecture, and decoding simulator. First, we propose Lottery BP, a decoder that introduces randomness during decoding. Lottery BP improves the decoding accuracy over BP by 2~8 orders of magnitude for topological codes. To efficiently decode multi-round measurement errors, we propose syndrome vote as a pre-processing step before Lottery BP, which compresses multiple rounds of syndromes into one. Syndrome vote increases the latency margin of decoding and mitigates the backlog problem. Second, we design a PolyQec architecture that implements Lottery BP as a local decoder and ordered statistics decoding (OSD) as a global decoder, and it is configurable for surface/toric code and X/Z check. Since Lottery BP boosts the local decoding accuracy, PolyQec invokes the costly global OSD decoder less frequently over BP+OSD to enhance the scalability, e.g., 3~5 orders of magnitude less for topological codes. Third, to evaluate decoders fairly, we develop a PyTorch-based decoding simulator, Syndrilla, that modularizes the simulation pipeline and allows to extend new decoders flexibly. We formulate multiple metrics to quantify the performance of decoders and integrate them in Syndrilla. Running on GPUs, Syndrilla is 1~2 orders of magnitude faster than CPUs.

Open paper