You're viewing papers too quickly. Please wait a moment.<br>This helps keep the archive available for everyone.

Quick Navigation

Overview Topics Abstract Importance Paper Tools References Reactions Discussion Publication Related

Topics

Quantum Compilation Routing Architecture Quantum Machine Learning Quantum Simulation Quantum Foundations

GPU-Accelerated Quantum Simulation: Empirical Backend Selection, Gate Fusion, and Adaptive Precision

arXiv

Authors: Poornima Kumaresan, Pavithra Muruganantham, Lakshmi Rajendran, Santhosh Sivasubramani

Year

2026

Paper ID

45207

Status

Preprint

Abstract Read

~2 min

Abstract Words

236

Citations

N/A

Abstract

Classical simulation of quantum circuits remains indispensable for algorithm development, hardware validation, and error analysis in the noisy intermediate-scale quantum (NISQ) era. However, state-vector simulation faces exponential memory scaling, with an n-qubit system requiring O2ⁿ complex amplitudes, and existing simulators often lack the flexibility to exploit heterogeneous computing resources at runtime. This paper presents a GPU-accelerated quantum circuit simulation framework that introduces three contributions: (1) an empirical backend selection algorithm that benchmarks CuPy, PyTorch-CUDA, and NumPy-CPU backends at runtime and selects the optimal execution path based on measured throughput; (2) a directed acyclic graph (DAG) based gate fusion engine that reduces circuit depth through automated identification of fusible gate sequences, coupled with adaptive precision switching between complex64 and complex128 representations; and (3) a memory-aware fallback mechanism that monitors GPU memory consumption and gracefully degrades to CPU execution when resources are exhausted. The framework integrates with Qiskit, Cirq, PennyLane, and Amazon Braket through a unified adapter layer. Benchmarks on an NVIDIA A100-SXM4 (40 GiB) GPU demonstrate speedups of 64x to 146x over NumPy CPU execution for state-vector simulation of circuits with 20 to 28 qubits, with speedups exceeding 5x from 16 qubits onward. Hardware validation on an IBM quantum processing unit (QPU) confirms Bell state fidelity of 0.939, a five-qubit Greenberger-Horne-Zeilinger (GHZ) state fidelity of 0.853, and circuit depth reduction from 42 to 14 gates through the fusion pipeline. The system is designed for portability across NVIDIA consumer and data-center GPUs, requiring no vendor-specific compilation steps.

Why This Paper Matters

This paper contributes to the Quantum Machine Learning research area in the Quantum Articles archive.
It adds a 2026 reference point for readers tracking recent quantum research.
Classical simulation of quantum circuits remains indispensable for algorithm development, hardware validation, and error analysis in the noisy intermediate-scale quantum (NISQ)...

Paper Tools

Become a member to use research tools

Sign in to open papers, visit source links, share, cite, compare, copy DOI links, request category corrections, and build your reading list.

Become a member Sign in

Show Paper arXiv Publisher Share Cite This Paper Copy URL Compare Copy DOI Add to Reading List Category Correction Request

References & Citation Signals

[1] DOI https://doi.org/arXiv:2604.03816 [2] arXiv https://arxiv.org/abs/2604.03816 [3] Publisher https://arxiv.org/abs/2604.03816

Local Citation Graph (Related-Paper Links)

External citation index: OpenAlex citation signal

Community Reactions

Quick sentiment from readers on this paper.

Score: 0

Likes: 0 Dislikes: 0

Discussion & Reviews (Moderated)

Average Rating: 0.0 / 5 (0 ratings)

No written reviews yet.