Quick Navigation

Topics

Quantum Machine Learning Quantum Simulation Quantum Chemistry

Gatekeepers and Hallucinations: A Layered Evaluation Framework for LLM-Driven Quantum Circuit Generation

arXiv
Authors: Christopher Coleman, Sharon Marfatia

Year

2026

Paper ID

69422

Status

Preprint

Abstract Read

~2 min

Abstract Words

197

Citations

N/A

Abstract

As large language models (LLMs) become embedded in quantum simulation workflows (IDE copilots, notebook assistants, agentic pipelines), evaluation must move beyond functional correctness to anticipate and catch structured failures before they propagate through expensive pipelines. We present a layered evaluation framework for materials-informed Variational Quantum Eigensolver (VQE) circuit generation: (i) a gatekeeper screening rubric across seven physical and framework criteria; (ii) a circuit fidelity analysis comparing model outputs against analytical and reference-implementation values for H2/STO-3G/Jordan-Wigner/UCCSD, with ansatz classification and gate-composition breakdown; and (iii) design entropy, a run-to-run behavioral consistency metric. We surface a taxonomy of five distinct LLM failure modes (geometry hallucination, nonexistent API usage, runtime integration failures, constraint violations, and plausible-but-unverifiable output), each with distinct detectability profiles and structural to the task rather than to any one model. A forensic audit of the evaluation platform's own source code further establishes that two apparent model failures originated in the harness through silent fallback-template substitution, demonstrating that evaluation infrastructure belongs inside the same trust boundary as the models it tests. Applied across multiple foundation models on a Materials Project integrated pipeline, the framework shows that gatekeeper-style validation is necessary, not optional, for reliable deployment.

Why This Paper Matters

  • This paper contributes to the Quantum Machine Learning research area in the Quantum Articles archive.
  • It adds a 2026 reference point for readers tracking recent quantum research.
  • As large language models (LLMs) become embedded in quantum simulation workflows (IDE copilots, notebook assistants, agentic pipelines), evaluation must move beyond functional...

Paper Tools

Become a member to use research tools

Sign in to open papers, visit source links, share, cite, compare, copy DOI links, request category corrections, and build your reading list.

Show Paper arXiv Publisher Share Cite This Paper Copy URL Compare Copy DOI Add to Reading List Category Correction Request

References & Citation Signals

Local Citation Graph (Related-Paper Links)

Current Paper #69422 #69596 Comprehensive pKa Data Augmenta... #69535 Adiabatically-induced Kawaguchi... #69599 Tensor network compression usin... #69594 A Collective-Spin Derivation of...

External citation index: OpenAlex citation signal

Community Reactions

Quick sentiment from readers on this paper.

Score: 0
Likes: 0 Dislikes: 0

Sign in to react to this paper.

Discussion & Reviews (Moderated)

Average Rating: 0.0 / 5 (0 ratings)

No written reviews yet.