Publication

ConfiBench: Automatic Testbench Generation with Confidence-Based Scenario Mask and Testbench Ensemble using LLMs for HDL Design

Ruidi Qiu; Yalin Zhang; Rolf Drechsler; Tsungyi Ho; Ulf Schlichtmann; Bing Li

In: ACM Transactions on Design Automation of Electronic Systems (TODAES), ACM, 2025.

Abstract

Functional simulation is an essential step in digital hardware design. Recently, there has been a growing interest in leveraging Large Language Models (LLMs) for hardware testbench generation tasks. However, the inherent instability associated with LLMs often leads to functional errors in the generated testbenches. Previous methods do not incorporate automatic functional correction mechanisms without human intervention and still suffer from low success rates, especially for sequential tasks. To address this issue, we propose ConfiBench, an automatic testbench generation framework with functional self-validation, self-correction, scenario masking and an ensemble of multiple testbenches. Utilizing only the RTL specification in natural language, the proposed approach can validate the correctness of the generated testbenches, perform functional self-correction on the generated testbenches and construct an ensemble of them with effective masks. The comparative analysis demonstrates that our method achieves a pass rate of 72.22% across all evaluated tasks, compared with the previous LLM-based testbench generation framework’s 52.18% and a direct LLM-based generation method’s 33.33%. Specifically in sequential circuits, our work’s performance achieves a testbench pass rate that is 21.06 percentage points higher than previous work AutoBench in sequential tasks and almost 5 times the pass rate of the direct method. More importantly, ConfiBench significantly improves the correctness of generated testbenches under golden RTLs, particularly for sequential circuits, which had been a major weakness in the previous work, CorrectBench. The codes and experimental results are open-sourced at the link: https://github.com/AutoBench/ConfiBench.