Publikation
AuditCopilot: Leveraging LLMs for Fraud Detection in Double-Entry Bookkeeping
Md Abdul Kadir; Sai Suresh Macharla Vasu; Sidharth S. Nair; Daniel Sonntag
In: Proceedings of the NeurIPS 2025 Workshop on Generative AI in Finance. Workshop on Generative AI in Finance, located at NeurIPS-2025, December 6, San Diego, CA, USA, 2025.
Zusammenfassung
Auditors rely on Journal Entry Tests (JETs) to detect anomalies in tax-related ledger records, but rule-based methods generate overwhelming false positives and struggle
with subtle irregularities. We investigate whether large language models (LLMs) can serve as anomaly detectors in double-entry bookkeeping. Benchmarking SoTA
LLMs such as LLaMA and Gemma on both synthetic and real-world anonymized ledgers, we compare them against JETs and machine learning baselines. Our results show that LLMs consistently outperform traditional rule-based JETs and classical ML baselines, while also providing natural-language explanations that enhance
interpretability. These results highlight the potential of AI-augmented auditing, where human auditors collaborate with foundation models to strengthen financial
integrity.
