Thermostat: A Large Collection of NLP Model Explanations and Analysis Tools

Nils Feldhus, Robert Schwarzenberg, Sebastian Möller

In: Heike Adel, Shuming Shi (editor). Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Conference on Empirical Methods in Natural Language Processing (EMNLP-2021) System Demonstrations November 7-11 Punta Cana Dominican Republic Association for Computational Linguistics 2021.


In the language domain, as in other domains, neural explainability takes an ever more important role, with feature attribution methods on the forefront. Many such methods require considerable computational resources and expert knowledge about implementation details and parameter choices. To facilitate research, we present Thermostat which consists of a large collection of model explanations and accompanying analysis tools. Thermostat allows easy access to over 200k explanations for the decisions of prominent state-of-the-art models spanning across different NLP tasks, generated with multiple explainers. The dataset took over 10k GPU hours (> one year) to compile; compute time that the community now saves. The accompanying software tools allow to analyse explanations instance-wise but also accumulatively on corpus level. Users can investigate and compare models, datasets and explainers without the need to orchestrate implementation details. Thermostat is fully open source, democratizes explainability research in the language domain, circumvents redundant computations and increases comparability and replicability.


Weitere Links

Thermostat_EMNLP2021Demos.pdf (pdf, 226 KB )

German Research Center for Artificial Intelligence
Deutsches Forschungszentrum für Künstliche Intelligenz