Publikation
Adapters for the Resource Efficient Deployment of Natural Language Understanding Models
Jan Nehring; Nils Feldhus; Akhyar Ahmed
In: Studientexte zur Sprachkommunikation: Elektronische Sprachsignalverarbeitung 2023. Elektronische Sprachsignalverarbeitung (ESSV-2023), ESSV, 2023.
Zusammenfassung
Modern Transformer-based language models such as BERT are huge and, therefore, expensive to deploy in practical applications. In environments such as commercial chatbot-as-a-service platforms that deploy many NLP models in parallel, less powerful models with a smaller number of parameters are an alternative to transformers to keep deployment costs down, at the cost of lower accuracy values. This paper compares different models for Intent Detection concerning their memory footprint, quality of Intent Detection, and processing speed. Many task- specific Adapters can share one large transformer model with the Adapter frame- work. The deployment of 100 NLU models requires 1 GB of memory for the proposed BERT+Adapter architecture, compared to 41.78 GB for a BERT-only architecture.