Skip to main content Skip to main navigation

Publikation

Exploring the Potential of Vision Language Models for Interpreting Technical Drawings

Leonhard Kunz; Mario Klostermeier; Kokulan Thanabalan; Tatjana Legler; Martin Ruskowski
In: DS 140: Proceedings of the 36th Symposium Design for X (DFX2025). Design for X (DFX) Symposium (DFX-2025), The Design Society, 2025.

Zusammenfassung

Vision Language Models (VLMs) have gained widespread adoption among end users. Their versatility has also sparked interest in applying them to more domain-specific challenges. This paper investigates the principal suitability of small-scale VLMs in the task of evaluating the manufacturability of parts based on a technical drawing by providing the Technical drawings for Manufacturability Benchmark (TechMB). A selection of small-scale VLMs is then tested using this benchmark. The results indicate that the models show potential for text extraction and interpretation of domain-specific terminology. However, they struggle with the reasoning about the manufacturing of the depicted parts and partly even with the delivery of concise and precise answers necessary for the targeted task.