Skip to main content Skip to main navigation


Development of Language and Task Models for Skill-based Selection and Operation of Robots using Speech Dialogue

Julian Wolter
Mastersthesis, Universität des Saarlandes, DFKI, 7/2020.


Industry 4.0 is happening. Robots are taking over more and more tasks, even in industries in which they have not yet been present. Products are becoming increasingly individual and are no longer mass products off the shelf, leading to a more customized production process. To produce high-quality products efficiently, human-machine collaboration is often essential, in particular the transfer of control from the human to the machine. Therefore, there is a continually growing need for intuitive and flexible control of robots without extensive training of the worker. In my thesis, I develop an intuitive system for robot control with the help of natural language understanding (NLU) tools. For this purpose, I first abstract and summarize the many different abilities of various robots by creating a skill model. In addition, a task model defines which tasks can be solved and how. There are often multiple different ways to approach a task, and the most suitable alternative has to be chosen depending on the capabilities of the available robots. The task model can later be mapped to real robots using my skill model. Besides, objects in the environment are modelled and given properties to allow for interaction with them. Once the tasks, robots and objects are defined, the system creates a training corpus to train the NLU system and make it ready for use. An appropriate dialogue model now makes it possible to trigger a task intuitively by voice. To put it all together, I show in my work how to assign the generated task to the available robots with the use of the skill model. Finally, my system sends out high-level robot commands to fulfil the request of the user. For evaluation purposes, I modelled the MRK4.0 laboratory in my work and showed some possible applications. The model allows representing all scenarios successfully, and the system distributed the tasks sensibly. I evaluated the intuitiveness of the language component with an exploratory user study. For this purpose, I created a simulated environment in which participants in the study should solve some predefined jobs. There was no training in advance. Most participants could solve all tasks after a small explorative phase in the beginning. In both cases, the system fully met expectations.