Skip to main content Skip to main navigation


Speeding up Data Annotations for Handwritten Text Recognition

Moritz Jonathan Wolf
Mastersthesis, Universität des Saarlandes, 9/2021.


Transcribing huge amounts of handwritten text is necessary to train supervised machine learning models due to the variability of handwritten text. This however is costly in both time and manpower. A common strategy to speed up annotation processes is to use human-in-the-loop methods and giving supporting suggestions to the annotator. In this thesis a combination of Transfer Learning and Active Learning with uncertainty sampling is proposed to incrementally train a well per- forming model that produces these suggestions after seeing only a few samples. We compare this strategy to using Active Learning with random sampling and find that uncertainty sampling leads to better suggestions and see a reduction in time humans spend in the annotation process by more than a half. We also find, that Active Learning converges at a better performance than conventional methods, that learn from all training data at once.