Skip to main content Skip to main navigation

Publication

Speeding up Data Annotations for Handwritten Text Recognition

Moritz Jonathan Wolf
Mastersthesis, Universität des Saarlandes, 9/2021.

Abstract

Transcribing huge amounts of handwritten text is necessary to train supervised machine learning models due to the variability of handwritten text. This however is costly in both time and manpower. A common strategy to speed up annotation processes is to use human-in-the-loop methods and giving supporting suggestions to the annotator. In this thesis a combination of Transfer Learning and Active Learning with uncertainty sampling is proposed to incrementally train a well per- forming model that produces these suggestions after seeing only a few samples. We compare this strategy to using Active Learning with random sampling and find that uncertainty sampling leads to better suggestions and see a reduction in time humans spend in the annotation process by more than a half. We also find, that Active Learning converges at a better performance than conventional methods, that learn from all training data at once.