Handwritten Text Recognition (HTR) aims at making handwritten text readable by the computer, just as Optical Character Recognition (OCR) does for printed text. However, the handwritten text is usually much more challenging as the handwriting varies in form in the same manuscript.

Many believe that OCR is a solved problem, but it is really not. It can still be quite challenging depending on the text source and the typeface used. HTR is generally much more challenging because of such things as document degradation (manuscripts are often much older than printed text) inter personal and personal variation in script and style.

On this page our main contributions to this field are presented in a logical order, describing the necessary pipeline for HTR.

Background Removal and Binarisation

bg2Since documents often are somewhat degraded it is important to be able to efficiently remove the disturbing background from the text. The next step would be to binarise the segmented text, but in our word spotter we prefer to work on the background removed text. We have published two papers dealing with these problems.

User Annotation and Text Boxes

cropped-markedup2.pngIt is quite a challenging task to make a prefect bounding box for a word by hand. This paper deals with this problem, but it can also be used in the word spotter for finding a perfectly fitting box of the found word.
The image shows how the user has marked the red box, while the algorithm finds the “perfectly” fitting green box.

Segmentation-free Word Spotting


We developed a segmentation free word spotter, which means that each word do not have to be extracted from the text. Instead a sliding window is used to traverse the document.  Key point matching was performed on four sets of different key points, using a descriptor based on the Fourier transform followed by a relaxed outlier removal method in two steps.

Semi-Automatic Transcription

One of our main goals is to create a tool for collaborative semi-automatic transcription. The idea is to use a human in the loop concept where the user decides what word to transcribe and the transcription is made once for all occurrences with the help of an interactive visualisation tool.

Arbitrary word order transcription of all occurrences of the word

a1Word spotting can be used to make a semi-automatic transcription of the text. The idea is that the user should mark up the word that needs to be transcribed and then the word spotter finds all occurrences of that word. Hence, the word needs to be transcribed only once, and the process can be performed in arbitrary word order by multiple transcribers.

Human in the Loop


In this paper we propose to use visualisation of word spotting results as a powerful tool to be able to manually determine in an efficient manner which found words are correct and which are incorrect. This will make it possible to quickly make a reliable transcription of each word.