I presented our ongoing work at the “högre seminariet i ekonomisk historia på Stockholms universitet ” (4 June 2020).
The title was Fast and easy transcription of handwritten documents

Abstract

Printed books can be converted into searchable machine encoded text using Optical Character Recognition (OCR). However, handwritten text is much harder to convert due to the large variation in handwriting style between persons, since every person will always inevitably write the same words with a small variation in size etc. Handwritten Text Recognition (HTR) has therefore emerged as an active research field to solve the problem of automatic word recognition and text conversion.

Transcription is a tedious time consuming task and several applications exist that facilitates the process, but usually require rather large training data that first needs to be transcribed and annotated by hand. We are therefore developing a framework for fast semi-automatic collection of words, which even allows for a group of users to transcribe a text in arbitrary word order. This will help in finding linked words much faster for subsequent learning and it also makes it possible to search in not yet transcribed document collections. Examples from ongoing research projects will be presented.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s