Zur Seitenansicht
 

Titelaufnahme

Titel
Word Beam Search: A Connectionist Temporal Classification Decoding Algorithm
Verfasser / Verfasserin Scheidl, Harald ; Fiel, Stefan ; Sablatnig, Robert
Erschienen in
16th International Conference on Frontiers in Handwriting Recognition (ICFHR 2018), Niagara Falls, New York, USA, 2018, S. 253-258
Erschienen2018
SpracheEnglisch
DokumenttypAufsatz in einem Sammelwerk
Schlagwörter (EN)connectionist temporal classification / decoding / language model / recurrent neural network / speech recognition / handwritten text recognition
Projekt-/ReportnummerEuropean Union's Horizon 2020: 674943
ISBN9781538658758
URNurn:nbn:at:at-ubtuw:3-3778 Persistent Identifier (URN)
DOI10.1109/ICFHR-2018.2018.00052 
Zugriffsbeschränkung
 Das Werk ist frei verfügbar
Dateien
Word Beam Search: A Connectionist Temporal Classification Decoding Algorithm [0.59 mb]
Links
Nachweis
Klassifikation
Zusammenfassung (Englisch)

Recurrent Neural Networks (RNNs) are used for sequence recognition tasks such as Handwritten Text Recognition (HTR) or speech recognition. If trained with the Connectionist Temporal Classification (CTC) loss function, the output of such a RNN is a matrix containing character probabilities for each time-step. A CTC decoding algorithm maps these character probabilities to the final text. Token passing is such an algorithm and is able to constrain the recognized text to a sequence of dictionary words. However, the running time of token passing depends quadratically on the dictionary size and it is not able to decode arbitrary character strings like numbers. This paper proposes word beam search decoding, which is able to tackle these problems. It constrains words to those contained in a dictionary, allows arbitrary non-word character strings between words, optionally integrates a word-level language model and has a better running time than token passing. The proposed algorithm outperforms best path decoding, vanilla beam search decoding and token passing on the IAM and Bentham HTR datasets. An open-source implementation is provided.

Notiz
Statistik
Das PDF-Dokument wurde 764 mal heruntergeladen.