Recognizing structure in report transcripts : an approach based on conditional random fields (CRFs)

Jancsary, Jeremy Martin

Record link:

https://resolver.obvsg.at/urn:nbn:at:at-ubtuw:1-14542
http://hdl.handle.net/20.500.12708/14009

Title:

Recognizing structure in report transcripts : an approach based on conditional random fields (CRFs)

Citation:

Jancsary, J. M. (2008). Recognizing structure in report transcripts : an approach based on conditional random fields (CRFs) [Master Thesis, Technische Universität Wien]. reposiTUm. https://resolver.obvsg.at/urn:nbn:at:at-ubtuw:1-14542

CatalogPlus:

AC05036576

Publication Type:

Thesis - Masterarbeit

Language:

English

Authors:

Jancsary, Jeremy Martin

Advisor:

Trost, Harald

Organisational Unit:

E180 - Fakultät für Informatik

Date (published):

2008

Number of Pages:

111

Keywords:

Strukturerkennung; Conditional Random Fields; Berichte; Graphische Modelle

structure identification; Conditional Random Fields; reports; graphical models

Abstract:

Typischerweise gibt automatische Spracherkennung lediglich eine Folge von Wörtern aus. Diese Sichtweise mag für einige Anwendungen ausreichend sein; andere wiederum benötigen eine etwas strukturiertere Vorgehensweise. Diese Diplomarbeit stellt ein Framework vor, das es ermöglicht, zugrundeliegende Strukturen in diktierten Berichten zu erkennen. Die explizite Ausweisung von strukturellen Elementen wie Abschnitten, Überschriften und Aufzählungen ist ein wichtiger Schritt in Richtung automatischer Nachverarbeitung von Diktaten. Der wissenschaftliche Beitrag dieser Diplomarbeit ist einerseits die Entwicklung eines generischen Ansatzes, der bestehende Spracherkennungssysteme dahingehend erweitert, dass strukturierte Ausgabe generiert werden kann; andererseits liegt der Beitrag in der Veröffentlichung eines frei verfügbaren CRF Toolkits, das dem zuvor genannten Ansatz zugrunde liegt, aber auch für viele andere Problemstellungen einsetzbar ist.<br />

Typically, the output of ASR is a mere sequence of words. This view may be sufficient for some tasks, whereas others require a more structured approach. This thesis presents a framework that allows for identification of deep, underlying structure in report dictations.<br />Identification of structural elements, such as headings, sections and enumerations is an important step towards automatic post-processing of dictated speech. The contributions of this thesis include a generic approach that can be integrated seamlessly with existing ASR solutions and provides structured output, as well as a freely available CRF toolkit that forms the basis of aforementioned approach and may also be applicable to numerous other problems.

Additional information:

Zsfassg. in dt. Sprache

License:

In Copyright

Appears in Collections:

Thesis