Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Alignment of lecture speech data and presentation documents based on discourse markers and text length
STATOSHI NAKAZAWAKENJI SATOHAKITOSHI OKUMURA
Author information
JOURNAL FREE ACCESS

2005 Volume 12 Issue 2 Pages 133-156

Details
Abstract
Keyword search for the certain scene in video data seems to be in great demand as well as text search.For the video search, a conventional approach is to apply speech recognition to video voice signals and use the results as a text index with time information. However, speech recognition has problems such as recognition errors and unknown words, and recognition results themselves do not work as a precise index. If there are detailed scripts or transcripts of a video available, it is possible to make a precise index synchronized with the video, by aligning the script and the speech recognition results, but not every video comes with detailed scripts.We would like to propose a new approach which enables to make a text index without detailed scripts but with presentation slides.We focus on lecture videos, and we will explain how to make a text index by aligning two different materials;speech recognition results and presentation slides.We align them by slide so that keyword search for lecture videos can be done by slide.
Content from these authors
© The Association for Natural Language Processing
Previous article Next article
feedback
Top